Our LLama 3 Quantized models

Download our quantized models on huggingface at https://huggingface.co/intelpen We have models generated with GPTQ – on 4 and 8 bits which were finetuned on wikitext. We have also Bits & …

Test our LLMs

This is a preview of our 8B LLM models finetuned on customer data. The models are quantized on 4bit (reducing the memory need from 48GB to ~6.5GB) and the last …