Try it live at on Gradio (New page) : How it works ? As easy as 1,2,3 Load our 4bit mode more from huggingface , using the standard Rollama2 tokenizer. …
Download our quantized models on huggingface at https://huggingface.co/intelpen We have models generated with GPTQ – on 4 and 8 bits which were finetuned on wikitext. We have also Bits & …