Cant believe that worked lol! Thank you so much. The speed increased significantly!
Thanks. Will try this. No idea how these really work so that is why i am asking :)
I am talking about this particular model:
https://huggingface.co/TheBloke/goliath-120b-GGUF
I specifically use: goliath-120b.Q4_K_M.gguf
I can run it on runpod.io on this A100 instance with "humane" speed, but it is way too slow for creating long form text.
https://preview.redd.it/fz28iycv860c1.png?width=350&format=png&auto=webp&s=cd034b6fb6fe80f209f5e6d5278206fd714a1b10
These are my settings in text-generation-webui:
https://preview.redd.it/vw53pc33960c1.png?width=833&format=png&auto=webp&s=0fccbeac0994447cf7b7462f65d79f2e8f8f1969
Any advice? Thanks
Cant believe that worked lol! Thank you so much. The speed increased significantly!