kryptkpr

joined 10 months ago

SQLCoder-34b beats GPT-4 at Text-to-SQL in c/localllama@poweruser.forum

[–] kryptkpr@alien.top 1 points 10 months ago (1 children)

DeepSeek is not based on any llama training, it's a 2T token pretrain of their own. 16k context. All this info is at the top of their model card.

GoLLIE: Guideline-following Large Language Model for Information Extraction (hitz-zentroa.github.io)

submitted 10 months ago by kryptkpr@alien.top to c/localllama@poweruser.forum

1 comments fedilink

We present GoLLIE, a Large Language Model trained to follow annotation guidelines. GoLLIE outperforms previous approaches on zero-shot Information Extraction and allows the user to perform inferences with annotation schemas defined on the fly. Different from previous approaches, GoLLIE is able to follow detailed definitions and does not only rely on the knowledge already encoded in the LLM."

Paper: https://huggingface.co/papers/2310.03668

GitHub: https://github.com/hitz-zentroa/GoLLIE

Models (7B, 13B, 34B): https://huggingface.co/collections/HiTZ/gollie-651bf19ee315e8a224aacc4f

7B and 13B GGUF: https://huggingface.co/s3nh/HiTZ-GoLLIE-7B-GGUF https://huggingface.co/s3nh/HiTZ-GoLLIE-13B-GGUF

Nobody has done quants for the 34B yet 😞