kryptkpr

joined 10 months ago
[โ€“] kryptkpr@alien.top 1 points 10 months ago (1 children)

DeepSeek is not based on any llama training, it's a 2T token pretrain of their own. 16k context. All this info is at the top of their model card.

 

We present GoLLIE, a Large Language Model trained to follow annotation guidelines. GoLLIE outperforms previous approaches on zero-shot Information Extraction and allows the user to perform inferences with annotation schemas defined on the fly. Different from previous approaches, GoLLIE is able to follow detailed definitions and does not only rely on the knowledge already encoded in the LLM."

Paper: https://huggingface.co/papers/2310.03668

GitHub: https://github.com/hitz-zentroa/GoLLIE

Models (7B, 13B, 34B): https://huggingface.co/collections/HiTZ/gollie-651bf19ee315e8a224aacc4f

7B and 13B GGUF: https://huggingface.co/s3nh/HiTZ-GoLLIE-7B-GGUF https://huggingface.co/s3nh/HiTZ-GoLLIE-13B-GGUF

Nobody has done quants for the 34B yet ๐Ÿ˜ž