LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Qwen-72B released (huggingface.co)

submitted 2 years ago by PookaMacPhellimen@alien.top to c/localllama@poweruser.forum

39 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] PookaMacPhellimen@alien.top 1 points 2 years ago (2 children)

https://github.com/QwenLM/Qwen

Also released was a 1.8B model.

From Bunyan Hui’s Twitter announcement:

“We are proud to present our sincere open-source works: Qwen-72B and Qwen-1.8B! Including Base, Chat and Quantized versions!

🌟 Qwen-72B has been trained on high-quality data consisting of 3T tokens, boasting a larger parameter scale and more training data to achieve a comprehensive performance upgrade. Additionally, we have expanded the context window length to 32K and enhanced the system prompt capability, allowing users to customize their own AI assistant with just a single prompt.

🎁 Qwen-1.8B is our additional gift to the research community, striking a balance between maintaining essential functionalities and maximizing efficiency, generating 2K-length text content with just 3GB of GPU memory.

We are committed to continuing our dedication to the open-source community and thank you all for your enjoyment and support! 🚀 Finally, Happy 1st birthday ChatGPT. 🎂 “

[–] candre23@alien.top 1 points 2 years ago (1 children)

we have expanded the context window length to 32K

Kinda buried the lead here. This is far and away the biggest feature of this model. Here's hoping it's actually decent as well!

[–] jeffwadsworth@alien.top 1 points 2 years ago (1 children)

Well, it depends on how well it keeps the context resolution. Did you see that comparison sheet on Claude and GPT-4? Astounding.

[–] domlincog@alien.top 1 points 2 years ago

https://preview.redd.it/c5k1ugynhj3c1.png?width=1100&format=png&auto=webp&s=4024b3e295ab740f341e132b9d9662104fdc09ef

[–] rePAN6517@alien.top 1 points 2 years ago

my heart skipped a beat because I thought it was Qwen-1.8T.