this post was submitted on 30 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
top 39 comments
sorted by: hot top controversial new old
[–] drooolingidiot@alien.top 1 points 1 year ago (1 children)

This is amazing. Yesterday we got Deepseek, and today we're getting Qwen. Thank you for releasing this model!

I'm looking forward to seeing comparisons

[–] lunar2solar@alien.top 1 points 1 year ago (1 children)

Is there any free website where I can test those Chinese models? Thanks.

[–] roselan@alien.top 1 points 1 year ago

for Deepseek there is https://chat.deepseek.com/ , for qwen I don't know.

[–] PookaMacPhellimen@alien.top 1 points 1 year ago (2 children)

https://github.com/QwenLM/Qwen

Also released was a 1.8B model.

From Bunyan Hui’s Twitter announcement:

“We are proud to present our sincere open-source works: Qwen-72B and Qwen-1.8B! Including Base, Chat and Quantized versions!

🌟 Qwen-72B has been trained on high-quality data consisting of 3T tokens, boasting a larger parameter scale and more training data to achieve a comprehensive performance upgrade. Additionally, we have expanded the context window length to 32K and enhanced the system prompt capability, allowing users to customize their own AI assistant with just a single prompt.

🎁 Qwen-1.8B is our additional gift to the research community, striking a balance between maintaining essential functionalities and maximizing efficiency, generating 2K-length text content with just 3GB of GPU memory.

We are committed to continuing our dedication to the open-source community and thank you all for your enjoyment and support! 🚀 Finally, Happy 1st birthday ChatGPT. 🎂 “

[–] rePAN6517@alien.top 1 points 1 year ago

my heart skipped a beat because I thought it was Qwen-1.8T.

[–] candre23@alien.top 1 points 1 year ago (1 children)

we have expanded the context window length to 32K

Kinda buried the lead here. This is far and away the biggest feature of this model. Here's hoping it's actually decent as well!

[–] jeffwadsworth@alien.top 1 points 1 year ago (1 children)

Well, it depends on how well it keeps the context resolution. Did you see that comparison sheet on Claude and GPT-4? Astounding.

[–] PookaMacPhellimen@alien.top 1 points 1 year ago (3 children)
[–] a_slay_nub@alien.top 1 points 1 year ago (1 children)

Bit disappointed by the coding performance but it is a general use case model. It's insane how good gpt 3.5 is for how fast it is.

[–] ambient_temp_xeno@alien.top 1 points 1 year ago

Apparently the chat version has about 64 for humaneval.

[–] Secret_Joke_2262@alien.top 1 points 1 year ago (1 children)

What do these tests mean for LLM? There are many values, and I see that in most cases qwen is better than gpt4. In others it is worse or much worse

[–] rileyphone@alien.top 1 points 1 year ago

All the cases it is better than GPT-4 are benchmarks involving Chinese language. OpenAI is going to have a hard time getting access to extensive Chinese language datasets so it's not surprising a 72B model can beat GPT-4, though it's still impressive in it's own right.

[–] Disastrous_Elk_6375@alien.top 1 points 1 year ago

big if true

[–] Secret_Joke_2262@alien.top 1 points 1 year ago

Now everyone is most interested in how much better it is than 70b llama

[–] perlthoughts@alien.top 1 points 1 year ago
[–] extopico@alien.top 1 points 1 year ago (1 children)

I wonder what the performance degradation is after quantising. For other models some users reported that quantizing greatly affected other language capabilities and this model seems to be at leash 50% Chinese.

[–] Art10001@alien.top 1 points 1 year ago

I've seen that ChatGLM began talking in mixed Chinese/English when asked "What tips do you have for a mountaineering trip?"

[–] EnergyUnlucky@alien.top 1 points 1 year ago

Just when I'd talked myself out of getting a second 3090

[–] QuieselWusul@alien.top 1 points 1 year ago (1 children)

Why did so many new chinese 70b foundation models release in a day? (this one, Deepseek, XVERSE) Is there any reason they all released in such a short time?

[–] RayIsLazy@alien.top 1 points 1 year ago

Today is the 1 year anniversary of chatgpt.

[–] a_beautiful_rhind@alien.top 1 points 1 year ago

Heh, 72b with 32k and GQA seems reasonable. Will make for interesting tunes if it's not super restricted.

[–] pseudonym325@alien.top 1 points 1 year ago

The last Qwen didn't really take off as base model for further fine-tunes.

Looking forward to the results on the German data protection training benchmark ;)

[–] balianone@alien.top 1 points 1 year ago

can it beats 3.5-turbo?

[–] polawiaczperel@alien.top 1 points 1 year ago

Would it be possible to merge it with deepseek coder 35B?

[–] Postorganic666@alien.top 1 points 1 year ago

Is it censored?

[–] ambient_temp_xeno@alien.top 1 points 1 year ago

The first thing I looked for was the number of training tokens. I think yi34 got a lot of benefit from 3 trillion, so this model having 3 trillion bodes well.

[–] ASL_Dev@alien.top 1 points 1 year ago
[–] norsurfit@alien.top 1 points 1 year ago (1 children)

In my informal testing, Qwen72b is quite good. I anecdotally rate it stronger than Llama 2 from the few tests that I have conducted.

[–] Secret_Joke_2262@alien.top 1 points 1 year ago

What tests have you tested this in?

I'm very interested in storytelling and RP

[–] Wonderful_Ad_5134@alien.top 1 points 1 year ago

If the US keeps going full woke and are too afraid to work as hard as possible on the LLM ecosystem, China won't wait twice before winning this battle (which is basically the 21th century battle in terms of technology)

Feels sad to see the US decline like that...

[–] carbocation@alien.top 1 points 1 year ago

It would be great to see gguf versions. (At least, my workflow right now goes via ollama.) How are people running Qwen-72B locally right now?

[–] logicchains@alien.top 0 points 1 year ago (1 children)

China seems to be pulling ahead in the open AI race. Now it has three weights-available ~70B non-undertrained models: Qwen, Deepseek and XVerse. The west on the other hand only has Llama 1&2 (and Falcon 180B, but the UAE barely counts as "the west").

[–] MangoReady901@alien.top 1 points 1 year ago

No one uses falcon lmao

[–] omniron@alien.top 0 points 1 year ago (2 children)
[–] matsu-morak@alien.top 1 points 1 year ago (1 children)

I could not undestand it. Is this true audio (can differentiate a helicopter sound from a fire engine for example, or a dog bark) or it just transforms speech into text and then it feeds the model?

[–] omniron@alien.top 1 points 1 year ago

It’s the former. It’s looking at audio data

So you can ask it sentiment, determine if someone is giggling, crying, laughing, can maybe even detect a condescending tone or flirtatious tone etc.

[–] Gigiboi@alien.top 1 points 1 year ago (1 children)
[–] kxtclcy@alien.top 1 points 1 year ago

Maybe for audio data that have both sound and words? For example if you want to summarize a concert or sth