LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Qwen-72B released (huggingface.co)

submitted 2 years ago by PookaMacPhellimen@alien.top to c/localllama@poweruser.forum

39 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] omniron@alien.top 0 points 2 years ago (2 children)

There’s an audio multimodal too

https://github.com/QwenLM/Qwen-Audio

[–] Gigiboi@alien.top 1 points 2 years ago (1 children)

Use cases??

[–] kxtclcy@alien.top 1 points 2 years ago

Maybe for audio data that have both sound and words? For example if you want to summarize a concert or sth

[–] matsu-morak@alien.top 1 points 2 years ago (1 children)

I could not undestand it. Is this true audio (can differentiate a helicopter sound from a fire engine for example, or a dog bark) or it just transforms speech into text and then it feeds the model?

[–] omniron@alien.top 1 points 2 years ago

It’s the former. It’s looking at audio data

So you can ask it sentiment, determine if someone is giggling, crying, laughing, can maybe even detect a condescending tone or flirtatious tone etc.