LocalLLaMA

4 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

are there any Super Tiny LLM models which we can ship within a mobile application? (alien.top)

submitted 2 years ago by Prashant_4200@alien.top to c/localllama@poweruser.forum

7 comments fedilink hide all child comments

I'm new in the LLM world, I just want to know if is there any super tiny LLM model that we can integrate with our existing mobile application and ship it on the app store.

If I give a simple example, I have one news application so it's possible to integrate llm with my news application so I can perform some operations on the application to provide a better experience to users without sharing their personal information on the internet. Like: summarises the article in different types of tones (like 5, 10, 15-year-old kid, in the poem, old and Gen Z style). Track the type of articles the user likes and display only those articles in his feed) and many more.

And if this is not possible, is there any platform where we can host these types of tiny models like a Firebase ML model (these services are not changed that much as compared to other LLM hosting services)?

top 7 comments

sorted by: hot top controversial new old

[–] woadwarrior@alien.top 1 points 2 years ago (1 children)

I have an app on the App Store that does that. It ships with a 4 bit quantised 3B parameter LLM baked in (The app is a 1.67GB download) and users on newer phones (iPhone 14,15 Pro and Pro Max) can optionally download a 3 bit quantised 7B parameter LLM.

[–] amusiccale@alien.top 1 points 2 years ago

Hey, I actually just tried this on my iPhone SE 2nd gen to see if it would run the 3B, even slowly, but it says it’s not compatible— any suggestions?

[–] SlowSmarts@alien.top 1 points 2 years ago

TinyLlama 1.1b may have potential - Tiny Llama 1.1b project

TheBloke has made a GGUF of v0.3 chat already.

Looking on HuggingFace, there may be more that have been fine tuned for instruct, etc.

[–] kotschi1997@alien.top 1 points 2 years ago

Check out the tiny llama project! 1.1B parameters, pretty solid performance for its size and the currently available checkpoints are only about halfway through the complete pre-training process.

https://github.com/jzhang38/TinyLlama

[–] Flying_Madlad@alien.top 1 points 2 years ago

Lol, sounds rough. 3B is better than no B. And that should mean I can have several models up at once

[–] techmavengeospatial@alien.top 1 points 2 years ago

https://llm.mlc.ai/

[–] erelim@alien.top 1 points 2 years ago

It will be very unfriendly for the user having to have a 1-2GB app that eats RAM and battery like a mobile game, still you won't get good or quick results. Check on Replicate, runpod or vast.ai for cheap GPU