LocalLLaMA

4 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Why isn't anyone building an Oogabooga-like app for Android and iPhone? (alien.top)

submitted 2 years ago by Winter_Tension5432@alien.top to c/localllama@poweruser.forum

14 comments fedilink hide all child comments

With high-end Android phones now packing upwards of 24GB of RAM, I think there's huge potential for an app like this. It would be amazing to have something as powerful as the future Mistral 13B model running natively on smartphones!

You could interact with it privately without an internet connection. The convenience and capabilities would be incredible.

top 14 comments

sorted by: hot top controversial new old

[–] GermanK20@alien.top 1 points 2 years ago (1 children)

People are always building, but the smaller models are kinda pointless

[–] Winter_Tension5432@alien.top 1 points 2 years ago (1 children)

Smaller models are the future of smartphones, everyone's will be running 10b models on their phones by 2025 this are more than enough for creating emails and translations and just asking questions, a lot more useful than siri and alexa.

[–] GermanK20@alien.top 1 points 2 years ago (1 children)

Well, I've just tested a few models for my workflows and found out only 70B cuts it.

[–] Winter_Tension5432@alien.top 1 points 2 years ago

For now, but you will have 13b models as good as 70b models by the end of next year.

[–] SlowSmarts@alien.top 1 points 2 years ago (1 children)

The direction I took was to start making a Kivy app that connects to an LLM API at home via OpenVPN. I have Ooba and LLama.cpp API servers that I can point the android app to. So, works on old or new phones and is the speed of the server.

The downsides are, you have to have a static IP address or DDNS to connect a VPN to. And cell reception can cause issues.

I have a static to my house, but a person could have the API server be in the cloud with a static IP, if you were to do things similarly.

[–] Winter_Tension5432@alien.top 1 points 2 years ago (1 children)

A normal person would not be able to do it, the first people that create a oogaboga app for android and iPhone and place it on the store at 15$ will have my money for sure and probably from a million other people too.

[–] SlowSmarts@alien.top 1 points 2 years ago

🤔 hmmm... I have some ideas to test...

[–] MrOogaBoga@alien.top 1 points 2 years ago (1 children)

Why isn't anyone building an Oogabooga-like app

you spoke the sacred words so here i am

[–] Winter_Tension5432@alien.top 1 points 2 years ago

I am dreaming with a S24 ultra with a app that let me run a hypothetical future mistral 13b running at 15 tokens/sec with tts, someone can dream.

[–] a_beautiful_rhind@alien.top 1 points 2 years ago

Apple is literally doing this stuff with their ML framework built into devices.. but for tool applications, not a chatbot.

[–] BlackSheepWI@alien.top 1 points 2 years ago (1 children)

It's a lot of work. Phones use a different OS and a different processor instruction set. The latter can be a big pain, especially if you're really dependant on low-level optimizations.

I also feel that -most- people who would choose a phone over PC for this kind of thing would rather just use a high quality easily-accessible commercial option (chatGPT, etc) instead of a homebrew option that required some work to get running. So demand for such a thing is pretty low.

[–] Winter_Tension5432@alien.top 1 points 2 years ago

I'm not so sure, chatgpt has privacy issues and a small model but completely uncensored it has value too. There is a market for this. Convenient and privacy.

[–] Nixellion@alien.top 1 points 2 years ago

Check Ollama, they have links on their GitHub page to stuff using it, and they have an android app that I believe runs locally on the phone. It uses llama.cpp

[–] _Lee_B_@alien.top 1 points 2 years ago

It's not just RAM, you also need the processing power. Phones can't do *good* LLMs yet.

If you watch the chatGPT voice chat mode closely on android, what it does is listen, with a local voice model (whisper.cpp), and then answers generally/quickly LOCALLY, for the first response/paragraph. While that's happening, it's sending what you asked to the servers, where the real text processing takes place. By the time your phone has run the simple local model and gotten a simple sentence for the first response and read that to you, it has MOSTLY gotten the full paragraphs of text back from the server and can read that. Even then, you still notice a slight delay.