overview for SlowSmarts

Why isn't anyone building an Oogabooga-like app for Android and iPhone? in c/localllama@poweruser.forum

[–] SlowSmarts@alien.top 1 points 2 years ago

🤔 hmmm... I have some ideas to test...

Why isn't anyone building an Oogabooga-like app for Android and iPhone? in c/localllama@poweruser.forum

[–] SlowSmarts@alien.top 1 points 2 years ago (2 children)

The direction I took was to start making a Kivy app that connects to an LLM API at home via OpenVPN. I have Ooba and LLama.cpp API servers that I can point the android app to. So, works on old or new phones and is the speed of the server.

The downsides are, you have to have a static IP address or DDNS to connect a VPN to. And cell reception can cause issues.

I have a static to my house, but a person could have the API server be in the cloud with a static IP, if you were to do things similarly.

Training on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methods in c/localllama@poweruser.forum

[–] SlowSmarts@alien.top 1 points 2 years ago

Huh...I figured this has already been happening for a while on closed dataset LLMs. The leaderboard has not directly indicated a models ability to do real-world work from my experience. Some of the lower ranking models seem to do better with what I put them through than the top ranking models. Just my personal opinion and observation.

I'm having... Trouble. in c/localllama@poweruser.forum

[–] SlowSmarts@alien.top 1 points 2 years ago (1 children)

Gpt4 going woke is, unfortunately, the least of the problems happening these days.

are there any Super Tiny LLM models which we can ship within a mobile application? in c/localllama@poweruser.forum

[–] SlowSmarts@alien.top 1 points 2 years ago

TinyLlama 1.1b may have potential - Tiny Llama 1.1b project

TheBloke has made a GGUF of v0.3 chat already.

Looking on HuggingFace, there may be more that have been fine tuned for instruct, etc.

Looking for open-source contributors for text-embedding server for inference in c/localllama@poweruser.forum

[–] SlowSmarts@alien.top 1 points 2 years ago

Looks very interesting!

Will this work on a pre-AVX CPU only machine? ( I happen to be far away from a computer right now to test)

I wonder theres way to run LLM without loading on ram in c/localllama@poweruser.forum

[–] SlowSmarts@alien.top 1 points 2 years ago

I ran a 13b Q_4 on a Raspberry Pi4 8Gb with Llama.cpp with no special settings, it just automatically cashed from disk... Was mega slow and got worse with more tokens, but did it. Don't know if it was Llama.cpp or Raspberry Pi OS that automatically cached.

You can cmake Llama.cpp on many platforms.

Thinking about what people ask for in llama 3 in c/localllama@poweruser.forum

[–] SlowSmarts@alien.top 1 points 2 years ago

I made a mild wishlist in another thread - Cool things for 100k LLM

If I were making an expensive LLM from scratch, these would be some of my thoughts before spending the dough:

A very large percentage of people use OSS LLMs for roleplay or coding, might as well just bake it into the base
Most coding examples and general programming data is years old and lacks knowledge of many new and groundbreaking projects and technologies; updated scrapes of coding sites needs to be made
Updated coding examples need to be generated
Longer coding examples are needed that can deal with multiple files in a codebase
Longer examples of summarizing code need to be generated (like book summing, but for long scripts)
Fine tuning datasets need a lot of cleaning from incorrect examples, bad math, political or sexual bias/agendas injected by wackjobs
Older math datasets seem way more error prone than newer ones
GPT-4 is biased and that will carry through into synthetic datasets, anything from it will likely taint the LLM, be it subtle; more creative dataset cleaning needed
Stop having datasets that contain stupid things like "As an AI...."
Excessive alignments is like sponsoring from birth a highly prized and educated genius, just to give them a lobotomy on graduation day
People regularly circumvent censorship and sensationalize "jailbreaking" it anyway, might as well leave the base model "uncensored" and advertise it as such
Cleaner datasets seems more important than maximizing the number of tokens trained
Multimodal and tool-wielding is the future, bake some cutting edge examples into the base

Speaking of clean databases, have you checked out the new RedPajama-Data v2? There's your 10T+ of clean dataset