LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Mozilla Llamafile - bundling one model & llama.cpp into a single executable (alien.top)

submitted 2 years ago by Danmoreng@alien.top to c/localllama@poweruser.forum

4 comments fedilink hide all child comments

Just read about this project on Twitter and it sounds really interesting:

https://github.com/mozilla-Ocho/llamafile

What do guys think, might be even more simple than Ollama?

top 4 comments

sorted by: hot top controversial new old

[–] mcbagz@alien.top 1 points 2 years ago

It is probably better to ask here than to create a new post:

They have noted that some Windows computers cannot run the programs that are larger than 4gb. Can anyone put together a really straightforward explanation of downloading and running llamafile not bundled with weights and connecting it to a set of weights?

[–] silenceimpaired@alien.top 1 points 2 years ago

Exciting and worrying… I have gone to great efforts to use safetensors… I would have to see every model packaged in executable format… but then again I have seen comments about llama.cpp behavior changing for the same model and settings (not sure if it is true but that could be bad)

[–] CheatCodesOfLife@alien.top 1 points 2 years ago

I wish Mozilla would just stick to Firefox, and invest the rest of the money into some dividend paying fund, so they aren't so reliant on Google for funding for their software engineers.

[–] FPham@alien.top 1 points 2 years ago

I find it very strange attaching the gguf file to an exe - it's a very bad security idea (your antivirus needs to hash 10 GB file) and then on windows you still need to split it to exe and data, because the exe limit is 4GB so basically instead of llama.cpp you are now using llamafile that is llamacpp. Or am I missing something?