if you are interested in doing this, you can write in PM
LocalLLaMA
Community to discuss about Llama, the family of large language models created by Meta AI.
fuck that noise
There's already a couple of startups working on similar things, check https://withmartian.com for example (not a reason not to do anything ofc). Interested in what it becomes!
I like the idea, i think it's similar to something I'm already discussing with some other people, dm me if you want and I'll introduce you
This is what my hobby project essentially does. I’m running a single chat from 3 different servers in my network all serving different LLMs that are given a role in the chat pipeline. I can send the same prompt to multiple models so they can work on it concurrently, or have them handoff each other’s responses to continue elaborating, validating, or whatever that LLMs job is. Since each server is serving an API and websocket route, all I need to do is put it behind a proxy and port forward them to the public internet. Anyone here could visit the public URL and run inference workflows in my homelab(theoretically speaking). They could also spin up an instance on their side and we can have our servers talk to each other.
Of course that’s highly insecure and just bait for bad actors. So I will scale it using overlay network that requires a key exchange and runs over VPN.
Any startup thinking they are going to profit from this idea will only burn investor money and waste their own time. This will all be free and it’s only a matter of time before the open source community cuts into their hopes and dreams.