So real quick, I've been exploring local LLM for a bit and so on. In this video I get into what I think is the future for LLM, but in a nut shell I think Microsoft will eventually push out a local LLM to machines to cut down on a lot of resources and cost. In doing so, it likely will be possible for developers to tap into that local LLM for their game.
The worries I seen bring up is
- Spoilers - As mention in the video it is currently and it should always be possible to solve for this in the stuff sent to the LLM. The LLM can't talk about what it doesn't know.
- The NPC talks about stuff it shouldn't - by fine tuning it, this solves this problem to an extreme degree. The better you prep it, the less likely it will go off script. Even more with how you coded your end.
- Story lines shouldn't be dynamic. The answer to this is simple. Don't use the LLM for those lines and given NPC.
- Cost - Assuming I'm right about Microsoft and others will add a local LLM. The local part of it removes this problem.
https://www.youtube.com/watch?v=N31x4qHBsNM
It is possible to have it where given NPC show different emotions and direct their emotions as shown here where I tested it with anger.
https://www.youtube.com/shorts/5mPjOLT7H-Q
I don't agree with the assumption that there is a pressure for companies like MS to reduce costs via local models. Compute on that gamer's PC is probably the biggest problem right now. Especially since in a game, pretty much all of the hardware is already used to the limit. And then you throw a 10GB LLM on top, maybe even loading different finetunes for different jobs? Then the TTS model? This does not result in reasonable response times any time soon, at least not with somewhat generalistic models.
On the other hand, that's something MS must like a whole lot. What you see as "optimizing costs" is optimizing their profit away. They can sell that compute to you. That's great for them, not something to be optimized away. And it's the best DRM ever too.
Microsoft has said many times the version (many expect to be win 12) is going to focus around AI. Not only this, some Linux versions are building around it and aiming for something like Her (movie).
Keep in mind if you're offline you can't use a cloud version, and the current version they put out has the bing budget limitations. So like if you are to say you want your computer in dark mode so it can be in dark mode. This goes against the budget for the day. And the budget is there due to resources on the cloud.
So now assuming they (Microsoft) and others are right that the main way people in the future will interact with the computers is by these models and interfacing with it like you would a human, the system figuring out what you want, and it doing it. If it is stuck on the cloud you're limited to a budget by how many times you can use it per day or it will cost you. Where if it's local it uses your resources and you can use it how much or how little you want. Plus you can use it offline.
Tldr this stuff is going to local llm.