LocalLLaMA

3 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago

MODERATORS

communick@poweruser.forum

Vector DB alternatives: storage optimization of past conversations in LLMs, anyone ever done it? (alien.top)

submitted 1 year ago by Hamdoullah@alien.top to c/localllama@poweruser.forum

1 comments fedilink hide all child comments

I'm exploring techniques to improve memory handling in LLMs without resorting to vector databases like Pinecone. In the scenario of an ongoing conversation of days or weeks in length, previous chats roll off the context window. The idea would be for a conversation manager (could be the LLM prompting itself as space fills up) to allocate space of a pre-set ratio within the context window for storing memories.

2 techniques I've thought about:

- Memory hierarchization based on keyword, timestamp, or subjective importance scores

- Text compression via various techniques such as syntactic/semantic shrinking, tokenization, substitution, etc.

Certainly this has been achieved before. Any experience with it?

top 1 comments

sorted by: hot top controversial new old

[–] AsliReddington@alien.top 1 points 1 year ago

All you need a 32K LLM. Everything beyond that needs a tool invocation where the archived texts can be pulled from. You'll have to make your orchestrator smart enough to know that there is content beyond just needs to be invoked