I think you might be able to plug in another model as a chat agent there. LangChain is pretty flexible, but I do remember being confused about the difference between a chat agent and LLMs. I think you can plug in any of these: https://python.langchain.com/docs/integrations/chat/
I quickly gave up on LangChain and went with custom llama-cpp-python because it was too difficult to figure out what LangChain was doing and customize the behavior.
But I also never got around to conversation memory because my rag prompt alone took 1 minute to start getting a response on my poor little laptop haha