The r1 has goldfish syndrome. There are ways to extend memory permanently and increase retrieval accuracy through RAG and Knowledge Graphs that, if run in parallel to the response generation should not significantly impact response times, while simultaneously making responses more relevant to the individual user.
I’ve built a memory system that implements dynamic memory management via prompt categorization. These categories are used to build a knowledge graph that organizes data learned about the user. A light weight categorization agent can generate responses in a few seconds, powering a robust memory system. This will allow the r1 to retrieve previous relevant conversations. I designed my system specifically to have a bot that can talk to multiple people simultaneously in the same conversation across multiple rooms on discord, and have a consolidated memory across all conversations. (A conversation in one room can inform a conversation in another) However, it would still be incredibly effective for a single channel single user conversation. (r1) Naturally, every user would need to have their own database to prevent leaks.
The code is here: GitHub - DataBassGit/AssistAF but the license is GPL, so no unattributed use. AgentForge is an AI consultation company, and we are happy to provide consultation for a fee.