Thoughts about making a multi-bot chat with fictional characters

justynasty@lemmy.kya.moe · edit-2 1 year ago

Thoughts about making a multi-bot chat with fictional characters

justynasty@lemmy.kya.moe · 1 year ago

I’m looking for feedback on whether or not this could be improved. Regarding open-sourcing this, I’d first need some suggestions for an existing (hackable, documented) user interface that I could use to integrate this into. Because that’s how this would become an end-to-end solution. There are some storytelling projects already in development, such as LlamaTale or Mantella, but they put LLM in an existing world, so there isn’t much code I could use from those projects. I’m suggesting something that could be used to create a world while I chat with my characters. I use MoE (mixture of models) for this, but without having to worry about setting up the infrastructure (no separate redis, sql, or vector databases). The chat (flutter) interface above is only for testing, until I find something better.

rufus@discuss.tchncs.de · edit-2 1 year ago

Ah, alright. I’ve been playing a bit with Python, Langchain, had a look at Microsoft Guidance and vector databases. And tried to implement some companion chatbot that would communicate with me via my favorite chat app. But I dislike many of the libraries I used and learned a few ways not to do it. And I got a bit lost evaluating different models (fine-tunes of llama-based models) and prompt engineering.

It’s quite a complex thing. Let me read your article again and see if I got some constructive thing left to say. Seams you’ve figured most of the stuff out already. And made some decent choices.

Edit: Since you’re mentioning previous posts… Can you link them so we have more context?

rufus@discuss.tchncs.de · edit-2 1 year ago

Well, judging by your text, you did your research and read the most obvious papers highlighting the design of agents. In case you missed something, related papers I found are:

Generative Agents: Interactive Simulacra of Human Behavior
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models * *
babyagi / AutoGPT
ReAct: Synergizing Reasoning and Acting in Language Models (probably not state of the art anymore)

And whatever I can’t come up with in this exact moment. This isn’t my area of expertise anyways. A probably good summary is here: https://lilianweng.github.io/posts/2023-06-23-agent/

Your characters/agents obviously need a personality and some kind of internal state / memory and a way to reason / predict their response. So I’d agree with your figures. The file format doesn’t really matter.

Summarization/compression is needed because text tokens are a limited resource and we can’t waste them on mundane stuff.

Storing information depends on the kind of knowledge and how you want to retrieve it. You can use a graph, match keywords. Also a vector database is aligned with the way LLMs work and having to deal with unstructured data. Or just progressively summarize stuff and let older memories degrade. You can combine these approaches.

So most of the things you mention are needed and I don’t see any better alternative.

I didn’t get how the tagging, tracking of things and prioritization works. Especially not how you retrieve that information later. Your hashtags seem like keywords that are the keys of your database. It’s a creative idea to do it this way. I didn’t think of doing that. One thing you can’t do that way is query information like: “Remember the last time we went hiking?” But most things can probably be queried by a place or a person. You could also just hand everything over to a vector database and hope it returns the correct things when needed. Or, better, let your LLM help you with processing and structuring the information. Have a look at the Generative Agents paper and their GitHub page with the demo. Those agents reflect and process memories before storing them in their memory. They do way more than summarization and then store that. If you need to pay attention to latency, don’t do it after each chat message. One idea would be to let it reflect on what happened and process the memories during the night for example, when your server isn’t in use. You can then strip that from the current chat history and put that summary into your database / long-term memory. You can also try and ask your LLM to tag sentences for you.

I’m not sure if it’s really necessary to forget everything about a person while they’re absent. It’s not how the real world works. But your mileage may vary. I’m sure this solves a few issues.

Speaking of the papers: I tried to take some inspiration from the Generative Agents, AutoGPT and BabyAGI. I fiddled around a bit with the LangChain implementations of those. But, I had quite some issues. I think those 33b parameter Llama models aren’t the same as ChatGPT or GPT4. A lot of scientific results from those papers can’t be transferred one to one to what we’re doing here. At least that’s my experience. Also those smaller models aren’t super intelligent, so you need to get your prompts right. Many things don’t work well the way they’re done in LangChain for example if you’re not using ChatGPT.

But I think you can take some inspiration from LangChain. And SillyTavern has examples for prompts specific to roleplay/dialogue. And interesting extensions (SillyTavern-extras) you can have a look at.

If you write your code modular, you can try a few things and swap in and out modules and different ways of storing memories etc.

I also had to waste some time until I found out that using a Llama model for summarization isn’t a good choice. Same applies to embeddings. Use something that is specifically made for the task and it’ll work way, way better. (Or even at all.)

Regarding the UI: Idk. I’d just take some web framework I like or am familiar with. Kinda depends on the programming language. I took some integration to talk to my chatbot via chat. But that turns out not to be the best idea. A chat app doesn’t have features like regenerating the last answer or giving a feedback, or editing their reply. All things that are pretty useful and available in all the proper LLM frontends.

justynasty@lemmy.kya.moe · edit-2 1 year ago

Previous posts have been uploaded in pdf format, which can be read online without downloading them first. https://docdro.id/UlaU7wy Multi-bot chat log is stored in a single file/note, that’s the only change since then.