Personality #6

Open
opened 2025-11-07 00:26:35 -08:00 by jshiffer · 1 comment
Owner

Make the chatbot capable of maintaining conversations (not just responding to individual prompts with no context) and it should know our inside jokes and stuff

How would we do this? First, experiment with in-context learning to see how well it can fit in with one of us. But for a more involved approach, we may start by fine-tuning on a representative subset of conversations in LUG chat (see #3), also having a RAG system of some sort where it can recall facts about users and events that happened in chat. We can continually add to the vector database, which makes RAG an attractive option.

And in terms of reinforcement learning, what would be better: having human-in-the-loop where we do sentiment analysis on people's interactions with the bot, or would it even be possible to do GRPO, where we masked parts of past conversations, had the bot generate several candidate messages to fill in the gaps, and then rated the relative suitability of each candidate?

Make the chatbot capable of maintaining conversations (not just responding to individual prompts with no context) and it should know our inside jokes and stuff How would we do this? First, experiment with in-context learning to see how well it can fit in with one of us. But for a more involved approach, we may start by fine-tuning on a representative subset of conversations in LUG chat (see #3), also having a RAG system of some sort where it can recall facts about users and events that happened in chat. We can continually add to the vector database, which makes RAG an attractive option. And in terms of reinforcement learning, what would be better: having human-in-the-loop where we do sentiment analysis on people's interactions with the bot, or would it even be possible to do GRPO, where we masked parts of past conversations, had the bot generate several candidate messages to fill in the gaps, and then rated the relative suitability of each candidate?
jshiffer added the machine-learning label 2025-11-07 00:26:35 -08:00
jshiffer self-assigned this 2025-11-07 00:26:48 -08:00
azhang was assigned by jshiffer 2025-11-07 00:26:48 -08:00
Author
Owner

Also my thought is that the bot should be the average of all our personalities, but if we decide to make it a girl or somehow "out-of-distribution", we can prompt-engineer it.

Also my thought is that the bot should be the average of all our personalities, but if we decide to make it a girl or somehow "out-of-distribution", we can prompt-engineer it.
Sign in to join this conversation.