Conversation 14 exercises

Transcript

00:00 Let's duplicate our chat.gpt example. We'll call this buffer window demo. And you can see

00:07 the main chat.gpt example uses buffer window memory. Now to demonstrate this, I'm going to change this message to only respond with 1 word answers, no matter what I type. So to test it I'm going to type 1, hit enter,

00:26 2, 3,

00:28 and that looks perfect. So inside of our handle.llm.start() we'll bring back the prompts and log them out. So we're going to log a new line plus the prompts, which is an array. We're going to join these And we'll go ahead and open our buffer window demo log log. And once we run this, I'll type

00:52 1, 2.

00:54 Let's check the formatting. So this looks good. So our prompt is currently the system with the main prompt, and then human 1 and 2. So let's run through this

01:05 1, 2, 3, 4, 5, 6, 7.

01:11 Now if we check the log here and I scroll down to the bottom you'll see the main system message at the top, and then 2 through

01:20 7.

01:21 Now the default setting for buffer window memory is a K of 5, which means 5 pairs of human and AI interactions. So this would be 1k. So we have 12345 pairs of humans with AI responses. And then this last 1 is the most recent call. So these are all in memory.

01:46 This is the initial prompt and then this is the most recent call. So the buffer window memory removed this first interaction once it got past 5 interactions. So if we are to lower this down to 2 We'll clear out the log. I'll run the script. I'll type

02:04 1, 2, 3, 4, 5.

02:10 Close this out. Scroll down to the bottom. You'll see the main prompt, and then

02:15 1, 2,

02:17 from the buffer window memory, and then the most recent input from the chain call. So OpenAI is only going to know about what it sees here in the handle llm start, these prompts. So as you're juggling the initial prompt, which will always be sent, the history, which is coming from the buffer window memory, and the current prompt here, you have to play the balancing act between cost of sending too many tokens, or you can just dump everything into your prompt until you start hitting those token limits so that OpenAI has the most information. And you can see that's what we're doing in continueAdventure, that we are passing in all of the previous human and AI messages and placing them before the history instead of adding them to the history. So if you're more worried about the cost of something like this and you want to use the buffer window memory you could delete all of this then come down to the buffer memory, Change it to buffer window memory if it's not already.

03:18 Make sure to import that from up here. Buffer window memory. And then down on our buffer window memory, the current solution is to, I'm going to paste this in, loop through all the messages, and then on the memory chat history add the user message and add the AI message. Now previous messages that we loaded in will start dropping off once this default K of 5 gets hit because we're now using buffer window memory. So you have to find your own balance between the number of messages in your buffer window memory, the size of your original prompt, and how that will combine the prompt with the history which is the memory, and then your current input.

04:03 Because all of those things put together will create the tokens that are sent to OpenAI. And the more tokens equals more cost, but if it's just for yourself or for your dev team, it's usually only a few cents. And it's worth noting that this stuff will only get cheaper and faster and accept larger and larger prompts. And also the dev team over at Langchain mentioned that they are working on different ways of quote-unquote hydrating the memory instead of manually adding them like this, And I'll make a follow-up lesson, waitin' if that happens.

lesson

Keep Track of Conversations with BufferWindowMemory

Duplicate the "ChatGPT Example" script to create a new "Buffer Window Demo" script.

Note that this script uses BufferWindowMemory:

let memory = new BufferWindowMemory({
  returnMessages: true
})

To demonstrate how this works, update the system message prompt to only respond wi