How does ChatGPT retain the context of previous questions?

quangngoc

The illusion of understanding context is created by the model's capacity to accept very long input sequences. OpenAI states that approximately 3000 words can be given as input. This together with the fact that GPT-3 was trained to produce text that continues a given prompt could explain the context feature.

Furtherlesss, there are several engineering approaches to improve the context after the maximum content length is exceeded. These include (but are probably not limited to):

Using language models to summarize the conversation thus far, and using that as context.
Using language models to search for the relevant context from the previous discussion (can be done by embedding questions and answers and doing a distance-based lookup in vector space), and feeding those as context with clever prompting like "If this information improves your answer, update your answer accordingly".

By using these techniques, ChatGPT is able to maintain conversation flow and provide more accurate and relevant responses.

quangngoc

Conversational Summarization: Summarizing the conversation so far is a useful approach. This cuts down the length of the conversation but retains the important points, allowing the model to stay within the token limit.

Relevant Context Lookup: This involves training models to identify and extract relevant sections of the conversation when necessary. By using vector space representations of text, questions and answers can be compared to find those most closely related to the current context.

Clever Prompting: Prompts can be designed in a way to "remind" the model of any important context that may influence its response, carrying relevant context throughout the conversation.

Context Window Adjustment: Some models use a sliding window approach where, as the conversation continues, the window "slides" to always include the latest inputs and outputs at the expense of older ones.

External Memory Architectures: These are designed to allow models to "remember" and pull in information from earlier in the conversation, or even from other conversations or sources of knowledge.

While these techniques have potential, they also introduce new complexities and challenges, such as deciding what information to keep or discard, maintaining coherence, and managing computational resources.