Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement "Continue". Final output left truncated and unfinished because LLM max_token was reached. #504

Open
MinhNgyuen opened this issue Apr 24, 2024 · 1 comment

Comments

@MinhNgyuen
Copy link

Right now if you give an Agent a task to write a long report and a lot of context, it will run out of output_tokens and stop midway through its response. There is no option to have the LLM continue generating the report in the case the LLM ran out of output tokens.

Similar to the ChatGPT web ui, add an option to continue output generation and finalize the output.

Here's some example code of how its done with OpenAI

def openai_write_long_response(prompt: str, context: str) -> str:
    client = OpenAI()
    completions = []
    messages = [
        {
            "role": "user",
            "content": f"Below is some relevant information that you can use to craft your report\n\n{context}",
        },
        {
            "role": "user",
            "content": prompt,
        },
    ]

    completion = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=messages,
    )
    completions.append(completion)
    output = completion.choices[0].message.content
    while completion.choices[0].finish_reason == "length":
        messages += [
            {
                "role": "system",
                "content": "Message was truncated. Please continue.",
            },
            {
                "role": "assistant",
                "content": output,
            },
        ]
        completion = client.chat.completions.create(
            model="gpt-4-turbo",
            messages=messages,
        )
        completions.append(completion)
        output += completion.choices[0].message.content + " "

    return output
@MinhNgyuen
Copy link
Author

If you could point to where in the code we could implement this I can take a stab at adding this functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant