Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create_partial streaming not behaving as expected #665

Open
5 of 8 tasks
jakevollkommer opened this issue May 13, 2024 · 12 comments
Open
5 of 8 tasks

create_partial streaming not behaving as expected #665

jakevollkommer opened this issue May 13, 2024 · 12 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@jakevollkommer
Copy link

  • This is actually a bug report.
  • I am not getting good LLM Results
  • I have tried asking for help in the community on discord or discussions and have not received a response.
  • I have tried searching the documentation and have not found an answer.

What Model are you using?

  • gpt-3.5-turbo
  • gpt-4-turbo
  • gpt-4
  • Other (gpt-4o)

Describe the bug
Partial streaming returns an extraction on every generation for the extraction stream, however fields are None on every generation until the final token, which is when it fully fills out. Essentially defeating the entire purpose of streaming.

To Reproduce

from pydantic import BaseModel, Field
from instructor import from_openai
from openai import OpenAI

class Summary(BaseModel):
    summary: str = Field(description="A detailed summary")

def main():
    client = from_openai(OpenAI())
    extraction_stream = client.chat.completions.create_partial(
        model="gpt-4o",
        response_model=Summary,
        messages=[
             {"role": "system", "content": "You summarize text"},
            {"role": "user", "content": " Summarize: Mary had a little lamb"}
        ],
        stream=True,
    )

    for extraction in extraction_stream:
        print(extraction)

main()

Expected behavior
This used to add new content on every token output, but now fields are always None until the last token for that field completes

Screenshots
Screenshot 2024-05-13 at 5 57 47 PM

@jxnl
Copy link
Owner

jxnl commented May 16, 2024

I was able to reproduce this but I don't really have time this week to take a look.

@jxnl jxnl added bug Something isn't working help wanted Extra attention is needed labels May 16, 2024
@jakevollkommer
Copy link
Author

@jxnl not sure I'll have time before you but if I can dive in any quick tips on where to look?

@jxnl
Copy link
Owner

jxnl commented May 16, 2024

import pydantic_core

print(
    pydantic_core.from_json(
        '{"name": "jaso',
        allow_partial=True,
    )
)
# > {}


print(
    pydantic_core.from_json(
        '{"name": "jaso"',
        allow_partial=True,
    )
)
# > {'name': 'jaso'}

@jxnl
Copy link
Owner

jxnl commented May 16, 2024

@jakevollkommer
Copy link
Author

@jxnl i see, so does the partial stream have unclosed json up until the last token? Maybe we have to downgrade pydantic

@jakevollkommer
Copy link
Author

@jxnl looks like pydantic/pydantic-core#1293 was closed, can you check if it solved the issue with partial streaming?

@jakevollkommer
Copy link
Author

pydantic/jiter#101

@antondkg
Copy link

https://github.com/pydantic/jiter/releases/tag/v0.4.0

fix released 18hrs ago

@jakevollkommer
Copy link
Author

Need to use jitter.from_json instead of pydantic_core.from_json

@jxnl
Copy link
Owner

jxnl commented May 23, 2024

ah! can you make a PR for this?

@jakevollkommer
Copy link
Author

Haven't had a chance yet, might be able to this week

@jxnl
Copy link
Owner

jxnl commented May 29, 2024

Need to use jitter.from_json instead of pydantic_core.from_json

@ellipsis-dev can you try to make this change? and make sure imports are correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants