Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: UnboundLocalError: cannot access local variable 'x' where it is not associated with a value #13133

Open
schneiderfelipe opened this issue Apr 26, 2024 · 6 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@schneiderfelipe
Copy link

schneiderfelipe commented Apr 26, 2024

Bug Description

x below is unbound in a very well defined case:

for x in f_return_val:
dispatcher.event(
LLMChatInProgressEvent(
messages=messages,
response=x,
span_id=span_id,
)
)
yield cast(ChatResponse, x)
last_response = x
callback_manager.on_event_end(
CBEventType.LLM,
payload={
EventPayload.MESSAGES: messages,
EventPayload.RESPONSE: last_response,
},
event_id=event_id,
)
dispatcher.event(
LLMChatEndEvent(
messages=messages,
response=x,

namely, when the for body never runs (see reproduction below).

Version

llama-index==0.10.32

Steps to Reproduce

from llama_index.llms.ollama import Ollama
from llama_index.core.llms import ChatMessage

llm = Ollama(model="phi3", request_timeout=30.0)
response = llm.stream_chat([ChatMessage(role="user", content="")])
for r in response:
    print(r.delta, end="", flush=True)

Relevant Logs/Tracbacks

❯ python
Python 3.12.3 (main, Apr 15 2024, 18:25:56) [Clang 17.0.6 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from llama_index.llms.ollama import Ollama
>>> from llama_index.core.llms import ChatMessage
>>> llm = Ollama(model="phi3", request_timeout=30.0)
>>> response = llm.stream_chat([ChatMessage(role="user", content="")])
>>> for r in response:
...     print(r.delta, end="", flush=True)
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/schneider/Dropbox/eleanor/.venv/lib/python3.12/site-packages/llama_index/core/ll
ms/callbacks.py", line 172, in wrapped_gen
    response=x,
             ^
UnboundLocalError: cannot access local variable 'x' where it is not associated with a value
@schneiderfelipe schneiderfelipe added bug Something isn't working triage Issue needs to be triaged/prioritized labels Apr 26, 2024
Copy link

dosubot bot commented Apr 26, 2024

Hey there, @schneiderfelipe! 🎉 I'm Dosu, a bot here to lend a hand with bugs, answer your questions, and guide you through contributing while we wait for a human maintainer to jump in. Think of me as your friendly neighborhood code helper. I'm diving into your issue as we speak and will get back to you with a full answer shortly. Stay tuned!

@logan-markewich
Copy link
Collaborator

why would the for loop not run? response timeout?

Copy link

dosubot bot commented Apr 26, 2024

The UnboundLocalError you're seeing happens because the variable x is being referenced before it's assigned any value. This issue arises specifically in the scenario where the for loop in llama_index/core/llms/callbacks.py doesn't execute because the generator f_return_val yields no values. This could be due to the generator being empty or an exception occurring before any yield statements are executed.

To address this, consider the following steps:

  1. Adjust the request_timeout parameter: If it's too short, the request might fail before any data is received, leading to an empty generator and thus the UnboundLocalError.
  2. Validate API key and model compatibility: Ensure they are correctly set up for use with the stream_chat method.
  3. Check external dependencies and network issues: These could affect the execution of the stream_chat method.
  4. Debug the method and file: Look into stream_chat and llama_index/core/llms/callbacks.py to verify if f_return_val yields values and x gets assigned.

If after these steps the issue remains, providing more details like the full traceback, the llama_index library version, and any other relevant context would be helpful for further diagnosis.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@schneiderfelipe
Copy link
Author

schneiderfelipe commented Apr 26, 2024

why would the for loop not run? response timeout?

This will happen whenever the generator f_return_val never yields, which is apparently happening when messages is either empty (see below) or consists of a single message with empty ("") content (as in the original bug report above).

Definitely not because of timeout.

Variations of the example given in the issue (click to reveal the full output):

Forcing a request timeout:
from llama_index.llms.ollama import Ollama
from llama_index.core.llms import ChatMessage

llm = Ollama(model="phi3", request_timeout=0.1)  # <--
response = llm.stream_chat([ChatMessage(role="user", content="")])
for r in response:
    print(r.delta, end="", flush=True)
Python 3.12.3 (main, Apr 26 2024, 13:22:08) [GCC 13.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.24.0 -- An enhanced Interactive Python. Type '?' for help.
              
In [1]: from llama_index.llms.ollama import Ollama
   ...: from llama_index.core.llms import ChatMessage
   ...:
   ...: llm = Ollama(model="phi3", request_timeout=0.1)  # <--
   ...: response = llm.stream_chat([ChatMessage(role="user", content="")])
   ...: for r in response:
   ...:     print(r.delta, end="", flush=True)
   ...:
---------------------------------------------------------------------------
ReadTimeout                               Traceback (most recent call last)
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_transports/default.py:69, in map_httpcore_exceptions()
     68 try:
---> 69     yield
     70 except Exception as exc:
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_transports/default.py:233, in HTTPTransport.handle_request(self, request)
    232 with map_httpcore_exceptions():
--> 233     resp = self._pool.handle_request(req)
    235 assert isinstance(resp.stream, typing.Iterable)
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py:216, in ConnectionPool.handle_request(self, request)
    215     self._close_connections(closing)
--> 216     raise exc from None
    218 # Return the response. Note that in this case we still have to manage
    219 # the point at which the response is closed.
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py:196, in ConnectionPool.handle_request(self, request)
    194 try:
    195     # Send the request on the assigned connection.
--> 196     response = connection.handle_request(
    197         pool_request.request
    198     )
    199 except ConnectionNotAvailable:
    200     # In some cases a connection may initially be available to
    201     # handle a request, but then become unavailable.
    202     #
    203     # In this case we clear the connection and try again.
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/connection.py:101, in HTTPConnection.handle_request(self, request)
     99     raise exc
--> 101 return self._connection.handle_request(request)
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/http11.py:143, in HTTP11Connection.handle_request(self, request)
    142         self._response_closed()
--> 143 raise exc
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/http11.py:113, in HTTP11Connection.handle_request(self, request)
    104 with Trace(
    105     "receive_response_headers", logger, request, kwargs
    106 ) as trace:
    107     (
    108         http_version,
    109         status,
    110         reason_phrase,
    111         headers,
    112         trailing_data,
--> 113     ) = self._receive_response_headers(**kwargs)
    114     trace.return_value = (
    115         http_version,
    116         status,
    117         reason_phrase,
    118         headers,
    119     )
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/http11.py:186, in HTTP11Connection._receive_response_headers(self, request)
    185 while True:
--> 186     event = self._receive_event(timeout=timeout)
    187     if isinstance(event, h11.Response):
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/http11.py:224, in HTTP11Connection._receive_event(self, timeout)
    223 if event is h11.NEED_DATA:
--> 224     data = self._network_stream.read(
    225         self.READ_NUM_BYTES, timeout=timeout
    226     )
    228     # If we feed this case through h11 we'll raise an exception like:
    229     #
    230     #     httpcore.RemoteProtocolError: can't handle event type
   (...)
    234     # perspective. Instead we handle this case distinctly and treat
    235     # it as a ConnectError.
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_backends/sync.py:124, in SyncStream.read(self, max_bytes, timeout)
    123 exc_map: ExceptionMapping = {socket.timeout: ReadTimeout, OSError: ReadError}
--> 124 with map_exceptions(exc_map):
    125     self._sock.settimeout(timeout)
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/contextlib.py:158, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    157 try:
--> 158     self.gen.throw(value)
    159 except StopIteration as exc:
    160     # Suppress StopIteration *unless* it's the same exception that
    161     # was passed to throw().  This prevents a StopIteration
    162     # raised inside the "with" statement from being suppressed.
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map)
     13     if isinstance(exc, from_exc):
---> 14         raise to_exc(exc) from exc
     15 raise
                                                                                              
ReadTimeout: timed out
                                                                                              
The above exception was the direct cause of the following exception:
                                                                                              
ReadTimeout                               Traceback (most recent call last)
Cell In[1], line 6
      4 llm = Ollama(model="phi3", request_timeout=0.1)  # <--
      5 response = llm.stream_chat([ChatMessage(role="user", content="")])
----> 6 for r in response:
      7     print(r.delta, end="", flush=True)
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/llama_index/core/llms/callbacks.py:150, in llm_chat_callback.<locals>.wrap.<locals>.wrapped_llm_chat.<locals>.wrapped_gen()
    148 def wrapped_gen() -> ChatResponseGen:
    149     last_response = None
--> 150     for x in f_return_val:
    151         dispatcher.event(
    152             LLMChatInProgressEvent(
    153                 messages=messages,
   (...)
    156             )
    157         )
    158         yield cast(ChatResponse, x)
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/llama_index/llms/ollama/base.py:140, in Ollama.stream_chat(self, messages, **kwargs)
    124 payload = {
    125     "model": self.model,
    126     "messages": [
   (...)
    136     **kwargs,
    137 }
    139 with httpx.Client(timeout=Timeout(self.request_timeout)) as client:
--> 140     with client.stream(
    141         method="POST",
    142         url=f"{self.base_url}/api/chat",
    143         json=payload,
    144     ) as response:
    145         response.raise_for_status()
    146         text = ""
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
    135 del self.args, self.kwds, self.func
    136 try:
--> 137     return next(self.gen)
    138 except StopIteration:
    139     raise RuntimeError("generator didn't yield") from None
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_client.py:870, in Client.stream(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
    847 """
    848 Alternative to `httpx.request()` that streams the response body
    849 instead of loading it into memory at once.
   (...)
    855 [0]: /quickstart#streaming-responses
    856 """
    857 request = self.build_request(
    858     method=method,
    859     url=url,
   (...)
    868     extensions=extensions,
    869 )
--> 870 response = self.send(
    871     request=request,
    872     auth=auth,
    873     follow_redirects=follow_redirects,
    874     stream=True,
    875 )
    876 try:
    877     yield response
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_client.py:914, in Client.send(self, request, stream, auth, follow_redirects)
    906 follow_redirects = (
    907     self.follow_redirects
    908     if isinstance(follow_redirects, UseClientDefault)
    909     else follow_redirects
    910 )
    912 auth = self._build_request_auth(request, auth)
--> 914 response = self._send_handling_auth(
    915     request,
    916     auth=auth,
    917     follow_redirects=follow_redirects,
    918     history=[],
    919 )
    920 try:
    921     if not stream:
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_client.py:942, in Client._send_handling_auth(self, request, auth, follow_redirects, history)
    939 request = next(auth_flow)
    941 while True:
--> 942     response = self._send_handling_redirects(
    943         request,
    944         follow_redirects=follow_redirects,
    945         history=history,
    946     )
    947     try:
    948         try:
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_client.py:979, in Client._send_handling_redirects(self, request, follow_redirects, history)
    976 for hook in self._event_hooks["request"]:
    977     hook(request)
--> 979 response = self._send_single_request(request)
    980 try:
    981     for hook in self._event_hooks["response"]:
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_client.py:1015, in Client._send_single_request(self, request)
   1010     raise RuntimeError(
   1011         "Attempted to send an async request with a sync Client instance."
   1012     )
   1014 with request_context(request=request):
-> 1015     response = transport.handle_request(request)
   1017 assert isinstance(response.stream, SyncByteStream)
   1019 response.request = request
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_transports/default.py:232, in HTTPTransport.handle_request(self, request)
    218 assert isinstance(request.stream, SyncByteStream)
    220 req = httpcore.Request(
    221     method=request.method,
    222     url=httpcore.URL(
   (...)
    230     extensions=request.extensions,
    231 )
--> 232 with map_httpcore_exceptions():
    233     resp = self._pool.handle_request(req)
    235 assert isinstance(resp.stream, typing.Iterable)
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/contextlib.py:158, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    156     value = typ()
    157 try:
--> 158     self.gen.throw(value)
    159 except StopIteration as exc:
    160     # Suppress StopIteration *unless* it's the same exception that
    161     # was passed to throw().  This prevents a StopIteration
    162     # raised inside the "with" statement from being suppressed.
    163     return exc is not value
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_transports/default.py:86, in map_httpcore_exceptions()
     83     raise
     85 message = str(exc)
---> 86 raise mapped_exc(message) from exc
                                                                                              
ReadTimeout: timed out
Giving no message at all:
from llama_index.llms.ollama import Ollama
from llama_index.core.llms import ChatMessage

llm = Ollama(model="phi3", request_timeout=30.0)
response = llm.stream_chat([])  # <--
for r in response:
    print(r.delta, end="", flush=True)
ipython
Python 3.12.3 (main, Apr 26 2024, 13:22:08) [GCC 13.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.24.0 -- An enhanced Interactive Python. Type '?' for help.
                                                                                              
In [1]: from llama_index.llms.ollama import Ollama
   ...: from llama_index.core.llms import ChatMessage
   ...:
   ...: llm = Ollama(model="phi3", request_timeout=30.0)
   ...: response = llm.stream_chat([])  # <--
   ...: for r in response:
   ...:     print(r.delta, end="", flush=True)
   ...:
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[1], line 6
      4 llm = Ollama(model="phi3", request_timeout=30.0)
      5 response = llm.stream_chat([])  # <--
----> 6 for r in response:
      7     print(r.delta, end="", flush=True)
                                                                                              
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/llama_index/core/llms/callbacks.py:172, in llm_chat_callback.<locals>.wrap.<locals>.wrapped_llm_chat.<locals>.wrapped_gen()
    159     last_response = x
    161 callback_manager.on_event_end(
    162     CBEventType.LLM,
    163     payload={
   (...)
    167     event_id=event_id,
    168 )
    169 dispatcher.event(
    170     LLMChatEndEvent(
    171         messages=messages,
--> 172         response=x,
    173         span_id=span_id,
    174     )
    175 )
                                                                                              
UnboundLocalError: cannot access local variable 'x' where it is not associated with a value

@RussellLuo
Copy link
Contributor

Although using empty messages or a single message with empty content serves no practical purpose, I personally feel there is indeed a slight issue from the perspective of Python — in theory, a for loop might not execute at all.

I guess a possible solution is to use last_response, which is guaranteed to be predefined:

instead of x:

dispatcher.event(
    LLMChatEndEvent(
        messages=messages,
-       response=x,
+       response=last_response,
        span_id=span_id,
    )
)

But I'm not sure whether this is acceptable as it will introduce a change to LLMChatEndEvent:

class LLMChatEndEvent(BaseEvent):
    messages: List[ChatMessage]
-   response: ChatResponse
+   response: Optional[ChatResponse] = None

@schneiderfelipe
Copy link
Author

But I'm not sure whether this is acceptable as it will introduce a change to LLMChatEndEvent:

class LLMChatEndEvent(BaseEvent):
    messages: List[ChatMessage]
-   response: ChatResponse
+   response: Optional[ChatResponse] = None

I think this is the way to go. There is a situation where there's simply nothing to be put in response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

3 participants