Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: Getting JSONDecodeerror while using "LLMTextCompletionProgram" along with "mistral ai 7b" LLM #13080

Open
1 task done
rvssridatta opened this issue Apr 24, 2024 · 1 comment
Labels
question Further information is requested

Comments

@rvssridatta
Copy link

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

I have followed step by step mentioned in below
https://docs.llamaindex.ai/en/stable/examples/pipeline/query_pipeline_sql/

but I have used mistral as my LLM model through Ollama.
below is code how i importing mistral LLM

from llama_index.llms.ollama import Ollama
from llama_index.core import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

Settings.llm = Ollama(model="mistral", request_timeout=1200000.0)
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
at this code block
https://docs.llamaindex.ai/en/stable/examples/pipeline/query_pipeline_sql/#extract-table-name-and-summary-from-each-table

image

I am getting this below error
"

JSONDecodeError Traceback (most recent call last)
File ~/miniconda3/envs/cummins_mistral_py3.11/lib/python3.11/site-packages/pydantic/v1/main.py:539, in BaseModel.parse_raw(cls, b, content_type, encoding, proto, allow_pickle)
538 try:
--> 539 obj = load_str_bytes(
540 b,
541 proto=proto,
542 content_type=content_type,
543 encoding=encoding,
544 allow_pickle=allow_pickle,
545 json_loads=cls.config.json_loads,
546 )
547 except (ValueError, TypeError, UnicodeDecodeError) as e:

File ~/miniconda3/envs/cummins_mistral_py3.11/lib/python3.11/site-packages/pydantic/v1/parse.py:37, in load_str_bytes(b, content_type, encoding, proto, allow_pickle, json_loads)
36 b = b.decode(encoding)
---> 37 return json_loads(b)
38 elif proto == Protocol.pickle:

File ~/miniconda3/envs/cummins_mistral_py3.11/lib/python3.11/json/init.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
343 if (cls is None and object_hook is None and
344 parse_int is None and parse_float is None and
345 parse_constant is None and object_pairs_hook is None and not kw):
--> 346 return _default_decoder.decode(s)
347 if cls is None:

File ~/miniconda3/envs/cummins_mistral_py3.11/lib/python3.11/json/decoder.py:337, in JSONDecoder.decode(self, s, _w)
333 """Return the Python representation of s (a str instance
334 containing a JSON document).
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()

File ~/miniconda3/envs/cummins_mistral_py3.11/lib/python3.11/json/decoder.py:353, in JSONDecoder.raw_decode(self, s, idx)
352 try:
--> 353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:

JSONDecodeError: Invalid \escape: line 2 column 7 (char 8)

During handling of the above exception, another exception occurred:

ValidationError Traceback (most recent call last)
Cell In[14], line 27
25 while True:
26 df_str = df.head(10).to_csv()
---> 27 table_info = program(
28 table_str=df_str,
29 exclude_table_name_list=str(list(table_names)),
30 )
31 table_name = table_info.table_name
32 print(f"Processed table: {table_name}")

File ~/miniconda3/envs/cummins_mistral_py3.11/lib/python3.11/site-packages/llama_index/core/program/llm_program.py:103, in LLMTextCompletionProgram.call(self, llm_kwargs, *args, **kwargs)
99 response = self._llm.complete(formatted_prompt, **llm_kwargs)
101 raw_output = response.text
--> 103 output = self._output_parser.parse(raw_output)
104 if not isinstance(output, self._output_cls):
105 raise ValueError(
106 f"Output parser returned {type(output)} but expected {self._output_cls}"
107 )

File ~/miniconda3/envs/cummins_mistral_py3.11/lib/python3.11/site-packages/llama_index/core/output_parsers/pydantic.py:62, in PydanticOutputParser.parse(self, text)
60 """Parse, validate, and correct errors programmatically."""
61 json_str = extract_json_str(text)
---> 62 return self._output_cls.parse_raw(json_str)

File ~/miniconda3/envs/cummins_mistral_py3.11/lib/python3.11/site-packages/pydantic/v1/main.py:548, in BaseModel.parse_raw(cls, b, content_type, encoding, proto, allow_pickle)
539 obj = load_str_bytes(
540 b,
541 proto=proto,
(...)
545 json_loads=cls.config.json_loads,
546 )
547 except (ValueError, TypeError, UnicodeDecodeError) as e:
--> 548 raise ValidationError([ErrorWrapper(e, loc=ROOT_KEY)], cls)
549 return cls.parse_obj(obj)

ValidationError: 1 validation error for TableInfo
root
Invalid \escape: line 2 column 7 (char 8) (type=value_error.jsondecode; msg=Invalid \escape; doc={
"table_name": "FilmAndRoleTable",
"table_summary": "A table of films and the roles played in them, with release year and cast members"
}; pos=8; lineno=2; colno=7)

Steps to reproduce the error:

  1. just follow Llama index documentation.
  2. Instead of gpt paid keys please use mistral ai opensource LLM (given code above) because with gpt keys it is executing successfully but getting error every time with mistral llm

Additional source followed for improting Mistral LLM: "https://medium.com/@parmarshyamsinh/mistral-7b-llm-run-locally-with-ollama-bf10494be857"

@rvssridatta rvssridatta added the question Further information is requested label Apr 24, 2024
Copy link

dosubot bot commented Apr 24, 2024

The JSONDecodeError you're encountering suggests an issue with the formatting or handling of JSON strings in the response from the "mistral ai 7b" LLM when used with the LLMTextCompletionProgram. This error typically occurs when the JSON string is not properly formatted, which can happen if the response from the LLM contains unexpected characters or structures not compliant with JSON standards.

To address this issue, you should first ensure that the response from the "mistral ai 7b" LLM is correctly formatted as valid JSON. This involves verifying that the extract_json_str function accurately extracts a JSON string from the LLM's response. Additionally, ensure that this JSON string matches the expected schema required by the Pydantic model you're using for parsing the response.

If the error persists, consider examining the raw output from the LLM before it's passed to the parse method to identify any discrepancies in the expected JSON format. This can help pinpoint whether the issue lies in the extraction process or if the LLM's response itself is malformed.

It's also worth reviewing the specific configurations or settings required for the "mistral ai 7b" LLM within the LlamaIndex framework to ensure compatibility. Incorrect configurations might lead to responses that are not properly formatted for the expected JSON parsing process.

Lastly, since the error message points to an invalid escape character in the JSON string, it's crucial to check for any escape sequences in the LLM's response that might not be correctly handled or represented in the JSON format. Correcting these issues should help resolve the JSONDecodeError and allow for successful parsing of the LLM's response.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant