You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Internal error occurs on add_docs when split_length < split_overlap. This issue was raised on our forums here.
Reproducing the issue To reproduce:
# create index:
curl -XPOST -H 'Content-type: application/json' http://localhost:8882/indexes/text-index -d '{ "index_defaults": { "text_preprocessing": { "split_length": 2, "split_overlap": 5, "split_method": "word" }, "treat_urls_and_pointers_as_images": false, "model": "hf/all_datasets_v4_MiniLM-L6", "normalize_embeddings": true, "image_preprocessing": { "patch_method": null }, "ann_parameters" : { "space_type": "cosinesimil", "parameters": { "ef_construction": 128, "m": 16 } } }, "number_of_shards": 3, "number_of_replicas": 0 }'
# add docs
curl -XPOST -H 'Content-type: application/json' http://localhost:8882/indexes/text-index/documents -d '{ "documents" : [{"_id":"1","title":"Fat cat","description":"The fat cat sits on the mat in the sunshine"},{"_id":"2","title":"Brown fox","description":"The quick brown fox jumps over the lazy dog"}], "tensorFields" : ["description"] }'
Yields this error:
Marqo logs:
File "/app/src/marqo/tensor_search/tensor_search.py", line 522, in add_documents
content_chunks = text_processor.split_text(field_content, split_by=split_by,
File "/app/src/marqo/s2_inference/processing/text.py", line 147, in split_text
segments = list(windowed(split_text, n=split_length, step=split_length - split_overlap))
File "/usr/local/lib/python3.8/dist-packages/more_itertools/more.py", line 841, in windowed
raise ValueError('step must be >= 1')
ValueError: step must be >= 1
The return message is an unhelpful message: Internal Server Error.
Expected behavior
Index-creation-time validation should prevent creating an index with these problematic settings.
Additional context
The text was updated successfully, but these errors were encountered:
Describe the bug
Internal error occurs on add_docs when split_length < split_overlap. This issue was raised on our forums here.
Reproducing the issue
To reproduce:
Yields this error:
Marqo logs:
The return message is an unhelpful message:
Internal Server Error
.Expected behavior
Index-creation-time validation should prevent creating an index with these problematic settings.
Additional context
The text was updated successfully, but these errors were encountered: