initial load of index documents - errors with more than 2000 records #877

beardsleym · 2020-07-22T03:21:28Z

beardsleym
Jul 22, 2020

Hi guys, so great to find MeiliSearch.
I used the 1-click DigitalOcean droplet which was so simple and I was suitably impressed, so now I have spun up MeiliSeacrh on a 1GB VM in Oracle Cloud (free-tier) for longer-term use as per the 'production' docs.
On both, I really struggled initially getting my data from Firestore to MeiliSearch. I'll share a few my different attempts. All running on my local Mac machine.

Get all documents from Firestore collection (25,000) and save to an array, after that add with index.addDocuments().
This would only work if I limited to 2000 documents from Firestore at a time.
Get all documents from Firestore collection (25,000) and loop through each one, adding a single document to MeiliSearch
This worked up till about 1956 records, then I received an error from MeiliSearch, I think it was along the lines of rate limit exceeded.
Get all documents from Firestore collection (25,000) and save in .json file. Read .json file (3.3MB) and upload with addDocuments()
This returned a 413 error (request entity too large) from MeiliSearch. I tried to change the upload limit as suggested in the docs, --http-payload-size-limit=100000000, but that has had no effect, I couldn't actually work out how to confirm the limit that is in place.

FYI, I have an Algolia account and I checked that the .json file was ok by uploading it in their console and all was fine.

eskombro · 2020-07-23T10:49:03Z

eskombro
Jul 23, 2020
Collaborator

@beardsleym I have indexed myself datasets of 20k documents in one single update, but those documents were small and simple. So I don't think there is a problem purely coming from the number of documents. But indeed, one single update with thousands of heavy documents may require huge amounts of RAM available. The alternative of iterating and indexing the documents one by one seems very cost-effective too. What I normally do is trying to index those documents in batches of, for example, 1000 for each request. This normally works fast and smooth. Have you tried this?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial load of index documents - errors with more than 2000 records #877

{{title}}

Replies: 1 comment

{{title}}

Select a reply

initial load of index documents - errors with more than 2000 records #877

beardsleym Jul 22, 2020

Replies: 1 comment

eskombro Jul 23, 2020 Collaborator

beardsleym
Jul 22, 2020

eskombro
Jul 23, 2020
Collaborator