Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error runing 03_glove_build_counts.py #140

Open
myeghaneh opened this issue Jul 1, 2021 · 3 comments
Open

error runing 03_glove_build_counts.py #140

myeghaneh opened this issue Jul 1, 2021 · 3 comments

Comments

@myeghaneh
Copy link

I followed your step to train my own S2V for my corpus on my customized NER model, thill step 2 everything is fine,.

corpusMODELV05.spacy is made and also corpusMODELV05-1.s2v

but in step 3 I faced with this error

ℹ Using 1 input files
✔ Created output directory data/S2VVocabMODELV05
ℹ Creating vocabulary counts
cat data\S2vcorpusMODELV05\corpusMODELV05-1.s2v | data/glove.6B.200d.txt/vocab_count -min-count 5 -verbose 2 > data\S2VVocabMODELV05\vocab.txt

✘ Failed creating vocab counts

I am working on Win 10 machine and have used this version of the glove

Wikipedia 2014 + Gigaword 5 (6B tokens, 400K vocab, uncased, 50d, 100d, 200d, & 300d vectors, 822 MB download): glove.6B.zip

https://nlp.stanford.edu/projects/glove/

it seems the number of VOC in

glove.6B.200d.txt/vocab_count is not in line with something

can someone help me ?

many thanks in advance

@myeghaneh
Copy link
Author

any idea? :)

@saimmehmood
Copy link

I am facing the same issue.
Let me know if you've been able to solve it.

@agonzalezreyes
Copy link

agonzalezreyes commented Oct 20, 2022

To run scripts/03_glove_build_counts.py successfully, make sure you do the following and pass the correct build folder of GloVe:

  1. Verify you have the submodule of GloVe (git submodule add https://github.com/stanfordnlp/GloVe.git)
  2. Build it by running cd GloVe && make, which will make a GloVe/build directory. Go back to the parent directory (cd ..).
  3. For 03_glove_build_counts.py GloVe directory path you pass the build folder GloVe/build as follows: python scripts/03_glove_build_counts.py GloVe/build source_folder output_folder

This is basically described in the script 03_glove_build_counts.py line 20-28 comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants