Skip to content
This repository has been archived by the owner on Jun 11, 2020. It is now read-only.

Problems with training Tibetan pictures #70

Open
hsyy673150343 opened this issue Apr 13, 2020 · 3 comments
Open

Problems with training Tibetan pictures #70

hsyy673150343 opened this issue Apr 13, 2020 · 3 comments

Comments

@hsyy673150343
Copy link

hsyy673150343 commented Apr 13, 2020

Do you still remember me? I spoke with you on your other open source project last time. This time I wanted to train with the pictures with Tibetan text that I generated last time, but I ran into a problem.
I use the following command:
--trdg -c 200000 -i dicts/zzwt_tibetan_sub_string.txt -ft fonts/latin/Qomolangma-UchenSarchung.ttf -t 8 --word_split

But the following error appears:
Missing modules for handwritten text generation.
3%|████▎ | 5527/200000 [00:14<08:25, 384.98it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/site-packages/trdg/data_generator.py", line 21, in generate_from_tuple
cls.generate(*t)
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/site-packages/trdg/data_generator.py", line 230, in generate
final_image.convert("RGB").save(os.path.join(out_dir, image_name))
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/site-packages/PIL/Image.py", line 2099, in save
fp = builtins.open(filename, "w+b")
OSError: [Errno 36] File name too long: 'out/་ཡི་གེ\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 唇音\xa0\xa0 pa\xa0 pha\xa0 ba\xa0 bha\xa0\xa0 ma\xa0\xa0\xa0\xa0 སྒྲ་ཕྱེད་ཀྱི་ཡི་གེ།\0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 半元音ya\xa0 ra la\xa0\xa0 vaསྒྲ་མེད་ཀྱི་_4343.jpg'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/hs/anaconda3/envs/data_generate/bin/trdg", line 8, in
sys.exit(main())
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/site-packages/trdg/run.py", line 414, in main
total=args.count,
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/site-packages/tqdm/std.py", line 1127, in iter
for obj in iterable:
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/multiprocessing/pool.py", line 699, in next
raise value
OSError: [Errno 36] File name too long: 'out/་ཡི་གེ\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 唇音\xa0\xa0 pa\xa0 pha\xa0 ba\xa0 bha\xa0\xa0 ma\xa0\xa0\xa0\xa0 སྒྲ་ཕྱེད་ཀྱི་ཡི་གེ།\0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 半元音ya\xa0 ra la\xa0\xa0 vaསྒྲ་མེད་ཀྱི་_4343.jpg'

Can this problem be solved by modifying the code? If yes, how can I modify it? Can you give me a simple guide?

@Belval
Copy link
Owner

Belval commented Apr 13, 2020

Use -na 2 to have the labels written to another file.

@hsyy673150343
Copy link
Author

hsyy673150343 commented Apr 14, 2020

Use -na 2 to have the labels written to another file.

what`s the format of you training data?I saw in the issues that you said that the label format of the training set is [LABEL]_[NUMBER].[EXT].

Use -na 2 to have the labels written to another file.Can this also be used for the label format of the training set for this project?

@Belval
Copy link
Owner

Belval commented Apr 14, 2020

You would have to edit the data manager to load the label file, but keep in mind that this project was built for Latin-based languages, and I do not if it will work at all with Tibetan. You would have to change the CHAR_VECTOR to match your characters.

Also, you could try and use the programmable API instead of pre-generated data for your needs. By editing this line: https://github.com/Belval/CRNN/blob/master/CRNN/data_manager.py#L50 to generate Tibetan data, you could avoid having pre-generating your data.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants