Problems with training Tibetan pictures #70

hsyy673150343 · 2020-04-13T13:24:09Z

Do you still remember me? I spoke with you on your other open source project last time. This time I wanted to train with the pictures with Tibetan text that I generated last time, but I ran into a problem.
I use the following command:
--trdg -c 200000 -i dicts/zzwt_tibetan_sub_string.txt -ft fonts/latin/Qomolangma-UchenSarchung.ttf -t 8 --word_split

But the following error appears：
Missing modules for handwritten text generation.
3%|████▎ | 5527/200000 [00:14<08:25, 384.98it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/site-packages/trdg/data_generator.py", line 21, in generate_from_tuple
cls.generate(*t)
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/site-packages/trdg/data_generator.py", line 230, in generate
final_image.convert("RGB").save(os.path.join(out_dir, image_name))
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/site-packages/PIL/Image.py", line 2099, in save
fp = builtins.open(filename, "w+b")
OSError: [Errno 36] File name too long: 'out/་ཡི་གེ\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 唇音\xa0\xa0 pa\xa0 pha\xa0 ba\xa0 bha\xa0\xa0 ma\xa0\xa0\xa0\xa0 སྒྲ་ཕྱེད་ཀྱི་ཡི་གེ།\0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 半元音ya\xa0 ra la\xa0\xa0 vaསྒྲ་མེད་ཀྱི་_4343.jpg'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/hs/anaconda3/envs/data_generate/bin/trdg", line 8, in
sys.exit(main())
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/site-packages/trdg/run.py", line 414, in main
total=args.count,
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/site-packages/tqdm/std.py", line 1127, in iter
for obj in iterable:
File "/home/hs/anaconda3/envs/data_generate/lib/python3.6/multiprocessing/pool.py", line 699, in next
raise value
OSError: [Errno 36] File name too long: 'out/་ཡི་གེ\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 唇音\xa0\xa0 pa\xa0 pha\xa0 ba\xa0 bha\xa0\xa0 ma\xa0\xa0\xa0\xa0 སྒྲ་ཕྱེད་ཀྱི་ཡི་གེ།\0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 半元音ya\xa0 ra la\xa0\xa0 vaསྒྲ་མེད་ཀྱི་_4343.jpg'

Can this problem be solved by modifying the code? If yes, how can I modify it? Can you give me a simple guide?

Belval · 2020-04-13T20:49:50Z

Use -na 2 to have the labels written to another file.

hsyy673150343 · 2020-04-14T06:37:44Z

Use -na 2 to have the labels written to another file.

what`s the format of you training data？I saw in the issues that you said that the label format of the training set is [LABEL]_[NUMBER].[EXT].

Use -na 2 to have the labels written to another file.Can this also be used for the label format of the training set for this project?

Belval · 2020-04-14T17:59:57Z

You would have to edit the data manager to load the label file, but keep in mind that this project was built for Latin-based languages, and I do not if it will work at all with Tibetan. You would have to change the CHAR_VECTOR to match your characters.

Also, you could try and use the programmable API instead of pre-generated data for your needs. By editing this line: https://github.com/Belval/CRNN/blob/master/CRNN/data_manager.py#L50 to generate Tibetan data, you could avoid having pre-generating your data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with training Tibetan pictures #70

Problems with training Tibetan pictures #70

hsyy673150343 commented Apr 13, 2020 •

edited

Belval commented Apr 13, 2020

hsyy673150343 commented Apr 14, 2020 •

edited

Belval commented Apr 14, 2020

Problems with training Tibetan pictures #70

Problems with training Tibetan pictures #70

Comments

hsyy673150343 commented Apr 13, 2020 • edited

Belval commented Apr 13, 2020

hsyy673150343 commented Apr 14, 2020 • edited

Belval commented Apr 14, 2020

hsyy673150343 commented Apr 13, 2020 •

edited

hsyy673150343 commented Apr 14, 2020 •

edited