-
Notifications
You must be signed in to change notification settings - Fork 7.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault happening inside docker container but not outside #12128
Comments
[2024/05/16 11:00:15] ppocr ERROR: error in loading image:abc/four-part.1.3.jpg Traceback (most recent call last): File "paddle_and_annotate.py", line 25, in result = ocr_model.ocr(img_path) File "/usr/local/lib/python3.8/site-packages/paddleocr/paddleocr.py", line 660, in ocr img = preprocess_image(img) File "/usr/local/lib/python3.8/site-packages/paddleocr/paddleocr.py", line 650, in preprocess_image _image = alpha_to_color(_image, alpha_color) File "/usr/local/lib/python3.8/site-packages/paddleocr/ppocr/utils/utility.py", line 86, in alpha_to_color if len(img.shape) == 3 and img.shape[2] == 4: AttributeError: 'NoneType' object has no attribute 'shape' from this error, make sure the image path that you give is correct, this error show's that paddleocr is not read the image correctly. |
understood and I was able to resolve it by changing the version of paddlepaddle-gpu to 2.5.2
|
This is a C++ error, seems to be a warning about file reading, more context is needed to determine the specific problem. |
Hi,
here's a very simple paddleocr code I am running outside the docker container:
`from paddleocr import PaddleOCR
from matplotlib import pyplot as plt
import cv2
import os
import numpy as np
import glob
from time import time
ocr_model = PaddleOCR(lang='en', use_angle_cls=True, use_gpu=True)
src_dir = 'abc'
dst_dir = 'xyz'
os.makedirs(dst_dir, exist_ok=True)
image_paths = glob.glob(os.path.join(src_dir, '*'))
for img_path in image_paths:
Upon execution of this code, I am running into no errors
[2024/05/16 16:27:43] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=True, use_xpu=False, use_npu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir=None, page_num=0, det_algorithm='DB', det_model_dir='/home/shubankar/.paddleocr/whl/det/en/en_PP-OCRv3_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='quad', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/home/shubankar/.paddleocr/whl/rec/en/en_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/home/shubankar/Desktop/envdori/lib/python3.10/site-packages/paddleocr/ppocr/utils/en_dict.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=True, cls_model_dir='/home/shubankar/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='en', det=True, rec=True, type='ocr', ocr_version='PP-OCRv4', structure_version='PP-StructureV2') [2024/05/16 16:27:45] ppocr DEBUG: dt_boxes num : 20, elapsed : 0.26409912109375 [2024/05/16 16:27:45] ppocr DEBUG: cls num : 20, elapsed : 0.03380465507507324 [2024/05/16 16:27:45] ppocr DEBUG: rec_res num : 20, elapsed : 0.0963449478149414 PaddleOCR time: 0.4418179988861084 [2024/05/16 16:27:45] ppocr DEBUG: dt_boxes num : 26, elapsed : 0.037672996520996094 [2024/05/16 16:27:45] ppocr DEBUG: cls num : 26, elapsed : 0.01969003677368164 [2024/05/16 16:27:45] ppocr DEBUG: rec_res num : 26, elapsed : 0.10423398017883301 PaddleOCR time: 0.20465898513793945 [2024/05/16 16:27:45] ppocr DEBUG: dt_boxes num : 46, elapsed : 0.04460334777832031 [2024/05/16 16:27:45] ppocr DEBUG: cls num : 46, elapsed : 0.042574167251586914 [2024/05/16 16:27:46] ppocr DEBUG: rec_res num : 46, elapsed : 0.17322564125061035 PaddleOCR time: 0.30808138847351074 [2024/05/16 16:27:46] ppocr DEBUG: dt_boxes num : 29, elapsed : 0.03878664970397949 [2024/05/16 16:27:46] ppocr DEBUG: cls num : 29, elapsed : 0.030057191848754883 [2024/05/16 16:27:46] ppocr DEBUG: rec_res num : 29, elapsed : 0.11749863624572754 PaddleOCR time: 0.22779440879821777 [2024/05/16 16:27:46] ppocr DEBUG: dt_boxes num : 20, elapsed : 0.03751087188720703 [2024/05/16 16:27:46] ppocr DEBUG: cls num : 20, elapsed : 0.0183870792388916 [2024/05/16 16:27:46] ppocr DEBUG: rec_res num : 20, elapsed : 0.07560086250305176 PaddleOCR time: 0.17191839218139648 [2024/05/16 16:27:46] ppocr DEBUG: dt_boxes num : 36, elapsed : 0.038983821868896484 [2024/05/16 16:27:46] ppocr DEBUG: cls num : 36, elapsed : 0.029161930084228516 [2024/05/16 16:27:46] ppocr DEBUG: rec_res num : 36, elapsed : 0.15292072296142578 PaddleOCR time: 0.26297926902770996 [2024/05/16 16:27:47] ppocr DEBUG: dt_boxes num : 21, elapsed : 0.03756856918334961 [2024/05/16 16:27:47] ppocr DEBUG: cls num : 21, elapsed : 0.02182292938232422 [2024/05/16 16:27:47] ppocr DEBUG: rec_res num : 21, elapsed : 0.08386802673339844 PaddleOCR time: 0.18845915794372559 [2024/05/16 16:27:47] ppocr DEBUG: dt_boxes num : 19, elapsed : 0.046122074127197266 [2024/05/16 16:27:47] ppocr DEBUG: cls num : 19, elapsed : 0.02371668815612793 [2024/05/16 16:27:47] ppocr DEBUG: rec_res num : 19, elapsed : 0.07404541969299316 PaddleOCR time: 0.18595504760742188 [2024/05/16 16:27:47] ppocr DEBUG: dt_boxes num : 47, elapsed : 0.04474592208862305 [2024/05/16 16:27:47] ppocr DEBUG: cls num : 47, elapsed : 0.03702855110168457 [2024/05/16 16:27:47] ppocr DEBUG: rec_res num : 47, elapsed : 0.174943208694458 PaddleOCR time: 0.30758070945739746 [2024/05/16 16:27:48] ppocr DEBUG: dt_boxes num : 24, elapsed : 0.0385584831237793 [2024/05/16 16:27:48] ppocr DEBUG: cls num : 24, elapsed : 0.020543336868286133 [2024/05/16 16:27:48] ppocr DEBUG: rec_res num : 24, elapsed : 0.09856367111206055 PaddleOCR time: 0.19896602630615234 [2024/05/16 16:27:48] ppocr DEBUG: dt_boxes num : 27, elapsed : 0.0385439395904541 [2024/05/16 16:27:48] ppocr DEBUG: cls num : 27, elapsed : 0.023215293884277344 [2024/05/16 16:27:48] ppocr DEBUG: rec_res num : 27, elapsed : 0.10276460647583008 PaddleOCR time: 0.2052445411682129 [2024/05/16 16:27:48] ppocr DEBUG: dt_boxes num : 20, elapsed : 0.035872697830200195 [2024/05/16 16:27:48] ppocr DEBUG: cls num : 20, elapsed : 0.018585205078125 [2024/05/16 16:27:48] ppocr DEBUG: rec_res num : 20, elapsed : 0.07669782638549805 PaddleOCR time: 0.17434000968933105 [2024/05/16 16:27:48] ppocr DEBUG: dt_boxes num : 36, elapsed : 0.0416712760925293 [2024/05/16 16:27:48] ppocr DEBUG: cls num : 36, elapsed : 0.030147075653076172 [2024/05/16 16:27:49] ppocr DEBUG: rec_res num : 36, elapsed : 0.14519858360290527 PaddleOCR time: 0.25902247428894043 [2024/05/16 16:27:49] ppocr DEBUG: dt_boxes num : 24, elapsed : 0.0388178825378418 [2024/05/16 16:27:49] ppocr DEBUG: cls num : 24, elapsed : 0.021373271942138672 [2024/05/16 16:27:49] ppocr DEBUG: rec_res num : 24, elapsed : 0.10416460037231445 PaddleOCR time: 0.20946550369262695 [2024/05/16 16:27:49] ppocr DEBUG: dt_boxes num : 26, elapsed : 0.03809237480163574 [2024/05/16 16:27:49] ppocr DEBUG: cls num : 26, elapsed : 0.028120994567871094 [2024/05/16 16:27:49] ppocr DEBUG: rec_res num : 26, elapsed : 0.09642601013183594 PaddleOCR time: 0.20334506034851074 [2024/05/16 16:27:49] ppocr DEBUG: dt_boxes num : 24, elapsed : 0.03925800323486328 [2024/05/16 16:27:49] ppocr DEBUG: cls num : 24, elapsed : 0.019763946533203125 [2024/05/16 16:27:49] ppocr DEBUG: rec_res num : 24, elapsed : 0.09666180610656738 PaddleOCR time: 0.19680356979370117
However, when I am trying to replicate the same results inside the docker container, I am hitting the following error:
[2024/05/16 11:00:15] ppocr ERROR: error in loading image:abc/four-part.1.3.jpg Traceback (most recent call last): File "paddle_and_annotate.py", line 25, in <module> result = ocr_model.ocr(img_path) File "/usr/local/lib/python3.8/site-packages/paddleocr/paddleocr.py", line 660, in ocr img = preprocess_image(img) File "/usr/local/lib/python3.8/site-packages/paddleocr/paddleocr.py", line 650, in preprocess_image _image = alpha_to_color(_image, alpha_color) File "/usr/local/lib/python3.8/site-packages/paddleocr/ppocr/utils/utility.py", line 86, in alpha_to_color if len(img.shape) == 3 and img.shape[2] == 4: AttributeError: 'NoneType' object has no attribute 'shape'
so I tried restarting the docker, but got the following:
`root@4eb0df9924f1:/code/build/data# python3 paddle_and_annotate.py
[2024/05/16 11:00:36] ppocr DEBUG: Namespace(alpha=1.0, alphacolor=(255, 255, 255), benchmark=False, beta=1.0, binarize=False, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/code/build/data/3', cls_thresh=0.9, cpu_threads=10, crop_res_save_dir='./output', det=True, det_algorithm='DB', det_box_type='quad', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_limit_side_len=960, det_limit_type='max', det_model_dir='/code/build/data/1', det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, fourier_degree=5, gpu_id=0, gpu_mem=500, help='==SUPPRESS==', image_dir=None, image_orientation=False, invert=False, ir_optim=True, kie_algorithm='LayoutXLM', label_list=['0', '180'], lang='en', layout=True, layout_dict_path=None, layout_model_dir=None, layout_nms_threshold=0.5, layout_score_threshold=0.5, max_batch_size=10, max_text_length=25, merge_no_span_structure=True, min_subgraph_size=15, mode='structure', ocr=True, ocr_order_method=None, ocr_version='PP-OCRv4', output='./output', page_num=0, precision='fp32', process_id=0, re_model_dir=None, rec=True, rec_algorithm='SVTR_LCNet', rec_batch_num=6, rec_char_dict_path='/usr/local/lib/python3.8/site-packages/paddleocr/ppocr/utils/en_dict.txt', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_model_dir='/code/build/data/2', recovery=False, save_crop_res=False, save_log_path='./log_output/', scales=[8, 16, 32], ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ser_model_dir=None, show_log=True, sr_batch_num=1, sr_image_shape='3, 32, 128', sr_model_dir=None, structure_version='PP-StructureV2', table=True, table_algorithm='TableAttn', table_char_dict_path=None, table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=True, use_dilation=False, use_gpu=True, use_mp=False, use_npu=False, use_onnx=False, use_pdf2docx_api=False, use_pdserving=False, use_space_char=True, use_tensorrt=False, use_visual_backbone=True, use_xpu=False, vis_font_path='./doc/fonts/simfang.ttf', warmup=False)
C++ Traceback (most recent call last):
0 inflateReset2
Error Message Summary:
FatalError:
Segmentation fault
is detected by the operating system.[TimeInfo: *** Aborted at 1715857238 (unix time) try "date -d @1715857238" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0x5895aaca3904) received by PID 94 (TID 0x709c6334fb80) from PID 18446744072279963908 ***]
Segmentation fault (core dumped)`
I have had experience with segfaults and I have seen that it usually resolves upon downgrading to previous library version and doing a hit and trial, that too didnt seem to work
Kindly help me troubleshoot this (language preference: English)
The text was updated successfully, but these errors were encountered: