You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ZeroDivisionError Traceback (most recent call last)
/tmp/ipykernel_7338/1281916851.py in <module>
----> 1 tables_cam=camelot.read_pdf(filepath='pdfs_files/fulltext.pdf',
2 pages="9,10",
3 flavor='stream',
4 edge_tol=500
5 )
~/anaconda3/envs/test/lib/python3.8/site-packages/camelot/io.py in read_pdf(filepath, pages, password, flavor, suppress_stdout, layout_kwargs, **kwargs)
111 p = PDFHandler(filepath, pages=pages, password=password)
112 kwargs = remove_extra(kwargs, flavor=flavor)
--> 113 tables = p.parse(
114 flavor=flavor,
115 suppress_stdout=suppress_stdout,
~/anaconda3/envs/test/lib/python3.8/site-packages/camelot/handlers.py in parse(self, flavor, suppress_stdout, layout_kwargs, **kwargs)
174 parser = Lattice(**kwargs) if flavor == "lattice" else Stream(**kwargs)
175 for p in pages:
--> 176 t = parser.extract_tables(
177 p, suppress_stdout=suppress_stdout, layout_kwargs=layout_kwargs
178 )
~/anaconda3/envs/test/lib/python3.8/site-packages/camelot/parsers/stream.py in extract_tables(self, filename, suppress_stdout, layout_kwargs)
461 sorted(self.table_bbox.keys(), key=lambda x: x[1], reverse=True)
462 ):
--> 463 cols, rows = self._generate_columns_and_rows(table_idx, tk)
464 table = self._generate_table(table_idx, cols, rows)
465 table._bbox = tk
~/anaconda3/envs/test/lib/python3.8/site-packages/camelot/parsers/stream.py in _generate_columns_and_rows(self, table_idx, tk)
323 # select elements which lie within table_bbox
324 t_bbox = {}
--> 325 t_bbox["horizontal"] = text_in_bbox(tk, self.horizontal_text)
326 t_bbox["vertical"] = text_in_bbox(tk, self.vertical_text)
327
~/anaconda3/envs/test/lib/python3.8/site-packages/camelot/utils.py in text_in_bbox(bbox, text)
374 if bbox_intersect(ba, bb):
375 # if the intersection is larger than 80% of ba's size, we keep the longest
--> 376 if (bbox_intersection_area(ba, bb) / bbox_area(ba)) > 0.8:
377 if bbox_longer(bb, ba):
378 rest.discard(ba)
ZeroDivisionError: float division by zero
I would expect it to return no tables found (like normally it does) rather than crashing for a 0 division. How do I prevent this?
PDF_FILE: fulltext.pdf
PS: If I transform a pdf page into an image, find the table area on the image and then decide to pass the corresponding area to camelot to extract the tables, is the conversion from the position on the image to the position in the pdf just pos_image * size_pdf/size_img ?
The text was updated successfully, but these errors were encountered:
Hello, I'm trying to extract tables with defaults parameters in stream mode. I try:
It returns;
I would expect it to return no tables found (like normally it does) rather than crashing for a 0 division. How do I prevent this?
PDF_FILE: fulltext.pdf
PS: If I transform a pdf page into an image, find the table area on the image and then decide to pass the corresponding area to camelot to extract the tables, is the conversion from the position on the image to the position in the pdf just pos_image * size_pdf/size_img ?
The text was updated successfully, but these errors were encountered: