Tables with nested columns (almost right) #155

danielduckworth · 2024-04-23T12:03:36Z

First of all, this is the best document (with tables) indexing service I've tried so far. The use of markdown is highly appreciated. In my own testing, I've found markdown to be much more effective than plain text when it comes to tables. So Kudos!

I just wanted to give some feedback about what the LlamaParse pipeline got wrong. The PDF document I uploaded (an appendix from a public research report) is attached for reference.

ICILS 2018 Appendix E Student percentages.pdf

Below is the parsing instruction and the markdown that resulted. Interestingly the second table was closer to the original than the first.

For comparison, I have also included a similar prompt used with ChatGPT4 and the result (which accurately represented the structure of both tables). If LlamaParse uses a system prompt, tuning the instruction to flatten nested columns might help.

Parsing instruction:

The provided document is Appendix E from the ICILS 2018 international report. It contains tables. The columns after "Country" are groups with two sub-columns each. Table E.1 has the following groups and sub-columns: Gender (%): Males | Females, Parental occupation (%): SEI below 50 | SEI of 50 or above, Parental education (%): No parents with university degree | At least one parent with university degree, Books at home (%): 25 books or less | More than 25 books, Computer resources at home (%): Fewer than two computers | Two computers or more. Table E.2 has the following groups and sub-columns: Immigrant background (%): No immigrant background | Immigrant background, Test language (%): Students speaking another language at at home | Student speaking the survey language at home, Computer resources at home (%): Fewer than two computers | Two computers or more, Study of ICT-related subject (%): Not studying ICT-related subject | Studying ICT-related subject

Result:

ICILS 2018 International Report - Appendix E

Table E.1: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6

Country	Gender (%)	Parental occupation (%)	Parental education (%)	Books at home (%)	Computer resources at home (%)
	Males	Females	SEI below 50	SEI of 50 or above	No parents with university degree
Chile	51 (1.7)	49 (1.7)	68 (1.7)	32 (1.7)	75 (1.4)
Denmark †¹	51 (0.9)	50 (0.9)	37 (1.3)	63 (1.3)	59 (1.6)
Finland	51 (0.8)	49 (0.8)	52 (1.4)	48 (1.4)	45 (1.3)
France	50 (0.6)	50 (0.6)	50 (1.1)	50 (1.1)	69 (1.2)
Germany	52 (0.9)	49 (0.9)	54 (1.4)	46 (1.4)	70 (1.4)
Italy²	52 (0.6)	48 (0.6)	59 (1.5)	41 (1.5)	69 (1.3)
Kazakhstan¹	52 (0.7)	48 (0.7)	51 (1.4)	49 (1.4)	65 (1.5)
Korea, Republic of	53 (0.9)	47 (0.9)	43 (1.5)	57 (1.5)	27 (1.4)
Luxembourg	53 (0.7)	47 (0.7)	58 (0.6)	43 (0.6)	56 (0.5)
Portugal ††¹	51 (1.0)	49 (1.0)	56 (1.5)	44 (1.5)	67 (1.6)
Uruguay	49 (1.0)	51 (1.0)	71 (1.5)	30 (1.5)	78 (1.6)
ICILS 2018 average	51 (0.3)	49 (0.3)	54 (0.4)	46 (0.4)	62 (0.4)
Not meeting sample participation requirements	United States	51 (0.4)	49 (0.4)	39 (1.2)	62 (1.2)
Benchmarking participants meeting sample participation requirements	Moscow (Russian Federation)	51 (0.7)	49 (0.7)	21 (1.1)	79 (1.1)
	North Rhine-Westphalia (Germany)	52 (1.1)	48 (1.1)	46 (1.5)	54 (1.5)

Notes: Standard errors appear in parentheses. Because results are rounded to the nearest whole number, some totals may appear inconsistent. SEI = socioeconomic index.

† Met guidelines for sampling participation rates only after replacement schools were included.

†† Nearly met guidelines for sampling participation rates after replacement schools were included.

1 National defined population covers 90% to 95% of the national target population.

2 Country surveyed target grade in the first half of the school year.

ICILS 2018 International Report - Appendix E

Table E.2: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6

Country	Immigrant background (%)			Test language (%)			Computer resources at home (%)
	No immigrant background	Immigrant background	Students speaking another language at home	Student speaking the survey language at home	Fewer than two computers	Two computers or more	Not studying ICT-related subject	Studying ICT-related subject
Chile	95 (1.2)	5 (1.2)	3 (0.3)	98 (0.3)	37 (1.4)	63 (1.4)	14 (2.1)	86 (2.1)
Denmark † ¹	90 (0.8)	10 (0.8)	7 (0.6)	93 (0.6)	8 (0.6)	92 (0.6)	85 (1.0)	15 (1.0)
Finland	97 (0.4)	3 (0.4)	7 (0.7)	93 (0.7)	26 (1.1)	74 (1.1)	70 (1.5)	30 (1.5)
France	85 (1.1)	15 (1.1)	13 (0.9)	87 (0.9)	29 (1.0)	72 (1.0)	18 (1.0)	82 (1.0)
Germany	77 (1.1)	23 (1.1)	20 (0.9)	80 (0.9)	27 (1.1)	73 (1.1)	63 (2.4)	37 (2.4)
Italy²	90 (0.6)	11 (0.6)	21 (1.0)	79 (1.0)	42 (1.1)	58 (1.1)	10 (0.9)	90 (0.9)
Kazakhstan¹	92 (0.8)	9 (0.8)	8 (1.0)	93 (1.0)	71 (1.1)	29 (1.1)	9 (0.8)	91 (0.8)
Korea, Republic of	100 (0.1)	0 (0.1)	1 (0.2)	99 (0.2)	39 (1.3)	61 (1.3)	73 (2.9)	27 (2.9)
Luxembourg	50 (0.8)	50 (0.8)	81 (0.6)	20 (0.6)	20 (0.5)	80 (0.5)	51 (0.5)	49 (0.5)
Portugal †† ¹	93 (0.6)	7 (0.6)	4 (0.4)	96 (0.4)	33 (1.0)	67 (1.0)	12 (1.7)	89 (1.7)
Uruguay	98 (0.3)	2 (0.3)	3 (0.4)	97 (0.4)	45 (1.2)	55 (1.2)	2 (0.8)	98 (0.8)
ICILS 2018 average	88 (0.2)	12 (0.2)	15 (0.2)	85 (0.2)	34 (0.3)	66 (0.3)	37 (0.5)	63 (0.5)
United States	94 (0.5)	6 (0.5)	17 (1.2)	83 (1.2)	30 (0.8)	70 (0.8)	51 (1.3)	49 (1.3)
Moscow (Russian Federation)	90 (0.9)	10 (0.9)	4 (0.5)	96 (0.5)	29 (1.4)	71 (1.4)	2 (0.3)	98 (0.3)
North Rhine-Westphalia (Germany)	73 (1.8)	27 (1.8)	23 (1.8)	77 (1.8)	28 (1.3)	72 (1.3)	67 (2.2)	33 (2.2)

Notes: Standard errors appear in parentheses. Because results are rounded to the nearest whole number, some totals may appear inconsistent.

† Met guidelines for sampling participation rates only after replacement schools were included.

†† Nearly met guidelines for sampling participation rates after replacement schools were included.

ChatGPT prompt:

The provided document is Appendix E from the ICILS 2018 international report. It contains tables. The columns after "Country" are groups with two sub-columns each. Table E.1 has the following groups and sub-columns: Gender (%): Males | Females, Parental occupation (%): SEI below 50 | SEI of 50 or above, Parental education (%): No parents with university degree | At least one parent with university degree, Books at home (%): 25 books or less | More than 25 books, Computer resources at home (%): Fewer than two computers | Two computers or more. Table E.2 has the following groups and sub-columns: Immigrant background (%): No immigrant background | Immigrant background, Test language (%): Students speaking another language at at home | Student speaking the survey language at home, Computer resources at home (%): Fewer than two computers | Two computers or more, Study of ICT-related subject (%): Not studying ICT-related subject | Studying ICT-related subject

Convert the tables to markdown tables.

Result:

Here are the converted markdown tables based on the information from Appendix E of the ICILS 2018 international report.

Table E.1: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6

Country	Gender (%) Males	Gender (%) Females	Parental Occupation (%) SEI Below 50	Parental Occupation (%) SEI 50 or Above	Parental Education (%) No University Degree	Parental Education (%) University Degree	Books at Home (%) 25 or Less	Books at Home (%) More Than 25	Computer Resources at Home (%) Fewer Than Two	Computer Resources at Home (%) Two or More
Chile	51 (1.7)	49 (1.7)	68 (1.7)	32 (1.7)	75 (1.4)	26 (1.4)	63 (1.8)	37 (1.8)	37 (1.4)	63 (1.4)
Denmark	51 (0.9)	50 (0.9)	37 (1.3)	63 (1.3)	59 (1.6)	41 (1.6)	28 (1.1)	72 (1.1)	8 (0.6)	92 (0.6)
Finland	51 (0.8)	49 (0.8)	52 (1.4)	48 (1.4)	45 (1.3)	55 (1.3)	27 (1.3)	73 (1.3)	26 (1.1)	74 (1.1)
France	50 (0.6)	50 (0.6)	50 (1.1)	50 (1.1)	69 (1.2)	31 (1.2)	41 (1.2)	59 (1.2)	29 (1.0)	72 (1.0)
Germany	52 (0.9)	49 (0.9)	54 (1.4)	46 (1.4)	70 (1.4)	30 (1.4)	24 (1.2)	76 (1.2)	27 (1.1)	73 (1.1)
Italy	52 (0.6)	48 (0.6)	59 (1.5)	41 (1.5)	69 (1.3)	31 (1.3)	31 (1.3)	69 (1.3)	42 (1.1)	58 (1.1)
Kazakhstan	52 (0.7)	48 (0.7)	51 (1.4)	49 (1.4)	65 (1.5)	35 (1.5)	55 (1.6)	45 (1.6)	71 (1.1)	29 (1.1)
Korea, Republic of	53 (0.9)	47 (0.9)	43 (1.5)	57 (1.5)	27 (1.4)	73 (1.4)	13 (0.8)	88 (0.8)	39 (1.3)	61 (1.3)
Luxembourg	53 (0.7)	47 (0.7)	58 (0.6)	42 (0.6)	56 (0.5)	44 (0.5)	28 (0.5)	72 (0.5)	20 (0.5)	80 (0.5)
Portugal	51 (1.0)	49 (1.0)	56 (1.5)	44 (1.5)	67 (1.6)	33 (1.6)	38 (1.2)	62 (1.2)	33 (1.0)	67 (1.0)
Uruguay	49 (1.0)	51 (1.0)	71 (1.5)	30 (1.5)	78 (1.6)	23 (1.6)	69 (1.5)	31 (1.5)	45 (1.2)	55 (1.2)
United States	51 (0.4)	49 (0.4)	39 (1.2)	62 (1.2)	52 (1.3)	48 (1.3)	37 (0.8)	63 (0.8)	30 (0.8)	70 (0.8)

Table E.2: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6

Country	Immigrant Background (%) No	Immigrant Background (%) Yes	Test Language (%) Other	Test Language (%) Survey	Computer Resources at Home (%) Fewer Than Two	Computer Resources at Home (%) Two or More	Study of ICT-Related Subject (%) No	Study of ICT-Related Subject (%) Yes
Chile	95 (1.2)	5 (1.2)	3 (0.3)	98 (0.3)	37 (1.4)	63 (1.4)	14 (2.1)	86 (2.1)
Denmark	90 (0.8)	10 (0.8)	7 (0.6)	93 (0.6)	8 (0.6)	92 (0.6)	85 (1.0)	15 (1.0)
Finland	97 (0.4)	3 (0.4)	7 (0.7)	93 (0.7)	26 (1.1)	74 (1.1)	70 (1.5)	30 (1.5)
France	85 (1.1)	15 (1.1)	13 (0.9)	87 (0.9)	29 (1.0)	72 (1.0)	18 (1.0)	82 (1.0)
Germany	77 (1.1)	23 (1.1)	20 (0.9)	80 (0.9)	27 (1.1)	73 (1.1)	63 (2.4)	37 (2.4)
Italy	90 (0.6)	11 (0.6)	21 (1.0)	79 (1.0)	42 (1.1)	58 (1.1)	10 (0.9)	90 (0.9)
Kazakhstan	92 (0.8)	9 (0.8)	8 (1.0)	93 (1.0)	71 (1.1)	29 (1.1)	9 (0.8)	91 (0.8)
Korea, Republic of	100 (0.1)	0 (0.1)	1 (0.2)	99 (0.2)	39 (1.3)	61 (1.3)	73 (2.9)	27 (2.9)
Luxembourg	50 (0.8)	50 (0.8)	81 (0.6)	20 (0.6)	20 (0.5)	80 (0.5)	51 (0.5)	49 (0.5)
Portugal	93 (0.6)	7 (0.6)	4 (0.4)	96 (0.4)	33 (1.0)	67 (1.0)	12 (1.7)	89 (1.7)
United States	94 (0.5)	6 (0.5)	17 (1.2)	83 (1.2)	30 (0.8)	70 (0.8)	51 (1.3)	49 (1.3)

These tables are formatted to display the percentage of students within each sub-category along with the standard error in parentheses, following the structure provided in the ICILS 2018 Appendix E.

hexapode self-assigned this Apr 23, 2024

logan-markewich added the pdf_debug label May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tables with nested columns (almost right) #155

Tables with nested columns (almost right) #155

danielduckworth commented Apr 23, 2024

Tables with nested columns (almost right) #155

Tables with nested columns (almost right) #155

Comments

danielduckworth commented Apr 23, 2024

ICILS 2018 International Report - Appendix E

Table E.1: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6

2 Country surveyed target grade in the first half of the school year.

ICILS 2018 International Report - Appendix E

Table E.2: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6

Table E.1: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6

Table E.2: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6