You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, this is the best document (with tables) indexing service I've tried so far. The use of markdown is highly appreciated. In my own testing, I've found markdown to be much more effective than plain text when it comes to tables. So Kudos!
I just wanted to give some feedback about what the LlamaParse pipeline got wrong. The PDF document I uploaded (an appendix from a public research report) is attached for reference.
Below is the parsing instruction and the markdown that resulted. Interestingly the second table was closer to the original than the first.
For comparison, I have also included a similar prompt used with ChatGPT4 and the result (which accurately represented the structure of both tables). If LlamaParse uses a system prompt, tuning the instruction to flatten nested columns might help.
Parsing instruction:
The provided document is Appendix E from the ICILS 2018 international report. It contains tables. The columns after "Country" are groups with two sub-columns each. Table E.1 has the following groups and sub-columns: Gender (%): Males | Females, Parental occupation (%): SEI below 50 | SEI of 50 or above, Parental education (%): No parents with university degree | At least one parent with university degree, Books at home (%): 25 books or less | More than 25 books, Computer resources at home (%): Fewer than two computers | Two computers or more. Table E.2 has the following groups and sub-columns: Immigrant background (%): No immigrant background | Immigrant background, Test language (%): Students speaking another language at at home | Student speaking the survey language at home, Computer resources at home (%): Fewer than two computers | Two computers or more, Study of ICT-related subject (%): Not studying ICT-related subject | Studying ICT-related subject
Result:
ICILS 2018 International Report - Appendix E
Table E.1: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6
Notes: Standard errors appear in parentheses. Because results are rounded to the nearest whole number, some totals may appear inconsistent. SEI = socioeconomic index.
† Met guidelines for sampling participation rates only after replacement schools were included.
†† Nearly met guidelines for sampling participation rates after replacement schools were included.
1 National defined population covers 90% to 95% of the national target population.
2 Country surveyed target grade in the first half of the school year.
ICILS 2018 International Report - Appendix E
Table E.2: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6
Country
Immigrant background (%)
Test language (%)
Computer resources at home (%)
Study of ICT-related subject (%)
No immigrant background
Immigrant background
Students speaking another language at home
Student speaking the survey language at home
Fewer than two computers
Two computers or more
Not studying ICT-related subject
Studying ICT-related subject
Chile
95 (1.2)
5 (1.2)
3 (0.3)
98 (0.3)
37 (1.4)
63 (1.4)
14 (2.1)
86 (2.1)
Denmark † ¹
90 (0.8)
10 (0.8)
7 (0.6)
93 (0.6)
8 (0.6)
92 (0.6)
85 (1.0)
15 (1.0)
Finland
97 (0.4)
3 (0.4)
7 (0.7)
93 (0.7)
26 (1.1)
74 (1.1)
70 (1.5)
30 (1.5)
France
85 (1.1)
15 (1.1)
13 (0.9)
87 (0.9)
29 (1.0)
72 (1.0)
18 (1.0)
82 (1.0)
Germany
77 (1.1)
23 (1.1)
20 (0.9)
80 (0.9)
27 (1.1)
73 (1.1)
63 (2.4)
37 (2.4)
Italy²
90 (0.6)
11 (0.6)
21 (1.0)
79 (1.0)
42 (1.1)
58 (1.1)
10 (0.9)
90 (0.9)
Kazakhstan¹
92 (0.8)
9 (0.8)
8 (1.0)
93 (1.0)
71 (1.1)
29 (1.1)
9 (0.8)
91 (0.8)
Korea, Republic of
100 (0.1)
0 (0.1)
1 (0.2)
99 (0.2)
39 (1.3)
61 (1.3)
73 (2.9)
27 (2.9)
Luxembourg
50 (0.8)
50 (0.8)
81 (0.6)
20 (0.6)
20 (0.5)
80 (0.5)
51 (0.5)
49 (0.5)
Portugal †† ¹
93 (0.6)
7 (0.6)
4 (0.4)
96 (0.4)
33 (1.0)
67 (1.0)
12 (1.7)
89 (1.7)
Uruguay
98 (0.3)
2 (0.3)
3 (0.4)
97 (0.4)
45 (1.2)
55 (1.2)
2 (0.8)
98 (0.8)
ICILS 2018 average
88 (0.2)
12 (0.2)
15 (0.2)
85 (0.2)
34 (0.3)
66 (0.3)
37 (0.5)
63 (0.5)
United States
94 (0.5)
6 (0.5)
17 (1.2)
83 (1.2)
30 (0.8)
70 (0.8)
51 (1.3)
49 (1.3)
Moscow (Russian Federation)
90 (0.9)
10 (0.9)
4 (0.5)
96 (0.5)
29 (1.4)
71 (1.4)
2 (0.3)
98 (0.3)
North Rhine-Westphalia (Germany)
73 (1.8)
27 (1.8)
23 (1.8)
77 (1.8)
28 (1.3)
72 (1.3)
67 (2.2)
33 (2.2)
Notes: Standard errors appear in parentheses. Because results are rounded to the nearest whole number, some totals may appear inconsistent.
† Met guidelines for sampling participation rates only after replacement schools were included.
†† Nearly met guidelines for sampling participation rates after replacement schools were included.
ChatGPT prompt:
The provided document is Appendix E from the ICILS 2018 international report. It contains tables. The columns after "Country" are groups with two sub-columns each. Table E.1 has the following groups and sub-columns: Gender (%): Males | Females, Parental occupation (%): SEI below 50 | SEI of 50 or above, Parental education (%): No parents with university degree | At least one parent with university degree, Books at home (%): 25 books or less | More than 25 books, Computer resources at home (%): Fewer than two computers | Two computers or more. Table E.2 has the following groups and sub-columns: Immigrant background (%): No immigrant background | Immigrant background, Test language (%): Students speaking another language at at home | Student speaking the survey language at home, Computer resources at home (%): Fewer than two computers | Two computers or more, Study of ICT-related subject (%): Not studying ICT-related subject | Studying ICT-related subject
Convert the tables to markdown tables.
Result:
Here are the converted markdown tables based on the information from Appendix E of the ICILS 2018 international report.
Table E.1: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6
Country
Gender (%) Males
Gender (%) Females
Parental Occupation (%) SEI Below 50
Parental Occupation (%) SEI 50 or Above
Parental Education (%) No University Degree
Parental Education (%) University Degree
Books at Home (%) 25 or Less
Books at Home (%) More Than 25
Computer Resources at Home (%) Fewer Than Two
Computer Resources at Home (%) Two or More
Chile
51 (1.7)
49 (1.7)
68 (1.7)
32 (1.7)
75 (1.4)
26 (1.4)
63 (1.8)
37 (1.8)
37 (1.4)
63 (1.4)
Denmark
51 (0.9)
50 (0.9)
37 (1.3)
63 (1.3)
59 (1.6)
41 (1.6)
28 (1.1)
72 (1.1)
8 (0.6)
92 (0.6)
Finland
51 (0.8)
49 (0.8)
52 (1.4)
48 (1.4)
45 (1.3)
55 (1.3)
27 (1.3)
73 (1.3)
26 (1.1)
74 (1.1)
France
50 (0.6)
50 (0.6)
50 (1.1)
50 (1.1)
69 (1.2)
31 (1.2)
41 (1.2)
59 (1.2)
29 (1.0)
72 (1.0)
Germany
52 (0.9)
49 (0.9)
54 (1.4)
46 (1.4)
70 (1.4)
30 (1.4)
24 (1.2)
76 (1.2)
27 (1.1)
73 (1.1)
Italy
52 (0.6)
48 (0.6)
59 (1.5)
41 (1.5)
69 (1.3)
31 (1.3)
31 (1.3)
69 (1.3)
42 (1.1)
58 (1.1)
Kazakhstan
52 (0.7)
48 (0.7)
51 (1.4)
49 (1.4)
65 (1.5)
35 (1.5)
55 (1.6)
45 (1.6)
71 (1.1)
29 (1.1)
Korea, Republic of
53 (0.9)
47 (0.9)
43 (1.5)
57 (1.5)
27 (1.4)
73 (1.4)
13 (0.8)
88 (0.8)
39 (1.3)
61 (1.3)
Luxembourg
53 (0.7)
47 (0.7)
58 (0.6)
42 (0.6)
56 (0.5)
44 (0.5)
28 (0.5)
72 (0.5)
20 (0.5)
80 (0.5)
Portugal
51 (1.0)
49 (1.0)
56 (1.5)
44 (1.5)
67 (1.6)
33 (1.6)
38 (1.2)
62 (1.2)
33 (1.0)
67 (1.0)
Uruguay
49 (1.0)
51 (1.0)
71 (1.5)
30 (1.5)
78 (1.6)
23 (1.6)
69 (1.5)
31 (1.5)
45 (1.2)
55 (1.2)
United States
51 (0.4)
49 (0.4)
39 (1.2)
62 (1.2)
52 (1.3)
48 (1.3)
37 (0.8)
63 (0.8)
30 (0.8)
70 (0.8)
Table E.2: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6
Country
Immigrant Background (%) No
Immigrant Background (%) Yes
Test Language (%) Other
Test Language (%) Survey
Computer Resources at Home (%) Fewer Than Two
Computer Resources at Home (%) Two or More
Study of ICT-Related Subject (%) No
Study of ICT-Related Subject (%) Yes
Chile
95 (1.2)
5 (1.2)
3 (0.3)
98 (0.3)
37 (1.4)
63 (1.4)
14 (2.1)
86 (2.1)
Denmark
90 (0.8)
10 (0.8)
7 (0.6)
93 (0.6)
8 (0.6)
92 (0.6)
85 (1.0)
15 (1.0)
Finland
97 (0.4)
3 (0.4)
7 (0.7)
93 (0.7)
26 (1.1)
74 (1.1)
70 (1.5)
30 (1.5)
France
85 (1.1)
15 (1.1)
13 (0.9)
87 (0.9)
29 (1.0)
72 (1.0)
18 (1.0)
82 (1.0)
Germany
77 (1.1)
23 (1.1)
20 (0.9)
80 (0.9)
27 (1.1)
73 (1.1)
63 (2.4)
37 (2.4)
Italy
90 (0.6)
11 (0.6)
21 (1.0)
79 (1.0)
42 (1.1)
58 (1.1)
10 (0.9)
90 (0.9)
Kazakhstan
92 (0.8)
9 (0.8)
8 (1.0)
93 (1.0)
71 (1.1)
29 (1.1)
9 (0.8)
91 (0.8)
Korea, Republic of
100 (0.1)
0 (0.1)
1 (0.2)
99 (0.2)
39 (1.3)
61 (1.3)
73 (2.9)
27 (2.9)
Luxembourg
50 (0.8)
50 (0.8)
81 (0.6)
20 (0.6)
20 (0.5)
80 (0.5)
51 (0.5)
49 (0.5)
Portugal
93 (0.6)
7 (0.6)
4 (0.4)
96 (0.4)
33 (1.0)
67 (1.0)
12 (1.7)
89 (1.7)
United States
94 (0.5)
6 (0.5)
17 (1.2)
83 (1.2)
30 (0.8)
70 (0.8)
51 (1.3)
49 (1.3)
These tables are formatted to display the percentage of students within each sub-category along with the standard error in parentheses, following the structure provided in the ICILS 2018 Appendix E.
The text was updated successfully, but these errors were encountered:
First of all, this is the best document (with tables) indexing service I've tried so far. The use of markdown is highly appreciated. In my own testing, I've found markdown to be much more effective than plain text when it comes to tables. So Kudos!
I just wanted to give some feedback about what the LlamaParse pipeline got wrong. The PDF document I uploaded (an appendix from a public research report) is attached for reference.
ICILS 2018 Appendix E Student percentages.pdf
Below is the parsing instruction and the markdown that resulted. Interestingly the second table was closer to the original than the first.
For comparison, I have also included a similar prompt used with ChatGPT4 and the result (which accurately represented the structure of both tables). If LlamaParse uses a system prompt, tuning the instruction to flatten nested columns might help.
Parsing instruction:
Result:
ICILS 2018 International Report - Appendix E
Table E.1: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6
Notes: Standard errors appear in parentheses. Because results are rounded to the nearest whole number, some totals may appear inconsistent. SEI = socioeconomic index.
† Met guidelines for sampling participation rates only after replacement schools were included.
†† Nearly met guidelines for sampling participation rates after replacement schools were included.
1 National defined population covers 90% to 95% of the national target population.
2 Country surveyed target grade in the first half of the school year.
ICILS 2018 International Report - Appendix E
Table E.2: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6
Notes: Standard errors appear in parentheses. Because results are rounded to the nearest whole number, some totals may appear inconsistent.
† Met guidelines for sampling participation rates only after replacement schools were included.
†† Nearly met guidelines for sampling participation rates after replacement schools were included.
ChatGPT prompt:
Result:
Here are the converted markdown tables based on the information from Appendix E of the ICILS 2018 international report.
Table E.1: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6
Table E.2: Percentages of students in categories for dichotomous variables used in Chapters 3, 4, 5, and 6
These tables are formatted to display the percentage of students within each sub-category along with the standard error in parentheses, following the structure provided in the ICILS 2018 Appendix E.
The text was updated successfully, but these errors were encountered: