Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include glossed examples in CLDF dataset #7

Open
xrotwang opened this issue Mar 4, 2020 · 7 comments
Open

Include glossed examples in CLDF dataset #7

xrotwang opened this issue Mar 4, 2020 · 7 comments

Comments

@xrotwang
Copy link
Member

xrotwang commented Mar 4, 2020

The paper contains about 35 glossed examples. It would be nice to scrape them and add them as ExampleTable to the dataset.

@Wu-Urbanek
Copy link

Something like this?
Screenshot 2020-03-04 at 10 07 02

@Wu-Urbanek
Copy link

Or this? If you mean this one, I can try to convert the PDF to word or excel and then extract this.
Screenshot 2020-03-04 at 10 19 20

@xrotwang
Copy link
Member Author

xrotwang commented Mar 4, 2020

The latter, i.e. the IGT examples. I played a bit with copy&paste from PDF to txt, and that worked sort of ok, in particular considering the small amount of examples. But then again, since it's so few examples, it may not be worth it.

Regarding the frequency table above, that's what I was talking about in #6

@Wu-Urbanek
Copy link

I just converted it to xml, and maybe one can extract it from the tag?

@xrotwang
Copy link
Member Author

xrotwang commented Mar 4, 2020

Don't know. Can you send me the XML?

@Wu-Urbanek
Copy link

made a pull request :) Please have a look

@xrotwang
Copy link
Member Author

xrotwang commented Mar 4, 2020

never mind. Found it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants