Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elisions not performed correctly by latin HexameterScanner and VerseScanner #1124

Open
bblumenfelder opened this issue Aug 28, 2021 · 1 comment
Assignees
Labels

Comments

@bblumenfelder
Copy link

bblumenfelder commented Aug 28, 2021

Describe the bug
In two cases elisions are not performed correctly:

  1. Whenever em at the end of the word meets a vowel in the next word.
  2. Whenever an elision should take place, h at the beginning of a word is treated as ordinary consonant.

To Reproduce Bug 1
Steps to reproduce the behavior:

from cltk.prosody.lat.hexameter_scanner import HexameterScanner
hexameter_scanner = HexameterScanner()

verse1 = "quem quidem ego actutum (modo vos absistite) cogam"
print(hexameter_scanner.scan(verse1).working_line)
# Out:          quēm quidem eg  āctutūm  modo vos ābsīstite  cogam
# Should be:    quēm quid eg  āctutūm  modo vos ābsīstite  cogam

verse2 = "Qui potis est, inquis? Quod amantem iniuria talis"
print(hexameter_scanner.scan(verse2).working_line)
# Out:          Qui potis ēst  īnquīs  Quod amāntem īnjuria talis
# Should be:    Qui potis ēst  īnquīs  Quod amānt īnjuria talis

verse3 = "ille, datis vadibus qui rure extractus in urbem est"
print(hexameter_scanner.scan(verse3).working_line)
# Out:          īlle  datīs vadibūs qui rur  ēxtrāctus in ūrbem ēst
# Should be:    īlle  datīs vadibūs qui rur  ēxtrāctus in ūrb ēst

To Reproduce Bug 2

verse4 = "non potuisse, tuaque animam hanc effundere dextra"
print(hexameter_scanner.scan(verse4).working_line)
# Out:          nōn potuisse  tuaqu  animam hānc ēffūndere dēxtra
# Should be:    nōn potuisse  tuaqu  anim ānc ēffūndere dēxtra

verse5 = "perque hiemes aestusque et inaequalis autumnos"
print(hexameter_scanner.scan(verse5).working_line)
# Out:          pērque hiemes aestūsqu  et inaequalis autūmnos
# Should be:    pērqu iemes aestūsqu  et inaequalis autūmnos

verse6 = "monstrum horrendum, informe, ingens, cui lumen ademptum"
print(hexameter_scanner.scan(verse6).working_line)
# Out:          mōnstr   hōrrēnd    īnfōrm   īngēns  cui lumen adēmptum
# Should be:    mōnstr   ōrrēnd    īnfōrm   īngēns  cui lumen adēmptum

The same bugs occur with VerseScanner:

from cltk.prosody.lat.verse_scanner import VerseScanner
verse_scanner = VerseScanner()
print(verse_scanner.elide_all(verse1))
# Out:   quem quidem eg  actutum modo vos absistite cogam
print(verse_scanner.elide_all(verse2))
# Out:   Qui potis est inquis Quod amantem iniuria talis
print(verse_scanner.elide_all(verse3))
# Out:   ille datis vadibus qui rur  extractus in urbem est

Environment:

  • CLTK 1.0.17 in different environments

Additional context
Elision rules are specified here (unfortunately in german) and here

@todd-cook todd-cook self-assigned this Sep 1, 2021
@kylepjohnson
Copy link
Member

@bblumenfelder thanks for opening this, and apologies that I did not reply sooner.

@todd-cook do you have an idea of the effort necessary to fix this? If it is too much, then let's think about a way to track this long-term.

Benedikt, if you would like to help out do some development work with us, please reach out. Even just finding bugs like this is very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants