-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Another root error with Collatinus.Decliner #1221
Comments
I've been trying to work out exactly how the decliner works (and how it works in Collatinus itself) and I may have got this wrong - but it seems to me that: a] if there is an entry in the lemma-entry for geninf, this should be assigned to root_id 1 I don't know quite what is happening in lines 126-129 of lat.py:
but it looks as if we end up with multiple options for various root_ids (i.e. original_roots[1] = "parent, parei". Replacing the above lines with the following (note that the first line is at the same ident level as 'original_roots.update(returned_roots)' in lat.py seems to fix this (though I don't know whether it breaks something else):
However, there also seems to be a problem in collected.json data: under the model for 'infans', the ending data for the neuter singular nominative, voc, and acc (pos 37, 38, 39) is given as root ID 1, ending "-ns". The ending should be the same as the masc/ fem n/v sing, correctly give at pos 13/14 and 25/26 as root ID 4 and no ending (i.e. the canonical form). However, the output is either *parentns or *pareins for the neuter form, instead of expected parens. This error seems to be in Collatinus itself: in the modeles.la file, the entry for infans reads: In Collatinus, des 37-39 refer to the neut sing n/v/a - either the root id should be 0 (remove two letters, then add ns) or 37-39 should be 4:K (I think) EDIT: in the most up-to-date branch of Collatinus (the Medieval one), this error with infans has been corrected |
Collatinus.Decliner produces incorrect results with words such as 'omnis' and 'parens'. This is a separate problem to the one reported at #1127 (problems with 'puer'), and is not solved by the changes to lat.py recommended there.
Python version: 3.9.13
CLTK version: CLTK 1.1.6
Windows 10
from cltk.morphology.lat import CollatinusDecliner
decliner = CollatinusDecliner()
print(decliner.decline("omnis",False,False))
We see the following erroneous output:
[('omnis', '--s---mn-'), ('omnis', '--s---mv-'), ('omnem', '--s---ma-'), ('omniem', '--s---ma-'), ('omnis', '--s---mg-'), ('omniis', '--s---mg-') etc.]
There is no such form as 'omniem' or 'omniis'. CollatinusDecliner has created both "omn-" and "omni-" as roots for the same root_id.
We should expect to see (and Collatinus gets this right):
[('omnis', '--s---mn-'), ('omnis', '--s---mv-'), ('omnem', '--s---ma-'), ('omnis', '--s---mg-'), etc.]
from cltk.morphology.lat import CollatinusDecliner
decliner = CollatinusDecliner()
print(decliner.decline("parens",False,False))
We see the following output:
[('parens', '--s---mn-'), ('parens', '--s---mv-'), ('pareiem', '--s---ma-'), ('parentem', '--s---ma-'), ('pareiis', '--s---mg-'), ('parentis', '--s---mg-')...
There are no forms 'pareie-'
We would expect to see:
[('parens', '--s---mn-'), ('parens', '--s---mv-'), ('parentem', '--s---ma-'), ('parentis', '--s---mg-')
Once again, CollatinusDecliner has created both "parent-" and "parei-" as roots
Certainly in the case of 'parens' there seems to be a problem in the "cltk_data\lat\model\lat_models_cltk\lemmata\collatinus\collected.json" files. The model for 'parens' is given as 'infans', and in the models section for 'infans' we see the following root info:
"infans": {"R": {"0": ["2", ""], "1": ["2", "i"], "2": ["2", "issim"], "4": ["K", null], "5": ["2", "i"]}
This cannot be right, as there is no circumstance in which we would remove two letters and replace them with an 'i' (which is what has happened here).
Oddly we find the same root info for 'fortis' which is the model for 'omnis' (and 'fortis' also declines incorrectly):
"fortis": {"R": {"0": ["2", ""], "1": ["2", "i"], "2": ["2", "issim"], "4": ["K", null], "5": ["2", "i"]}
The text was updated successfully, but these errors were encountered: