Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft proposal for short-term inflection enhancements for MF2 #10

Open
macchiati opened this issue Mar 13, 2024 · 5 comments
Open

Draft proposal for short-term inflection enhancements for MF2 #10

macchiati opened this issue Mar 13, 2024 · 5 comments

Comments

@macchiati
Copy link
Member

macchiati commented Mar 13, 2024

Here is a draft proposal for information that we could use in the very short term with MF2, to improve grammar of messages. While we would target MF2, the information could be used more broadly.

A. Enhance grammatical feature information

We could use more gender / noun-class information in order to switch among appropriate variant messages. To do this well, we need to know what the grammatical categories are. The localization tooling can then expand or contract the variant messages to be appropriate for the locale, much as it does for plurals right now in MF1.

Sample message

.match {$person-nc}
animate-masculine {{{$person} needs it: give it to him.}}
feminine {{{$person} needs it: give it to her.}}
…

$user-gender

  1. For each locale, provide data for what the user-gender categories are (eg, for “you” or imperatives).
  2. We are only concerned with categories that grammatically affect the rest of a message.
  3. The fallback category is “other”, and we only need categories that would be distinct from the fallback.
  4. Some, like English will be just {other}, while others like French will be {feminine, other}, while others might be {masculine, feminine, other}.

$person-noun-class

  1. For each locale, provide data for what the person-gender categories are (eg, for “Pat Smith”).
  2. We are only concerned with categories that grammatically affect the rest of a message.
  3. Some, like Japanese will be just {other}, while others like French will be {feminine, other}, and others like English will be {masculine, feminine, other}.
  4. This can be more than gender, eg for Polish {animate-masculine feminine neuter}

$object-noun-class

  1. For each locale, provide data for what the object-noun-class categories are (eg, for “Paris”, or “basketball”).
  2. This can have different ‘scopes’ for types of objects (we currently have a scope for units), but the scopes should be locale-independent.
  3. We are only concerned with categories that grammatically affect the rest of a message.
  4. This can be more than gender, eg for Polish {inanimate-masculine feminine neuter}

B. Test data for gender detection

We could prepare test data for at least one locale where we can derive the gender of people or objects, and use them in messages with a new function :noun-class

Sample message

.input {$person}
.locale $person-nc = {$person :u:noun-class}
.match {$person-nc}
animate-masculine {{{$person} needs it: give it to him.}}
feminine {{{$person} needs it: give it to her.}}
…

C. Test data for case inflections

We could prepare test data for at least one locale where we can support case as an option:

Sample message

.input {$person}
{{Give it to {$person u:case=dative}.}}

(We do have case data for units in CLDR, but it would be better if we had some more general examples.)

@BrunoCartoni
Copy link

Principal categories that can be affected by Gender:

  • nouns
  • pronouns
  • adjectives
  • verbs

I'd also suggest to limit our scope to human gender (so only "you", "he/she", "I" will be taken into account, not "it").

@macchiati
Copy link
Member Author

macchiati commented Mar 13, 2024 via email

@BrunoCartoni
Copy link

This example is debatable (both "si" and "es" are possible).

As always, we should prioritize according to use case (e‧g.: do we need subject pronoun reference? Or is it more object pronoun?).

@macchiati
Copy link
Member Author

macchiati commented Mar 13, 2024 via email

@grhoten
Copy link
Member

grhoten commented Mar 13, 2024

This topic around the choice of words and associated human gender seems related to #7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

3 participants