Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order languages in one list #924

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Order languages in one list #924

wants to merge 3 commits into from

Conversation

tobias-klein
Copy link
Member

This fixes #923.

@tobias-klein tobias-klein requested a review from zhuiks May 22, 2023 20:21
@tobias-klein
Copy link
Member Author

@zhuiks Do you think this change makes sense? Was there a particular reason to have the languages grouped in the categories of iso6391, iso6392T, iso6393?

@zhuiks
Copy link
Collaborator

zhuiks commented May 24, 2023

I'll take a look today

@zhuiks
Copy link
Collaborator

zhuiks commented May 25, 2023

@zhuiks Do you think this change makes sense? Was there a particular reason to have the languages grouped in the categories of iso6391, iso6392T, iso6393?

@tobias-klein from what I can recover from my memory and a little bit of digging in the code different iso standards have different value of language details and language/region name localization. If you switch locale to something that is not using Latin characters (i.e. Ukrainian or Russian), the last iso group or maybe last two would not have localized names.

Also iso groups would represent languages by number of speakers (more/less used languages). As far as I know for most iso6393 languages that don't have any representation in iso6391 or iso6392T speakers would be bilingual or trilingual with the languages from more "common" groups.
It's much easier to find language in the smaller group than in the one big "merged" group. (E.g. compare your UX scrolling down to Tatar or Turkish or Uzbek with merged group and groups divided by number of speakers)

I hope that what I wrote make sense. Let me know what you think.

@tobias-klein
Copy link
Member Author

@zhuiks I understand your point. However, this is not self-explanatory to the user, is it? When you scroll through and see that the sorting suddenly starts again and then again, you don't know why it is like that and what each block of languages entails. If we keep it like this, we should either add an explanation above the language list or maybe additional headlines over each block, or maybe both?!

@zhuiks
Copy link
Collaborator

zhuiks commented May 31, 2023

@tobias-klein yes, I agree! I just didn't know how to phrase it.
This article seems to be a good background on different iso 639 standards

ISO 693-1 ...There are 183 two-letter codes registered as of June 2021. The registered codes cover the world's major languages

So maybe instead of iso6391-languages we can use common-languages with heading "Common Languages"

The ISO 639-2 firstly includes identifiers for languages from the 639-1 and additional languages which have a relevant amount of literature. This list provides identifiers for language families, and so covers almost all languages of the world...

Instead iso6392T-languages use less-common-languages with heading "Less Common Languages"
As I've stated above language names from these two groups can be translated to locales that we have based on Internationalization API (which I think derives its data from commonly used Unicode CLDR project). I think total of languages in these groups should be around 450.

The ISO 639-3 list covers individual languages from ISO 639-2, but includes also extinct, ancient, historic, and constructed languages (conlangs). This language code set recognizes the lesser-known languages.

Instead iso6393-languages use rear-languages with heading "Rear Languages" This group has close to 2K languages.

unknown-languages are "Non-Standard Language Codes". These codes do not have equivalent in ISO 639-3 standard.

The language data is taken from languages.json which is a compilation of all SWORD locales with extra info from Internationalization API and ISO 639-3 auto-generated by sword_locales_to_json.js.

I can put together a PR with these proposed headings above each language group and add a paragraph with a bit of explanation.

I don't mind making less groups. I just think that having ~450 common languages separate from ~2000 rear languages is a better UX.

@tobias-klein
Copy link
Member Author

@zhuiks Thanks for your feedback. I understand the reasoning behind limiting the number of entries in the groups shown. Using an approach that enhances the usability is good. We should definitely make the logic for the language groups transparent to the user, though.

However, before moving forward with implementation it may be interesting to put this up for discussion in the SWORD mailing list and get some feedback there about the idea with the groups and the headers. What do you think?

When you say "Rear languages" do you actually mean "Rare languages"?

@zhuiks
Copy link
Collaborator

zhuiks commented Jun 10, 2023

However, before moving forward with implementation it may be interesting to put this up for discussion in the SWORD mailing list and get some feedback there about the idea with the groups and the headers. What do you think?

@tobias-klein sound like a great idea. Do you mind communicating that?

When you say "Rear languages" do you actually mean "Rare languages"?

😁 yes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Long list of 'other' languages is rendered in several groups
2 participants