-
Notifications
You must be signed in to change notification settings - Fork 838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Faroese language #1914
base: master
Are you sure you want to change the base?
Added Faroese language #1914
Conversation
Looks good. Things to do before approval:
|
1a10e46
to
7c48e88
Compare
Also: include the language in the windows build files in src/windows as well |
Thank you, Juho, for the response. Regarding the inclusion of the language in the Windows build files in src/windows, is there a guide or documentation available for this part? I'm assuming that I need to add my language to the src/windows/installer/Product.wxs file. Specifically, for the Guid value, should I generate a new GUID myself, or should I use a pre-existing value? Additionally, I have added some new phonemes and the pronunciation of around 225,000 words in Faroese. Can I use this opportunity to include that update in the same pull request, or would you prefer that I create a separate pull request for these additions? I haven't used git extensively, so I'm still getting familiar with the workflow and processes in the open-source environment. Any guidance you can provide would be greatly appreciated. Thank you again for your help. |
adf2f00
to
b20b710
Compare
I believe I have done the requested steps, to align the repo with the base branch, and I did adjust the changelog. |
Now that the PR is ready for acceptance I ran the pipeline. The following tests FAILED:
13c22fcd8aa140bd22e3299fdcc75b5b2c2308ca != 25a10409481c8874d4c0b9c46a70e185d0b5f40f All projects have their own processes and culture. The important part is to use the commit messages and PR description to explain what the commits do and why. You have done it well. I think we don't have documentation for the windows installer. Just copying what other languages do should be ok. As for the large word list, do you think it's necessary? Check what other languages have done; most have multiple rules and less word exceptions. On the other hand, for example Russian needs a large dictionary because of the way word stress is handled in Russian. I don't know how Faroese works. Might be easier to have general rules for 99% of the words and then just fix the exceptions. |
Thank you Juho, I forgot about the language-phonemes.test now that the pronunciations have been improved. I will change the computed value to match the new reality. Regarding the phoneme dictionary, I was thinking that since this phoneme dictionary has been prepared by professional linguists for a 3½ year period from 2019-2022 (http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.495.pdf), I assumed that it would be of great value to add that to the fo_list, as it didn't seem to impact the performance of espeak-ng. The Faroese language does have grammar rules and many of those rules have exceptions. Since I'm not a linguist, I tried to make the rules match as good as I could get it, but with the phoneme dictionary, the speech improved pretty good. On the other hand, I would agree that it would be very good if we could just have rules that work for the 99% and make list for the exceptions. I think that this should also be the aim. Meanwhile, as I work on those improvements, I was hoping that this version would pass, as Faroese is somewhat neglected in the TTS world, and implementing a good voice in espeak would open many opportunities, instead of training TTS voices with norwegian or icelandic as the base for Faroese :) Regarding the language-phonemes.test, this should just be an alteration of the checksum in the .test file, as the produced phonemes since the dictionary and added phonemes, no longer give the same checksum, right? |
I added initial support for Faroese language. Did the rules for most common phonetic rules that are in place for Faroese, and the ability to more easily make improvements to the phonemes, rules, and exceptions. Current status is early testing, but voice is mostly understandable at this point, and pronounces numbers by the correct rules now.