Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ignoretext for the translator #160

Open
thejeff77 opened this issue Jan 5, 2020 · 6 comments
Open

Support ignoretext for the translator #160

thejeff77 opened this issue Jan 5, 2020 · 6 comments

Comments

@thejeff77
Copy link

Support strings to ignore in the translation.

Ex: May want to not translate certain substrings in a localization like:

Company Name
Person Name
(swiftVar), %d, %@, etc...

This could also potentially be accomplished by allowing the developer to extend the before/after parse of the translation. Ex: give a callback function for the string before it is translated for mutation, and/or a callback function for the string after it is translated.

Thanks for any help!!!

@Jeehut
Copy link
Member

Jeehut commented Jan 27, 2020

@thejeff77 I'm not sure if I understand your request correctly. Could you please give a more thourough example and also point out how this could be implemented? I don't quite understand your description yet.

To me it sounds like you're looking for something like a "Glossary" feature which could be implemented within your app and then %@ could be used where you need e.g. your company name in the translation strings, no?

@thejeff77
Copy link
Author

So I guess a bit of background might help:

Currently I use an npm package called xlifftranslate (uses google translate api v3), and a custom script which exports all localizations, translates them then throws them back into the app.
https://github.com/jasonruesch/xlifftranslate

This works pretty well, although bartycrouch is awesome, and this seems faster and more incremental so I'd love to try it out, but it seems to be missing options that I use.

One issue I've had to solve for with this program is the translation of strings that have variable placeholders in them, or strings that should not be translated. Some examples might be:

%@, %d, %f, %1$d:%2$02d, %d:%02d.%01d, %2$@, %1$d, %1$d, $(PRODUCT_NAME),

A lot of these are objc style placeholders, and swift placeholders matter too.

Or additional strings that you just don't want translated but you want to include them in the *.strings files too for reference (don't lose track).

Then I'd translate strings with a lot of the values mentioned above in them, they came back garbled. For instance, the string replacement done for %d:%02d.%01d needs that to be in the string, it can't have added spaces, url encoded values of any of the characters, etc..

So I guess I'd like to know more about how the bartycrouch translate supports variable placeholders in auto-translated strings. Does the Microsoft API handle this very well (I.E. no garbled var placeholders returned)? Or is this handled in bartycrouch explicitly to guard against garbled placeholders being returned in translated strings?

With the solution I have in place (xliff translate), I can use the --ignore-text " %d:%02d.%01d %d" option for a string such as "seconds: %d:%02d.%01d days: %d" translated into spanish will always come back as "segundos: %d:%02d.%01d dias: %d" in spanish (I.E. placeholders are in-tact).

If your product name contained dictionary words, but you didn't want to translate your product name, It would be nice to be able to specify that.

Ex: Ignore the string "Picture Palace"

"Picture Palace is the best" (en) -> "Picture Palace es la mejor" (es)

Currently I don't see a way to extend the translator that you configure with bartycrouch so I could accomplish this with mutations on the strings before and after. For Google translate its just a matter of keeping track of the ignore strings and replacing them with unique html tags: eg: <_>

Thank you for any further information that might help me understand better.

@Jeehut
Copy link
Member

Jeehut commented Jan 30, 2020

Thank you for that thorough explanation, @thejeff77.

I understand what you want to achieve now, I wasn't understanding it earlier, so you can ignore what I said before. Actually, the translation support in BartyCrouch isn't really elaborated yet. There's no special handling for those placeholders at all. But as the placeholders are all clearly documented, we could add a feature which would use the --ignore-text function (or something similar) for all the placeholders automatically. But I think it might be quite tricky to implement for some cases, but basic support should be easy.

I think ignoring some (I will call it) "glossary" words that are fixed names or brands is a completely different feature, although related because we would use the same API for that. But I think it's a different feature because it can't be automated and needs some way to specify the words to ignore either project-wide or per-string. Project-wide we could add an array to the configuration file, something like non-translatabes. For per-string configuration I think we could add a hint to the comment for localizers, something like #bartycrouch-non-translatables or shorthand #bc-nt followed by a Swift-like array literal of terms like this: #bc-nt["Picture Palace"].

Basically, those feature could be added, which would make 2 different PRs. Also, we should consider adding Google Translation API support as well.

While all of this definitely make sense to be added to BartyCrouch, I don't think I will have much time to tackle it myself any time soon. But I would love to review a good PR if someone would take the time. I probably will focus on a new major version of BartyCrouch sometime after WWDC this year.

@thejeff77
Copy link
Author

thejeff77 commented Jan 30, 2020

I dig it. I am busy too. Can we mark it as a feature or "good first feature"? Perhaps you, myself or someone will pick it up - see if there is any interest for this beyond me. Seems to me that there are a few features discussed:

Ignore string placeholders (feature):

  • Automatically ignore/skip common swift string placeholders from translation, add config and turn on by default. I.E. Substrings that start with '%' Eg- (%@ - objc, \(varname) - swift)

Additional custom non-translatable substrings (feature):

  • Investigate Microsoft API and learn how to ignore a string. Eg: replace with html tag on translate, mutate back after translation (Google API v2).
  • Create config for ignore strings that applies on all translations. (.bartycrouch.toml?, swift bartycrouch translation config file?) - see comment from @Jeehut above.

Add google translate (v3?):

  • Add google translate.
  • Bonus - use bulk translation API

@ownmas
Copy link

ownmas commented Apr 25, 2020

I agree with @thejeff77 and @Jeehut!

It would be cool to add Google Translate support, because I think Google Translate is better than Microsoft Translate.

And there is one lifehack. It would be nice to add integration with Google Apps Script, because there is a free unlimited Google Translate. Here is an example article to understand what I mean.

@Jeehut
Copy link
Member

Jeehut commented Apr 25, 2020

I've just added the "good first issue" label, although I think I would prefer it a separate issue was created for each of the above mentioned features. The title here isn't very discoverable. @ownmas Could you create a separate issue for the Google Translate feature so we can prioritize and discuss it separately?

And as I said, I would be up to review a PR anytime, I just invested quite some time to fix several bugs recently and I don't plan to make any bigger changes myself before summer this year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants