Skip to content

translate source code without breaking it (too badly). supports chinese, russian and really anything that isn't ASCII.

Notifications You must be signed in to change notification settings

ip-rw/translate_code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Source code translator

Updated to include a GPT version with significantly better performance and accuracy than the original. You can see an example here https://github.com/ip-rw/yakit_english/ this is a large, mature electron GUI translated from Chinese into English without manual intervention.

Rough and ready way to translate source code into English. It will extract and translate blocks of non-ASCII text and then write it all back to the file. It expects you to pipe/pass as args a list of file paths to translate. It uses GoogleTranslate from the deep_translate library and works best via pool of rotating proxies. There's no reason the other deep_translate backends wouldn't work, just untested.

It's been tested with Russian and Chinese source and does as good a job as one could hope. YMMV but it seems to be okay at not mangling files.

It batches things up and handles Google's antics as best it can, there's a fair bit of juggling but it should go as quickly as it can without filling the files with nonsense.

I made it because the Chinese particularly release a lot of interesting code now, and unfortunately its just squiggles to me.

Usage

I use like this:

find ~/ksubdomain -type f |grep -v 'git\|svg'|  python3 main.py

If you want to supercharge things (have a rotating proxy handy) then use xargs but beware of unescaped file paths (I avoid spaces):

find ~/ksubdomain -type f |grep -v 'git\|svg'|  xargs -n 30 -P5 python3 main.py 

Prerequisites

You'll need Python 3 along with the following packages:

  • argparse
  • cypunct
  • charset-normalizer
  • deep_translator
  • thefuzz

Before:

user@flex:~/ksubdomain$ go run cmd/ksubdomain/*.go e
NAME:
   cmd enum - 枚举域名

USAGE:
   cmd enum [command options] [arguments...]

OPTIONS:
   --domain value, -d value        域名
   --band value, -b value          宽带的下行速度,可以5M,5K,5G (default: "2m")
   --resolvers value, -r value     dns服务器文件路径,一行一个dns地址,默认会使用内置dns
   --output value, -o value        输出文件名
   --silent                        使用后屏幕将仅输出域名 (default: false)
   --retry value                   重试次数,当为-1时将一直重试 (default: 3)
   --timeout value                 超时时间 (default: 6)
   --stdin                         接受stdin输入 (default: false)
   --only-domain, --od             只打印域名,不显示ip (default: false)
   --not-print, --np               不打印域名结果 (default: false)
   --dns-type value                dns类型 可以是a,aaaa,ns,cname,txt (default: "a")
   --domainList value, --dl value  从文件中指定域名
   --filename value, -f value      字典路径
   --skip-wild                     跳过泛解析域名 (default: false)
   --ns                            读取域名ns记录并加入到ns解析器中 (default: false)
   --level value, -l value         枚举几级域名,默认为2,二级域名 (default: 2)
   --level-dict value, --ld value  枚举多级域名的字典文件,当level大于2时候使用,不填则会默认
   --help, -h                      show help (default: false)

After:

user@flex:~/ksubdomain$ go run cmd/ksubdomain/*.go e
NAME:
   cmd enum - Enumerate the domain name

USAGE:
   cmd enum [command options] [arguments...]

OPTIONS:
   --domain value, -d value        Domain name
   --band value, -b value          broadband downlink speed, can be 5M, 5K, 5G (default: "2m")
   --resolvers value, -r value     dns server file path, one dns address per line, the default will use the built-in dns
   --output value, -o value        output file name
   --silent                        After using it, the screen will only output the domain name (default: false)
   --retry value                   retry times, when it is -1, (default: 3)
   --timeout value                 timeout Time (default: 6)
   --stdin                         accepts stdin input (default: false)
   --only-domain, --od             only prints the domain name, does not display the ip (default: false)
   --not-print, --np               does not print the domain name result (default: false)
   --dns-type value                dns type can be a, aaaa, ns, cname, txt (default: "a")
   --domainList value, --dl value  Specify the domain name
   --filename value, -f value      dictionary path
   --skip-wild                     skip pan-analysis domain name (default: false)
   --ns                            Read the ns record of the domain name and add it to the ns parser (default: false)
   --level value, -l value         Enumerate several levels of domain names, the default is 2, the second-level domain name (default: 2)
   --level-dict value, --ld value  Enumerate the dictionary file of multi-level domain names, used when the level is greater than 2, if not filled, it will default to
   --help, -h                      show help (default: false)

About

translate source code without breaking it (too badly). supports chinese, russian and really anything that isn't ASCII.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages