-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data Corruptors a la GeCO #175
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I've been developing some data corruption algorithms (inspired by the documentation from https://dmm.anu.edu.au/geco/flex-data-gen-manual.pdf but not looking at the sourcecode, since it has an unusual license), and I wonder if your excellent project would be interested in some pull requests to incorporate python implementations in your
recordlinkage.datasets
submodule.I'm imagining methods such as
corrupt.ocr_noise(s : str) -> str
. If this sounds of interest, I can put together a PR or use this ticket to further discuss the design. And if this is beyond the scope of what you want for your module, I understand!The text was updated successfully, but these errors were encountered: