moj-analytical-services / splink Star 1.1k Code Issues Pull requests Discussions Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends data-science spark record-linkage entity-resolution fuzzy-matching deduplication em-algorithm data-matching deduplicate-data duckdb uk-gov-data-science Updated May 17, 2024 Python
bmiller1009 / deduper Star 5 Code Issues Pull requests General deduping engine for JDBC sources with output to JDBC/csv targets data-deduplication deduplication deduplicate deduplicate-data Updated Dec 21, 2020 Kotlin
gochore / uniq Star 0 Code Issues Pull requests Sort and deduplicate data. go golang sort uniq deduplicate-data Updated Feb 3, 2021 Go