Repository for the Data-Centric AI Competition
-
Updated
Sep 27, 2021 - Jupyter Notebook
Repository for the Data-Centric AI Competition
Codes for a Top 5% finish in the Data-Centric AI Competition organized by Andrew Ng and DeepLearning.AI
Customer churn train/prediction library with automatic dataset size optimisation features.
A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀
An Empirical Study of Memorization in NLP (ACL 2022)
Jupyter book showing how to build an ML powered book genre classifier
Find illustrations in historic book using computer vision
📕 flyswot book on developing a pragmatic machine learning workflow in a library setting
Lab assignments for Introduction to Data-Centric AI, MIT IAP 2023 👩🏽💻
nbsynthetic is simple and robust tabular synthetic data generation library for small and medium size datasets
Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)
Cleanlab and MachineHack Organised Data-Centric AI Competition 2023. This is One of Solution I tried and achieved 13th rank.
Implementation of data typology for imbalanced datasets.
[ECCV 2022] Official Implementation for Unsupervised Selective Labeling for More Effective Semi-Supervised Learning
Input-Agnostic Face Detection
A better Alpaca Model Trained with Less Data (only 9k instructions of the original set)
Quickly set up an image labelling web application for manually tagging images for machine learning tasks.
Add a description, image, and links to the data-centric-ai topic page so that developers can more easily learn about it.
To associate your repository with the data-centric-ai topic, visit your repo's landing page and select "manage topics."