Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



6 Commits

Repository files navigation

KLAP - Horse races prediction

Student research project that aims to build predictive models to beat the bookmakers. Our client Alezan.IA is a computer vision startup working on horse races, that wants to explore betting strategy. They gathered online scrapped data that we extracted, cleaned, processed, and used to build our models.

Project Organization

├── Makefile           <- Makefile with commands like `make data` or `make train`
├──          <- The top-level README for developers using this project.
├── data
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
├── models             <- Trained and serialized models, model predictions, or model summaries
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
├──           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├──    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └──
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └──
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├──
│   │   └──
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └──

Feature extraction and preprocessing



Different models have been implmented


To estimate horses' skills, we created an elo rating, inspired by the one used for chess. We implmented the model on three years of horse races.

Machine Learning Models

SVM, MLP, k-NN comparisons

Multivariate sequence prediction

---- TO DO

Model evaluation

Input : a horse race, containg horses data coming from feature engineering Output : a predictive ranking of the race

Project based on the cookiecutter data science project template. #cookiecutterdatascience