15-Scene Image Classification with SIFT and SVM
This repository contains the Python code for a machine learning project that classifies images from the 15-Scene dataset using Scale-Invariant Feature Transform (SIFT) features and Support Vector Machine (SVM) classifier. The project utilizes OpenCV for image processing, scikit-learn for clustering and classification, and joblib for caching.
The dataset used is the 15-Scene Image Dataset, which can be downloaded from: 15-Scene Image Dataset
To run this project, you need Python 3.x and the following packages:
- OpenCV
- NumPy
- scikit-learn
- joblib
- tqdm
You can install the required packages using pip
:
pip install numpy opencv-python scikit-learn joblib tqdm
-
First, clone the repository to your local machine:
git clone [email protected]:hammershock/VisionVocabClassifier.git cd VisionVocabClassifier
-
Download the 15-Scene Image Dataset and extract it into the project directory under
./15-Scene Image Dataset
. -
Run the main script:
python main.py
This will execute the data loading, feature extraction, training, and evaluation sequence. Results including the model's accuracy and confusion matrix will be displayed in the console.
- Data Loading: Automatically loads and splits the dataset.
- Feature Extraction: Extracts SIFT descriptors from images.
- Clustering: Uses MiniBatchKMeans for clustering descriptors into visual words.
- Histogram Generation: Builds histogram features from clustered descriptors.
- Classification: Trains an SVM classifier with RBF kernel.
- Evaluation: Computes accuracy and displays a confusion matrix.
Contributions to this project are welcome. Please fork the repository and submit a pull request with your changes.
This project is open-sourced under the MIT License. See the LICENSE file for more details.