Skip to content

lightly-ai/lightly-insights

Repository files navigation

Lightly Insights

GitHub Unit Tests PyPI Code style: black

Get quick insights about your ML dataset.

Lightly Insights visualises basic statistics of an image dataset. You provide a folder with images and object detection labels, and it generates a static HTML webpage with metrics and plots.

Features

  • Supports all object detection label formats that can be read with Labelformat package. That includes YOLO, COCO, KITTI, PascalVOC, Lightly and Labelbox.
  • Shows the image, object and class counts
  • Analyzes how many images have no labels, and provides their filenames.
  • Shows image samples
  • Shows an analysis of image and object sizes
  • Shows an analysis per class with object sizes, counts per image, location heatmap and other.
  • Typed
  • MIT licensed

Preview

See a live example report for a small dataset.

Screenshots

Lightly Insights Screenshot Lightly Insights Screenshot Lightly Insights Screenshot

Installation

pip install lightly-insights

Usage

Lightly Insights report is generated by a python script. This is necessary to support different label formats.

The example below uses PascalVOC 2007 dataset. You can follow the example by downloading it (~450MB):

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
tar -xvf VOCtrainval_06-Nov-2007.tar

To run Lightly Insights, we need to provide:

  • Image folder. In our case that is ./VOCdevkit/VOC2007/JPEGImages.
  • Object detection labels. They are ingested as Labelformat ObjectDetectionInput class. For PascalVOC the constructor needs the folder with annotations ./VOCdevkit/VOC2007/Annotations and the list of classes. For other format classes, please refer to Labelformat formats.
from pathlib import Path
from labelformat.formats import PascalVOCObjectDetectionInput
from lightly_insights import analyze, present

# Analyze an image folder.
image_analysis = analyze.analyze_images(
    image_folder=Path("./VOCdevkit/VOC2007/JPEGImages")
)

# Analyze object detections.
label_input = PascalVOCObjectDetectionInput(
    input_folder=Path("./VOCdevkit/VOC2007/Annotations"),
    category_names=(
        "person,bird,cat,cow,dog,horse,sheep,aeroplane,bicycle,boat,bus,car,"
        + "motorbike,train,bottle,chair,diningtable,pottedplant,sofa,tvmonitor"
    )
)
od_analysis = analyze.analyze_object_detections(label_input=label_input)

# Create HTML report.
present.create_html_report(
    output_folder=Path("./html_report"),
    image_analysis=image_analysis,
    od_analysis=od_analysis,
)

To view the report, open ./html_report/index.html.

Development

The library targets python 3.7 and higher. We use poetry to manage the development environment.

Here is an example development workflow:

# Create a virtual environment with development dependencies
poetry env use python3.7
poetry install

# Make changes
...

# Autoformat the code
poetry run make format

# Run tests
poetry run make all-checks

Maintained By

Lightly is a spin-off from ETH Zurich that helps companies build efficient active learning pipelines to select the most relevant data for their models.

You can find out more about the company and it's services by following the links below:

About

Easily get basic insights about your ML dataset

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published