Crowd_counting_from_scratch

This is an overview and tutorial about crowd counting. In this repository, you can learn how to estimate number of pedestrians in crowd scenes through computer vision and deep leaning.

How to do crowd counting?

Crowd counting has a long research history. About twenty years ago or even earlier, researchers have been interested in developing the method to count the number of pedestrians in the image automatically.
There are mainly three categories of methods to count pedestrians in crowd.

Pedestrian detector. You can use traditional HOG-based detector or deeplearning-based detector like YOLOs or RCNNs. But effect of this category of methods are seriously affected by occlusion in crowd scenes.
Number regression. This category of methods just capture some features from original images and use machine-learning models to map the relation between features and numbers. An improved version via deep-learning directly map the relation between original image and its numbers. Before deep-learning, regression-based methods were SOTA and researchers are focus on finding more effective features to estimate more accuracy results. But when deep-learning get popular and achieve better results, regression-based methods get less attention because it is hard to capture effective hand-crafted features.
Density-map. This category of methods are the mainstream methods in crowd counting nowadays. Compared with detector-based methods and regression-based methods, density-map can not only give the information of pedestrian numbers, but also can reflect the distribution of pedestrians, which can make the models to fit original images with opposite density better.

What is density-map?

Simply speaking, we use a gaussian kernel to simulate a head in corresponding position of the original image. After do this action for all heads in the image, we then perform normalization in matrix which is composed by all these gaussian kernels. The sample picture is as follows:

Further, there are three strategies to generate density-map.

Fixed-size density map. Use the same gaussian kernel to simulate all heads. This method applies to scene without severe perspective distortion. [fixed_kernel_code]
Perspective density map. Use the perspective map(which is generated by linear regression of pedestrians' height) to generate gaussian kernels with different sizes to different heads. This method applies to fixed scene. [perspective_kernel_code] And [paper-zhang-CVPR2015] give detailed instruction about how to generate perspective density-map.
KNN density map. Use the k-nearest heads to generate gaussian kernels with different sizes to different heads. This method applies to very crowded scenes. [k_nearset_kernel_code] And [paper-MCNN-CVPR2016] give detailed instruction about how to generate k-nearest density-map.

DataLoader for load image and its corresponding density-map

When finish generating density-maps, we need to program a dataloader to load image and its corresponding density-map for forward and backward propagation every batch. For images with same resolution, we can use batch_size=32 or 64 or even larger. Otherwise, we just use batch_size=1. We strongly recommend to use torch.utils.data.Dataset and torch.utils.data.DataLoader to realize your own dataloader. An example code for how to contrust an dataloader is in [dataloader_example_code].

Deep-learning model for crowd counting

For beginner, [paper-MCNN-CVPR2016] is the most suitable model to learn crowd counting. The model is not complex and have an acceptable accuracy. We provide an easy [MCNN_model_code] to let you know MCNN rapidly and an easy full realization of [MCNN-pytorch].
If you want more accuracy result, [paper-CSRNet-CVPR2018] is a deeper model for crowd counting. It uses dilated convolution to avoid the frequent pooling and upsample. We provide an easy full realization of [CSRNet-pytorch].

Some Crowd-counting Dataset

UCSD：we provide an processed version with images and point annotations like other crowd-counting dataset. Link：https://pan.baidu.com/s/1rykWyMYHMLr99W5CCEXeBQ Extraction code：4u66 (But I haven't achieved the same performance as showed in [paper-MCNN-CVPR2016]

Some Tricks

use data augumentation such as horizontal flip, crop, illumination change, etc.

Other Crowd Counting Branch

Multi-view crowd counting [link]

Learn More

If you want learn more about crowd counting, you can visit [Awesome Crowd Counting]. It collects the papers and opposite codes in the field of crowd counting.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
code_comprehension		code_comprehension
crowd_model		crowd_model
dataloader		dataloader
dataset/FDST		dataset/FDST
generate_density_map		generate_density_map
imgs		imgs
multi_view_counting		multi_view_counting
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code_comprehension

code_comprehension

crowd_model

crowd_model

dataloader

dataloader

dataset/FDST

dataset/FDST

generate_density_map

generate_density_map

imgs

imgs

multi_view_counting

multi_view_counting

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Crowd_counting_from_scratch

How to do crowd counting?

What is density-map?

DataLoader for load image and its corresponding density-map

Deep-learning model for crowd counting

Some Crowd-counting Dataset

Some Tricks

Other Crowd Counting Branch

Learn More

About

Releases

Packages

Languages

License

CommissarMa/Crowd_counting_from_scratch

Folders and files

Latest commit

History

Repository files navigation

Crowd_counting_from_scratch

How to do crowd counting?

What is density-map?

DataLoader for load image and its corresponding density-map

Deep-learning model for crowd counting

Some Crowd-counting Dataset

Some Tricks

Other Crowd Counting Branch

Learn More

About

Topics

Resources

License

Stars

Watchers

Forks

Languages