Paper | Code | Dataset | |
---|---|---|---|
DensePeds, IROS’19 | GitHub Link. | India-Walk (More details below). |
We present a pedestrian tracking algorithm, DensePeds, that tracks individuals in highly dense crowds (greater than 2 pedestrians per square meter). Our approach is designed for videos captured from front-facing or elevated cameras. We present a new motion model called Front-RVO (FRVO) for predicting pedestrian movements in dense situations using collision avoidance constraints and combine it with state-of-the-art Mask R-CNN to compute sparse feature vectors that reduce the loss of pedestrian tracks (false negatives). We evaluate DensePeds on the standard MOT benchmarks as well as a new dense crowd dataset. In practice, our approach is 4.5 times faster than prior tracking algorithms on the MOT benchmark and we are state-of-the-art in dense crowd videos by over 2.6% on the absolute scale on average.
Please cite our work if you found it useful,
@article{chandra2019densepeds,
title={DensePeds: Pedestrian Tracking in Dense Crowds Using Front-RVO and Sparse Features},
author={Chandra, Rohan and Bhattacharya, Uttaran and Bera, Aniket and Manocha, Dinesh},
journal={arXiv preprint arXiv:1906.10313},
year={2019}
}
Dataset License
Available for download here.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The labels corresponding to each video inside the folder videos_original is stored in the file annotations/*_gt.txt.
Each line in each *_gt.txt has the following information in the specified order:
<frame number>,<#agents in frame>,<bbox top left x>,<bbox top left y>,<bbox bottom right x>,<bbox bottom right y>,<agent1 ID>,...,<bbox top left x>,<bbox top left y>,<bbox bottom right x>,<bbox bottom right y>,<agentN ID>
- bbox: bounding box of the agent.
- All x and y values are in pixels from top left of the corresponding image frame.
- N = number of agents in the frame.
- Agents belong to one of the following classes: ped, cycle, scooter, bike, rick, car, bus, truck, others. Agent ID is assigned according to:
. For example, the first tracked pedestrian in a video has the ID ped0, the 5th tracked rickshaw has the ID rick4 etc. - number of lines in each file = number of frames in the corresponding video.
- The dataset contains 8 annotated dense pedestrian videos so far.
- Each video is processed at 20 fps, and the videos roughly range between between 200 and 700 frames.
- The videos are extremely dense, with roughly 70 to 80 pedestrains per frame on average.