This is a machine learning course project on computer vision.
It may look a bit complicated. The jupyter notebook file is the core file to link all documents, training, validating, and make predictions of video data.
The main idea is using YOLOv5 as the groundwork and detect the number of heads in a video in order to guarantee the limitation of people number is not surpassed.
create_dataset.py will convert xml labels to YOLO-satisfied labels.
The dataset I mainly used is SCUT_HEAD and Hollywood Heads, you can find it the first on GitHub and the second on web.
The slides(.pptx) may give you a quick glance at how I did it, the major steps.
You can contact me at firstname(dot)lastname at nyu(dot)edu for copyright concerns or helps.