Dataset development that supports both offline and online input. #28

periakiva · 2024-10-08T19:56:37Z

What does this PR do?

Fixes #<issue_number>

Before submitting

Did you make sure title is self-explanatory and the description concisely explains the PR?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you list all the breaking changes introduced by this pull request?
Did you test your PR locally with pytest command?
Did you run pre-commit hooks with pre-commit run -a command?

Did you have fun?

Make sure you had fun coding 🙃

periakiva · 2024-10-08T20:19:22Z

@Purg

tcn_hpl/data/test.mscoco.json

Purg · 2024-10-09T15:36:43Z

tcn_hpl/data/tcn_dataset.py

+
+
+class TCNDataset(Dataset):
+    def __init__(self, kwcoco_path: str, sample_rate: int, window_size: int):


I was thinking that this dataset could take three paths as input: for object detection COCO, for pose COCO, and for activity truth COCO. If this is one combined file, we will still have to create this file outside of usage of this dataset whenever we change our detection or pose sources. I was thinking that if they are separate inputs, we could more easily configure mixing and matching for training experiments. Thoughts?

The current offline pipeline works sequentially: we generate a detection kwcoco and input it to the pose generation script to populate it further with pose annotations. The script I wrote above takes the detection+pose combined kwcoco and populates it with activity GT. Changing to 3 different coco files would mean that we need to change the way we save our feature specific (i.e. pose, detection, activity) kwcocos as well, and regenerate them before continuing.

For offline training there shouldn't be a change of data sources or kwcoco generation pipeline. Outside offline training , I currently set up a "collect_input" function outside the dataset class, which should take care of data standardization for either offline or online data inputs.

My thought was along the lines that one thing we are achieving here is removing the "bundle" meta-files that were a step in between processing detections/pose/activity-truth combinations. As this currently stands, we are reduced in what we are preprocessing, but still have to do some preprocessing to create this combined COCO file. Going forward, I was assuming we would be changing our detection and pose sources (yolov7 vs. yolov8 vs. yolov11, mmpose vs. yolov11 pose, etc.), in which case being able to just point the dataset at which results version to use for iterating would be easier than having to remember to create a preprocessed unified COCO file. Is that conceptualization off the mark?

it is not off the mark, though it depends on how the current code will be changed. If you want to simply take the new pose/detection models and replace them in the current code, then the pipeline stays the same (i.e. sequential input-output of kwcoco files and obtaining a unified kwcoco). It is probably possible to just create a new kwcoco file for each task (instead of adding annotations) easily enough, but we now generate more kwcoco files without improving or adding functionality.

The way I see it, there isnt anything inherently different in creating 3 kwcoco files versus 1 kwcoco file, since this is only being used in the offline system (since the online system gets its own logic written in the "collect_inputs" function). Also - with 3 kwcoco files, you will need to cross reference pose, detections, and activity GT to the same images, and will also need to ensure that frames have the same ids in all those kwcocos.

Purg · 2024-10-09T15:38:28Z

tcn_hpl/data/test.mscoco.json

Separate question: are the individual detection / pose / activity classification COCO formats (all merged into this file, seemingly) following any standards? If so, could you remind me what/where those standards are? Last I thought I heard, the activity classification standard was one that we made up. If that is true, is there a better activity classification standard to use? I would almost thing that this is basically an image classification, so would it be better to follow that established format?

the training pipeline is setup as a classification problem - yes. I am not sure if I understand what you mean by the individual annotation coco format standards? If you mean in terms of metadata and category information - it is set up as a mix of pose and detection, without acitivity classification metadata (i.e. dict with id and name of activity classification categories) baked into the metadata. We use a separate config file (i.e. /home/local/KHQ/peri.akiva/projects/angel_system/config/activity_labels/medical/r18.yaml)

tcn_hpl/data/add_gt_to_kwcoco.py

Purg · 2024-10-09T16:40:53Z

tcn_hpl/data/tcn_dataset.py

+
+
+class TCNDataset(Dataset):
+    def __init__(self, kwcoco_path: str, sample_rate: int, window_size: int):


My thought was along the lines that one thing we are achieving here is removing the "bundle" meta-files that were a step in between processing detections/pose/activity-truth combinations. As this currently stands, we are reduced in what we are preprocessing, but still have to do some preprocessing to create this combined COCO file. Going forward, I was assuming we would be changing our detection and pose sources (yolov7 vs. yolov8 vs. yolov11, mmpose vs. yolov11 pose, etc.), in which case being able to just point the dataset at which results version to use for iterating would be easier than having to remember to create a preprocessed unified COCO file. Is that conceptualization off the mark?

tcn_hpl/data/tcn_dataset.py

tcn_hpl/models/ptg_module.py

…file, config file corrections

Purg · 2024-10-09T20:15:32Z

Fixup commit (with integrating rebase) removed COCO file and added some nitpick whitespace updates

Purg · 2024-10-09T22:04:08Z

@periakiva I've done some minor updates based on my few comments. I'm happy to merge this unless you have any issues.

small cleanup

731c629

periakiva marked this pull request as draft October 8, 2024 20:18

Purg reviewed Oct 9, 2024

View reviewed changes

tcn_hpl/data/test.mscoco.json Outdated Show resolved Hide resolved

Purg reviewed Oct 9, 2024

View reviewed changes

tcn_hpl/models/ptg_module.py Show resolved Hide resolved

periakiva added 2 commits October 9, 2024 16:14

add GT to kwcoco script, tcn dataset object, generated sample kwcoco …

c992f02

…file, config file corrections

add GT to kwcoco script, tcn dataset object, generated sample kwcoco …

19adc05

…file, config file corrections

Purg force-pushed the dev-dset branch from b0cfde8 to 19adc05 Compare October 9, 2024 20:14

Purg added 2 commits October 9, 2024 16:18

Black reformatting

5d3b526

Some formatting and commenting

4ce16f1

periakiva marked this pull request as ready for review October 11, 2024 14:40

Purg merged commit 46e792a into PTG-Kitware:main Oct 11, 2024
0 of 11 checks passed

Purg deleted the dev-dset branch October 11, 2024 15:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset development that supports both offline and online input. #28

Dataset development that supports both offline and online input. #28

periakiva commented Oct 8, 2024

periakiva commented Oct 8, 2024

Purg Oct 9, 2024

periakiva Oct 9, 2024 •

edited

Loading

Purg Oct 9, 2024 •

edited

Loading

periakiva Oct 9, 2024

Purg Oct 9, 2024

periakiva Oct 9, 2024

Purg Oct 9, 2024 •

edited

Loading

Purg commented Oct 9, 2024

Purg commented Oct 9, 2024



		class TCNDataset(Dataset):
		def __init__(self, kwcoco_path: str, sample_rate: int, window_size: int):

Dataset development that supports both offline and online input. #28

Dataset development that supports both offline and online input. #28

Conversation

periakiva commented Oct 8, 2024

What does this PR do?

Before submitting

Did you have fun?

periakiva commented Oct 8, 2024

Purg Oct 9, 2024

Choose a reason for hiding this comment

periakiva Oct 9, 2024 • edited Loading

Choose a reason for hiding this comment

Purg Oct 9, 2024 • edited Loading

Choose a reason for hiding this comment

periakiva Oct 9, 2024

Choose a reason for hiding this comment

Purg Oct 9, 2024

Choose a reason for hiding this comment

periakiva Oct 9, 2024

Choose a reason for hiding this comment

Purg Oct 9, 2024 • edited Loading

Choose a reason for hiding this comment

Purg commented Oct 9, 2024

Purg commented Oct 9, 2024

periakiva Oct 9, 2024 •

edited

Loading

Purg Oct 9, 2024 •

edited

Loading

Purg Oct 9, 2024 •

edited

Loading