You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
produce a video_description/ modality data folder of the following format:
root/video_description/shard-00000.tar
| ├── 00000.jsonl # this corresponds to one video. each line within it corresponds to one subsequence of frames.
| ├── 00001.jsonl
| └── ...
Note that the txt/jsons in the v2d might not correspond exactly to the representation we want here (e.g., we might need some logic to determine the start/end frame indices from timestamps).
Where are these descriptions coming from? Do we pseudolabel them out with another description model? @smontariol?
Child issue of #3.
The text was updated successfully, but these errors were encountered:
kdu4108
changed the title
Transform from v2d format into video_transcript format and save in video_description/ directory.
Transform from v2d format into video_description format and save in video_description/ directory.
Jul 15, 2024
kdu4108
changed the title
Transform from v2d format into video_description format and save in video_description/ directory.
Transform from v2d format into video_description format and save in video_description/ directory.
Jul 19, 2024
Goal: given v2d format of
produce a
video_description/
modality data folder of the following format:Each jsonl should look something like
Note that the txt/jsons in the v2d might not correspond exactly to the representation we want here (e.g., we might need some logic to determine the start/end frame indices from timestamps).
Where are these descriptions coming from? Do we pseudolabel them out with another description model? @smontariol?
Child issue of #3.
The text was updated successfully, but these errors were encountered: