Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider defining clearly the PTS and DTS sourcing in each bytestream spec #292

Open
wolenetz opened this issue Aug 13, 2021 · 1 comment
Milestone

Comments

@wolenetz
Copy link
Member

To improve readability, especially when other specifications like WebCodecs, WebRTC, HTML.rVFC, etc have potentially different semantic around "timestamp" or "presentation timestamp", it would be good to further clarify in each MSE bytestream format what the PTS and DTS of an MSE coded frame originate from in the underlying format.

See also w3c/webcodecs#107 (comment)

@cconcolato
Copy link

For ISOBMFF, here is what I would propose:

  • MSE's "Decoding Timestamp" should be equal to ISOBMFF's "decoding time", i.e. for a fragmented file, derived from the tfdt box of the fragment containing the coded frame and the sum of the duration of the previous samples in the fragment (indicated by trun, or tfhd, or trex).
  • MSE's "Presentation Timestamp" should be the ISOBMFF's "presentation time", as defined in ISOBMFF, i.e. for a fragmented file, derived from the decoding time (see above), the possible composition time offset (from trun), and from a possible edit list.

But reading MSE I see the following definition :

The decode timestamp indicates the latest time at which the frame needs to be decoded assuming instantaneous decoding and rendering of this and any dependant frames (this is equal to the presentation timestamp of the earliest frame, in presentation order, that is dependant on this frame)

The part in parenthesis is not always true for ISOBMFF. Consider the case of an edit list shifting the presentation forward by 10s (media rate 1, media time -1, edit_duration 10s), the presentation of the first frame will be 10s while its decode time will be 0. IIUC, it's not explicitly permitted but not excluded either in the ISOBMFF Byte Stream Format spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants