-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSE-for-WebCodecs draft feature specification #302
base: main
Are you sure you want to change the base?
Conversation
* Updates MediaSource and SourceBuffer sections' IDL to reference the new methods and types. * Updates SOTD substantives list format and content to include this feature. See w3c#184 for the spec issue tracking this feature's addition. (Remove this set of lines during eventual squash and merge:) * Adds placeholder notes including references to the WebCodecs spec to let the updated IDLs' references to definitions from that spec succeed. Upcoming commits will remove the placeholders and include exposition on the behavior of the updated IDL, possibly also refactoring reused steps into subalgorithms.
Needs verification later of correct linkage to WebCodecs dfns for "valid AudioDecoderConfig" and "valid VideoDecoderConfig" once WebRef updates. (See their associated exports added in w3c/webcodecs#372).
Adds TODO items to track more work necessary in this draft. Fixes the addSourceBuffer parameter type to be TypeOrConfig. References a not-yet-defined internal slot that tracks the pending append chunks promise, and adds steps to reject that promise on abort() or removeSourceBuffer(), if necessary. Updates various places that describe BSF to also mention the possibility that a SourceBuffer may instead be configured to expect WebCodecs encoded chunks appends. Adds a note to isTypeSupported() for how to use the WebCodecs isConfigSupported() methods to proactively check support for buffering encoded chunks into a SourceBuffer.
Fixes reference to the sourceBuffer in removeSourceBuffer()'s promise abort step. Updates the coded frame eviction algorithm's definition of |new data| to possibly be EncodedChunks being appended. Adds top-level steps for appendEncodedChunks(). Updates abort() and removeSourceBuffer() handling to conditionally reject promise vs abort / updateend firing. Updates reset parser state algorithm to clear any EncodedChunks from the input configs and chunks slot, but leave any WebCodecs config there. References w3c#301 for discussion from each of appendBuffer() and appendEncodedChunks().
Defined [[pending append chunks promise]]. Defined [[input webcodecs configs and chunks]] Enforced all EncodedChunks have non-null duration during synchronous appendEncodedChunks(), after the step that runs the Prepare Append algorithm. Updated init-segment-received to mention [=chunks append=] algorithm can also call it. Updated coded-frame-processing to mention [=chunks append=] algorithm can also call it. Defined [=chunks append=] algorithm, Defined generation of DTS for frames from EncodedChunks (needed by coded frame processing algorithm) to be simply 0. Noted this assumes no discontinuity within or across appendEncodedChunks() calls, except as may be explicitly signalled by the app calling abort() or changeType(). Disambiguated (slightly) "SourceBuffer track configuration" versus "SourceBufferConfig". Updated changeType. Some notes on the current (M95) Chrome prototype implementation of this feature; all are expected to be fixed very soon: * crbug.com/1255048: it doesn't support changeType(SourceBufferConfig)) * crbug.com/1255050: it doesn't support h.264 EncodedVideoChunks' append support, and it assumes encoded chunks' DTS==PTS instead of using just 0 for all DTS of EncodedChunks' frames sent to the coded frame processing algorithm: this may require further refinement as noted in the spec as well if it is not working as expected once the prototype is updated. * crbug.com/1255052: it still hardcodes EncodedAudioChunk durations to be 22ms coded frames due to duration field originally not in EncodedAudioChunk specification, and it checks for EncodedVideoChunk duration in the middle of the Prepare Append steps instead of after that subalgorithm.
@mwatson2, please review this full feature. There are probably some edge cases in pre-existing text that I missed that might still need some updating. Hopefully, the sequence of changes I put together in this PR enable easier piecemeal review. @tidoust, please review for respec and standards-track aspects. Thank you both! |
appended in the same {{SourceBuffer/appendEncodedChunks()}} call: the application should instead wait for the | ||
{{SourceBuffer}} to no longer be {{SourceBuffer/updating}}, and then call {{SourceBuffer/abort()}} or | ||
{{SourceBuffer/changeType()}} before appending {{EncodedChunks}} that are discontinuous with those in the | ||
previous {{SourceBuffer/appendEncodedChunks()}} call.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why EncodedChunks cannot be appended in any order like media segments can ?
Does the above mean that to append, say, overlapping media (replacing some existing frames) we need to call changeType
even if the type has not changed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this is not ergonomic/compatible with previous behavior, and should be improved.
If I recall from when I wrote this draft and the prototype in Chromium, sequences of out-of-order presentation time versus the decode sequence must still be supported. The issue, I think, arises from the lack of explicit "decode timestamp" in EncodedChunks, and the prototype/draft API instead assumes the order of chunk appends describes their decode sequence. There is no ability of the coded frame processing algorithm to detect discontinuities in EncodedChunk decode time, unlike bytestream-formatted media. Without some ability to determine actual discontinuity implicitly, explicit discontinuity signaling is needed (abort() or changeType()). Even so, changeType
isn't required if type isn't changing, but there is indeed a discontinuity. Instead, abort() signals the discontinuity explicitly without changing type.
This could be improved/simplified for "segments" mode: keyframes themselves could be used as the implicit signal of potential discontinuity, especially since there is allowance for only 1 audio config or only 1 video config in a SourceBuffer
configured to allow EncodedChunk appends (and therefore there is no need to have such per-keyframe discontinuity also impacting the processing of coded frames for any other track buffers in that SourceBuffer
.)
I might be missing some other detail in my recollection of this old PR and prototype though. For example, "sequence" mode appends that are not discontinuous at keyframes, but might appear to be if they are from SAP-Type-2 encode sequences, could cause interop issues if keyframes are used as implicit potential discontinuities. Conversely, without implicit discontinuity signaling, lack of app explicitly signaling could impair reaching the intent of "sequence" mode buffering.
One solution might be to use keyframes as implicit potential discontinuity signaling, but disallow use of appending chunks in "sequence" mode.
// that constraint: the implementation MUST reject a sequence containing both | ||
// audio and video chunks. | ||
typedef (sequence< (EncodedAudioChunk or EncodedVideoChunk) > | ||
or EncodedAudioChunk or EncodedVideoChunk) EncodedChunks; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we write sequence<EncodedAudioChunk> or sequence<EncodedVideoChunk> or EncodedAudioChunk or EncodedVideoChunk
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I tried that originally in the prototype, but the IDL bindings generator failed. I fuzzily recall maybe it was due to an identifier length limit on the generated type name. Worth reconsidering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like leaving this with just a fuzzy recollection, so I checked the Chromium prototype's commit description for the change I landed that added this IDL:
Caveat: "
sequence<A> or sequence<V>
" cannot be disambiguated by the
bindings, sosequence<A or V>
is used in this change. Regardless, the
eventual implementation would need to validate that all in the
sequence are either A or all are V (along with the usual validation
that appended chunks or frames also appear to use the most recent
SourceBufferConfig).
Likewise, my fuzzy recollection of identifier length was fuzzy. In that same change I did run into identifier length limits, but solved that by adding the long identifier into an exception list for that check in the generator:
Caveat: The bindings generator requires help when generated union type
identifiers are too long for some platforms. This change adds a
seventh case to the existing hard-coded lists of names that need
shortening with the generator.
Preview | Diff