-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: ManagedMediaSource API #320
Comments
If there is no objection, @jyavenard and I would like to submit a Pull Request to the spec outlining the details of how it might work to solicit further feedback from working group members. We are also happy to provide a test suite in the form of Web Platform Tests. We have a reference implementation in WebKit if folks want to try it out. |
Given positive reactions to @jyavenard's post above, I think a PR would be welcome. I'd like to arrange to talk through the proposal in an upcoming Media WG meeting, if you'd be happy to? We're also going to need someone in the WG as a new co-editor. |
Thanks for working on this! The buffering events look great. I do have some minor concerns, but no blocking objections:
|
I can't wait to see the outcome of this new proposal. I think what's being proposed here is really what MSE is lacking most today, so I'm all for it. Closely related, but a little off topic. Is this API currently deployed on Webkit, or is it just an experiment for now? The MSE API has been disabled on the iPhone for a few years, so I was wondering if the arrival of this new proposal, as well as the arrival of |
that's great to hear ! Managed Media Source is enabled on Safari 17 beta on macOS and iPadOS and behind an experimental flag on iOS for now. There will be a talk for WWDC on developer.apple.com on Thursday June 8th "Explore media formats for the web" where it is presented. |
Anticipating that there may be a period when there are browsers or platforms that only support For instance, is the pull request that proposes to add support for I'm also wondering about the overall message down the road. Is it "This strikes a better balance than the previous version of MSE, use |
The message we want to convey is to use Managed Media Source first if available, and only fallback to MSE if that's the only option available. Any logic to decide which resolution variant is suitable to use would be common between the two as far as bandwidth management is concerned. As mention by @dalecurtis the |
Improved exposure of and flexibility for memory constraints would be great. Regarding browser optimization of network request timing, isn't this a general concern rather than one that is specific to streaming ? The key property of streaming network requests is that they are (often) not urgent, because we have lots of buffered data, and so we're happy to trade some latency for improved overall throughput or some other benefit such as battery life. The propose start / stop streaming events don't prevent the site from downloading, only from appending. And it could presumably happen that a network request issued during a "streaming allowed" period doesn't complete within that period, but we should still be allowed to append it. An alternative, admittedly more radical, approach would be a way to tag network requests as "non-urgent" and so eligible for delay within the browser until a time where they can be made more efficiently. Of course the browser should only delay such requests if there is some pay-off that the site could later observe. Regarding the quality hints, I think that as with the network request timing there needs to be some measurable benefit to the site to start listening to these and I am not sure what that is ? Does it need to be a new class ? Or could these just be discoverable extensions to the existing MediaSource ? |
(Implementor hat on) @mwatson2 said:
Correct, they don't prevent the site from downloading. The current language allows a UA to prevent appending, but in our implementation experience, that wasn't actually necessary. We left the ability to block appends in the proposal should a UA decide it was necessary or desirable to implement, but that could be pulled out into a separate proposal and removed from this one.
Seems like a good idea to do whether or not we do ManagedMediaSource, and also something very outside the Media WG's bailiwick. Doing this without wreaking havoc on site's bandwidth estimation would also be difficult. The current proposal allows UAs to incorporate buffer water-levels in its decision to fire
We discussed this with @wolenetz, and the alternative would be to pass a dictionary containing configuration modes into the |
You mean in addition to this proposition?
I will add to Jer's answers: |
When I read the comment from @mwatson2 above I vaguely remembered that such a mechanism already exists. There is a Sorry if that was obvious to you all. |
@jernoble As a site implementor I would be very concerned about the user agent making assumptions about "buffer level". The buffer state consists of both media that is appended and media that has been downloaded and not appended. When considering non-trivial playback scenarios (anything where the download is more complex than a straightforward linear sequence of media blocks) the site is managing what is downloaded and what is appended. For example, sometimes we append media "just-in-time" for playback in which case the UA has no useful information about the true buffer level. Looked at another way, if the UA starts treating media differently based on the UA's perception of buffer level, sites are just going to optimize when they append to get to the site's idea of optimum performance. A better concept is the "urgency" of requests i.e. some information about the actual earliest time the response might be needed. @jyavenard About the memory constraints, I meant this proposal. However, ideally, if there are memory constraints it would be nice for the site to be able to know about them in advance so we can consider than in our adaptive streaming choices. On very constrained devices we sometimes stream at a lower bitrate even when throughput is high so as to be able to store enough media to cover adaptations in future. Of course, a site can heuristically work out what the constraint is by observing the UA behavior when it comes to removals. From your description, it sounds like what the @chrisguttandin Yes, it certainly seems like a UA could defer "low" |
@mwatson2 said:
No, the fetch spec does not allow that currently. The priority is only used to prioritize fetch requests relative to each other. It doesn't appear to allow UAs to delay fetches indefinitely if conditions are not "ripe" for a low priority request. The notes from Chrome explicitly state that |
The UA has to make those assumptions in order to implement things like
Seems that by not appending data that has been downloaded and the site intends to play, the resulting problem is one of the sites' own making. One that is easily avoided by just appending that downloaded data rather than saving it for a "just in time" append when the buffered level becomes critically low.
Yes, that is literally the point. :) A site that can "lie" to the UA by fetching a ton of data up front, and appending that previously downloaded data whenever they receive In the end, |
@jernoble wrote:
Yep, but I don't think anyone is going to hold back on appending data that they 100% intend to play. The use-case for "just-in-time" appending is when you are not sure until that time what media is to be played. I appreciate that an alternative is to append anyway and then replace if you change your mind, but this has its own complexities. The point is that MSE provides sites with the flexibility to compose media streams in whatever way they choose in a manner that is decoupled from the download strategy. This is very useful. UA assumptions about network requests made based on what has been appended are likely to be incorrect. I'm assuming that a possible UA algorithm would be to turn on the expensive radio when buffer levels get low and to switch to a longer on / off duty cycle the higher the buffer level. A site that wanted to game this would hold back appends to gain access to the expensive fast radio more often, optimizing for their own goal (throughput) but defeating the UA's objective to save battery life. A more enlightened site developer might share your desire to preserve battery, but in that case wouldn't it be better to give control of the duty cycle to the site, which knows more about its data needs ? |
Yes, that is the preferred pattern of use. It doesn't seem reasonable to design and implement a complicated network API because of "complexities" in how overlapping appends work. We should just address those complexities directly!
No, because the site isn't the only application on the system driving the radio, nor is the ability for a website to control the duty cycle of an expensive modem a desirable thing (IMO) for the web platform. |
We considered this, but the risk here is
|
I agree with this. However the risks here are equivalent to the streaming event approach if we require some percentage of fetched bytes to be appended. I.e., with either solution a page could use canned data to simulate the buffering levels required to get 5G if they really wanted to.
The risks here also seem the same.
Yes, I was expecting the fetch version to delay in the same way, but with added benefit of knowing the download rate so that fetches can be scheduled with transfer time in mind. I don't quite follow how you expect to be able to deliver |
@jernoble My point is that - at least - if you embed assumptions into the design - like the assumption that the site is downloading a single simple linear media sequence and will append media as soon as it is downloaded - you'd better be explicit about that assumption. So then sites that do not conform to that can avoid using the new API. But I'd prefer a solution that did not embed such an assumption because that is clearly just one specific use case - albeit a common one. The fundamental problem here is one where you want to schedule downloads to take advantage of a resource that is slow and/or expensive to enable, use and disable. As a result, we get the best results when the resource is intermittently available and fully utilized when it is available (i.e. we want to aggregate the idle times, compared to current download scheduling). This problem has very little to do with streaming media, except that media is one example where the application is (sometimes) robust to downloads being scheduled this way. Ideally, the site would simply be able to provide each download with a wallclock deadline. This would give the UA perfect knowledge of when each request was required and it could schedule in the most efficient way. What's proposed is that during the |
To be fair, this use case is the overwhelmingly most common one. Linear playback by appending chunks as soon as they are received is far and away the most common mode of operation. The solution proposed is incredibly simple, easy to specify, implement, and use for the most common use case of the API. We can debate whether a more complicated networking coalescing API (defined outside of this specification and by a completely separate working group) would help solve the remaining (and much, much less common) use cases, but I don't believe that should prevent this proposal from moving forward. Meanwhile, those use cases that don't fit neatly into the "fetch, append, throw away" mode above can... just continue on exactly as they have been with |
@dalecurtis said:
True. The UA has a great deal of latitude about both when to fire the
In our implementation, the times at which the Other UAs may make different (and more advanced!) decisions about when to schedule those events. A hypothetical browser may make note of the speed at which fetches made between the streaming events take place, and allow the buffer levels to more completely empty before triggering However, a non-goal of our implementation is to facilitate pages staying as close to the edge of the buffered range as possible, delivering data from the network "just in time" to avoid underruns. So the requirement to closely monitor and predict download speeds simply isn't present. |
@mwatson2 said:
I'm curious about this phrasing. Do you mean "avoid using ManagedMediaSource" entirely? Or just avoid listening to the The design of this |
I'm not saying the proposal shouldn't move forward. I was trying to see if there was any scope for something more flexible. I do think that the assumptions should be made explicit.
Unless I'm mis-understanding, if you use ManagedMediaSource and then ignore the startstreaming and stopstreaming events then throughput measurements are going to be a bit messed up because the site's requests will randomly fall into the 5G streaming windows or not. Or is the radio on / off behavior essentially the same with MediaSource and the difference is just whether you tell the site about it or not ? In the former case, it would be good if the |
@jernoble said:
Thanks that explains a lot. I can see how this system works well enough for VOD playbacks. As you note, a live stream might have to ignore the streaming events to maintain buffering. That seems in conflict with the language around the UA being able to block appends. How do you see that being reconciled? Is stopstreaming never fired since the forward buffering level remains too low? The text around how the streaming events are to be used will need some care to ensure developers are aware that the streaming events can't function as the sole buffering mechanism during live streaming. It'll be a bit surprising to first time authors I expect; but after ten years, there aren't many non-library based players so maybe that's no big deal. |
As I mentioned upthread, we found during implementation that blocking appends was unnecessary, and I suggested that we remove that language from the proposal and track it in another issue. Something @jyavenard and I thought about earlier was an explicit signal to the UA that the client would be doing live streaming; something like |
Ah, I understand now, thanks.
This will vary from platform to platform, but on platforms with multiple radios, it's certainly possible for network speeds to vary wildly as data is routed on one or the other with different capabilities. And of course on mobile devices, users can travel in and out of coverage with varying levels of signal quality and capabilities. In my own neighborhood, I noticed that available bandwidth dropped off a cliff sometimes even when not moving, presumably as other people around me all tried to use the network simultaneously. So yes, it could cause bandwidth measurements to change as radios were activated and deactivated. But I don't believe this is a new problem, nor one that sites are unprepared to deal with.
The proposed |
There is also a flag that says 'Managed Media Source requires AirPlay Source" which is checked. May I ask the purpose of that dependency? I could not find mentions of Airplay in this proposal anywhere. But if I overlooked it, please guide me. Besides striking a jarring note to our developers, it makes it harder to explain to our users. |
It is not part of this proposal. I explained the reason behind it in https://developer.apple.com/videos/play/wwdc2023/10122 I don't see how this has any negative effects on your users that requires an explanation, quite the opposite. In order to avoid usability regression (where all videos would have the ability to be used with AirPlay), when using ManagedMediaSource on iPhone, you need to provide an alternative video source that is compatible with remote media playback. |
I tried to use ManagedMediaSource API on iOS 17 in the simulator. Futured flag is enabled for ManagedMediaSource API. When I call addSourceBuffer() method with any mime typeit just says it is not SUpported and can not create source buffer. And when I try to use the method isTypeSupported() it gives false every time. I tried with different combinations of mime types with and without codecs and it always return false. The only mime type that return 'true' is "video/webm" without any codecs. So my question is which codecs amd mime types iOS support and do others have the same issue - not able to use the Managed Media Source on iOS. Or maybe I do something wrong when i use addSourceBuffer() and isTypeSUpported() methods. |
You need to use iOS 17.1 beta 2; but this isn't the place to ask those questions, please use bugs.webkit.org thank you |
jyavenard, thank you for the prompt reply.
Thanks, I viewed that video twice. There is no mention of the flag 'Managed Media Source requires AirPlay Source'. Perhaps you could convey the explanation here?
This proposal opened with a presentation of the new feature, an invitation to test a reference implementation, and a pointer to the Feature Flag. The 'Managed Media Source requires Airplay Source' flag was not mentioned but is absolutely required to test the reference implementation. Probably good to fill in that part too.
Respectfully, it does. As soon as we mention Airplay, users have a dozen questions which we cannot answer. "Is this turning off AirPlay?" "Do I need to use Airplay?" "Is your product using Airplay?" This is not hypothetical, we have encountered it. I believe that AirPlay is a branded Apple service, is it not? We wish to show our users that we are delivering something that uses standards and is independent of any proprietary vendor product/service.
I am suddenly feeling concerned. If we ask a user to test Managed Media Source using the reference implementation introduced here, are we changing the behavior of their iPhone vis-a-vis Airplay? Thank you, This is not particularly relevant but we are streaming audio, not video. |
There's definitely is, you should watch again at the 20 minutes mark, or from the transcript When designing Managed MSE, we wanted to make sure that nothing was left out by accident and that users continue to get the same level of features as they did in the past. So to activate Managed MSE on Mac, iPad, and iPhone, your player must provide an AirPlay source alternative. You can still have access to Managed MSE without it, but you must explicitly disable AirPlay by calling disableRemotePlayback on your media element from the Remote Playback API Currently, on iPhone, only plain mp4 or HLS is supported, those inherently works with AirPlay (Apple's version of the spec's remote playback). And AirPlay is a very popular (and used) feature. Most A/V receivers support it these days We didn't want a functionality to become overnight broken once the ManagedMediaSource element became available. We hope that the solution adopted will be to do the former (using a source that is likely existing as there's need to be support for earlier iOS version), so that users can continue to listen or view on their preferred A/V equipment.
So in all honesty, I believe your concerns are unwarranted. In the worse case, you need to set a single attribute to your audio element for things to work as you expected. And again, this has nothing to do with this proposal, so this will be my last answer on this topic here. |
jyavenard, thank you for the patient, informative response. Sorry for hijacking this channel for a moment. We will get back to the lab and finish building it into our software. We are delighted to have this proposal, BTW. More than you can imagine. |
The spec can be seen at w3c/media-source#320 Closes #5271
I've wrote a first draft here https://jyavenard.github.io/media-source/media-source-respec.html |
@jyavenard - nice work. It might be useful to also extend the Examples section of https://github.com/jyavenard/media-source/tree/managed_mse to include an example showing the use of ManagedMediaSource and the onstartstreaming and onendstreaming events. |
Done |
Amended the proposal to change the BufferedChangeEvent to no longer make optional the two TimeRanges. |
Hey folks, the PR is now up: Thanks again to everyone that's provided feedback and helped shape the overall design. Would love to get a second (or third!) implementer commenting on the PR before landing it - as well as potential developers. We intend to prepare some tests in parallel, so we'd love some feedback on the overall design in the meantime. |
The spec can be seen at w3c/media-source#320 Closes shaka-project#5271
Related: w3c/webtransport#522 |
@chrisn, @jyavenard, it's probably fine to close this, and to do the remaining work (if any) in follow-up. |
Thanks @padenot, agreed. We welcome new issues if anyone has points to follow up on. |
Definitions
A “managed” MediaSource is one where more control over the MediaSource and its associated objects has been given over to the User Agent.
Introduction
The explicit goal of the Media Source Extensions specification is to transfer more control over the streaming of media data from the User Agent to the application running in the page. This transfer of control can and has added points of inefficiencies, where the page does not have the same level of capabilities, knowledge, or even goals as the User Agent.
Examples of these inefficiencies include the management of buffer levels, the timing and amount of network access, and media variant selection. These inefficiencies have largely been immaterial on relatively powerful devices like modern general purpose computers. However, on devices with narrower capabilities, it can be difficult to achieve the same quality of playback with the MediaSource API as is possible with native playback paths provided by the User Agent.
The goal of the ManagedMediaSource API is to transfer some control back to the User Agent from the application for the purpose of increasing playback efficiency and performance, while retaining the ability for pages to control streaming of media data.
Goals
Non-Goals
Scenario
Low-memory availability
A user loads a MSE-based website on a device with a limited amount of physical memory, and no ability to swap. The user plays some content on that website, and pauses that content. The system subsequently faces memory pressure, and requires applications (including the User Agent) to purge unused memory. If not enough memory is made available, those applications (including the User Agent) may be killed by the system in order to free up enough memory to perform the operation triggering this memory pressure.
The User Agent runs a version of the “Coded Frame Eviction” algorithm, removing ranges of buffered data in order to free memory for use by the system. At the end of this algorithm, the User Agent fires a “bufferedchange” event at every SourceBuffer affected by this algorithm, allowing the web application to be notified that it may need to re-request purged media data from the server before beginning playback.
Memory availability notification
When a call to appendBuffer is rejected with a QuotaExceededError exception, it can indicate the amount of excessive bytes or time that caused the error.
Network Streaming changes
Currently, a web application is allowed to append media data into a Source Buffer at any time, up until that Source Buffer’s “buffer full flag” is set, indicating no additional data is allowed to be appended. However, a constrained device may want to coalesce network use into a small window, and allow the network to query for battery and bandwidth reasons.
Alternatively, a device may have access to a high-speed network with high power use while the relevant communications interface is active (as can happen on 5G cellular). Using such a network may be beneficial in some circumstances:
To get these benefits without excessive battery drain, it's necessary to buffer more at once, and to limit streaming activity to specific windows so that the device's radio can be cycled on and off.
The User Agent would fire a “startstreaming” event at the MediaSource, indicating that the web application should begin streaming new media data. It would be up to the User Agent to determine when streaming should start, and could take current buffer levels, current time, network conditions, and other networking activity on the system.
When the User Agent determines that no further media streaming should take place, it would fire a “stopstreaming” event at the MediaSource, indicating to the web application that enough media data had been buffered to allow playback to continue successfully.
Usage example
Privacy considerations
TODO: discuss potential privacy protections if multiple origins try poke at this at the same time.
A concern is providing visibility to preferred quality if it is based on networking condition such as cellular or wifi etc.
Other
To consider from MSE v2:
The text was updated successfully, but these errors were encountered: