At the heart of MSE is the MediaSource object. This object is created by the application and attached to the media element. Its purpose is to provide the media data for playback as requested by the media element.
The MediaSource object maintains a collection of SourceBuffers. These are the interface through which the application appends media data to the source and methods are provided to insert, remove and manage media data. They are essentially an abstraction of a timeline – media data can appended to the buffer based on media playback timestamp, or it can be appended sequentially, ignoring timestamps. The latter mode enables unrelated media to be spliced together, which allows uses such as advert insertion or even video editing in the browser.
The application handles the requesting of media data from the server and appends the response to the SourceBuffer. Decoupling the fetching of media data from playback allows the media data to be sourced using novel transport mechanisms or from different locations.
SourceBuffers can contain audio, video or timed text and an instance is created for each stream that needs to be presented. Typically there might be one video stream, one audio stream and perhaps a subtitle stream. Since each media type is handled separately, access services such as audio description or subtitling can be selected simply by requesting a different stream.
Finally, the specification also includes extensions to the HTMLVideoElement allowing measurement of video decode and rendering performance which could be used to help decide the most appropriate video stream to present if a number of options are available.
An additional benefit of not hardcoding features into the browser is that any functionality upgrades such as improved adaptive algorithms or defect fixes are simply a case of updating the Javascript application, which is freshly fetched each time the page is loaded, rather than requiring every user to upgrade their browser. Software updates to the browser itself might be fairly easy on a PC but happen infrequently on a smart TV or set top box.
MSE的核心是MediaSource对象。该对象由应用程序创建并附加到媒体元素。其目的是提供媒体元素请求的媒体数据进行回放。
MediaSource对象维护SourceBuffers的集合。这些是应用程序通过其将媒体数据附加到源的接口,并且提供用于插入,移除和管理媒体数据的方法。它们本质上是一个时间轴的抽象 - 媒体数据可以基于媒体播放时间戳附加到缓冲区,也可以按顺序附加,忽略时间戳。后一种模式可以将不相关的媒体拼接在一起,从而允许在浏览器中使用诸如广告插入或者甚至视频编辑。
应用程序处理从服务器请求媒体数据并将响应附加到SourceBuffer。将媒体数据从回放中取出允许使用新颖的传输机制或从不同的位置获取媒体数据。
SourceBuffers可以包含音频,视频或定时文本,并为每个需要呈现的流创建一个实例。通常可能有一个视频流,一个音频流和一个字幕流。由于每种媒体类型是分开处理的,因此可以简单地通过请求不同的流来选择诸如音频描述或字幕的访问服务。
最后,该规范还包括对HTMLVideoElement的扩展,允许测量视频解码和渲染性能,如果有多个选项可用,可用于帮助确定最合适的视频流。
浏览器没有对特征进行硬编码的另外一个好处是,任何功能升级(例如改进的自适应算法或缺陷修复)都只是更新JavaScript应用程序的一种情况,每次页面加载时都会提取新的Javascript应用程序,而不是要求每个用户升级浏览器。浏览器本身的软件更新在PC上可能相当容易,但在智能电视或机顶盒上很少发生。