Filter by topic and date
What’s the deal with Media Over QUIC?
- Brett BralleyThought Leadership Content Writer, Cisco
25 Jan 2024
In 2022, the IETF formed a working group for Media Over QUIC (MoQ)—a media delivery solution that has the potential to transform how we send and receive media during live streaming, real-time collaboration, gaming, and more.
For those of you unfamiliar with this IETF work, here’s a primer on what you need to know: from what MoQ is, to how it works, to how it can completely change the game of media delivery.
What is Media Over QUIC?
We have many media streaming services, like Netflix and YouTube, on the Internet today. We also have services, like Webex and Zoom, that allow low-latency video to flow quickly to sets of users.
“Meanwhile, the video conferencing services that have low latency can’t scale out in a cost-effective way,” he adds.
“What MoQ is trying to do is build a new protocol that allows low latency for streaming services and higher scale for real-time media conferencing. We want to bring together the best of both of these to create one set of technology that replaces these two silos of existing technology. MoQ can stream media from a user that is producing the media content up into the cloud, and then take that media from the cloud and distribute it down to many viewers.”
According to the MoQ Working Group Charter, MoQ is a “simple low-latency media delivery solution for ingest and distribution of media.” It includes a single protocol for sending and receiving high-quality media (including audio, video, and timed metadata, such as closed captions and cue points) in a way that provides ultra low latency for the end user.
That publisher/subscriber protocol supports multiple media formats, interoperability when indicating the media and media format being sent, rate adaption strategies based on changing codec rates or chosen media encoding/qualities, and cache-friendly mechanisms.
To achieve low latency and high quality, the protocol works in conjunction with installed media relays throughout the data distribution chain. Those relays function as cache mechanisms, where copies of media are stored and then distributed as needed.
As its name indicates, MoQ is laid on top of QUIC mechanisms (QUIC streams and/or QUIC datagrams), and can be used over raw QUIC or WebTransport. All media sent and received using the protocol can also be encrypted at the transport layer using standard QUIC, and can even be end-to-end encrypted.
“MoQ is inspired by many existing design patterns,” says Jennings. “It uses much of of what has been learned about real-time media from RTP, it takes the scaling ideas from HLS/DASH and HTTP CDNs, and in some ways it can be considered an application-level multicast overlay as as well as an application layer Named Data Networking. It works by subscribing to named data, caching, and real-time publishing to subscribers looking for named data. This makes it a very flexible architecture for many uses on the Internet beyond just media.”
How does it work?
To understand how MoQ can completely transform media delivery, it’s important to take a look at how media travels now without it. When, for example, a live streamer sends content out over YouTube Live, the journey it travels from the publisher to the individual viewer looks something like this:
Data packets inevitably get lost in the commute, which results in rebuffering video and audio that cuts in and out. To alleviate this problem now, the user’s video streaming application will send another request to the video streaming service for the missing data packets.
The request has to travel all the way from the user to the CDN, then the missing data packets make the journey once again back to the user.
We know this can be a thousands-of-miles journey, and that’s what causes latency problems. What’s more, the home connection of a user might be subpar, and the delay can worsen.
That’s where the installed relays come in: MoQ allows the media to flow through multiple relays, which can help “fan out” the data to many downstream users. They also form a point for re-requesting the lost data.
The relays can be placed throughout the data transportation route, including at the CDN level, within a 5G network, and even on a user’s local Wi-Fi. (This is not a new idea: HTTP has enabled the same thing for a long time, but MoQ introduces this idea to low-latency media.)
As media travels, copies of the data are stored in these relays. When data packets are lost, instead of the user sending a request that must travel thousands of miles, it might only have to travel a few hundred to the local 5G network, or better yet, a few feet to the user's local Wi-Fi router. This means that MoQ can enable error recovery so quickly that we don’t have to wait for the playout of data. This means we experience less latency, which leads to a better user experience.
The relays also help the data efficiently scale, as they are designed to need minimal processing of the data. If many downstream clients are receiving the same media, a relay only needs to receive one copy of the media, then sends it out to the many clients downstream. In some situations, this also helps reduce bandwidth and general power consumption of the Internet.
What other cool things can MoQ enable?
1. As partially explained above, MoQ can enable fan-out, meaning one copy of media can be distributed hundreds, even thousands of times, instead of many copies traveling through a network.
For example, imagine a branch office at a major corporation like Cisco, where 100 people in the same building might be on the same conference call. As it is now, the content of that call has to travel 100 times, arriving at each individual user’s device.
With MoQ, a media relay installed at that local branch office would enable fan out, meaning the content from that call only travels one time. Copies are stored in the local relay, and then are distributed within the branch office as many times as necessary.
2. With MoQ, media content can be end-to-end encrypted. But even when this is the case, media relays can still access information like the priority field needed for caching, and that can reveal helpful information that allows media to make forwarding decisions.
So if a network is congested, the media relay can decide whether to drop or delay certain media as needed.
3. MoQ also creates a solution to another problem: different applications have different latency needs, and MoQ would enable “tunable latency,” explains Will Law, a MoQ Working Group participant and Cloud Technology Group chief architect at Akamai Technologies.
“A stack such as WebRTC is optimized for playback at the live-edge only; it is difficult to use it for near-live and VOD playback,” he says. “Segmented formats (HLS, DASH, HESP) can operate at scale in the one-to-few-seconds latency range, but not for real-time.
“MoQ is being built so that it can be used across all three latency regimes: real-time, interactive, and VOD. Far more video by volume is consumed as VOD playback (which you can conceptualize as infinite latency) than as live. Peak traffic, however, is driven by live consumption. So a format in which the latency can be tuned to the application requirements is particularly useful.”
Why MoQ? Why now?
For the past 22 years, Law has been involved with streaming media, “a period over which we have seen the rise of Internet-delivered media and the advent of HTTP delivery and adaptive bitrate streaming,” he says.
The last decade has seen a refinement in methods and protocols used, such as HTTP/3, “but progress has been incremental in terms of improving QoE, reducing rebuffering, and lowering latency.”
Law co-chairs the WebTransport working group at the World Wide Web Consortium, and he notes that WebTransport, “offers an efficient connection between a client and a server, but it requires an application to give it utility.
“It became evident to me early on in that project that the prime candidate application was video delivery, especially live video delivery. Various companies were experimenting independently with leveraging QUIC for media delivery via a transport other than HTTP.”
That’s why he joined the Birds of a Feather discussion of MoQ in 2022 and has been involved ever since.
A powerful piece of the puzzle is the fact that a single protocol can be used, he adds.
“Today media delivery is a mixture of different protocols and formats for contribution, real-time distribution, and non-real time and VOD (video on demand),” he says.
Each of those protocols come with their own boundaries, complexities, and latencies, he notes. “Having a single format and protocol across all elements of the distribution chain (contribution, processing, distribution) can improve the efficiency, reduce the errors and conversion costs and improve the overall performance.”
Luke Curley is a software engineer at Discord as well as a MoQ Working Group participant. When he was an engineer at Twitch, he came across a need for a better user experience than current protocols allow.
“We hit a latency boundary of HLS, and WebRTC was really the only option available in the browser,” he recalls. “However, the user experience just wasn’t good enough for our use-case, so I had to search for something else.’
He started investigating QUIC and the QuicTransport origin trial, which indicated that “browser support was a real possibility,” he says. “We gambled on it, it worked great, and soon enough, we had Warp running in production. We realized that we had to advocate for these new standards (ex. WebTransport, WebCodecs) otherwise they might never see the light of day.”
Curley says MoQ goes beyond what other protocols thus far have been able to offer.
“WebRTC is a monolith that owns the entire pipeline: capturing, encoding, transmitting, receiving, decoding, rendering. If you want a Google Meets clone, then WebRTC does a fantastic job. However, if you want to make even the slightest modification, then WebRTC is too brittle. It's a very narrowly defined blackbox that doesn't allow for any experimentation or divergent use-cases.
“MoQ is a devolution, relying on individual web technologies instead. QUIC provides the networking (via WebTransport), WebCodecs provides the encoding/decoding, and the application is free to choose how to capture/render. Your application is actually responsible for the user experience, which involves more work, of course, but creates a huge opportunity.
“And critically, there's an explicit layer for relays in the form of MoQTransport. It's a generic live pub/sub framework designed for CDNs and mass fanout. It's quite a pain to build a WebRTC CDN because of how coupled it is to the conferencing use-case, while MoQ aims to offer a more generic experience more like HTTP (but live).”
How will different industries benefit?
There are two industries that will benefit greatly from MoQ: real-time collaboration apps like Webex, Zoom, Microsoft Teams, etc., and live streaming, explains Jennings.
“We have two converging problems,” he says. “Real-time collaboration apps want low latency, high quality, and a wider reach. The live-streaming industry would like to bring down latency to allow for real-time interactivity. Neither has been able to make significant progress, and they probably won’t if they continue to make incremental improvements (in other words, scaling up to 2,000 from 1,000 participants).”
MoQ is a disruptive solution that will “bring together the best of both worlds — the responsiveness of real-time voice and video conferencing and the scale and reach of larger-scale streaming — solving problems for both industries,” he adds.
“If used the right way, Media Over QUIC has the potential to yield an incredible real-time experience on a wide scale, no matter your Internet connection. It also forms the basis for a highly scalable and low-latency publish/subscribe networks that can be used by many types of applications,” Jennings says.
CDNs also have the potential to reap the benefits of MoQ in a big way, explains Law.
“Today, CDNs must deploy separate networks to deliver real-time media, VOD movies, and live sports events, due to the different protocols and formats involved in delivering those content types,” he explains. “Idle machines in one network cannot serve traffic on another and inter-network capacity must be carefully managed.
“The allure of MoQ is that a single delivery network, protocol, and format can address multiple market segments concurrently. This affords lower network OPEX and better capacity management while maximizing the addressable market.”
Jennings says that even though MoQ is designed for improving media delivery, it will enable so much more and benefit other markets.
“MoQ is actually a very generic mechanism that allows us to do even more than deliver media,” he explains. “MoQ allows us to build a publisher/subscriber network across the internet that works on low latency, high fan out, and high scalability, and this can be used for lots of applications.”
Other use cases include:
- IoT: “IoT devices report tons of information up their local edge network, with different monitoring systems trying to subscribe to and monitor pieces of that information, he says. This will make that process simpler for the systems subscribing to that media.
- Push notifications: “Many applications have the issue of when events happen, many clients have to be notified, and there isn’t a simple way to do that,” Jennings explains. “MoQ can enable an easier and simpler way to accomplish this.”
- 5G Networks: Modern 5G networks have opened up the opportunity of using servers that provide very low latency to clients. “However, there isn’t a good programming model of how to take advantage of this,” says Jennings. “In the same way CDNs gave a simple way for applications to use data centers all over the world, deploying MoQ relay nodes in the 5G edge would give applications an easy way to use the low latency capabilities of the 5G networks.“
- Text messaging protocols: “MoQ allows the most used messaging apps like WhatsApp to create a way to deliver text messages to millions of people in a very scalable way.”
What will the end-user experience with MoQ?
In a recent blog post authored by Jennings, he quotes user experience designer Vanessa Costa-Massimo on the psychological impacts of latency and how it can impede hybrid work:
“When we experience latency [on a real-time video conferencing app], our first instinct is to blame not a faulty network connection, but the person we’re talking to,” she says. “We assume that if it takes a couple seconds for someone to respond, they’re not paying attention, or they’re lazy, rude, or some other negative quality, before we realize it could actually be a latency issue.”
This indicates that collaboration is most certainly affected by latency, Jennings says.
“MoQ is a powerful solution in this way. If we can get rid of this hindrance to meaningful collaboration and trust in one another, we can enable end-users to accomplish their best work together.”
Realistically, most end-users won’t know they’re using MoQ at all, except that they’ll be happier with their experiences, from live streaming to gaming to video-conferencing calls, says Law.
“The old cliche of hearing your sports event on your neighbors OTA broadcast ahead of your OTT broadcast will have inverted—with OTT becoming significantly faster than any cable, satellite, or over-the-air based distribution channel.”
Where is MoQ now? What’s next?
Now, the group can shift their focus to the media streaming formats that will ride on top of it, he says.
The group will hold an interim meeting Feb. 6–8 in the United States. The on-site portion of the hybrid meeting will be hosted by Comcast at their offices in Denver, Colorado. The meeting includes a one-day hackathon/interop on Feb. 6, followed by two days of interim meetings on Feb. 7–8. Those interested in participating can learn more here.
The mailing list is also open and active (click here to subscribe), along with GitHub issues. Anyone with interest in the future of Internet-delivered media is encouraged to review the Internet-Drafts currently being considered by the MoQ working group, and to get involved.