You have a backup stream. You haven’t checked it in six months, the credentials are probably expired, and the person who set it up left the company. The one time live went down it wouldn’t have made a difference as the failure was elsewhere.
What took you down usually wasn’t the thing you prepared for. Most backup streams are designed for one scenario: the encoder dies, the switch flips, life goes on. That’s the clean failure. It’s also the rare one. What actually happens in production is messier, slower, and harder to call. The stream is alive but broken. The dashboard is green. The audience is already gone.
The mentality
…we probably don’t need it
The primary never failed. Not last week, not last month, not at the big event in March. The encoder is solid. The CDN is enterprise-grade. The team knows what they’re doing.
Hard to argue with. It’s not laziness. It’s a reasonable reading of a track record that hasn’t included a failure yet.
The economics
…it’s not worth the extra money/effort
The backup stream has a cost. Second encoder, second ingest, second CDN, second everything if you do it properly. That’s a real line item on a budget that already has competing priorities.
The outage has a cost too. But it’s theoretical. It might never happen. And if it does, maybe it’s a short one, maybe viewers come back, maybe the client doesn’t notice. 😈
This is the math that kills backup streams. Not negligence. A rational calculation made by people who have never experienced the failure they’re deciding not to protect against.
Single points of failure
…the backup failed too
The backup encoder is on the same switch as the primary? The same venue WiFi? Pushing to the same ingest endpoint, just different stream key? When the network goes down, both go down. When the ingest malfunctions, both fail. The backup existed. It was wired to the same failure.
A backup that fails the same way as the primary isn’t a backup. But…
True separation at every layer (encoder, network, ingest, origin, CDN) is complex to build and expensive to run. Most operations can’t justify it and don’t need it.
The gaps are rarely invisible. Most engineers can tell you where the stack is fragile. The backup just never got prioritized over everything else that needed building.
Operational readiness
…somebody knows how to switch, I think
The backup stream is configured. The runbook exists somewhere. The vendor console has a button for it. But the one person who knows how it works is not on call tonight. The doc is three product iterations old. Nobody has actually switched to the backup since the initial test months ago. The failure doesn’t wait for the right person to be available.
Automated failover is worse in a sneaky way. You stop thinking about it entirely. The script will handle it. Except the detection logic was written for a clean failure, not a half-dead stream that’s technically alive. The threshold never triggers. The switch never happens. And the humans who could override it manually are asleep, unreachable, or no longer with the company.
What does backup even mean
…we have redundancy
Backup encoder? Backup ingest? Backup origin? Backup CDN? Backup player URL? Each one protects against a different failure. Most teams pick one, check the box, and call it redundancy.
A backup encoder does nothing when the CDN degrades. A backup CDN does nothing when the origin serves a stale manifest. A backup ingest does nothing when the venue network goes down. The stack has layers and each layer can fail on its own.
Knowing which layer to back up requires knowing which layer is most likely to fail for your specific setup. That answer is different for a stadium broadcast, a remote satellite uplink, and a cloud-only pipeline. There is no universal answer. There is just the layer you skipped.
Select your redundancy plan
I’ve been wanting to lay this out as a diagram for a while. Clients always assume backup is a yes/no question. It’s not. Here are your options.
0. No real backup
Cheap until it is not. One failure anywhere takes the stream down. No recovery path, no fallback. A deliberate risk acceptance, not a resilience strategy.
1. Restart-and-recover
Someone notices, someone restarts. No parallel path, just a documented procedure and an operator who knows what to do. About five minutes of downtime if everything goes right. Longer if the right person is not reachable.
2. Fire-exit backup
A separate, deliberately degraded path. Different encoder, different network, different ingest endpoint. Not a mirror – lower quality, simpler setup – but genuinely independent. When the primary goes down, viewers stay on air within about a minute. This is the minimum viable backup for anything with a real audience.
3. Mirrored backup
A full-quality duplicate of the primary path. Separate encoder, separate ingest, separate origin. The CDN is typically still shared, which remains a single point. Recovery is faster and the experience is seamless, but the cost is substantially higher.
4. Full active-active
Both paths running simultaneously. Automated failover routes to the best available path. Viewers never see a switch because there is no switch – continuity is built into the architecture. The right choice when downtime is not an option and the budget reflects that.
Let’s gamify it 🙂
Of all, economics hurts the most, we know it.
This tool gives you a hint of what option you should deploy depending on what you’re doing and how much it costs to fail. Source is here.
You’re at that point where your single-server or clustered streaming setup just can’t keep up with the spikes and you know for a fact that you need a CDN, kudos for getting this far. Or maybe you’re already using one but wondering if perhaps you’ve missed out on a better deal from the other guys.
Navigating public price offerings can be challenging. Between the assortment of parameters (ingress, egress, requests, transfer), hidden fees, offers too good to be true, temptation to give into long term commitments etc, one may find it’s quite tough to make a decision.
This piece will focus on comparing the true pay-as-you-go CDN vendors with public and transparent pricing models. By ‘true pay as you go’ I mean the flexibility to pay zero if you don’t use the service at all in a certain month. That’s important if you’re broadcasting occasionally (festivals, seasonal sporting events), or you just don’t know if your business will still be alive and kicking in a few months from now.
And here it gets graphed, comparing the few mainstream providers. Just adjust the ckecks and sliders to match your use case and make your pick, today.
Lessons learned
Stick with the big names
Unless you know your game really well that is. Smaller players will do their best to lure you into admittedly appealing deals, yet most of these will be either
(A) resellers – nothing wrong with that per se, except there may be a better offer from the very CDN they’re reselling; that’s not always the case, as they can negotiate better pricing than you ever could by leveraging big volumes and upfront commitments
(B) maintaining their own infrastructure – nothing wrong with that either, yet do expect inferior throughput due to the reduced footprint and peering capabilities; also they may run out of capacity when you need it most – at peak; sadly cards have been dealt in the industry more than a decade ago there’s no way to stand up to the giants unless you’re a giant yourself
(C) hybrid – relying on both own and 3rd party infrastructure and trying to make the best of each; that’s admirable, still… they need to walk a fine line prioritizing either quality or profit, as it’ll be very tempting to try max out the (usually) inferior inbuilt capacity before racking up the upstream bill
(D) tricksters – still sharing traits with one of the above categories, yet at the very dishonest end of the scale; expect generally poor quality for the buck, slowdowns, interruptions, being throttled in favor of other customers, untruthful traffic measurements
Not meaning to scare you off, and there are surely exceptions; you’ll be able to find gold if you take the time to dig, especially among the local providers.
Always have a backup
There are many ways a network can fail, get saturated, or otherwise work against your best interest. Be prepared to switch or offload to some other vendor, may it be more expensive. There’s no good excuse for not being able to deliver the service you promised, and the chance to earn back the trust after a big fail may not come easy, if at all.
Where you deliver matters
While North America and Europe are well covered, providing fast connectivity elsewhere is often not straightforward.
Geographical regions discussed are those offered commonly by all suppliers. Yet some will cater distinctly to destinations like Africa, Middle East, India, Japan etc. You need to do a more in depth research if you focus anywhere there, look into local dealers too.
What goes into the graphs and what doesn’t?
The per-GB egress price – this makes the bulk of the pricing. Varies between $0.02 and $0.466, depending on region and consumption (i.e. the more you use the less you pay per unit)
The per-request price – varies between $0.6 and $2.2 per million HTTPS requests depending on region and offering. HTTP requests are cheaper with AWS but have not been considered here.
The per-GB ingress price (aka cache fill), where applicable – varies between $0.01 and $0.04 depending on ingress and egress location, only applied to google’s offering
The somewhat hidden $0.075 per hour for google’s ‘forwarding rule’ – a must-have paid-for link in their CDN chain.
The licensing price for a month in the case of Wowza
What about Akamai, CloudFlare, Comcast, Fastly, Level3, LimeLight etc
They don’t have public pricing, so we won’t discuss them here. Also some will require commitments and/or longer term contracts to have you as a customer. That does make sense for a company that offers this as its main service, as they need to rely on a somewhat predictable income to invest in capacity.
Do realize that there’s reselling (explicit or not) even at the highest levels, Azure and Wowza among them.
Can I use more than one?
Most certainly can, and you should if it’s feasible. Also know how much you’re paying each for exactly what, and over time use the information as leverage for a possible better deal. And stay on the lookout, the market continuously evolves and all this may be obsolete in a few months.
Since we last brought up the topic, the industry has evolved a bit. Most of the big live streaming and social media players now routinely stream at under 5 seconds end-to-end latency, and your modest platform may be laughed at or lose business if still relying on the good old HLS/DASH and its inbuilt huge delays.
The technological background hasn’t changed much, yet the emergence of ‘cord cutting’ has emphasized on the annoyingly big delays and pushed OTT providers to adapt and innovate. Where it could, LL-DASH has been implemented with relative success, periscope’s LHLS has had (and still has) its own success stories, and eventually apple had to step into the game and put together its own LL-HLS, currently already a published standard and deployed in the latest iOS.
As we speak, there are a few factors at play that may set back your roadmap to low latency
Support for proprietary WebSocket based streaming is going away, most notable possibly being wowza’s announcement to discontinue its ‘ultra low latency’ thing; it makes sense in light of market-driven evolution of alternatives and fact that this was a stand-in solution from the get-go, with obvious drawbacks
WebRTC is not yet a grownup; while having been standardized and taken a giant step since available in Safari, remarkable implements are taking a while
Player and server support of LL-HLS is still limited to commercial products
LL-DASH support is still not ubiquitous
The treat
To the rescue, a friendly wrapped POC solution based on the rather amazing open source OvenMediaEngine. It supports both WebRTC and LL-DASH egress from a RTMP source, amongst other cool stuff.
The WebRTC output lets you stream with sub-second latencies (!), and the LL-DASH can be configured to use a playback buffer of 1 second or less.
As long as you can deploy/make use of reverse proxies that support chunked-transfer, scaling is a breeze. Nginx can do it, as do most CDNs – go for it.
The larger shortcoming of WebRTC is that it’s been designed for peer-to-peer and one-to-one; twisting it to support one-to-many means impersonating multiple one-to-one endpoints, each mildly resource consuming, to the point where it’ll choke any one server.
Capabilities will largely vary depending on actual hardware, and stream characteristics. Consider just 200 viewers per cpu core when budgeting, any betterment will make your customer happy.
There’s also the hot topic of transcoding. While AVC is (at long last) ubiquitous in WebRTC, you’ll need to transcode the audio to Opus. That’s surely a breeze for any CPU but it won’t scale, so the number of streams you can run on a server is limited.
Is it worth it?
If you absolutely need cheap/free low latency, it is.
Biggest conundrum being that DASH won’t work on iOS and WebRTC is harder/more expensive to scale, may I suggest you use both (have iOS users play the WebRTC feed) and see where your scalability needs take you. Provided you’re running a small/medium platform or just starting up, the odds are you’re better off than giving into commercial offerings.
What about LL-HLS?
In OvenMedia, it’s reportedly in the works and may be available soon. In general, it may still be a while before we see it thriving. Partly due to its initial intent to mandate HTTP/2, the industry has been slow to adopt it, and the couple implements I’ve seen still get laggy provided near-perfect networking and encoding setups.
Adaptive bitrate anyone?
Not supported with this product but it may soon be.
Let me point out though that the two (ABR and extremely low latency) don’t go particularly well together. Think that
The need to transcode for ABR will add to the latency
Determining network capabilities and switching between ABR renditions is way tougher to properly plan ahead and execute given sub-second delays and buffers
In the big picture, you’re trading every second of latency for quality of experience or cost. Please don’t make it a whim and seriously assess how bad and how low you absolutely need it. Delivering near-instant high quality uninterrupted video over the open internet requires sophisticated/expensive tech, and even the most state of the art won’t deliver flawlessly to all.
You’ve seen it around on big guys’ platforms. But when trying to put together your own you may have hit the price, maintenance, or scalability wall.
The following solution is no magic fix to these and surely not special, yet it may help you understand the pitfalls and tell apart the clues ahead of time.
How it works
As simple as you’d imagine. Each viewer announces its presence to a central authority, let’s call that the counter. As soon one new such presence is announced, all the viewers are notified that the audience count has increased. Also, as soon as any viewer disengages, the others get notified that the respective count decreased.
Persistent connection
To facilitate instant updates, a continuously open connection is required between any one viewer and the centralized counter. Having the former just ask around for the number every once in a while (i.e. polling) is still an option but won’t be nearly as smooth or fast.
Sockets
Such connectivity can generally be accomplished by means of sockets. Long story short, a socket is a kind of nearly-instant bi-directional data channel between 2 network-connected devices.
WebSockets
Many most apps are commonly able to liberally create and make use of regular sockets, however the restrictive context of an internet browser cannot. Special abstractions had to be figured to bring socket-like functionality to the browser, of which the WebSocket has surfaced and is finally widely supported.
The server
The so called counter is merely a piece of software; it has to reside on a publicly accessible, always-on computer or device that oldsters like to call a server; while it does a lot of things, a server’s main job is to ‘serve’ common needs of various other (not necessarily so public or available) devices, generally referred to as ‘clients’.
The ready to use solution
Is here for grabs. Variations have been implemented on multiple platforms and it’s stood the test of time.
How cheap is it?
You’ll just be paying for the ‘server/s’. Unless you can run the counter on one of your existing computers [remember it needs to be public] or take advantage of some cloud’s free tier (in which case it’s free, as in beer).
In production, consider budgeting $1 per 1k simultaneous viewers per month, prorated.
Does it scale?
Not without headaches.
The boxed solution is well optimized and proven to accommodate some 10k viewers when running onto the smallest available cloud instance (with just 0.5 GB of RAM!)
It can stretch to take up to maybe 4-5 times that much on a single computer but the truly scalable setup takes an autoscaled cluster of ‘servers’. Not too complex really, it has been done a repeatedly and hope to get the time to dust one up and make it public soon.
Is it stable/reliable?
Up to a point…
You’ll see it hogging the host’s CPU way before it starts being laggy to your viewers. Run it on a more powerful computer next time you expect a similar or larger audience
Memory use hasn’t been a real concern in any of the implements
If noticing a rather constant limit your counter never goes above, your setup (either the software, OS or NIC) may be running out of sockets it can simultaneously keep open. There are many ways to mitigate that, details vary with specifics of the environment
Before WebSockets were ubiquitous, long-polling (and at times its creepier cousin short-polling) was the norm for setting up persistency in the browser; these put a more severe burden on the server and are safe to be avoided, at long last; don’t give in to the likes of socket.io unless you really know what you’re doing
Is it secure?
In the example it’s not. Meaning that if one wanted to impersonate extra viewers into your pool they could easily do so. Also go the (DoS) extra mile and try to bring the ‘counter’ and the server hosting it to its knees, by impersonating a jolly bunch of extra viewers.
Not to say it can’t be made safe. CORS and SSL are the first things to consider. Also some simple way to limit rate and payload size.
Next up, any extra validation, authentication, tokenization etc. will take a slight toll on the server resources, multiplied by one of the numbers above. So be wary and benchmark each addition.
Is it fast?
Yesss! As fast as you’d expect an update to propagate over the internet these days, at half the speed of light if lucky.
Sounds like a simple task, why is it so hard to scale?
Think the following scenario: 100 average viewers, each coming in or out every 10 minutes. That’s 10 updates per minute, to propagate to all of the 100, for a total of 1000 updates per minute.
Now for 1000 average viewers, each also coming in or out every 10 minutes. That’s 100 updates per minute, to propagate to all of the 1000, for a total of 100k updates per minute.
Take that for 10k average viewers, it’ll be 10M (!) updates per minute. And that’s just averages, real life will show you that the audience tends to flock in the beginning and key moments of an event.
Ok, there are tricks to smooth out the treacherous exponential there, and you know one of them already. Display 1.7k viewers instead of 1745 viewers. That’s a hundred-fold reduction in the number of updates, out of the box! And there’s more to be done of course.
There’s a shred of misunderstanding, to say the least, when it comes to grasping and facing the codecs licensing topic. General perception being that if you’re just starting out you don’t need to worry about it, the warning here is that it may crawl up on you as you grow, depending on how you put that codec to good use and especially how you monetize it. Let’s start with the basics though.
Intellectual Property
Many video compression techniques included in a codec are patented inventions. To use the codec, you’d have to license the patents from their creator or representative. Fair enough, except we are talking about a few thousand patents from a few dozen companies.
To simplify licensing, copyright holders ‘pooled’ the patents through organizations that sell these collectively on behalf of their members.
While there’s more than a single pool, and some patents are unaffiliated, it is commonly agreed that you only need to reach out to MPEG-LA to license H.264 (aka AVC), while in the case of H.265 (aka HEVC) you need to pay at least the 3 big pools (MPEG-LA, HEVC Advance, Velos), of which the latter does not even publicly disclose prices.
Known pricing
Terms under which a license is sold are rather complex and highly nuanced. Cost will vary depending on the context respective codec is being used, volume, and revenue you may drive from it.
Very much notable, some use cases bear no cost, while others carry a generous entry level threshold. Nevertheless, do pay attention, and let’s take these one by one.
Per-Device
Applies to smartphones, tablets, digital and smart TVs, computers, video players and anything with a hardware encoder or decoder of the respective codec. Royalties are owned by the device supplier and not by the encoding/decoding chip or module manufacturer.
Also applies to software products that include an encoder or decoder. Royalty is owned by the product vendor/distributor, whether the product itself is commercial or free. Notable exception: free products (truly free, like Firefox) may include the OpenH264 binary, in which case royalties will be generously covered by Cisco.
Per-Title PPV
These include platforms that sell access to content on a per-title basis. Royalties are either (A) a fixed value per sale or (B) a percent from sales to end-users, in some cases the lesser of the two. Note that tiles (i.e. videos) 12 minutes or less are exempt from such royalties.
Subscription-Based PPV
Royalties apply to subscription platforms like Netflix and vary depending on codec and number of subscribers. There’s a zero cost entry level for AVC if one has less than 100K subscribers.
Free Television Broadcast
Applies to terrestrial, cable and satellite broadcasters, with pricing per encoder or size of the audience
Free Internet Broadcast
You own no royalties if encoding content to be distributed for free over the internet.
Real world (small) business models, and how much they may own
Mobile apps
We’re obviously talking about mobile apps that either play or broadcast/manipulate video through either one of these codecs.
If you rely on a hardware or OS exposed encoder/decoder to do the job, you don’t owe anybody anything, godspeed!
If you include a software encoder or decoder in your app, you fit into the ‘Per-Device’ category. For AVC you don’t pay anything until you reach 100K units (i.e. actively installed apps).
For HEVC, you’ll be paying from the ground up, think $1.5 to $4 per unit.
Streaming platforms
You owe royalties if you distribute AVC or HEVC encoded content, unless it’s free as in YouTube.
A TVOD platform (or the live streaming pay-per-title equivalent) should pay MPEG-LA 2¢ or 2% per title sale for H264 and/or 2.5¢ to HEVC Advance for H265. There is no entry level freebie for this model.
A SVOD platform (or the live streaming subscription-based equivalent) starts owing MPEG-LA between $25-100K for AVC after they go over 100K paying customers. HEVC is not as friendly to newcomers and you owe HEVC Advance ¢0.5-2.5 per customer from day one.
Cloud encoding
If you operate a service that sells the encoding/transcoding service explicitly (like encoding.com does), you definitely do owe royalties. How you will be billed is however rather uncertain. You’ll ultimately have to reach out to licensors and ask, I have at least 2 customers being charged very differently for quite similar business models. Common sense would even so dictate that
If you charge for encoding by the item (title) you will pay royalties per title
If you charge for encoding via a subscription, you will fit into the subscription-based royalty model
If you charge for encoding by the minute, you may (possibly) fit into the per-device category, where each encoding server counts as one such device
If you transcode video internally, as part of a larger streaming platform, there’s no clear rule/guideline on how licensing works and you also have to ask. A couple customer stories would lead yours truly to believe that
If the platform distributes paid content (SVOD or TVOD) and already paying per-title or per-subscription royalties in that respect, there is no extra charge for the encoding part
If the platform distributes free or AVOD content, it may owe per device (i.e. transcoding server or server core/thread) royalties; or it may not 😐
Online TV Stations
If it’s free to watch, you’re in the clear, no royalties.
If it’s a paid service (i.e. subscription) you do owe it. Even if streaming is powered by a 3rd party platform and/or commercial player, the organization that labels the content also has to license the technology. Now you know.
Will They Come After Me?
Possibly not. Interesting enough, the ‘pool’ organizations cannot and do not deal with litigation.
Never heard of any small player being anywhere close to indicted but still…
As your startup begins to grow, you should start being aware of how much you owe and consider that you might someday need to pay it all retroactively. Balance your encoding needs and don’t shoot for the mightiest codec unless you really need it. Explore alternatives and know your options.
Are there free alternatives?
Sure!
AV1 is everyone’s dream: royalty free, and backed by an alliance of 48+ members; but it’s rather new, half baked, and it will probably be long before you’ll find a decoder for it in every device out there; but definitely one to look after in the years to come.
VP8 and VP9 are roughly comparable in quality to AVC and HEVC respectively, and also royalty free; except they’re only supported by google. While they admirably carried out the complex (and expensive) job of bringing these to market and safeguarding them from patent claims, they failed to convince the other big boys to adopt it; so hardware support is still scarce some 10 years later.
Where to go next?
See Jina’s article on the matter of AVC licensing, it may help clear out extra concerns. Also a couple of great articles here and there.
Perhaps you should. More so if you’re annoyed at the inconsistent latency of RTMP, dealing with packet loss over RTP/RTSP or tired of running long SDI or HDMI cables.
Although rather different, these 2 share common traits, possible reason they’re sometimes discussed together or confused among. They’re both designed for low latency video transmission, both came around in an attempt to fill a gap in existing technology, they’re free and widely adopted. And the similarities stop here.
Internet vs LAN
By design, SRT is intended to transport video via the open internet and other unpredictable (i.e. prone to bandwidth jitter and packet loss) networks. In turn, NDI works at full potential over consistently fast (read Gigabit or better) internal networks that can guarantee extremely low transmission error and congestion rates.
UDP vs TCP
You know already, TCP is fully reliable while UDP is not. The former will retransmit lost packets until successfully delivered, the latter will just let them get lost and not retransmit anything. Retransmission introduces delays, and the frequency and length of these delays is unpredictable.
SRT runs on top of UDP, NDI runs on top of TCP; NDI is arguably faster than SRT. Wait, what?
Latency
SRT allows you to set a fixed latency (120ms by default) for your video transmission. It will retransmit the packets that were lost over UDP only if not too late to stay within that latency margin. Depending on the speed and quality of the network between the 2 endpoints, setting a slightly higher latency may visibly reduce the amount of video and audio ‘glitches’ caused by undelivered packets.
NDI advertises near-zero latency. It however demands a very fast and non-jittery network, or it’ll degrade ungracefully. Fact that it rides over TCP is therefore rather irrelevant as the low latency is only achievable if the packet loss and retransmission are minimal.
Encoding
SRT is allegedly codec and format agnostic and only takes care of transporting video payloads across the network. There’s room for debate but let’s just mention that most of the tried and true implements gracefully deal with AVC and HEVC wrapped in MPEG-TS.
NDI manages both encoding, transport and decoding, thus it takes and outputs raw video. Internally a subset of H264 is used, more often than not trying to take advantage of built-in hardware encoders and decoders on host devices.
Video Quality
In the case of SRT, what you send is what you get. If it’s not well tuned with the network capabilities (or if these degrade beyond ordinary) you’ll see or hear the ‘jerks’ commonly associated with packet loss.
NDI is so called ‘visually lossless’, meaning the encoding artifacts will be virtually unobservable by the naked eye. Surely at the expense of…
Bandwidth Consumption
For NDI, it’s huge; think 100-150Mbps for a full HD stream and at least double that much for 4K.
With SRT, it’s up to you, as you’d need to figure out how to encode and mux the video before sending it off to transport. Do take into account the overhead introduced by retransmissions, especially if your network is busy.
Can I play with it?
Sure can! Put together a simple low latency streaming platform proof of concept. Based on Nimble (hey, it’s still free, wowza more than tripled their price over the last few), featuring SRT ingress and SLDP egress, you can deploy it in a few clicks and test it in a few minutes. It’s a mere adapt/simplification of the earlier larger scalable low latency project.
Fine, I’m interested, which best suits my use case?
Let’s recap the differences, and this time bring our old friend RTMP into the mix, should help make an educated choice
RTMP
SRT
NDI
Designed for
Streaming over Internet
Streaming over Internet
Streaming in LAN
Transport
TCP
UDP
TCP
Latency
Higher, Unpredictable
Low, Customizable
Lowest
Signal Quality
High
Good
Nearly Lossless
Deals with
Transport
Transport
Encode and Transport
Bandwidth consumption
Encoding Dependant
Encoding Dependant
High
Are they expensive?
They’re both free, although NDI is proprietary yet royalty free, while SRT is fully open.
Are they secure?
SRT supports AES encryption.
NDI has no built in encryption mechanism afaik, if that’s critical you can run it over a VPN of some kind.
Long gone are the days when you could whip up an RTMP broadcast in (sort of) any browser with a few lines of code, not that we honestly really miss those times. Ironically, it’s been more than 10 years since the F word became taboo, and a quick search revealed that Flash (there I said it) will finally start resting in peace at the end of the year.
Despite, the protocol that’s been built for it – RTMP – lives on and there is no end to it in sight. Actually it’s by all means the de facto ingest standard to all streaming platforms, big and small. And implemented by many hundreds of broadcast products. Because it’s unsophisticated, it works more often than it fails, it’s open (though it hasn’t been from the beginning), offers a relatively low latency, and supports a couple of the most common codecs.
It’s not all good. As it runs on top of TCP, it can’t feasibly be tuned to operate at a fixed/predictable latency, plus it degrades ungracefully under fluctuating, unstable, or otherwise poor connections. And worst of all, there is no browser support for it, now and ever.
But…
WebRTC is at long last supported in all modern browsers, with some players being particularly late to the game. And while it’s no RTMP (in fact it’s vastly superior to it) it lets you grab a live feed of your camera and transmit it to a fellow WebRTC endpoint, be it a browser or anything else.
Thus, there’s no stopping us from putting together a proxy that sets up one such WebRTC endpoint (connecting to the broadcasting browser), and also converts and repackages the incoming feed into working RTMP to be pushed to a 3rd party platform or server. Like this
The fabulous perk is that most WebRTC enabled browsers can, according to standard, encode in H264 so there will be no need to transcode the video at all. Audio coming out of the browser is usually Opus and that we’ll want to transcode into AAC for compatibility. That’s still a big win as audio transcoding requires a lot less processing than video would.
The nugget
As with other solutions, tried to smooth out the learning curve by offering a working prototype to be deployed asap. It sets up the webpage and ‘proxy’ bundle in a few clicks, effectively making it look like you’re broadcasting your webcam to a customizable RTMP address, all from the comfort of your own browser. You’ll still have to go through the hurdle of setting up and providing a key/certificate pair as WebRTC requires HTTPS more often than not.
POC is powered by the amazing MediaSoup framework, and much of the code is recycled from this handy sample. Work is part of a bigger effort to rejuvenate a commercial browser broadcasting enabled product, originally built on top of Kurento.
Does it scale?
With a bit of care and planning it will. Bottleneck being the processing toll it takes on transcoding the audio. Think a midrange computer/server will easily proxy 20 simultaneous streams.
Also need to take into account that some browsers still won’t encode H264 (e.g. Firefox on Android) and it’ll have to default to VP8/9 which needs transcoded to work with RTMP.
Is it stable/reliable?
Better than the real thing! As the last mile (first actually) is delivered via WebRTC — which features UDP transport, adjustable bitrate, congestion control, forward error correction and other cool stuff — the overall quality of service will be superior to the scenario where you would have broadcast the same camera with (e.g.) Wirecast, given the same encoding profile (i.e. constrained baseline) and resolution, especially over unpredictable networks or when employed by your non-streaming-savvy website users.
Is it fast?
Relatively so. Think sub-second WebRTC browser-to-proxy and 2-3 seconds RTMP proxy-to-platform. Not good enough for a video call but at least bound to be consistent, as compared to an unpredictable 2-30 second direct RTMP connection. There is extra delay introduced by the need to transcode, but that’s like half a second at most.
Is it expensive?
I say no, but depends on what you’d be using it for, and the scale you’d be using it at. Processing juice for transcoding the audio would set you back under a cent per hour per stream; if it were to be separate that is, yet the proxying and transcoding needs to be thought of as part of the bigger picture and possibly run on a system that’s often idle.
Is it secure?
Yes, actually. WebRTC is always encrypted, unlike RTMP.
Hey, streaming is a complex matter. Broadcasters are ever increasing and so are the options for platforms and equipment. While setting up a FB/IG live session from the app is fairly intuitive, stepping up one’s game to offer a more ‘pro’ gig can be challenging. And no wonder, media acquisition, encoding, transport, processing, delivery have to be well tuned and work in sync to ensure crisp and smooth playback.
Far from able to summarize all it takes to make your transmission crystal clear and free from ‘lagging’ or ‘freezing’, will in turn try to outline the most common rookie wrongdoings when setting up a broadcast. Though the following may come to you as common sense, I still get a huge share of ‘ahaaaa’ moments when I tell people to…
Turn on the lights!
Yes, the difference may amaze you. Blinding lights studios use are no mistake.
Thing is, regardless of its sensitivity, the digital sensor (and the film before it) of any camera will output a grainy/noisy picture under low lighting. And if streaming it, noise does not encode particularly well and you end up with an unexpectedly poor frame quality, either ‘grainy’, or ‘blocky’, or both. Tech details aside, use the brightest light you can find and see for yourself. Next up if you truly need to shoot in the dark realize you may need specialized (i.e. expensive) gear.
Bonus: you’ll get a less ‘choppy’ video. Many low-end consumer webcams do not have an adjustable aperture and will in turn vary the exposure time to tune the brightness of a frame; in low light this will lead to longer exposures and reduced framerates.
Get a better connection
Please, stop being convinced your internet is amazing simply because your provider or the default speed test told you so. In the case of the latter, what you’re seeing is merely the speed to the nearest PoP, which is usually way off your true internet speed. There’s a lot more to networking than raw (average) speed unfortunately, and before going scientific at least try running the same test against a farther ‘server’ like one in Australia or South America.
Moreover, if you’re wireless (may that be wifi or cellular) keep in mind that the quality of your connection varies with position, obstacles, interference, weather, conspiracists’ tinfoils. And unlike browsing or tweeting, streaming works best under constant and predictable network speeds and latencies.
So I beg you, especially since nowadays anyone can whip up a high speed mobile hotspot, try another network, you may be in for a surprise.
Get a better camera!
Sure, that’s obvious. Yet the characteristics will vary widely, and often the quality/performance of the lens, sensor, and sometimes electronic post-processing will make a remarkable difference, between cameras with identical specs nonetheless.
But you don’t have to break the bank in the process. A used/aftermarket DSLR or ‘handycam’ will output a crisper picture than many high-end webcams or smartphone cameras. There’s no wrongdoing in lending or trying out a few and see what looks/works best for your needs. Or ask around to find what worked for others. Just don’t run your broadcast business around that same camera unless you understand it very well, it may be suboptimal for a million reasons, like having been pointed at the sun.
Reduce the resolution
But hey, isn’t HD and lots of megapixels what everybody’s after nowadays? It is, but the full broadcast chain has to support it in harmony. That is the camera, the capture device, the encoder, and the upload bandwidth. If any of these isn’t up to the task you’ll end up with a sub-par HD picture that is either wasteful or poor. Depending on your setup, a high quality SD may look better and get enjoyed by more.
Bonus: try streaming at 540p. Often unknown or overlooked, you can think of it as near-HD. It’s suitable for most unremarkable needs, and it’ll take slightly more than half of 720p’s bandwidth to encode at the same quality
Reduce the bitrate
I know, it’s counterintuitive. Higher bitrate always means higher quality, given all else the same. But it’s not proportional. Depending on your content (and the equipment, remember?), there will be a sweet spot beyond which increasing the bitrate will result in little to no visual improvement. There are tools and metrics pros use to gauge that (see psnr) but the naked eye can still be a good judge, just run a few tests.
Don’t beat up your encoder
If using a computer for streaming, make sure its CPU never runs above 80%, ideally even lower. Else, it will drop frames (‘laggy’ again) or otherwise degrade the performance of your stream. Dedicated encoder boxes and smartphone apps tend to automatically pick the encoding profile to match the hardware capabilities so you don’t have to worry that much, but do keep the same in mind and measure it if possible. Now you know.
Use (better) microphones
This one’s easy. If your sound is poor you’re probably too far away from the mic, or you need a better one. Pay particular attention to the wireless kinds as some may introduce delays, and getting it in sync is kind of an advanced topic.
Bonus: Be careful not to introduce echo/feedback. Mute all your players and always use headsets if you really need to monitor the transmission’s audio in the vicinity of the microphone.
Reboot everything
Sketchy topic… For reasons only understood by masterminds, electronics with a reset button don’t just crash and freeze, they may also malfunction in weird, unobvious and unexpected ways, more so if they’re low end and have been running for a longer while. Especially if you’re not an expert, do yourself a favor and take the time to restart that router, computer, smartphone or gadget before the big event.
Expect to fail
Things will go wrong, mercilessly, when and where you least expect it. Ensuring redundancy/failover for every scenario is overkill and overly expensive. Do prepare for the most common mishaps (internet/electricity going out) but not the apocalypse. When it hits the fan, deal with it the best you can and don’t freak out; your viewers are forgiving and your reputation is salvageable. Apologize if the case, be honest about what went wrong and steps you’re taking to avoid that in the future.
The peer-to-peer realm… It goes so far off the ‘classic’ client-server paradigm it’s just a world in itself. If you’re not familiar with the topic, you will in turn have heard of Bitcoin, Tor, Skype, BitTorrent, DC++ or Napster. What do you know, they all rely on…
P2P
Computers, smartphones and IoT devices can connect to each other and exchange data. The closer they are to one another, the faster and more efficient they can communicate. Ha, that’s actually not true 🙂 Open internet connectivity, routing and network peering is optimized for end-user devices to efficiently reach service providers’ servers. As for connecting to each other, it’s a hit and miss. Particularly due to extensive NATting and sometimes deliberate ISP blocks, but also other reasons, 2 random internet connected devices may or may not be able to communicate to each other directly.
Swarming
That’s ok. Except for special circumstances, an IP connected device can still connect to a bunch of quasi-random like-minded fellow devices to share data in a partly-predictable fashion. If you lived the age of torrent downloads, you probably do have a certain understanding of how individual clients team up and help each other towards a common goal by sharing pieces of that same individual content. Results will vary; peers in close vicinity to each other (network-wise, not necessarily geographic) do share faster while others have to wait, sometimes more than they would if downloading from an actual server. Regardless, pressure and traffic on the seed(s) is heavily reduced as compared to the scenario where all clients would have to download directly.
The video streaming context
The amounts of data trafficked by video streaming are enormous. Meaning that somewhere there’s an enormous traffic bill. Forget the giants as they can strike nice deals with the CDNs, the average players will end up paying lots for broadcasting their venue outside of the ad-driven free services like YT and FB.
So what if we could put some of that p2p magic to good use…?
Peer-assisted video streaming is not a new idea. Sure, unlike a torrent download you can’t afford your viewers to buffer a lot or play at low quality just because of inadequate peer availability. Instead, rely on your friendly CDN to quickly grab the first part of the video and, in parallel, start downloading latter pieces from peers as soon as you have secured a comfortable buffer to ensure smooth playback for a while; fall back to CDN if peering capabilities degrade.
The live video streaming context
Particular to live streaming, all viewers will be consuming the same pieces of content at the same time. This is of furthest importance and makes for a particularly interesting use case, as the sharing is way more straightforward. Think there’s just 5-10 pieces of video being circulated in the ‘swarm’ at any given time, as opposed to hundreds in an hour long VOD.
If not clear by now, peer-to-peer traffic is free. From the standpoint of the provider that is. Any slice of video downloaded from a peer rather than from a server is a penny saved. And the overall potential savings are huge! Think large communities of ‘neighbors’, like in a campus or compound, downloading that content just once or twice and sharing it among each other in a fast and fairly efficient network.
And it gets more spectacular as the viewer count increases. For events with enormous audiences like the World Cup or the Superbowl, the sheer number of devices watching will lead to high incidence of high-speed peering and massive savings, all at a scale that might get a traditional client-proxy-server network to just crumble.
Convinced yet? There’s more! It’s not just sheer savings on the content provider’s end. There’s also faster starts, reduced buffering, and superior quality on many of the viewers. For some it’s just as if they were connected to a faster network, with all the benefits of that.
The readily available technology: HTML5 and WebRTC
WebRTC is finally part of almost any modern browser. And surprisingly unknown to many, it incorporates advanced peering capabilities. Details aside, a piece of JS code can drive swarming between browsers and get them to speed up, cut costs and improve the quality of video playback in a manner that’s transparent to the viewer. And it’s happening. For quite a while already WebTorrent has been around, ventures have tried to capitalize on the tech by selling it as a service, and ready-to-use open solutions eventually surfaced.
The Nugget
Here it is, your very own p2p-enabled streaming platform, ready to deploy and start broadcasting in minutes. It’s based on this free and open initiative; though built and promoted by a private company there are no strings attached afaik.
Does it scale?
Beautifully! As mentioned, this could actually sustain numbers that would overwhelm even the mightiest CDN, and that’s no overstatement. Minor note though, the proof of concept makes use of public trackers and if you need to stream to more than a couple thousand you’ll have to deploy your own. Scalability of that will be your bottleneck so take good care of it.
How much can I save?
Hard to say. Some will advertise figures of ‘up to’ 90% or more, but your mileage will vary. The more watching the better, and the more concentrated into metropolitan areas or individual networks your viewers are, the more and faster they will peer.
What’s the catch?
There isn’t one, everybody wins. Except… 🙂
Extra traffic usage (think double) on most of the viewers due to the fact that they have to upload video pieces to others; not a problem for unlimited wifi but possibly problematic for those on a metered connection
Extra overhead on each of the clients in terms of CPU and memory consumption; that’s needed to initiate and maintain tracker and peer connections and also manage and relay the extensive amount of data
Is the free solution inferior to commercial alternatives?
May very well be. As is the case with other tech, it’s easier and faster to build a proprietary system, and monetizing it may fuel further innovation. P2P is still a matter that spurs academic research, trade secrets and patents. At the core of any solution there’s a tracker and some replication algorithms that can vary immensely in key areas like central coordination, congestion avoidance, mesh optimization etc. Long story short it once again depends, and your use case is possibly very different from others’.
Couldn’t stress it enough, except for a few borderline cases Adaptive Bitrate is simply a must have. Your viewers need to start fast and be able to watch the game on poor or fluctuating networks.
Revisiting the topic as there are ever so many angles to approach it. For this one, (unsurprisingly) cost was the primordial factor and had to pull all the tricks to get it that cheap.
In no particular order, and not necessarily to be used all together (actually some are incompatible), following are tactics to reduce the cost of an ABR setup:
Reduce the number of ABR renditions
The main point of ABR is to allow bandwidth-challenged viewers to play your content smoothly, may that be at a lower quality. The more renditions employed, the closer you will be able to match one’s capabilities (it’s at times not just bandwidth but also decoding horsepower and video canvas resolution) and offer them the best possible viewing experience.
On the other hand, having fewer renditions may lock some users into a less than ideal quality setting, yet still fluid, of reasonable visual quality, and with good audio. Especially if your content or programming does not mandate top quality (i.e. news), this may save a lot on transcoding in the long run.
Add to that, much of the public has grown to instinctively realize a better network will lead to better quality and will make voluntary efforts to better their connection if they want a better video.
Recycle the original encode
This won’t work for all scenarios but…
Particularly when source feed is encoded by known studio equipment (unlike user contributed which tends to be less predictable) you can transmux the original video and make it the highest quality variant in the ABR set. Depending on the profile, this may save up to half the overall processing power needed for transcode.
Recycle the original audio
Simple, just mandate a middle ground audio bitrate/quality at the source and use it for all renditions. Not always ideal for audiophiles but good enough for us humans.
Use lesser complex transcoding
Encoding is a fine trade-off between quality, bitrate (which translates in bandwidth required for transport) and processing needs. In the special case of live streaming, the transcoding device has to offer enough power to process the content in real time. Choosing a less complex transcoding profile, while requiring less computing resources, will lead to video that has a higher bitrate for the same quality, or lower quality for the same bitrate. Sure thing, traffic costs too, and it may be unwise to save pennies on one transcode and pay for the extra traffic multiplied by the number of viewers. Yet every case is different and numbers may be in favor of this approach at times.
Use the GPU
Modern GPUs have had built-in dedicated video encoders for quite a while now and they can be put to good use in many scenarios. Just off the top of my head, you’d be able to transcode 2-4 times as cheap, real number depends on a huge amount of factors, most notably the cost of actually buying or renting the respective GPUs
Use the cloud
That’s a no brainer nowadays, I guess. Even if you have some idle dedicated servers lying around, it would be hard to set up a scalable solution around them. Between SaaS cloud transcoding and running custom software on cloud virtual servers, the latter is cheaper by far though it comes with extra headaches.
Use free software
Duh, doesn’t get any cheaper than that. You don’t get to call support when something goes wrong, but hey, maybe your team is too good to ever need that. Encoders in ffmpeg and gstreamer (i.e. x264) are hardly inferior to their commercial counterparts and also mature and stable, so no real worries there, most software transcoders are built on top of them anyway.
Use a separate virtual server for every stream
That’s not necessarily a winner for every scenario. In fact it’s always more economical to be doing multiple transcodes on a more powerful machine. But that’s only if you can use that to full capacity, otherwise you’re as efficient as flying a large plane half empty.
This one is very close to a hack. Only particular to AWS, the older virtual server types (since the days they billed by the hour) will let you burst some cheap CPU credits and throw the instance away when depleted. There’s a limit to how much you can abuse the ‘feature’ but good enough to get you started at a real bargain.
Putting it all together (actually just some) …
…the solution is here for grabs. Deploys in a few clicks and sets you up with a rather generous ABR profile for as little as 2-5¢ (!!!) per hour of live streaming or well within the free tier if you still have that.
Does it scale?
Not in all directions. Long story short, you can stay on the cheap end of the spectrum if you transcode up to a couple hundred hours of content per day, after that the perks start to run out.
Is it worth it?
Oh yess! If only I could pocket the savings it’s brought…
Is it stable/reliable?
Should be. For a while I monitored it in production and noticed no issues. See for yourself.
Is this blog sponsored by AWS?
No. And by no other company for that matter. I just happen to have been exposed to amazon’s much more than to others’, but (except perhaps for very specific use cases) do not believe it’s any better than other cloud platforms. Will gladly take on the challenge to deploy solutions in any environment or to objectively choose one that best suits particular needs.