Free* Zoom-grade video conferencing platform

*Free as in beer

In the vast sea of video conferencing tools, each one carries its own set of quirks and surprises. Some go all-in on scalability, others nail simplicity, and a few hand you the keys for full customization — if you’re up for the challenge. To keep things lively, let’s break these down into some neat categories

Free vs paid vs freemium

Truly free

Either open source code that you can deploy on your own infrastructure (talk about these in a bit) or very simple hosted services. Truth be told, top notch video conferencing needs innovation and uses up resources, none of which are free. If it’s free for you it’s either because it’s crappy (i.e. reduced feature set) or somebody else is paying for it.
Exaples: Jitsi Meet, MediaSoup, BigBlueButton, Talky, Briefing, Palava.tv

Freemium

If you’ve been on Zoom or Google meet a few times, you know what it’s like. Same product and mostly the same features for the free and paid, yet longer meetings and more participants/features for the latter. There’s also FreeConference, Microsoft Teams, GoToMeeting etc. These services are operated by major companies on their own infrastructure, so you’ll get good quality meetings even when using the free tier.

Paid

No free tier. They provide comprehensive features like advanced security, large-scale meetings, dedicated support, and deep integration with other business tools. They are geared toward enterprise-level use or large organizations.
In this category: Microsoft Teams, Webex, BlueJeans, AnyMeeting, Connect, Dialpad, Adobe Connect etc

Open source or not?

Ok, the opensource ones are just code. It’s only after you deploy it on a server (or more) that it becomes a platform. Jitsi Meet, BigBlueButton, LiveKit, Apache OpenMeetings are all part of this pool. You can tailor them to your liking and fit them into custom workflows, but that can take a bit or a lot of work. Some are lacking features and many are not straightforward to scale.

In turn, most closed-source solutions are already established platforms. Think Zoom, Microsoft Teams, Discord, Cisco Webex, GoToMeeting – they all exist as entities that you just use, not software that you host somewhere. There are a few exceptions to this (Cisco Meeting Server, TrueConf Server), but they tend to cater to very specific niches.

Branded vs White Label

When setting up a platform, you have the option to either prominently display the Zoom logo and make it clear that you’re using their service for the conference, or alternatively, you can create a fully integrated experience that mirrors Zoom’s functionality but showcases only your branding, without any mention of Zoom itself. The choice to go with white label or not is about deciding between keeping control of your brand or going for something quicker and easier to customize.

The branded choices we already mentioned, white label solutions include Daily.co, Agora, Twilio, LiveKit, Pexip, TrueConf, Whereby

It is all WebRTC already?

Kinda. Though you can’t communicate through other means from a browser, in the realm of native apps some just use a browser wrapper while others actually make use of proprietary protocols for enhanced performance.
I know, it’s a bit sad that WebRTC hasn’t quite become the magic bullet we were promised, and at this rate, we might be waiting a couple more decades for it to finally live up to the hype. Why so? Well, issues with standardization, leading to inconsistent browser support, combined with performance and stability challenges when scaling, and ongoing difficulties in managing latency and bandwidth, all of which make it hard to maintain a seamless experience across different networks and devices. It’s much easier to mitigate all these in the shadow of a proprietary protocol, especially if you’re a powerhouse like Zoom or Webex.

Why it’s hard to make videoconferencing flawless

For the most part, unpredictable unpredictable network connectivity. Not everyone has a connection fast enough to ingest the AV data others are pushing, may that be for the whole session or parts of it. When a slow user comes into the meeting, one of the following needs to happen

  • lower the quality for everyone (not cool)
  • find a way to send lower quality video to the slow user while still sending good quality video the others (cool)

Well the latter is complicated, requiring either serverside transcoding, SVC, or simulcast; coupled with complex logic to always be on the lookout for congestion and compensate; none of which are cheap, straightforward, or widely supported at the same time. And that’s where the big players make the difference – they can afford to pioneer and push the boundaries at higher costs and more sophisticated tech stack and infrastructure.

So what gives?

Among the many mentioned above, LiveKit stands a bit taller — and no, this isn’t a sponsored take! What sets it apart?

  • It’s fairily new, so it got to learn from others’ mistakes.
  • It’s open source, but without the fragmented progress seen in similar projects.
  • Although it’s pure WebRTC, they stretched it to the limit and keep doing so;
  • It’s still a commercial product, which allows it to finance fresh thinking; you pay if you want to use their infrastructure, otherwise you’re free to use yours

Ok, where’s the free conferencing platform?

Here. You can deploy it in a few minutes and play with it.
If don’t have the few minutes, play with this first – it’s the hosted demo of what you can deploy on your own infrastructure and adjust to your liking.

Why bother?

If you just found out about LiveKit, it’ll help you bypass the steepest part of its (still lean) learning curve. As you now have it set up and working, you can move on to have it fit your context. Or just use it as is, the defaults are very well suited for real-world use.

Also, now that you’re hosting it yourself (at some cost), you can get a rough idea of how much of what you’re paying Zoom goes toward operating costs and how much might be profit, possibly funneled back into innovation.

Not least, you can now experience first hand how a top-notch open source app stacks up against its commercial counterparts and decide if the added cost is worth it.

Does it scale?

Glad you’ve asked. While the demo is a single-server take, LiveKit itself supports distributed setup.

But… there had to be a catch. When using the OS product, there’s a limit (think hundreds) to how many can join the same meeting, dictated by the capabilities of the servers you’re deploying it on and the stated fact that “a room must fit on a single node”. The (paid) hosted service does not have this limitation 😊

Is it secure?

It is. Checks all boxes you’d expect from a mature product, including end-to-end encryption.

How free* is it really?

Once again, you’re only paying for the infrastructure, i.e. the servers you’re hosting it on.

Budget 1¢/user/hour and take it from there. Your mileage will greatly vary with app specifics and users’ traffic patterns. LiveKit provides a benchmarking tool.

Cheapest pay as you go CDN for streaming

There is zero activity in February and August

You’re at that point where your single-server or clustered streaming setup just can’t keep up with the spikes and you know for a fact that you need a CDN, kudos for getting this far. Or maybe you’re already using one but wondering if perhaps you’ve missed out on a better deal from the other guys.

Navigating public price offerings can be challenging. Between the assortment of parameters (ingress, egress, requests, transfer), hidden fees, offers too good to be true, temptation to give into long term commitments etc, one may find it’s quite tough to make a decision. 

This piece will focus on comparing the true pay-as-you-go CDN vendors with public and transparent pricing models. By ‘true pay as you go’ I mean the flexibility to pay zero if you don’t use the service at all in a certain month. That’s important if you’re broadcasting occasionally (festivals, seasonal sporting events), or you just don’t know if your business will still be alive and kicking in a few months from now. 

And here it gets graphed, comparing the few mainstream providers. Just adjust the ckecks and sliders to match your use case and make your pick, today.

Lessons learned

Stick with the big names

Unless you know your game really well that is. Smaller players will do their best to lure you into admittedly appealing deals, yet most of these will be either 

(A) resellers – nothing wrong with that per se, except there may be a better offer from the very CDN they’re reselling; that’s not always the case, as they can negotiate better pricing than you ever could by leveraging big volumes and upfront commitments

(B) maintaining their own infrastructure – nothing wrong with that either, yet do expect inferior throughput due to the reduced footprint and peering capabilities; also they may run out of capacity when you need it most – at peak; sadly cards have been dealt in the industry more than a decade ago there’s no way to stand up to the giants unless you’re a giant yourself

(C) hybrid – relying on both own and 3rd party infrastructure and trying to make the best of each; that’s admirable, still… they need to walk a fine line prioritizing either quality or profit, as it’ll be very tempting to try max out the (usually) inferior inbuilt capacity before racking up the upstream bill

(D) tricksters – still sharing traits with one of the above categories, yet at the very dishonest end of the scale; expect generally poor quality for the buck, slowdowns, interruptions, being throttled in favor of other customers, untruthful traffic measurements

Not meaning to scare you off, and there are surely exceptions; you’ll be able to find gold if you take the time to dig, especially among the local providers.

Always have a backup 

There are many ways a network can fail, get saturated, or otherwise work against your best interest. Be prepared to switch or offload to some other vendor, may it be more expensive. There’s no good excuse for not being able to deliver the service you promised, and the chance to earn back the trust after a big fail may not come easy, if at all.  

Where you deliver matters

While North America and Europe are well covered, providing fast connectivity elsewhere is often not straightforward.  

Geographical regions discussed are those offered commonly by all suppliers. Yet some will cater distinctly to destinations like Africa, Middle East, India, Japan etc. You need to do a more in depth research if you focus anywhere there, look into local dealers too.

What goes into the graphs and what doesn’t?

  1. The per-GB egress price – this makes the bulk of the pricing. Varies between $0.02 and $0.466, depending on region and consumption (i.e. the more you use the less you pay per unit)
  2. The per-request price – varies between $0.6 and $2.2 per million HTTPS requests depending on region and offering. HTTP requests are cheaper with AWS but have not been considered here.
  3. The per-GB ingress price (aka cache fill), where applicable – varies between $0.01 and $0.04 depending on ingress and egress location, only applied to google’s offering
  4. The somewhat hidden $0.075 per hour for google’s ‘forwarding rule’ – a must-have paid-for link in their CDN chain.
  5. The licensing price for a month in the case of Wowza

What about Akamai, CloudFlare, Comcast, Fastly, Level3, LimeLight etc

They don’t have public pricing, so we won’t discuss them here. Also some will require commitments and/or longer term contracts to have you as a customer. That does make sense for a company that offers this as its main service, as they need to rely on a somewhat predictable income to invest in capacity. 

Do realize that there’s reselling (explicit or not) even at the highest levels, Azure and Wowza among them. 

Can I use more than one?

Most certainly can, and you should if it’s feasible. Also know how much you’re paying each for exactly what, and over time use the information as leverage for a possible better deal. And stay on the lookout, the market continuously evolves and all this may be obsolete in a few months.

Another free low latency solution

We really need to do something about this delay

Since we last brought up the topic, the industry has evolved a bit. Most of the big live streaming and social media players now routinely stream at under 5 seconds end-to-end latency, and your modest platform may be laughed at or lose business if still relying on the good old HLS/DASH and its inbuilt huge delays. 

The technological background hasn’t changed much, yet the emergence of ‘cord cutting’ has emphasized on the annoyingly big delays and pushed OTT providers to adapt and innovate. Where it could, LL-DASH has been implemented with relative success, periscope’s LHLS has had (and still has) its own success stories, and eventually apple had to step into the game and put together its own LL-HLS, currently already a published standard and deployed in the latest iOS. 

As we speak, there are a few factors at play that may set back your roadmap to low latency

  • Support for proprietary WebSocket based streaming is going away, most notable possibly being wowza’s announcement to discontinue its ‘ultra low latency’ thing; it makes sense in light of market-driven evolution of alternatives and fact that this was a stand-in solution from the get-go, with obvious drawbacks
  • WebRTC is not yet a grownup; while having been standardized and taken a giant step since available in Safari, remarkable implements are taking a while
  • Player and server support of LL-HLS is still limited to commercial products
  • LL-DASH support is still not ubiquitous

The treat

To the rescue, a friendly wrapped POC solution based on the rather amazing open source OvenMediaEngine. It supports both WebRTC and LL-DASH egress from a RTMP source, amongst other cool stuff. 

The WebRTC output lets you stream with sub-second latencies (!), and the LL-DASH can be configured to use a playback buffer of 1 second or less.

It’s here, to use as such or inspire from, enjoy!

Does it scale?

LL-DASH – scale with ease

As long as you can deploy/make use of reverse proxies that support chunked-transfer, scaling is a breeze. Nginx can do it, as do most CDNs – go for it. 

WebRTC – not as easy but it can be made to

The larger shortcoming of WebRTC is that it’s been designed for peer-to-peer and one-to-one; twisting it to support one-to-many means impersonating multiple one-to-one endpoints, each mildly resource consuming, to the point where it’ll choke any one server. 

Capabilities will largely vary depending on actual hardware, and stream characteristics. Consider just 200 viewers per cpu core when budgeting, any betterment will make your customer happy. 

There’s also the hot topic of transcoding. While AVC is (at long last) ubiquitous in WebRTC, you’ll need to transcode the audio to Opus. That’s surely a breeze for any CPU but it won’t scale, so the number of streams you can run on a server is limited. 

Is it worth it?

If you absolutely need cheap/free low latency, it is. 

Biggest conundrum being that DASH won’t work on iOS and WebRTC is harder/more expensive to scale, may I suggest you use both (have iOS users play the WebRTC feed) and see where your scalability needs take you. Provided you’re running a small/medium platform or just starting up, the odds are you’re better off than giving into commercial offerings. 

What about LL-HLS?

In OvenMedia, it’s reportedly in the works and may be available soon. In general, it may still be a while before we see it thriving. Partly due to its initial intent to mandate HTTP/2, the industry has been slow to adopt it, and the couple implements I’ve seen still get laggy provided near-perfect networking and encoding setups. 

Adaptive bitrate anyone?

Not supported with this product but it may soon be. 

Let me point out though that the two (ABR and extremely low latency) don’t go particularly well together. Think that

  • The need to transcode for ABR will add to the latency
  • Determining network capabilities and switching between ABR renditions is way tougher to properly plan ahead and execute given sub-second delays and buffers

In the big picture, you’re trading every second of latency for quality of experience or cost. Please don’t make it a whim and seriously assess how bad and how low you absolutely need it. Delivering near-instant high quality uninterrupted video over the open internet requires sophisticated/expensive tech, and even the most state of the art won’t deliver flawlessly to all.

Setting Up A Live Viewer Count, For Cheap

Millions, with an M

You’ve seen it around on big guys’ platforms. But when trying to put together your own you may have hit the price, maintenance, or scalability wall. 

The following solution is no magic fix to these and surely not special, yet it may help you understand the pitfalls and tell apart the clues ahead of time.

How it works

As simple as you’d imagine. Each viewer announces its presence to a central authority, let’s call that the counter. As soon one new such presence is announced, all the viewers are notified that the audience count has increased. Also, as soon as any viewer disengages, the others get notified that the respective count decreased.

Persistent connection

To facilitate instant updates, a continuously open connection is required between any one viewer and the centralized counter. Having the former just ask around for the number every once in a while (i.e. polling) is still an option but won’t be nearly as smooth or fast. 

Sockets

Such connectivity can generally be accomplished by means of sockets. Long story short, a socket is a kind of nearly-instant bi-directional data channel between 2 network-connected devices. 

WebSockets

Many most apps are commonly able to liberally create and make use of regular sockets, however the restrictive context of an internet browser cannot. Special abstractions had to be figured to bring socket-like functionality to the browser, of which the WebSocket has surfaced and is finally widely supported

The server

The so called counter is merely a piece of software; it has to reside on a publicly accessible, always-on computer or device that oldsters like to call a server; while it does a lot of things, a server’s main job is to ‘serve’ common needs of various other (not necessarily so public or available) devices, generally referred to as ‘clients’.

The ready to use solution

Is here for grabs. Variations have been implemented on multiple platforms and it’s stood the test of time. 

How cheap is it?

You’ll just be paying for the ‘server/s’. Unless you can run the counter on one of your existing computers [remember it needs to be public] or take advantage of some cloud’s free tier (in which case it’s free, as in beer). 

In production, consider budgeting $1 per 1k simultaneous viewers per month, prorated.

Does it scale?

Not without headaches. 

The boxed solution is well optimized and proven to accommodate some 10k viewers when running onto the smallest available cloud instance (with just 0.5 GB of RAM!)

It can stretch to take up to maybe 4-5 times that much on a single computer but the truly scalable setup takes an autoscaled cluster of ‘servers’. Not too complex really, it has been done a repeatedly and hope to get the time to dust one up and make it public soon.

Is it stable/reliable?

Up to a point…

  • You’ll see it hogging the host’s CPU way before it starts being laggy to your viewers. Run it on a more powerful computer next time you expect a similar or larger audience
  • Memory use hasn’t been a real concern in any of the implements
  • If noticing a rather constant limit your counter never goes above, your setup (either the software, OS or NIC) may be running out of sockets it can simultaneously keep open. There are many ways to mitigate that, details vary with specifics of the environment
  • Before WebSockets were ubiquitous, long-polling (and at times its creepier cousin short-polling) was the norm for setting up persistency in the browser; these put a more severe burden on the server and are safe to be avoided, at long last; don’t give in to the likes of socket.io unless you really know what you’re doing

Is it secure?

In the example it’s not. Meaning that if one wanted to impersonate extra viewers into your pool they could easily do so. Also go the (DoS) extra mile and try to bring the ‘counter’ and the server hosting it to its knees, by impersonating a jolly bunch of extra viewers.

Not to say it can’t be made safe. CORS and SSL are the first things to consider. Also some simple way to limit rate and payload size. 

Next up, any extra validation, authentication, tokenization etc. will take a slight toll on the server resources, multiplied by one of the numbers above. So be wary and benchmark each addition. 

Is it fast?

Yesss! As fast as you’d expect an update to propagate over the internet these days, at half the speed of light if lucky. 

Sounds like a simple task, why is it so hard to scale?

Think the following scenario: 100 average viewers, each coming in or out every 10 minutes. That’s 10 updates per minute, to propagate to all of the 100, for a total of 1000 updates per minute.

Now for 1000 average viewers, each also coming in or out every 10 minutes. That’s 100 updates per minute, to propagate to all of the 1000, for a total of 100k updates per minute. 

Take that for 10k average viewers, it’ll be 10M (!) updates per minute. And that’s just averages, real life will show you that the audience tends to flock in the beginning and key moments of an event.

Ok, there are tricks to smooth out the treacherous exponential there, and you know one of them already. Display 1.7k viewers instead of 1745 viewers. That’s a hundred-fold reduction in the number of updates, out of the box! And there’s more to be done of course.

As a small business, must I pay royalties for H264 and H265?

Will they come after me?

There’s a shred of misunderstanding, to say the least, when it comes to grasping and facing the codecs licensing topic. General perception being that if you’re just starting out you don’t need to worry about it, the warning here is that it may crawl up on you as you grow, depending on how you put that codec to good use and especially how you monetize it. Let’s start with the basics though.

Intellectual Property

Many video compression techniques included in a codec are patented inventions. To use the codec, you’d have to license the patents from their creator or representative. Fair enough, except we are talking about a few thousand patents from a few dozen companies.

Patent Pools

To simplify licensing, copyright holders ‘pooled’ the patents through organizations that sell these collectively on behalf of their members. 

While there’s more than a single pool, and some patents are unaffiliated, it is commonly agreed that you only need to reach out to MPEG-LA to license H.264 (aka AVC), while in the case of H.265 (aka HEVC) you need to pay at least the 3 big pools (MPEG-LA, HEVC Advance, Velos), of which the latter does not even publicly disclose prices. 

Known pricing

Terms under which a license is sold are rather complex and highly nuanced. Cost will vary depending on the context respective codec is being used, volume, and revenue you may drive from it. 

Very much notable, some use cases bear no cost, while others carry a generous entry level threshold. Nevertheless, do pay attention, and let’s take these one by one.

Per-Device 

Applies to smartphones, tablets, digital and smart TVs, computers, video players and anything with a hardware encoder or decoder of the respective codec. Royalties are owned by the device supplier and not by the encoding/decoding chip or module manufacturer.

Also applies to software products that include an encoder or decoder. Royalty is owned by the product vendor/distributor, whether the product itself is commercial or free. Notable exception: free products (truly free, like Firefox) may include the OpenH264 binary, in which case royalties will be generously covered by Cisco. 

Per-Title PPV

These include platforms that sell access to content on a per-title basis. Royalties are either (A) a fixed value per sale or (B) a percent from sales to end-users, in some cases the lesser of the two. Note that tiles (i.e. videos) 12 minutes or less are exempt from such royalties.

Subscription-Based PPV

Royalties apply to subscription platforms like Netflix and vary depending on codec and number of subscribers. There’s a zero cost entry level for AVC if one has less than 100K subscribers. 

Free Television Broadcast

Applies to terrestrial, cable and satellite broadcasters, with pricing per encoder or size of the audience

Free Internet Broadcast

You own no royalties if encoding content to be distributed for free over the internet. 

Real world (small) business models, and how much they may own

Mobile apps

We’re obviously talking about mobile apps that either play or broadcast/manipulate video through either one of these codecs.

If you rely on a hardware or OS exposed encoder/decoder to do the job, you don’t owe anybody anything, godspeed!

If you include a software encoder or decoder in your app, you fit into the ‘Per-Device’ category. For AVC you don’t pay anything until you reach 100K units (i.e. actively installed apps). 

For HEVC, you’ll be paying from the ground up, think $1.5 to $4 per unit.

Streaming platforms

You owe royalties if you distribute AVC or HEVC encoded content, unless it’s free as in YouTube.

A TVOD platform (or the live streaming pay-per-title equivalent) should pay MPEG-LA 2¢ or 2% per title sale for H264 and/or 2.5¢ to HEVC Advance for H265. There is no entry level freebie for this model.

A SVOD platform (or the live streaming subscription-based equivalent) starts owing MPEG-LA between $25-100K for AVC after they go over 100K paying customers. HEVC is not as friendly to newcomers and you owe HEVC Advance ¢0.5-2.5 per customer from day one.

Cloud encoding

  1. If you operate a service that sells the encoding/transcoding service explicitly (like encoding.com does),  you definitely do owe royalties. How you will be billed is however rather uncertain. You’ll ultimately have to reach out to licensors and ask, I have at least 2 customers being charged very differently for quite similar business models. Common sense would even so dictate that
  • If you charge for encoding by the item (title) you will pay royalties per title
  • If you charge for encoding via a subscription, you will fit into the subscription-based royalty model
  • If you charge for encoding by the minute, you may (possibly) fit into the per-device category, where each encoding server counts as one such device
  1. If you transcode video internally, as part of a larger streaming platform, there’s no clear rule/guideline on how licensing works and you also have to ask. A couple customer stories would lead yours truly to believe that
  • If the platform distributes paid content (SVOD or TVOD) and already paying per-title or per-subscription royalties in that respect, there is no extra charge for the encoding part
  • If the platform distributes free or AVOD content, it may owe per device (i.e. transcoding server or server core/thread) royalties; or it may not 😐

Online TV Stations

If it’s free to watch, you’re in the clear, no royalties. 

If it’s a paid service (i.e. subscription) you do owe it. Even if streaming is powered by a 3rd party platform and/or commercial player, the organization that labels the content also has to license the technology. Now you know.

Will They Come After Me?

Possibly not. Interesting enough, the ‘pool’ organizations cannot and do not deal with litigation. 

Never heard of any small player being anywhere close to indicted but still…

As your startup begins to grow, you should start being aware of how much you owe and consider that you might someday need to pay it all retroactively. Balance your encoding needs and don’t shoot for the mightiest codec unless you really need it. Explore alternatives and know your options.

Are there free alternatives?

Sure! 

AV1 is everyone’s dream: royalty free, and backed by an alliance of 48+ members; but it’s rather new, half baked, and it will probably be long before you’ll find a decoder for it in every device out there; but definitely one to look after in the years to come.

VP8 and VP9 are roughly comparable in quality to AVC and HEVC respectively, and also royalty free; except they’re only supported by google. While they admirably carried out the complex (and expensive) job of bringing these to market and safeguarding them from patent claims, they failed to convince the other big boys to adopt it; so hardware support is still scarce some 10 years later.

Where to go next?

See Jina’s article on the matter of AVC licensing, it may help clear out extra concerns. Also a couple of great articles here and there

SRT vs NDI streaming, should I care?

Perhaps you should. More so if you’re annoyed at the inconsistent latency of RTMP, dealing with packet loss over RTP/RTSP or tired of running long SDI or HDMI cables.

Although rather different, these 2 share common traits, possible reason they’re sometimes discussed together or confused among. They’re both designed for low latency video transmission, both came around in an attempt to fill a gap in existing technology, they’re free and widely adopted. And the similarities stop here. 

Internet vs LAN

By design, SRT is intended to transport video via the open internet and other unpredictable (i.e. prone to bandwidth jitter and packet loss) networks. In turn, NDI works at full potential over consistently fast (read Gigabit or better) internal networks that can guarantee extremely low transmission error and congestion rates. 

UDP vs TCP

You know already, TCP is fully reliable while UDP is not. The former will retransmit lost packets until successfully delivered, the latter will just let them get lost and not retransmit anything. Retransmission introduces delays, and the frequency and length of these delays is unpredictable. 

SRT runs on top of UDP, NDI runs on top of TCP; NDI is arguably faster than SRT. Wait, what?

Latency

SRT allows you to set a fixed latency (120ms by default) for your video transmission. It will retransmit the packets that were lost over UDP only if not too late to stay within that latency margin. Depending on the speed and quality of the network between the 2 endpoints, setting a slightly higher latency may visibly reduce the amount of video and audio ‘glitches’ caused by undelivered packets. 

NDI advertises near-zero latency. It however demands a very fast and non-jittery network, or it’ll degrade ungracefully. Fact that it rides over TCP is therefore rather irrelevant as the low latency is only achievable if the packet loss and retransmission are minimal. 

Encoding

SRT is allegedly codec and format agnostic and only takes care of transporting video payloads across the network. There’s room for debate but let’s just mention that most of the tried and true implements gracefully deal with AVC and HEVC wrapped in MPEG-TS.

NDI manages both encoding, transport and decoding, thus it takes and outputs raw video. Internally a subset of H264 is used, more often than not trying to take advantage of built-in hardware encoders and decoders on host devices. 

Video Quality

In the case of SRT, what you send is what you get. If it’s not well tuned with the network capabilities (or if these degrade beyond ordinary) you’ll see or hear the ‘jerks’ commonly associated with packet loss. 

NDI is so called ‘visually lossless’, meaning the encoding artifacts will be virtually unobservable by the naked eye. Surely at the expense of…

Bandwidth Consumption

For NDI, it’s huge; think 100-150Mbps for a full HD stream and at least double that much for 4K.

With SRT, it’s up to you, as you’d need to figure out how to encode and mux the video before sending it off to transport. Do take into account the overhead introduced by retransmissions, especially if your network is busy.

Can I play with it?

Sure can! Put together a simple low latency streaming platform proof of concept. Based on Nimble (hey, it’s still free, wowza more than tripled their price over the last few), featuring SRT ingress and SLDP egress, you can deploy it in a few clicks and test it in a few minutes. It’s a mere adapt/simplification of the earlier larger scalable low latency project.

Fine, I’m interested, which best suits my use case?

Let’s recap the differences, and this time bring our old friend RTMP into the mix, should help make an educated choice


RTMP SRT NDI
Designed for Streaming over Internet Streaming over Internet Streaming in LAN
Transport TCP UDP TCP
Latency Higher, Unpredictable Low, Customizable Lowest
Signal Quality High Good Nearly Lossless
Deals with Transport Transport Encode and Transport
Bandwidth consumption Encoding Dependant Encoding Dependant High

Are they expensive?

They’re both free, although NDI is proprietary yet royalty free, while SRT is fully open. 

Are they secure?

SRT supports AES encryption. 

NDI has no built in encryption mechanism afaik, if that’s critical you can run it over a VPN of some kind.

Broadcasting RTMP from the browser

rip adobe flash player
Image by Development Standards

Long gone are the days when you could whip up an RTMP broadcast in (sort of) any browser with a few lines of code, not that we honestly really miss those times. Ironically, it’s been more than 10 years since the F word became taboo, and a quick search revealed that Flash (there I said it) will finally start resting in peace at the end of the year. 

Despite, the protocol that’s been built for it – RTMP – lives on and there is no end to it in sight. Actually it’s by all means the de facto ingest standard to all streaming platforms, big and small. And implemented by many hundreds of broadcast products. Because it’s unsophisticated, it works more often than it fails, it’s open (though it hasn’t been from the beginning), offers a relatively low latency, and supports a couple of the most common codecs.

It’s not all good. As it runs on top of TCP, it can’t feasibly be tuned to operate at a fixed/predictable latency, plus it degrades ungracefully under fluctuating, unstable, or otherwise poor connections. And worst of all, there is no browser support for it, now and ever. 

But…

WebRTC is at long last supported in all modern browsers, with some players being particularly late to the game. And while it’s no RTMP (in fact it’s vastly superior to it) it lets you grab a live feed of your camera and transmit it to a fellow WebRTC endpoint, be it a browser or anything else. 

Thus, there’s no stopping us from putting together a proxy that sets up one such WebRTC endpoint (connecting to the broadcasting browser), and also converts and repackages the incoming feed into working RTMP to be pushed to a 3rd party platform or server. Like this

The fabulous perk is that most WebRTC enabled browsers can, according to standard, encode in H264 so there will be no need to transcode the video at all. Audio coming out of the browser is usually Opus and that we’ll want to transcode into AAC for compatibility. That’s still a big win as audio transcoding requires a lot less processing than video would. 

The nugget

As with other solutions, tried to smooth out the learning curve by offering a working prototype to be deployed asap. It sets up the webpage and ‘proxy’ bundle in a few clicks, effectively making it look like you’re broadcasting your webcam to a customizable RTMP address, all from the comfort of your own browser. You’ll still have to go through the hurdle of setting up and providing a key/certificate pair as WebRTC requires HTTPS more often than not.

POC is powered by the amazing MediaSoup framework, and much of the code is recycled from this handy sample. Work is part of a bigger effort to rejuvenate a commercial browser broadcasting enabled product, originally built on top of Kurento.

Does it scale?

With a bit of care and planning it will. Bottleneck being the processing toll it takes on transcoding the audio. Think a midrange computer/server will easily proxy 20 simultaneous streams. 

Also need to take into account that some browsers still won’t encode H264 (e.g. Firefox on Android) and it’ll have to default to VP8/9 which needs transcoded to work with RTMP.

Is it stable/reliable?

Better than the real thing! As the last mile (first actually) is delivered via WebRTC — which features UDP transport, adjustable bitrate, congestion control, forward error correction and other cool stuff — the overall quality of service will be superior to the scenario where you would have broadcast the same camera with (e.g.) Wirecast, given the same encoding profile (i.e. constrained baseline) and resolution, especially over unpredictable networks or when employed by your non-streaming-savvy website users.

Is it fast?

Relatively so. Think sub-second WebRTC browser-to-proxy and 2-3 seconds RTMP proxy-to-platform. Not good enough for a video call but at least bound to be consistent, as compared to an unpredictable 2-30 second direct RTMP connection. There is extra delay introduced by the need to transcode, but that’s like half a second at most. 

Is it expensive?

I say no, but depends on what you’d be using it for, and the scale you’d be using it at. Processing juice for transcoding the audio would set you back under a cent per hour per stream; if it were to be separate that is, yet the proxying and transcoding needs to be thought of as part of the bigger picture and possibly run on a system that’s often idle.

Is it secure?

Yes, actually. WebRTC is always encrypted, unlike RTMP.

Turn on the lights

But I have 100mbps

Hey, streaming is a complex matter. Broadcasters are ever increasing and so are the options for platforms and equipment. While setting up a FB/IG live session from the app is fairly intuitive, stepping up one’s game to offer a more ‘pro’ gig can be challenging. And no wonder, media acquisition, encoding, transport, processing, delivery have to be well tuned and work in sync to ensure crisp and smooth playback.

Far from able to summarize all it takes to make your transmission crystal clear and free from ‘lagging’ or ‘freezing’, will in turn try to outline the most common rookie wrongdoings when setting up a broadcast. Though the following may come to you as common sense, I still get a huge share of ‘ahaaaa’ moments when I tell people to…

Turn on the lights!

Yes, the difference may amaze you. Blinding lights studios use are no mistake. 

Thing is, regardless of its sensitivity, the digital sensor (and the film before it) of any camera will output a grainy/noisy picture under low lighting. And if streaming it, noise does not encode particularly well and you end up with an unexpectedly poor frame quality, either ‘grainy’, or ‘blocky’, or both. Tech details aside, use the brightest light you can find and see for yourself. Next up if you truly need to shoot in the dark realize you may need specialized (i.e. expensive) gear. 

Bonus: you’ll get a less ‘choppy’ video. Many low-end consumer webcams do not have an adjustable aperture and will in turn vary the exposure time to tune the brightness of a frame; in low light this will lead to longer exposures and reduced framerates.

Get a better connection

Please, stop being convinced your internet is amazing simply because your provider or the default speed test told you so. In the case of the latter, what you’re seeing is merely the speed to the nearest PoP, which is usually way off your true internet speed. There’s a lot more to networking than raw (average) speed unfortunately, and before going scientific at least try running the same test against a farther ‘server’ like one in Australia or South America. 

Moreover, if you’re wireless (may that be wifi or cellular) keep in mind that the quality of your connection varies with position, obstacles, interference, weather, conspiracists’ tinfoils. And unlike browsing or tweeting, streaming works best under constant and predictable network speeds and latencies.

So I beg you, especially since nowadays anyone can whip up a high speed mobile hotspot, try another network, you may be in for a surprise. 

Get a better camera!

Sure, that’s obvious. Yet the characteristics will vary widely, and often the quality/performance of the lens, sensor, and sometimes electronic post-processing will make a remarkable difference, between cameras with identical specs nonetheless.

But you don’t have to break the bank in the process. A used/aftermarket DSLR or ‘handycam’ will output a crisper picture than many high-end webcams or smartphone cameras. There’s no wrongdoing in lending or trying out a few and see what looks/works best for your needs. Or ask around to find what worked for others. Just don’t run your broadcast business around that same camera unless you understand it very well, it may be suboptimal for a million reasons, like having been pointed at the sun.

Reduce the resolution

But hey, isn’t HD and lots of megapixels what everybody’s after nowadays? It is, but the full broadcast chain has to support it in harmony. That is the camera, the capture device, the encoder, and the upload bandwidth. If any of these isn’t up to the task you’ll end up with a sub-par HD picture that is either wasteful or poor. Depending on your setup, a high quality SD may look better and get enjoyed by more.

Bonus: try streaming at 540p. Often unknown or overlooked, you can think of it as near-HD. It’s suitable for most unremarkable needs, and it’ll take slightly more than half of 720p’s bandwidth to encode at the same quality

Reduce the bitrate

I know, it’s counterintuitive. Higher bitrate always means higher quality, given all else the same. But it’s not proportional. Depending on your content (and the equipment, remember?), there will be a sweet spot beyond which increasing the bitrate will result in little to no visual improvement. There are tools and metrics pros use to gauge that (see psnr) but the naked eye can still be a good judge, just run a few tests. 

Don’t beat up your encoder

If using a computer for streaming, make sure its CPU never runs above 80%, ideally even lower. Else, it will drop frames (‘laggy’ again) or otherwise degrade the performance of your stream. Dedicated encoder boxes and smartphone apps tend to automatically pick the encoding profile to match the hardware capabilities so you don’t have to worry that much, but do keep the same in mind and measure it if possible. Now you know. 

Use (better) microphones

This one’s easy. If your sound is poor you’re probably too far away from the mic, or you need a better one. Pay particular attention to the wireless kinds as some may introduce delays, and getting it in sync is kind of an advanced topic. 

Bonus: Be careful not to introduce echo/feedback. Mute all your players and always use headsets if you really need to monitor the transmission’s audio in the vicinity of the microphone. 

Reboot everything

Sketchy topic… For reasons only understood by masterminds, electronics with a reset button don’t just crash and freeze, they may also malfunction in weird, unobvious and unexpected ways, more so if they’re low end and have been running for a longer while. Especially if you’re not an expert, do yourself a favor and take the time to restart that router, computer, smartphone or gadget before the big event.

Expect to fail

Things will go wrong, mercilessly, when and where you least expect it. Ensuring redundancy/failover for every scenario is overkill and overly expensive. Do prepare for the most common mishaps (internet/electricity going out) but not the apocalypse. When it hits the fan, deal with it the best you can and don’t freak out; your viewers are forgiving and your reputation is salvageable. Apologize if the case, be honest about what went wrong and steps you’re taking to avoid that in the future. 

Free peer-to-peer assisted live streaming

So it’s just like Napster?

The peer-to-peer realm… It goes so far off the ‘classic’ client-server paradigm it’s just a world in itself. If you’re not familiar with the topic, you will in turn have heard of Bitcoin, Tor, Skype, BitTorrent, DC++ or Napster. What do you know, they all rely on… 

P2P

Computers, smartphones and IoT devices can connect to each other and exchange data. The closer they are to one another, the faster and more efficient they can communicate. Ha, that’s actually not true 🙂 Open internet connectivity, routing and network peering is optimized for end-user devices to efficiently reach service providers’ servers. As for connecting to each other, it’s a hit and miss. Particularly due to extensive NATting and sometimes deliberate ISP blocks, but also other reasons, 2 random internet connected devices may or may not be able to communicate to each other directly.

Swarming

That’s ok. Except for special circumstances, an IP connected device can still connect to a bunch of quasi-random like-minded fellow devices to share data in a partly-predictable fashion. If you lived the age of torrent downloads, you probably do have a certain understanding of how individual clients team up and help each other towards a common goal by sharing pieces of that same individual content. Results will vary; peers in close vicinity to each other (network-wise, not necessarily geographic) do share faster while others have to wait, sometimes more than they would if downloading from an actual server. Regardless, pressure and traffic on the seed(s) is heavily reduced as compared to the scenario where all clients would have to download directly. 

The video streaming context

The amounts of data trafficked by video streaming are enormous. Meaning that somewhere there’s an enormous traffic bill. Forget the giants as they can strike nice deals with the CDNs, the average players will end up paying lots for broadcasting their venue outside of the ad-driven free services like YT and FB. 

So what if we could put some of that p2p magic to good use…?

Peer-assisted video streaming is not a new idea. Sure, unlike a torrent download you can’t afford your viewers to buffer a lot or play at low quality just because of inadequate peer availability. Instead, rely on your friendly CDN to quickly grab the first part of the video and, in parallel, start downloading latter pieces from peers as soon as you have secured a comfortable buffer to ensure smooth playback for a while; fall back to CDN if peering capabilities degrade.

The live video streaming context

Particular to live streaming, all viewers will be consuming the same pieces of content at the same time. This is of furthest importance and makes for a particularly interesting use case, as the sharing is way more straightforward. Think there’s just 5-10 pieces of video being circulated in the ‘swarm’ at any given time, as opposed to hundreds in an hour long VOD.

If not clear by now, peer-to-peer traffic is free. From the standpoint of the provider that is. Any slice of video downloaded from a peer rather than from a server is a penny saved. And the overall potential savings are huge! Think large communities of ‘neighbors’, like in a campus or compound, downloading that content just once or twice and sharing it among each other in a fast and fairly efficient network. 

And it gets more spectacular as the viewer count increases. For events with enormous audiences like the World Cup or the Superbowl, the sheer number of devices watching will lead to high incidence of high-speed peering and massive savings, all at a scale that might get a traditional client-proxy-server network to just crumble. 

Convinced yet? There’s more! It’s not just sheer savings on the content provider’s end. There’s also faster starts, reduced buffering, and superior quality on many of the viewers. For some it’s just as if they were connected to a faster network, with all the benefits of that. 

The readily available technology: HTML5 and WebRTC

WebRTC is finally part of almost any modern browser. And surprisingly unknown to many, it incorporates advanced peering capabilities. Details aside, a piece of JS code can drive swarming between browsers and get them to speed up, cut costs and improve the quality of video playback in a manner that’s transparent to the viewer. And it’s happening. For quite a while already WebTorrent has been around, ventures have tried to capitalize on the tech by selling it as a service, and ready-to-use open solutions eventually surfaced.

The Nugget 

Here it is, your very own p2p-enabled streaming platform, ready to deploy and start broadcasting in minutes. It’s based on this free and open initiative; though built and promoted by a private company there are no strings attached afaik. 

Does it scale?

Beautifully! As mentioned, this could actually sustain numbers that would overwhelm even the mightiest CDN, and that’s no overstatement. Minor note though, the proof of concept makes use of public trackers and if you need to stream to more than a couple thousand you’ll have to deploy your own. Scalability of that will be your bottleneck so take good care of it. 

How much can I save?

Hard to say. Some will advertise figures of ‘up to’ 90% or more, but your mileage will vary. The more watching the better, and the more concentrated into metropolitan areas or individual networks your viewers are, the more and faster they will peer. 

What’s the catch?

There isn’t one, everybody wins. Except… 🙂

  1. Extra traffic usage (think double) on most of the viewers due to the fact that they have to upload video pieces to others; not a problem for unlimited wifi but possibly problematic for those on a metered connection
  2. Extra overhead on each of the clients in terms of CPU and memory consumption; that’s needed to initiate and maintain tracker and peer connections and also manage and relay the extensive amount of data

Is the free solution inferior to commercial alternatives?

May very well be. As is the case with other tech, it’s easier and faster to build a proprietary system, and monetizing it may fuel further innovation. P2P is still a matter that spurs academic research, trade secrets and patents. At the core of any solution there’s a tracker and some replication algorithms that can vary immensely in key areas like central coordination, congestion avoidance, mesh optimization etc. Long story short it once again depends, and your use case is possibly very different from others’.

Dirt cheap live transcoding for ABR

Neet to pitch this at the lowest possible cost

Couldn’t stress it enough, except for a few borderline cases Adaptive Bitrate is simply a must have. Your viewers need to start fast and be able to watch the game on poor or fluctuating networks. 

Revisiting the topic as there are ever so many angles to approach it. For this one, (unsurprisingly) cost was the primordial factor and had to pull all the tricks to get it that cheap. 

In no particular order, and not necessarily to be used all together (actually some are incompatible), following are tactics to reduce the cost of an ABR setup:

Reduce the number of ABR renditions

The main point of ABR is to allow bandwidth-challenged viewers to play your content smoothly, may that be at a lower quality. The more renditions employed, the closer you will be able to match one’s capabilities (it’s at times not just bandwidth but also decoding horsepower and video canvas resolution) and offer them the best possible viewing experience.

On the other hand, having fewer renditions may lock some users into a less than ideal quality setting, yet still fluid, of reasonable visual quality, and with good audio. Especially if your content or programming does not mandate top quality (i.e. news), this may save a lot on transcoding in the long run. 

Add to that, much of the public has grown to instinctively realize a better network will lead to better quality and will make voluntary efforts to better their connection if they want a better video.  

Recycle the original encode 

This won’t work for all scenarios but…

Particularly when source feed is encoded by known studio equipment (unlike user contributed which tends to be less predictable) you can transmux the original video and make it the highest quality variant in the ABR set. Depending on the profile, this may save up to half the overall processing power needed for transcode.

Recycle the original audio

Simple, just mandate a middle ground audio bitrate/quality at the source and use it for all renditions. Not always ideal for audiophiles but good enough for us humans. 

Use lesser complex transcoding

Encoding is a fine trade-off between quality, bitrate (which translates in bandwidth required for transport) and processing needs. In the special case of live streaming, the transcoding device has to offer enough power to process the content in real time. Choosing a less complex transcoding profile, while requiring less computing resources, will lead to video that has a higher bitrate for the same quality, or lower quality for the same bitrate. Sure thing, traffic costs too, and it may be unwise to save pennies on one transcode and pay for the extra traffic multiplied by the number of viewers. Yet every case is different and numbers may be in favor of this approach at times. 

Use the GPU

Modern GPUs have had built-in dedicated video encoders for quite a while now and they can be put to good use in many scenarios. Just off the top of my head, you’d be able to transcode 2-4 times as cheap, real number depends on a huge amount of factors, most notably the cost of actually buying or renting the respective GPUs

Use the cloud

That’s a no brainer nowadays, I guess. Even if you have some idle dedicated servers lying around, it would be hard to set up a scalable solution around them. Between SaaS cloud transcoding and running custom software on cloud virtual servers, the latter is cheaper by far though it comes with extra headaches. 

Use free software

Duh, doesn’t get any cheaper than that. You don’t get to call support when something goes wrong, but hey, maybe your team is too good to ever need that. Encoders in ffmpeg and gstreamer (i.e. x264) are hardly inferior to their commercial counterparts and also mature and stable, so no real worries there, most software transcoders are built on top of them anyway.

Use a separate virtual server for every stream

That’s not necessarily a winner for every scenario. In fact it’s always more economical to be doing multiple transcodes on a more powerful machine. But that’s only if you can use that to full capacity, otherwise you’re as efficient as flying a large plane half empty.

Take advantage of the Launch Credits

This one is very close to a hack. Only particular to AWS, the older virtual server types (since the days they billed by the hour) will let you burst some cheap CPU credits and throw the instance away when depleted. There’s a limit to how much you can abuse the ‘feature’ but good enough to get you started at a real bargain.

Putting it all together (actually just some) …

…the solution is here for grabs. Deploys in a few clicks and sets you up with a rather generous ABR profile for as little as 2-5¢ (!!!) per hour of live streaming or well within the free tier if you still have that. 

Does it scale?

Not in all directions. Long story short, you can stay on the cheap end of the spectrum if you transcode up to a couple hundred hours of content per day, after that the perks start to run out.

Is it worth it?

Oh yess! If only I could pocket the savings it’s brought…

Is it stable/reliable?

Should be. For a while I monitored it in production and noticed no issues. See for yourself.

Is this blog sponsored by AWS?

No. And by no other company for that matter. I just happen to have been exposed to amazon’s much more than to others’, but (except perhaps for very specific use cases) do not believe it’s any better than other cloud platforms. Will gladly take on the challenge to deploy solutions in any environment or to objectively choose one that best suits particular needs.