Web Real Time Communication
WebRTC provides a feature rich framework for developers that makes it easy to build applications for browsers to communicate in real time. It provides the fundamental building blocks to transmit audio, video and text from client to client.
When being implemented in a browser these components will be accessed through JavaScript API’s. Making the developers job a lot simpler.
Other programming languages will also have their own implementations of WebRTC. This means that we can have a C++ or Go client interacting with a user’s web browser, or even two Go clients (or any programming language that has a WebRTC implementation) communicating with each other in real time.
I love Go and have recently been using Go Pion, which is a pure Go implementation of the WebRTC API.
Peer to Peer connections, cancelling out the middleman reduces latency.
One advantage is that WebRTC doesn’t require the user to download anything third party, it will just work in the browser without having to install an application.
Connections are peer-to-peer, traffic doesn’t need to go through a central server. Reducing the middleman here saves time so ultra-low latency can be achieved.
Some ambitious projects are possible thanks to WebRTC, such as:
Some concepts to understand before building a WebRTC application include:
Today nearly everyone will be sitting behind Network Address Translation. Simply meaning that your machines public IP address is not available to the outside world, instead your routers IP address will be visible and manage the traffic.
Machine 10.0.0.2 wants to make a connection to machine 4.4.4.4:80. In order to do this the machine will try to make a connection request.
There are four different types of translation methods, and each routers implementation will slightly vary. It is important to understand different translation methods so that we understand why we are implementing certain functionality into our WebRTC application.
Packets on the external IP:Port on the router always maps to the internal IP:Port without exceptions.
This means that the router does not check where the external packet has come from, it will always accept the incoming packet. For example, if we had these incoming packets, all would be allowed to pass. This is because the external IP and port match the ones in the table. It doesn’t matter where they are going internally.
Packets to the external IP:Port on the router always map to internal IP:Port as long as the source address from the packet matches the table (regardless of port.)
Allow if we have communicated with the host before.
Only the first packet here would be accepted because the destination IP matches which is specified.
Restrictions on the IP address and the port.
Once again only the first packet would be allowed through. The destination port and destination IP are matching.
This is by far the strictest implementation of NAT. This is very similar to Port Restricted NAT, but the table must be completely symmetrical.
If a connection is behind symmetric NAT, then a peer-to-peer connection is not possible, and a TURN server must be used in order to bypass this.
Session Traversal Utilities for NAT. STUN Servers are used so that clients have the information they need to be able to setup a peer-to-peer connection. It is important to note that STUN will not work for symmetric NAT.
This includes the client knowing their public IP address and Port, and the type of NAT they are behind.
Once the packet arrives at the STUN server:
This green arrow would be the shortest path to the peer. This would ensure no latency. There is no middleman involved in WebRTC, as a result this process is much faster and offers much lower latency.
There isn’t a vast amount functionality, so STUN servers are very cheap to run. They are so cheap to run that Google and other providers have public STUN servers that are free to use.
Recall that this type of communication is not possible if client is behind symmetric NAT. If this is the case, there is an alternative that can be used.
Traversal Using Relays around NAT. In the case of Symmetric NAT, then we use TURN. This is a server that relays packets. In the case of TURN then a middleman must be used, TURN is not desirable because it would add latency. However, in some cases this must be used.TURN servers are more expensive to maintain so I think it would be unlikely that providers would give them away for free. XIRSYS does provide a free TURN server for a basic developer account which is good for a personal project, it is very simple to setup and doesn’t require a credit/debit card. However, production TURN servers will cost you.
With STUN and TURN both available, how do we know which option to pick? This requires Interactive Connectivity Establishment. ICE Collects all available candidates such as:
These are all known as ICE candidates. All the collected addresses are then sent to the remote peer via SDP.
So, what is SDP? The almighty Session Description Protocol, this is arguably the most important part of WebRTC. This is a format that describes:
There is a lot of information that is included in the SDP. The goal is to take a user generated SDP and send it to the other party. The other party will also need to generate its own SDP and send it back to the initial user to establish the peer-to-peer connection.
It is even possible to add your own custom information to the protocol. Check out this interesting article from Discord blog.
Signalling is the process of sending the user generated SDP to the other party that we wish to communicate with. I will not cover the details of how to build a signalling server because this blogpost is already becoming lengthy enough.
Signalling can be done several ways, but the best approach is to use a WebSocket to exchange the two SDP.
This example can be followed in the browser’s web console, which makes it easy and quick to follow.The goal of this example is to connect two browsers so that we can send messages to each other via WebRTC. Peer A will create an offer, which is an SDP. Peer A will set its local description to the offer it has generated. Peer B will receive the offer created by Peer A and Peer B will use this offer to set its remote description. Peer B will create an answer (which is just another SDP). The answer will become Peer B’s local description. It will signal this answer to Peer A, and then Peer A will use set its remote description to Peer B’s answer.
Peer A Code This code can be typed in line by line in the web console.
Once these steps have been completed the console should print a JSON structure which includes key values for type and SDP.
Peer B Code Now that Peer A has created an offer, we need this peer to use the offer. For this example, we will just copy and paste the offer however I wouldn’t recommend this approach in production for obvious reasons.
The offer that is printed in the console will be longer than the offer I have set in the example code. This is just to make the code readable.
The console should print a JSON object with a key and value for type and answer.
The connection is not quite established yet. We need to set the remote description in Peer A with the answer that has just been generated.
Peer A
Both consoles should print a message stating a connection has been established. You can send messages between the peers using the function.
If you would like any more information on WebRTC, please get in touch with our Future Networks Consultant !