AWS Direct Connect (DX) is a service solution that provides a dedicated and private network connection from your on-premises (branch office, datacenter, or colocation environment) to an AWS region, bypassing the internet service provider. In particular, it allows you to connect your premises to a DX Location in a specific region over a standard Ethernet fiber cable at high speed (up to 10Gbps and 100Gbps in some Locations).
AWS DX is essentially used for critical, high-throughput, and latency-sensitive workloads and more generally when you need a more stable, reliable, and secure network connection with respect to the Internet-based one, thanks to the dedicated nature of DX connectivity.
When using the Internet-based connections you have no guarantee on the actual bandwidth, neither on the latency, since different paths may be used to route traffic to/from your workloads in AWS at different points in time; with Direct Connect, instead, you can select which data will use the dedicated connection and your DX traffic will be routed to the selected Region using shortest path.
Direct Connect natively provides a bandwidth rate of 1 and 10 Gbps (even 100Gbps in some locations) for a single connection, whereas it is possible to use a Link Aggregation Group (LAG) in order to aggregate multiple connections into a single managed connection to increase the bandwidth capacity. Note that AWS Managed VPN can only support up to 1.25 Gbps per VPN tunnel, therefore you should avoid it if you need much more bandwidth for your workloads, even if only used as a backup connection for the main Direct Connect link.
In case you have workloads that consume a lot of network bandwidth on the Internet link, AWS Direct Connect helps you reduce Internet costs, since incoming and outgoing traffic is transferred to/from AWS directly, without having to pass through your Internet provider, which often caused higher bandwidth commitment and data transfer charges to be payed to the ISP.
It is compatible with all of AWS services, both private services (e.g. Amazon EC2, VPC, etc.) and public services (e.g. Amazon S3). Particularly, by leveraging virtual interfaces and VLANs, Direct Connect allows you to directly access either multiple VPCs inside a region (using private virtual interfaces), or any public AWS services (using public virtual interfaces).
Not only does Direct Connect offer a high-bandwidth, low-latency and stable connection, but it also provides you with secure connectivity to multiple VPCs by ensuring network segmentation and isolation.
Storm Reply, AWS Premier Consulting Partner since 2014, has developed a strong expertise on AWS Direct Connect implementation on several customers. Thanks to years of projects, we have developed a set of best practices that achieves the right mix between cost and reliable connections to the cloud leveraging on the partnership with Equinix® which offers both hosted or dedicated connections to AWS.
In a nutshell, a Direct Connect Gateway is a redundant and distributed peering device which is essentially used to mediate and aggregate the Direct Connect peering and the VPC attachment from a networking perspective.
Using DX gateway instead of direct VPC attachment When your on-premise network has a DX connection to a given region to directly reach VPCs in that region, it may also need to connect to VPCs in other regions. The Direct Connect gateway enables you to do exactly that, without having to use a direct VPC attachment: each VPC in any region is connected to the DX gateway through its virtual private gateway, and the DX gateway is connected to the DX location of the main region by using a private virtual interface (VIF). Additionally, with a DX gateway you can use a single VIF to access multiple VPCs, whereas with direct VPC attachments you would have to create a distinct VIF for every VPC you want to connect to.
Configuration using Transit-GW In case you want to enable mesh connectivity among your on-premises and VPCs, while managing a single DX connection and VIF, and advertising prefixes to/from AWS, you need to use a Transit Gateway (TGW) where to attach all of your VPCs to. The TGW has to be associated with the DX Gateway of the DX connection you want to route traffic over, and a single transit virtual interface has to be created, which connects the DX gateway with the DX location.
Configuration in multi-account In many real-world scenarios, the Direct Connect Gateway (DXGW) and the Transit Gateway (TGW) may have different owner accounts. Let’s suppose for example that a first account, say Account-A, owns the DXGW while another account, say Account-B, is the owner of a TGW with many VPCs attached. In order for those VPCs to use Direct Connect, an association proposal has to be sent to Account-A by Account-B. If the former accepts the proposal, then the VPCs can route traffic toward the DXGW, which of course controls the routing and hence decide which prefixes to advertise in both directions.
Link aggregation represents a common networking solution to aggregate several physical links into a single channel, in order to incrementally increase the bandwidth, and improve resilience in case of a link failure as well. If you have multiple dedicated connections from your Customer Gateway to a specific DX Location for redundancy, all of them using the same bandwidth (i.e. 1/10/100 Gbps), then you can create a Link Aggregation Group (LAG) for those connections and treat it as a single (logical) link to simplify configuration and management.
High Resilience in network connectivity to the cloud is crucial to obtain a well-architected system and here are some best practices to follow in order to reach this goal. First, it is a best practice to have multiple, physically separated, on premise networks which are connected to AWS through different DX Locations, in order to guarantee resilience not only to device failures, but also to complete physical location failure. Second, it is also suggested having redundant hardware and telco providers, e.g. multiple customer gateways connected to as many DX routers for each and every one DX Location (but also when connecting from a single physical location); this configuration will provide you with the maximum resilience for your critical workloads.
Additionally, it is important to use dynamic routing (rather than static routing) to automatically perform load balancing and failover across multiple redundant connections. Finally, you should avoid using an AWS VPN as a backup for a DX primary connectivity with a throughput greater than 1 Gbps, and more generally you should provide sufficient network bandwidth to guarantee that the failure of a network connection will not affect too much or overcome redundant connections capacity.