The following article is part 2 in a 3-part series about CDNs (Content Delivery Networks), and it's a guest post by Matt Conran of Network Insight. In this article, Matt covers CDN PoP (Points of Presence) architectures, with a detailed discussion of the methods for directing users to the best PoP, including the pros and cons of DNS load balancing versus Anycast.
CDN PoP Generic Architecture
CDN's operate a distributed edge PoP design, strategically placing PoP infrastructures as close as possible to eyeball networks, where most of their users are located. The core of a CDN architecture relies on PoPs that are very small data centres and might only contain a few x86 servers connected with some network equipment.
For example, if you have a large customer based in Germany, you would install local PoPs in Germany, rather than having that user's requests for content travelling to a different country or continent. So now we have some smaller PoP locations closer to the user sites. What benefits does this bring to user experience?
Firstly, the PoPs store and cache static content. Static content is easily cacheable while dynamic content is not. Instead of the user's requests travelling to a data centre in a far away location, they can request the content from the local PoP, which has a shorter distance. This decreases the number of RTTs (Round Trip Times), resulting in better application performance.
If the local PoP does not have the requested content available locally, it may request the content from the central data centre location. The local PoP and primary data centre are already talking to each other, enabling the reuse of an existing TCP connection, which provides low RTT and a maximum congestion window size. When the TCP maximum congestion window is at its peak, data can really start flowing between two endpoints.
Let's get rid of state
Logically, PoP locations can take on a number of designs. But usually, CDNs try to remove as much state as possible.
State is anything that has to keep memory or registers. A stateless device does not require registers and may be processed independently. Many security devices hold a lot of state due to stateful functions and Deep Packet Inspection (DPI). They are slower than a device that simply forwards packets based on the destination IP address. Too much state in the network destroys forwarding performance and increases complexity. Devices that hold state require processing power, are generally expensive and are hard to manage.
Most CDNs aim to operate in a clean stateless design, offering better operations and cleaner networking. There will be state somewhere, but it's not going to be right in the middle, clogging up network performance.
Many CDN PoP architectures run equal-cost multi-path routing (ECMP) right down to the server hosts at the local PoP. ECMP enables next hop packet processing to a single next hop over multiple paths. Massive leaf and spine data centres are now designed with large ECMP deployments for load balancing, and this is the only real way to scale a data centre efficiently. Within a CDN, ECMP removes the need to run heavy load balancers and expensive routers. Everything is terminated on the host including any Secure Sockets Layer (SSL) connections. All this is achieved with reduced cost by deploying cheaper Layer 3 switches and standards-based BGP.
The designs consist of a BGP route reflector (RR) deployed in each CDN PoP, which runs iBGP to the local host and eBGP out to the WAN. BGP RR is an alternative to a full mesh of iBGP speakers. It acts as the centrepiece for all iBGP sessions, increasing scalability and performance.
Load balancing is now based on pure IP, which can now be done in hardware without specialist load balancing devices and Layer 3 routers.
Directing Users to PoPs
So now we have discussed the PoP physical layout and logical design with ECMP and BGP, but how do we direct user requests to the correct PoP? After all, this is the purpose of the PoP. We have two primary methods:
- DNS-based load balancing.
1) DNS based load balancing
Traditionally, CDN designs started with each PoP advertising a different IP address to the WAN, informing users of its location. This is combined with geolocation DNS-based load balancing, to send the client requests to one of those data centres. Each data centre is configured with a different IP address.
The DNS-based load balancing method has plenty of shortcomings. The DNS server responds to the client based on the IP of the resolver, not the actual client IP address.
A client based in Europe could be configured to use a resolver in the USA. As a result, the data centre IP returned to the client will be based in the USA and not Europe. This is suboptimal in the sense that you can only run performance metrics to the DNS resolver IP and not to the client's IP address.
There are also dragging issues with failover and performance issues with low DNS TTL's. Failing over with DNS is much slower than with, for example, pure BGP. BGP is designed to failover quickly.
The main advantage of DNS-based selection is full control over where to place users. It's not an organic placement method, and users are explicitly directed, leaving nothing to chance. This brings many advantages to capacity management as there is direct control over user placement mapping to individual data centres. If a data centre is overloaded you simply don't send users there.
Instead of using DNS to select the best PoP, Anycast uses the natural flow of the Internet for PoP selection. Traditionally, with the DNS-based approach, PoPs are assigned a different IP address, but with Anycast each PoP is assigned the same IP address. Anycast is nothing new -- the entire global DNS infrastructure is built using it.
The single IP address is advertised from multiple PoP locations and users follow the natural flow of the Internet for PoP selection, which is based on hop count. DNS is still used, but instead of advertising many IP addresses, DNS advertises only one IP address. The Anycast approach does not use the resolver IP, but rather uses the client IP for Anycast routing. This provides a better picture as to where users are located, enabling performance metrics to be run to the user's IP, not the resolver IP.
There is usually never only one single IP Anycast address for an entire domain. This would surely be suboptimal, especially using the same IP for different continents with flaky peering arrangements. Regional Anycast is often deployed with a number of regions participating under one address. This could potentially lead to 4 or 5 Anycast regions with different IP addresses.
Anycast is a great DDoS mitigation tool. As IoT-fueled botnets reach Terabyte scale DDoS attacks, it's becoming increasingly hard to mitigate these attacks with centralised-only designs that don't have sufficient scale-up and scale-out appliance models.
Anycast naturally absorbs distributed DDoS attacks, with every PoP taking a little hit. The size of DDoS attacks is increasing every month, so a distributed architecture is the only real way to deal with this type of velocity.
Centralised designs might work to deflect some DDoS attacks, But as billions of unsecured light bulbs come online the only way is to distribute the architecture and remove as much state from forwarding as possible.
Anycast does require some stickiness, so per packet load balancing will break it. But in all honesty, we are building better networks these days and per-packet load balancing is rarely seen.
Anycast directs users based on fewest hops, but this does not necessarily mean lowest latency.
Latency is a more efficient performance metric than hop count because a single hop may have higher latency than, say, a hop count of ten. For example, an intercontinental link could have one hop with very high latency.
However, more than often user requests don't need to travel across high latency intercontinental connections, due to strategically placed CDN PoP designs.
Anycast does not suffer from any DNS correlation issues and can use high DNS TTL's. This enables the resolver to cache the response for better overall user experience.
Stay tuned for more! This article is the second in a series about content delivery networks. The other articles are:
WANT TO KNOW MORE?
Here are some additional links you might find interesting: