Hosting a Website Through Amazon CloudFront
In yesterday's post I mentioned that we are now hosting our corporate website through the Amazon Cloudfront content delivery network (CDN). Today I would like to share some observations we made while creating our new setup.
Speed and Performance
The basic idea of a CDN is to deliver content to web surfers faster by hosting the files at various locations around the globe so latencies caused by large geographical distances are avoided.
When somebody enters our URL, www.paessler.com, into one's browser, one's computer will connect to the geographically nearest CloudFront server, the so-called "edge server." Only if this edge server does not already have a recent copy of the webpage (or images, CSS files, etc.) will it connect with our web servers to get the latest content before sending it back to the browser.
This means when delivering the website our web servers no longer have any direct connection with the website visitor's browser.
- The website is fast: Most users don't have to wait for content to be delivered from Dallas, TX, any more
- We never know when objects in the CloudFront cache will expire. And Amazon does not offer an API to clear the whole cache for one domain (only for single URLs). So handling object expiry times in the HTTP headers is crucial
- Dynamic pages (e.g. our knowledge base) cannot be hosted through a CDN
- We no longer have reliable log file.
So we had to apply some special configurations:
- Our webpages have an expiry time of 1 hour, all page assets expire after one week. This means that it can take up to one hour until changes to the homepage are visible around the globe.
- We are hosting the interactive part of the knowledge base under a different domain name, that is not sent through CloudFront
- For website analytics we use Google Analytics and Urchin with a dedicated domain utm.paessler.com for the tracking images
Using a CDN not only has speed advantages, it's also more reliable. Because we now effectively host our website on CloudFront there is so much caching involved that a short downtime of our web servers may not even be noticed at all in the outside world.
And we have set up additional redundancy, too.
- Fail-over load balancing of the web servers: Look at the network diagram above and you will notice that we are using two web servers behind a load balancer. Under normal load the load balancer simply distributes requests from CloudFront between both servers, seeing both have the same configuration. In case one server fails (or is undergoing maintenance), the load balancer will automatically switch over all traffic to the remaining server.
- Setup of a fall-back CDN: In case the CloudFront CDN has technical problems we have also undergone the complete setup process with another CDN provider, NetDNA. This enables us to switch over to the fail-over CDN in a matter of minutes by changing the DNS records for paessler.com
Our website is now effectively hosted on a globally distributed system consisting of many servers, ours and Amazon's. You cannot monitor such a system from just one location anymore, because you only see one of many perspectives.
For "Paessler's own monitoring" we are now using a cluster installation of our software, PRTG Network Monitor, with cores in California, Texas, Ireland, Singapore, and Sydney. With five locations around the globe we have a good perspective on the availability of our website.
Additionally, we also use our demo installation http://prtg.paessler.com which uses 16 probes around the globe to track performance.
This is a screenshot of a map that we use in our NOC to keep an eye on our website and all its parts.