It's not the size of your pipe, it's the way you use it

How to manhandle CDN bandwidth like a pro


WAR on the cloud 6 In part 5, I tuned my site's home/entry page to load faster than Google's, in part because page-load time and general responsiveness are important to retaining users and those "sticky eyeballs".

Now I want to make better use of my already-paid-for resources to handle assets that I didn't feel comfortable putting on a third-party network due to lack of control (potentially big bills from misuse by third parties for example). More stick for free...

In the end I punted only relatively less valuable media off to Rackspace's CDN (content delivery network) because of the lack of barriers to even casual abuse such as hotlinking – ie, embedding my images in someone else's page so that I end up paying for bandwidth every time their page is viewed.

Rackspace has now shown some interest in providing controls (such as a regex check on the Referer header), which is good, but it's still not enough for me to put my more valuable and larger assets (primarily images) on their CDN.

For interactivity it is usually most important to minimise latency (of which network round-trip-time is a big component), and thus serve the user's requests from a mirror as geographically close to the user as possible.

For larger objects (ie, more bytes, such as big images or videos) high bandwidth is more important to minimise the time to complete a request. (There are other issues such as packet loss and network congestion on long routes to be considered too, though.)

DIY CDN

I already bounce users to mirrors closer to them to minimise latency, so now I want to selectively serve (large) content from remote mirrors to maximise bandwidth.

Other than some new and exciting bugs I wrote into the code (and still haven't fully squashed), the split between the two modes was relatively simple. The results started to show fairly quickly after I rolled the code out on Monday to several of the mirrors. The bandwidth curve for the main UK mirror is starting to flatten to follow local time-of-day use less, in this case the UK mirror taking some load from the US mirror in the main I suspect.

Looking at the underlying numbers I can also see the two largest/fastest (UK and US) mirrors closer in their spare capacity more of the time than before.

_bw

See the bandwidth profile change a little at the start of the week when the new mechanism went (partially) live: it's flatter as more non-local traffic is carried.

The major beneficiaries are those viewers bounced to a mirror close to them that is relatively small (with low-ish bandwidth): since a lot of the material that they see on a page is now served from the Rackspace CDN or my internal 'high-bandwidth' CDN, those mirrors feel a whole lot faster. This makes them seem as snappy as I had intended all along. And it costs nothing extra beyond a little coding time.

To avoid confusing search-engine spiders (and end users), and to maximise robustness on a human's first page hit, I turn off most of the high-bandwidth re-routing magic in these two cases.

Tinker, Tailor

There is further optimisation (aka tinkering) possible of course: for example, for a mirror to continue to route high-bandwidth requests to itself if near the top of the spare-bandwidth league (eg, within say 25 per cent), not only if at the very top.

Another tweak to marginally improve responsiveness and avoid hiccups in underlying TCP connections when a very long or congested geographical route would otherwise be taken, is to instead pick a physically close mirror near to top of the bandwidth rankings, rather than just the top. Improved proximity may somewhat compensate for the nominally lower bandwidth and help to complete a user's page display quickly.

Another tweak would be to spread traffic between all mirrors near the top to induce less sloshing back and forth between candidates.

These changed should result in more robust and often smaller pages and more intuitive dependencies.

Lower carbon packet-print?

Another wrinkle – which is more a problem for the likes of Google and Facebook which have data centres for which power rates are time-dependent - is to move processing where electricity is cheaper and/or has lower carbon intensity (less CO2 emitted per kWh consumed and thus user action performed). I already have my servers report a slightly lower available bandwidth (to draw in slightly less traffic) when running in energy-conserving mode for reasons like those, so while it doesn't save me any pennies it may well be trimming my footprint a little, assuming that the extra distance that packets may travel doesn't outweigh that.

Audience participation

Thanks for interesting comments on previous parts to this series.

One point that I didn't make clear is that security is not an enormous issue for the site described in this series beyond avoiding vandalism, XSS attacks, and generally being used as a weapon to attack my users and other sites with. The site doesn't use SSL and doesn't hold any personal details, and the only cookies used are benign session cookies, mainly noting a user's locale override, if anything.

I have put together sites, such as for retail finance, which would require serious effort on security issues, including the possibilities of an attacker using IPv6 and IPv4 together in unusual ways to find chinks in the armour.

Tagging for cache

In the comments to part 5, "theodore" asked what I meant by "redoing all my Last-Modified and ETag headers and my response to a browser's If -Modified-Since and If-None-Match" and if I meant making sure that all of my mirrored copies of files are identical both in content and timestamp?

I do indeed make sure that timestamps and content are identical between sites (except for brief transients when updates are being propagated to them all), though identical is a bit subtle here for pages with ads in them for example. Images will in the main be bit-for-bit identical (strongly identical in ETag terms) whichever mirror you visit, but the HTML pages will only be weakly identical because although the main information content may be the same, ads and links to other internal pages may be somewhat different for a number of reasons.

But the real issues that caught me out this time around were:

  • Being sure not to use the Last-Modified-Since header at all once any ETag matching had been done.
  • Actually making sure that all significantly-trafficked pages had a sensible ETag generated for them, which is harder than it appears at first blush with a heavily templated site and because of weakness issues mentioned above.
  • Finding that some objects could not realistically use ETag/Last-Modified at all if I was to avoid issues with browsers requesting duplicate copies when hopping between mirrors behind one 'main' DNA alias.
  • Lastly, finding that I have to repeat these headers and Cache-Control even on a 304 Not Modified response (as an upcoming IETF draught suggests should become mandatory) to avoid buggy browsers not updating their internal notion of when to re-request (ie, they should add on the original max-age but forget and so wrongly re-request every time after the initial expiry).

After having done that, I think I have gone a little bit too far, and have to row back on cache lifetime a touch, but generally I feel more in control of the process and have less random logic scattered around. ®

Broader topics


Other stories you might like

  • Tesla driver charged with vehicular manslaughter after deadly Autopilot crash

    Prosecution seems to be first of its kind in America

    A Tesla driver has seemingly become the first person in the US to be charged with vehicular manslaughter for a deadly crash in which the vehicle's Autopilot mode was engaged.

    According to the cops, the driver exited a highway in his Tesla Model S, ran a red light, and smashed into a Honda Civic at an intersection in Gardena, Los Angeles County, in late 2019. A man and woman in the second car were killed. The Tesla driver and a passenger survived and were taken to hospital.

    Prosecutors in California charged Kevin George Aziz Riad, 27, in October last year though details of the case are only just emerging, according to AP on Tuesday. Riad, a limousine service driver, is facing two counts of vehicular manslaughter, and is free on bail after pleading not guilty.

    Continue reading
  • AMD returns to smartphone graphics with new Samsung chip for your pocket computer

    We're back in black

    AMD's GPU technology is returning to mobile handsets with Samsung's Exynos 2200 system-on-chip, which was announced on Tuesday.

    The Exynos 2200 processor, fabricated using a 4nm process, has Armv9 CPU cores and the oddly named Xclipse GPU, which is an adaptation of AMD's RDNA 2 mainstream GPU architecture.

    AMD was in the handheld GPU market until 2009, when it sold the Imageon GPU and handheld business for $65m to Qualcomm, which turned the tech into the Adreno GPU for its Snapdragon family. AMD's Imageon processors were used in devices from Motorola, Panasonic, Palm and others making Windows Mobile handsets.

    Continue reading
  • Big shock: Guy who fled political violence and became rich in tech now struggles to care about political violence

    'I recognize that I come across as lacking empathy,' billionaire VC admits

    Billionaire tech investor and ex-Facebook senior executive Chamath Palihapitiya was publicly blasted after he said nobody really cares about the reported human rights abuse of Uyghur Muslims in China.

    The blunt comments were made during the latest episode of All-In, a podcast in which Palihapitiya chats to investors and entrepreneurs Jason Calacanis, David Sacks, and David Friedberg about technology.

    The group were debating the Biden administration’s response to what's said to be China's crackdown of Uyghur Muslims when Palihapitiya interrupted and said: “Nobody cares about what’s happening to the Uyghurs, okay? ... I’m telling you a very hard ugly truth, okay? Of all the things that I care about … yes, it is below my line.”

    Continue reading

Biting the hand that feeds IT © 1998–2022