BBC makes switch to AWS, serverless for new website architecture, observers grumble about the HTML

News aggregator says it's 'way more complicated and much harder to parse'


Updated The BBC website, the sixth most popular in the UK, has mostly migrated from the broadcaster's bit barns to Amazon Web Services (AWS) with around half the site now rendered using AWS Lambda, a serverless platform.

"Until recently much of the BBC website was written in PHP and hosted on two data centres near London," Matthew Clark, head of architecture, said lately. "Almost every part has been rebuilt on the cloud."

PHP runs fine in the cloud, but this is not a matter of lift and shift. Instead, the BBC team devised a new architecture based on serverless computing. It also endeavoured to combine what used to be several sites – such as News, Sport, and so on – into one, though Clark said the World Service, iPlayer video, and the radio site BBC Sounds remain separate.

The rest have been combined into a new thing called WebCore. "By focussing on creating one site, rather than several, we're seeing significant improvements in performance, reliability, and SEO," said Clark.

Web traffic initially hits a Global Traffic Manager (GTM), an in-house solution based on the Nginx web server and running partly on-premises (showing that the BBC has not entirely ditched its data centres) and partly on AWS. GTM handles "tens of thousands of requests a second," said Clark. A second layer on AWS handles caching and routing, before hitting functions running on AWS Lambda, which perform server-side rendering (SSR) of dynamic content using React, a JavaScript framework.

Server-side rendering means the browser gets a page ready to view without having to do a lot of work, and thus it should appear instantly, though it increases the burden on the server – caching mitigates this, we note. Walmart engineer Alex Grigoryan, who also oversaw a migration to SSR, tested SSR vs client-side rendering (CSR) and said: "When we did A/B tests on SSR vs CSR... our numbers showed better engagement from the customer with rendering early," though he noted increased server load as a major disadvantage.

In the BBC's case, Lambda is used, which is able to auto-scale on demand. "About 2,000 lambdas run every second to create the BBC website; a number that we expect to grow," said Clark. He added that Lambda scales better than VMs on the AWS Elastic Compute Cloud (EC2), saying that "our traffic levels can rocket in an instant; Lambda can handle this in a way that EC2 auto-scaling cannot."

Another aspect of the BBC site is the logic that goes into requesting content, which Clark calls the "business layer". Content is provided to the web rendering layer via a REST API, and a solution called Fast Agnostic Business Layer "allows different teams to create their own business logic," he said, so that different requirements are met while still sharing the same system for things like access control and caching. Clark didn't say much about how the content itself is stored, though he promised to return to this topic in future posts.

The WebCore platform uses CI/CD to enable rapid iteration, and Clark showed an example monthly report showing 110 releases or around three per day. Builds take around 3.5 minutes, and the average time from a pull request (when new code is merged) to running it in production was one day and 23 minutes, in this particular month. On average 67 per cent of pull requests were actually merged into the code.

BBC website HTML

A small section of the HTML delivered for a news article today on the BBC site. A news aggregator says it is much harder to parse than before

Great work? Comments on Hacker News show that opinions vary. "Running a site the size of the BBC on Lambda is nothing short of an exuberant waste of a government-subsidized budget, it's absolutely crazy. Lambda VM time has a massive markup compared to regular compute... IMHO this is the epitome of serverless gone wrong," said one.

Another comment from John Leach, who runs a headline aggregation site called News Sniffer, said that the generated HTML is not easy to analyze. "I run the News Sniffer project which has to parse BBC News pages and I knew about this rollout a few weeks ago when the HTML all changed format completely and my parsers broke. As a side note, the new HTML is way more complicated and much harder to parse than before – I know the aim isn't to help parsing for content, but I was still saddened to see how it's ended up."

There is also curiosity about unanswered questions. What is the cost impact of moving from on-premises to AWS? What is the cost impact of Lambda versus using EC2? Why, if the caching and content delivery network is working as expected, are 2,000 lamdbas a second required?

We have asked the BBC for more details. ®

Updated at 16:02 UTC on 5 November 2020 to add

The BBC's Matthew Clark got in touch to say: "Although EC2 Lambda compute cost is higher, the amount you need is less, offsetting this." He added, somewhat mysteriously since EC2 can autoscale, that: "With EC2, we provision web servers with plenty of capacity to handle sudden traffic changes (e.g. due to breaking news). Whereas with Lambda, we only pay for what we actually use."

To the question of why the org didn't use the opportunity of server-side rendering to deliver more human-readable HTML that would be better for parsing and accessibility tools, he responded: "The web page HTML looks different as it's largely generated by the framework used (React). The BBC has a very high bar for accessibility and performance, and we continue to test the site to ensure that it works well across browsers and screen readers." Lastly, we asked why, if the caching and content delivery network was working as expected, 2,000 Lamdbas a second were required.

Clark claimed: "The Lambdas are essential at handling updates so that the site remains up to date. Each BBC page typically involves multiple simple Lambda executions - the majority of which complete in under 150ms."


Other stories you might like

  • Suspected phishing email crime boss cuffed in Nigeria
    Interpol, cops swoop with intel from cybersecurity bods

    Interpol and cops in Africa have arrested a Nigerian man suspected of running a multi-continent cybercrime ring that specialized in phishing emails targeting businesses.

    His alleged operation was responsible for so-called business email compromise (BEC), a mix of fraud and social engineering in which staff at targeted companies are hoodwinked into, for example, wiring funds to scammers or sending out sensitive information. This can be done by sending messages that impersonate executives or suppliers, with instructions on where to send payments or data, sometimes by breaking into an employee's work email account to do so.

    The 37-year-old's detention is part of a year-long, counter-BEC initiative code-named Operation Delilah that involved international law enforcement, and started with intelligence from cybersecurity companies Group-IB, Palo Alto Networks Unit 42, and Trend Micro.

    Continue reading
  • Broadcom buying VMware could create an edge infrastructure and IoT empire
    Hypervisor giant too big to be kept ticking over like CA or Symantec. Instead it can wrangle net-connected kit

    Comment Broadcom’s mooted acquisition of VMware looks odd at face value, but if considered as a means to make edge computing and the Internet of Things (IoT) more mature and manageable, and give organizations the tools to drive them, the deal makes rather more sense.

    Edge and IoT are the two coming things in computing and will grow for years, meaning the proposed deal could be very good for VMware’s current customers.

    An Ethernet switch that Broadcom launched this week shows why this is a plausible scenario.

    Continue reading
  • Ex-spymaster and fellow Brexiteers' emails leaked by suspected Russian op
    A 'Very English Coop (sic) d'Etat'

    Emails between leading pro-Brexit figures in the UK have seemingly been stolen and leaked online by what could be a Kremlin cyberespionage team.

    The messages feature conversations between former spymaster Richard Dearlove, who led Britain's foreign intelligence service MI6 from 1999 to 2004; Baroness Gisela Stuart, a member of the House of Lords; and Robert Tombs, an expert of French history at the University of Cambridge, as well as other Brexit supporters. The emails were uploaded to a .co.uk website titled "Very English Coop d'Etat," Reuters first reported this week.

    Dearlove confirmed his ProtonMail account was compromised. "I am well aware of a Russian operation against a Proton account which contained emails to and from me," he said. The Register has asked Baroness Stuart and Tombs as well as ProtonMail for comment. Tombs declined to comment.

    Continue reading
  • As Microsoft's $70b takeover of Activision nears, workers step up their organizing
    This week: Subsidiary's QA staff officially unionize, $18m settlement disputed, and more

    Current and former Activision Blizzard staff are stepping up their organizing and pressure campaigns on execs as the video-game giant tries to close its $68.7bn acquisition by Microsoft.

    Firstly, QA workers at Raven Software – a studio based in Wisconsin that develops the popular first-person shooter series Call of Duty – successfully voted to officially unionize against parent biz Activision. Secondly, a former employee appealed Activision's proposed $18 million settlement with America's Equal Employment Opportunity Commission regarding claims of "sex-based discrimination" and "harassment" of female staff at the corporation. 

    Finally, a group of current and ex-Activision employees have formed a Worker Committee Against Sex and Gender Discrimination to try and improve the company's internal sexual harassment policies. All three events occurred this week, and show how Activision is still grappling with internal revolt as it pushes ahead for Microsoft's takeover. 

    Continue reading
  • Nvidia shares tumble as China lockdown, Russia blamed for dent in outlook
    Sure, stonking server and gaming sales, but hiring and expenses to slow down, too

    Nvidia exceeded market expectations and on Wednesday reported record first-quarter fiscal 2023 revenue of $8.29 billion, an increase of 46 percent from a year ago and eight percent from the previous quarter.

    Nonetheless the GPU goliath's stock slipped by more than nine percent in after-hours trading amid remarks by CFO Colette Kress regarding the business's financial outlook, and plans to slow hiring and limit expenses. Nvidia stock subsequently recovered a little, and was trading down about seven percent at time of publication.

    Kress said non-GAAP operating expenses in the three months to May 1 increased 35 percent from a year ago to $1.6 billion, and were "driven by employee growth, compensation-related costs and engineering development costs."

    Continue reading
  • Millions of people's info stolen from MGM Resorts dumped on Telegram for free
    Meanwhile, Twitter coughs up $150m after using account security contact details for advertising

    Miscreants have dumped on Telegram more than 142 million customer records stolen from MGM Resorts, exposing names, postal and email addresses, phone numbers, and dates of birth for any would-be identity thief.

    The vpnMentor research team stumbled upon the files, which totaled 8.7 GB of data, on the messaging platform earlier this week, and noted that they "assume at least 30 million people had some of their data leaked." MGM Resorts, a hotel and casino chain, did not respond to The Register's request for comment.

    The researchers reckon this information is linked to the theft of millions of guest records, which included the details of Twitter's Jack Dorsey and pop star Justin Bieber, from MGM Resorts in 2019 that was subsequently distributed via underground forums.

    Continue reading
  • DuckDuckGo tries to explain why its browsers won't block some Microsoft web trackers
    Meanwhile, Tails 5.0 users told to stop what they're doing over Firefox flaw

    DuckDuckGo promises privacy to users of its Android, iOS browsers, and macOS browsers – yet it allows certain data to flow from third-party websites to Microsoft-owned services.

    Security researcher Zach Edwards recently conducted an audit of DuckDuckGo's mobile browsers and found that, contrary to expectations, they do not block Meta's Workplace domain, for example, from sending information to Microsoft's Bing and LinkedIn domains.

    Specifically, DuckDuckGo's software didn't stop Microsoft's trackers on the Workplace page from blabbing information about the user to Bing and LinkedIn for tailored advertising purposes. Other trackers, such as Google's, are blocked.

    Continue reading
  • Despite 'key' partnership with AWS, Meta taps up Microsoft Azure for AI work
    Someone got Zuck'd

    Meta’s AI business unit set up shop in Microsoft Azure this week and announced a strategic partnership it says will advance PyTorch development on the public cloud.

    The deal [PDF] will see Mark Zuckerberg’s umbrella company deploy machine-learning workloads on thousands of Nvidia GPUs running in Azure. While a win for Microsoft, the partnership calls in to question just how strong Meta’s commitment to Amazon Web Services (AWS) really is.

    Back in those long-gone days of December, Meta named AWS as its “key long-term strategic cloud provider." As part of that, Meta promised that if it bought any companies that used AWS, it would continue to support their use of Amazon's cloud, rather than force them off into its own private datacenters. The pact also included a vow to expand Meta’s consumption of Amazon’s cloud-based compute, storage, database, and security services.

    Continue reading

Biting the hand that feeds IT © 1998–2022