Oh no, you're thinking, yet another cookie pop-up. Well, sorry, it's the law. We measure how many people read us, and ensure you see relevant ads, by storing cookies on your device. If you're cool with that, hit “Accept all Cookies”. For more info and to customize your settings, hit “Customize Settings”.

Review and manage your consent

Here's an overview of our use of cookies, similar technologies and how to manage them. You can also change your choices at any time, by hitting the “Your Consent Options” link on the site's footer.

Manage Cookie Preferences
  • These cookies are strictly necessary so that you can navigate the site as normal and use all features. Without these cookies we cannot provide you with the service that you expect.

  • These cookies are used to make advertising messages more relevant to you. They perform functions like preventing the same ad from continuously reappearing, ensuring that ads are properly displayed for advertisers, and in some cases selecting advertisements that are based on your interests.

  • These cookies collect information in aggregate form to help us understand how our websites are being used. They allow us to count visits and traffic sources so that we can measure and improve the performance of our sites. If people say no to these cookies, we do not know how many people have visited and we cannot monitor performance.

See also our Cookie policy and Privacy policy.

This article is more than 1 year old

Google cloud: rubbish at updates, world-class at rapid rollbacks

Another borked software upgrade gives Google's cloud hiccups

Google's revealed that it has once again borked its own cloud with an update.

The latest incident hit last Thursday, when the company made what it's calling “A routine software upgrade to the authorization process in BigQuery”.

That update “had a side effect of reducing the cache hit rate of dataset permission validation … triggered a cascade of live authorization checks that fanned out and amplified throughout the BigQuery service, eventually causing user visible errors as the authorization backends became overwhelmed.”

“As a byproduct, error rates for the service increased as individual requests failed to authorize.”

Users experienced problems for six hours.

Google's promising it will “change the structure of permissions validation so that continual retries will not destabilize the entire service”.

The Reg's cloud desk hopes it is also changing the way it tests patches, because it is racking up quite a list of self-inflicted outages. In March, for example, one flawed fix caused a lengthy VM brownout and another took down App Engine for a time.

Come April and the company said a routine maintenance “resulted in traffic being sent to a datacenter router that was running a test configuration” and caused packet loss.

The good news is that Google fixes this stuff quickly: six-hour wobbles like the BigQuery incident are rather longer than the company's usual time to restore a distressed service. And it's not had outright outages. On the downside, the company keeps experiencing wobbles of its own making, but appears to have a world-class rapid response and rollback capability to keep things moving along. ®

 

Similar topics

TIP US OFF

Send us news


Other stories you might like