AWS catches up to Azure and GCP with CloudShell, adds deliberate injection of chaos

Plus: Managed Grafana service for observability

re:Invent Amazon Web Services CTO Dr Werner Vogels has opened up on CloudShell, a Linux environment accessed through the browser which gives users a command-line and scripting environment for all AWS services.

At a re:Invent keynote yesterday, the exec also described Fault Injection Simulator - chaos engineering as a service, intended to help customers build resilient applications.

Dr Werner Vogels expounding the benefits of observability at an ancient food processing factory.

Dr Werner Vogels expounding the benefits of observability at an ancient food processing factory near his home town of Amsterdam, NL

Vogels’ keynote was a welcome relief from the relentless marketing that has characterised many other re:Invent keynotes, focusing mainly on technology and software engineering. Speaking from a 19th century sugar beet processing plant in Haarlem in the Netherlands, he used the industrial background to discuss matters such as operations, observability and reliability in today’s context of cloud services.

CloudShell is a Linux instance for managing or developing on AWS services, accessed through the browser.

CloudShell is a Linux instance for managing or developing on AWS services, accessed through the browser

First up though was AWS CloudShell, a Linux terminal which is available from the AWS console (a search for CloudShell brings it up, provided you are in a supported region, such as eu-west-1).

The OS is Amazon Linux 2, and the AWS Command Line Interface (CLI) is pre-installed. There are options for Bash, zsh and PowerShell, and yum is available for installing applications. If you want to use the nano editor, for example, sudo yum install nano will do it.

Changes to the environment do not persist, but there is 1GB of persistent storage (deleted after 120 days of no access) so that scripts and files can be saved. There is no charge other than for data transfer or other AWS resources used. One CloudShell VM is available per user, with up to 10 concurrent sessions. The VM has 1 vCPU (virtual CPU) and 2GB RAM. An Actions menu lets users upload or download files.

A limitation in the current release is that there is no access to resources in a VPC (Virtual Private Cloud). That said, the internet is accessible, so users can grab files with wget, for example. Git is pre-installed.

AWS seems to have developers in mind as much if not more than administrators. Node and npm (the Node package manager) are pre-installed, as is Python (2 and 3) and pip, the Python package installer.

AWS is not the first cloud pusher to offer a Linux shell and in some ways this is a catch-up. Azure Cloud Shell was first previewed in 2017, and runs a Ubuntu instance, while Google Cloud Platform (GCP) also has a Cloud Shell with kubectl pre-installed for Kubernetes management. The Azure and GCP creations have 5GB of persistent storage. Both are very useful in their respective environments and no doubt AWS CloudShell will be the same.

Agents of chaos: This is not a simulation

The Fault Injection Simulator is not really a simulator: it injects actual faults into running services, such as throttling API calls.

The Fault Injection Simulator is not really a simulator: it injects actual faults into running services, such as throttling API calls

Vogels also enthused about Fault Injection Simulator (FIS), coming in “early 2021” he said. “Out of all things we are announcing this year, this is the one I am most excited about,” we were assured.

Chaos Engineering is about deliberately introducing failures into a system in order to test its resilience, as described to The Register by Gremlin’s SRE principal Tammy Bryant (Butow) last year. As she explained, chaos engineering was pioneered by Netflix, where Adrian Cockcroft, now an AWS VP, used to be cloud architect. Cockcroft is a chaos engineering advocate, and likely had some influence on the new service.

According to Vogels, “there is no better way to test your systems than chaos engineering,” and he added that it is good for teams as well as for the software itself. “Most underrated is the experience teams get responding to infrequent critical issues,” he said.

Product manager Laura Thomson said at re:Invent that “what we’re trying to do with FIS is to offer managed fault injection actions and composable experiment templates that make it easier to get started with best practice chaos testing.”

As Thomson noted, “the faults are really happening,” they are not simulated, which is why caution is required when performing chaos tests.”

Is this bad news for Gremlin, which now has to compete with AWS? The company declared itself “excited to see this announced today,” and will no doubt be happy to see more prominence given to chaos engineering; what tends to happen is that loss of business is mitigated by increased demand for chaos engineering skills and services, so it may not be all bad news.

The third piece of significant news was the preview of a managed service for Grafana, in partnership with Grafana Labs. Grafana is an open source application for making better sense of metrics, logs and traces via data visualization, analysis and alarms. The service is in preview until February 15 2021, during which time it is free for up to 100 users per workspace and two workspaces per account.

After that, which we presume is the projected date for general availability, there will be licences for editors and viewers, at $9.00 and $5.00 per month respectively. Prices jump if users upgrade to Grafana Enterprise: $3500 per month plus $36.00 and $10.00 for the editor/viewer licenses. The Enterprise version is supported by Grafana Labs and includes plugins and on-demand training. These plugins include integration with third-party data sources such as Splunk, DataDog and AppDynamics.

A related new service is the Managed Service for Prometheus, also in preview, which is an open source monitoring solution built on Cortex, a Cloud Native Computing Foundation (CNCF) service for scalable storage and query of Prometheus data. The managed Grafana service can monitor operations on other clouds such as Azure and GCP as well as AWS, another nod towards multi-cloud.

AWS open source executive Matt Asay promised that “AWS, working with Grafana Labs, will be contributing licensing revenue and code to help make Grafana even better, not just for the AWS service, but also for open source users.

In this instance, it seems that AWS has found a way to work with rather than against open source companies (unlike the situation with Elastic, whose search engine AWS has forked). ®

Similar topics

Other stories you might like

Biting the hand that feeds IT © 1998–2021