This article is more than 1 year old
All your database are belong to us: AWS wants every data silo on its platform
Also: Custom SQL Server service, free SageMaker ML, and experts on hand to label data
RE:INVENT AWS has introduced a flurry of new database and ML services at its Re:invent conference, including a migration service targeting every database in an organization,
Swami Sivasubramanian, VP of ML (machine learning) gave the data keynote today. He claimed that Aurora, a service that is compatible with either MySQL or PostgreSQL, has “5 x the performance of MySQL and 3 x the performance of PostgreSQL,” and “is still the fastest-growing service in AWS history.”
Nevertheless he was concerned that “there are certain customers who are held back from migrating to databases in the cloud.” He introduced two new services aimed at getting an even greater proportion of an organization’s databases onto the AWS cloud.
The first is RDS (Relational Database Service) Custom for Microsoft SQL Server. This joins the existing RDS Custom for Oracle, and the two services are for customers with “critical applications such as Oracle E-Business suite, or Microsoft Dynamics, or SharePoint, that were designed to run on commercial databases with very specific configurations.”
RDS Custom is a hybrid between RDS, where the customer does not have direct access to the operating system, and running SQL Server or Oracle direction on EC2, where every detail is the customer’s responsibility. RDS Custom allows full privileged access to the operating system, enabling custom applications and agents to be installed, but still provides some of the benefits of RDS such as automated point-in-time recovery and health monitoring. Ideal, said Sivasubramanian, for applications like SharePoint, Dynamics, Pow
- AWS unveils Graviton3 Arm chips and more. But the real story is the slide from IaaS to packaged solutions
- Can Rust save the planet? Why, and why not
- All change at JetBrains: Remote development now, new IDE previewed
- AWS is on the threshold of adulthood, but is nowhere near grown up
erBI and Polybase.
Next, Sivasubramanian said that “we have heard from customers that building a migration plan for their entire fleet of databases is challenging,” the assumption being that organizations want all their database on AWS if they can only figure out how. “Today we are launching in preview AWS DMS Fleet Advisor,” he said, where DMS is the Database Migration Service.
Fleet advisor is “a new capability of AWS DMS that automates migration planning for an entire fleet of databases. DMS Fleet Advisor automatically builds an inventory of your on-prem database and analytics servers by streaming data from on-prem to Amazon S3. We analyze them to match with the appropriate AWS data store and customize migration plans. All of this now just takes hours,” he promised. “You don’t have to rely on a third-party tool or an expensive migration consultant.”
Trust the agent
Database administrators may be sceptical, but the idea is that users install an agent on their networks, in one or more locations, which collects data and builds an inventory, uploading its findings to S3.
From the resulting inventory, “you can convert schemas for migration by using the AWS Schema Conversion Tool,” say the docs. This has numerous options, including for example conversion from SQL Server to Aurora MySQL; or databases could simply be migrated to the same database manager running on AWS.
Typically such tools turn out to have limitations, but there is no doubting the company’s determination to provide for every variety of database on its ever-expanding platform. “Over 500,000 databases have been migrated to AWS with AWS Database Migration Service,” said Sivasubramanian.
Another statistic mentioned is that there are more than 200,000 data lakes on AWS. Sivasubramanian made a point of the number of different types of database AWS offers, including DocumentDB for key-value pairs, Neptune for Graph data, TimeStream for event processing, and ElastiCache for in-memory caching.
There are a ridiculous number of new features announced here at Re:invent, and the data and ML category is no exception. DevOps Guru for RDS is a new service for detecting and analysing issues with Aurora performance or operation. DynamoDB gets a new infrequent access table class which, it is claimed, could reduce costs by “up to 60 percent for tables that store infrequently accessed data.”
On the ML side, SageMaker, a tool for developing ML models, gets a new training compiler that could double performance via better use of GPU instances; and SageMaker Studio Canvas, a no-code drag and drop environment for building SageMaker models, which Sivasubramanian positioned as an alternative to trying to figure out predictions using figures in spreadsheets. There is also SageMaker serverless inference, which lets users deploy models without having to provision compute resources.
Crypto currency mining is prohibited
SageMaker Studio Lab is a free tier ML service base on the open source JupiterLab, limited to one project and up to 15GB storage. Users can select a CPU or GPU runtime and use it for up to 12 hours (CPU) or 4 hours (GPU). Files are persisted though the runtimes are closed down after each session. Availability is not guaranteed, and “crypto currency mining is prohibited.” Support is community-only.
A service called SageMaker Ground Truth Plus involves actual people, with AWS promising to hire experts to label a customer’s data for in order to build a model. “For example, if you need medical experts to label radiology images, you can specify that in the guidelines you provide to Ground Truth Plus. The service will then automatically select labelers trained in radiology to label your data,” said the post today.
How much? As one might expect, after applying “our team of AWS Experts will schedule a call to discuss your data labeling project.” The existing Ground Truth service already offered the possibility of using "Mechanical Turk, third-party vendors, or your own private workforce" to label data, but in the new service AWS itself is taking responsibility for this critical task.
Sivasubramanian also introduced new feature for Kendra, the AWS enterprise search service. Apparently customers have struggled to build a user interface to Kendra, so now the Experience Builder will allow a UI to be built in a few clicks and without code.
In addition, Search Analytics provides data on how Kendra is used, and Custom Document Enrichment will pre-process documents with automatic classification, conversion of images to text, and additional metadata using custom processes. Making it easier to move databases onto AWS is obvious good business for the company. The innovation though is in the area of making ML more accessible, right down to providing experts to label data. ®