Redmond turns to Linux AGAIN for Azure data science primer

CentOS-powered big data playground aims to rescue devs from dependency hell

Microsoft has taken a data science bundle it crafted last November and put it onto an Azure-hosted Linux VM.

The combo, announced at Microsoft's Cortana blog, takes CentOS 7.2, runs it up as an Azure virtual machine image, and packages it with a slew of data science tools.

Microsoft had already run up a Windows Server 2012-based offering, which it announced in November 2015.

If nothing else, a “turn it on and use it” Linux data science bundle will save experimenters from having to pray they don't drop into “dependency” hell to put together a list of tools that includes R Open with the math kernel library, Python 2.7 and 3.5 (the Anaconda distribution), the Azure CLI and Storage Explorer, and PostgreSQL.

The machine learning suite includes Azure ML, Redmond's Github-posted Computational Network Toolkit, Vowpal Wabbit (with hashing, allreduce, reductions, learning2search, active, and interactive learning), the XGBoost boosted tree implementation, and the Rattle R GUI tool.

Gopi Kumar, senior program manager for Microsoft's Data Group, says in this post that users should be able to get things up and running in 15 minutes, so “you can standup your own data science VM within your subscription and you’ll be ready to jump right into data exploration and modelling immediately”.

The developer tool list is also fairly long: “Azure SDK in Java, Python, Node.js, Ruby, PHP; Eclipse IDE with Azure Toolkit plugin; code editors like vim, gedit and Emacs (with ESS, auctex add-ons); SQL Server drivers and command line tools like bcp (Bulk Copy), sqlcmd (text based SQL Server query utility); SQuirreL SQL graphical client to access various databases”.

The VM doesn't have a separate fee – users just pay for compute usage based on its size. Prices run from US$0.018 (1.8 cents) per hour to run it on a single core, up to $8.69 for a 32-core Xeon E5v3-based host.

All files are saved if the VM is turned off, and there are extra services available in Azure and in the Cortana Intelligence Suite, he writes. ®

