Snowflake wrestles Python, chases China, and ingests unstructured data
Cloudy analytics contender is also having a look at Amazon's Graviton silicon
Cloudy data-cruncher Snowflake has added Python support to its "Snowpark" developer toolkit.
Clive Astbury, a regional manager for sales engineering, told The Register that customers have expressed frustration at needing to export data from Snowflake to put the popular programming language to work. Adding support for the language solves that issue. Snowflake worked with Anaconda for access to pre-groomed Python libraries in an arrangement it is hoped will reduce dependency dramas.
The Snowpark dev kit has also gained the ability to use Java Functions to enhance its ability to process unstructured data – vids, pics, and docs – so they too can be fed into Snowflake's analytics and ML engines. Another new feature is stored procedures to let code run inside Snowflake instead of an external client.
"We also have the ability to run our own native algorithms in Python, (statistical language) R, and SQL but can also run open-source algorithms from Python and R libraries directly on all data that [cloud platform] Vantage will have access to."
All the new stuff mentioned above is in private preview, meaning select customers can play with it ahead of a formal debut to all users.
- Fivetran snags $565m funding round as Snowflake attempts to eat its lunch with in-house data integration tools
- Snowflake doubters voice reservations over data warehouse's attempt to break into financial services
- Informatica bids to become Switzerland of data with SaaSy governance and catalogue tool
Some of those users may soon be in China – The Register has learned that a translation of Snowflake for that market is under way. The company held back on its entry to Japan while it worked on translation, on the grounds that targeting English-speaking users would limit growth potential. Similar logic applies to Snowflake's entry to China.
Greg Roodt, head of data platforms at SaaS-y graphics application Canva, told The Register he eagerly awaits the debut of the service as he's keen to operate a single analytics setup instead of maintaining the data warehouse the company currently operates in the Middle Kingdom.
Snowflake is playing catch-up with other data warehouse vendors in terms of Python support. Teradata has had a Python connector since 2015. It also allows users to run Python scripts directly on Vantage, either through a client interface or through Python installed directly on Vantage nodes "for highly efficient, parallel execution," a Teradata spokesperson said.
Most Snowflake customers prefer to use the service in Amazon Web Services, which led The Register to ask if the analytics vendor has pondered using the cloud colossus's homegrown Graviton2 CPU, as AWS claims it offers superior price/performance – just the ticket for any SaaS-slinger.
Snowflake's Astbury said he is aware Snowflake engineers have started to consider the Arm-powered silicon, but could offer no detail on whether it features in Snowflake's plans, or ever will.
Hyoun Park, CEO and chief analyst, Amalgam Insights, said: "Snowflake has established itself as a dominant cloud data warehouse that is capable of storing structured data and becoming the trusted repository for traditional enterprise data in cloud settings. However, to continue evolving, Snowflake must also become a central working environment for data science.
He said today's news from Snowflake "allows data analysts and scientists to use Python within Snowflake more easily and in a more governed fashion to support production-grade data preparation and data science on enterprise data."
Park added that Python had become the most common language to support modern applications due to its "ease of syntax and its value in supporting machine learning."
"With this announcement, Snowflake shows its ability to not just be the modern data warehouse, but also a central location for building and supporting modern applications on top of trusted data." ®