Snowflake wrestles Python, chases China, and ingests unstructured data

Cloudy analytics contender is also having a look at Amazon's Graviton silicon


Cloudy data-cruncher Snowflake has added Python support to its "Snowpark" developer toolkit.

Clive Astbury, a regional manager for sales engineering, told The Register that customers have expressed frustration at needing to export data from Snowflake to put the popular programming language to work. Adding support for the language solves that issue. Snowflake worked with Anaconda for access to pre-groomed Python libraries in an arrangement it is hoped will reduce dependency dramas.

The Snowpark dev kit has also gained the ability to use Java Functions to enhance its ability to process unstructured data – vids, pics, and docs – so they too can be fed into Snowflake's analytics and ML engines. Another new feature is stored procedures to let code run inside Snowflake instead of an external client.

"We also have the ability to run our own native algorithms in Python, (statistical language) R, and SQL but can also run open-source algorithms from Python and R libraries directly on all data that [cloud platform] Vantage will have access to."

All the new stuff mentioned above is in private preview, meaning select customers can play with it ahead of a formal debut to all users.

Some of those users may soon be in China – The Register has learned that a translation of Snowflake for that market is under way. The company held back on its entry to Japan while it worked on translation, on the grounds that targeting English-speaking users would limit growth potential. Similar logic applies to Snowflake's entry to China.

Greg Roodt, head of data platforms at SaaS-y graphics application Canva, told The Register he eagerly awaits the debut of the service as he's keen to operate a single analytics setup instead of maintaining the data warehouse the company currently operates in the Middle Kingdom.

Snowflake is playing catch-up with other data warehouse vendors in terms of Python support. Teradata has had a Python connector since 2015. It also allows users to run Python scripts directly on Vantage, either through a client interface or through Python installed directly on Vantage nodes "for highly efficient, parallel execution," a Teradata spokesperson said.

Most Snowflake customers prefer to use the service in Amazon Web Services, which led The Register to ask if the analytics vendor has pondered using the cloud colossus's homegrown Graviton2 CPU, as AWS claims it offers superior price/performance – just the ticket for any SaaS-slinger.

Snowflake's Astbury said he is aware Snowflake engineers have started to consider the Arm-powered silicon, but could offer no detail on whether it features in Snowflake's plans, or ever will.

Hyoun Park, CEO and chief analyst, Amalgam Insights, said: "Snowflake has established itself as a dominant cloud data warehouse that is capable of storing structured data and becoming the trusted repository for traditional enterprise data in cloud settings. However, to continue evolving, Snowflake must also become a central working environment for data science.

He said today's news from Snowflake "allows data analysts and scientists to use Python within Snowflake more easily and in a more governed fashion to support production-grade data preparation and data science on enterprise data."

Park added that Python had become the most common language to support modern applications due to its "ease of syntax and its value in supporting machine learning."

"With this announcement, Snowflake shows its ability to not just be the modern data warehouse, but also a central location for building and supporting modern applications on top of trusted data." ®


Other stories you might like

  • India reveals home-grown server that won't worry the leading edge

    And a National Blockchain Strategy that calls for gov to host BaaS

    India's government has revealed a home-grown server design that is unlikely to threaten the pacesetters of high tech, but (it hopes) will attract domestic buyers and manufacturers and help to kickstart the nation's hardware industry.

    The "Rudra" design is a two-socket server that can run Intel's Cascade Lake Xeons. The machines are offered in 1U or 2U form factors, each at half-width. A pair of GPUs can be equipped, as can DDR4 RAM.

    Cascade Lake emerged in 2019 and has since been superseded by the Ice Lake architecture launched in April 2021. Indian authorities know Rudra is off the pace, and said a new design capable of supporting four GPUs is already in the works with a reveal planned for June 2022.

    Continue reading
  • Prisons transcribe private phone calls with inmates using speech-to-text AI

    Plus: A drug designed by machine learning algorithms to treat liver disease reaches human clinical trials and more

    In brief Prisons around the US are installing AI speech-to-text models to automatically transcribe conversations with inmates during their phone calls.

    A series of contracts and emails from eight different states revealed how Verus, an AI application developed by LEO Technologies and based on a speech-to-text system offered by Amazon, was used to eavesdrop on prisoners’ phone calls.

    In a sales pitch, LEO’s CEO James Sexton told officials working for a jail in Cook County, Illinois, that one of its customers in Calhoun County, Alabama, uses the software to protect prisons from getting sued, according to an investigation by the Thomson Reuters Foundation.

    Continue reading
  • Battlefield 2042: Please don't be the death knell of the franchise, please don't be the death knell of the franchise

    Another terrible launch, but DICE is already working on improvements

    The RPG Greetings, traveller, and welcome back to The Register Plays Games, our monthly gaming column. Since the last edition on New World, we hit level cap and the "endgame". Around this time, item duping exploits became rife and every attempt Amazon Games made to fix it just broke something else. The post-level 60 "watermark" system for gear drops is also infuriating and tedious, but not something we were able to address in the column. So bear these things in mind if you were ever tempted. On that note, it's time to look at another newly released shit show – Battlefield 2042.

    I wanted to love Battlefield 2042, I really did. After the bum note of the first-person shooter (FPS) franchise's return to Second World War theatres with Battlefield V (2018), I stupidly assumed the next entry from EA-owned Swedish developer DICE would be a return to form. I was wrong.

    The multiplayer military FPS market is dominated by two forces: Activision's Call of Duty (COD) series and EA's Battlefield. Fans of each franchise are loyal to the point of zealotry with little crossover between player bases. Here's where I stand: COD jumped the shark with Modern Warfare 2 in 2009. It's flip-flopped from WW2 to present-day combat and back again, tried sci-fi, and even the Battle Royale trend with the free-to-play Call of Duty: Warzone (2020), which has been thoroughly ruined by hackers and developer inaction.

    Continue reading

Biting the hand that feeds IT © 1998–2021