Data scientists: Do they even exist?

Data data everywhere, but not a drop to shrink


Open ... and Shut Big Data is all the rage. Now if only someone had to clue what to do with it.

According to a new survey of senior executives by Big data consultantancy NewVantage, Big Data is "top of mind for leading industry executives," but these same executives struggle to find the right people to analyse their data. In fact, while 70 per cent of those organisations surveyed plan to hire data scientists, 100 per cent of them said they find it at least "somewhat challenging" to hire competent data scientists:

Given the difficulty in finding qualified people to analyse data, it's perhaps not surprising that only 0.5 per cent of enterprise data gets analysed, according to IDC. But if this is the case, why is Big Data so big?

After all, Gartner expects Big Data to drive $34bn in IT spending in 2013. Some companies, like Sears, clearly "get" Big Data and are putting it to work. But for the unwashed masses of enterprise IT, it sounds like Big Data is an aspiration, not a reality.

Still, it's an aspiration that has hard dollars chasing it. Of the top-10 job skills in demand on Indeed.com's job boards, two of them are Big Data-related. Over time, however, I suspect this data scientist arms race to be absorbed by two other trends:

1. Big Data technology being embedded into applications and

2. Enterprises training existing employees on Big Data technologies rather than hiring data scientists.

On the first trend, Cloudera chief executive Mike Olson perhaps said it best when he argued that the value of big-data technology like Hadoop will increasingly be delivered through applications. Enterprises won't need data scientists as their applications will process and analyse the data for them. Yes, someone will still need to know which questions to ask of the data, but the hard-core science of it should be rendered simpler by applications.

The second trend is equally important, and was called out by Gartner analyst Svetlana Sicular, who posits: "Organisations already have people who know their own data better than mystical data scientists" and that: "Learning Hadoop is easier than learning the company’s business." So the focus of enterprises should be training employees to use tools like Hadoop, not to waste cycles and recruiting fees scouring the planet for mythical data scientists.

All of which should provide some comfort to those organisations that have been struggling to find data scientists to analyse their data. It may turn out that the "mythical data scientist" is actually Lily who works one cubicle over. ®

Matt Asay is vice president of corporate strategy at 10gen, the MongoDB company. Previously he was SVP of business development at Nodeable, which was acquired in October 2012. He was formerly SVP of biz dev at HTML5 start-up Strobe (now part of Facebook) and chief operating officer of Ubuntu commercial operation Canonical. With more than a decade spent in open source, Asay served as Alfresco's general manager for the Americas and vice president of business development, and he helped put Novell on its open source track. Asay is an emeritus board member of the Open Source Initiative (OSI). His column, Open...and Shut, appears three times a week on The Register. You can follow him on Twitter @mjasay.

Similar topics


Other stories you might like

  • Mastering metadata key to shifting data fast, says Arcitecta
    A new transmission protocol can work lightning fast, but only with very thorough records to pull from

    Companies that move and analyze huge volumes of data are always on the lookout for faster ways to do it. One Australian company says it has created a protocol that can "transmit terabytes per minute across the globe."

    The company, Arcitecta, which has an almost 25-year history, has just announced the new Livewire protocol, which is part of their data management platform, Mediaflux, used by institutions including the Australian Department of Defense, drug maker Novartis, and the Dana Farber Cancer Institute.

    According to CEO Jason Lohrey, Livewire itself has already made an impact for some of the largest data movers. "One of our customers transmits petabytes of data around the globe, he told The Register.

    Continue reading
  • Real-time data analytics firm Tinybird nets $37m in Series A
    Millions of rows per second in real time, so the legend goes...

    A data analytics company claiming to be able to process millions of rows per second, in real time, has just closed out a Series A funding round to take-in $37 million.

    Tinybird raised the funds via investors Crane Ventures, Datadog CPO Amit Agarwal, and Vercel CEO Guillermo Rauch, along with new investments from CRV and Singular Ventures.

    Tinybird's Stephane Cornille, said the company plans to use the funds to expand into new regions, build connectors for additional cloud providers, create new ways for users to build their own connectors, provide tighter Git integration, Schema management and migrations, and better defaults and easier materialization.

    Continue reading
  • Big data means big money for the UK government as £2bn tender mooted
    Central procurement team tickles the market with tantalising offer... but what for?

    The UK government is putting the feelers out for a bundle of big data products and services in a move that could kick off £2bn in tendering.

    Cabinet Office-run Crown Commercial Service (CCS), which sets up procurement on behalf of central government ministries and other public sector organisations, has published an early market engagement to test the suppliers' interest in a framework for so-called big data and analytics systems.

    According to the prior information notice: "Big data and analytics is an emerging and evolving capability, with its prominence heightened by COVID. It is fast becoming recognised as business critical and a core business function, with many government departments now including chief data officers. The National Data Strategy and implementation of its missions reinforce the requirement to access and interrogate Government data more effectively to improve public services."

    Continue reading

Biting the hand that feeds IT © 1998–2022