There’s a lot of hype around data scientists. You can blame big data and the cloud. Data scientists are lauded, hunted and positively desired by those wanting to squeeze the most from their information.
In accordance with such demand come large salaries – the average is $123,000 in the US. Like the Yanks of WWII, data scientists are overpaid, oversexed and over here – "here" being the cosy IT establishment we have been familiar with for decades as firms recruit outside the comp-sci arena.
But take a moment: do you actually know what these data scientist actually do?
Finding patterns in big data, yeah, but that's only one small part of a larger subject. Data scientists shape your world in unperceivable ways.
Data science is a broad term that finds a home in web analytics, machine learning, healthcare and biotechnology, amongst others fields. The only commonality is huge amounts of data from which to extract useful information.
We all know that as consumers we are mined for data when we connect to any website or service. What you probably don’t appreciate, however, is the degree to which data science wizards have been deployed, and how far their work extends on the other side of that site or service.
Take for example the happy prospect of booking a holiday. With big data, it becomes a whole new proposition. The website becomes all about the highly personalised experience for you, and only you. Frictionless is the word that is frequently used.
The site automatically adjusts itself around items that may appeal to you specifically. Obviously if you are presented with something that is compelling you are much more likely to commit to its purchase. The personalised experience is a byword for increased profits and also enhanced user happiness. It needs to be done right from the start though.
This is where a data scientist earns his pay. The personalisation effect comes from a number of places. Items that are collected and used include obvious information such as IP, visit frequency, location (even without GPS location is now fairly accurate thanks to extensive work on geo-location.)
What you may not have expected however is how this data is used. You probably knew this much.
Using advanced (usually proprietary) algorithms data science can make some educated assumptions about you and where you buy stuff. An example of this: if you log in from two different places frequently it is usually fairly safe to deduce that one location is home and the other is work.
At this point algorithms can make an educated guess about your socio-economic grouping based on aggregate information from data collected from other nearby IP addresses and their search history, and such like.
Although all this may sound scary, allowing big data logic to make decisions rather than emotion and with more facts at its disposal it may well chose a better holiday for you than you personally would have done without the big data technology.
This data crunching allows the website to present offers that have proven popular with people that not only have the same economic grouping but also location, lifestyle and aspirations. In short, it can potentially pick a better experience.
To build up an even more complete picture (and this is where it gets a bit scary) is that after you have done some initial browsing a batch job in the system can then go out and trawl publicly available databases such as the electoral roll to find out more information about you.
If it finds the data about you it will then enable the offers to become even more personalised. It will know not only your name but also your household make up.
Therefore, if you have two adults and two children under 16 it will know that certain places can have a big appeal. There are other closed databases that hold even more aggregated data than this but these vendors don’t advertise the fact for obvious reasons.
If you leave feedback about a holiday you went on, it will aggregate that and know that similar people will share similar viewpoints.
As the site collects more and more data on you it gets to know you better and better. At some point it may actually know you better than you know yourself and it applies cold hard logic to what it presents, so the system knows that if it presents you with three or four customised offers you will more than likely love at least one option if not more.
That data feeds back for your next visit so it can fine-tune what it shows you in the future.