This article is more than 1 year old

Thank heavens for the silicon chip: A BRIEF history of data

You have a pair of bones in Africa to thank for Larry Ellison

Seize the data

Some people argue that networking gave us the internet and the web, so networking it is the most important feature of computing. UI advocates argue that, without a good interface, a computer is useless, so the GUI is king. Both of these aspects of computing are important, but data is the true ground zero for computers, upon which everything else is built. If you disagree, there is a simple test. Try to think of a computer application that doesn’t manipulate data. It is a crucial test because it is impossible.

There are many computers not connected to the internet, many that don’t have GUIs, but ALL computer programs manipulate data. Even the humble “Hello World” that, for many of us, was our first computer program, manipulates data.

So the mistake that the early programmers made was to underestimate the importance of data. They tended to write programs that mixed up the data, the application code and the UI all together.

However, because they were smart, they noticed that every single application they wrote manipulated data. So eventually some bright spark wrote a specialised program, the sole purpose of which was to store and manipulate data. This was retrospectively called a Database Management System (DBMS). Its creation meant that subsequent applications could be much smaller because, instead of including all the code necessary to store and manipulate data, they could simply send the data to the DBMS and, when they wanted the data, they could call it back.

DBMSs are vital to the story of data because they didn’t just make data handling easier and faster, they fundamentally changed how we thought about data itself. We started to think about data as an abstract concept rather than as a set of numbers or words and in turn that led us to think about and develop data models.

Data models are formal descriptions about how the data will be stored and manipulated. In 1969, IBM’s Ted Codd described the relational data model, the first commercial Relational DBMS were produced in the 1980s and RDBMSs rapidly became the tool of choice for handling transactional data, with not a split tally stick in sight.

At the same time, systems like word processors and paint packages were appearing (particularly with the advent of the PC around 1980), so big data – in the form of documents, images and so on – was also being produced on an increasingly industrial scale. However, there was a distinction. Both forms of data were being produced but only the tabular data was being analysed; the Big Data was simply being used for its intended purpose and then stored: what the military might call “file and forget”.

There are several reasons why we weren’t analysing Big Data at that time.

The first is that analysing tabular data was the most obviously profitable – most of that data was business data and commercial organisations could readily see the benefit of analysing, say, sales data.

Another reason was because analysing tabular data is easier than Big Data. However, it is worth pointing out that easier doesn’t mean easy. In fact, tabular data is inherently much more complex than it first appears and a great deal of our energies in the 1980s and 1990s were devoted to making sure we were doing the basics correctly. As an example, a desirable characteristic of data itself is that transactions have so-called ACID properties – Atomicity, Consistency, Isolation and Durability.

Next page: They call it ACID

More about

TIP US OFF

Send us news


Other stories you might like