This article is more than 1 year old
Embracing the data explosion
Acting on data as it is created is the key to exploiting the oncoming information tsunami
Sponsored Feature Data volumes globally are ramping up at an insane rate. According to market intelligence company IDC, the world's data will grow to 175 zettabytes in 2025 – a volume of data that, if copied onto BluRay media, would create a pile of discs that spans the distance between the earth and the moon 23 times.
This information tsunami shows no sign of abating, with total data expected to more than double in size between 2022 and 2026, says John Rydning, research vice president, IDC's Global DataSphere. He predicts that enterprise data will grow more than twice as fast as consumer data over the next five years, "putting even more pressure on enterprise organizations to manage and protect the world's data while creating opportunities to activate data for business and societal benefits."
Johnson Noel is a senior manager for technical solutions at Hazelcast, a real-time stream processing platform. He agrees that enterprises are having to contend with nothing less than a "data explosion".
"If you can't deal with data volumes as they currently stand now - if you cannot meet your service level agreements now – you have got no chance in five years' time when that volume of data has quadrupled, or increased even more," warns Noel.
"A lot of businesses just don't know how to handle this information. Some may not even be aware of the value that this data has, and generally they do nothing about it. But that information is vital because that's the heart and lungs of their business."
Waking up to real time data processing
In light of these concerns, Noel believes that forward looking enterprises are waking up to the importance of "real-time" data processing to maximize the commercial value of their data resources. Those real-time platforms are an increasingly popular choice for organizations that frequently store and process high volumes of digital transactions, but while some databases feature native caching options they can add critical microsecond delays and latency into the equation. The issue can be particularly pronounced for financial services, ecommerce, healthcare and government agencies, where the ability to read/write data in real-time dictates the success or failure of "a moment in time".
"When we start talking about some of the data challenges facing enterprises, it is clear there have always been problems to overcome, but now when we're in the real-time world these issues are much more serious, and much more prevalent. So much more data is being generated and it is moving much faster than before. And more data introduces greater varieties, sources, formats and the increased possibility of data sparsity and veracity." adds Noel.
"If companies were able to deliver insights in real-time, they would get benefits, but invariably at the moment a lot of them aren't able to do that because they haven't got the capabilities."
To help address the issues associated with processing spiraling volumes of data in real time, Hazelcast has developed a real-time stream processing platform that lets enterprises easily and quickly build applications that take action on data immediately. Hazelcast is a real-time stream processing platform, integrated with low latency storage and machine learning inference.
In addition to distributing data on low latency storage (a combination of memory and high performance SSDs), Hazelcast provides a set of APIs to access the CPUs in a cluster for maximum processing speed. Thus, in Hazelcast, the RAM and local storage of all cluster members is combined into a single low latency data store to provide fast access to data. This distributed model is designed to make organizations' data fault-tolerant and scalable, because, if a single member goes down, the backups of the primary mission-critical data is rebalanced across the remaining members.
Noel notes that, the advantage of distributed low latency storage is that it provides the ability to have very high-performance access to data, while offering the ability to elastically grow and shrink capacity.
He adds that Hazelcast provides the foundation to develop and deploy fast, scalable applications for enterprises to run large-scale calculations, simulations, and other data- and compute-intensive workloads in real-time. It also accelerates enterprises' transactions by significantly reducing data access latency via tiered storage of data, sourced from many disparate data stores.
Of course some technology providers claim, but do not deliver, real-time data processing: "There are some providers out there who say that they do real-time but all they do is collect the information in real-time. But then they end up having to load into a database or some other service and then do processing and then of course, that's too late." explains Noel.
"True real-time involves continuously collecting, analyzing, canonicalizing and performing functions (like processing against rules or machine learning models) on the data 'in-flow' outside of a database or any other service – in other words, the entire process should be run on very low latency storage and delivered to users or downstream processes long before it touches a database."
Noel adds that Hazelcast's processing of data provides commercial benefits: "Processing is an area where we differentiate from other conventional solutions. The idea of processing data is really all about being able to immediately access created data, enrich it and then provide information in real-time, so that action can be taken immediately to deliver value that would have otherwise been missed".
"It's about making sure that you make the right calls in time, all the time."
Banks improve customer response times
The commercial benefits of real-time data processing were certainly compelling for BNP Paribas Bank Polska, which has deployed an event-driven architecture to increase revenue opportunities. The financial institution, a member of the BNP Paribas banking group whose footprint spans 71 countries, recently undertook an initiative to increase the adoption of its products.
That included the promotion of personal loans to any customer whose bank account balance was low and could not provide the requested amount of cash via an ATM. However, the bank's previously deployed batch-oriented infrastructure based on CRM and data warehouse technology meant that it would typically take up to two days to present the customer with an offer.
To speed up this response process, the BNP Paribas IT team plugged Hazelcast's real-time stream processing platform into its existing publish/subscribe messaging bus and turned the environment into an event-driven architecture. This gave it the ability to act on events in real time, especially since they were already capturing information about customer interactions.
As event data was read by Hazelcast, it was quickly supplemented with data lookups in the low latency storage which provided the context necessary to make better decisions on how to respond to the customer. The whole event could then be published back to the bus for downstream processes to use. One example of a downstream action is sending the customer an SMS message about a product offer, which could be sent immediately after the event.
"When we started, we didn't know if the system could support the different types of business logic and the expected campaign volumes for this to be a viable effort," stated Szymon Domagala, Enterprise Architect, BNP Paribas Bank Polska.
"But it was easy and relatively cheap to get started to see how the software could work. And we obtained good results, as the offer conversion rate is four times higher than before and the campaigns are profitable."
Cloud-based services adds more flexibility
Hazelcast's Noel goes on to point out that, while the majority of his company's enterprise customers currently deploy Hazelcast technology on prem with in-house IT teams responsible for software and hardware rollouts and management, there is a growing trend towards adoption of cloud-based software-as-a-service offerings. One of these is Hazelcast Viridian Serverless, which is a self-service, pay-as-you-go, cloud managed service. Another is Hazelcast Viridian Dedicated, the company's contractually licensed version of its cloud managed service offering dedicated specific servers (cloud instances) for customers.
"We expect to see moves [towards cloud-based services] from customers who are using on premises at that moment. Perhaps they don't want to manage the environment anymore. Our cloud services allow them to focus on their applications, rather than focusing on infrastructure issues. We've given them the flexibility to choose." explains Noel.
Working on the next logical step beyond real-time data, Noel points to operationalizing machine learning models. Using Hazelcast's technologies, he explains, allow you to build solutions that extrapolate on real time data to predict events that have not yet happened.
For example, applications are being considered for the healthcare industry that can predict whether an illness affecting a patient will improve or worsen – based on information that is being collected in real time. These applications could monitor vital signs of hospital patients including oxygen saturation, heartbeat, temperature and blood pressure and this real-time data is loaded into Hazelcast and enriched with context and meaning record by record, immediately, ready for clinicians to act on that data.
"With all of this information, a clinician can tell if a patient's health is changing and the application can categorize that in terms of urgency - high, medium or low - to show if a person requires, or will require urgent attention," says Noel.
"You can build healthcare applications that can predict (using a machine learning model) if that patient might need urgent attention, so you're becoming more proactive rather than reactive. Real time means that you're responding to someone who's just fallen sick. Predicting that they are going to fall sick is even better than real time, and we provide the platform to operationalize pre-built models to run in-flow when the data is created".
The advantages provided by real time data processing platforms aren't limited to banking and healthcare of course – any organization struggling to accommodate huge volumes of constantly updated information can benefit. They just have to know the best way to handle it first.
You can find out more about the Hazelcast Real-Time Stream Processing platform by attending this free conference, either virtually or in-person, in London on 9th March – head here to register and access more details.
Sponsored by Hazelcast.