The business interest in big data is driven by the demand for greater insight to remain competitive
In the last two years, the frenzy around big data has reached fever pitch with investment and technology advances moving at a staggering pace. Much of the focus in this fast moving world is on the use of analytics to analyse structured, semi-structured and unstructured data in big data platforms like Hadoop in order to produce new high value insights to add to what is already known about customers and business operations. For many companies, one of the top priorities in the use of big data is to deepen customer insight to improve customer experience. In addition, there is also a drive to improve operational effectiveness through better operational decision making. Also there is a rise in demand for real-time analytics. Some examples here include real-time analysis of markets data for more timely and accurate trading decisions, tackling financial crimes such as credit card fraud or anti-money laundering, monitoring of asset and utility grid usage for field service optimisation and preventative maintenance, and improving click through in on-line advertising.
Yet, despite this, the focus is often on defining requirements for big data analytics with not enough attention being given to defining requirements associated with capturing transactional and non-transactional data needed or in delivering right-time, in-context insights to front-line decision makers via integration with transaction processing systems.
In this context, you can break a big data strategy into two major parts:
- Big data analytical processing
- Big data operational transaction and non-transactional event processing The focus of this paper is on the latter.
BIG DATA ANALYTICS – THE DEPENDENCY ON RANSACTION DATA
In order to analyse big data unstructured or semi-structured data on platforms like Hadoop, many organisations also load structured transaction and master data into this environment to integrate with multi-structured data in order to provide a context for analysis.
In order to analyse customer on-line behaviour, companies often want to combine customer data, product data, non-transactional click stream data and transaction data. For example, many organisations often want to analyse:
- The paths taken on a website that led to purchasing of specific products
- The paths taken on a web site that led to abandoned shopping carts
- The products placed into and taken out of shopping carts en route to purchasing
- Customer profitability on-line versus other channels
It is clear from both of these examples and surveys that big data analytics is very much dependent on transaction data. In addition, given that many organizations using Hadoop are now offloading data warehouse ETL processing to Hadoop, it is also the case that transaction data may end up on Hadoop systems first for data cleansing and data integration en-route to traditional data warehouses.
THE IMPACT OF BIG DATA ON TRANSACTION SYSTEMS
Since the late 1950s and the birth of mainframe technology, software applications have been built to process and record business transactions in database management systems (DBMSs). Initially this was done with application programs processing groups of transactions in batch with input and output files being used in process execution.
Then on-line transaction processing (OLTP) systems emerged and user interfaces appeared on the scene to capture transaction data. Today these user interfaces have progressed to being web and mobile based with OLTP systems being used to record transaction activity across almost every aspect of business operations including sales force automation, order-to-cash, fulfilment (materials-to-finished goods), shipment-to-delivery, procure-to-pay, customer service, billing, finance and HR e.g. hire-to-fire/retire.
Historically transaction data was recorded in non-relational database systems such as IBM IMS, CA DatacomDB, Burroughs DMSII and Cincom Total until the arrival of relational DBMSs such as IBM DB2 in the early 1980’s. Today although non-relational products like IBM IMS are still very much alive and well, relational DBMSs underpin the vast majority of the world’s transaction systems supporting on-line transaction processing.
“An input message to a computer system dealt with as a single unit of work performed by a database management system”.
Traditional on-line transaction processing systems usually support ACID properties, which stipulate that a transaction must be:
- Atomic – the transaction executes completely or not at all
- Consistent – the transaction must obey legal protocols and preserve the internal consistency of the database
- Isolation – the transaction executes as if it was running alone with no other transactions
- Durable – the transaction’s results will not be lost in a failure
Traditional ACID transactions are typically associated with structured data. All relational DBMSs support ACID properties to guarantee transaction integrity.
Since the emergence of the web in the 1990’s there has been explosive growth in the rate (or velocity) that data is being generated. Data velocity has increased both in terms of traditional on-line transaction rates and also in terms of so-called ‘non-transaction’ data.
BIG DATA TRANSACTIONS – THE IMPACT OF DEVICES, SENSORS AND THE INTERNET OF THINGS
Since the birth of the Internet there has been a relentless rise in online shoppers and on-line transaction processing as more and more websites offer e-commerce capabilities. However, it is the arrival of mobile devices, together with the convenience of mobile commerce that is taking the growth in transaction volumes to new levels. It is therefore no surprise that transaction data is rapidly on the increase but even that is small compared to what is happening to click stream data in web logs created from online browsing that happens prior to transactions occurring. Transaction levels and accompanying click stream are reaching unprecedented volumes. In addition, edge devices in a network such as network switches, routers, multiplexers or devices at the edge of a utility grid such as meters, re-closers, capacitor banks, solar inverters, street lights, and even buildings or homes, now support the embedding of traditional DBMSs to catch data on the edge for further analysis.
In a similar way that the web and mobile/edge devices have impacted the growth of traditional transaction rates, so the arrival of sensor networks and sensors embedded in ‘smart’ products has had a similar impact on the non- transactional big data generation rates.
Sensor data can be created at very high rates. 50000 sensors emitting sensor readings at 5 times per second creates 250000 pieces of data per second. That is 15,000,000 pieces of data a minute. Now consider the emerging world of smart products such as smart phones, smart meters for use in energy, smart cars, smart buildings, and smart production likes. Smart products have sensors built-in that can send back data for collection and analysis. Just imagine the GPS sensor data from all the mobile smart phones in the world or from all the vehicles on the move or all the smart meters in the world. The need to analyse this type of data means that the systems that capture it have to scale to handle both the volumes of data being generated and the velocity at which it is being created.
THE NEED TO MODERNISE AND SCALE TRANSACTION SYSTEMS FOR BIG DATA CAPTURE AND EXTREME TRANSACTION PROCESSING
Given what we have discussed so far, there is no question that for many organisations, an important factor in contributing to business success with big data is in modernising the data platform underpinning transaction processing systems. This needs to happen to:
- Scale to capture and serve up non-transactional data prior to and during transaction processing
- Scale to handle ever-increasing transaction rates from desktop and mobile devices with full transactional integrity
- Scale to capture, ingest and record non-transactional structured, semi- structured and unstructured data for big data analysis e.g. click stream data and sensor data, to produce the new insights needed for competitive advantage
- Capture, ingest, record and serve up structured, semi-structured and unstructured non-transactional data at scale to enrich OLTP system user interfaces
- Leverage enriched data in enhanced OLTP user interfaces to provide everything a user needs for informed transaction processing. This includes:
- Integrating transaction processing systems with non- transactional information services
- Integrating transaction processing systems with traditional and big data analytical systems to leverage real-time analytics and actionable prescriptive analytics to guide operational decision making (including recommendations to tempt customers)
It may also be the case that real-time analytics on streaming data in-motion may become an increasingly important requirement.
EXPLOITING RICHER TRANSACTION DATA
With the ability to capture non-transactional and transactional data together in real-time and enrich transaction data, there are new opportunities to gain considerable business value whether that be in the form of improved customer satisfaction and loyalty, increased revenue, reduced cost, optimised operations, reduced risk or better compliance. This is especially the case if you also add real-time analytics into the mix.
LEVERAGING OPERATIONAL ANALYTICS IN TRANSACTIONAL ENVIRONMENTS
As mentioned previously, analytical queries on live transaction data is often discouraged to avoid the impact that analytical queries may have on the performance of transaction processing. In this scenario, data is copied from the transaction system to an analytics system (e.g. data warehouse, data mart or Hadoop system), introducing a time lag or latency between a transaction and when the data is available to be analyzed. For many of the real-time analytical applications we have discussed, eliminating this time lag can create significant competitive advantage:
- Faster analytics on new transaction data in financial trading platforms could mean the difference between trading gains or losses
- Advertising and promotions can be served up more rapidly to help on- line media organizations improve ad response rates and increase revenue
- Gaming firms can better optimize their user experience and gain additional ‘in-play’ offers accepts
- On-line and mobile retailers can personalise offers more quickly to grow their business
It is the emergence of these smart transaction systems that can capture data at scale and leverage insight in real-time that will drive business value in top performing companies. Smart OLTP systems will be able to leverage transactions and big data analytics on-demand, on an event-driven basis and in real-time for competitive advantage. The figure below shows how operational OLTP applications can integrate with traditional and Big Data analytical platforms to facilitate smart business.
Intelligent Business Strategies is a research and consulting company whose goal is to help companies understand and exploit new developments in business intelligence, analytical processing, data management and enterprise business integration. Together, these technologies help an organisation become an intelligent business.