Using Data Blocks and Data Versioning to deliver real time analytics


This is part of a series of articles where we describe the way the Meniscus Analytics Platform (MAP) works. Theses articles jump into the features that make MAP different to other analytics applications by providing an Integrated Analytics Stack delivering real time analytics. In this article we discuss Data Blocks and Data Versioning.

In delivering real time analytics, disk IOPS (Input/output Operations Per Second) is one of the main rate limiting steps in achieving the calculation speeds required when processing high volume and high velocity raw data. An example of such a data is radar rainfall data where new values covering a large area arrive every 5 minutes.

To help reduce disk IOPS, we developed the concepts of Data Blocks and Data Versioning into MAP to drastically speed up data access, increase calculation speed and reduce the volume of data written back to the database.

Data Blocks

Rather than loading and persisting all data for an Item, data can be broken up into chunks called Blocks. So, only the chunks of data that are demanded for a query, or as an input to a calculation, are loaded from the database (i.e. delay loading), and only the chunks of data that actually change need to be persisted. Blocks are typically used with unbounded, time-related data such as sample arrays, where the size of a Block is limited and the maximum number of Block samples depends on the size of a sample. This provides efficiencies in real-time processing, whereby data changes are localised and typically at the end of the data.

Data Blocks are transparent to the user. It is purely an internal mechanism to reduce traffic to/from the database. When requested or persisted, Data Blocks are held in memory for a time. This ensures future retrieval is temporarily faster as the data is expected to be in demand.

Data Versioning

Data Blocks are complimented by the MAP concept of Data Versioning. All Item data in MAP is versioned, including Blocks (as such referred to as child data). A version is simply a unique timestamp. It allows users to query for the relative age of data. Specifically, when it last changed, and for calculated Items when the last calculation started and completed. A client application can then tell if data has changed without having to load the data itself. There are additional non-data versions on an Item. I.E when its properties or list of child items last changed.

It is this versioning technique that allows MAP to efficiently detect when calculated items need recalculating (referred to as dirtying as calculation).

About MAP

MAP is an Integrated Analytics Stack providing a framework for users to create and deploy calculations at scale using any source of raw data. MAP is based on IOT principles and uses Items as the underlying building blocks to store either RAW or CALCulated data. So, users create an Entity Template or Thing using these Items and then replicate this template hundreds of thousands of times using an ItemFactory.

For more information on MAP then click here

InnovateUK project to deliver solar intensity predictions

This project uses near real time satellite imagery to create short term predictions of output from PV farms to help optimise revenues for sites having battery storage

MAP IoT analytics applications

MAP IoT lets users create their own Entities or Things containing the Items and Properties they need to configure any type of calculation and process any type of data.

Flood Forecasting part 1

MAP-Rain is an existing flood forecasting and rainfall prediction solution based on the proven Meniscus Analytics Platform (MAP) which is a high performance, generic, Big Data, cloud based real time calculation/analytics platform.

MAP-Rain is a generic and initial solution for flood forecasting and integrates additional datasets and models from the existing systems, APIs and databases already used and developed as well as a range of new datasets.

MAP-Rain delivers the following real-time flood forecasting solutions:

Flood Forecasting by Meniscus MAP

flood forecasting

  • Real-time prediction of flooding in sewer networks by using simplified hydraulic models, real-time actual radar-based rainfall, forecast rainfall and sewer pumping station operational data. MAP is currently able to process over 6,000 sewer catchments with the latest rainfall data within five minutes.
  • Aggregating real-time radar rainfall data (five-minute updates for 65,000 km2 at 1km2 pixel resolution) into over 1,000 polygon based sewer catchments. Used by modellers to build a range of hydraulic models. Calculating a range of rainfall related metrics for each polygon.
  • On Demand calculation of rainfall return period calculations (using the Flood Estimation Handbook FEH methodology) for any point in a region for any time during the past three years and comparing this result to a similar calculation from the local Environment Agency rain gauge.
  • Integrating an open source complex third party pollution transportation and hydraulic model (Soil and Water Assessment Tool) to predict the impact that rainfall has on pesticide runoff (in particular metaldehyde) concentrations at key water abstraction points in sensitive river catchments.

MAP-Rain is also being used to as part of an InnovateUK funded Smart City project (Hyperlocal Rainfall) looking to increase the use of sustainable transport in cities. This solution predicts the path of actual rainfall over the course of the next hour, at five-minute increments, and relates this to specific journeys that users can create. The aim being to increase the use of cycling and walking by answering the question, “Will it rain during my journey”? This solution uses a combination of real-time radar rainfall data, local wind speed and direction data from an existing local network of weather stations and high altitude wind forecasts. The radar rainfall data is also being ground truthed to local rain gauge data to increase accuracy. As part of this Hyperlocal Rainfall project Meniscus has developed an Android mobile app for users to create journeys and to track and plan journeys around rainfall. MAP-Rain is integrating a third party personalisation engine developed to learn users’ behaviour and to personalise the app based on insights learnt by the engine. The project is initially focused around Peterborough but is also being tested using the entire radar rainfall dataset for England and Wales.