Generating Big Data

The project titled Value Enhancement for Data from Assets & Transactions (VEDAT) is a good example where DRS enabled new discoveries and generated insights from 'Big Data'.

The project's aim was to "provide tools that enable intelligent, predictive modelling capabilities, including the integration and analysis of heterogeneous data types" in the heavy goods vehicle (HGV) sector. The project was completed in partnership with the company Microlise, which controls more than 30% of the HGV and van fleets across the UK, generating millions of miles worth of telematics data per day.

One key element of this project was combining previously unconnected, disparate data to create a platform that allows deriving high value benefits by both isolated data community members, as well as the whole disparate data owner community. Data exploration in this project resulted in novel techniques and tools tackling HGV transportation challenges in the context of Big Data, particularly what is often referred to as "Data Silo" blackholes. The developed solutions proved useful with a wide range of applications in other sectors with complex disparate data environments such as finance, engineering, biotechnology and informatics.

Big data image 

The data captured by Microlise included information around driving style, risk areas and vehicle flow rates. The main challenges addressed by VEDAT were (i) identifying driving behaviour and establishing industry standards for safe driving; (ii) creating a framework to rank and shortlist the best drivers engaging with Microlise telematics solutions in the UK; and (ii) detecting road segments with high likelihood of HGV driving incidents, accidents and crime (hot spot identification).

The University team supported by DRS provided innovative advanced data analytics solutions, which are now being used by Microlise, their customers, and business partners to understand drivers, vehicle usage, safety procedures and service levels.


HGV Incident Hotspot Analysis

Hot spot identification (HSID) problems are present across several domains, such as health care, security, maintenance, energy or transport. A hot spot can be defined as an area with high likelihood of occurrence for a certain event. In public health care, algorithms to determine hot spots could be employed for early detection of locations of an epidemic outbreak.

In the context of intelligent transportation systems (ITSs), HSID can be applied to telematics data obtained from HGV fleets with the aim to create methods, processes and devices to allow for improvements in driving performance as well as road economy and safety. 

The DRS enabled Identifying hot spot areas of potential danger to drivers across the whole country and this information is now also being used by the police. An adaptive algorithm was developed, inspired by the immune system and pheromones to dynamically determine the importance of hot spots based on current and past data, eliminating old hot spots, and adding new relevant locations. The algorithm is designed using Apache Spark Streaming utilising several MapReduce operations to parallelise the most time-consuming operations, enabling the detection of hot spots in big data streams of data online.

This led to a better understanding of HGV driving behaviour and led to creating new driving and safety standards within the HGV industry and promotes societal impacts by encouraging changes regarding driving safety and economy in the UK.

The DRS is continuing their exploratory data analysis to further improve the understanding of driving behaviour and vehicle health.


Helping to find the UK's best HGV driver

The hotspot analysis and HGV driver profiling work also contributed to Microlise's Driver of the Year competition which recognises the UK’s most talented HGV drivers. Data analysis undertaken by members of the Digital Research Service generates an initial shortlist of fifteen drivers in each of the three categories - short, medium and long-distance drivers.

The data was analysed to identify the best criteria to establish the top performing drivers. The initial selection was made based on a minimum number of miles driven across each of the four quarters in 2014. Each qualifying professional belonging to one of the three categories was then assessed using a range of criteria compared to other drivers from the same category. The Digital Research Service was able to bring a range of data analytics knowledge and expertise together to provide objective and scientific data to help Microlise identify the winner of the Driver of the Year award.

The developed methodologies have since been reused and developed further and have been contributing to subsequent Driver of the Year awards.