01.03.15
Big data risk analysis for rail
Source: Rail Technology Magazine Feb/March 2015
Dr Coen van Gulijk, reader in Railway Safety at the University of Huddersfield, and Marcus Dacre, the Safety Risk Assessment Manager at RSSB, discuss a newly launched rail safety research group that will assess Big Data Risk Analysis for the rail sector.
Harnessing computer power for rail
The term ‘big data’ is cropping up with increasing regularity, not just at specialist conferences but also in the mainstream media. Advances in data science and the availability of relatively cheap computer power, which have helped to bring about the big data revolution, also create new opportunities for the rail industry.
Big data is often described in terms of the ‘four Vs’: volume (the amount of information available), velocity (some datasets are now continually streamed), variety (information comes in many different forms) and veracity (information may be of varying quality). Big data techniques have been applied in many domains. One of the most cited examples, both by enthusiasts and sceptics, is the use of data on Google searches to monitor the spread of flu.
On the railway, internet services already offer passengers real-time pricing and train times, and engineering decisions are informed by asset condition monitoring systems and on-train data recorders. However, these successes are not easily applied to other domains. For example, the nature of risk analysis is different and careful thought is required before we can exploit cheap computer power effectively.
The search for optimal use of computer power has wide support in the GB rail industry. Network Rail and ATOC are paving the way with the ORBIS programme and DARWIN project, but nearly every stakeholder is involved in the railway data-revolution in one way or another.
Harnessing computer power for safety: BDRA
RSSB and the Institute of Railway Research (IRR) at the University of Huddersfield have combined forces to drive the data-revolution for safety and risk in the railways. RSSB has a key role in supporting the industry to deliver a safe railway and is a centre of expertise in risk management and risk modelling. The IRR contributes with a newly launched rail safety research group that comprises safety researchers from the UK, the Netherlands, Australia and Spain.
The working title for the research programme is BDRA, short for Big Data Risk Analysis. The work aims to incorporate ‘big data’ sources into risk modelling and to develop methods to handle the information and extract intelligence from it. The purpose is to provide the rail industry with tools for rapid access to relevant safety information in a cost-effective way to support effective and efficient decision making.
Blueprint for a technical system
We are at the beginning of the development but now have a first blueprint for BDRA. At its centre is a cheap computer system formed from a collection of standard PCs that are hooked up in what is known as a Hadoop cluster. Hadoop is the name of a software package that distributes computation power and disk space over the individual computers in the cluster. This makes computer power extremely cheap since any off-the-shelf PC (even used ones will do) may be combined with 10, 20 or 100 others.
The inputs for BDRA are ‘big data’ sources and in most cases, rather than keeping dedicated risk data on a local computer, they are accessed through the internet. To date, demonstrator projects have been applied to two sources.
The first data source is Network Rail’s TD live-stream. The TD live-stream provides train-position data at track berth level as well as information on signal aspects. Dedicated software called Red Aspect Approaches to Signals (RAATS) has been developed to estimate the frequency with which trains approach particular signals at red. It is hoped that this will lead to better understanding and management of SPAD (Signal Passed at Danger) risk.
The second data source is the Close Call System. Network Rail and its principal contractors report close calls at work to a central system supported by RSSB, and are encouraging train and freight operators to also use the system. This has had a positive effect on safety culture and provides a mechanism for raising specific concerns. However, because most of the information is in the form of free text it has so far been difficult to extract intelligence from the aggregate dataset. In our current Learning from Close Calls project, software was developed to automatically analyse the free text entries. Automated text analysis is far from trivial since computers do not interpret free text the way humans do. The software combines the knowledge of safety specialists with Natural Language Processing software to yield meaningful safety learning.
In the longer term the BDRA programme aims to integrate these individual programmes and add further data sources. The safety-relevant information that is extracted will be made available to safety experts and decision makers. Ideally this would be via a dashboard that could be accessed through the internet or maybe even through mobile phones. We have some initial ideas about the design of the dashboard but it will ultimately depend on what the end-users need. We are currently starting a project to draft the requirements and are looking for rail industry decision makers who are willing to participate.
BDRA supports data-revolution in rail
BDRA paves the way for rapid access to relevant safety information in a similar way that passengers have access to train departure times. Its unique network-centred approach makes it possible to make use of existing risk models and, in principle, any other safety-relevant data-sources. Though the development process has just started we believe that BDRA is part of a data revolution that will help the railway continuously improve its performance, efficiency and safety.
Tell us what you think – have your say below or email [email protected]