Наслаждайтесь миллионами электронных книг, аудиокниг, журналов и других видов контента

Только $11.99 в месяц после пробной версии. Можно отменить в любое время.

Data-Centric Safety: Challenges, Approaches, and Incident Investigation

Data-Centric Safety: Challenges, Approaches, and Incident Investigation

Читать отрывок

Data-Centric Safety: Challenges, Approaches, and Incident Investigation

Длина:
1 126 страниц
9 часов
Издатель:
Издано:
27 мая 2020 г.
ISBN:
9780128233221
Формат:
Книга

Описание

Data-Centric Safety presents core concepts and principles of system safety management, and then guides the reader through the application of these techniques and measures to Data-Centric Systems (DCS). The authors have compiled their decades of experience in industry and academia to provide guidance on the management of safety risk. Data Safety has become increasingly important as many solutions depend on data for their correct and safe operation and assurance. The book’s content covers the definition and use of data. It recognises that data is frequently used as the basis of operational decisions and that DCS are often used to reduce user oversight. This data is often invisible, hidden. DCS analysis is based on a Data Safety Model (DSM). The DSM provides the basis for a toolkit leading to improvement recommendations. It also discusses operation and oversight of DCS and the organisations that use them. The content covers incident management, providing an outline for incident response. Incident investigation is explored to address evidence collection and management.

Current standards do not adequately address how to manage data (and the errors it may contain) and this leads to incidents, possibly loss of life. The DSM toolset is based on Interface Agreements to create soft boundaries to help engineers facilitate proportionate analysis, rationalisation and management of data safety. Data-Centric Safety is ideal for engineers who are working in the field of data safety management.

This book will help developers and safety engineers to:

  • Determine what data can be used in safety systems, and what it can be used for
  • Verify that the data being used is appropriate and has the right characteristics, illustrated through a set of application areas
  • Engineer their systems to ensure they are robust to data errors and failures
Издатель:
Издано:
27 мая 2020 г.
ISBN:
9780128233221
Формат:
Книга

Об авторе

Dr. Alastair Faulkner is a Consultant Engineer at Abbeymeade Limited. He has more than 30 years of experience in senior management and has specialist knowledge of data-centric systems. He specialises in system safety and systems engineering. He supports clients with business planning, execution, delivery, risk assessment and management.


Связано с Data-Centric Safety

Похожие Книги

Похожие статьи

Предварительный просмотр книги

Data-Centric Safety - Alastair Faulkner

Data-Centric Safety

Challenges, Approaches, and Incident Investigation

First edition

Alastair Faulkner

Mark Nicholson

Table of Contents

Cover image

Title page

Copyright

Preface

Readership

Directed Reading

Bibliography

It's Monday Morning …

Bibliography

Acknowledgements

List of Figures

List of Tables

Part I: Data-Centric Safety

1: Introduction

Abstract

1.1. Logic and Rationality

1.2. Data

1.3. Data, Information, Knowledge and Wisdom

1.4. Systems Reliant on Data

1.5. Data becomes the Dominant Systems Component

Bibliography

2: System Safety Management

Abstract

2.1. Safety Management Systems

2.2. Hazard, Opportunity, Incident

2.3. Decision, Confidence and Uncertainty

2.4. Errors, Faults, Failures and Anomalies

2.5. 4Plus1 Safety Assurance Principles

2.6. Risk Management Model

2.7. Safety Justification

2.8. Maturity Modelling for Data-centric Systems

2.9. Safety Management Paradigms

Bibliography

3: Challenges to Systems Engineering

Abstract

3.1. Systems Science

3.2. Systems Engineering

3.3. Cyber Security Management

3.4. Identity Model

3.5. Information Systems

3.6. Emerging Disciplines

3.7. The Accidental System

3.8. Change in the Systems Domain

Bibliography

Part II: Data-Centric Fundamentals

4: Data Fundamentals

Abstract

4.1. Data Quality

4.2. Value (Economics) of Data

4.3. High-Integrity Data

Bibliography

5: Data-Centric Systems

Abstract

5.1. Classification of Data

5.2. Decision Model

5.3. Uncertainty

5.4. Autonomy and Perception

5.5. Safety Management of Adaptive Systems

Bibliography

6: System Context

Abstract

6.1. Mature Context

6.2. Multiple Contexts

6.3. Context Switch

6.4. Learning, Adaptive and Autonomous

6.5. Indeterminate Context

6.6. Summary

Bibliography

7: System Definition

Abstract

7.1. Requirements

7.2. Requirements Management

7.3. Data Definition Languages (DDL)

7.4. Supervisory Model

7.5. Service Provision

7.6. Rely-Guarantee

7.7. Performance

7.8. Metamodels and Metadata

7.9. Safety-Related Application Conditions

7.10. Security Requirements

7.11. Summary

Bibliography

Part III: Data-Centric Design

8: Data-Centric Architecture

Abstract

8.1. Computational Models

8.2. Diversity

8.3. Architecture Styles and Patterns

8.4. Interfaces and Interface Agreements (IA)

8.5. Critical Control Points

8.6. Metamodel Architectures

8.7. Metadata for IA

8.8. Data Paths

8.9. Summary

Bibliography

9: Development

Abstract

9.1. Operational Context

9.2. Architecture and the Operational Context

9.3. Project Management

9.4. Life Cycle Models

9.5. Configuration Management

9.6. Data Path Implementation

9.7. Analysis

9.8. Threat Identification

Bibliography

10: Acceptance and Approval

Abstract

10.1. Policy, Strategy and Planning

10.2. Assessment of Design and Implementation

10.3. Assessment against 4Plus1 Principles

10.4. Evaluation of Risk Assessment of Design

10.5. Assessment of Implementation

10.6. Assessment of Safety Management System

Bibliography

Part IV: Operational Management and Maintenance

11: Operational Matters

Abstract

11.1. Business Model and Data Metamodel

11.2. Data-Centric Operational Organisation

11.3. Business Management

11.4. Organisational Metamodel

11.5. Self-consistent Organisation

11.6. Operational Modes

11.7. Emergency Preparedness

Bibliography

12: Live Management and Control

Abstract

12.1. Data Management Plans

12.2. Business Continuity

12.3. Safety-related System Continuity

12.4. Data Integration

12.5. Managing Data Change

12.6. Operational Safety Management

12.7. Authentication

12.8. Competency

12.9. Maintenance of Data as a (Virtual) Asset

12.10. Data Obsolescence and Destruction

Bibliography

Part V: Incident Investigation

13: Major Incident Response

Abstract

13.1. Incident Response

13.2. Effective Response and Recovery

13.3. Immediate Aftermath

Bibliography

14: Investigation Management

Abstract

14.1. Planning

14.2. Strategy

14.3. Execution

Bibliography

15: DCI Investigation Methodologies

Abstract

15.1. Derivation of an Incident Model

15.2. Classification

15.3. AcciMap

15.4. Systems-Theoretic Accident Model and Processes

15.5. Functional Resonance Analysis Method (FRAM)

15.6. Network Theory

15.7. Systems Dynamics

15.8. Applying DSM to Incident Investigation

Bibliography

16: Incident Investigation

Abstract

16.1. Investigation Planning

16.2. Validation of the System Context

16.3. Validation of the System Definition

16.4. Analysability

16.5. Access, Security and Authorities

16.6. Ongoing Data Safety Incidents

16.7. Data Safety Incident Investigation

Bibliography

17: Investigation Methodology Maturity

Abstract

17.1. Validation

17.2. Investigation Repeatability

17.3. Education and Training Requirements

Bibliography

18: Analysis as Part of a DCI

Abstract

18.1. Evidence Directed Analysis

18.2. Root Cause Analysis (RCA)

18.3. Incident Model Validation

18.4. Replicating the Incident

Bibliography

19: Incident Report

Abstract

19.1. Evidence Navigation

19.2. Incident Report

19.3. Escalation and Resolution

Bibliography

Part VI: Data Safety Model

20: Data Safety Model

Abstract

20.1. Model Elements

20.2. Transformation Model (T-axis)

20.3. Abstraction Model (A-axis)

20.4. Product, Installation and Maintenance (P-axis)

20.5. Interface Agreements (IA)

20.6. Critical Control Points

20.7. Metadata and Metamodels

20.8. Data-Centric Decisions

20.9. Identity and Identity Management

20.10. Implementing Permit to Work

20.11. Triplet Relationships

20.12. Time, Change and Maintenance

Bibliography

21: Using the DSM

Abstract

21.1. Initial TAP Identification

21.2. Analysis of P-TAP

21.3. Confidence in Risk Assessment over DSM

21.4. Impact of Change on the DSM (Brownfield Sites)

Bibliography

22: Validation

Abstract

22.1. AcciMap

22.2. STAMP

22.3. FRAM

22.4. Network Theory

22.5. System Dynamics

22.6. Weinberg's Categorisation of System Complexity

22.7. Resilience Engineering

22.8. Data Security

22.9. Explanation and Communication

Bibliography

Part VII: Application Areas

23: Autonomous Flight

Abstract

23.1. Introduction

23.2. System Description

23.3. Normal Operation

23.4. An Airspace Described in Data

23.5. Applying DSM

23.6. Metamodel

23.7. Metadata

23.8. Incident Investigation

23.9. Expressing the Supervisory Model in Metadata

Bibliography

24: Enterprise

Abstract

24.1. Introduction

24.2. System Description

24.3. Normal Operation

24.4. Multi-layer Error Management

24.5. Acquisition and Merger

24.6. Divestment

24.7. Permit-to-Work Failure

24.8. Safe-Method-of-Work Failure

24.9. Emergency Response

Bibliography

25: Healthcare

Abstract

25.1. Introduction

25.2. System Definition

25.3. Metamodel Integration

25.4. Vertical Integration

25.5. Horizontal Integration

25.6. ‘Product Line’ Integration

25.7. Cyber Physical System Threats

25.8. Healthcare Incident

25.9. Summary

Bibliography

Part VIII: References

Bibliography

Bibliography

Abbreviations

Definitions

Index

Postface

Bibliography

Copyright

Elsevier

Radarweg 29, PO Box 211, 1000 AE Amsterdam, Netherlands

The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom

50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States

Copyright © 2020 Alastair Faulkner and Mark Nicholson. Published by Elsevier Ltd. All rights reserved.

data-centric-safety.com

No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher's permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

Notices

Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

Library of Congress Cataloging-in-Publication Data

A catalog record for this book is available from the Library of Congress

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-0-12-820790-1

For information on all Elsevier publications visit our website at https://www.elsevier.com/books-and-journals

Publisher: Susan Dennis

Acquisitions Editor: Anita Koch

Editorial Project Manager: Kelsey Connors

Production Project Manager: Poulouse Joseph

Designer: Victoria Pearson

Typeset by VTeX

Preface

Technology evolves, shaped by its use, often in unexpected ways. Products once constrained by ‘air gaps' are enabled by communications-based infrastructural technologies and data ecosystems. Yet data is too broad a term as it does not address its many roles within systems. It is true that historically in most systems, data is merely consumed, processed and some action performed based on predetermined criteria. In this case data is passive and inert, and needs to be consumed to participate in or to direct actions or activities. However, data often exhibits many degrees of freedom that include the description of functionality, performance, capability, capacity and constraint. Data may also include temporal (sequence or order) or time-based (time, rate or calendar) properties. Data has a mercurial property. It is challenging to manage and control; it has a habit of being consumed by systems that it was not produced for, by omission or by design, perhaps without the awareness of the system designer. It is common for it to pass (often unchecked or even unwittingly) across system and organisational boundaries.

Data (in all its forms) is often unchallenged, unverified, ubiquitous, unrecorded and invisible. Yet this data increasingly determines the behaviour of systems and through this behaviour our access to products (goods and services). Data may be internal or fed to systems with a safety responsibility. As a result, data error or omission may go undetected with potentially hazardous or catastrophic consequences. There may also be consequent damage to assets. Failure of such systems may also contribute to harm indirectly through incorrect decisions made by actors (human or computer) who rely on, or trust, these systems and the data they supply. How should safety justifications reason about Data-Centric Systems (DCS) so that our reliance on, or trust in their correct operation can be justified?

In using the term DCS we acknowledge the ever-increasing volumes of data. Data may be structured or unstructured. However, not all data has value to us; not all data is fit to be used in a system with safety implications or as part of the assurance of such systems. So how should we determine what data can be used, and what it can be used for? How do we assure ourselves that the data used is appropriate and has the right characteristics? How do we engineer our systems to ensure they are robust and resilient to data errors and failures?

As these DCSs grow, they experience a change of scale, consuming (and potentially producing) vast quantities of data. As a result automated methods are required to ensure and assure the contribution of data to system safety in such systems. Furthermore, how do we ensure that actors using the data generated by such systems do so in the intended way and with the appropriate level of criticality?

Currently, no mature methods exist to address these issues. Careful development of data-intensive systems will improve an organisation's ability to ensure and assure the safety of systems. Currently, guidance in this area is very immature. We address these issues in this book.

System Safety Engineering (SSE) is applicable across the entire life cycle of a product, from concept to disposal. In this book we address data safety issues relating to both physical goods and service elements of a product within an SSE framework. However, the emerging field of data safety means that in this first edition there are aspects of data safety that we do not address.

WARNING: This book contains Scary Monsters

Where the authors have paused for thought, tea and/or discussion …

A scary monster is used to identify open research questions, open certification issues, a requirement for an in-depth discussion to take an issue further than it is explored in this text and Key Safe Behaviour (KSB) deficiency in current SSE / Safety Management System (SMS) practice.

Readership

System Safety Engineering (SSE) has its origins in very high-impact activities, such as controlling nuclear power plants or aircraft. As the use of Computer-Based Technologies (CBTs) to control systems and provide services has spread, so has the range of systems that need to consider system safety. Communications-enabled CBTs combine to create infrastructural technologies. Infrastructural technologies are one foundation of the data ecosystem that deliver the performance and low latency required to handle extensive, complex data (1.2.1). Infrastructural technologies make it possible to run applications on systems with thousands of nodes, involving vast quantities of data. The emergence of data as a determinant of system behaviour has followed the spread of these technologies. This book is therefore relevant for practitioners in the classical system safety industries, but also for an ever-increasing set of providers of products (goods and services) outside these industries. It is also an area that has not been adequately addressed by the academic community.

As a result, the audience for this book includes, but is not exclusive to the following communities:

Academia

•  Post-graduate students undertaking or wishing to undertake research into Safety Management of data-centric and data-intensive organisations and systems.

•  Safety Engineers / Professionals studying the development, operation or oversight of data-centric or data-intensive systems and organisations as part of Continuing Professional Development (CPD).

Industrial practitioners

•  Incident Investigators who need to address data issues in their causal analysis and improvement recommendations.

•  Safety Engineers / Professionals working in the development, operation or oversight of data-centric or data-intensive systems and organisations.

•  Software Engineers who have to develop software that interfaces with and uses data to determine the set of services provided and the results of the services provided by data-centric and data-intensive systems.

•  Data scientists who have a role in safety-related information systems or data-intensive control systems development or operation.

•  Enterprise architects who wish to determine how data-centric and data-intensive their architecture should be. There are implications for enterprise architectures derived from the move towards data-centric and data-intensive activities.

•  Enterprise Architects whose data has the potential to be used for functions beyond their original intention. Boundaries for the data use may be defined or additional efforts made to improve integrity, etc., where such limits cannot be imposed.

Corporate and management practitioners

•  Operational Managers who have System Safety Management responsibility for data-centric and data-intensive systems and organisations.

•  Information managers of data and metadata interested in the link between data and the trust and reliance that can, or should, be placed on the information extracted from it.

•  Enterprise Architects whose data, metadata and metamodels have the potential to describe and shape an organisation or enterprise to be safe by design and to ensure that boundaries are enforced.

•  Project Managers who wish to set the competencies of staff engaged in data safety activities, ideally before the start of such activities. Furthermore, they will be interested in identifying, understanding and controlling the risks that exist with data, and the risks associated with data safety and the current status of the identification and control of those risks.

•  Data and Commercial Managers who are interested in the impact of the safety ensurance and assurance work on cost, value, timeliness, logistics and disposal / retention issues relating to safety.

•  Data Asset Managers who are interested in the through-life management of data.

•  Lawyers addressing liability issues implied by the use of data (and faulty states induced by data and data errors) in data-centric and data-intensive systems. Liability accrues proportionally to the contribution of the activity / element to safety risk. Data crosses boundaries, which makes intellectual property, copyright and theft issues relevant.

•  Training and education departments within data-centric and data-intensive organisations who have to provide, or commission, training towards competence.

•  Corporate Managers interested in acquisition or divestment. Especially the legal and corporate issues associated with the management of data, metadata and particularly metamodels as these contain Intellectual Property (IP).

Societal Guardians

•  Regulators and approval bodies. What should they be asking for, and how will they know that applicants have addressed data safety appropriately in their safety and compliance cases?

•  Policymakers who are interested in updating existing regulations to incorporate data contributions to safety. Typically, this will have international and national contributions.

Directed Reading

The discipline of data safety is immature and needs to be improved as a matter of urgency. In this book, we attempt to raise awareness of data (1.2.1), metadata (7.8.3), metamodels (7.8.2) and the Social influences and impacts of data, and Interface Agreements (IA) (8.4.3) (or their absence and enforcement) in Data-Centric Systems (DCS) (1.5.9). This book provides a structure within which proposed solutions can be analysed. Experience is not available at the time of writing as to the effectiveness of these approaches other than on individual system exemplars. Where experience is available, it is highlighted.

This book could not, and should not, be read in isolation. We have deliberately built on existing material (and the concepts that they contain); therefore, there are many references to external sources. While recognising that the application domain is continuously changing, the core safety concepts, techniques and measures are incorporated into mature Safety Management Systems (SMS) (2.1.1) (typically arranged as sequences of processes), which are to be adapted to support DCS in their operational contexts. Situations where current SMS practices may no longer be applicable or sufficient are subject to increasing research activity. For example, systems may employ Machine Learning and as a result are highly dynamic in the evolution of their safety characteristics. Systems use data as a critical enabler. The primary challenge of the developer is to know the methodology of the learning and the associated integrity / criticality, which could be assured by such methods.

The Reader is reminded that established practices and SMS apply equally to all components of the system (hardware, software, people, process and data (including metadata and metamodels)). We note that many established standards offer little guidance, explicitly addressing data. Data's absence from standards (and guidance notes) does not provide the basis for credible claims that data is outside the confines of safety management, and therefore few safety resources, if any, are required to be allocated to data. This text focuses on the data (including metadata and metamodels) as the emerging and soon-to-be-dominant system safety component.

System Safety Practitioners

Section 2.9 (Safety Management Paradigms) expresses the evolution of system safety management and the challenges that lie ahead. These challenges are explored through headline issues.

•  Boundaries: Large datasets obscure boundaries (2.1.12), and without clear boundaries hazard (2.1.10) management is problematic. Section 8.4 (Interface Agreements) (IA) provides one means of managing and controlling real and virtual boundaries.

•  Identity: Increases in the number of elements gives rise to identity (1.0.4) and identity management requirements. Section 3.4 (Identity Model) provides one means of expressing issues associated with identity.

•  Safe Method of Work: Existing SMS requires high integrity implementation of Permit-to-Work (24.7.1). The increased span of control requires that these practices be reinforced in the DCS. Section 24.7 (Safe Method of Work) addresses these issues.

•  Data Safety Model (DSM): The interconnected nature of DCS requires a way to express the data element of a product (1.5.4), the operational process and organisational hierarchy. These issues are expressed in Section 20 (Data Safety Model).

•  Using the DSM: In a complex context using the DSM becomes challenging. Section 21 (Using the DSM) provides initial guidance on its application.

•  Data, Metadata and Metamodels: It is becoming clearer that managing data through content (1.3.3) (Data Quality) is no longer enough. A dependency on data and its ever-growing volume inevitably draw comparisons with machine code and the use of abstraction in software engineering. Data should be abstracted into metadata and metadata abstracted into metamodels.

•  Autonomy and Automation: A growing reliance on data requires transparency, visibility of the influence of data and the errors that data may contain.

•  Incident Investigation: Finally, data-centric systems will fail; this failure will lead to harm (1.0.3). Part V (Incident Investigation) provides one way to investigate data incidents.

System Safety Acceptance and Approvals

Independent review is one of the cornerstones of System Safety practice. The Safety Assessor will be a System Safety Practitioner; therefore, the guidance in the preceding paragraph applies. In addition, the Assessors need to be satisfied that the element is suitably and sufficiently described in its context and that its features, functions, dependencies and failure modes are understood well enough to manage safety risk.

The use of Autonomy and Automation presents particular difficulties as to the nature and form of the safety case. Safety I (S.2.9.1), in which the set of hazards (2.1.10) are sufficiently well known, represents the current footprint for Independent assessment and review. Products are complete; their failure mechanisms are known. Required mitigations and barriers to escalation (2.2.1) are also known.

Autonomy, the use of Artificial Intelligence (AI) and Machine Learning (ML) will result in unfinished elements. At the point they go operational, they learn and adapt their behaviours, and in doing so give rise to new hazards and new combinations of hazards. The Safety Assessor will be expected to express a professional opinion as to the safety risks involved in such systems (see Section 10 (Acceptance and Approval)).

Incident Management and Investigation

The Safety Investigator will be a System Safety Practitioner; therefore, the guidance in the preceding two paragraphs applies. Evidence, in the form of witness marks on physical components and eye witness statements, has been pivotal in determining the root causes of many fatal accidents (13.0.2). A reliance on data may mean a reduction in the availability of physical evidence to the extent that the absence of physical evidence is an important feature. An incident (13.0.1) involving an Autonomous Vehicle (AV) may not include skid marks, which would indicate a failure to brake.

As reliance on data increases the probability of systematic data failure increases. Therefore, rather than single incidents at single locations and points in time, multiple incidents may manifest at many times and locations. Investigating the underlying data causes from a set of complex situations is challenging. Section 15 describes a range of incident investigation methodologies. The investigation methodology should be documented to ensure repeatability and audit. This methodology may be a combination of existing approaches, a hybrid or something new.

Corporate and Management Practitioners

Autonomy and the use of AI and ML require the application of system safety to evolve recognising that treating only product-based hazards may not be enough. This places additional responsibilities on operational and corporate management and requires operational managers to become Duty Holders (2.0.4).

•  Safety I (S.2.9.1): This is a conventional view, represented in many system safety standards, where all hazards are known, managed, mitigated or removed such that the residual risk is at least tolerable (2.0.5). Products and systems (1.5.5) are ‘finished’ and are supported by operational processes faithfully executed by competent, trained and experienced users. Section 2.9 (Safety Management Paradigms) expresses the evolution of SSM and the challenges that lie ahead. Highly configurable data systems present significant management challenges.

•  Safety II (S.2.9.2): Hollnagel [302,299] recognises that safety systems are not perfect and that users play an important role in the resilience of the safety system. One extension of resilience is the implementation products that are unfinished at the point they are set to work. These issues recognise a shift in emphasis towards adaptive requirements placed on operations (Section 11 (Operational Matters)) and maintenance (Section 12 (Live Management and Control)). Who will be liable for incidents involving these unfinished products?

•  Safety II+ (S.2.9.3): Reduced oversight and an increased span of control require tasks to be automated. To what degree should these tasks be automated, and how is this automation to be supported by autonomous systems? What contribution can data assurance make to the assurance of AS?

•  Safety III (S.2.9.4): This is an area for academic research. The use of Safety III implies that autonomous behaviours also have input to SMS. Current implementations of autonomy are changing safety practice. In which other SMS elements (philosophy, policy, procedure, practice) or responses should we permit autonomy to change? (see Figure 2.1)

Academia

The scope for further academic work is extensive. Solution constraints formerly imposed by hardware, software and limited communications infrastructures are significantly diminished. As a result, highly connected and adaptive systems are emerging, as embodied in technologies such as the Internet of Things (IoT). Several fundamental building blocks are incomplete and require academic research.

•  Scary Monsters: This text contains many ‘Scary Monsters’. They represent the unasked and unanswered questions; where possible, we try to isolate them to formulate problem descriptions for academic consideration.

•  Teaching and Training: DCS offer an unprecedented opportunity to refresh and revise curriculum. SSE has to evolve to encompass DCS. This text is a reference work collating and collecting many sources.

New to System Safety?

We hope that you find our writing style readable. While we do include introductory material, beginning with Section 1 and reading to the end will present you with a substantial learning curve. Before you apply any of the concepts contained in this text, we recommend you consult a System Safety Practitioner familiar with DCTs and its application domain.

Bibliography

[299] Erik Hollnagel, Safety-I and Safety-II. Routledge; 2014 978-1472423085.

[302] Erik Hollnagel, Jean Paries, John Wreathall, Resilience Engineering in Practice: A Guidebook, Volume Ashgate Studies in Resilience Engineering. CRC Press; 2013 978-1472423085.

It's Monday Morning …

You have read the book (hopefully you found it interesting), and arrived at work. You're in a data-centric organisation (DCO) (1.5.3) with many data-centric systems (DCS) (1.5.9). You've got a data-centric problem …where do you start?

This problem is enormous …big enough for you to reopen this book …

There is no easy answer; much depends on the industry sector (regulated or unregulated), the nature of the safety problem (in its operational context (9.1.1) and its position within the Data Safety Model (DSM) (20.0.1) and one or more TAP points). It would be unreasonable of us to be prescriptive …

What we can do is outline a process, a place to start, and to issue a stern warning: you must adapt this process to your data-centric problem; we cannot do this for you.

Develop a Remit

It is important that you establish what it is that you want from this investigation. Data Safety (DS) assurance and associated investigations have a propensity to consume resources, not because data (1.2.1) is more complex than other system components, but simply because of its potentially extensive technical footprint. It is all too easy for data, metadata (7.8.3) or an element of the metamodel (7.8.2) to be shared by multiple DCSs and DCOs. Some of these uses will be explicit and some implicit; hopefully, only rarely will they be ‘unintended', accessible through sneak circuits (15.8.2).

It is essential to set a boundary (2.1.12) on your remit and the ‘area of interest'. The identification of context (6.0.2) is of concern as the data may not be valid outside the context and uses it was created for. It is common for a system (1.5.5) to be within a hierarchy. We can no longer assume that the user will be human. To reflect the increased use of automation, the term ‘user' is replaced by actor (1.0.1).

Existing Safety Management

All operational domains contain risk (2.0.2). Regulated domains include at least one Duty Holder (2.0.4) and Designer (2.0.6) identifying their roles and responsibilities. Systems in these regulated domains are associated with one or more Safety Cases (2.7.4), addressing their use by competent and trained actors. Therefore, your context may contain some or all of the following existing safety documents:

1.  Safety Management System (SMS) (2.1.1)

2.  Safety Management Manual (SMM) (2.1.2)

3.  Safety Management Plan (SMP) (2.1.3)

4.  one or more existing Safety Cases

Figure 2.1 illustrates the relationships between these documents. Your ‘area of interest' may be associated with a Hazard Log (2.1.11) to track all hazards (2.1.10), hazard analysis, risk assessment and risk reduction activities for the ‘whole-of-life' of the safety-related system (SRS) (2.7.7) for any conditions that can potentially lead to harm (1.0.3), including identification of those at risk.

Enabling Works

The remit is extended and elaborated to identify the infrastructural technologies (4.0.2). These are the underlying, often ignored, communications systems that form the foundation of DCS. This examination is to confirm that the topology and configuration contains no errors (2.4.1) that might permit sneak circuits and hence unintended (rogue) data paths (8.8.1). Use Network Theory (S.15.6) to construct the initial network representation of your ‘area of interest'.

Context

Stepwise decomposition of the ‘area of interest' is used to refine and create one or more hierarchies based on the A-axes (of the DSM). Each of these hierarchies will contain one or more systems and actors that use them. Choose the hierarchies carefully as further decomposition simply reinforces the choices you have made, and therefore the cost of any rework. It is good practice to create several (say three) first-level decompositions so that you can evaluate them and choose the ‘best fit' for further decomposition. Develop a context and boundary for each of the systems identified. Create the initial System Definition (7.0.1) for the ‘area of interest'.

Enterprises and Organisations

For your chosen hierarchy, identify the enterprises (1.5.1) and organisations (1.5.2) (their respective boundaries within the context). This provides demarcation between the Duty Holders and Designers, and between their respective roles and responsibilities and any Safety Cases.

This is the process at the systems level. Now identify and locate any potential or actual ‘incident harm' within the context. These may already be described in the list of top-level hazards for the ‘area of interest' as part of the SRM. Refine the initial System Definition.

Constituent Systems

For the chosen hierarchy, use stepwise refinement to decompose the hierarchies into its constituent systems. Create a System Definition for each of the constituents systems. Identify the systems directly associated with ‘incident harm', the top-level hazards and the hazard records. The goal is to provide a basis for the identification of interfaces (8.4.1) that will be used in the next step.

Interfaces

For each interface, identify the ‘Owner' [System] and the connected systems. Identify, describe and document the Interface Agreements (IA) (8.4.3). Examination of the interfaces provides a check on the system and its description. Therefore, if necessary, refine the top-level System Definition, its network representation (see ‘Enabling Works' above) and the System Definitions for each of the constituents systems.

Actors, Identities and Authentication

Consider how you might gain access to a computer system. Typically, you would log on at a keyboard with a ‘username' and ‘password'. In this example, the ‘username' is your identity (1.0.4) and the ‘password' provides a means of authentication (3.3.4). In a DCSs and DCOs identity applies to each system, subsystem, product (1.5.4), interface and IA will also have an identity.

For each interface, establish the actors, their identities, their authentication and the authorities (3.3.3) used with that interface. From these lists construct the following:

1.  initial Identity Model (3.4.2);

2.  initial Security Model (3.3.2).

It cannot be assumed that the identity model and security model will be homogeneous, that is, uniform and applied across the whole ‘area of interest'. A ‘triplet' [35] access strategy can be used to access Information Systems (3.5.1), including legacy systems, with a minimum of intervention and change to those legacy systems. Therefore, part of this process is to identify and document these ‘triplet' systems. The initial security model should address the following processes:

1.  Ensuring that all connected systems are supported by, and protected by, a suitable security model (the capability of the security model is to be supported by a suitable and sufficient risk and threat assessment);

2.  Determining access requirements;

3.  Identifying the types of searches (to develop an index [for ‘triplet' hops across intermediate systems to destination retrieval system(s)]);

4.  Identifying the types of access (read only; read and update; read, write, create and delete);

5.  Specifying the unique identity (R.3.4.1) of the ‘triplet' access agent (for security and logging, and to support subsequent audit requirements).

Data, metadata and elements of the metamodel

Each interface is examined to determine what data, metadata and elements of the metamodel flows within the context and its hierarchy. This may require ‘recursion', that is, stepping along the interfaces until the source is determined (the ‘stopping condition'). It also may involve many subsidiary data paths as different ‘threads' are combined. In this way the documented description of data, metadata and elements of the metamodel is created.

The use of infrastructural technologies which are often associated with data ecosystems (4.0.1) enable the creation of architectures that employ highly adaptive applications. If these data ecosystems are beyond the boundary of the ‘area of interest' then the ‘stopping condition' is the IA at that boundary.

It is now possible to use Root Cause Analysis (RCA) (S.18.2) to trace the information used in the causal chains (2.1.16). The analysis should consider errors, faults (2.4.3) and failures (2.4.5) and security issues such as authentication failure, all of which can prevent access. The analysis should also consider employing a form of ‘Reverse Engineering' and Sneak Circuit (15.8.2) analysis.

This process step has done the following:

1.  Established what systems are involved in the ‘area of interest' by identifying

(a)  data, metadata and elements of the metamodel

(b)  candidate ‘triplets' relationships (S.20.11)

(c)  IA

(d)  the identities used to access data, metadata and elements of the metamodel

2.  Outlined the steps required

(a)  to select the minimum set of relevant data via navigation of an appropriate set of ‘triplets'

(b)  to identify relevant TAP point(s) on DSM

i.  characterising the data requirements over each TAP point

ii.  selecting an appropriate ‘triplet' interface point to an adjoining TAP point

iii.  navigating outward through ‘triplet' set(s) until reaching stopping criteria

iv.  repeating for each relevant ‘triplet' set interfacing directly with TAP point

v.  collecting back to central location or running of applications remotely on data

What's Next

This is a starting point. With the results of this process you are in a position to analyse the effects of proposed changes, to look at the issues associated with corporate acquisition and divestment and to have a firm basis from which to participate in the discussion about the impact of automation. One of the possible uses for this process is incident investigation.

Bibliography

[35] Gerard Askew, Triangulation: Navigation of Information Contexts Using Triplet Relationships. [UNPUBLISHED] 2016.

Acknowledgements

Alastair Faulkner

This book could not have been completed without the support and patience of my wife Cheryl, and my children Eamon and Grace.

I would like to thank my colleague Ron Pierce for his patience and understanding. Ron is my industrial mentor, initially from my doctorate, and has extensive experience of systems, software and safety issues. I would also like to thank Andy Harrison who has witnessed the journey, recognised its importance and offered help and assistance.

Mark Nicholson

Writing books of this scope is a long haul, rather than a sprint. Thank you to Rachel for her input and her entreaties to get on with it. I would like to thank my colleagues, those who have talked to me at meetings, conferences and the odd bar, for their patience, helpful discussions and polite pointers as to the errors in my approach. Robust but helpful scepticism is the lifeblood of these endeavours.

This book marks the start of a journey as the horizon for this work expands to include the industrialisation of Autonomous Systems, and the assurance thereof. I would therefore like to thank my colleagues in the Assurance of Autonomy Programme a priori for their patience, discussions and robust scepticism as the journey to the second edition of this book unfolds.

Why write this book

Safety management must evolve to address the challenges posed by a reliance on data, enabled by infrastructural technologies, data ecosystems and autonomy. An awareness of this growing gap gives rise to a chronic unease where data-centric autonomous agents are used in safety systems.

To our proofreaders

Developing in the abstract is one thing; writing it down concisely and unambiguously so that the text communicates the intent is another. We would like to thank our proofreaders:

List of Figures

1.1  Bow-tie Diagram 5

1.2  Liew (2013) DIKIW Elements and Linking Statements 13

1.3  Surface and Deep Learning in the DIKIW 14

1.4  DIKIW and Human Centred Competence 14

1.5  Broad Comparison of DIKIW and Semiotic Model 17

1.6  An Intelligent (Expert) System based on Symbolic AI 17

1.7  A Cyber-Physical System based on Smart AI 18

2.1  Safety Management System – Basics 26

2.2  Hazard – Incident Sequence 29

2.3  Hazards in a Systems Hierarchy 30

2.4  Data Error Traversing a Data Path 33

2.5  Extending Villemeur: Primary-Secondary-Command-Decision Failures 35

2.6  Decomposition of Safety Requirements 38

2.7  Safety Triumvirate 40

2.8  Fragment of Decision Making Pattern Based on a Information System 42

2.9  Dynamic Safety Assurance Process 43

2.10  TRL versus IRL versus SRL 46

2.11  Safety Related Information System 52

3.1  A Communications Mesh – Physical and Logical Address 60

3.2  Example Interface Agreement Implementation 63

3.3  Safety Related Information System in Safety Decision Context 66

3.4  Safety Management via Operational Hazard Logs 68

3.5  Accidental System 70

4.1  Data Sources and the IoT 83

4.2  Deep Learning Modelling Life cycle 83

4.3  DS Integrity Resolution within Safety Case Regime 91

5.1  Veracity Challenges 100

7.1  Service Provision Actors 122

7.2  Service Specification 123

7.3  A Model Constructed and Interpreted 131

8.1  DSM and the TAP Axis 140

8.2  Implementation Model for Interface Agreements 150

8.3  Example Interface Agreement Implementation 151

8.4  Interface Agreement Service Provision 152

8.5  An Array of Processes with a Hierarchy 154

8.6  A Simple Linear Metamodel Architecture 156

8.7  Maritime Data Path 159

9.1  A Set of Operational Contexts for Nested Systems 162

9.2  Development Milestone Terms 163

9.3  Architectures in the Development Context 167

9.4  Project Management System – Basics 169

9.5  Data Path Layer Model 177

9.6  Identify the Data Origins 177

9.7  Identify the Boundaries 178

9.8  Identify the Transformations and Processing of the Datasets 178

9.9  Apportion the Integrity Requirements 179

9.10  Identify Evidence Requirements 179

9.11  Specify Corrective Action Process 180

9.12  Completed Data Path 180

11.1  DCO Operating Canvas 198

12.1  System or Organisation's Resilience 212

12.2  Motor as a Line Replaceable Unit 212

12.3  Data (Asset) Management System – Basics 213

12.4  Data (Asset) at TAP (t, a, p) 214

12.5  Use of ETL to create virtual schemas 217

12.6  Near Real-time Monitoring of Operational Safety Management 221

14.1  Evidence Management in Context 243

15.1  A Small Network with Both Multi-edges and Self-edges 259

15.2  System and Subsystem as a Directed Network 260

15.3  SoI Viewpoints 268

16.1  Incident Investigation Management in Context 272

16.2  Incident Footprint in Time, Space, Complexity and Severity 273

18.1  Incident Analysis in Context 282

19.1  Incident Report in Context 287

20.1  TAP axes of the DSM 292

20.2  Computer-Based Technology and / or Human Systems 294

20.3  A Layered Model for a Hierarchy of Systems 295

20.4  System Interfaces 297

20.5  System Boundary Issues 298

21.1  Generation of an Initial List of TAP Points 308

21.2  Illustration of Relationships between PoI and TAP points 310

21.3  Illustration of Generation of TAP Critical Control Points 311

21.4  Safety Argument over TAPs and 4plus1 Principles 319

23.1  qCopter (Physical) Context 332

23.2  Entity-relationships for the qCopter System 333

23.3  Example of qSpace Airspace Segments 334

23.4  Fragment of an Entity-relationship Diagram of a Flight Plan 335

23.5  Example Airspace – Physical Data 338

23.6  Initial P-TAP DSM Representation of qPilot Flight Data 343

23.7  V-TAPs Identified for P-TAP P-Airspace 344

23.8  D-TAPs Identified for P-TAP P-Airspace V-TAPs 345

23.9  qPilot with an Initial Set of CCPs 346

23.10  Initial DSM Representation of qPilot 347

23.11  qCopter Incident 352

23.12  Initial Network Representation of the qCopter Incident System 354

24.1  Well-formed Enterprise 359

24.2  Logical and Physical Production 360

24.3  A Manufacturing Cell 364

24.4  A DSM Hierarchy of Cells 364

24.5  A DSM Hierarchy of Cells 365

24.6  Vertical and Horizontal Integration 371

24.7  Ring-fencing an Acquired Organisation 372

24.8  Transformation using Critical Control Points 374

24.9  Vertical and Horizontal Divestment 375

24.10  Divestment Threat Assessment Process 378

24.11  Implementation of the Partition Barrier 380

24.12  A timeline for the use of Permit to Work 381

24.13  Interface Agreement in Normal Operation 383

24.14  Permit to Work Interface Agreement 383

25.1  A Broad Categorisation of Healthcare Provision 391

25.2  High-level Context for UK Healthcare Provision 394

25.3  ASimplifiedSupplier,SecondaryandTertiaryDelivery,andPrimaryProvisionModel 404

25.4  Mass Casualty Event 409

List of Tables

1  Key to use of italics, referencing and hyperlinks viii

1.1  Argument Terminology for Logic 9

1.2  13 Types of Knowledge Based on Source 11

1.3  State of Knowledge 12

1.4  Human Centred Competence 14

1.5  Types of Actor Exposed to DCS 15

2.1  DMMi Risk Management Support Function 47

2.2  Data-centric Organisation Data Safety Risk Management 48

2.3  Strategies for Controlling Safety Risk 49

2.4  Situation Awareness as Product and Process 51

3.1  Minimum Set of Elements an Identity Model 61

3.2  Interface Agreement Acronyms 62

3.3  Four Types of SoS 72

5.1  Ethics of Uncertainty 101

7.1  Characteristics of Good Requirements 115

7.2  Additional Characteristics of Good Requirements 116

7.3  Desirable Properties of DCS Performance 130

7.4  Metamodel Category Descriptions 132

7.5  Metadata Category Descriptions 134

7.6  Typical SRACs Address 135

8.1  Selection of Common Definitions of Architectures 140

8.2  Characteristics of Storage Types 144

8.3  IA Template 151

8.4  IA – Sample Implementations 153

8.5  AnIncompleteSelectionofISOStandardsRelevanttoMetadataandMetamodels 157

9.1  Data Path Symbols 175

9.2  Data Path Layers 176

10.1  Description of Assessment Life Cycle Model Phases 187

10.2  4Plus1 Data Safety Principles 188

10.3  Risk Assessment Maturity Model Categories 189

11.1  Components of the Business Model Canvas 202

11.2  Business Policy Features 203

11.3  Value Map 204

11.4  Customer Profile 205

11.5  Operating Model Canvas Components 205

11.6  Operational Modes 207

11.7  Non-operational Modes 208

11.8  Emergency Preparedness 209

12.1  Change Management in Adaptive Systems 219

12.2  Start-up Mode After Data Modification 219

12.3  Areas for Discussion for Adaptive SMS 221

13.1  An Overview of Incident Types 233

13.2  Principles of Effective Response and Recovery 236

15.1  Partial Classification of Incident Models 248

15.5  Healthcare Epidemiological Models 263

15.6  Additional Data-Centric Systems Dynamics Terms 264

15.7  Annotation of the Application of the DSM 267

18.1  Root Cause Analysis – Sample Questions 283

20.1  DSM Metadata Category Descriptions 300

20.2  DSM Metamodel Category Descriptions 301

21.1  Model Categories 306

21.2  Generalised Process Frame for the DSM 307

21.4  Key to Identities Used in TAP Identification Outline 309

22.1  Partial Classification of Model Viewpoints 323

23.1  qCopter System Entities Descriptions 333

23.2  qCopter System Organisations 335

23.3  Summary of Autonomous Flight Roles 335

23.4  Typical Air Traffic Control – Top Level Hazards 337

23.5  An Initial an Abstract Hierarchy (A-axis) 339

23.6  qPilot Interfaces 340

23.7  Initial qPilot Operational Context 341

23.8  qPilot: Steps a ‘day-in-the-life’ for a qCopter Flight Use Case 341

23.9  qPilot: Data Instantiations within P-axis Entities for Element of Flight 342

23.10  qPilot: Flight Steps 8 to 11 and Associated Data Instantiations 342

23.11  qPilot P-TAP Entities Required to Execute Autonomous Flight 343

23.12  V-TAPs identified for P-TAP P-Airspace 344

23.13  D-TAPs Identified for P-TAP P-Airspace 345

23.14  Airspace Enterprise Metamodel 348

23.15  Extract of the Airspace Segment Metamodel 349

23.16  Extract from the Flight Controller Metamodel 350

23.17  Extract from the Airspace Metadata 350

23.18  Extract from the Airspace Metadata 351

23.19  Extract from qCopter #QC007 and #QC901 Flight Data 352

23.20  Application of the DSM to Autonomous Flight Incident 353

23.21  qPilot Interface Agreements 355

23.22  Actor and Identity 356

24.1  Typical People-related Safety Hazards 360

24.2  Typical Top Level Hazards for Autonomous Asset Movements 361

24.3  Shop Floor Value Proposition 366

24.4  Organisational Metamodel Category Descriptions 368

24.5  Examples of Metamodel Errors 370

24.6  Examples of Metamodel Errors Exposed by ‘Change’ and Scale 370

24.7  Examples of Errors in Acquired and Merged Systems 373

24.8  Divestment Areas for Consideration 376

24.9  Assessment of Divestment Threats 379

24.10  Enterprise – Emergency Preparedness 385

25.1  Domains of Healthcare Provision 391

25.2  Desirable Properties of Healthcare Performance 392

25.3  Healthcare Management Metamodels 393

25.4  Initial Healthcare ‘Enterprise’ Metamodel 394

25.5  Initial Healthcare ‘Organisational Unit’ Metamodel 395

25.6  Typical Issues that can arise in the Healthcare Context 397

25.7  ICP expressed in the Healthcare TAP DSM 403

25.8  NHS Incident Classification 407

25.9  NHS Incident Level 407

Part I: Data-Centric Safety

Outline

1. Introduction

2. System Safety Management

3. Challenges to Systems Engineering

1: Introduction

Abstract

This chapter provides an introduction to the overall content of the book. Many of the core terms employed in the book are defined, and the importance of data is explored. Data is employed extensively by systems, enterprises and organisations. Data is used to configure and characterise. Data is produced, passed across interfaces, stored, processed, transformed and consumed. Improvements in communications technologies allow the interconnection of systems into ever greater Systems of Systems. In these large-scale System-of-Systems domains, we manage their scale by creating a series of abstractions; such as metadata and metamodels. The chapter also identifies the core use of data as an element in an information processing chain involving the collection of the data, transformation into information to be internalised to provide knowledge that can be employed to interact effectively within a given context. The Semiotic and DIKW models are provided as examples.

Keywords

Definitions; Data centric organisations; Metadata; Human learning; Machine learning

Data is becoming a more and more important element of modern life.

Data is at the core of science. – Neil deGrasse Tyson

Data is important, the contribution to (safety) risks associated with the use of data (1.2.1) are real [227,229] and are beginning to be recognised [238,139,168]. What is more problematic is the lack of consensus, within the safety community, regarding the design, management and treatment of data consumed and produced by systems (1.5.5) with potential safety consequences.

We acknowledge that data (like software) cannot directly harm (1.0.3) you, without an operational environment. We note that data requires an actor (1.0.1) to interpret and act on the data. However, access denial, delay or misdirected data services may contribute to harm (particularly where time is a factor). In a data-centric world, how might data constrain or enable individual actors to interact safely with their operational environment and how will omissions or data errors (2.4.2) influence safety?

Definition 1.0.1

Actor

an individual, entity, or combination of product (1.5.4), people and process.

The role of actor (1.0.1) will increasingly be undertaken by one or more Autonomous Agents (1.0.2).

Definition 1.0.2

Autonomous Agent

(AA) an entity operating on the owner's behalf, as an actor (1.0.1) without interference from the ownership entity. Typically, these are products (1.5.4) that incorporate varying degrees of Artificial Intelligence (AI) and Machine Learning (ML) [241,391].

AAs are often the controlling entity in Autonomous Systems (AS) (3.2.2) reliant on data products (3.6.2).

Where's the harm?

Definition 1.0.3

Harm

physical injury or damage to the health of people or damage to property or the environment [319].

Technology continues to evolve, enabled primarily through infrastructural technologies (4.0.2) connecting the Information Systems domain dominated by service. Data is the common factor in the provision of these integrated services. Commercial entities have become dependent on this data, as without it they could not operate. The provision of data services has also evolved from client-server, multi-tier systems, data warehousing to enterprise and organisational data architectures. Data Safety (DS) continues to evolve as decision-making (autonomous) technologies mature. All these changes are taking place while data volumes are rising exponentially, in turn giving rise to the new disciplines of Data Science (3.6.1) and Data Engineering [638].

Infrastructural technologies provide a way to share resources and information about resources. Shared resources require the use of identity (1.0.4) and therefore an identity model (3.4.2). It is desirable that these identities are unique (R.3.4.1) (within specified boundaries (2.1.12) and, where appropriate, the entire system). The use of identity requires consideration of access. Access should be controlled to create privileged areas, functions, applications, system and groups of systems (including the data (1.2.1), metadata (7.8.3) and metamodels (7.8.2)) they may contain). Access Control applies to all elements. As a result, identity and Access Control are a critical interface with Cyber Security (3.3.1) Management.

Definition 1.0.4

Identity

A unique labelling of attributes of the object (system resource) being accessed and of the actor (1.0.1) requesting access in a given context (6.0.2).

Communications-enabled technologies also change organisational structures and the enterprises (1.5.1) that use them. Internet-based services are perhaps the most recognisable of these changes with the rise of online shopping. The execution of a retail website transaction includes an array of services on the website, from payment, goods selection and dispatch to confirmation of delivery, often across several commercial entities. Data is the common factor in the provision of these integrated services. These business entities have become data-centric, as without this (correct) data they could not operate. As a result, data errors and failures (2.4.5) become a significant feature of incidents (13.0.1). Therefore, data errors may be a part of a direct causal chain (2.1.16) and contribute to harm.

Threat Identification and Risk Management

A combination of factors and circumstances will be required to give rise to an incident with data as a contributing cause. One simple model to represent this is the bow-tie diagram [119]. Figure 1.1 represents the contribution of data error, failures and malicious threat events to incidents. The left-hand side of the bow-tie diagram shows data errors, failures or malicious threat events. Furthermore, the diagram can represent the impact of data errors, failures or malicious threat events on the effectiveness of mitigation (2.2.3). Incident sequences (13.0.3) and therefore safety risk management, is significantly affected by data.

Figure 1.1 Bow-tie Diagram

System Safety Principles are incorporated into mature Safety Management Systems (SMS) (2.1.1). Similar risk management systems are used to address disciplines from asset management to enterprise and organisational (1.5.2) risk management. A greater reliance on data affects all these risk management frameworks.

Where does all this data come from?

Data has always been present. The volumes of data used in protection systems have been limited to data generated by dedicated sensors and exchanged through limited interfaces (8.4.1). Larger volumes of data are common in Air Traffic Control for the management of navigation data, flight planning and operations. Typically, these are closed systems (1.5.8).

Infrastructural technologies and data ecosystems (4.0.1) enable the creation of architectures that employ highly adaptive applications that are data-dependent, if not data-centric. In parallel, developments in hardware and operating systems have allowed the creation of low-cost platforms that form the Internet of Things (IoT) (3.8.1). The ubiquity of the IoT has the potential to produce vast quantities of data in open systems (1.5.7) and environments. Often IoT devices exploit remote cloud data storage and Fog Computing [70].

This storage and computing capability change the nature and possible uses of data, as well as the potential impact of data error. Data growth is exponential. Many datasets make reference to other data, often across organisational domains and system boundaries. Data volume growth provides a multiplier for the data references that it contains.

What does this mean for Safety Management?

In addressing Data-Centric Systems (DCS) (1.5.9), the Safety Principles embodied in many mature SMS remain unchanged, but it will be more challenging to marshal and control the resources required to create and maintain the integrity of systems. This is especially true where the capability to evolve rapidly using ML exists. The system becomes highly dynamic.

The introduction of new technology is often associated with step change. Over time these technologies mature and the processes related to them become normalised, requiring the standardisation of components. Economic factors drive component inventories to minimum levels; such pressures encourage re-use and give rise to the requirements to reduce costs associated with change [230].

The sheer range of applications of Computer Based Technologies (CBT) creates issues of scope and applicability. System safety management has its foundation in protection systems, typically fast-acting rule-based technologies. The implementation of configurable CBTs changes the risk profiles associated with this established domain. This is in part due to changes in Supervision, Optimisation and Control (SOC), creating more complex requirements for vertical and

Вы достигли конца предварительного просмотра. , чтобы узнать больше!
Страница 1 из 1

Обзоры

Что люди думают о Data-Centric Safety

0
0 оценки / 0 Обзоры
Ваше мнение?
Рейтинг: 0 из 5 звезд

Отзывы читателей