Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Data-Centric Safety: Challenges, Approaches, and Incident Investigation
Data-Centric Safety: Challenges, Approaches, and Incident Investigation
Data-Centric Safety: Challenges, Approaches, and Incident Investigation
Ebook1,126 pages9 hours

Data-Centric Safety: Challenges, Approaches, and Incident Investigation

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Data-Centric Safety presents core concepts and principles of system safety management, and then guides the reader through the application of these techniques and measures to Data-Centric Systems (DCS). The authors have compiled their decades of experience in industry and academia to provide guidance on the management of safety risk. Data Safety has become increasingly important as many solutions depend on data for their correct and safe operation and assurance. The book’s content covers the definition and use of data. It recognises that data is frequently used as the basis of operational decisions and that DCS are often used to reduce user oversight. This data is often invisible, hidden. DCS analysis is based on a Data Safety Model (DSM). The DSM provides the basis for a toolkit leading to improvement recommendations. It also discusses operation and oversight of DCS and the organisations that use them. The content covers incident management, providing an outline for incident response. Incident investigation is explored to address evidence collection and management.Current standards do not adequately address how to manage data (and the errors it may contain) and this leads to incidents, possibly loss of life. The DSM toolset is based on Interface Agreements to create soft boundaries to help engineers facilitate proportionate analysis, rationalisation and management of data safety. Data-Centric Safety is ideal for engineers who are working in the field of data safety management.

This book will help developers and safety engineers to:

  • Determine what data can be used in safety systems, and what it can be used for
  • Verify that the data being used is appropriate and has the right characteristics, illustrated through a set of application areas
  • Engineer their systems to ensure they are robust to data errors and failures
LanguageEnglish
Release dateMay 27, 2020
ISBN9780128233221
Data-Centric Safety: Challenges, Approaches, and Incident Investigation
Author

Alastair Faulkner

Dr. Alastair Faulkner is a Consultant Engineer at Abbeymeade Limited. He has more than 30 years of experience in senior management and has specialist knowledge of data-centric systems. He specialises in system safety and systems engineering. He supports clients with business planning, execution, delivery, risk assessment and management.

Related to Data-Centric Safety

Related ebooks

Chemical Engineering For You

View More

Related articles

Reviews for Data-Centric Safety

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Data-Centric Safety - Alastair Faulkner

    Data-Centric Safety

    Challenges, Approaches, and Incident Investigation

    First edition

    Alastair Faulkner

    Mark Nicholson

    Table of Contents

    Cover image

    Title page

    Copyright

    Preface

    Readership

    Directed Reading

    Bibliography

    It's Monday Morning …

    Bibliography

    Acknowledgements

    List of Figures

    List of Tables

    Part I: Data-Centric Safety

    1: Introduction

    Abstract

    1.1. Logic and Rationality

    1.2. Data

    1.3. Data, Information, Knowledge and Wisdom

    1.4. Systems Reliant on Data

    1.5. Data becomes the Dominant Systems Component

    Bibliography

    2: System Safety Management

    Abstract

    2.1. Safety Management Systems

    2.2. Hazard, Opportunity, Incident

    2.3. Decision, Confidence and Uncertainty

    2.4. Errors, Faults, Failures and Anomalies

    2.5. 4Plus1 Safety Assurance Principles

    2.6. Risk Management Model

    2.7. Safety Justification

    2.8. Maturity Modelling for Data-centric Systems

    2.9. Safety Management Paradigms

    Bibliography

    3: Challenges to Systems Engineering

    Abstract

    3.1. Systems Science

    3.2. Systems Engineering

    3.3. Cyber Security Management

    3.4. Identity Model

    3.5. Information Systems

    3.6. Emerging Disciplines

    3.7. The Accidental System

    3.8. Change in the Systems Domain

    Bibliography

    Part II: Data-Centric Fundamentals

    4: Data Fundamentals

    Abstract

    4.1. Data Quality

    4.2. Value (Economics) of Data

    4.3. High-Integrity Data

    Bibliography

    5: Data-Centric Systems

    Abstract

    5.1. Classification of Data

    5.2. Decision Model

    5.3. Uncertainty

    5.4. Autonomy and Perception

    5.5. Safety Management of Adaptive Systems

    Bibliography

    6: System Context

    Abstract

    6.1. Mature Context

    6.2. Multiple Contexts

    6.3. Context Switch

    6.4. Learning, Adaptive and Autonomous

    6.5. Indeterminate Context

    6.6. Summary

    Bibliography

    7: System Definition

    Abstract

    7.1. Requirements

    7.2. Requirements Management

    7.3. Data Definition Languages (DDL)

    7.4. Supervisory Model

    7.5. Service Provision

    7.6. Rely-Guarantee

    7.7. Performance

    7.8. Metamodels and Metadata

    7.9. Safety-Related Application Conditions

    7.10. Security Requirements

    7.11. Summary

    Bibliography

    Part III: Data-Centric Design

    8: Data-Centric Architecture

    Abstract

    8.1. Computational Models

    8.2. Diversity

    8.3. Architecture Styles and Patterns

    8.4. Interfaces and Interface Agreements (IA)

    8.5. Critical Control Points

    8.6. Metamodel Architectures

    8.7. Metadata for IA

    8.8. Data Paths

    8.9. Summary

    Bibliography

    9: Development

    Abstract

    9.1. Operational Context

    9.2. Architecture and the Operational Context

    9.3. Project Management

    9.4. Life Cycle Models

    9.5. Configuration Management

    9.6. Data Path Implementation

    9.7. Analysis

    9.8. Threat Identification

    Bibliography

    10: Acceptance and Approval

    Abstract

    10.1. Policy, Strategy and Planning

    10.2. Assessment of Design and Implementation

    10.3. Assessment against 4Plus1 Principles

    10.4. Evaluation of Risk Assessment of Design

    10.5. Assessment of Implementation

    10.6. Assessment of Safety Management System

    Bibliography

    Part IV: Operational Management and Maintenance

    11: Operational Matters

    Abstract

    11.1. Business Model and Data Metamodel

    11.2. Data-Centric Operational Organisation

    11.3. Business Management

    11.4. Organisational Metamodel

    11.5. Self-consistent Organisation

    11.6. Operational Modes

    11.7. Emergency Preparedness

    Bibliography

    12: Live Management and Control

    Abstract

    12.1. Data Management Plans

    12.2. Business Continuity

    12.3. Safety-related System Continuity

    12.4. Data Integration

    12.5. Managing Data Change

    12.6. Operational Safety Management

    12.7. Authentication

    12.8. Competency

    12.9. Maintenance of Data as a (Virtual) Asset

    12.10. Data Obsolescence and Destruction

    Bibliography

    Part V: Incident Investigation

    13: Major Incident Response

    Abstract

    13.1. Incident Response

    13.2. Effective Response and Recovery

    13.3. Immediate Aftermath

    Bibliography

    14: Investigation Management

    Abstract

    14.1. Planning

    14.2. Strategy

    14.3. Execution

    Bibliography

    15: DCI Investigation Methodologies

    Abstract

    15.1. Derivation of an Incident Model

    15.2. Classification

    15.3. AcciMap

    15.4. Systems-Theoretic Accident Model and Processes

    15.5. Functional Resonance Analysis Method (FRAM)

    15.6. Network Theory

    15.7. Systems Dynamics

    15.8. Applying DSM to Incident Investigation

    Bibliography

    16: Incident Investigation

    Abstract

    16.1. Investigation Planning

    16.2. Validation of the System Context

    16.3. Validation of the System Definition

    16.4. Analysability

    16.5. Access, Security and Authorities

    16.6. Ongoing Data Safety Incidents

    16.7. Data Safety Incident Investigation

    Bibliography

    17: Investigation Methodology Maturity

    Abstract

    17.1. Validation

    17.2. Investigation Repeatability

    17.3. Education and Training Requirements

    Bibliography

    18: Analysis as Part of a DCI

    Abstract

    18.1. Evidence Directed Analysis

    18.2. Root Cause Analysis (RCA)

    18.3. Incident Model Validation

    18.4. Replicating the Incident

    Bibliography

    19: Incident Report

    Abstract

    19.1. Evidence Navigation

    19.2. Incident Report

    19.3. Escalation and Resolution

    Bibliography

    Part VI: Data Safety Model

    20: Data Safety Model

    Abstract

    20.1. Model Elements

    20.2. Transformation Model (T-axis)

    20.3. Abstraction Model (A-axis)

    20.4. Product, Installation and Maintenance (P-axis)

    20.5. Interface Agreements (IA)

    20.6. Critical Control Points

    20.7. Metadata and Metamodels

    20.8. Data-Centric Decisions

    20.9. Identity and Identity Management

    20.10. Implementing Permit to Work

    20.11. Triplet Relationships

    20.12. Time, Change and Maintenance

    Bibliography

    21: Using the DSM

    Abstract

    21.1. Initial TAP Identification

    21.2. Analysis of P-TAP

    21.3. Confidence in Risk Assessment over DSM

    21.4. Impact of Change on the DSM (Brownfield Sites)

    Bibliography

    22: Validation

    Abstract

    22.1. AcciMap

    22.2. STAMP

    22.3. FRAM

    22.4. Network Theory

    22.5. System Dynamics

    22.6. Weinberg's Categorisation of System Complexity

    22.7. Resilience Engineering

    22.8. Data Security

    22.9. Explanation and Communication

    Bibliography

    Part VII: Application Areas

    23: Autonomous Flight

    Abstract

    23.1. Introduction

    23.2. System Description

    23.3. Normal Operation

    23.4. An Airspace Described in Data

    23.5. Applying DSM

    23.6. Metamodel

    23.7. Metadata

    23.8. Incident Investigation

    23.9. Expressing the Supervisory Model in Metadata

    Bibliography

    24: Enterprise

    Abstract

    24.1. Introduction

    24.2. System Description

    24.3. Normal Operation

    24.4. Multi-layer Error Management

    24.5. Acquisition and Merger

    24.6. Divestment

    24.7. Permit-to-Work Failure

    24.8. Safe-Method-of-Work Failure

    24.9. Emergency Response

    Bibliography

    25: Healthcare

    Abstract

    25.1. Introduction

    25.2. System Definition

    25.3. Metamodel Integration

    25.4. Vertical Integration

    25.5. Horizontal Integration

    25.6. ‘Product Line’ Integration

    25.7. Cyber Physical System Threats

    25.8. Healthcare Incident

    25.9. Summary

    Bibliography

    Part VIII: References

    Bibliography

    Bibliography

    Abbreviations

    Definitions

    Index

    Postface

    Bibliography

    Copyright

    Elsevier

    Radarweg 29, PO Box 211, 1000 AE Amsterdam, Netherlands

    The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom

    50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States

    Copyright © 2020 Alastair Faulkner and Mark Nicholson. Published by Elsevier Ltd. All rights reserved.

    data-centric-safety.com

    No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher's permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

    This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

    Notices

    Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

    Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

    To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

    Library of Congress Cataloging-in-Publication Data

    A catalog record for this book is available from the Library of Congress

    British Library Cataloguing-in-Publication Data

    A catalogue record for this book is available from the British Library

    ISBN: 978-0-12-820790-1

    For information on all Elsevier publications visit our website at https://www.elsevier.com/books-and-journals

    Publisher: Susan Dennis

    Acquisitions Editor: Anita Koch

    Editorial Project Manager: Kelsey Connors

    Production Project Manager: Poulouse Joseph

    Designer: Victoria Pearson

    Typeset by VTeX

    Preface

    Technology evolves, shaped by its use, often in unexpected ways. Products once constrained by ‘air gaps' are enabled by communications-based infrastructural technologies and data ecosystems. Yet data is too broad a term as it does not address its many roles within systems. It is true that historically in most systems, data is merely consumed, processed and some action performed based on predetermined criteria. In this case data is passive and inert, and needs to be consumed to participate in or to direct actions or activities. However, data often exhibits many degrees of freedom that include the description of functionality, performance, capability, capacity and constraint. Data may also include temporal (sequence or order) or time-based (time, rate or calendar) properties. Data has a mercurial property. It is challenging to manage and control; it has a habit of being consumed by systems that it was not produced for, by omission or by design, perhaps without the awareness of the system designer. It is common for it to pass (often unchecked or even unwittingly) across system and organisational boundaries.

    Data (in all its forms) is often unchallenged, unverified, ubiquitous, unrecorded and invisible. Yet this data increasingly determines the behaviour of systems and through this behaviour our access to products (goods and services). Data may be internal or fed to systems with a safety responsibility. As a result, data error or omission may go undetected with potentially hazardous or catastrophic consequences. There may also be consequent damage to assets. Failure of such systems may also contribute to harm indirectly through incorrect decisions made by actors (human or computer) who rely on, or trust, these systems and the data they supply. How should safety justifications reason about Data-Centric Systems (DCS) so that our reliance on, or trust in their correct operation can be justified?

    In using the term DCS we acknowledge the ever-increasing volumes of data. Data may be structured or unstructured. However, not all data has value to us; not all data is fit to be used in a system with safety implications or as part of the assurance of such systems. So how should we determine what data can be used, and what it can be used for? How do we assure ourselves that the data used is appropriate and has the right characteristics? How do we engineer our systems to ensure they are robust and resilient to data errors and failures?

    As these DCSs grow, they experience a change of scale, consuming (and potentially producing) vast quantities of data. As a result automated methods are required to ensure and assure the contribution of data to system safety in such systems. Furthermore, how do we ensure that actors using the data generated by such systems do so in the intended way and with the appropriate level of criticality?

    Currently, no mature methods exist to address these issues. Careful development of data-intensive systems will improve an organisation's ability to ensure and assure the safety of systems. Currently, guidance in this area is very immature. We address these issues in this book.

    System Safety Engineering (SSE) is applicable across the entire life cycle of a product, from concept to disposal. In this book we address data safety issues relating to both physical goods and service elements of a product within an SSE framework. However, the emerging field of data safety means that in this first edition there are aspects of data safety that we do not address.

    WARNING: This book contains Scary Monsters

    Where the authors have paused for thought, tea and/or discussion …

    A scary monster is used to identify open research questions, open certification issues, a requirement for an in-depth discussion to take an issue further than it is explored in this text and Key Safe Behaviour (KSB) deficiency in current SSE / Safety Management System (SMS) practice.

    Readership

    System Safety Engineering (SSE) has its origins in very high-impact activities, such as controlling nuclear power plants or aircraft. As the use of Computer-Based Technologies (CBTs) to control systems and provide services has spread, so has the range of systems that need to consider system safety. Communications-enabled CBTs combine to create infrastructural technologies. Infrastructural technologies are one foundation of the data ecosystem that deliver the performance and low latency required to handle extensive, complex data (1.2.1). Infrastructural technologies make it possible to run applications on systems with thousands of nodes, involving vast quantities of data. The emergence of data as a determinant of system behaviour has followed the spread of these technologies. This book is therefore relevant for practitioners in the classical system safety industries, but also for an ever-increasing set of providers of products (goods and services) outside these industries. It is also an area that has not been adequately addressed by the academic community.

    As a result, the audience for this book includes, but is not exclusive to the following communities:

    Academia

    •  Post-graduate students undertaking or wishing to undertake research into Safety Management of data-centric and data-intensive organisations and systems.

    •  Safety Engineers / Professionals studying the development, operation or oversight of data-centric or data-intensive systems and organisations as part of Continuing Professional Development (CPD).

    Industrial practitioners

    •  Incident Investigators who need to address data issues in their causal analysis and improvement recommendations.

    •  Safety Engineers / Professionals working in the development, operation or oversight of data-centric or data-intensive systems and organisations.

    •  Software Engineers who have to develop software that interfaces with and uses data to determine the set of services provided and the results of the services provided by data-centric and data-intensive systems.

    •  Data scientists who have a role in safety-related information systems or data-intensive control systems development or operation.

    •  Enterprise architects who wish to determine how data-centric and data-intensive their architecture should be. There are implications for enterprise architectures derived from the move towards data-centric and data-intensive activities.

    •  Enterprise Architects whose data has the potential to be used for functions beyond their original intention. Boundaries for the data use may be defined or additional efforts made to improve integrity, etc., where such limits cannot be imposed.

    Corporate and management practitioners

    •  Operational Managers who have System Safety Management responsibility for data-centric and data-intensive systems and organisations.

    •  Information managers of data and metadata interested in the link between data and the trust and reliance that can, or should, be placed on the information extracted from it.

    •  Enterprise Architects whose data, metadata and metamodels have the potential to describe and shape an organisation or enterprise to be safe by design and to ensure that boundaries are enforced.

    •  Project Managers who wish to set the competencies of staff engaged in data safety activities, ideally before the start of such activities. Furthermore, they will be interested in identifying, understanding and controlling the risks that exist with data, and the risks associated with data safety and the current status of the identification and control of those risks.

    •  Data and Commercial Managers who are interested in the impact of the safety ensurance and assurance work on cost, value, timeliness, logistics and disposal / retention issues relating to safety.

    •  Data Asset Managers who are interested in the through-life management of data.

    •  Lawyers addressing liability issues implied by the use of data (and faulty states induced by data and data errors) in data-centric and data-intensive systems. Liability accrues proportionally to the contribution of the activity / element to safety risk. Data crosses boundaries, which makes intellectual property, copyright and theft issues relevant.

    •  Training and education departments within data-centric and data-intensive organisations who have to provide, or commission, training towards competence.

    •  Corporate Managers interested in acquisition or divestment. Especially the legal and corporate issues associated with the management of data, metadata and particularly metamodels as these contain Intellectual Property (IP).

    Societal Guardians

    •  Regulators and approval bodies. What should they be asking for, and how will they know that applicants have addressed data safety appropriately in their safety and compliance cases?

    •  Policymakers who are interested in updating existing regulations to incorporate data contributions to safety. Typically, this will have international and national contributions.

    Directed Reading

    The discipline of data safety is immature and needs to be improved as a matter of urgency. In this book, we attempt to raise awareness of data (1.2.1), metadata (7.8.3), metamodels (7.8.2) and the Social influences and impacts of data, and Interface Agreements (IA) (8.4.3) (or their absence and enforcement) in Data-Centric Systems (DCS) (1.5.9). This book provides a structure within which proposed solutions can be analysed. Experience is not available at the time of writing as to the effectiveness of these approaches other than on individual system exemplars. Where experience is available, it is highlighted.

    This book could not, and should not, be read in isolation. We have deliberately built on existing material (and the concepts that they contain); therefore, there are many references to external sources. While recognising that the application domain is continuously changing, the core safety concepts, techniques and measures are incorporated into mature Safety Management Systems (SMS) (2.1.1) (typically arranged as sequences of processes), which are to be adapted to support DCS in their operational contexts. Situations where current SMS practices may no longer be applicable or sufficient are subject to increasing research activity. For example, systems may employ Machine Learning and as a result are highly dynamic in the evolution of their safety characteristics. Systems use data as a critical enabler. The primary challenge of the developer is to know the methodology of the learning and the associated integrity / criticality, which could be assured by such methods.

    The Reader is reminded that established practices and SMS apply equally to all components of the system (hardware, software, people, process and data (including metadata and metamodels)). We note that many established standards offer little guidance, explicitly addressing data. Data's absence from standards (and guidance notes) does not provide the basis for credible claims that data is outside the confines of safety management, and therefore few safety resources, if any, are required to be allocated to data. This text focuses on the data (including metadata and metamodels) as the emerging and soon-to-be-dominant system safety component.

    System Safety Practitioners

    Section 2.9 (Safety Management Paradigms) expresses the evolution of system safety management and the challenges that lie ahead. These challenges are explored through headline issues.

    •  Boundaries: Large datasets obscure boundaries (2.1.12), and without clear boundaries hazard (2.1.10) management is problematic. Section 8.4 (Interface Agreements) (IA) provides one means of managing and controlling real and virtual boundaries.

    •  Identity: Increases in the number of elements gives rise to identity (1.0.4) and identity management requirements. Section 3.4 (Identity Model) provides one means of expressing issues associated with identity.

    •  Safe Method of Work: Existing SMS requires high integrity implementation of Permit-to-Work (24.7.1). The increased span of control requires that these practices be reinforced in the DCS. Section 24.7 (Safe Method of Work) addresses these issues.

    •  Data Safety Model (DSM): The interconnected nature of DCS requires a way to express the data element of a product (1.5.4), the operational process and organisational hierarchy. These issues are expressed in Section 20 (Data Safety Model).

    •  Using the DSM: In a complex context using the DSM becomes challenging. Section 21 (Using the DSM) provides initial guidance on its application.

    •  Data, Metadata and Metamodels: It is becoming clearer that managing data through content (1.3.3) (Data Quality) is no longer enough. A dependency on data and its ever-growing volume inevitably draw comparisons with machine code and the use of abstraction in software engineering. Data should be abstracted into metadata and metadata abstracted into metamodels.

    •  Autonomy and Automation: A growing reliance on data requires transparency, visibility of the influence of data and the errors that data may contain.

    •  Incident Investigation: Finally, data-centric systems will fail; this failure will lead to harm (1.0.3). Part V (Incident Investigation) provides one way to investigate data incidents.

    System Safety Acceptance and Approvals

    Independent review is one of the cornerstones of System Safety practice. The Safety Assessor will be a System Safety Practitioner; therefore, the guidance in the preceding paragraph applies. In addition, the Assessors need to be satisfied that the element is suitably and sufficiently described in its context and that its features, functions, dependencies and failure modes are understood well enough to manage safety risk.

    The use of Autonomy and Automation presents particular difficulties as to the nature and form of the safety case. Safety I (S.2.9.1), in which the set of hazards (2.1.10) are sufficiently well known, represents the current footprint for Independent assessment and review. Products are complete; their failure mechanisms are known. Required mitigations and barriers to escalation (2.2.1) are also known.

    Autonomy, the use of Artificial Intelligence (AI) and Machine Learning (ML) will result in unfinished elements. At the point they go operational, they learn and adapt their behaviours, and in doing so give rise to new hazards and new combinations of hazards. The Safety Assessor will be expected to express a professional opinion as to the safety risks involved in such systems (see Section 10 (Acceptance and Approval)).

    Incident Management and Investigation

    The Safety Investigator will be a System Safety Practitioner; therefore, the guidance in the preceding two paragraphs applies. Evidence, in the form of witness marks on physical components and eye witness statements, has been pivotal in determining the root causes of many fatal accidents (13.0.2). A reliance on data may mean a reduction in the availability of physical evidence to the extent that the absence of physical evidence is an important feature. An incident (13.0.1) involving an Autonomous Vehicle (AV) may not include skid marks, which would indicate a failure to brake.

    As reliance on data increases the probability of systematic data failure increases. Therefore, rather than single incidents at single locations and points in time, multiple incidents may manifest at many times and locations. Investigating the underlying data causes from a set of complex situations is challenging. Section 15 describes a range of incident investigation methodologies. The investigation methodology should be documented to ensure repeatability and audit. This methodology may be a combination of existing approaches, a hybrid or something new.

    Corporate and Management Practitioners

    Autonomy and the use of AI and ML require the application of system safety to evolve recognising that treating only product-based hazards may not be enough. This places additional responsibilities on operational and corporate management and requires operational managers to become Duty Holders (2.0.4).

    •  Safety I (S.2.9.1): This is a conventional view, represented in many system safety standards, where all hazards are known, managed, mitigated or removed such that the residual risk is at least tolerable (2.0.5). Products and systems (1.5.5) are ‘finished’ and are supported by operational processes faithfully executed by competent, trained and experienced users. Section 2.9 (Safety Management Paradigms) expresses the evolution of SSM and the challenges that lie ahead. Highly configurable data systems present significant management challenges.

    •  Safety II (S.2.9.2): Hollnagel [302,299] recognises that safety systems are not perfect and that users play an important role in the resilience of the safety system. One extension of resilience is the implementation products that are unfinished at the point they are set to work. These issues recognise a shift in emphasis towards adaptive requirements placed on operations (Section 11 (Operational Matters)) and maintenance (Section 12 (Live Management and Control)). Who will be liable for incidents involving these unfinished products?

    •  Safety II+ (S.2.9.3): Reduced oversight and an increased span of control require tasks to be automated. To what degree should these tasks be automated, and how is this automation to be supported by autonomous systems? What contribution can data assurance make to the assurance of AS?

    •  Safety III (S.2.9.4): This is an area for academic research. The use of Safety III implies that autonomous behaviours also have input to SMS. Current implementations of autonomy are changing safety practice. In which other SMS elements (philosophy, policy, procedure, practice) or responses should we permit autonomy to change? (see Figure 2.1)

    Academia

    The scope for further academic work is extensive. Solution constraints formerly imposed by hardware, software and limited communications infrastructures are significantly diminished. As a result, highly connected and adaptive systems are emerging, as embodied in technologies such as the Internet of Things (IoT). Several fundamental building blocks are incomplete and require academic research.

    •  Scary Monsters: This text contains many ‘Scary Monsters’. They represent the unasked and unanswered questions; where possible, we try to isolate them to formulate problem descriptions for academic consideration.

    •  Teaching and Training: DCS offer an unprecedented opportunity to refresh and revise curriculum. SSE has to evolve to encompass DCS. This text is a reference work collating and collecting many sources.

    New to System Safety?

    We hope that you find our writing style readable. While we do include introductory material, beginning with Section 1 and reading to the end will present you with a substantial learning curve. Before you apply any of the concepts contained in this text, we recommend you consult a System Safety Practitioner familiar with DCTs and its application domain.

    Bibliography

    [299] Erik Hollnagel, Safety-I and Safety-II. Routledge; 2014 978-1472423085.

    [302] Erik Hollnagel, Jean Paries, John Wreathall, Resilience Engineering in Practice: A Guidebook, Volume Ashgate Studies in Resilience Engineering. CRC Press; 2013 978-1472423085.

    It's Monday Morning …

    You have read the book (hopefully you found it interesting), and arrived at work. You're in a data-centric organisation (DCO) (1.5.3) with many data-centric systems (DCS) (1.5.9). You've got a data-centric problem …where do you start?

    This problem is enormous …big enough for you to reopen this book …

    There is no easy answer; much depends on the industry sector (regulated or unregulated), the nature of the safety problem (in its operational context (9.1.1) and its position within the Data Safety Model (DSM) (20.0.1) and one or more TAP points). It would be unreasonable of us to be prescriptive …

    What we can do is outline a process, a place to start, and to issue a stern warning: you must adapt this process to your data-centric problem; we cannot do this for you.

    Develop a Remit

    It is important that you establish what it is that you want from this investigation. Data Safety (DS) assurance and associated investigations have a propensity to consume resources, not because data (1.2.1) is more complex than other system components, but simply because of its potentially extensive technical footprint. It is all too easy for data, metadata (7.8.3) or an element of the metamodel (7.8.2) to be shared by multiple DCSs and DCOs. Some of these uses will be explicit and some implicit; hopefully, only rarely will they be ‘unintended', accessible through sneak circuits (15.8.2).

    It is essential to set a boundary (2.1.12) on your remit and the ‘area of interest'. The identification of context (6.0.2) is of concern as the data may not be valid outside the context and uses it was created for. It is common for a system (1.5.5) to be within a hierarchy. We can no longer assume that the user will be human. To reflect the increased use of automation, the term ‘user' is replaced by actor (1.0.1).

    Existing Safety Management

    All operational domains contain risk (2.0.2). Regulated domains include at least one Duty Holder (2.0.4) and Designer (2.0.6) identifying their roles and responsibilities. Systems in these regulated domains are associated with one or more Safety Cases (2.7.4), addressing their use by competent and trained actors. Therefore, your context may contain some or all of the following existing safety documents:

    1.  Safety Management System (SMS) (2.1.1)

    2.  Safety Management Manual (SMM) (2.1.2)

    3.  Safety Management Plan (SMP) (2.1.3)

    4.  one or more existing Safety Cases

    Figure 2.1 illustrates the relationships between these documents. Your ‘area of interest' may be associated with a Hazard Log (2.1.11) to track all hazards (2.1.10), hazard analysis, risk assessment and risk reduction activities for the ‘whole-of-life' of the safety-related system (SRS) (2.7.7) for any conditions that can potentially lead to harm (1.0.3), including identification of those at risk.

    Enabling Works

    The remit is extended and elaborated to identify the infrastructural technologies (4.0.2). These are the underlying, often ignored, communications systems that form the foundation of DCS. This examination is to confirm that the topology and configuration contains no errors (2.4.1) that might permit sneak circuits and hence unintended (rogue) data paths (8.8.1). Use Network Theory (S.15.6) to construct the initial network representation of your ‘area of interest'.

    Context

    Stepwise decomposition of the ‘area of interest' is used to refine and create one or more hierarchies based on the A-axes (of the DSM). Each of these hierarchies will contain one or more systems and actors that use them. Choose the hierarchies carefully as further decomposition simply reinforces the choices you have made, and therefore the cost of any rework. It is good practice to create several (say three) first-level decompositions so that you can evaluate them and choose the ‘best fit' for further decomposition. Develop a context and boundary for each of the systems identified. Create the initial System Definition (7.0.1) for the ‘area of interest'.

    Enterprises and Organisations

    For your chosen hierarchy, identify the enterprises (1.5.1) and organisations (1.5.2) (their respective boundaries within the context). This provides demarcation between the Duty Holders and Designers, and between their respective roles and responsibilities and any Safety Cases.

    This is the process at the systems level. Now identify and locate any potential or actual ‘incident harm' within the context. These may already be described in the list of top-level hazards for the ‘area of interest' as part of the SRM. Refine the initial System Definition.

    Constituent Systems

    For the chosen hierarchy, use stepwise refinement to decompose the hierarchies into its constituent systems. Create a System Definition for each of the constituents systems. Identify the systems directly associated with ‘incident harm', the top-level hazards and the hazard records. The goal is to provide a basis for the identification of interfaces (8.4.1) that will be used in the next step.

    Interfaces

    For each interface, identify the ‘Owner' [System] and the connected systems. Identify, describe and document the Interface Agreements (IA) (8.4.3). Examination of the interfaces provides a check on the system and its description. Therefore, if necessary, refine the top-level System Definition, its network representation (see ‘Enabling Works' above) and the System Definitions for each of the constituents systems.

    Actors, Identities and Authentication

    Consider how you might gain access to a computer system. Typically, you would log on at a keyboard with a ‘username' and ‘password'. In this example, the ‘username' is your identity (1.0.4) and the ‘password' provides a means of authentication (3.3.4). In a DCSs and DCOs identity applies to each system, subsystem, product (1.5.4), interface and IA will also have an identity.

    For each interface, establish the actors, their identities, their authentication and the authorities (3.3.3) used with that interface. From these lists construct the following:

    1.  initial Identity Model (3.4.2);

    2.  initial Security Model (3.3.2).

    It cannot be assumed that the identity model and security model will be homogeneous, that is, uniform and applied across the whole ‘area of interest'. A ‘triplet' [35] access strategy can be used to access Information Systems (3.5.1), including legacy systems, with a minimum of intervention and change to those legacy systems. Therefore, part of this process is to identify and document these ‘triplet' systems. The initial security model should address the following processes:

    1.  Ensuring that all connected systems are supported by, and protected by, a suitable security model (the capability of the security model is to be supported by a suitable and sufficient risk and threat assessment);

    2.  Determining access requirements;

    3.  Identifying the types of searches (to develop an index [for ‘triplet' hops across intermediate systems to destination retrieval system(s)]);

    4.  Identifying the types of access (read only; read and update; read, write, create and delete);

    5.  Specifying the unique identity (R.3.4.1) of the ‘triplet' access agent (for security and logging, and to support subsequent audit requirements).

    Data, metadata and elements of the metamodel

    Each interface is examined to determine what data, metadata and elements of the metamodel flows within the context and its hierarchy. This may require ‘recursion', that is, stepping along the interfaces until the source is determined (the ‘stopping condition'). It also may involve many subsidiary data paths as different ‘threads' are combined. In this way the documented description of data, metadata and elements of the metamodel is created.

    The use of infrastructural technologies which are often associated with data ecosystems (4.0.1) enable the creation of architectures that employ highly adaptive applications. If these data ecosystems are beyond the boundary of the ‘area of interest' then the ‘stopping condition' is the IA at that boundary.

    It is now possible to use Root Cause Analysis (RCA) (S.18.2) to trace the information used in the causal chains (2.1.16). The analysis should consider errors, faults (2.4.3) and failures (2.4.5) and security issues such as authentication failure, all of which can prevent access. The analysis should also consider employing a form of ‘Reverse Engineering' and Sneak Circuit (15.8.2) analysis.

    This process step has done the following:

    1.  Established what systems are involved in the ‘area of interest' by identifying

    (a)  data, metadata and elements of the metamodel

    (b)  candidate ‘triplets' relationships (S.20.11)

    (c)  IA

    (d)  the identities used to access data, metadata and elements of the metamodel

    2.  Outlined the steps required

    (a)  to select the minimum set of relevant data via navigation of an appropriate set of ‘triplets'

    (b)  to identify relevant TAP point(s) on DSM

    i.  characterising the data requirements over each TAP point

    ii.  selecting an appropriate ‘triplet' interface point to an adjoining TAP point

    iii.  navigating outward through ‘triplet' set(s) until reaching stopping criteria

    iv.  repeating for each relevant ‘triplet' set interfacing directly with TAP point

    v.  collecting back to central location or running of applications remotely on data

    What's Next

    This is a starting point. With the results of this process you are in a position to analyse the effects of proposed changes, to look at the issues associated with corporate acquisition and divestment and to have a firm basis from which to participate in the discussion about the impact of automation. One of the possible uses for this process is incident investigation.

    Bibliography

    [35] Gerard Askew, Triangulation: Navigation of Information Contexts Using Triplet Relationships. [UNPUBLISHED] 2016.

    Acknowledgements

    Alastair Faulkner

    This book could not have been completed without the support and patience of my wife Cheryl, and my children Eamon and Grace.

    I would like to thank my colleague Ron Pierce for his patience and understanding. Ron is my industrial mentor, initially from my doctorate, and has extensive experience of systems, software and safety issues. I would also like to thank Andy Harrison who has witnessed the journey, recognised its importance and offered help and assistance.

    Mark Nicholson

    Writing books of this scope is a long haul, rather than a sprint. Thank you to Rachel for her input and her entreaties to get on with it. I would like to thank my colleagues, those who have talked to me at meetings, conferences and the odd bar, for their patience, helpful discussions and polite pointers as to the errors in my approach. Robust but helpful scepticism is the lifeblood of these endeavours.

    This book marks the start of a journey as the horizon for this work expands to include the industrialisation of Autonomous Systems, and the assurance thereof. I would therefore like to thank my colleagues in the Assurance of Autonomy Programme a priori for their patience, discussions and robust scepticism as the journey to the second edition of this book unfolds.

    Why write this book

    Safety management must evolve to address the challenges posed by a reliance on data, enabled by infrastructural technologies, data ecosystems and autonomy. An awareness of this growing gap gives rise to a chronic unease where data-centric autonomous agents are used in safety systems.

    To our proofreaders

    Developing in the abstract is one thing; writing it down concisely and unambiguously so that the text communicates the intent is another. We would like to thank our proofreaders:

    List of Figures

    1.1  Bow-tie Diagram 5

    1.2  Liew (2013) DIKIW Elements and Linking Statements 13

    1.3  Surface and Deep Learning in the DIKIW 14

    1.4  DIKIW and Human Centred Competence 14

    1.5  Broad Comparison of DIKIW and Semiotic Model 17

    1.6  An Intelligent (Expert) System based on Symbolic AI 17

    1.7  A Cyber-Physical System based on Smart AI 18

    2.1  Safety Management System – Basics 26

    2.2  Hazard – Incident Sequence 29

    2.3  Hazards in a Systems Hierarchy 30

    2.4  Data Error Traversing a Data Path 33

    2.5  Extending Villemeur: Primary-Secondary-Command-Decision Failures 35

    2.6  Decomposition of Safety Requirements 38

    2.7  Safety Triumvirate 40

    2.8  Fragment of Decision Making Pattern Based on a Information System 42

    2.9  Dynamic Safety Assurance Process 43

    2.10  TRL versus IRL versus SRL 46

    2.11  Safety Related Information System 52

    3.1  A Communications Mesh – Physical and Logical Address 60

    3.2  Example Interface Agreement Implementation 63

    3.3  Safety Related Information System in Safety Decision Context 66

    3.4  Safety Management via Operational Hazard Logs 68

    3.5  Accidental System 70

    4.1  Data Sources and the IoT 83

    4.2  Deep Learning Modelling Life cycle 83

    4.3  DS Integrity Resolution within Safety Case Regime 91

    5.1  Veracity Challenges 100

    7.1  Service Provision Actors 122

    7.2  Service Specification 123

    7.3  A Model Constructed and Interpreted 131

    8.1  DSM and the TAP Axis 140

    8.2  Implementation Model for Interface Agreements 150

    8.3  Example Interface Agreement Implementation 151

    8.4  Interface Agreement Service Provision 152

    8.5  An Array of Processes with a Hierarchy 154

    8.6  A Simple Linear Metamodel Architecture 156

    8.7  Maritime Data Path 159

    9.1  A Set of Operational Contexts for Nested Systems 162

    9.2  Development Milestone Terms 163

    9.3  Architectures in the Development Context 167

    9.4  Project Management System – Basics 169

    9.5  Data Path Layer Model 177

    9.6  Identify the Data Origins 177

    9.7  Identify the Boundaries 178

    9.8  Identify the Transformations and Processing of the Datasets 178

    9.9  Apportion the Integrity Requirements 179

    9.10  Identify Evidence Requirements 179

    9.11  Specify Corrective Action Process 180

    9.12  Completed Data Path 180

    11.1  DCO Operating Canvas 198

    12.1  System or Organisation's Resilience 212

    12.2  Motor as a Line Replaceable Unit 212

    12.3  Data (Asset) Management System – Basics 213

    12.4  Data (Asset) at TAP (t, a, p) 214

    12.5  Use of ETL to create virtual schemas 217

    12.6  Near Real-time Monitoring of Operational Safety Management 221

    14.1  Evidence Management in Context 243

    15.1  A Small Network with Both Multi-edges and Self-edges 259

    15.2  System and Subsystem as a Directed Network 260

    15.3  SoI Viewpoints 268

    16.1  Incident Investigation Management in Context 272

    16.2  Incident Footprint in Time, Space, Complexity and Severity 273

    18.1  Incident Analysis in Context 282

    19.1  Incident Report in Context 287

    20.1  TAP axes of the DSM 292

    20.2  Computer-Based Technology and / or Human Systems 294

    20.3  A Layered Model for a Hierarchy of Systems 295

    20.4  System Interfaces 297

    20.5  System Boundary Issues 298

    21.1  Generation of an Initial List of TAP Points 308

    21.2  Illustration of Relationships between PoI and TAP points 310

    21.3  Illustration of Generation of TAP Critical Control Points 311

    21.4  Safety Argument over TAPs and 4plus1 Principles 319

    23.1  qCopter (Physical) Context 332

    23.2  Entity-relationships for the qCopter System 333

    23.3  Example of qSpace Airspace Segments 334

    23.4  Fragment of an Entity-relationship Diagram of a Flight Plan 335

    23.5  Example Airspace – Physical Data 338

    23.6  Initial P-TAP DSM Representation of qPilot Flight Data 343

    23.7  V-TAPs Identified for P-TAP P-Airspace 344

    23.8  D-TAPs Identified for P-TAP P-Airspace V-TAPs 345

    23.9  qPilot with an Initial Set of CCPs 346

    23.10  Initial DSM Representation of qPilot 347

    23.11  qCopter Incident 352

    23.12  Initial Network Representation of the qCopter Incident System 354

    24.1  Well-formed Enterprise 359

    24.2  Logical and Physical Production 360

    24.3  A Manufacturing Cell 364

    24.4  A DSM Hierarchy of Cells 364

    24.5  A DSM Hierarchy of Cells 365

    24.6  Vertical and Horizontal Integration 371

    24.7  Ring-fencing an Acquired Organisation 372

    24.8  Transformation using Critical Control Points 374

    24.9  Vertical and Horizontal Divestment 375

    24.10  Divestment Threat Assessment Process 378

    24.11  Implementation of the Partition Barrier 380

    24.12  A timeline for the use of Permit to Work 381

    24.13  Interface Agreement in Normal Operation 383

    24.14  Permit to Work Interface Agreement 383

    25.1  A Broad Categorisation of Healthcare Provision 391

    25.2  High-level Context for UK Healthcare Provision 394

    25.3  ASimplifiedSupplier,SecondaryandTertiaryDelivery,andPrimaryProvisionModel 404

    25.4  Mass Casualty Event 409

    List of Tables

    1  Key to use of italics, referencing and hyperlinks viii

    1.1  Argument Terminology for Logic 9

    1.2  13 Types of Knowledge Based on Source 11

    1.3  State of Knowledge 12

    1.4  Human Centred Competence 14

    1.5  Types of Actor Exposed to DCS 15

    2.1  DMMi Risk Management Support Function 47

    2.2  Data-centric Organisation Data Safety Risk Management 48

    2.3  Strategies for Controlling Safety Risk 49

    2.4  Situation Awareness as Product and Process 51

    3.1  Minimum Set of Elements an Identity Model 61

    3.2  Interface Agreement Acronyms 62

    3.3  Four Types of SoS 72

    5.1  Ethics of Uncertainty 101

    7.1  Characteristics of Good Requirements 115

    7.2  Additional Characteristics of Good Requirements 116

    7.3  Desirable Properties of DCS Performance 130

    7.4  Metamodel Category Descriptions 132

    7.5  Metadata Category Descriptions 134

    7.6  Typical SRACs Address 135

    8.1  Selection of Common Definitions of Architectures 140

    8.2  Characteristics of Storage Types 144

    8.3  IA Template 151

    8.4  IA – Sample Implementations 153

    8.5  AnIncompleteSelectionofISOStandardsRelevanttoMetadataandMetamodels 157

    9.1  Data Path Symbols 175

    9.2  Data Path Layers 176

    10.1  Description of Assessment Life Cycle Model Phases 187

    10.2  4Plus1 Data Safety Principles 188

    10.3  Risk Assessment Maturity Model Categories 189

    11.1  Components of the Business Model Canvas 202

    11.2  Business Policy Features 203

    11.3  Value Map 204

    11.4  Customer Profile 205

    11.5  Operating Model Canvas Components 205

    11.6  Operational Modes 207

    11.7  Non-operational Modes 208

    11.8  Emergency Preparedness 209

    12.1  Change Management in Adaptive Systems 219

    12.2  Start-up Mode After Data Modification 219

    12.3  Areas for Discussion for Adaptive SMS 221

    13.1  An Overview of Incident Types 233

    13.2  Principles of Effective Response and Recovery 236

    15.1  Partial Classification of Incident Models 248

    15.5  Healthcare Epidemiological Models 263

    15.6  Additional Data-Centric Systems Dynamics Terms 264

    15.7  Annotation of the Application of the DSM 267

    18.1  Root Cause Analysis – Sample Questions 283

    20.1  DSM Metadata Category Descriptions 300

    20.2  DSM Metamodel Category Descriptions 301

    21.1  Model Categories 306

    21.2  Generalised Process Frame for the DSM 307

    21.4  Key to Identities Used in TAP Identification Outline 309

    22.1  Partial Classification of Model Viewpoints 323

    23.1  qCopter System Entities Descriptions 333

    23.2  qCopter System Organisations 335

    23.3  Summary of Autonomous Flight Roles 335

    23.4  Typical Air Traffic Control – Top Level Hazards 337

    23.5  An Initial an Abstract Hierarchy (A-axis) 339

    23.6  qPilot Interfaces 340

    23.7  Initial qPilot Operational Context 341

    23.8  qPilot: Steps a ‘day-in-the-life’ for a qCopter Flight Use Case 341

    23.9  qPilot: Data Instantiations within P-axis Entities for Element of Flight 342

    23.10  qPilot: Flight Steps 8 to 11 and Associated Data Instantiations 342

    23.11  qPilot P-TAP Entities Required to Execute Autonomous Flight 343

    23.12  V-TAPs identified for P-TAP P-Airspace 344

    23.13  D-TAPs Identified for P-TAP P-Airspace 345

    23.14  Airspace Enterprise Metamodel 348

    23.15  Extract of the Airspace Segment Metamodel 349

    23.16  Extract from the Flight Controller Metamodel 350

    23.17  Extract from the Airspace Metadata 350

    23.18  Extract from the Airspace Metadata 351

    23.19  Extract from qCopter #QC007 and #QC901 Flight Data 352

    23.20  Application of the DSM to Autonomous Flight Incident 353

    23.21  qPilot Interface Agreements 355

    23.22  Actor and Identity 356

    24.1  Typical People-related Safety Hazards 360

    24.2  Typical Top Level Hazards for Autonomous Asset Movements 361

    24.3  Shop Floor Value Proposition 366

    24.4  Organisational Metamodel Category Descriptions 368

    24.5  Examples of Metamodel Errors 370

    24.6  Examples of Metamodel Errors Exposed by ‘Change’ and Scale 370

    24.7  Examples of Errors in Acquired and Merged Systems 373

    24.8  Divestment Areas for Consideration 376

    24.9  Assessment of Divestment Threats 379

    24.10  Enterprise – Emergency Preparedness 385

    25.1  Domains of Healthcare Provision 391

    25.2  Desirable Properties of Healthcare Performance 392

    25.3  Healthcare Management Metamodels 393

    25.4  Initial Healthcare ‘Enterprise’ Metamodel 394

    25.5  Initial Healthcare ‘Organisational Unit’ Metamodel 395

    25.6  Typical Issues that can arise in the Healthcare Context 397

    25.7  ICP expressed in the Healthcare TAP DSM 403

    25.8  NHS Incident Classification 407

    25.9  NHS Incident Level 407

    Part I: Data-Centric Safety

    Outline

    1. Introduction

    2. System Safety Management

    3. Challenges to Systems Engineering

    1: Introduction

    Abstract

    This chapter provides an introduction to the overall content of the book. Many of the core terms employed in the book are defined, and the importance of data is explored. Data is employed extensively by systems, enterprises and organisations. Data is used to configure and characterise. Data is produced, passed across interfaces, stored, processed, transformed and consumed. Improvements in communications technologies allow the interconnection of systems into ever greater Systems of Systems. In these large-scale System-of-Systems domains, we manage their scale by creating a series of abstractions; such as metadata and metamodels. The chapter also identifies the core use of data as an element in an information processing chain involving the collection of the data, transformation into information to be internalised to provide knowledge that can be employed to interact effectively within a given context. The Semiotic and DIKW models are provided as examples.

    Keywords

    Definitions; Data centric organisations; Metadata; Human learning; Machine learning

    Data is becoming a more and more important element of modern life.

    Data is at the core of science. – Neil deGrasse Tyson

    Data is important, the contribution to (safety) risks associated with the use of data (1.2.1) are real [227,229] and are beginning to be recognised [238,139,168]. What is more problematic is the lack of consensus, within the safety community, regarding the design, management and treatment of data consumed and produced by systems (1.5.5) with potential safety consequences.

    We acknowledge that data (like software) cannot directly harm (1.0.3) you, without an operational environment. We note that data requires an actor (1.0.1) to interpret and act on the data. However, access denial, delay or misdirected data services may contribute to harm (particularly where time is a factor). In a data-centric world, how might data constrain or enable individual actors to interact safely with their operational environment and how will omissions or data errors (2.4.2) influence safety?

    Definition 1.0.1

    Actor

    an individual, entity, or combination of product (1.5.4), people and process.

    The role of actor (1.0.1) will increasingly be undertaken by one or more Autonomous Agents (1.0.2).

    Definition 1.0.2

    Autonomous Agent

    (AA) an entity operating on the owner's behalf, as an actor (1.0.1) without interference from the ownership entity. Typically, these are products (1.5.4) that incorporate varying degrees of Artificial Intelligence (AI) and Machine Learning (ML) [241,391].

    AAs are often the controlling entity in Autonomous Systems (AS) (3.2.2) reliant on data products (3.6.2).

    Where's the harm?

    Definition 1.0.3

    Harm

    physical injury or damage to the health of people or damage to property or the environment [319].

    Technology continues to evolve, enabled primarily through infrastructural technologies (4.0.2) connecting the Information Systems domain dominated by service. Data is the common factor in the provision of these integrated services. Commercial entities have become dependent on this data, as without it they could not operate. The provision of data services has also evolved from client-server, multi-tier systems, data warehousing to enterprise and organisational data architectures. Data Safety (DS) continues to evolve as decision-making (autonomous) technologies mature. All these changes are taking place while data volumes are rising exponentially, in turn giving rise to the new disciplines of Data Science (3.6.1) and Data Engineering [638].

    Infrastructural technologies provide a way to share resources and information about resources. Shared resources require the use of identity (1.0.4) and therefore an identity model (3.4.2). It is desirable that these identities are unique (R.3.4.1) (within specified boundaries (2.1.12) and, where appropriate, the entire system). The use of identity requires consideration of access. Access should be controlled to create privileged areas, functions, applications, system and groups of systems (including the data (1.2.1), metadata (7.8.3) and metamodels (7.8.2)) they may contain). Access Control applies to all elements. As a result, identity and Access Control are a critical interface with Cyber Security (3.3.1) Management.

    Definition 1.0.4

    Identity

    A unique labelling of attributes of the object (system resource) being accessed and of the actor (1.0.1) requesting access in a given context (6.0.2).

    Communications-enabled technologies also change organisational structures and the enterprises (1.5.1) that use them. Internet-based services are perhaps the most recognisable of these changes with the rise of online shopping. The execution of a retail website transaction includes an array of services on the website, from payment, goods selection and dispatch to confirmation of delivery, often across several commercial entities. Data is the common factor in the provision of these integrated services. These business entities have become data-centric, as without this (correct) data they could not operate. As a result, data errors and failures (2.4.5) become a significant feature of incidents (13.0.1). Therefore, data errors may be a part of a direct causal chain (2.1.16) and contribute to harm.

    Threat Identification and Risk Management

    A combination of factors and circumstances will be required to give rise to an incident with data as a contributing cause. One simple model to represent this is the bow-tie diagram [119]. Figure 1.1 represents the contribution of data error, failures and malicious threat events to incidents. The left-hand side of the bow-tie diagram shows data errors, failures or malicious threat events. Furthermore, the diagram can represent the impact of data errors, failures or malicious threat events on the effectiveness of mitigation (2.2.3). Incident sequences (13.0.3) and therefore safety risk management, is significantly affected by data.

    Figure 1.1 Bow-tie Diagram

    System Safety Principles are incorporated into mature Safety Management Systems (SMS) (2.1.1). Similar risk management systems are used to address disciplines from asset management to enterprise and organisational (1.5.2) risk management. A greater reliance on data affects all these risk management frameworks.

    Where does all this data come from?

    Data has always been present. The volumes of data used in protection systems have been limited to data generated by dedicated sensors and exchanged through limited interfaces (8.4.1). Larger volumes of data are common in Air Traffic Control for the management of navigation data, flight planning and operations. Typically, these are closed systems (1.5.8).

    Infrastructural technologies and data ecosystems (4.0.1) enable the creation of architectures that employ highly adaptive applications that are data-dependent, if not data-centric. In parallel, developments in hardware and operating systems have allowed the creation of low-cost platforms that form the Internet of Things (IoT) (3.8.1). The ubiquity of the IoT has the potential to produce vast quantities of data in open systems (1.5.7) and environments. Often IoT devices exploit remote cloud data storage and Fog Computing [70].

    This storage and computing capability change the nature and possible uses of data, as well as the potential impact of data error. Data growth is exponential. Many datasets make reference to other data, often across organisational domains and system boundaries. Data volume growth provides a multiplier for the data references that it contains.

    What does this mean for Safety Management?

    In addressing Data-Centric Systems (DCS) (1.5.9), the Safety Principles embodied in many mature SMS remain unchanged, but it will be more challenging to marshal and control the resources required to create and maintain the integrity of systems. This is especially true where the capability to evolve rapidly using ML exists. The system becomes highly dynamic.

    The introduction of new technology is often associated with step change. Over time these technologies mature and the processes related to them become normalised, requiring the standardisation of components. Economic factors drive component inventories to minimum levels; such pressures encourage re-use and give rise to the requirements to reduce costs associated with change [230].

    The sheer range of applications of Computer Based Technologies (CBT) creates issues of scope and applicability. System safety management has its foundation in protection systems, typically fast-acting rule-based technologies. The implementation of configurable CBTs changes the risk profiles associated with this established domain. This is in part due to changes in Supervision, Optimisation and Control (SOC), creating more complex requirements for vertical and

    Enjoying the preview?
    Page 1 of 1