Вы находитесь на странице: 1из 32

Safety Driven Design

Margaret Stringfellow, MIT


Dr. Nancy G. Leveson, MIT
Dr. Brandon Owens, Berkeley

Complex Systems Research Laboratory

Outline
Safety Engineering and its application to Software
Safety Driven Design
The Process
Example: Martian Lander
Comparison to other Methods and Results

Software in Automotive
and Aerospace Systems
Lines of Code:
MER (Mars Rovers) 428,000
F-35 (Joint Strike Fighter) 5.7 million
Modern day car: 100 million

How can we be sure the software is safe? (Will not


cause a loss event?)
Testing?
Probability of sw failure is?
3

Safety Engineering
Broad Definition of Safety Loss Events (accident) can
be:
A car that wont start because of a Software Error in the
Computer (Recall!)
A spacecraft that crashes into the surface of the planet

Hazard: System state that may permit an Accident


The purpose of Safety Engineering is to identify system
hazards and prevent systems from transitioning to an
unsafe (hazardous) state.
4

STAMP Accident Model


Systems-Theoretic Accident Model and Processes
(STAMP)

Accidents are not the last event in a chain of events


Accidents are the result of the inadequate control of
system state
Basic premise is to prevent accidents by enforcing safety
constraints on system behavior (controlling hazardous
system states)
Safety is viewed as a control problem, not a failure
problem
5

System Safety
System accidents:
Catastrophic outcome arising from interactions between
operating components
Each component functions within an acceptable
performance range, or in the context of an appropriate
objective

Safety is Emergent
Safety must be Built-in From the Beginning
Cheaper
More Effective
6

STAMP-Based Hazard Analysis (STPA)


Goals (same as any hazard analysis)
Identification of system hazards and related safety constraints
necessary to ensure acceptable risk
Accumulation of information about how hazards can occur.
Use info to eliminate, mitigate and control hazards in system
design, development, manufacturing, and operations

Controlling States
Since hazardous states can be prevented through
appropriate control (enforcing safety constraints), this
hazard analysis method seeks to find instances of
Inadequate Control
Inadequate control occurs when there are state transitions to
hazardous states
The commands or actions that lead to violation of safety
constraints:
Inadequate Control Actions

Inadequate Control Actions


Identify inadequate control actions
1. A required control action is not provided or not
followed
2. An incorrect or unsafe control action is provided
3. A potentially correct control action is provided too
late or too early (at the wrong time)
4. A correct control action is stopped too soon.

Control Structure

10

Control Flaws and Generic Control Loop


Inadequate control
Commands
Controller

Control
Input
Wrong or
Missing

Inadequate
Control
Algorithm
Process
Model
Wrong

Feedback
Wrong or
Missing

Actuator(s)
Inadequate
Actuator
Operation

Process Input
Wrong or
Missing
Controlled
Process

Sensor(s)
Inadequate
Sensor
Operation

Disturbances
Unidentified
or Out of
Range

Process Output
Wrong or Missing

11

How to Perform STPA


1. High-level Hazard Analysis:
Indentify Accidents or Loss Events
Hazards
High-level Safety Constraints
2. Create and Analyze Control Structure to Identify
Inadequate Control Actions
3. Identify Control Flaws
In the design
4. Change design to eliminate, mitigate, or control
potentially unsafe control actions and behaviors.
Or accept
5. Iterate

12

Design For Safety


Goals: To get safety designed into the system rather than
added on at the end.
Most hazard analyses can only be applied to systems that
already exist.
FMEA
Hazop
Design for Safety attempts to get safety considerations
made at the same time performance trades are made.
How? Use STPA to drive design decisions.
13

Process
Overview
Use STPA
(Inadequate
Control Actions
and Control
Flaws) to analyze
high-level design
and refine safety
constraints, or
change design.
Iterate.

Identify and Characterize the


Problem to be Solved: System
Level Goals, Loss Events,
Hazards, Safety Constraints
and Requirements

Create Design

14

Characterize the Problem to be Solved

15

Simple Martian Lander Example:


System Characterization
Mission Goals
G1 Land on the surface of Mars and collect needed scientific
data.
G2 Transmit data back to Earth.

16

Loss Event, Hazard, Safety Constraints


Loss Event/Accident.1 Spacecraft experiences uncontrolled descent
into the surface of Mars and is consequently destroyed.

Hazard.1 Spacecraft comes in contact with the surface with an


impact greater than 100 N.
SafetyConstraint.1 The spacecraft must control its descent to the
surface of Mars so that its impact force is less than 100N.
SafetyConstraint.2 The spacecraft must be protected from impact
with the surface. Rationale: The spacecraft structure is susceptible
to damage even with gentle impacts and must have some type of
protection.

17

Mission Level Requirements:


The mission shall collect and analyze soil samples at XYZ
coordinates.
Rationale: Scientists believe this location may contain ice and
discovering the presence on water on Mars is of great interest.

Customer-derived system design constraints


DC1. The mission must be carried out with existing technologies and
space exploration infrastructures as needed (i.e., technologies rated
at Technology Readiness Level TBD as defined by NASA).
Rationale: While technology development is expected to be an
ongoing activity of NASA, it is assumed to be beyond the mandate of
the mission

Customer programmatic constraints (e.g., budgets,


etc.)
18

High Level Design

19

Design High-Level System Control


Structure

20

Create High-level Design


to Enforce Safety Constraints
SafetyConstraint.1: The spacecraft must control its
descent to the surface of Mars so that its impact force
is less than 100N.
Design Decision 1: Use Thrusters to Control Descent
rate of Spacecraft.

21

STPA

22

Perform 1st Iteration of STPA


(How can constraints be violated?)
SafetyConstraint.1: The spacecraft must control its
descent to the surface of Mars so that its impact force
is less than 100N. .
ICA.1 Spacecraft descent control is not engaged.
ICA.2 Spacecraft descent control allows descent
velocity that are to fast.
ICA.3 Spacecraft descent control is activated too
late.
ICA.4 Spacecraft descent control is de-activated too
soon.
23

Perform 1st Iteration of STPA


(Control Flaws in the Design?)
SC: The spacecraft must control its descent to the surface
of Mars so that its impact force is less than 100N.
ICA: Spacecraft descent control is not engaged.
CF 1: The spacecraft controller does
not receive an initial descent control input.
CF 2: The spacecraft controller does not
command the descent actuators to activate.

24

Use STPA to
Refine Constraints or Change Design
Create a new safety constraint, modify the related
safety constraint, or refine the related
safety constraint to better enforce control.
Create new design or modify existing design to
eliminate, prevent or mitigate the effect
of the control flaw
Accept the design as is and record the rationale
for doing so.

25

Iterate STPA Analysis and Design


Process on Lower-level Components
For each Subsystem Define:
Goals
Requirements,
Safety Constraints
until design is set and all hazards are
eliminated, mitigated or controlled.
May result in changes to any part of the specification.

26

Comparisons and Results

27

STPA Comparison with


Traditional HA Techniques
Top-down (vs. bottom-up like FMECA)
Considers more than just component failure and failure events
(includes these but more general)
Guidance in doing analysis (vs. FTA)
Handles dysfunctional interactions and system accidents,
software, management, etc.

28

STPA Comparisons (2)


Concrete model (not just in head)
Not physical structure (HAZOP) but control (functional)
structure
General model of inadequate control (based on control
theory)
HAZOP guidewords based on model of accidents being
caused by deviations in system variables
Includes HAZOP model but more general

Compared with TCAS II Fault Tree (MITRE)


STPA results more comprehensive
Included Ueberlingen accident

29

Ballistic Missile Defense System (BMDS)


Non-Advocate Safety Assessment using STPA
A layered defense to defeat all ranges of threats in all
phases of flight (boost, mid-course, and terminal)
Made up of many existing systems (BMDS Element)

Early warning radars


Aegis
Ground-Based Midcourse Defense (GMD)
Command and Control Battle Management and
Communications (C2BMC)
Others

MDA used STPA to evaluate the residual safety risk of


inadvertent launch prior to deployment and test
30

Results
Deployment and testing held up for 6 months because so
many scenarios identified for inadvertent launch. In many of
these scenarios:
All components were operating exactly as intended
Complexity of component interactions led to unanticipated
system behavior

STPA also identified component failures that could cause


inadequate control (most analysis techniques consider only
these failure events)
As changes are made to the system, the differences are
assessed by updating the control structure diagrams and
assessment analysis templates.
Adopted as primary safety approach for BMDS
31

Thank you

Questions?

32

Вам также может понравиться