Structured Investigation of Digital Incidents in Complex Computing Environments

Structured Investigation of Digital Incidents in Complex Computing Environments
Peter Reynolds Stephenson
School of Technology Oxford Brookes University
Thesis submitted in partial fulfilment of the requirements of the award of Doctor of Philosophy as awarded by Oxford Brookes University
October, 2004
DEDICATION
This effort has been one that requires more than a single dedication. First, and foremost, I dedicate this to my wife Deborah who, for nearly 30 years, has believed that I can do more than I often have believed myself. Her tolerance for my endless days on the road interspersed with time spent in my office or lab has been truly remarkable. Her support for this work is as much a credit to her as to any effort I may have made over the past four years of research. However, there is an underlying motivator for the research that is culminating in this thesis: the digital forensic community. Digital forensic science is young and, like most youth, is struggling to find itself. Therefore, I further dedicate this thesis and the research leading up to it to the digital forensic community at large and I charge you, as forensic scientists and practitioners, to develop reliable methods, and a rigorous scientific approach to our discipline.
Structured Investigation of Digital Incidents in Complex Computing Environments
CONTENTS
DEDICATION .....................................................................................................................................
CONTENTS ....................................................................................................................................... I
ILLUSTRATIONS....................................................................................................................... XVI
ABSTRACT............................................................................................................................... XVIII
1. INTRODUCTION.........................................................................................................................1 1.1 BACKGROUND ...........................................................................................................................1 1.2 PROBLEM STATEMENT ..............................................................................................................3 1.2.1 Scientific Rigour in Forensic Digital Analysis .................................................................5 1.3 THESIS STATEMENT ...................................................................................................................6 1.4 MOTIVATIONS ............................................................................................................................6 1.4.1 Selected Approach.............................................................................................................8
1.4.1.1 General Research Plan .............................................................................................................8 1.4.1.1.1 Selecting a Framework.....................................................................................................9 1.4.1.1.2 Developing Process Definition and Description ............................................................10 1.4.1.1.3 Formalization and Modelling.........................................................................................12 1.4.1.1.4 Selection of Tools and Techniques ................................................................................12
1.5 CONTRIBUTIONS.......................................................................................................................13 1.6 TERMINOLOGY .........................................................................................................................14 1.6.1 DEFINITION 1: Primary evidence ...............................................................................14 1.6.2 DEFINITION 2: Secondary evidence.............................................................................14 1.6.3 DEFINITION 3: Digital forensic science.......................................................................15 1.6.4 DEFINITION 4: Forensic digital evidence collection ...................................................15 1.6.5 DEFINITION 5: Digital forensic correlation................................................................15 1.6.6 DEFINITION 6: Digital forensic normalization ............................................................15
Structured Investigation of Digital Incidents in Complex Computing Environments i
1.6.7 DEFINITION 7: Digital forensic deconfliction..............................................................16 1.6.8 DEFINITION 8: Digital forensic data fusion.................................................................16 1.6.9 DEFINITION 9: Digital media........................................................................................16 1.6.10 DEFINITION 10: Digital investigative process ...........................................................16 1.7 ACKNOWLEDGEMENTS ............................................................................................................16 2. RELATED WORK ......................................................................................................................19 2.1 INTRODUCTION ........................................................................................................................19 2.2 FORENSIC DIGITAL ANALYSIS .................................................................................................19 2.2.1 Forensic Computer Analysis ...........................................................................................20 2.2.2 Forensic Network Analysis..............................................................................................20 2.3 FORMAL ANALYSIS OF FORENSIC EVENTS ...............................................................................22 2.3.1 Formal Methods .............................................................................................................22
2.3.1.1 Communicating Sequential Processes (CSP) ..........................................................................23 2.3.1.2 Coloured Petri Nets (CPN)......................................................................................................23 2.3.1.3 Lambda Calculus.....................................................................................................................24
2.3.2 Investigative Process ......................................................................................................25 2.3.3 Synthesis Between Process and Formalization ..............................................................25 2.4 RELATIONSHIPS TO OTHER APPROACHES ................................................................................31 2.4.1 Existing Standards...........................................................................................................31
2.4.1.1 ISO9000 Series .......................................................................................................................32 2.4.1.2 ISO15408 ................................................................................................................................33 2.4.1.3 ISO17025 ................................................................................................................................34 2.4.1.4 Fraud Investigation Techniques ..............................................................................................35
2.4.2 Best Practices ..................................................................................................................35 3. STRUCTURED DIGITAL INVESTIGATION ........................................................................36 3.1 INTRODUCTION ........................................................................................................................36 3.2 UNDERLYING DIGITAL INVESTIGATIVE PROCESS FRAMEWORKS .............................................37 3.2.1 The EEDI Domain Concept.............................................................................................39 3.2.2 Applying the DFRWS Framework to EEDI .....................................................................40 Structured Investigation of Digital Incidents in Complex Computing Environments ii
3.2.2.1 The Investigative Narrative .....................................................................................................42 3.2.2.2 DIPL Characterization.............................................................................................................43 3.2.2.3 CPN Modelling .......................................................................................................................44 3.2.2.4 The DFRWS Framework Classes............................................................................................45 3.2.2.4.1 The Identification Class ..................................................................................................45 3.2.2.4.2 The Preservation Class....................................................................................................47 3.2.2.4.3 The Collection Class .......................................................................................................49 3.2.2.4.4 The Examination Class ...................................................................................................52 3.2.2.4.5 The Analysis Class..........................................................................................................55 3.2.2.4.6 The Presentation Class ....................................................................................................55
3.3 THE EEDI PROCESS .................................................................................................................56 3.3.1 Collecting Evidence.........................................................................................................56 3.3.2 Analysis of Individual Events ..........................................................................................57
3.3.2.1 Preliminary Correlation...........................................................................................................57 3.3.2.1.1 Event Normalizing ..........................................................................................................57 3.3.2.1.2 Event Deconfliction ........................................................................................................58 3.3.2.1.3 Second Level Correlation................................................................................................58 3.3.2.1.4 Timeline Analysis ...........................................................................................................58 3.3.2.1.5 Chain of Evidence Construction .....................................................................................58 3.3.2.1.6 Corroboration..................................................................................................................58
4. DIGITAL INVESTIGATION PROCESS LANGUAGE (DIPL) ...........................................60 4.1 INTRODUCTION ........................................................................................................................60 4.1.1 DIPL Design Objectives ..................................................................................................60
4.1.1.1 Justification of the DIPL Design Objectives ...........................................................................61 4.1.1.2 Testing the DIPL Design Objectives.......................................................................................61
4.2 DIPL A DIGITAL INVESTIGATION PROCESS LANGUAGE ........................................................61 4.2.1 S-Expressions and DIPL..................................................................................................62
4.2.1.1 The Concepts of S-Expressions and Semantic Identifiers .......................................................62 4.2.1.2 Data Typing in DIPL..............................................................................................................63
Structured Investigation of Digital Incidents in Complex Computing Environments iii
4.2.1.2.1 SID Classes ....................................................................................................................64 4.2.1.2.2 Data Types .....................................................................................................................64 4.2.1.3 General DIPL Constructs .......................................................................................................66
4.2.2 SID Descriptions .............................................................................................................68

4.2.2.1 Verb SIDs................................................................................................................................68 4.2.2.2 Role SIDs ................................................................................................................................69 4.2.2.3 Adverb SIDs............................................................................................................................69 4.2.2.4 Attribute SIDs .........................................................................................................................69 4.2.2.5 Atom SIDs ..............................................................................................................................69 4.2.2.6 Conjunction SIDs ....................................................................................................................69 4.2.2.7 Referent SIDs..........................................................................................................................70 4.2.2.8 SID Worlds and the World SID ..............................................................................................71
4.3 A TOP-LEVEL DIPL INVESTIGATION REFERENCE MODEL ......................................................71 4.3.1 The Identification Class..................................................................................................72 4.3.2 The Preservation Class...................................................................................................74 4.3.3 The Collection Class.......................................................................................................76 4.3.4 The Examination and Analysis Classes ..........................................................................78 4.3.5 The Presentation Class...................................................................................................79 5. MODELLING OF DIGITAL INVESTIGATIVE AND FORENSIC PROCESSES ..................................................................................................................................................80 5.1 INTRODUCTION ........................................................................................................................80 5.2 NOTATION AND MATHEMATICAL PROCESSES .........................................................................82 5.2.1 Logic and Sets .................................................................................................................82
5.2.1.1 Special Notation .....................................................................................................................84
...........................................................................................................84 5.2.1.1.2 The Function f ( ) .................................................................................................86

5.2.1.1.1 The Symbol 5.2.1.1.3 From Initiator
to Target
, t , , t ( iuuuuuuuur ) ................................................................88
1 n
5.2.1.1.4 Implies, is the same as, and If and Only If (iff)...............................................................89
5.2.2 Example Mathematical Definitions ................................................................................89
Structured Investigation of Digital Incidents in Complex Computing Environments iv
5.2.2.1 Definition 1.1 Potentially Successful Attack Process...........................................................89 5.2.2.2 Definition 1.2 Authentication of a Block of Digital Data.....................................................90 5.2.2.3 Definition 1.3 Acquisition of Proxy .....................................................................................91
5.2.3 Coloured Petri Nets........................................................................................................91

5.2.3.1 Dynamic Properties of Coloured Petri Nets ............................................................................94 5.2.3.1.1 Boundedness ...................................................................................................................95 5.2.3.1.2 Liveness ..........................................................................................................................95
5.3 A PENETRATION ATTACK ........................................................................................................95 5.3.1 DIPL Characterization...................................................................................................96 5.3.2 Mathematical Analysis ...................................................................................................96 5.3.3 Petri Net Model ..............................................................................................................99 5.4 A MORE COMPLEX EXAMPLE ATTACK .................................................................................102 5.4.1 DIPL Characterization.................................................................................................102 5.4.2 Mathematical Analysis .................................................................................................103 5.4.3 Petri Net Model ............................................................................................................106 5.5 EXAMPLE OF THE USE OF MODELLING IN INCIDENT POST MORTEMS ....................................109 5.5.1 Simple Petri Net Interpretation of an Incident Post Mortem........................................110 6. VALIDATION OF RESEARCH RESULTS..........................................................................115 6.1 INTRODUCTION AND DISCUSSION OF VALIDATION APPROACH ...............................................116 6.2 FORMAL VALIDATION OF SAMPLE INVESTIGATIONS .............................................................117 6.3 VALIDATION OF SELECTED SIDS ...........................................................................................118 6.4 PRACTITIONER EVALUATIONS ...............................................................................................118 6.5 ON THE USE OF GRAPHICS IN COURTS OF LAW .....................................................................119 6.6 COMPARISON WITH OTHER VALIDATION APPROACHES ........................................................120 6.7 COMPARISON WITH INVESTIGATIVE PROCESS MODELS ........................................................122 7. SUMMARY, CONCLUSIONS AND FUTURE DIRECTIONS ............................................124 7.1 CONCLUSIONS .......................................................................................................................124 7.2 ADVANTAGES AND DISADVANTAGES OF THE PROPOSED APPROACH ....................................125 7.3 SUMMARY OF MAIN CONTRIBUTIONS ...................................................................................127 Structured Investigation of Digital Incidents in Complex Computing Environments v
7.3.1 A Digital Investigative Process .....................................................................................127 7.3.2 A Process Language ......................................................................................................128 7.3.3 Mathematical Modelling ...............................................................................................128 7.3.4 A Presentation Approach ..............................................................................................129 7.4 FUTURE DIRECTIONS .............................................................................................................129 8. REFERENCES .....................................................................................................................132 8.1 WORKS CITED WITHIN THIS THESIS ......................................................................................132 8.2 ADDITIONAL PUBLICATIONS BY THE AUTHOR .......................................................................137 APPENDIX 1 DIPL SEMANTIC IDENTIFIER (SID) LISTING..........................................140 VERB SIDS ..................................................................................................................................140 File Verb SIDs........................................................................................................................140
Name: Copy ......................................................................................................................................140 Name: Move .....................................................................................................................................141 Name: Delete ....................................................................................................................................142
Process Verb SIDs ......................................................................................................................142

Name: Execute ..................................................................................................................................142 Name: Suspend .................................................................................................................................143 Name: Resume ..................................................................................................................................144 Name: Terminate ..............................................................................................................................145
Forensic Identification Verb SIDs ..................................................................................................145

Name: ResolveSignature...................................................................................................................145 Name: DetectProfile..........................................................................................................................146 Name: DetectAnomaly......................................................................................................................147 Name: MonitorSystem ......................................................................................................................148
Forensic Preservation Verb SIDS...................................................................................................149

Name: ImageUsing ...........................................................................................................................149 Name: SynchronizeTime...................................................................................................................150
Forensic Collection Verb SIDs.......................................................................................................150

Name: CollectData............................................................................................................................150
Structured Investigation of Digital Incidents in Complex Computing Environments vi
Name: SampleData ...........................................................................................................................151 Name: ReduceData ...........................................................................................................................151 Name: RecoverData ..........................................................................................................................152
Forensic Examination Verb SIDS.................................................................................................153

Name: FilterData...............................................................................................................................153 Name: MatchPattern .........................................................................................................................153 Name: DiscoverData .........................................................................................................................154 Name: ExtractData............................................................................................................................155
Forensic Analysis Verb SIDs........................................................................................................155

Name: ConstructTimeline .................................................................................................................155
Investigation Identification Verb SIDs.............................................................................................156

Name: DetectEvent ...........................................................................................................................156 Name: ReceiveComplaint .................................................................................................................156
Investigation Preservation Verb SIDS .............................................................................................157

Name: ManageCase ..........................................................................................................................157
Investigation Collection Verb SIDs .................................................................................................158

Name: TraceAuthority ......................................................................................................................158 Name: ConductInterview ..................................................................................................................158
Investigation Presentation Verb SIDs..............................................................................................159

Name: Clarify....................................................................................................................................159 Name: UseStatistics ..........................................................................................................................159 Name: CreateReport..........................................................................................................................160
Host Status Verb SIDs ...............................................................................................................161

Name: Reboot ...................................................................................................................................161 Name: ShutDown..............................................................................................................................161 Name: Boot .......................................................................................................................................162
TCP Connection Verb SIDs.........................................................................................................163

Name: TCPConnect ..........................................................................................................................163
HTTP Verb SIDs .....................................................................................................................164

Name: HTTPPost ..............................................................................................................................164
Structured Investigation of Digital Incidents in Complex Computing Environments vii
Name: HTTPGet ...............................................................................................................................165
Application Session Verb SIDs .....................................................................................................166

Name: OpenApplicationSession .......................................................................................................166 Name: CloseApplicationSession .......................................................................................................167 Name: Login .....................................................................................................................................167 Name: OpenFTP ...............................................................................................................................168 Name: SendMail ...............................................................................................................................169
State Assertion Verb SIDs...........................................................................................................170

Name: ObserveState..........................................................................................................................170 Name: ChangeState...........................................................................................................................171
Authorization and Policy Verb SIDs..............................................................................................171

Name: AcquireProxy.........................................................................................................................171 Name: ReleaseProxy .........................................................................................................................172 Name: Request ..................................................................................................................................173 Name: Require ..................................................................................................................................174 Name: Allow.....................................................................................................................................175 Name: Forbid ....................................................................................................................................176
Auditing Verb SIDs ...................................................................................................................177

Name: AuditAccount ........................................................................................................................177 Name: TraceMessage........................................................................................................................178
Analysis Verb SIDs....................................................................................................................179

Name: Attack ....................................................................................................................................179
Command Verb SIDs .................................................................................................................180

Name: Do..........................................................................................................................................180 Name: Did.........................................................................................................................................181 Name: Authenticate...........................................................................................................................181
ROLE SIDS ...................................................................................................................................182 General Purpose Role SIDs...........................................................................................................182

Name: Initiator ..................................................................................................................................182 Name: Observer ................................................................................................................................183
Structured Investigation of Digital Incidents in Complex Computing Environments viii
File-Related Role SIDs ................................................................................................................184

Name: FileSource..............................................................................................................................184 Name: FileDestination ......................................................................................................................185
Process-Related Role SIDs ............................................................................................................185

Name: Process...................................................................................................................................185 Name: Tool .......................................................................................................................................186
User-Related Role SIDs ...............................................................................................................187

Name: Account .................................................................................................................................187 Name: Proxy .....................................................................................................................................188
Forensic Preservation Role SIDs.....................................................................................................189

Name: Data .......................................................................................................................................189
Forensic Collection Role SIDs........................................................................................................189

Name: ApprovedMethod...................................................................................................................190 Name: ApprovedSoftware.................................................................................................................190 Name: ApprovedHardware ...............................................................................................................191
Forensic Examination Role SIDs...................................................................................................192

Name: Validation ..............................................................................................................................192
Forensic Analysis Role SIDs.........................................................................................................193

Name: Link .......................................................................................................................................193
Investigation Collection Role SIDs ..................................................................................................194

Name: Citation ..................................................................................................................................194 Name: Certification...........................................................................................................................194 Name: Policy.....................................................................................................................................195
Investigation Analysis Role SIDs ...................................................................................................195

Name: Subject...................................................................................................................................195 Name: Suspect ..................................................................................................................................196
Investigation Presentation Role SIDs ...............................................................................................196

Name: Expert ....................................................................................................................................196 Name: Countermeasure.....................................................................................................................197
Network-Related Role SIDs..........................................................................................................198 Structured Investigation of Digital Incidents in Complex Computing Environments ix
Name: Receiver.................................................................................................................................198
Messaging-Related Role SIDs ........................................................................................................199

Name: Message.................................................................................................................................199 Name: MailMessage .........................................................................................................................199
State-Related Role SIDs...............................................................................................................200

Name: OldState.................................................................................................................................200 Name: CurrentState...........................................................................................................................200 Name: Machine.................................................................................................................................201
Analysis-Related Role SIDs..........................................................................................................202

Name: AttackSpecifics......................................................................................................................202 Name: Target ....................................................................................................................................202
Auditing Role SIDs ....................................................................................................................203

Name: AttackPath .............................................................................................................................203 Name: TracePath...............................................................................................................................204 Name: ReceivedVia ..........................................................................................................................204 Name: ReceivedFrom .......................................................................................................................204
ADVERB SIDS ..............................................................................................................................205

Name: Outcome ................................................................................................................................205 Name: When .....................................................................................................................................206
ATTRIBUTE SIDS .........................................................................................................................206

Name: Owner ....................................................................................................................................206 Name: Certifier .................................................................................................................................207 Name: Developer ..............................................................................................................................207
ATOM SIDS..................................................................................................................................208 General Purpose Atom SIDs.........................................................................................................208

Name: Comment ...............................................................................................................................208 Name: World.....................................................................................................................................208 Name: Multiplier...............................................................................................................................208
File Descriptor Atom SIDs...........................................................................................................209

Name: FileName ...............................................................................................................................209
Structured Investigation of Digital Incidents in Complex Computing Environments x
Name: FullFileName.........................................................................................................................209 Name: ByteSize ................................................................................................................................209 Name: TimeCreated ..........................................................................................................................209 Name: TimeModified........................................................................................................................210 Name: TimeAccessed .......................................................................................................................210 Name: DirectoryName ......................................................................................................................210 Name: FullDirectoryName................................................................................................................210 Name: AccessPermission ..................................................................................................................211
Program Descriptor Atom SIDs.....................................................................................................211

Name: ProgramName........................................................................................................................211 Name: VersionNumber .....................................................................................................................211 Name: LanguageName......................................................................................................................212
Process Descriptor Atom SIDs.......................................................................................................212

Name: ProcessID ..............................................................................................................................212 Name: ProcessName .........................................................................................................................212 Name: ProcessStatus .........................................................................................................................212
User Descriptor Atom SIDs..........................................................................................................213

Name: UserName..............................................................................................................................213 Name: RealName ..............................................................................................................................213 Name: UserID ...................................................................................................................................213 Name: GroupName ...........................................................................................................................214 Name: GroupID ................................................................................................................................214 Name: EMailAddress........................................................................................................................214
Forensic Collection Atom SIDs......................................................................................................214

Name: LosslessCompression ............................................................................................................214 Name: Hash.......................................................................................................................................215 Name: VirusSignature.......................................................................................................................215 Name: VolumeID..............................................................................................................................215 Name: DiskID ...................................................................................................................................215 Name: MediaID ................................................................................................................................215
Structured Investigation of Digital Incidents in Complex Computing Environments xi
Name: Device....................................................................................................................................216 Name: ProcedureName .....................................................................................................................216 Name: CourtDecision........................................................................................................................216 Name: Jurisdiction ............................................................................................................................216
Forensic Analysis Atom SIDs.......................................................................................................217

Name: StatisticalMethod...................................................................................................................217
Investigation Preservation Atom SIDs .............................................................................................217

Name: ChainOfCustody ....................................................................................................................217 Name: CaseName..............................................................................................................................217 Name: EvidenceID............................................................................................................................217 Name: CertType................................................................................................................................218 Name: CertNumber ...........................................................................................................................218 Name: PolicyName ...........................................................................................................................218 Name: PolicyDate .............................................................................................................................218 Name: BackupImageType.................................................................................................................218 Name: ImageType.............................................................................................................................219 Name: CaseNotes..............................................................................................................................219
Investigation Presentation Atom SIDs .............................................................................................219

Name: Document ..............................................................................................................................219 Name: MediaID ................................................................................................................................219 Name: MissionImpactStatement .......................................................................................................220
Time Descriptor Atom SIDs .........................................................................................................220

Name: Time ......................................................................................................................................220 Name: BeginTime .............................................................................................................................220 Name: EndTime ................................................................................................................................220 Name: Duration.................................................................................................................................221
Host Descriptor Atom SIDs ........................................................................................................221

Name: HostName..............................................................................................................................221 Name: FQHostName.........................................................................................................................221 Name: IPv4Address ..........................................................................................................................222
Structured Investigation of Digital Incidents in Complex Computing Environments xii
Name: ArchitectureName..................................................................................................................222 Name: OSName ................................................................................................................................222
Domain Descriptor Atom SIDs.....................................................................................................222

Name: DomainName.........................................................................................................................222 Name: FQDomainName ...................................................................................................................223 Name: IPv4Mask ..............................................................................................................................223
Protocol Descriptor Atom SIDs .....................................................................................................223

Name: DataLinkProtocol ..................................................................................................................223 Name: StandardTCPPort...................................................................................................................224 Name: StandardUDPPort ..................................................................................................................224
Ethernet Header Atom SIDs ........................................................................................................224

Name: EtherAddress .........................................................................................................................224
IPv4 Header Atom SIDs .............................................................................................................225

Name: SourceIPv4Address ...............................................................................................................225 Name: DestinationIPv4Address ........................................................................................................225 Name: SourceIPv4Mask....................................................................................................................225 Name: DestinationIPv4Mask ............................................................................................................225
ICMP Descriptor Atom SIDs.......................................................................................................225

Name: ICMPType .............................................................................................................................226
TCP Header Atom SIDs.............................................................................................................226

Name: TCPPort.................................................................................................................................226 Name: TCPSourcePort......................................................................................................................226 Name: TCPDestinationPort...............................................................................................................226 Name: TCPPortRange.......................................................................................................................226 Name: TCPSourcePortRange............................................................................................................227 Name: TCPDestinationPortRange.....................................................................................................227
UDP Header Atom SIDs............................................................................................................227

Name: UDPPort ................................................................................................................................227 Name: UDPSourcePort .....................................................................................................................227 Name: UDPDestinationPort ..............................................................................................................228
Structured Investigation of Digital Incidents in Complex Computing Environments xiii
Name: UDPPortRange ......................................................................................................................228 Name: UDPSourcePortRange ...........................................................................................................228 Name: UDPDestinationPortRange ....................................................................................................228 Name: UDPLength............................................................................................................................229
Mail Descriptor Atom SIDs .........................................................................................................229

Name: MailMessageID .....................................................................................................................229 Name: ByteSize ................................................................................................................................229 Name: ContentType ..........................................................................................................................229
HTTP Descriptor Atom SIDs ......................................................................................................230

Name: URL.......................................................................................................................................230
Authorization Descriptor Atom SIDs.............................................................................................230

Name: ACL.......................................................................................................................................230
Statistics Atom SIDs...................................................................................................................230

Name: TotalCount.............................................................................................................................230
Attack Descriptor Atom SIDs ......................................................................................................231

Name: IPv4Path ................................................................................................................................231 Name: AttackNickname ....................................................................................................................231 Penetration ...................................................................................................................................231 Denial of Service .........................................................................................................................234 Unusual Access............................................................................................................................235 Flooding.......................................................................................................................................236 Probe............................................................................................................................................236
Alert Descriptor Atom SIDs.........................................................................................................237

Name: Certainty ................................................................................................................................237 Name: Severity..................................................................................................................................237
Outcome or Status Descriptor Atom SIDs .......................................................................................238

Name: ReturnCode............................................................................................................................238 Name: TCPConnectionStatus............................................................................................................238
Conjunction SIDs........................................................................................................................239
Name: And........................................................................................................................................239
Structured Investigation of Digital Incidents in Complex Computing Environments xiv
Name: HelpedCause..........................................................................................................................239 Name: ByMeansOf ...........................................................................................................................240
Referent SIDs.............................................................................................................................240
Name: ReferAs..................................................................................................................................240 Name: ReferTo..................................................................................................................................240
APPENDIX 2 CASE STUDY EXAMPLE FRAGMENTS......................................................241 A.2.1 AN EXAMPLE INCIDENT POST MORTEM FRAGMENT .........................................................241 A.2.1.1 Introduction...............................................................................................................241 A.2.1.2 DIPL SID Listing ......................................................................................................241
A.2.1.2.1 Identification ....................................................................................................................243 A.2.1.2.2 Preservation......................................................................................................................245 A.2.1.2.3 Collection .........................................................................................................................246
A.2.2 AN EXAMPLE INCIDENT INVESTIGATION FRAGMENT ........................................................249 A.2.2.1 Introduction...............................................................................................................249 A.2.2.2 Incident Background ..................................................................................................250
A.2.2.2.1 Investigative Narrative ......................................................................................................251 A.2.2.2.2 DIPL Characterization.......................................................................................................254 A.2.2.2.3 Selected Modelling............................................................................................................264
Structured Investigation of Digital Incidents in Complex Computing Environments xv
ILLUSTRATIONS
FIGURE 1 - RELATIONSHIP OF THE EEDI DOMAINS TO THE DFRWS FRAMEWORK...........................................40 FIGURE 2 - THE DFRWS INVESTIGATION FRAMEWORK MATRIX .....................................................................41 FIGURE 3 - THE GENERALIZED EEDI PROCESS FLOW .......................................................................................42 FIGURE 4 - DIPL CODE FRAGMENT ..................................................................................................................44 FIGURE 5 - DIPL DATA TYPES ..........................................................................................................................65 FIGURE 6 DIPL LISTING OF A TOP LEVEL REFERENCE MODEL FOR THE IDENTIFICATION CLASS .....................74 FIGURE 7 - DIPL LISTING OF A TOP LEVEL REFERENCE MODEL FOR THE PRESERVATION CLASS.....................75 FIGURE 8 - DIPL LISTING OF A TOP LEVEL REFERENCE MODEL FOR THE COLLECTION CLASS ........................78 FIGURE 9 - DIPL LISTING OF A TOP LEVEL REFERENCE MODEL FOR THE EXAMINATION AND ANALYSIS CLASSES ..................................................................................................................................................79 FIGURE 10 - DIPL LISTING FOR A POTENTIALLY SUCCESSFUL PENETRATION ATTACK ....................................96 FIGURE 11 - COLORED PETRI NET DESCRIBING A SIMPLE PENETRATION ATTACK (DESIGN/CPN) ...................99 FIGURE 12 - COLORED PETRI NET SHOWING THE TOKEN ARMING THE TRANSITION .......................................100 FIGURE 13 - COLORED PETRI NET SHOWING THE TOKEN IN THE OUTPUT PLACE ...........................................101 FIGURE 14 - DIPL LISTING FOR A SUCCESSFUL DENIAL OF SERVICE ATTACK ................................................103 FIGURE 15 - COLORED PETRI NET DESCRIBING A DENIAL OF SERVICE ATTACK .............................................107 FIGURE 16 - COLORED PETRI NET SHOWING TOKEN READY FOR ATTACK .....................................................108 FIGURE 17 - COLORED PETRI NET SHOWING TOKE AT TARGET - ATTACK SUCCESSFUL .................................108 FIGURE 18 CPN1 COLORED PETRI NET FOR SQLSLAMMER SIMULATION: COUNTERMEASURES IN PLACE ..110 FIGURE 19 CPN2 COLORED PETRI NET FOR SQLSLAMMER SIMULATION: COUNTERMEASURES FAIL .........113 FIGURE 20 - VALIDATION RESULTS.................................................................................................................117 FIGURE 21 - GENERAL FORMAT OF DIPL SID LISTING ...................................................................................242 FIGURE 22 - DIPL LISTING OF OBSERVED STATE CHANGE IN VICTIM NETWORK ...........................................244 FIGURE 23 - DIPL LISTING FOR CAPTURE OF WORM PACKET Y REMOTE SITE ...............................................244 FIGURE 24 CASE MANAGEMENT DIPL TEMPLATE .......................................................................................246 FIGURE 25 - DIPL LISTING FOR THE VERIFICATION OF POLICY ......................................................................248 FIGURE 26 - SUCCESSFUL SQLSLAMMER ATTACK AGAINST EXAMPLE NET WITH FAILED COUNTERMEASURES Structured Investigation of Digital Incidents in Complex Computing Environments xvi
..............................................................................................................................................................249 FIGURE 27 - GENERALIZED EEDI PROCESS FLOW ..........................................................................................250 FIGURE 28 - DIPL LISTING OF EXAMPLE INVESTIGATION ...............................................................................264 FIGURE 29 - CPNET FOR THE INVESTIGATION IN A.2.2 IN PRE-SET STATE ......................................................265 FIGURE 30 - CPNET FOR THE INVESTIGATION IN A.2.2 IN SUCCESSFUL POST-SET STATE ...............................266 FIGURE 31 - DECLARATIONS FOR THE CPNETS IN FIGURES 28 AND 29 ...........................................................267
Structured Investigation of Digital Incidents in Complex Computing Environments xvii
ABSTRACT
Stephenson, Peter. October, 2004. Structured Investigation of Digital Incidents in Complex Computing Environments. This thesis introduces a novel approach to the digital investigative process (DIP) for investigating and analyzing complex computer-related security incidents. The pre-eminent challenge, in such investigations, is the creation of a corroborated chain of digital evidence, collected, analyzed and managed with scientific rigor to a set of appropriate and consistent standards. We first describe a digital investigative structure that meets the criteria for a digital investigation framework with a rigorous method of process definition and verification. This structure considers platforms and networks generically, allowing specific procedures to be developed within the framework for the collection, analysis and management of digital evidence within a particular environment. We then introduce a process language for describing the digital investigative process rigorously. This process language provides a structure that may be proven formally. Finally, we introduce a method for submitting an investigation, described in detail by the process language, to a formal model of the investigative process for verification. The investigation may be tested against investigative process standards, such as those derived by the digital investigative community and the courts, for rigor and compliance. The model is modular, allowing individual modules, already court-tested, to be reused in new investigations.
Structured Investigation of Digital Incidents in Complex Computing Environments xviii
1. INTRODUCTION
1.1 Background
Forensic digital analysis is the analysis of digital data such that information regarding its nature, contents and origin may be presented reliably at a court of enquiry. The Digital Forensics Research Workshop (DFRWS) defines digital forensic science as:
The use of scientifically derived and proven methods toward the preservation, collection, validation, identification, analysis, interpretation, documentation and presentation of digital evidence derived from digital sources for the purpose of facilitating or furthering the reconstruction of events found to be criminal, or helping to anticipate unauthorized actions shown to be disruptive to planned operations. [DFRW01] However, this definition itself requires some scrutiny. The notions of preservation, collection, validation, identification, analysis, interpretation, documentation and presentation of digital evidence seem to relate to a process that goes beyond the customary notion of forensic science. If we accept James and Nordbys definition of forensic science [JN01] (see footnote 1 below) as applied to the digital domain, we find that we really are concerned only with collection, analysis, interpretation and presentation of digital evidence. The rest of the DFRWS definition seems to apply more to the investigative process of discovery, evidence management and reporting of the results of the investigation. This notion is borne out by the International Organization on Computer Evidence (IOCE) [IOC00]. It is exactly this type of fuzzy definition that we seek to clarify and distil into a coherent process of digital investigation using the tools of digital forensic science as part of that process. Thus, we must focus at least part of our attentions upon digital forensics. Bearing the DFRWS definition in mind, however, we show a process for using scientifically derived and proven methods towards the conducting of a digital investigation.
Stephenson Structured Investigation of Digital Incidents in Complex Computing Environments
The process framework begins with the structuring of the digital investigative process (DIP) after the DFRWS recommendations [DFRW01] and extends that process to a semiformal process language and, finally, to a formal mathematical model. It is important to recognize that there is, today, the beginning of a divergence between the conduct of digital investigations and analysis of digital evidence using digital forensic science. In this thesis we must, necessarily, treat both of these components because the interrelationships between them are strong. In the unpublished draft report of the Digital Evidence Certification Roundtable held 5 and 6 May 2004 at the National Center for Forensic Science (NCFS), chaired by Carrie Whitcomb (Director, NCFS) and Mark Pollitt (FBI, ret.) we see that, There are many roles in the forensic process supporting that contention. Additionally, the draft report states, Certification must not be defined or determined by the employment, profession or function of the certificant., and defines employment, profession or function to include, investigator, laboratory examiner, private investigator, information security professional, technician, examiner, analyst. However, this research is directed at describing a structured process for conducting a digital investigation in a complex computing environment. Such a process, in many cases, must derive its strength from digital forensic science as much as it does from investigative tradition. Therefore, this thesis is not directed at the development of digital forensics as a science except as that development is necessary to the investigative process. It is an ongoing debate within the digital forensic community whether it is easier to teach a technologist how to conduct an investigation or to teach a seasoned investigator technology. In this research we acknowledge the dual roles of investigator and technologist and focus our attention upon the unique aspects of a digital investigation. It is those dual roles, perhaps, that enable a clearer understanding of the unique aspects of a digital investigation and the need to develop a structured approach that is equally unique. Finally, and critically important, applications of digital forensic analysis and digital investigation are not limited to the courtroom. There are numerous potential applications of both digital investigative and forensic techniques. These potential applications primarily include, but are not limited to:
Application of computer science to matters of law 1 Conducting post-digital incident root cause analysis (incident post mortems) [PS03B] [PS04B] Performing enterprise-wide risk analysis [PS04]
In this thesis we focus upon the investigative process and enabling techniques rather than specific applications. While we may use applications as case studies or illustrative examples, our concern here is with the underlying digital investigative process.
1.2 Problem Statement

Today there is virtually no agreed-upon structure in the digital investigative process. The conclusion of the Digital Forensics Research Workshop [DFRW01], Reith et al [RCG02], Talleur [TT02], Palmer [GP02], Giordano and Maciag [GM02] and others in the field is that the state of the practice of forensic digital analysis is in its infancy, lacking in scientific rigor, a universally accepted process, appropriate training and certification or, indeed, a universally accepted and understood vocabulary of its own. In short, digital forensic science is not, by any measure, a real science. A review of the current state of the practice shows that this challenge is not being met with consistency. The Digital Forensics Research Workshop supports this notion: analytical procedures and protocols are not standardized nor do practitioners and researchers use standard terminology [DFRW01] A result is that complex cases of computer related crime are challenging to try in most courts. There is general agreement in the digital forensic community that the digital investigative process comprises several elements, among them elements of digital forensic science. The DFRWS definition as discussed above, clearly includes elements of digital forensics in the overall investigative process as an example. Thus, the study of the digital investigative process must, necessarily, include some allusion to the study of digital forensics.
This is an extension of the definition of forensic science in general as stated by James and Nordby
[JN01]: the application of natural science to matters of law. Stephenson Structured Investigation of Digital Incidents in Complex Computing Environments
Most important, however, is the notion that whether or not the divergence of digital forensics and digital investigation is, or, in fact can be, complete. It is illogical to expect that the digital investigative process and digital forensic science could exist in mutually exclusive vacuums. In order to conduct a credible digital investigation, eventual recourse to elements of digital forensics must be admitted. Conversely, the acquisition and management of digital evidence that can be credibly analyzed using digital forensics strongly implies an equally rigorous digital investigative process, which to date does not exist. In order for the digital investigative process to become integrated with accepted science many changes must occur and many challenges must be met. Primary among those challenges is the construction of a scientifically acceptable and rigorously provable framework for conducting a digital investigation, managing the evidence, analyzing digital evidence and presenting the evidence credibly in a court of enquiry. It would be well if such a framework were based upon other disciplines, such as mathematics, that have proven and acceptable standing in the scientific community. Until this challenge is met, the veracity of digital evidence in the courts will be challenged, likely with increasing success. In addressing these challenges we ask, and, along with other investigators, must answer, the following questions: Is it feasible to construct a scientifically acceptable and rigorously provable framework for conducting a digital investigation, managing and analyzing the evidence and presenting the evidence credibly in a court of inquiry? Is it feasible to mature digital investigation into a science that meets guidelines such as those in Daubert v Merrill-Dow Pharmaceuticals [DVM93] for admission as scientific evidence in a US court? Can such maturing occur as well relative to non-US courts? (virtually all of the work reported here has been done in the context of the United States) What foundations are required to achieve scientific acceptability for digital investigation, and how do those guidelines impact the digital investigative process?
What must we develop to build upon the foundations required to achieve scientific acceptability for digital forensics and, coincidently, the digital investigative process?
In these questions there is an inevitable interlinking between digital forensic science and the digital investigative process. It is, perhaps, ironic that while attempting to disengage one from the other we must necessarily return to their commonalities. This is because forensic analysis of digital evidence begins with its proper discovery and management. Thus there are elements of digital forensics that touch digital investigation in important ways. We discuss those touch-points in detail in section 3.2.
1.2.1 Scientific Rigour in Forensic Digital Analysis

Scientific rigour in the digital investigative process refers to the strictness with which the investigation follows pre-established, accepted, process guidelines. Those guidelines must have been established appropriately and the investigative process must be proven to follow them. As there are inevitable linkings between digital forensic analysis and the digital investigative process, it follows that both must adhere to principles of scientific rigour. In some cases the requirements for such rigor in digital forensic analysis define the requirements for rigor in the investigative process that feeds that analysis. The appropriate establishment of investigative guidelines may, and probably should, be accomplished by the courts and consensus of the digital investigative community of experts. Once those guidelines have been established, there is need for a process framework capable of establishing that an investigation did, indeed, follow those guidelines. There are examples of acceptable guidelines for the management of digital evidence (US Department of Justice [DOJ02], US National Institute of Standards and Technology [NIST01] to name two). Additionally organizations such as the DFRWS are in the process of institutionalizing a statement that encompasses a set of digital investigative guidelines. Finally, we may look to the courts for examples of case law that have established firm legal precedent for validating the conduct of digital investigations. With such a body of guidelines, there is a requirement for a structured, formal process framework for applying those guidelines, showing that they have been followed and 5
defining in formal detail the actual process that was followed in the conduct of a digital investigation. Such a process framework, in order to be credible, may not be constructed from scratch out of whole cloth. Rather, it must be based upon some accepted body of scientific proofs that may, where appropriate, be applied to an actual investigation for the purpose of validating its process against accepted guidelines. Failing that, the framework simply will take its place with numerous other, un-proven and untested methodologies the validity of whose results are open to question and debate based upon credibility of method.
1.3 Thesis Statement

It is possible to specify a framework for the conduct of a digital investigation and a method for formal verification that the framework has been followed. Such a framework may be represented as a formal model relative to the guidelines against which such processes are to be measured. Finally, the investigation can be tested against a reference model using scientifically proven, accepted and rigorous mathematical processes.
1.4 Motivations
The formalization of the digital investigative process is an important prerequisite for the development and verification of a digital investigation. Lacking a structured process, evidence presented in a court of inquiry is subject to challenge based upon a variety of criteria (see [DOJ02, BK02, DVM93] for examples). Providing a model of an investigation, either before or after the fact, is important to the validation of the evidence. An additional important use of a formalized investigative process is in the conduct of digital incident post mortems. The reconstruction and analysis of a complex digital event, possibly distributed across multiple complex computer networks, is a daunting task using ad hoc approaches common today. A structured, formalized approach provides appropriate credibility and rigor to the investigative and analytical processes. The use of an investigative framework to plan an investigation before the fact helps quality assure the investigative process, resulting in fewer exposures to court challenges of the evidence based upon violations of accepted practices in evidence collection, analysis and management. The use of formal modelling to test a completed investigative process allows for validation of accepted practices, either validating or invalidating the digital
investigation. Additionally, such modelling may indicate gaps or flaws in the investigative process that could affect the validity of the evidence itself and the conclusions drawn by the investigator therefrom. Perhaps the most important potential growing out of formal modelling may be the ability to simulate the investigative process during an actual investigation. Adding formal modelling and simulation to machine learning techniques may form the basis for next generation investigative tools that help the investigator of digital incidents to complete investigations faster, more accurately and gain more convictions where legal proceedings are involved. In Section 2 we will discuss current references to the digital investigative process in somewhat more detail. However, there are several specific aspects of digital investigation that require formal modelling, some of which are included in the following list:
1. Evidence
collection
procedures
including
ensuring the use of approved forensic tools by trained examiners, verification of appropriate policies or the use of appropriate warrants for evidence search and seizure and the use of approved evidence collection techniques. 2. Evidence management procedures including chain of custody, and evidence storage and preservation. 3. Appropriate training administered to digital evidence technicians. 4. Traceability of collection and analysis processes to standardized procedures and practices accepted by the digital forensic community and the courts. 5. Aspects of the digital event that triggered the investigation and the investigators response to those aspects. Finally, the application of formal modelling to the digital investigative process helps reduce ambiguity deriving from procedural issues and allows investigators, advocates (such as attorneys), the courts, and other finders of fact to focus upon interpretation of
the evidence rather than the processes used to collect, preserve and analyze it.
1.4.1 Selected Approach

The research underlying this thesis required some choices of approach that, because of the immaturity of the digital forensic discipline, are as yet not generally well-defined within the forensic community. Within various sections of the thesis we discuss briefly our choices for tools such as Petri Nets (Section 2.3 on selection of particular formal methods as potential tools, for example), however, the overall selection of methods, tools and approaches was, as would be expected, as much a part of the research as the research itself. The evolution of foundational approaches to an emerging science requires that those involved in that evolution establish starting points upon which those who follow can build reliably. We have attempted to make such a contribution here.
1.4.1.1 General Research Plan

Once we established a generalized research objective, that of formalizing the structure of a digital investigation and the forensic process that supports it, we searched for an investigative framework that was consistent with the criminal investigative process accepted generally by the traditional investigative community. We felt that this was the optimum starting point for imposing structure on digital investigations since a large percentage of digital investigators come from a background of traditional investigation. Indeed, traditional investigation of physical crimes, we learned, must play a major role in a high technology investigation of any kind, regardless of the digital tools used in the investigative or forensic process. Because of the lack of maturity of the digital investigative process and the underlying digital forensic process, we were hard-pressed to develop reliable tests for the validity of our conclusions. Very little hard data as relates to investigative process is available. While, to be sure, the outcomes of investigations may be deduced from the results of court cases, reliance on such outcomes is ill-advised since the outcome of a court action may depend more upon lawyering than it does upon the process by which the case was investigated. We discuss the validation process and make suggestions for future research techniques in Chapter 6. As well, at this point it is worthwhile noting that the work being reported in this thesis is intended to be foundational in nature. The immaturity of digital forensics and digital
investigation forces a return, in many cases, to first principles as a means of laying the groundwork for a structured, scientific approach to conducting digital enquiries. It is the intent of this thesis to lay those foundations in the form of a digital investigative process based upon the scientific method of developing hypotheses and testing those hypotheses for validity. The author intends that the work reported here will be a basis upon which additional structure and rigor may be developed by future researchers.
1.4.1.1.1 Selecting a Framework

Because there are special techniques and procedures required for digital investigations, as with any specialized investigation, we looked for an existing process that favoured inclusion of those techniques and procedures. We found such a process in the Digital Forensics Research Work Shops (DFRWS) Digital Investigation Framework [DFRW01, DFRW03]. The Framework retains a general investigative process (identify the existence of a crime, manage and preserve possible evidence, collect evidence, examine evidence, analyze the evidence and draw conclusions, and present the evidence and the conclusions to a court of enquiry), while adding those elements that relate, specifically, to the investigation of digital events and processing of digital forensic evidence. The Framework, however, presents a challenge in that it is ambiguous as to whether it refers to the digital forensic process, or the digital investigative process. In the 2003 meeting of the DFRWS considerable discussion cantered around that ambiguity [DFRW03 annotated]. The workshop was unable to draw a satisfactory conclusion. However, examining the individual elements of the Framework, one finds that the ambiguity, while unintentional, is consistent both with investigation and forensic process. We feel that this relates directly to the current state of digital investigative and forensic practice. In other, more mature, branches of forensic science, the investigator and the forensic examiner are not, usually, the same person. The training and experience required for investigators differs from the training and experience required for forensic examiners. Due to the youth of digital investigation and forensic practice, however, in most cases investigators of digital incidents perform their own forensic examination. While this has proven acceptable for simple investigations, we suspect that it will be ineffective for investigation of complex incidents. Today, we see the beginnings of a divergence of the functions of the investigator and the digital forensic analyst. It appears inevitable that, as digital forensic science matures, the Framework will need to be broadened and, perhaps, evolved into two frameworks, one for investigation and one
for forensic examination. However, the selection of the Framework as an investigative framework for this research seems appropriate in the context of developing a foundational departure point for future refinement of digital forensics as a scientific discipline, and the evolution of a form of the Framework to support forensic examination.
1.4.1.1.2 Developing Process Definition and Description

Once we established a platform (the Framework) upon which to build procedural structure, we considered ways to extend the Framework into a set of well-defined processes. Because the investigative process, generally, is defined only in the broadest terms (seasoned investigators follow an approach rather than a rigorous set of predetermined processes, while forensic examiners tend to take a more structured procedural approach), we hypothesized that, early in the evolution of digital investigation and forensics, the Framework would require interpretation in the context of its specific use. Taking this approach required that we develop a method of measuring the rigor with which the investigator or examiner approaches a digital incident and the evidence involved therewith. This approach is not inconsistent with current trends toward standardization in the International Organization for Standardization (ISO) community. For example, the Common Criteria (ISO Standard 15408) [CC03] provides a set of benchmarks against which an evaluator can measure the security of a device, system or process, based upon a set of judgments by the evaluator as to what level of rigor needs to be applied in a specific instance. We also sought an approach that, because there is a relatively low level of rigor in todays digital investigations and forensic examinations, would encourage appropriate maturation and separation of those two processes over time. Thus, rather than defining a new science, we seek to provide the tools that will allow and encourage practitioners to participate in its evolution. We hypothesize that the nature of those tools will involve measurement of process structure, definition of process and modelling of investigative and forensic processes. Thus, our next research objective was to define those tools. Because the investigative process is, by its nature, less rigorous than the forensic process, we sought an approach that would enable detailed description of either depending upon the specific application. In other words, we wanted the tool used to describe an investigation to be applicable to describing forensic procedure as well. Since the primary differences between investigation and forensic analysis are rigor of procedure and granularity of examination of details, we hypothesize that a process language, sufficiently
10
rich in descriptive vocabulary and syntax, would provide a core platform for characterization of investigation and analysis of a digital event. Rivests concept of S-Expressions [RR97] offers a useful basis for building a description of a process such as those encountered in investigation and forensic analysis. Rivests application of S-Expressions is covered in Section 4.2.1.1. The key benefit we derive from S-Expressions is that they allow a sufficiently granular description of a complex process without demanding such granularity of the process. Rivest has refined the use of LISP SExpressions in the context of addressing digital processes relating to such areas as computer systems, information security and encryption. His paper specifically addresses the use of S-Expressions to transport data between computers. That use implies application of protocols, potentially a complex application requiring a granular process language to describe. Rather than create a new process language out of whole cloth, we sought an existing language that could be fit to our purpose and retained the construct of S-Expressions (or a similar construct) and that would allow us to benefit from a rigorous approach to process characterization when required. We found such a language in the Common Intrusion Specification language (CISL) [FKP+99]. However, Doyle [JD99] found that the CISL was insufficiently robust for its intended purpose of communicating intrusion detection data between dissimilar devices and the CISL effort was, eventually, abandoned. Because we do not use the CISL for its original intended purpose of inter-device communications, the issues that led to Doyles conclusions (all of which related to those communications) do not apply to us. Instead, we extract syntax, some vocabulary and language structure, and use them to suit our purposes which relate explicitly to the description of events and processes. The CISL, although deemed insufficient for its intended purpose, is a granular, descriptive process language, rich in vocabulary appropriate to describing digital events. Its foundational basis in Rivests S-Expressions offers the promise of formal structure and we hypothesized that, with appropriate modifications, the CISL could be melded for use as a forensic and investigative process language. Much of the CISLs vocabulary was inappropriate to our needs and was removed for that reason. Portions of its syntactical structure were unnecessary to our purpose and simply added unnecessary complexity. Where we could without damaging underlying structure, we eliminated those portions as well. Finally, the CISLs vocabulary, while rich in its ability to describe an intrusion process, was
11
deficient in its ability to describe portions of the investigative process and virtually all of the forensic process. To remedy that deficiency we extended the CISL vocabulary significantly (by adding over 60 new semantic identifiers <SIDs>) while retaining the syntactic basis upon which it was built originally. In order to maintain consistency between the investigative and forensic processes and the process language, we structured the additions to the CISL vocabulary to be consistent with the structure of the DFRWS Framework. Vocabulary additions generally follow the elements of the Framework. We renamed the CISL the Digital Investigation Process Language (DIPL).
1.4.1.1.3 Formalization and Modelling

Although we believe that the characterization of the investigative and forensic processes using a process language is useful, especially for describing very complex investigations, we hypothesize that the ability to extend that characterization to a more mathematically robust formalism would allow additional certainty of the results of an investigation or forensic analysis. The nature of digital incidents is that they contain very large amounts of data, most of which is of limited or no usefulness. Processing that data with acceptable certainty of results presents significant challenges. The use of formal modelling offers promise as a method for describing with appropriate rigor the events in an incident as well as the investigative or forensic process used to address the incident. Thus, we view the addition of formal methods to the digital investigation and forensics tool kit as an additional step in reducing the level of abstraction from the general case (observing the nature of a digital incident and describing it using a narrative) to the specific (a mathematical characterization that allows for explicit discrimination between what is surmised and what may be established with acceptable certainty).
1.4.1.1.4 Selection of Tools and Techniques

We have described the rationale behind the selection of the DFRWS Framework, the CISL and the notion of formal modelling. However, we selected a few specific tools for the purpose of our research. We describe the selection of Coloured Petri Nets as our preferred formalism in Section 2.3. In the course of evaluating applicable formal methods, we examined FDR 2 as a tool to use with CSP as a formalism.
http://www.fsel.com/software.html
12
Because, as we describe in Section 2.3, we selected Coloured Petri Nets, we opted for a tool that would allow us to create a graphical representation of a formal model that could be presented acceptably to a lay audience. There are two primary freeware tools that fit that requirement: Design/CPN and CPNTools. Both of these tool sets are maintained by the CPN Group, University of Aarhus, Denmark 3. We selected Design/CPN because it is more mature than CPNTools which currently is in beta and is not recommended for project work by the maintainers. Additionally, Design/CPN has been used extensively both in research and industrial projects, many of which are referenced on the CPN Groups web site.
1.5 Contributions
We make the following specific contributions to the body of knowledge regarding forensic digital investigation: 1. The first reliable, structured process for using scientifically derived and proven methods and/or techniques toward the conducting of an investigation or enquiry in a digital environment for the investigation of digital security incidents in a complex network environment such as the Internet, and 2. the first process language specifically derived for use in characterizing a digital forensic investigation in support of contribution number 1, and 3. the first structured process for creating a formal mathematical model of a digital investigation in support of contribution number 1, and 4. an approach to presenting the results of a formally modelled and proven digital investigation to a court of enquiry.
These contributions form the basis for further research into standardized practices that may be used as the reference variables in idealized formal models for the digital
http://www.daimi.au.dk/CPnets
13
investigative process. Additionally, they form, for the first time, the structure of a verification process for a digital investigation, allowing validation of the investigative process before, during and after the investigation.
1.6 Terminology
This section presents some important definitions that we will use throughout this thesis. Specific terms not discussed here will be defined as they first appear in the body of the text. The definitions that follow are the definitions we will use in the context of the forensic digital environment.
1.6.1 DEFINITION 1: Primary evidence

Primary evidence is evidence that is corroborated by other pieces of primary evidence and, in turn, corroborates additional primary evidence in a chain of evidence. Primary evidence makes up the evidence chain in a digital investigation. Primary evidence may, in turn, be corroborated additionally by secondary evidence. In special circumstances, such as the first piece of evidence in a chain, sufficiently clear and obvious evidence (such as evidence that a computer has been the victim of an attack) may be considered primary evidence if it is corroborated by a significant body of secondary evidence and, in turn, corroborates other primary evidence.
1.6.2 DEFINITION 2: Secondary evidence

Secondary evidence is evidence that is not, itself, corroborated but may serve to corroborate primary evidence. Secondary evidence rarely stands alone credibly since it does not have anything to support it directly. Secondary evidence may be circumstantial, for example. The presence of secondary evidence in sufficient quantity and of sufficient quality may, however, serve to tell a compelling story of how a series of digital events occurred. These first two definitions lead to the First Rule of End-to-End forensic digital analysis: Primary evidence should be corroborated by at least one other piece of relevant primary evidence to be considered a valid part of the evidence chain. Evidence that does not fit this description, but does serve to corroborate some
14
other piece of evidence without itself being corroborated, is considered to be secondary evidence. [PS02]
1.6.3 DEFINITION 3: Digital forensic science

Digital forensic science encompasses the use of scientific techniques to perform analysis of digital events or data, whether on computer networks, computer media or on noncomputer devices such as PDAs, cell phones and digital cameras. We define forensic digital science as: The application of computer science and mathematics to the reliable and unbiased collection, analysis, interpretation and presentation of digital evidence. Note that nothing in this definition restricts the use of digital forensic science to matters of law. However, the appropriateness of the application of digital forensic science to matters of law is implicit in the definitions use of the qualifiers, reliable and unbiased. The allusion to collection clearly includes elements of the investigative process since a key element of that process is the collection of digital evidence for further forensic processing.
1.6.4 DEFINITION 4: Forensic digital evidence collection

The use of approved tools and techniques by trained technicians to obtain digital evidence from computer devices, networks and media. By approved we mean those tools and techniques generally accepted by the discipline and the courts where collected evidence will be presented.
1.6.5 DEFINITION 5: Digital forensic correlation

The comparison of evidentiary information from a variety of sources with the objective of discovering information that stands alone, in concert with other information, or corroborates or is corroborated by other evidentiary information.
1.6.6 DEFINITION 6: Digital forensic normalization

The combining of evidentiary data of the same type from different sources with different
15
vocabularies into a single, integrated terminology that can be used effectively in the correlation process.
1.6.7 DEFINITION 7: Digital forensic deconfliction

The combining of multiple reportings of the same evidentiary event by the same or different reporting sources, into a single, reported, normalized evidentiary event.
1.6.8 DEFINITION 8: Digital forensic data fusion

The process by which all of the available evidentiary data is analyzed and correlated into a single consistent representational model such as a timeline.
1.6.9 DEFINITION 9: Digital media

Any method of storing data digitally. This includes but is not limited to magnetic media (such as disks or tape), optical media, and memory (whether volatile or non-volatile).
1.6.10 DEFINITION 10: Digital investigative process

The digital investigative process (DIP) is the framework around which we investigate a digital incident. It can include incident post mortems, criminal investigations, data discovery in a digital environment, and incident response among other applications. It is characterized by reliability, structure, rigor and the use of appropriate scientific techniques. We define the digital investigative process as: A reliable, structured process for using scientifically derived and proven methods and/or techniques toward the conducting of an investigation or enquiry in a digital environment.
1.7 Acknowledgements
There are almost too many people, as one would expect, to acknowledge in a research work of this type. However, there are several people who stand out in this writers mind as being exceptionally important to this effort. At the top of the list are Dr. Joy Reed and Dr. Eugene (Spaf) Spafford. Dr. Reed has
16
never hesitated to support my work and has always asked the hard questions as the research progressed. A phone conversation with Joy would never have been complete without her admonishment to Finish your degree! Dr. Reed also introduced me to the joys (and frustrations) of formal methods, and for that I shall always be grateful. Dr. Spafford started my work off on the right foot when he listened patiently to what I was trying to accomplish and responded with Youre asking all the wrong questions. Hopefully, I have it right this time, Spaf. See section 1.2. Julie Hogan and Pam Salaway at the Computer Security Institute took a chance on letting me try out the fruits of this research in CSIs conferences. I was able to teach seminars on EEDI and DIPL with enough success that I was asked back and the students seemed to get something out of the experience. Its nice to know when one has developed a new approach that practitioners actually find useful. Dr. Mich Kabay, always supportive, has invited me repeatedly to present my research at Norwich University and we have spent long hours debating technology in the process. My understanding of the academic environment has come largely from Mich. My former colleagues at QinetiQ in the UK including, especially, Andy Bates, who supported my work strongly, Andy Jones who was always ready with useful comments about research in general, Phil Turner, a computer forensic engineer of prodigious accomplishment (welcome to the PhD program, Phil), Dr. Nic Peeling, an intellectual with a rare sense of humour, and, most important, my mates in the managed services group, Al Hood, Bob Halsall and Simon Pearce, all of whose irreverent love of technology and research made for many long conversations over pints at the Three Kings in Hanley Castle, all deserve my gratitude and lasting respect and friendship. Two other colleagues at QinetiQ bear special mention because of their support for my theories and their willingness to spend more time than they probably should have discussing them with me and applying their special knowledge of formal methods to the problems I was trying to address. For that support, Irfan Zakiuddin (Zak) and Dr. Sadie Creese will always be special for me. No list of acknowledgements would be complete without the most important contributors, Drs David Duce and Catherine Hobbs, and Ken Brownsey, my academic advisors from Brookes. While Im sure it was not their intent, their success at supporting my efforts in digital forensic science has led to the second forensics PhD candidate. We may be on to something here.
17
Finally, I am certain that I have left someone important off of this list, but if we have ever crossed paths over my nearly 40 years in technology, rest assured that you contributed in some way to this work. Know that I appreciated you and what you had to say, whether I agreed with it or not.
18
2. RELATED WORK
2.1 Introduction
Related work falls into two distinct categories: forensic and formalization. Until recently virtually no attempts have been made to bring the two together.
2.2 Forensic Digital Analysis

Forensic digital analysis is the evolution of the techniques formerly referred to as computer forensics [RCG02]. The concept of forensic digital analysis grew out of the original discipline of computer forensic analysis and the newer notion of network forensic analysis. However, the two concepts along with software forensic analysis [IK94] have become vague with the emergence of complex computing environments. Thus, although we are concerned with the broader concept of digital forensic analysis, we must look to the other, earlier, descriptions for a clear view of work leading up to the research reported in this thesis. When dealing with any scientific technique that is likely to be used in a court of law in the United States, one must defer to the Daubert tests [DVM93]. These four tests form a basis of acceptance of scientific evidence in a court action: Whether the theory or technique in question can be and has been tested. Whether it has been subjected to peer review and publication. Its known potential rate of error along with the existence and maintenance of standards controlling the techniques operation. The degree of acceptance within the relevant scientific community (may not be the only means to examine expert testimony as in Frye v. US [JV23]).
19
Because Daubert applies broadly to all types of forensic science, we consider it, as well, to be a benchmark for digital forensic science. There are, however, questions of applicability to digital forensics and to the digital investigative process. We discuss those in greater detail in section 2.3.3.
2.2.1 Forensic Computer Analysis

Much early work in computer forensics was done by Michael Anderson as a founder of the International Association of Computer Investigative Specialists, an organization dedicated to the training of law enforcement officers in computer forensic techniques. Other early work was done by Dan Farmer and Wietse Venema in the specific area of Unix forensic analysis. Much of their work was documented in columns written for Dr. Dobbs Journal. Farmer, particularly documented his Unix forensics work there [DF01] when he introduced The Coroners Toolkit, a set of tools for performing forensic analysis in a Unix environment. Application of computer forensics in the investigation of computer-related crime was the focus of work by Rosenblatt [KR95]. Rosenblatt, an assistant district attorney, concentrates heavily on the legal aspects of high technology crime investigation, and had early success in establishing himself as an expert in computer-related crime investigation. Another early contributor to computer forensics was Ken Diliberto, who, with Franklin Clark, formed one of the early West Coast computer investigation teams [CD96]. Additional contributions, initially to law enforcement (the FBI) but, eventually to the general public, came from Icove, Seger and VonStorch. They introduced law enforcement procedure to the investigation of computer-related crime [ISV95]. Finally, in todays practitioner environment there is a library of works, all addressing computer forensic practice at a fairly basic level. Space prohibits listing these redundant works and there is no compelling reason to do so.
2.2.2 Forensic Network Analysis

While most work in this area is not overtly forensic, we must look to such techniques as network back tracing for an understanding of what we mean by forensic network analysis. Simply, forensic analysis of networks involves understanding, identifying and extracting evidence of a possible crime from information travelling on, or stored on, a network.
20
This causes a certain amount of difficulty since we could view forensic information stored on a network device as being the province of forensic computer analysis, not forensic network analysis. Commonly, we consider such information, specifically information that offers evidence about events occurring on the network or in the network environment, as being network events. For example, if a computer residing on a network logs an intruder passing through the computer (as we might see when an attacker penetrates a computer to use as a launching pad for further attacks), we have, clearly, a computer, not a network event. However, if that same computer has a sniffer installed and that sniffer detects anomalistic activity on the network that may be considered direct evidence of a network event, that evidence is network evidence even though it resides on a computer. This is common with intrusion detection systems (IDSs). Early approaches to our research considered focusing upon the IDS as a pure source of network forensic data. However, a very short way into the research it became obvious that, in the context of a complete approach to digital forensic analysis, the concept of an intrusion detection system was, at best, fuzzy, encompassing everything from system loggers on individual devices, to firewall logging systems and traditional intrusion detection systems. The extension of the IDS to an embedded agent within the kernel of an operating system [DZ01] further extends the traditional model of an IDS sufficiently to suggest that such a narrow scope for this investigation would not produce useful information. Thus, the scope was broadened to comprise the overall digital investigative process and the formalization of all of the data collection process that the investigator intends to use to develop evidence in an investigation. We consider work done in the area of network intrusion back tracing to be of interest. Probably the purest network forensics are found in the traces of network intrusion detection systems (NIDS). These traces may tell a strong story of activity on a network. While much work has been done by intrusion detection practitioners in the study of NIDS data, the leading chronicler of NIDS trace analysis is Stephen Northcutt [NCF+01, NN00]. Significant work in collecting and interpreting actual attack traces has been done by the Honeynet project [HP01]. The useful areas of the Honeynet Projects research are not limited to forensic analysis of network events. Rather, they include much information
21
about the actual behaviour of criminal hackers as they attack networks set up on the Internet for the purpose of attracting and monitoring attack activity. Not all network events are so simplistic as to be analyzed using IDS traces. Two particular areas of difficulty, from a forensic perspective, are onion routers and spoofed packets. In the former, a group of routers, distributed around a network, encrypt socket connections and act as proxies making back tracing very difficult [DF02]. In the latter, an attacker seeks to alter source information in packet headers such that back tracing is difficult or impossible [TD01]. The final important area of network forensics involves network routers. Packets passing through network routers may leave traces in router logs that can help the forensic analyst determine the nature of network activity [TA02]. Interesting research is being done currently by Dr. David Nicol at Dartmouth College in the U.S. on boundary gateway protocol (BGP) instability forensics [DN02]. Dr. Nicols research is examining relationships between behaviors of boundary routers in the presence of large scale worm infestations. Early results suggest that back tracing a worms origins may be possible using techniques Dr. Nicols team is developing. This important due to the stealth techniques used by worm writers to obfuscate the worms origin.
2.3 Formal Analysis of Forensic Events

Prior work in this area falls into three categories: formal methods, investigative process and a combination of the two. Because work specific to the first two is so broad and disparate, we will concern ourselves only with that which supports our thesis directly. As to the third, the combination of the two categories, the lack of prior work in the synthesis of formalization and the investigative process is the motivation for research reported in this thesis.
2.3.1 Formal Methods

There are numerous examples of formal mathematical modelling techniques. We have selected three plus a process language to examine and we believe that there is application for each of them in the process of formalizing the digital investigative process. First, we determined that there is a need for some approach to structuring (or analyzing) a digital investigation in such a way that every step may be examined and that the process of
22
the investigation as well as the triggering event itself may be laid out in sufficient detail that gaps in process may be identified. We examined a number of approaches, starting with population of the DFRWS framework [DFRW01] and settling, finally on a semi-formal process language [FKP+99]. That language, CISL (Common Intrusion Specification Language) was determined to be inadequate for its intended purpose [JD99], but, for our purposes, proved acceptably rich. However, as we will describe in Chapter 4, it was necessary to extend the language to encompass, specifically, processes dealing with forensics and investigation.
2.3.1.1 Communicating Sequential Processes (CSP)

Communicating Sequential Processes is a calculus for studying processes which interact with each other and their environment by means of communication. [AWR98]. CSP was introduced in 1978 by C. A. R. Hoare [CAH78] and has been used in information security for the analysis of encryption protocols [RS01]. Hoares Communicating Sequential Processes [CAH85] is the foundational work on CSP. CSP is an appropriate vehicle for formalizing the investigative process in the context of a completed investigation since such an investigation does, in fact, comprise processes which interact with each other and their environment. Further, FDR, a commercially available model checker, may be used as a tool for verifying the correctness of an investigation modelled in CSP. As used here, the term correctness refers to the one-toone relationship of the process of an actual investigation to a reference model of the process of an idealized investigation of the same type. Verification of this relationship helps to establish that the actual investigation process was carried out in accordance with accepted procedures. Although we believe that CSP is appropriate as an avenue for further research, we did not select it as our initial formal process due to its complexity as a forensic practitioner tool and the difficulty of using it to communicate with a jury.
2.3.1.2 Coloured Petri Nets (CPN)

Coloured Petri Nets offer an alternative to CSP with the advantage that the technique is graphical in nature. Although CPN uses an underlying calculus, interaction with it usually is accomplished by the use of graphical diagrams. There are several tools, many free, available for processing CPN models. One of the easiest to use is Design/CPN, a freeware program that runs on an Apple Macintosh or in X-Windows. The Design/CPN
23
tutorial describes a Petri Net as: .. a network of interconnected locations and activities, with rules that determine when an activity can occur, and specify how its occurrence changes the states of the locations associated with it. Petri nets originated in the work of C. A. Petri in 1962, and have since been developed by many researchers in many countries. [MSC93] The tutorial goes on to say: Petri nets can be used to model and simulate systems of any type. They are particularly useful in facilitating the design and analysis of complex distributed systems that handle discrete flows of objects and/or information. There is an extensive mathematical formalism associated with Petri nets. This formalism completely defines what a Petri net is and how it behaves. Although Petri nets are typically represented as graphs drawn on paper or on a computer screen, a Petri net is actually a mathematical object that exists independently of any physical representation. [MSC93] We selected Petri Nets, specifically, Coloured Petri Nets, as our formalism for two primary reasons. First, it appears to be less difficult than a pure mathematical representation to describe to judges and juries who may not have an extensive mathematical background. Second, although graphical in representation, as pointed out by the Design/CPN tutorial, a Petri Net actually is a mathematical object. This satisfies our requirement for a formal representation of the digital investigative process. Most Coloured Petri Net examples used in this thesis were developed using Design/CPN running in an X-Windows environment.
2.3.1.3 Lambda Calculus

The Lambda Calculus was developed in 1936 by Alonzo Church. It is the underlying mathematics upon which several functional programming languages are based [BB94]. The Lambda Calculus also is the underlying mathematics of LISP, the basis upon which 24
the Common Intrusion Specification Language (CISL) and the Digital Investigation Process Language (DIPL early references to DIPL were as the Digital Forensics Process Language (DIPL) which now has been broadened to DIPL) were built. Thus, Lambda Calculus may be used to verify a process characterized using LISP and, probably CISL and DIPL. However, because Lambda Calculus would be difficult to explain to a lay jury, we have opted to save such proofs for future work.
2.3.2 Investigative Process

The state of the practice at this writing is the structuring of the digital investigative process into a multi-level framework [DFRW01, RCG02]. Additionally, the courts in the United States have given us some guidelines in case law. For example, Katz v. U.S., 389 U.S. 347 (1967) describes a two-part test for questions of legal search and seizure (Fourth Amendment to the US Constitution). Such specific case law, even though it may not relate explicitly to the digital investigative process, provides guidelines that may not, in many cases, be breached. The Internet Society has developed a Request For Comments (RFC) that addresses the collection and management of digital evidence [BK02]. Finally, important aspects of statutory law (USC18-II, USC 18-I), define general investigative procedures, protections and statutory crimes. These guidelines offer specifics that investigators may apply to an investigation and that may be used as tests of the correctness of the investigation.
2.3.3 Synthesis Between Process and Formalization

The synthesis between process and formalization is the motivation for this thesis. To date virtually no prior work has been done in this area. The key concept in this regard is the notion of formalization. Within the investigative community that notion is ill-defined at present. What we mean by formalization regards underlying mathematical formalisms as the defining benchmarks. The term formal as used within the investigative community is defined more accurately as structured. Structure in the investigative process, as described above, comes largely from practitioner experience and guidance from the courts. Unfortunately, this structure in todays digital investigations is more closely related to traditional investigative procedures learned by investigators pursuing solutions to nondigital crimes. Such structure tends to be ad hoc in nature and is not directly applicable to advanced mathematical characterizations of explicit processes. Simply, current
25
investigative structure operates at a higher level of abstraction than we feel is appropriate for digital investigations. Since digital systems are inherently numeric in nature, we suggest that the application of mathematical analysis would be an appropriate approach to the analysis of digital processes. Investigative processes involving digital systems may, therefore, be shown formally to follow the digital processes under investigation. A key element in the synthesis between the investigative process and formalization is the ease with which the end process or technique satisfies the Daubert (see Section 2.2) criteria. Since virtually no work has been done in this area, it is an open question as to how Daubert will apply. However, the application of established mathematics, such as Coloured Petri Nets, might be expected to be acceptable as scientific proof. For example, analyzing our approach based upon the four Daubert criteria we find: Whether the theory or technique in question can be and has been tested. Certainly the technique can be tested using Petri nets as a measure, and the underlying mathematics of Petri Nets have been tested thoroughly [TM89]. It is questionable what the concept of testing under the Daubert criteria really means. Justice Blackmun writing for the unanimous court [DVM93] quotes several sources that, essentially, view testing as generating hypotheses and seeing if they can be falsified. However, in the case of the formalisms that underlie Petri Nets, it is clear that a process of formal mathematical proofs have been applied, satisfying the requirements for seeing if they [i.e., the formalisms] can be falsified. If we then apply those formalisms to the
investigative process, by modeling the process using Petri Nets, we test the investigative process formally to see if it can be falsified. Since the digital investigative process may be shown to be composed of a number of techniques used in concert to conduct a complex digital investigation, there is the implication that each of the techniques must be tested individually and collectively with the other techniques used
26
(to
ensure
that
interactions
cause
no
unintended
consequences) for formal verification to be considered valid. Whether it has been subjected to peer review and publication. The techniques described in this thesis have been published, both by this document and a paper by the author [PRS02]. That the underlying mathematics has been published and peer reviewed extensively there can be no doubt. Its known potential rate of error along with the existence and maintenance of standards controlling the techniques operation. The error rate for the technique has not been tested in actual practice. The underlying formalisms, however, provide the accepted methods for controlling the techniques operation. The degree of acceptance within the relevant scientific community (may not be the only means to examine expert testimony as in Frye v. US). This, of course, is the open question. This test has been found to be subordinate to the other three, however, based upon the finding by the Court in Daubert that Frye's "general acceptance" test was superseded by the Rules' adoption (the Rules referring to the Federal Rules of Evidence) [DVM93],. Subsequent work in the area has been published by the author and has been peer reviewed [PRS02], [PS03A], [PS04], [PS03B], [PS04B]. The authors subsequent work [PS03B] has been cited by researchers in the Computer networks and Distributed Systems Research Group, University College, Dublin, as one of only four major developments in the use of formality for analysis and corroboration of digital evidence [GP04].
There is some question as to the validity of the Daubert tests in the US legal system in the
27
contexts of digital forensic analysis and, specifically, the digital investigative process. Certainly, although the investigative process includes, and, indeed, may be dependent upon digital forensic science there may be an argument that Daubert is not applicable because the investigative process is not an individual scientific (or, purportedly scientific) technique. We would not agree with that position on the following basis. First, in Daubert v Merrell-Dow [DVM93], the Court held that: The Federal Rules of Evidence, not Frye, provide the standard for admitting expert scientific testimony in a federal trial. Thus, the so-called Daubert Criteria are represented as being derived, directly, from the Rules. The Court goes on to say, in paragraph (c) of the opening summary of Justice Blackmuns opinion: Faced with a proffer of expert scientific testimony under Rule 702, the
trial judge, pursuant to Rule 104(a), must make a preliminary assessment of whether the testimony's underlying reasoning or methodology is scientifically valid and properly can be applied to the facts at issue. [emphasis
ours] Many considerations will bear on the inquiry, including whether the theory or technique in question can be (and has been) tested, whether it has been subjected to peer review and publication, its known or potential error rate and the existence and maintenance of standards controlling its operation, and whether it has attracted widespread acceptance within a relevant scientific community. The inquiry is a flexible one, and its focus must
be solely on principles and methodology, not on the conclusions that they generate. [emphasis ours] Throughout, the
judge should also be mindful of other applicable Rules. The reader will note the emphasized section where we point out that the focus is solely
28
on principles and methodology. The Merriam-Webster On-Line Dictionary 4 defines methodology as: 1 : a body of methods, rules, and postulates employed by a discipline : a particular procedure or set of procedures Certainly the investigative process fits this definition and, on its face, as a set of procedures, is potentially subject to a Daubert finding. The key advance in Daubert over Frye is that Frye is superseded by the Federal Rules of Evidence. The use of the term methodology is not accidental. Earlier in his opinion (above) Blackmun refers to underlying reasoning or methodology. However, to avoid becoming entangled in semantics, we may also note the spirit of Blackmuns opinion: the trial judge, pursuant to Rule 104(a), must make a preliminary assessment of whether the testimony's underlying reasoning or methodology is scientifically valid and properly can be applied to the facts at issue.. Thus, as Sommer points out [SO98] it is the judge who acts as a gate-keeper in evaluating whether the expert evidence comes from generally accepted scientific principles. There is no mention or, indeed any inference that such evidence is or is not a process rather than an individual technique. The mentions of a method or technique imply strongly that the notion of a process based upon generally accepted scientific principles is acceptable under both Daubert and Rule 702, Federal Rules of Evidence (the Rules being the underlying basis for the Daubert decision). It has been suggested additionally by some reviewers of the applicability of Daubert to the digital investigation and digital forensics processes that Kuomo Tire Company Ltd. v. Carmichael [KVC99] is a more appropriate case for evaluating the digital investigative process, or indeed, any digital forensic process because it specifically upholds Daubert saying that: The Daubert factors may apply to the testimony of engineers and other experts who are not scientists. In this context the Court specifically is applying Rule 702. The Court continues:
http://www.m-w.com/
29
The Daubert gatekeeping obligation applies not only to scientific testimony, but to all expert testimony. Rule 702 does not distinguish between scientific knowledge and technical or other specialized knowledge, but makes it clear that any such knowledge might become the subject of expert testimony. However, we believe that the combination of Daubert and Kuomo clearly place the determination as to admissibility of digital forensic findings in the context of the digital investigative process under the general purview of the Daubert criteria (and the Court as gate keeper), whether we consider them to be scientific or held to some lesser technical standard. This does not mean, of course, that there is any less burden upon the rigour of the digital investigative process than if the courts chose to interpret Daubert rigidly. The notion that Daubert can and, in fact, does, apply to the digital forensic process is further supported by Rogers and Seigfried [MRS04]. Because digital forensic science does not comprise a single technique but, rather, is made up of many techniques, the notion of a process not being subject to Daubert simply does not make sense. Additionally, other forensic sciences, such as DNA, comprise multiple techniques as part of an overall forensic process. These sciences have been shown repeatedly to be subject to the Daubert tests. Important in that context, as pointed out by Smith [SS04], the important factor in the admissibility of DNA evidence is not chemistry. Rather, it is the underlying statistical analysis of the evidence. As there is a formal mathematics supporting the digital investigative approach reported here, Smith contends that there is a direct parallel implying admissibility of this investigative process on the same basis. Smith (an Assistant United States Attorney in New Mexico, and an expert on admissibility of scientific and technical evidence and expert testimony [SB03]), however, supports Sommers characterization of the Court as gatekeeper. Thus, we conclude that it simply is not credible that a digital investigative process made up of multiple purported scientific techniques, and underpinned by accepted mathematics, is not potentially admissible based upon tests such as Daubert and Kuomo Tire. However, we must wait until this argument is settled in the courts and becomes part of the body of case law for a final determination.
30
2.4 Relationships to Other Approaches

When examining a process purported to be a new and improved approach to addressing some existing problem, it is reasonable to ask, How was the problem being solved before? and Is there a current method, perhaps not in general use for our purposes, that could solve the problem better than the current approach, obviating the need to develop something new?. Fundamentally, there are three other general types of approaches that could be said to be similar to the underlying approaches taken in this research, and which should be examined both for redundancy with, and contribution to the digital investigative process. We identify them as: Standards (such as ISO9000, ISO15408 and ISO17025) Best practices (such as those of the International Organization on Computer Evidence) Work of other researchers
For the third general type of approach, we refer the reader to section 6.7, Comparison With Investigative process Models.
2.4.1 Existing Standards

We examine three ISO standards to understand their positions relative to the work reported in this thesis. Those standards are the ISO9000 series, ISO15408 (the Common Criteria) and ISO17025 (standards for testing laboratories). Since it is reasonable to assert that the work reported here represents a rigorous set of standardized approaches to conducting a digital investigation, those standards that address such approaches and the maintenance of a quality process relative to those approaches are of interest. The measurement of the End-to-End process developed under the current research as reported in this thesis is discussed in more detail in Chapter 6. However, it is difficult to make a quantitative comparison between the process developed in this research and the approaches discussed below in sections 2.4.1.1 though 2.4.1.4 because there is no history of quantitative analysis of the standards themselves. Additionally, as we show below, there is little direct correlation between our process and the quality management standards 31
discussed in these sections. We, therefore, did not attempt a quantitative comparison because such a comparison would be meaningless in the context of our research.
2.4.1.1 ISO9000 Series

ISO9000 or, more correctly, ISO9001:2000 5 (ISO9000:1994 applied only to manufacturing and has been replaced by ISO9001:2000) applies to any type of business, whether it is a service or manufacturing business. Its purpose is to provide a framework for documenting and monitoring a quality process. However, it is not, in itself, a quality process. ISO9001:2000 certification says, simply, that the organization has a structured quality process in place, a method of managing and documenting that process, and a method of monitoring its success. Thus, for ISO9001:2000 to work, there must be some sort of underlying quality process to be measured and documented. As will be shown in Chapter 3, the digital investigation process developed in this research is capable of providing such an underlying process. By applying the End-to-End Digital Investigation process to the conduct of digital investigations, a framework for the development of a quality process is possible. An important question, however, is the applicability of an ISO9000-type structure to the investigative process. Clearly, an organization whose business is conducting digital investigations could benefit from imposing the quality control requirements of ISO9001:2000 to the underlying quality process. Such an organization could be in the private sector, such as a consulting company, or it could be an element of law enforcement, the military or some other investigative agency. There is a question, of course, as to where such a quality management process might fit with laboratory quality processes such as will be discussed below. The most important question as regards the current work is, does there exist a potential redundancy of the work with existing standards such as ISO9001:2000?. Here the answer is emphatically that there is no overlap. While ISO9001:2000 is intended to monitor, document and manage a quality process, the investigative process discussed in this thesis is, arguably, a quality process that could be so managed, monitored and documented.
ISO9001 is described in some detail by Praxiom Research Group Ltd. http://www.praxiom.com/iso-
32
Because the current work provides a very structured, formally provable investigative process, it offers a platform upon which ISO9001:2000 could perform the tasks for which it was intended. The role of CPNets in this context also is important. Here it is clear that the use of CPNets is not, in itself a quality process (although one might be tempted to interpret the successful modelling of an investigation as a test of the quality of that investigation). It is, rather, an underlying mathematical representation of the process. That the mathematical proof returns a correct response says little about the quality of the process. It merely says that the process performed as expected. It could be a very bad investigation that was mathematically complete while being very low in quality. In other words, rather than being redundant with the ISO9001:2000 process, digital investigation, if viewed as a quality process, could be subject to the standard. The application of a quality control and documentation mechanism to the digital investigative process and, indeed, the digital forensic process, is one that currently is being explored by a number of forensic laboratory managers. Most emphasis to date, however, has been on ISO17025 (see 2.4.1.3 below) rather than ISO9001:2000.
2.4.1.2 ISO15408
ISO15408 is, perhaps, better known as the Common Criteria (CC). The CC is a set of standards by which an information security product may be measured to some level of assurance. At its highest assurance levels it requires the use of formal verification methods. The CC is interesting in that it does not mandate a particular level of security requirements. It simply says that a standard, called a Protection Profile, assured to a particular level of assurance, may be created for some product type and that actual products of that type may be tested against that standard using a very structured testing approach. The mechanism for accomplishing this is to create the Protection Profile and then create a Security Target for the product to be tested that meets, at some level, the Protection Profile. If certifiers agree that the Security Target meets the Protection Profile, actual products may be tested against the Security Target and certified at some level of compliance. While it may be argued that a Protection Profile could be written for the investigative
9001.htm Stephenson Structured Investigation of Digital Incidents in Complex Computing Environments
33
process, there are two important reasons why this is not appropriate. First, the CC was developed for products, not processes. That said, there are some processes that may be compatible with, if not certifiable by, the CC. Generally, these processes are standards of some type, the most interesting one being HIPAA (the Health Insurance Portability and Accountability Act). At is core, however, HIPAA is a standard, not a process. The second argument against using the CC to write a Protection Profile for a digital investigation is that the issues addressed by the CC, specifically in the so-called catalogue of security elements, are not fully consistent with the issues addressed in a digital investigation as inculcated in the work reported here. The most important question, as in the discussion of ISO9001:2000 above, is whether the CC and the process reported here are redundant. Clearly, that is not the case since there are fundamental incompatibilities of purpose between the two. Simply, the CC is not, as relates to the digital investigative process, fit for purpose. It may, however, be used to validate products used in the investigative of forensic processes. And, in that context, ISO15408 may have the potential for some indirect impact.
2.4.1.3 ISO17025
ISO17025 is a laboratory standard and is becoming very popular with organizations developing digital forensic laboratories. Here is where the issue of a quality process becomes both important and useful in the context of the current work. ISO17025 may, in the case of a testing laboratory, be considered a subset of ISO9000:2000. Thus, a forensic laboratory complying with ISO17025 very nearly complies with ISO9001:2000. The notion that a testing laboratory may use an underlying process such as the End-toEnd process reported in this thesis is valid. The use of such a process, if appropriate, may provide an underlying quality process that must be managed, documented and monitored. The structured nature of the End-to-End process, if applied in a laboratory setting, certainly would comprise a quality structure that easily could be managed using a standard such as either ISO9001:2000 or ISO17025. The bottom line, therefore, when discussing the work reported in this thesis is that the processes developed during this research can, in fact, offer a basis for compliance with a certification program such as those provided by applicable ISO standards, but is not, in itself, such a quality management program.
34
2.4.1.4 Fraud Investigation Techniques

There is a potential convergence of the End-to-End process as described in this thesis with the techniques for investigating complex frauds. This convergence lies in the need to manage the details of a very complex investigation that seeks to discover not a needle in a haystack, but a needle in a stack of needles. Experienced fraud investigators use techniques such as link analysis to discover connections between apparently unconnected subjects in an investigation. We view this approach as tool-based in that it builds upon the procedures inherent in the use of a particular tool, in this case a link analyzer. We view the process described in this thesis as tool-independent. In actual practice, the use of specific tools, such as link analyzers, can be used within a particular investigative process which the structure developed in this research can report. For example, as will be seen, the application of a link analyzer simply is the application of a particular tool which implements a particular investigative technique, which, in turn, produces results. This entire process may be characterized using the process described in this thesis. That characterization is highly structured and may, in fact, be modelled.
2.4.2 Best Practices

There are few examples of coherent best practices in digital investigation. This is partly due to the ambiguous nature of the notion of best practices (What is best? Who says its best? What are the criteria for determining that the practice is a best practice?), and partly due to the immaturity of digital investigation in general. However, there are a few examples of so-called best practices that could, legitimately, be examined for comparison with the EEDI process. The primary source for such best practices is the International Organization for Computer Evidence (IOCE) [IOC00], [IOC02]. However, these best practices pertain largely to the handling of individual items of electronic or digital evidence. Like many other documents the emphasis is on the evidence, not on digital investigative process. Other, similar, documents such as The Internet Societys RFC3227 (request for comments) address the issue similarly. Thus, it is safe to say that there is little, if any, evidence of a set of best practices that comprehensively addresses the issue of digital investigative process holistically as is the aim of this research.
35
3. STRUCTURED DIGITAL INVESTIGATION
3.1 Introduction
It is desirable to structure a digital investigation such that it is consistent with other, physical, investigations [RCG02]. To do otherwise could compromise the investigation itself, the evidence collected and, ultimately, the outcome. An important problem, however, that faces digital investigators is the virtual nature of the crime scene. Most of the paradigms of the physical world do not exist in cyberspace. For example, a virtual crime scene may exist in multiple places: the source of the attack, the victim of the attack and several intermediate devices. This is in contrast to the localized nature of a physical crime scene. Additionally, in a physical crime there usually is physical evidence. In a virtual crime the evidence may be in the form of data that exists only representationally (i.e., physical representation such as a print-out of logical data within a computer). This complicates the process of discovering, collecting, analyzing and preserving evidence. Given this challenge, digital investigators have developed techniques that mimic physical world investigative practices. Even so, digital investigations lack the consistency of physical world investigations. Some researchers have begun the process of structuring the digital investigative process sufficiently to provide a framework for managing a virtual environment. The DFRWS and Reith, et al, have led that effort in recent months. However, the US Department of Justice [DOJ02] has added materially to the process with considerable reference to existing case law as well as parallels drawn from long experience in physical investigations. These procedures, especially the Department of Justice recommendations, have set the stage for the actions of law enforcement investigators faced with digital crimes. Because virtually all efforts to date have focused upon the physical crime scene (the involved computers and other devices), the concept of network forensics has received less attention from the investigative community. Even with the acceptance of the network as part of the digital crime scene, the full impact of the total path is not realized in most digital investigations. As a basis for the research underlying this thesis we have developed
36
an approach to digital investigation called End-to-End Digital Investigation (EEDI) [PS02A]. The EEDI process takes into account a structured process meeting the definition in Section 1.6.10, the network involved, the attack computer, the victim computer and all of the intermediate devices on the network. EEDI is described in detail in topics within this section. The EEDI approach is the cornerstone upon which the processes described in this thesis are built. Without a coherent end-to-end process, investigations may be fragmented into sub-investigations, centering upon individual devices, which the investigator attempts to connect at the end of the investigation. Such connections may be tenuous in a complicated investigation involving many devices of different types, multiple networks, and interconnections over a large public network such as the Internet. EEDI extends the DFRWS framework to a coherent, single investigative process wherein every device in the end-to-end chain of events becomes connected from an evidentiary perspective, and the evidence gathered may be used, in turn, to form a single, corroborated chain that supports the facts of the event. Simply put, EEDI combines the many individual events and pieces of evidence in a complex computer crime into a single chain that can be characterized and proven to meet appropriate investigative criteria. EEDI scales very well to large network investigations, but involves a number of practical network complexities that can complicate its use over large public networks such as the Internet. Chief among these complexities is the ability (or lack thereof) to access other computing devices that may be intermediate platforms between the attacker and the victim. Often these devices are not owned by the investigator and, therefore, are inaccessible to him or her. Where official law enforcement is involved in the investigation, of course, these barriers do not, usually, exist. A second challenge to the EEDI process is exactly the opposite of those challenges on large scale networks. Applying EEDI to an attack against a single computer, where only that computer was involved (an attack from the computers console, or a local attack) requires some interpretation of the EEDI paradigm. These challenges aside, EEDI is the only current structured digital investigative technique that admits of the entire attacker-tovictim environment, both physical and logical, as the digital crime scene.
3.2 Underlying Digital Investigative Process Frameworks

Before we can structure an end-to-end process, we need to designate a generalized framework for an investigation. The DFRWS [DFRW01] and Reith, et al [RCG02] each
37
have approached that challenge with an individual framework for the digital investigative process (DIP). The DFRWS Framework was developed first and Reith et al built upon it, adding some additional functions. The DFRWS framework is: Identification Preservation Collection Examination Analysis Presentation Decision Because we used the DFRWS Framework for the digital investigative process reported in this work, we will address it in far more detail in 3.2.2. However, the Reith Framework is very similar to the DFRWS Framework, having been derived from it, and provides a good starting point for understanding the current state of structured investigative frameworks in general. Reith consists of: Identification recognizing an incident from indicators and determining its type. This is not explicitly within the field of forensics, but significant because it impacts other steps. Preparation preparing tools, techniques, search warrants, and monitoring authorizations and management support. Approach strategy dynamically formulating an approach based on potential impact on bystanders and the specific technology in question. The goal of the strategy should be to maximize the collection of untainted evidence while minimizing impact to the victim. Preservation isolate, secure and preserve the state of physical and digital evidence. This includes preventing people from using the digital device or allowing other electromagnetic devices to be used within an affected radius. Collection record the physical scene and duplicate digital evidence using standardized and accepted procedures. Examination in-depth systematic search of evidence relating to the suspected crime. This focuses on identifying and locating potential evidence, possibly within unconventional locations. Construct detailed documentation for analysis. Analysis determine significance, reconstruct fragments of data and draw conclusions based on evidence found. It may take several iterations of examination and analysis to
38
support a crime theory. The distinction of analysis is that it may not require high technical skills to perform and thus more people can work on this case. Presentation summarize and provide explanation of conclusions. This should be written in a laypersons terms using abstracted terminology. All abstracted terminology should reference the specific details. Returning evidence ensuring physical and digital property is returned to proper owner as well as determining how and what criminal evidence must be removed. Again not an explicit forensics step, however any model that seizes evidence rarely addresses this aspect. The Reith framework adds three steps to the DFRWS framework: preparation, approach strategy and returning evidence. We include the approach strategy step in both the preservation and collection steps. The preparation step we include in the preservation step. The step of returning evidence does not directly impact the investigation and, in fact, may not be desirable as in when evidence must be preserved for a number of years pending appeals through the courts [DOJ02]. Thus, for practical purposes, the DFRWS framework, with concessions to processes within the Reith framework will be the basis for the EEDI approach.
3.2.1 The EEDI Domain Concept

We have taken the position that the EEDI process allows a full analysis of the investigation process, and provides for the presentation of the results of the investigation on a timeline or other presentation medium. However, process, evidence and presentation clearly comprise different data types. While it may theoretically be possible to combine those data types into a single analysis process, from a practical perspective the results of the process would, at the best, be very complex and, at worst, be ambiguous. Therefore, the EEDI process divides the body of investigative data into three separate, but interrelated, domains: Process, Evidence and Temporal. The Process Domain describes, formally, the investigative process and is the subject of this thesis. The Evidence Domain describes the incident and applies formal modelling techniques to analysis of the evidence describing the incident. The Evidence Domain, with the exception of its relationship to the Process Domain is a topic for future research. The Temporal Domain extracts the chain of events from the Evidence Domain and
39
places them on a timeline for presentation to a finder of fact 6. The Temporal Domain is one embodiment of the Presentation Class of the DFRWS framework. The data in the Evidence Domain is developed through the correct use of the EEDI process as described in the Process Domain. If modelling of the Process Domain indicates that gaps or errors may exist in the investigative process, it may be that the evidence developed by the process either is erroneous or missing. The relationships of the three EEDI domains and the classes of the DFRWS framework are depicted in Figure 1. The DFRWS framework is described in the next section and is the basis for the EEDI process. All classes (except Decision) are included in the Process Domain, while the Evidence Domain is developed from the Analysis class of the process and the Temporal Domain is derived from both the Analysis and Presentation classes. Note that the Analysis class is the glue that holds all three domains together. A failure in process, therefore, impacts both the Evidence and Temporal Domains.
TEMPORAL DOMAIN
EVIDENCE DOMAIN
ANALYSIS
PRESENTATION
PROCESS DOMAIN
IDENTIFICATION PRESERVATION COLLECTION EXAMINATION
Figure 1 - Relationship of the EEDI Domains to the DFRWS Framework
3.2.2 Applying the DFRWS Framework to EEDI

The DFRWS framework, referred to as the Investigative Process for Digital Forensic Science [DFRW01], is depicted in Figure 2 below. The process is divided into six
The term finder or trier of fact refers to the legal concept of a judge or jury.
40
investigative steps. The framework characterizes the Process Domain. Below each of the steps are the draft topics that the steps address. We expect these topics to be refined further in subsequent DFRWS conferences and as researchers progress with a more granular definition of the process. However, for our purposes, the steps and the topics are adequate. As the reader will note in Chapter 4 we have used many of the DFRWS steps and draft topics as the basis for constructing the extensions to the CISL that comprise the DIPL. Additionally, although Reith refers to both the Reith process and the DFRWS process as models, they are, both, more correctly process frameworks. The DFRWS Roadmap acknowledges this and does not refer to the table below as a model. We will refer to it as a process or a framework for digital investigations. The DFRWS adds the Decision step, however we do not view that as part of the investigative process. Rather, it is the inevitable outcome of the execution of a digital investigation. The decision may comprise the opinion of a court, arbitrator, mediator or other finder of fact. It may also be the result of a confession, plea bargain or closed case due to inability to proceed. EEDI is concerned with each of the six steps shown in the framework in Figure 2.
IDENTIFICATION Event/Crime Detection Resolve Signature Profile Detection Anomalous Detection Complaints System Monitoring Audit Analysis
PRESERVATION Case Management Imaging Technologies Chain of Custody Time Synch.
COLLECTION Preservation Approved Methods Approved Software Approved Hardware Legal Authority Lossless Compression Sampling Data Reduction Recovery Techniques
EXAMINATION Preservation Traceability Validation Techniques Filtering Techniques Pattern Matching Hidden Data Discovery Hidden Data Extraction
ANALYSIS Preservation Traceability Statistical Protocols Data Mining Timeline Link Spatial
PRESENTATION Documentation Expert Testimony Clarification Mission Impact Statement Recommended Countermeasure Statistical Interpretation
Figure 2 - The DFRWS Investigation Framework Matrix
41
Procedurally, we begin by charting out the steps in the investigation using a narrative. Our objective is to address each of the appropriate cells in the process represented by Figure 2. The narrative may be created before or after the investigation depending upon the purpose of the analysis. If the purpose is the design of an investigative process, we would develop the model prior to conducting the investigation. This approach is appropriate for creating a model framework for future use. When analyzing an existing investigation, we apply the actual narrative of the steps taken to the framework in preparation for modelling the actual investigation. Once the narrative is complete, we translate it into the Digital Investigation Process Language (DIPL). The DIPL is a semi-formal process language derived from the Common Intrusion Specification Language (CISL) [FKP+99]. The DIPL is described in detail with an example investigation in Section 4. The final step is the formal modelling of the investigation using the DIPL as input for a formal modelling program. For our purposes we have selected Design/CPN, a freeware modeller for Coloured Petri Nets. Design/CPN uses a graphical paradigm which we posit is appropriate for explanations to lay audiences such as one would encounter in a court of law. The Design/CPN process is described fully in Section 5 using the DIPL listing from Section 4. The process offers the ability to identify investigative process flaws that could compromise the investigation procedurally, could lead to developing flawed evidence or missing important evidence. The chain of evidence developed in the EEDI process depends upon the full, complete and correct use of the process from beginning to end. The generalized EEDI process is illustrated in Figure 3.
INVESTIGATIVE NARRATIVE
DIPL CHARACTERIZATION
CPN MODELING
FORMAL MODEL OF INVESTIGATION
Figure 3 - The Generalized EEDI Process Flow
3.2.2.1 The Investigative Narrative

The investigative narrative usually comprises the investigators notes. The EEDI process
42
supports the construction of an investigation around an investigative framework. For the purposes of this thesis we use the DFRWS framework shown in Figure 2. The framework includes the basic areas where investigative and forensic controls are required. For example, under the Collection class we find reference to approved software, hardware and methods. This indicates that the forensic software, hardware and methods used by the investigator or digital forensic examiner must meet some standard of acceptance within the investigative community. That standard usually refers to court testing. Should the investigator or forensic examiner not adhere to that standard, there is the possibility that the evidence collected will be subject to challenge. At critical points in the investigation, such as the collection of primary evidence, such a lapse could jeopardize the outcome materially. It should be noted that the framework does not necessarily alter the generalized investigative techniques of experienced investigators and forensic examiners. Rather, it adds a dimension of rigor and quality assurance to the digital investigative process. It also ties the functions of the forensic examiner and the investigator tightly together, ensuring that the chain of evidence is properly supported, developed and maintained.
3.2.2.2 DIPL Characterization

The Digital Investigation Process Language is discussed in detail in Section 4. Its purpose is to allow a formal description of the investigative process, whether planned or actually executed. The language provides for a formal description of the investigative process using a vocabulary of Semantic Identifiers (SIDs) to describe tasks, functions and actions. The language is derived from the Common Intrusion Specification Language (CISL) [FKP+99] which, in turn, was derived from LISP. Using a simple example of DIPL as an illustration: (ManageCase (Initiator (RealName Joe Investigator) ) (Link 123.221.3.2, 222.122.56.4) (Data (ChainOfCustody Joe Investigator) (CaseName A237-4) (EvidenceID A237-4-1) ) . .
43
. (BeginTime [16:54:23 GMT 03052002]) (EndTime [21:13:45 GMT 03052002]) (Comment This is the source and destination of the attack from firewall logs on FW-3. The logs are entered as evidence.) (When (Time [09:30:00 GMT 05052002]) ) )
Figure 4 - DIPL Code Fragment
The narrative for this DIPL fragment is: The investigator (Joe Investigator) is entering his case notes for a particular event. The event is a link between IP address 123.221.3.2 and 222.122.56.4 (not real addresses). Joe Investigator places the evidence into chain of custody and assigns it the evidence number A237-4-1 in case number A237-4. If there were other pieces of evidence the investigator wished to include the would go where the dots indicate in the listing. The time period covered by the events in the evidence is from 16:54:23 GMT on 03 May 2002, to 21:13:45 GMT on the same day. The investigator performed these actions at 09:30:00 GMT on 05 May 2002. The evidence, per the investigators comments, consists of a log from firewall FW-3 that covers the period in question. We will show the syntax of each DIPL SID in Appendix 1. The DIPL uses some of the same SIDs as the CISL plus a number of new SIDs developed particularly for the EEDI process.
3.2.2.3 CPN Modelling

While the DIPL focuses upon providing a structured description of the investigative process, we perform formal modelling using Coloured Petri Nets (CPN). The CPN modeller, Design/CPN, is used to build a formal model of the process described by the DIPL. We can use the modeller in two ways. First, we can create a model of the actual DIPL investigation. Often, this does little more than create the formal model of the actual process. It can, however, be used to simulate the probable outcome of a set of investigative actions. This capability promises to be of significant value in the investigative process. The second approach, and by far the more useful, is to create a CPN template of an acceptable investigation and load the actual DIPL process into it. Where the actual
44
process has failed, the modeller will fail as well showing clearly where the process flaw exists. This is the approach we have taken in the research from which this thesis is written. An example of this approach is discussed in Section 5. This approach works most effectively if we break the process template into modules. The modular approach has two benefits. First, it allows for easy updating of the model template and rapid processing of the actual investigation. Second, and most important, individual modules are far easier than the entire investigation for a lay audience to visualize. Thus, it aids in the clear presentation of an investigation to individuals not familiar with the underlying mathematics of formal models.
3.2.2.4 The DFRWS Framework Classes

The DFRWS Framework classes [DFRW01] contain key elements that are under constant review by the digital forensics community. However, there is a continuity between the classes that is important. For example, we note that the Preservation class continues as an element of the Collection, Examination and Analysis classes. This indicates that preservation of evidence, as characterized by case management, imaging technologies, chain of custody and time synchronization, is an ongoing requirement throughout the digital investigative process. Thus preservation is a guarded principle across forensic categories. [DFRW01]. Traceability, likewise, is a guarded principle, but not across all forensic categories. The following topics discuss each of the DFRWS Framework classes in more detail. The reader may refer to Figure 2 above for the graphical representation of the Framework.
3.2.2.4.1 The Identification Class

The identification class describes the method by which the investigator is notified of a possible incident. Since about 50% of all reported incidents have benign explanations 7, processing evidence in this class is critical to the rest of the investigation. Likewise, as it is the first step in the EEDI process, it is the only primary evidence not corroborated directly by other primary evidence. Therefore, a more significant amount of secondary evidence is needed to validate the existence of an actual event. The DFRWS gives the following definition of the Identification Class [DFRW03]:
Authors experience over 20 years of conducting incident response exercises
45
Determining items, components and data possibly associated with the allegation or incident. Perhaps employing triage techniques. The descriptions that follow of the elements of the individual Framework classes are those that we have adopted as specific definitions for the purposes of EEDI. The DFRWS has, as of this writing, not developed such definitions. Elements marked with an asterisk (*) are required elements within the DFRWS Class. In other words, to satisfy the Class, at least those elements must be present. The elements of the Identification class are: *Event/Crime Detection. This element implies direct
evidence of an event. An example of such direct evidence is discovery of a large number of credit card numbers having been downloaded from a server. Resolve Signature. This applies to the use of some
automated event detection system such as an intrusion system or antivirus software program. The system in use must make its determination (of the presence of an event of interest) by means of signature analysis and mapping. Profile Detection. Like signature resolution, profile
detection usually relies upon some automated event detection system. However, in this instance, the event will be characterized through matching with a particular profile as opposed to an explicit signature. Signatures generally apply to an individual event. Events, however, may come together in an attack scenario, or attack profile. Such a profile may consist of a number of events, a pattern of behaviour, or pattern of specific results of an attack. Anomalous Detection. Again, like the preceding two
elements, this usually relies upon a detection system. However, in the case of anomalous detection, the event is deduced from the detection of patterns of behaviour outside of the observed norm. This is the classic 46
Sherlockian case of the dog that did not bark [ACD02]. Complaints. This element relies upon the direct reporting of a potential event by an observer. The observer may
observe the event directly or simply the end result of the event. System Monitoring. System monitoring explicitly requires some sort of intrusion detection, anti-virus or similar system in place. It is less specific than other elements
requiring a specific action (e.g., anomaly, profile of signature detection) and may be used together with another element of this class. Audit Analysis. This element refers particularly to the
analysis of various audit logs produced by source, target and intermediate devices.
3.2.2.4.2 The Preservation Class

The Preservation Class deals with those elements that relate to the management of items of evidence. The DFRWS describes this class as a guarded principle across forensic categories. [DFRW01]. The requirement for proper evidence handing is basic to the digital investigative process as it relates to legal actions. The DFRWS defines this class as [DFRW03]: Ensuring evidence integrity or state *Case Management. This element covers the management of the investigative process by investigators and digital forensic examiners. Typical in this element are investigator notes, process controls, quality controls, and procedural issues. Imaging Technologies. This element is separate from the
47
elements in the Collection Class in that it does not refer to specific hardware, software or techniques. The imaging
technologies element refers to the technology used for imaging computer media. For example, physical imaging or bitstream backup may be considered an appropriate imaging technology whereas a logical backup would not be. The term imaging as used here is rather broad. It
encompasses not only the technology used to create an image of computer media, but also the technology used to extract such items as logs from a device. In this case the log might be extracted from a bitstream image or it might be read out of the device to a peripheral as a result of a keystroke command issued by the investigator. *Chain of Custody. This element refers to the process of limiting access to and subsequent alteration of evidence. In most jurisdictions chain of custody rules require that the evidence custodian be able to account for all accesses or possible accesses to items of evidence within his or her care from the time it is collected until the time it is used in a legal proceeding. *Time Synchronization. This element refers to the
synchronizing of evidence items to a common time base. Since logs and other evidence are collected from a number of devices during the conduct of an investigation, it is clear that those devices can differ from each other in terms of time base. If all devices are in a single time-synchronized network, they will not, of course, differ. However, that rarely is the case and some effort must be made to obtain a common time base for all devices. approaches one might take. There are two
The first approach is to 48
normalize all times on evidence (such as logs) to a common device clock such as that of the victim computer. The second is to use a common time zone (TZ) such as Universal Time (UT) or Greenwich Mean Time (GMT) as a baseline. No evidence is modified. The investigator simply notes the variance of a particular log or other piece of digital evidence from the pre-determined time standard. This also is referred to as normalizing time stamps.
3.2.2.4.3 The Collection Class

The Collection Class is concerned with the specific methods and products used by the investigator and forensic examiner to acquire evidence in a digital environment. As has been noted, the Preservation Class continues as an element of this Class. With the exception of the Legal Authority element, the elements of this class are largely technical. The DFRWS defines this class as [DFRW03]: Extracting or harvesting individual items or groupings. *Approved Methods. This element refers to the techniques used by the forensic examiner or investigator to extract digital evidence. The concept of being approved is somewhat different than one might expect. Approval refers to the general acceptance in courts of the techniques and training or certifications of the individual performing the evidence collection. The most rigorous test of methods and
technologies is the Daubert test [DVM93]. However, due largely to the immaturity of digital forensic science, most court tests have not had this level of rigor applied. For this reason, those elements in this class that relate to approval derive their authority from cases where the technique, technology or product has been challenged in a court of the same level as the case in question and has survived the test.
49
Approved Software. This element addresses the specific software product used to collect evidence. The discussion of the approval process above applies. There is an issue specifically involving software used for digital forensic data collection. In order for a software program to be considered approved it must be identical in every way to the software that has survived either a Daubert hearing or a court challenge. Failing that, the program may need to undergo its own court testing. For the purposes of the Framework and subsequent EEDI procedures, however, a program that has any differences (i.e., version level, bug fixes, source code changes, etc.) from the program tested originally is not considered to be Approved Software. Approved Hardware. This element describes the hardware, if any, used to collect evidence. Usually this is not an issue unless the hardware is designed specifically for use in a digital forensic evidence collection environment. To a lesser extent the caveats of sameness that apply to approved software apply to approved hardware. The hardware device used must in every way be identical to instances of the device that have survived court challenges. The Approved Hardware element does not apply to simple computers, disks or other media used by the examiner to collect evidence unless the device was developed explicitly for digital forensic evidence collection and contains special unique features for use in that environment only. *Legal Authority. The Legal Authority element is the only element of this class that is non-technical. In most
jurisdictions some legal authority is required prior to extracting information from computer media. This 50
authority could be a policy, a subpoena or a search warrant as examples. Failure to comply with applicable laws may render the evidence collected useless in a court of law. Lossless Compression. This element refers to the
compression techniques, if any, used by backup, encryption or digital signature software used to collect and/or preserve evidence. If the software program uses compression, it must be proven to be lossless, that is, to have no impact whatever upon the evidence on which it is used. Sampling. If sampling techniques are used to collect
evidence, it must be shown that the technique has no impact upon the evidence collected, or, if it has, that the impact can be demonstrated clearly and unambiguously. It must also be shown that the sampling method is valid (generally accepted by the mathematical community) and that the conclusions that may be drawn from the sample are defined clearly. Data Reduction. When techniques and/or programs (such as normalization) are used to reduce data that contains or may contain evidence, it must be shown that such techniques or programs produce valid, repeatable, provable results that do not affect, in any way whatever, the evidence being collected. For example, using data reduction directly on evidence would alter the evidence and would not be acceptable. However, using such methods or tools on a copy of the evidence would have no direct affect upon the evidence. Its affect upon the analysis of the evidence (the validity of conclusions, for example) is an issue for the Examination and/or Analysis Class(es). Recovery Techniques. This element refers to the recovery
51
of data that may contain evidence from a digital device. It specifically describes the methods used by the forensic examiner to extract evidence using approved hardware, software and methods. While the elements of approved hardware, software and methods refer to the naming (or brief description of) the element and the connection between the element and the appropriate court test by which it is approved, this element describes in detail the actual process used to recover the evidence. By extension, when non-forensic methods are used to collect information (traditional investigation methods such as interviewing), we consider these techniques to be Recovery Techniques and we apply the same rules to them (e.g., Approved Methods, Legal Authority, etc.) as we would in a digital environment. However, we apply the rules in the context of the technique used.
3.2.2.4.4 The Examination Class

The Examination Class deals with the tools and techniques used to examine evidence. It is concerned with evidence discovery and extraction rather than the conclusions to be drawn from the evidence (Analysis Class). While the Collection Class deals with gross procedures to collect data that may contain evidence (such as imaging of computer media), the Examination Class is concerned with the examination of that data and the identification and extraction of possible evidence from it. Note that the Preservation Class continues to be present in this class. The DFRWS gives the following description of this class [DFRW03]: Closer scrutiny of items and their attributes (characteristics) *Traceability. This element is, arguably, the most It is the 52
important element in the EEDI process.
traceability and continuity of a chain of evidence throughout an investigation that leads to the credibility and correctness of the conclusions. According to the DFRWS Traceability (cross referencing and linking) is key as evidence unfolds. [DFRW01]. Validation Techniques. This element refers to techniques used to corroborate evidence. Evidence may be
corroborated in a variety of ways. Traditionally, evidence is corroborated by other, relevant evidence. However, digital evidence may stand on its own merit if its technical validity can be established. For example, a fragment of test
extracted from an image of a computer disk may be shown to be a valid piece of evidence through various technical validation techniques. Its applicability or usefulness as an element of proof in an investigation may be open to interpretation, but that it is valid data would not be in dispute. A log, however, if extracted from a device that had been penetrated by a criminal hacker would require additional corroboration (validation) to show that the hacker had not altered its contents. Filtering Techniques. When dealing with evidence
acquired from certain types of digital systems (such as intrusion detection systems) it is not uncommon to find that the gross data has been filtered for expediency by the system. While many intrusion detection experts would
agree that filtering at the source (the incoming data flow from sensors) is not as appropriate as filtering the display while preserving the original data, such source filtering does occur. This element requires that the investigator and/or forensic examiner determine and describe the filtering
53
techniques used, if any, and apply the results of that description to the determination of the validity of the data as evidence. Another application of filtering is the
extraction of potential evidence from a gross data collection 8 such as a bit-stream image of digital media. Some digital forensic tools use filters to extract data of a particular type such as graphical images. This element
requires that the filtering technique be defined clearly and understood by the investigator or forensic examiner. These tools may also use the filtering technique of matching a known hash value to digital items on a gross data collection. Items that match the known hash are presumed to be the same as the item for which the hash value was originally generated. Again, the techniques and tools applied must be clearly understood by the investigator or the forensic examiner. Pattern Matching. This element addresses methods used to identify potential events by some pre-determined signature or pattern. Examples are pattern-based intrusion detection systems and signature-based virus checkers. When the
pattern or signature is unclear, ambiguous or demonstrates a large number of false positives or negatives, the evidence and conclusion following from it are open to challenge. Hidden Data Discovery. This element refers to the
discovery of evidence that is hidden in some manner on computer media. The data may be hidden using encryption, steganography or any other data hiding technique. It may
A gross data collection is a file or files containing data collected from a digital source that may contain
individual evidentiary data. Stephenson Structured Investigation of Digital Incidents in Complex Computing Environments
54
also include data that has been deleted but is forensically recoverable. Hidden Data Extraction. This element addresses the extraction of hidden evidence from a gross data collection.
3.2.2.4.5 The Analysis Class

The Analysis Class refers to those elements that are involved in the analysis of evidence collected, identified and extracted from a gross data collection. The validity of techniques used in analysis of potential evidence impact directly the validity of the conclusions drawn from the evidence and the credibility of the evidence chain constructed therefrom. The Analysis Class contains, and is dependent upon, the Preservation Class and the Traceability element of the Examination Class. The various elements of the Analysis Class refer to the means by which a forensic examiner or investigator might develop a set of conclusions regarding evidence presented from the other five classes. As with all elements of the Framework a clear understanding of the applicable process is required. Wherever possible, adherence to standard tools, technologies and techniques is critical. Finally, when mapping this class to the DIPL or when performing model checking, we are concerned solely with the process, not the results of the analysis or the detailing of the contents of evidentiary items. The Link element is the key element used to form a chain of evidence. It is related to traceability and, as such, is a required element. This class is described by the DFRWS as [DFRW03]: Fusion, correlation and assimilation of material for reasoned conclusions.
3.2.2.4.6 The Presentation Class

This class refers to the tools and techniques used to present the conclusions of the investigator and the digital forensic examiner to a court of enquiry or other finder of fact. Each of these techniques has its own elements and a discussion of expert witnessing is beyond the scope of this thesis. However, for our purposes we will stipulate that the EEDI process emphasizes the use of timelines as an embodiment of the Clarification element of this class. This class has the following DFRWS description [DFRW03]:
55
Reporting facts in an organized, clear, concise and objective manner.
3.3 The EEDI Process 9

The End-to-End Digital Investigation process is a collection of generalized steps to be taken in conjunction with the DFRWS Framework. While the Framework gives a roadmap for addressing those issues comprising a formal investigation, the EEDI process provides a set of steps the investigator must perform in order to preserve, collect, examine and analyze digital evidence. The basic End-to-End evidence collection and analysis process consists of: Collecting evidence Analysis of individual events Preliminary correlation Event normalizing Event deconfliction Second level correlation (consider both normalized and nonnormalized events) Timeline analysis Chain of evidence construction Corroboration (consider only non-normalized events)
3.3.1 Collecting Evidence

The collection of evidence in a computer security incident is very time sensitive. When an event occurs we have the first warning of a potential incident. An event may not be, by itself, particularly noteworthy. However, taken in the context of other events, it may become extremely important. From the forensic perspective we want to consider all relevant events whether they appear to have been tied to an incident or not. From the definitive point of view, then, events are the most granular elements (at the atomic level) of an incident.
This section (as edited for publication) has subsequently been published in Computer Fraud &
Security [PS02A] Stephenson Structured Investigation of Digital Incidents in Complex Computing Environments
56
An incident is defined as a collection of events that lead to, or could lead to, a compromise of some sort. That compromise may include unauthorized disclosure or modification of a system or its data or destruction of the system or its data. An incident becomes a crime when a law or laws is/are violated. As soon as possible, in the context of an incident, collecting evidence from all possible locations where it may reside must begin. The methods vary according to the type of evidence (forensic, logs, indirect, traditionally developed, etc.). It is important to emphasize that EEDI is concerned not only with digital evidence. Gathering witness information should be accomplished as early in the evidence collection process as possible. Witness impressions and information play a crucial role in determining the steps the forensic examiner must take to uncover digital evidence. Critical in this process are: Images of effected computers Logs of intermediate devices, especially those on the Internet Logs of effected computers Logs and data from intrusion detection systems, firewalls, etc.
3.3.2 Analysis of Individual Events

An alert or incident is made up of one or more individual events. These events may be duplicates reported in different logs from different devices. These events and duplications have value both as they appear and normalized (see below). The first analysis effort should be to examine these isolated events and assess what value they may have to the overall investigation and how they may tie into each other.
3.3.2.1 Preliminary Correlation

The first correlation step is to examine the individual events and see how they may correlate into a chain of evidence. The main purpose is to understand in broad terms what happened, what systems or devices were involved and when the events occurred.
3.3.2.1.1 Event Normalizing

There may be some events that are reported from multiple sources. During part of the
57
analysis (timeline analysis) these duplications must be eliminated. This process is known as normalizing. EEDI uses, eventually, both normalized and non-normalized events.
3.3.2.1.2 Event Deconfliction

Sometimes events are reported multiple times from the same source. An example is a denial of service attack where multiple packets are directed against a target and each one is reported by a reporting resource. The EEDI process should not count each of those packets as a separate event. The process of viewing the packets as a single event instead of multiple events is called deconfliction.
3.3.2.1.3 Second Level Correlation

This is just an extension of earlier correlation efforts. However, at this point views of various events have been refined through normalization or deconfliction.
3.3.2.1.4 Timeline Analysis

In this step normalized and deconflicted events are used to build a timeline. This is an iterative process and should be updated constantly as the investigation continues to develop new evidence. The entire event analysis, correlation, deconfliction and timeline analysis is iterative.
3.3.2.1.5 Chain of Evidence Construction

Once there is a preliminary timeline of events, the process of developing a coherent chain of evidence begins. Ideally each link in the chain, supported by one or more pieces of evidence, will lead to the next link. That rarely happens in large-scale network traces, however, because there often are gaps in the evidence-gathering process due to lack of logs or other missing event data.
3.3.2.1.6 Corroboration
In this stage we attempt to corroborate each piece of evidence and each event in our chain with other, independent, evidence or events. For this we use the non-correlated event data as well as any other evidence developed either digitally or traditionally. The best evidence is that which has been developed digitally and corroborated through traditional investigation or vice versa. The final evidence chain consists of primary evidence corroborated by additional secondary evidence. This chain will consist of both digital and
58
traditional evidence. The overall process does not differ materially between an investigation and an event post mortem.
59
4. DIGITAL INVESTIGATION PROCESS LANGUAGE (DIPL)
4.1 Introduction
The Digital Investigation Process Language (DIPL), arguably the core product of this research, is derived as a heavily-modified subset of the Common Intrusion Specification Language (CISL) [FKP+99]. The DIPL is documented in the Digital Investigation Process Language (DIPL) Language Definition Document. [PRS03]. In this section we describe the language and provide an example of a reference model for a typical investigation. The descriptions below include a discussion of DIPL grammar, but do not include the detailed vocabulary of semantic identifiers. That vocabulary is provided in Appendix 1.
4.1.1 DIPL Design Objectives

Our general objectives for the design of a digital investigation process language included: Ability to characterize the investigative process completely and with sufficient granularity to be useful. Ability to feed a formal modelling process that could be used as a second order analysis tool Ease of use in terms of being English-like in syntax and vocabulary Relatively free of unnecessary elements but easy to extend if necessary. Applicable to moderately detailed description of attack processes for inclusion in the investigative characterization. Not overly granular in descriptions of attack processes so as to avoid focus on the attack rather than the investigative process. Inclusive of both forensic and investigative elements.
60
Easily extensible and maintainable. An accepted structure and syntax approach. An important element of the language includes its ability to be descriptively rich without being overly complex in areas where complexity adds nothing to the characterization of an investigative process.
4.1.1.1 Justification of the DIPL Design Objectives

Since the overall aim of this research is the formalization of the digital investigative process, the characterization of such a process in practice is an important aspect of the verification that the investigative process meets the framework developed here. The DIPL, when applied to an investigation using the EEDI process, is intended to provide verification that the investigation has been conducted in accordance with EEDI principles. Additionally, since a goal of the EEDI process is the formal modelling of the investigation and its results, it is important that some tools exist to translate the unstructured investigative narrative into a structure that can feed a modelling process such as CPNets.
4.1.1.2 Testing the DIPL Design Objectives

During the research we applied the DIPL to two existing investigations as well as to several hypothetical investigative fragments to test for its applicability and its compliance with our design objectives. During that testing we modified the DIPL considerably from its original draft vocabulary and modified its existing constructs slightly to remain consistent with its objectives and for consistency within its overall use. Those modifications are described in more detail in the sections that follow.
4.2 DIPL A Digital Investigation Process Language

As demonstrated in Section 4.2.1.1 the additions to, deletions from and modifications to the original CISL are significant. In terms of SIDs retained and added, we deleted 74 of the original CISL SIDs as not applicable to DIPL requirements and added 60 new SIDs related directly to digital investigation or forensics. In the area of syntax and structure, we characterize the parts of speech referred to in the original document as types more correctly as classes of SIDs. We reserve the
61
term type to refer to the more traditional data type (see Section 4.2.1.2.2) adhering to Rivests data types. The use of traditional data types allows us to simplify somewhat the content of atom SIDs in that atoms are explicitly typed and, as the objects of an SExpression sentence they determine much of the syntax of the sentence. Where practical we intend that the syntax of a sentence be intuitive from the syntax of the SIDs that comprise it. Additionally, we remove those parts of SID construction reserved by the CISL developers for communicating S-Expressions between computing devices, the original motivation for the development of the CISL. Those constructs are meaningless in our use of the language and add unnecessary complexity. The most obvious change to the CISL in this regard was the removal of octet encoding. Finally, we simplified the original guidelines for constructing sentences and clauses, [FKP+99, section 4.6 ff] such that they apply only to the context in which the DIPL uses CISL constructs. We describe those constructs explicitly in Section 4.2.1.3 and we describe our interpretation of the classes of SIDs in Section 4.2.2. Figure 20 in appendix A.2.1.2 shows the general syntax of a DIPL sentence as:
(Conjunction SID if used (Verb SID (Adverb SIDs if used (Role SID (Objects such as Atom SIDs) ) ) ) )
Other SID classes not shown (i.e., referent and attribute) are inserted as necessary to simplify complex references (see Sections 4.2.2.4 and 4.2.2.7 for complete descriptions of their uses).
4.2.1 S-Expressions and DIPL
4.2.1.1 The Concepts of S-Expressions and Semantic Identifiers

The notion of the S-Expression was developed in 1997 by Ronald Rivest [RR97]. Quoting
62
Rivest: S-expressions are a data structure for representing complex data. They are a variation on LISP S-expressions. The concepts of S-expressions and Semantic Identifiers (SIDs) are preserved from the original CISL. Although we have made significant changes to CISL vocabulary. We retained roughly 130 of the original SIDs and added over 60 new ones. However, we deleted those portions of the SID format that related explicitly to the transfer of data between computing devices such as intrusion detection systems. The result is that the DIPL SIDs are more compact than the original CISL SIDs. Additionally, where we retained SIDs from the original CISL, we retained, essentially, their individual meanings. In a very few cases we limited the original meanings somewhat to address the needs of digital investigation and forensic analysis. As we describe in Section 4.2.1.2.2 below we have applied a subset of data typing as originally specified by Rivest. This is a departure from the original CISL in terms of structure in that CISL data types do not appear to follow Rivest directly. The original CISL appears from the definition document to be less strongly typed than DIPL, although its authors claim strong data typing. Unfortunately, the definition document is somewhat unclear in this regard. The DIPL associates a particular data type with every SID, either explicitly (as in the case of atom SIDs) or implicitly through the structure of the SExpression. Finally, we have organized the additions to the CISL consistently with the DFRWS Framework and added specific syntax for each SID. More details regarding the DIPL are in sections that follow. S-expressions (S-exp) in DIPL are groupings of tags and data listed using parentheses. A DIPL S-exp may be compared to a sentence in a spoken language. The S-exp is built using a vocabulary of Semantic Identifiers that serve the syntactical purposes of verbs, objects and subjects. Because the DIPL SIDs form a structured vocabulary of process concepts, they may be used to represent a structured process.
4.2.1.2 Data Typing in DIPL

DIPL is a strongly typed language. Each element of data in DIPL has a data type
63
associated with it. DIPL uses the following data types (see Figure 5 for details): Boolean String Timestamp Character Numeric Decimal Ushort Ulong Number
4.2.1.2.1 SID Classes

The following classes, not data types in themselves, are associated specifically with DIPL semantic identifiers: Verb Role Adverb Attribute Atom Referent Conjunction
These classes are described in more detail below (Section 4.2.1.3). Generally, these classes refer to the SID itself as opposed to the data type the SID may contain. The exception is the atom SID which, generally, is associated with a specific data type. In the SID listing (Appendix 1) we show both the SID class and, where appropriate, the associated data type. Although we have added significantly to the original CISL SID listing using new SIDs created to be consistent with the Framework, we have retained the seven general SID classes.
4.2.1.2.2 Data Types

SIDs contain data that may be associated with a specific data type. Data types associated 64
with the various SIDs are listed in the table below. These data types are extensions of Rivests original advanced transport types [RR97]. We show the original source type in the Comments column in the table below in parentheses. DATA TYPE Boolean (bool) DESCRIPTION Single true or false value A character string of variable length COMMENTS (Token) Enclose in single quotes: [string] (Quoted-string) Format is: hh:mm:ss [TZ] ddmmyyyy (Token) Not enclosed for char Character (char) A fixed n length character string containing numbers only, use Number (Token) An exact numeric type with arbitrary precision and length An unsigned 16 bit integer An unsigned 64 bit integer A fixed n length Number character string containing numbers only Not enclosed (Token) (Hexadecimal without the # delimiters) (Hexadecimal without the # delimiters) (Decimal)
String
Time of day (including Timestamp (time) time zone - TZ) and date containing the time
Numeric Decimal (dec)
Ushort
Ulong
Figure 5 - DIPL Data Types
65
4.2.1.3 General DIPL Constructs

DIPL uses seven classes of SIDs: Verb SIDs, such as Delete and OpenApplicationSession Role SIDs, such as Initiator and FileDestination The adverb SIDs, Outcome and When Attribute SIDs, such as Owner Atom SIDs, such as UserName and Time The referent SIDs, ReferTo and ReferAs Conjunction SIDs, such as And
These seven classes of SIDs allow us to create process descriptions that clearly characterize the process and are, at the same time, reasonably easy for an investigator to use. The general DIPL constructs are below [FKP+ 99]. Construct 1 - DIPL is linear. By that we mean that it does not contain looping constructs or constructs such as if-then-else. Because it is a process language, we represent specific processes by creating a DIPL listing and, if the circumstances of the investigation require that a process be repeated, that DIPL listing is added an additional time. This provides a true representation of the actual events that occurred in the investigation. Construct 2 - DIPL constructs may be compared to sentences. Each sentence must begin with a verb and must contain an object, usually a role SID. A role SID parent may not contain a role SID child. However, other classes of SIDS may be, and routinely are, nested. The classes of SIDs that may be nested are atoms, adverbs, conjunctions and verbs. Verbs are special in that they are nested using conjunctions. The typical construction of a DIPL sentence is a verb, one or more role objects and atoms describing the role. Each role heads a DIPL clause and multiple sentences are tied together with conjunctions. The conjunction heads the group of sentences and parent SIDs and their children are enclosed as lists using parentheses. Construct 3 - SIDs must be used with the number of arguments required by the syntax of the individual SID. Those arguments must have the correct syntax and meaning as described in the SIDs definition. This does not mean that all of the subordinate SIDs called for in the May Contain portion of the syntax must always be
66
used. It simply means that certain types of SIDs may not stand alone without at least one argument. Construct 4 - The Principle of Connectedness. This principle states that when one encounters a SID one does not understand, one must strictly ignore the S-expression that the SID heads. One must not reject the total expression solely on this basis. Construct 5 The Distinct Child Rule. When there is more than a single child SID under a parent, each child must be unique and may not be reused within the same level. In the pseudo listing below ChildSID1 through ChildSIDn may not be reused under ParentSID1. Each child must be unique and distinct from the other child SIDs.
(ParentSID1 (ChildSID1 ...) (ChildSID2 ...) ... (ChildSIDn ...) )
There are two exceptions to the distinct child rule: conjunctions, and subjects and objects below verbs. In the case of the parent SID being one of the conjunctions (And, ByMeansOf, and HelpedCause), the distinct child rule does not apply to other conjunctions and verbs appearing below it as child SIDs. Those child SIDs may appear multiple times under a single parent. For the case of subjects and objects below verbs, the exception determines whether or not those subjects and objects may be pluralizable. For verbs, there are three types of roles (Role Class) that may be pluralizable. Those roles are: Subject Direct Object Indirect Object
At least one of these roles will, usually, be present for each verb. The roles that may be pluralizable are indicated in the language listing (Appendix 1). If the role may be pluralized, it may appear multiple times beneath the verb acting as its parent. Only one of the direct or indirect object roles may be pluralized within the parent verb and we interpret multiple subjects/objects within the parent verb as the association of the subject
67
or object with the verb. Construct 6 - The multiplier SID. Multiplier is a special SID used to express several occurrences of the same clause in a single sentence. There are four possible locations for the Multiplier SID: Directly under a verb In role SIDs designated as pluralizable subjects In role SIDs designated as pluralizable direct objects In role SIDs designated as pluralizable indirect objects
The general format of Multiplier used under a verb is:

(<Verb> (Multiplier <n>) <Rest Of Sentence> )
When Multiplier is used with role SIDs, the generalized format is:
(<Verb> (<PluralizableRole> (Multiplier <n>) <RestOfRole> ) <RestOfSentence> )
Construct 7 - Attribute inferences. Inferences may not be made across multiple simple sentences (those with one verb) even when those sentences are connected by conjunctions. S-expressions appearing in a subsequent sentence does not imply an inference of data in an s-expression in an earlier sentence. To preserve such an inference, one either must code it explicitly or use referent SIDs. The following sections describe the classes of SIDs used in DIPL [FKP+ 99].
4.2.2 SID Descriptions
4.2.2.1 Verb SIDs

Every DIPL expression must contain at least one verb. The verb is the action part of the
68
DIPL expression. When the expression, or sentence, contains exactly one verb that expression is a simple sentence and the verb must head the expression. Verbs, with a sequence of modifying s-expressions make up a complete sentence. The fragments that these modifying s-expressions head (usually Role Class SIDs) make up the clauses in the sentences.
4.2.2.2 Role SIDs

Role SIDs usually are the objects of the verb and describe the relationships to the action indicated by the verb. Most DIPL expressions contain at least one role SID. The role SIDs may or may not be pluralizable depending upon the parent verb. Role SIDs may take as arguments those SIDs indicated in the role SIDs syntax. An s-expression under a role SID is a role clause.
4.2.2.3 Adverb SIDs

As in many spoken languages adverb SIDs modify verbs, not objects. There are two adverb SIDs, when and outcome. When is used frequently in DIPL to establish a timestamp for an action.
4.2.2.4 Attribute SIDs

Attribute SIDs modify role SIDs. They must be placed directly under the role SID they modify.
4.2.2.5 Atom SIDs

Atom SIDs instantiate other SIDs, usually role SIDs. Unlike other SID classes, atom SIDs have a data type associated with them directly and adherence to that type is important. Placement of the atom SID is dependent upon the s-expression it modifies. If it modifies a verb, it should be placed in an adverb clause. If it modifies a role, it should be placed in the role clause.
4.2.2.6 Conjunction SIDs

There are three conjunction SIDs: And, ByMeansOf and HelpedCause. These SIDs are used to connect multiple sentences into larger expressions. Conjunctions head the list of
69
sentences and they may be nested. The use of the And conjunction implies that all sentences beneath it hold true. There is no order or causal relationship implied. The use of the ByMeansOf conjunction means that the events in a subsequent sentence contributed to the success of events in the immediately preceding sentence. While there is an explicit order, the causal relationship may be somewhat more indirect. For example, using the verb AcquireProxy, we may want to know something about the attack itself. Since the verb only tells us that we have determined that an attack occurred, we can use the ByMeansOf conjunction to allow us to add additional information in the form of additional verb clauses such as Attack. The use of HelpedCause implies a direct causal relationship between two or more sentences. To use the HelpedCause conjunction there must be an explicit chain of causality.
4.2.2.7 Referent SIDs

Referent SIDs are used to shorten complex expressions where an object is used repeatedly. The two referent SIDs, ReferTo and ReferAs are of type ulong using a format of [0x12345678]. ReferTo is used to reference an object caught by ReferAs. There are some specific rules for using referent SIDs [FKP+ 99]: Referents Under a Verb Rule. If a ReferAs clause is placed into a sentence, it refers to that sentence, except for any ReferAs clauses. Thereafter, the corresponding ReferTo clause can be used in place of that sentence except as below. Referents in a Role Clause Rule. If a ReferAs clause is placed into a role clause, it refers to the object described by the sequence of S-expressions following that role, except for any ReferAs clauses. Thereafter, the corresponding ReferTo clause can be used in place of that object description. A ReferTo clause which points to a role ReferAs of this kind is subject to the following rule: The ReferTo clause can only appear directly below a role SID, and it must be the only clause appearing below that role SID.
Semantic Rule. The referent SIDs carry actual semantics, and are not simply macros. If a ReferAs clause is placed into a sentence (i.e., directly under a verb), and that sentence
70
refers to an event, the ReferTo clause refers explicitly to that specific event, and not simply to an event with the same attributes. If a ReferAs clause is placed into a role clause, and that role clause describes an object, the ReferTo clause refers explicitly to the same object, and not simply to an object with the same attributes. Scope rule applying to referent SIDs. The value of a referent clause is the verb or role within which it is found, provided that that verb or role is in the same S-expression. A referent may not be reused within the same thread. A ReferTo clause must not be placed anywhere within the scope of any ReferAs clause.
4.2.2.8 SID Worlds and the World SID

The World SID is intended to allow generic representation of concepts within a specified environment. The two most common SID worlds are Microsoft and Unix. However, any explicit environment (i.e., Linux) may be designated at a SID world. The concept of SID worlds, as designated by the World SID, allows generic statements and s-expressions that are to be interpreted within the context of the specific SID world. SID worlds should not be used extensively as a matter of course. They are reserved for those situations where it is important for clarity to differentiate between environments. One and only one World clause may appear in any s-expression except an atom. The syntax of the World SID is:
World [world name],[modifier 1],...[modifier n]
An example is:
World Unix, Solaris 7.0
If multiple modifiers appear after a World SID, they must refer to the specific SID world, and the modifiers must go from the most generic to the most specific.
4.3 A Top-Level DIPL Investigation Reference Model

In creating a reference model for an idealized investigation using DIPL we concentrate 71
upon two specific objectives. First, we ensure that each class of the DFRWS framework is represented including that classs required elements. Second, we include the ability to add appropriate modules for the non-required elements as necessary. The role of the reference model is to characterize an idealized investigative process. It is not to demonstrate all possible approaches or potential outcomes. However, individual investigators may develop locally appropriate reference models specific to the types of investigations they conduct and the laws and rules under which they conduct them.
4.3.1 The Identification Class

The Identification Class has a single required element: Event/Crime Detection. This element implies direct evidence of an event. The required element should be expanded using one or more of the other elements in this class. A possible reference model for the Identification Class containing only this required element is shown below. The listing begins with the attack. The attack causes the architecture to change state. The And conjunction connects the four sentences beginning with Attack, ChangeState, DetectEvent, and ManageCase. The target is one or more specific hosts, the attack against which causes a state change in the enterprise, usually upon which the host resides. If the attack against the host does not result in such a change in the enterprise, because it is directed against specific data on the host, use Data instead of ArchitectureName as in the example below. Any of the SIDs listed in the state-related role SIDs may also be used as appropriate. The DetectEvent SID refers to the Initiator (the entity that detected the event and reported it to the Observer) and the Observer (the entity that recorded the process as reported presumably the investigator). Additional details, per the DetectEvent SID may be used as available. The more detail the investigator has, the more complete the investigative process record will be. ManageCase starts at the beginning of an investigation (usually) with limited information. As the case progresses and there is more detail available, such as specific pieces of evidence, the ManageCase SID becomes more specific. It may always be used to enter investigator notes into the otherwise formal listing. The Comment SID may be used for this purpose. Note the use of the Data role SID in the ManageCase verb. Use of this SID assumes that
72
the evidence involved comprises data of some kind. The Data SID must take appropriate SIDs as arguments to describe fully the sort of data involved.
(And (Attack (AttackSpecifics (AttackNickName [Name Here]) (Comment [Additional Information]) (BeginTime [hh:mm:ss TZ ddmmyyyy]) (EndTime [hh:mm:ss TZ ddmmyyyy]) (FileName [Name Here]) ) (Target (HostName [Name Here]) ) ) (ChangeState (Observer (RealName [Name Here]) ) (OldState [Old State] (ArchtitectureName VictimEnterprise) ) (CurrentState [New State] (ArchitectureName VictimEnterprise) ) (When (Time [hh:mm:ss TZ ddmmyyyy]) ) ) (DetectEvent (Initiator (RealName [Name Here]) ) (Observer (RealName [Name Here]) ) (When (Time [hh:mm:ss TZ ddmmyyyy]) ) ) (ManageCase (Initiator (RealName [Name Here]) ) (Data (ChainOfCustody [Custodian Name]) (CaseName [Identifier of the Case]) (EvidenceID [ID Number]) ([Other SIDs as appropriate]) )
73
. . . (BeginTime [hh:mm:ss TZ ddmmyyyy]) (EndTime [hh:mm:ss TZ ddmmyyyy]) (Comment [Additional Information Case Notes]) (When (Time [hh:mm:ss TZ ddmmyyyy]) ) ) )
Figure 6 DIPL Listing of a Top Level Reference Model for the Identification Class
4.3.2 The Preservation Class

The Preservation Class has three required elements: Case Management, Chain of Custody and Time Synch. There is one required element that one might consider to be optional in the case of an investigation that will never reach a courtroom: Chain of Custody. In reality, this element is equally important in an investigation not intended to end up in a legal preceding as it is in a court of law. The preservation process demands that evidence, in order to preserve its reliability and integrity, be handled appropriately. Whether the investigation will or will not end up in a legal tribunal rarely is known at the start of the investigative process. The Preservation Class deals with those elements that relate to the management of items of evidence. The DFRWS describes this class as a guarded principle across forensic categories. [DFRW01]. The requirement for proper evidence handling is basic to the digital investigative process as it relates to legal actions. The three required elements in this class use the ManageCase and SynchronizeTime verbs. Additionally, there are several atom SIDs the support these verbs. The obvious example is ChainOfCustody [PS03]. The listing that follows is a top level DIPL reference model for the Preservation Class. (And (ManageCase (Initiator (RealName [Name Here]) ) (Link [Argument1], [Argument2]) (Data (ChainOfCustody [Custodian Name]) (CaseName [Identifier of the Case]) (EvidenceID [ID Number])
74
([Other SIDs as appropriate]) ) . . (BeginTime [hh:mm:ss TZ ddmmyyyy]) (EndTime [hh:mm:ss TZ ddmmyyyy]) (Comment [Case Notes or Other Comment]) (When (Time [hh:mm:ss TZ ddmmyyyy]) ) ) (SynchronizeTime (Initiator (RealName [Name Here]) ) (ReturnCode [1|0]) (When (Time [hh:mm:ss TZ ddmmyyyy]) ) (BeginTime [hh:mm:ss TZ ddmmyyyy]) (EndTime [hh:mm:ss TZ ddmmyyyy]) ) )
Figure 7 - DIPL Listing of a Top Level Reference Model for the Preservation Class
The core of the class is the ManageCase verb. This verb will be seen repeatedly throughout any DIPL characterization. There are several possible SIDs that the verb may include as arguments in its s-expression, and we have only shown some of them here. The general rule is, if one would put a particular atom SIDs contents in case notes, use that SID here in ManageCase. The Initiator role will be present in all cases of the ManageCase verb. This refers to the investigator. The ChainOfCustody atom and its accompanying atoms, CaseName and EvidenceID usually are included as well to ensure that the listing is properly annotated for the particular case. BeginTime and EndTime refer to the start and end times of the period covered by the case notes. The When SID refers to the date and time that the investigator made the entries in the case notes. Link, while not always present, is important when it is. Linking is key to the analysis class (though not required) and is an important aspect of traceability, pervasive through the Examination and Analysis classes. The SID has a syntax of Link [Arg1], [Arg2] where Arg1 is the start of a link and Arg2 is the end of the link. When using a sophisticated link analysis tool in a complex case there may be multiple links discovered. In that case the
75
Link role may be used as many times as necessary to establish all applicable links discovered by the investigator or the tool. The arguments are strings. The SynchronizeTime verb also has some special considerations. The Initiator is the investigator or analyst. The BeginTime is the time showing on the log, computer or any other piece of evidence or device containing a gross data set of interest. The EndTime is the time to which the BeginTime is synchronized or normalized. This is the standard time that the investigator will use throughout the investigation. It may be a fixed time such as GMT or it may be variable such as the clock on the victim. These times are never really altered. Rather the discrepancy between the two times is normalized. The ReturnCode indicates whether or not the times initially were synchronized. If they were, ReturnCode is 0(zero). If not, it is 1. A 0 ReturnCode results in no need for a BeginTime or EndTime. Only the time that the synchronization test was made (When) is necessary [PS03].
4.3.3 The Collection Class

The Collection Class contains three required elements: Preservation, Approved Methods and Legal Authority. Preservation, as has been noted, is pervasive across the Collection, Examination and Analysis classes. Because the details of Preservation will vary each time it is invoked, it must be explicitly included in the DIPL code each time it is used. Approved Methods refers to the method used to perform a forensic or investigative process and the training and/or certification of the individual performing the process. As with other approved elements, there may be legal, professional or other citations to support the contention that the approval is genuine and relevant. It is in the Collection Class that the examiner or investigator collects data from the gross data set(s) that will be the basis for evidence in any preceding. The Legal Authority element may consist of a legal document such as a subpoena or warrant, or it may be a local policy that permits examination of computer media as an exception to an expectation of privacy. A top level reference model for the Collection Class follows. (And (CollectData (Data ([SIDs as appropriate to the data collected]) ) (Initiator (RealName [Name Here])
76
) (Target ([SIDs as appropriate to the data collected]) ) (ApprovedMethod ([Tool and/or ProgramName SID as appropriate]) ([Certification and/or Citation as appropriate]) ) ) (TraceAuthority (Initiator (RealName [Name Here]) ) (Observer (RealName [Name Here]) ) ([Citation and/or Policy as appropriate]) ) (ManageCase (Initiator (RealName [Name Here]) ) (Link [Argument1], [Argument2]) . . . (Data (ChainOfCustody [Custodian Name]) (CaseName [Identifier of the Case]) (EvidenceID [ID Number]) ([Other SIDs as appropriate]) ) . . . (BeginTime [hh:mm:ss TZ ddmmyyyy]) (EndTime [hh:mm:ss TZ ddmmyyyy]) (Comment [Case Notes or Other Comment]) (When (Time [hh:mm:ss TZ ddmmyyyy]) ) ) (SynchronizeTime (Initiator (RealName [Name Here]) ) (ReturnCode [1|0]) (When (Time [hh:mm:ss TZ ddmmyyyy]) ) (BeginTime [hh:mm:ss TZ ddmmyyyy]) (EndTime [hh:mm:ss TZ ddmmyyyy]) )
77
)
Figure 8 - DIPL Listing of a Top Level Reference Model for the Collection Class
4.3.4 The Examination and Analysis Classes

These two classes have required elements that mirror the Preservation Class and the Traceability element. As such, their required elements are identical. As in the Collection Class, the preservation element is identical in structure to the Preservation Class. The Link SID satisfies a portion of the traceability requirement, however, traceability comprises more than evidence linking. Traceability may be tied to the legal requirement for establishing that evidence, once collected, can be shown not to have been altered prior to its presentation in a court or other tribunal. This means that arguments to the ManageCase verb must be richer than would be required in the Preservation and Collection Classes. It is always best, of course, to include as much information as is available at every stage of an investigation, but once the traceability requirement appears that approach becomes a requirement. Essentially, these two classes look like the Preservation Class with a richer set of SIDs under the ManageCase verb. Additionally, as in all classes, those elements not required, but appropriate to a particular investigative situation must be added in the appropriate order. Where a data file or other digital (as opposed to printed) data comprises the content of the Data role, the Hash SID is required to maintain traceability. (And (ManageCase (Initiator (RealName [Name Here]) ) (Link [Argument1], [Argument2]) . . (Data (ChainOfCustody [Custodian Name]) (CaseName [Identifier of the Case]) (EvidenceID [ID Number]) ([Other SIDs as appropriate]) . . (BeginTime [hh:mm:ss TZ ddmmyyyy]) (EndTime [hh:mm:ss TZ ddmmyyyy]) (Comment [Case Notes or Other Comment]) (When (Time [hh:mm:ss TZ ddmmyyyy])
78
) ) (SynchronizeTime (Initiator (RealName [Name Here]) ) (ReturnCode [1|0]) (When (Time [hh:mm:ss TZ ddmmyyyy]) ) (BeginTime [hh:mm:ss TZ ddmmyyyy]) (EndTime [hh:mm:ss TZ ddmmyyyy]) ) )
Figure 9 - DIPL Listing of a Top Level Reference Model for the Examination and Analysis Classes
4.3.5 The Presentation Class

There are no required elements in the Presentation Class since the form of presentation of evidence is dependent upon the circumstances under which it is presented.
79
5. MODELLING OF DIGITAL INVESTIGATIVE AND FORENSIC PROCESSES
5.1 Introduction
Much of the promise of the research behind this thesis rests in the ability to formalize the investigative process. By subjecting an investigation or digital forensic examination to the structure and rigour of reliable methods of inquiry 10 we help ensure that the outcome of the investigation or examination is based upon solid scientific foundations and consequently can withstand the scrutiny of the courts. Additionally, the ability to characterize an investigation or examination mathematically suggests that we may be able to model the process and/or outcome formally. Modelling an investigation, including making adjustments for the randomness of investigative results as they appear, offers promise for improving both the speed and accuracy of digital inquiries. While the DIPL is simply a first step in the modelling process, there are ways to formalize at least some DIPL code. It remains for future work to address the entire Digital Investigative Process Language as a whole mathematically (see Chapter 6). It is important, to avoid any confusion, that we place the current state of our work clearly in the overall picture of the formalization process. The work reported in this thesis is preliminary to the complete formalization of the digital forensic and digital investigative processes. We have conducted this research largely as a proof of concept preliminary to determining whether extensions to classical set theory and logic will be required to characterize digital forensic and digital investigative processes fully. We do not intend to imply that such formalization has been completed at this time. Further, we demonstrate the use of Coloured Petri Nets as an appropriate formalism to describe the investigative process. That usage may take either of two forms. In the first instance, we may use Coloured Petri Nets to characterize the outcome of an investigation or incident post mortem. We use such a post mortem, or, post incident root cause analysis, as an example in this chapter and in the appendix.
10
Nordby [JN03] states that reliable methods of inquiry possess the characteristics of integrity,
competence, defensible technique and relevant experience. Stephenson Structured Investigation of Digital Incidents in Complex Computing Environments
80
In the second instance, we may use Coloured Petri Nets to describe an investigation process. In this case we use the DIPL somewhat differently than we do in the case of a post mortem. In a post mortem, the DIPL is used to conduct the investigation, characterize the evidence and the investigative process and cull that evidence which is meaningful to the investigation into a single characterization. This characterization allows the investigator to construct a Coloured Petri Net that describes the various states of the victim system (pre-attack, post-attack, effects of countermeasures, and, possibly, one or more failure states). When using Coloured Petri Nets to model the investigation itself, we conform more closely to the DIPL characterization of the actual investigation. In this case we create a CPNet of a correct and complete investigation and use the DIPL characterization to modify the behaviour of the Net in accordance with the actual investigation. The simulation of the actual investigation may show deviations from the ideal, pointing to flaws in the actual investigative process. In this chapter we examine fragments of DIPL s-expressions as an early proof of concept. In examples 5.3 and 5.4 we first present the DIPL listing for the example, followed by the mathematical analysis of the DIPL code, and ending with a Coloured Petri Net model. The exception is example 5.4 which is a fragment of an actual incident. For illustrative purposes we simply show and describe the Petri Net model of the attack fragment and the countermeasures in place. We have, as explained earlier, selected Coloured Petri Nets (CPNets or CPN) as our formal modelling. Also, as mentioned earlier, we use Design/CPN as our modelling and diagramming tool. Design/CPN requires the user to build the Coloured Petri Net manually after which the tool verifies the syntax of markings and guards and performs an automated simulation of the model. Since the representation is graphical in nature and easy to visualize, use of Design/CPN diagrams should be relatively easy to present in a courtroom. In informal presentations in training classes to lay individuals, understanding of the process being represented or modelled has been relatively easy to achieve. Acceptance of Coloured Petri Net evidence as illustrated using Design/CPN, or similar tools, will depend upon results of court challenges and Daubert [DVM93] hearings, however, as discussed in Section 2.2 and Section 2.3.3. Because Design/CPN simplifies the CPNet representation significantly, and, therefore, offers an acceptable platform for explanation to a lay audience, the actual mathematical process is hidden. Therefore, preparatory to the examples that follow we show, in
81
addition to the Design/CPN representations, the more traditional high level Petri Net graphs and the notation that underlies them. The first two examples describe a theoretical attack while the third is taken from an actual incident. The first DIPL fragment we will use describes a potentially successfully penetration attack of unknown type. We will examine it from a variety of perspectives starting with a pure mathematical characterization. However, before we continue, it is necessary to define the notation conventions that we will use and to describe the mathematical approach that we apply to the discussion of digital forensics. At this point we do not propose a new mathematical paradigm. Rather we expand existing traditional mathematical logic to form a foundation for future work and to describe, as completely as practical, the processes we are using.
5.2 Notation and Mathematical Processes

We begin with a discussion of the notation that we will use in the models that follow. Much of the notation is standard, either as used in traditional logic and formal mathematics or as used in describing CPNets. However, in some cases we have applied special meanings to be consistent with the objectives of digital forensic analysis. In this topic, we show various notation and describe our use of that notation in the context of digital forensic analysis and modelling.
5.2.1 Logic and Sets

The development of a mathematical representation specific to the digital investigative and forensic processes poses some interesting challenges. First, the research resulting in this thesis suggests that, in order to characterize completely both the investigative process and the forensic process, traditional logic and set theory alone do not offer a robust enough mathematical environment. There clearly is a requirement to extend classical notation, definitions and theorems to encompass the entire digital forensic and investigative processes. While a superficial representation of a process or outcome can be made using traditional core mathematical techniques, it is difficult to convey an entire investigation or forensic examination clearly and completely. Thus, we believe that there is fruitful ground for the extension of set theory to support forensic analysis. Much of this work, our research suggests, has been done using the various applications of Petri Nets. One of the issues that confronts us when developing a mathematical characterization of
82
an investigation or forensic examination is the notion of concurrency. It is quite normal for several threads of an investigation or forensic examination to be conducted simultaneously, and equally common for those threads to converge at some point, often in a particular and important order. A second issue with which we must deal mathematically is the notion of deadlock. While most mathematics view deadlock as a failure, when we apply countermeasures to a vulnerable system (as in the analysis of an incident root cause), the desired outcome may be deadlock. Such deadlock represents the failure of an attack to succeed due to the placement of appropriate countermeasures. In CPNets we say that, when such a deadlock occurs, the net no longer displays the property of liveness. Dynamic properties of CPNets are discussed in more detail in section 5.2.3.1. Livelock, where the system seeks indefinitely to resolve a problem without success, however, is not desirable since, while it may show that a countermeasure has prevented penetration into the system, for example, it also may represent an immense waste of system resources resulting in, effectively, a denial of service. In fact, many denial of service attacks, when modelled, exhibit livelock characteristics. The third, and most difficult issue with which we must content is that of randomness. Attacks are not always predictable, even if the attack mechanism is well known. It is common for attackers to rely upon a variant of a particular attack or reconnaissance technique in order to avoid detection. Thus, it is usual that an investigation into such a complex attack may itself become complex and unpredictable. Seasoned investigators tend to resist structured investigative processes due to the unpredictability of the thread of an investigation or forensic examination. In these three areas at least we are faced with the need for extending traditional mathematical representations to accommodate the special requirements of digital investigation and forensic examination. In the mathematical examples that follow we introduce two important notions. First, we show that it is feasible, though, perhaps, not completely practical to characterize even a simple investigation mathematically. We have found in our research that this feasibility tends to break down rapidly as the investigation or forensic analysis becomes increasingly complex. The characterizations we show in the following sections are on the border between being useful and being, simply, suggestive of the feasibility of such characterizations if extended to accommodate the issues described above.
83
Second, by applying Coloured Petri Net modelling we show that a more extensive mathematical representation is, indeed, appropriate for dealing with the practicalities of digital investigation and forensic examination. However, arguably the most important outcome of the use of CPNets is the ability to model the outcome of an investigation or forensic examination taking into account temporal issues as well as issues of concurrency and, potentially, randomness. We believe that there are several other process algebras that could be equally, or perhaps, more useful. CPNets, as mentioned earlier, however, have the advantage of being graphical in nature, a characteristic that lends them to use before a lay audience. One final note on the extensibility and usefulness of modelling techniques to digital forensic science is in order. In our research we have discovered that there are a number of potential permutations of the techniques we describe in this thesis that have applicability with the broader field of information security and assurance. The ability to impose theoretical attacks upon a supposed secure enterprise and model the outcome of the attack as if we were conducting a forensic investigation shows promise as an advanced method of enterprise-wide risk management. We hypothesize that we may extend this notion to include vulnerability assessment of complex enterprises, threat and impact analysis and risk analysis based upon the forensic modelling of the failed state of an enterprise following an attack scenario. Given such a model, the application of theoretical countermeasures may be modelled and the outcome analyzed. While this work is outside the scope of this thesis, it is interesting to contemplate additional uses for the techniques described herein.
5.2.1.1 Special Notation
5.2.1.1.1 The Symbol
We use the symbol to designate the set of all successful attacks. We may add a subscript to assist in interpreting the nature of the successful attack, for example p for a set containing successful penetration attacks. For simplicity of notation and clarity of analysis, attacks fall into three categories:
The set of all attacks a particular type such as all penetration

attacks. We notate these attacks , often with a subscript
84
such as pen for the set of all penetration attacks. The attacks may or may not be successful success or failure will be determined by the rest of the expression.
The subset of attacks of a particular class such as the subset

of buffer overflow attacks that could lead to a penetration. We notate that subset as , again adding a subscript for clarity as in buff . We show that buff pen .
A specific attack. We notate individual attacks as , the

subscript for example, dtspcd , relating to the dtspcd exploit 11. Thus, dtspcd buff .
The rationale behind breaking the attack paradigm into three levels is that attacks may be arbitrarily complex and their makeup unpredictable. Thus, there may be many instantiations of a particular type of attack depending upon how the attacker chooses to obfuscate key elements of the attack that could lead to early detection and deployment of countermeasures by defenders. Thus, as a general case we break attacks up into three levels of abstraction: 1. ATTACK TYPE: A generic classification
sufficiently broad to accommodate types of attack as seen by most attack or threat classification systems 12. 2. ATTACK CLASS: A specific classification that allows a more focused definition of an attack without taking into account specific instantiations of the attack that would, presumably, result in the same impact upon the target. If we view the taxonomy of an explicit attack or exploit as an ontology tree, we could say that most of the tree is
11
See http://www.cert.org/advisories/CA-2001-31.html for a description of this attack For example the NIAP Common Criteria Profiling Knowledge Base (CC-PKB) see
12
http://niap.nist.gov/niap/archive/iccc/CC_PKB.htm for details. Stephenson Structured Investigation of Digital Incidents in Complex Computing Environments
85
collapsed into this level of abstraction. 3. SPECIFIC ATTACK: The explicit instantiation of an attack class, dependent upon specific attack code or manual attack steps. At this level we generally refer to the attack as a specific exploit. Using the CC-PKB as an example, from the General Threats database we select Threat number 16: Malicious Code Exploitation as the Attack Type. For the Attack Class we select, from the CC-PKB Specific Attacks database, Attack number 64: A perpetrator executes malicious code either remotely or locally. This precludes all other classes of attack and focuses our attention only upon those that result from the execution of malicious code by a perpetrator/hacker. The CC-PKB, for example, lists at least six classes of attack resulting from malicious code. This particular class, however, results from both malicious code and malicious intent. Typical of this class of attack are buffer overflows. Finally, we select a Specific Attack, in this case the dtspcd exploit. A benefit of this approach is that, when determining failure states of a victim enterprise, we are able to be very specific about the cause of the failed state. While we may see that the specific exploit was the proximate cause of the failed state, when determining the countermeasure, or lack thereof, that allowed the failed state to occur, we may be more concerned about the class or type of attack. When we analyze a digital incident with a view to preventing such incidents in the future, or determining root cause, we may be more concerned about protecting against an entire attack type. Clearly, this is more desirable than attempting to predict every potential specific exploit that an attacker might devise to achieve penetration via a buffer overflow, as an example. As with much of the work in this thesis, we view this approach as preliminary to complete attack modelling. Indeed, for the purpose of characterizing an investigation or forensic examination, we need not model the attack itself in detail. It is sufficient only to model the existence of the attack and those characteristics that allow it to be identified forensically. The detailed modelling of an explicit attack, while interesting, is beyond the scope of this thesis.
5.2.1.1.2 The Function

Classically, we define a function, f
( )
( ) , as a set consisting of ordered pairs where for

86
every x there is at most one y such that x, y f [HH83]. Within our approach to mathematical notation for digital forensics, we use the function notation, f
( ) to designate a process of some sort applied to the notation within the
parentheses. For example, within an attack, f ( x ) refers to some explicit process applied to a class of attack, x resulting in a specific attack defined later in the expression. The specific process may be explicitly defined or not depending upon how it is used. In the example in 5.2.2.1 we could say:
f ( buff ) = dtspcd
meaning that some specific programming or technique is applied to the general class of buffer overflow attacks to achieve a specific attack called the dtspcd attack. Relating this application of functions to the classic definition of a function, we begin with the following definition of a function expressed consistently with our particular circumstances [HH83]: A set consisting of ordered pairs is a function if for every attack there is at most one set of code or manual procedures y such that , y f . For example, let p be a specific Ping of Death attack, and let yp be specific program code that generates the attack. For every Ping of Death attack, p , there is, at most, a single program code listing, yp . If there is a different yp , the code or attack procedure will be slightly (or, perhaps, greatly) different resulting in a different attack p ' . We may consider yp DoS where DoS is the set of all denial of service attacks. Applying function f attack p . Note that this approach says nothing about the outcome of an attack. The outcome of any of several instantiations of a Ping of Death attack may be, substantially, the same. It is also important to distinguish the term attack as being the description of an exploit as opposed to being an instance of attacking or launching an exploit. In other words, a function, f
( )
to DoS selects specific attack program code yp resulting in a unique
( ) , is the process of creating program code or attack steps
in general. If the process is different (i.e., the function is different) the code or procedure
87
will be different and will result in a different, unique attack. Generally, we distinguish between a generic attack and some specific attack y based upon the explicit programming code or manual process used to execute the attack. The dtspcd attack referred to in our example becomes specific based upon the code used to execute it. Until we apply the specific code or manual procedure, it simply remains an exploit that can, theoretically, be applied to a vulnerability. This distinction is not, in terms of digital forensic analysis, trivial. The ability to identify an exploit forensically may depend upon the ability to extract evidence in the form of the specific code used or steps taken to achieve an exploit, successful or not. We may wish to expand the notation to describe the function in greater detail if we have that level of information, however, the statement as shown above is sufficient, at a high level of abstraction, to show that a specific attack exists and is part of a particular class of attacks.
5.2.1.1.3 From Initiator i to Target t - ( i, t1, , tn ) uuuuuuuur

We need a special notation for describing an action such as an attack that originates at one point and terminates at another. Typically, such an action would begin at an initiator or an attacker and terminate at a some target or targets, intended or not. We use the notation
, t , , t ( iuuuuuuuur ) to describe such a condition.

1
It is important to note that this notation uses the
right arrow underbar and is not the same predicate notation P ( i, t , , n ) as used in the predicate calculus. In order to make the notation meaningful, we must prepend an action such as an attack
resulting in, for example, ( i,r ) meaning some attack, , from an initiator i to a uut
target t , both initiator and target being previously defined. An n-place expression, ( i, t1, , tn ) , allows us to express a multiple target attack, for uuuuuuuur example a distributed denial of service attack against several targets simultaneously. The notation for such an attack would be: ( i, t1, t 2, tr ) . uuuuuuuu3 In order to comprise a complete statement, this expression must yield a result such as shown below in 5.2.2.1.
88
5.2.1.1.4 Implies, is the same as, and If and Only If (iff)

We use the notation for implies and the notation for if and only if (iff). In the context of digital forensics and attack description, implies can also mean yields or leads to. We use the symbol for is the same as.
5.2.2 Example Mathematical Definitions

In order to apply classical logic and set theory to the digital forensic and investigative processes, we must create a few extensions to the traditional mathematics. The definitions that follow do not represent all possible definitions for a specialized mathematics to be applied to forensic use. They do, however, serve to provide those definitions we will require for the examples that follow as well as examples of mapping DIPL s-expressions to mathematical expressions. The mapping of DIPL s-expressions to mathematical expressions is not, however, a direct translation of any DIPL SID. Rather, we use a mathematical definition to express a DIPL process built around a DIPL verb.
5.2.2.1 Definition 1.1 Potentially Successful Attack Process

Definition 1.1: A potentially successful attack process consists of an attack, an initiator, one or more targets and an outcome in the form of a state change in the target(s)
: ( i, t1, t 2, , tr ) ( ( , spre1, spost1)( , spre 2, spost 2 ) ( , spren, spostn ) ) uuuuuuuuuuun

such that: (i) An attack is a member of the set of successful attacks. (ii) An attack is applied by an initiator i against one or more targets ( t1, t 2, , tn ). (iii) The attack results in a state change in the target(s) from the pre-attack state spre to a post-attack state spost (attack in state spre leads to state spost is true) for each instance of the attack, i to t1 , t 2 , tn . Note that this defines a potentially successful attack. The actual outcome of the attack would depend upon the nature of the post-attack state. An attack could be applied against
89
an enterprise and the state change could be trivial and not considered a successful attack. For example, if the attack were to fail in its intended purpose but, none-the-less, cause a trivial change in state in the target, the conditions of the definition would be met but the attack itself might not be considered successful in the context of a complete analysis of its outcome. An example of this condition is an attack intended to penetrate a password file and allow the attacker to extract passwords from it. If the attack penetrated the password file but the attacker was unable to extract any passwords, the condition for success, a change in state (the pre-attack state was that the password file had not been penetrated), would have been met. The attack, however, could not be considered successful. Examples of the use of this definition in the characterization of DIPL s-expressions appear below in sections 5.3 and 5.4. Definition 1.1 is consistent with the use of the DIPL Attack verb, although it is not, as stated above, a direct translation of the verb.
5.2.2.2 Definition 1.2 Authentication of a Block of Digital Data

Definition 1.2: A block of digital data is authenticated as identical to another finite block of digital data ' iff a function f ( ) produces a unique result equal to a unique result produced by f ( ' )
' f ( ) = f ( ')
such that: (i) and ' are blocks of digital data of finite length. (ii) f ( ) and f ( ') are the results of the application of some algorithm sufficiently complex as to produce a result unique to some predefined standard. Definition 1.2 is an example of the mapping of a DIPL s-expression built upon the verb Authenticate. This is a very simple, in fact, intuitive definition. It is, however, important to the digital forensic investigator due to the need to authenticate copies of digital data such as bit stream backups of files or disks using various hash functions. In practical use, the digital forensic investigator will determine the algorithm (hashing function) in advance and apply
90
the same hashing program to both sets of data. The definition that follows in somewhat more complex and demonstrates the mapping of s-expressions using the verb AcquireProxy to mathematical expressions.
5.2.2.3 Definition 1.3 Acquisition of Proxy

Definition 1.3: A user or process gains the ability to act as another user or process iff user or process , through use of some specific attack code or procedure acquires the set of access rights and permissions to a system belonging to user or process
: (, , )
such that: (i) Attack is a member of the set of all potentially successful attacks. (ii) User or process has the access rights and permissions . (iii) Attack by user or process in user access rights and permissions is successful leading to acquisition of access rights and permissions . (iv) Access rights and permissions are the same as access rights and permissions . For the purposes definition 1.3, an attack need not necessarily be malicious. The explicit granting of to user or process qualifies under this definition.
5.2.3 Coloured Petri Nets

We take the notation for Coloured Petri Nets from a variety of sources [CLP02], [GMU03], [KJ96], [PB03], [UA03], [KJ97], [KJ94]. The reader is encouraged to refer to these sources for an exhaustive discussion of Coloured Petri Nets and their relationships to classical place-transition and high level Petri Nets. For a more complex representation, Haas chapter on Coloured Stochastic Petri Nets 13 adds the element of randomness to Jensens CPNets. Additionally, we wish to point out that the tool, Design/CPN and its successor CPNTools, simply provides graphical generator for CPNets along with the ability to check
91
the syntax of the graphical model and simulate the operation of the net. Because the tool translates the graphical representation into the underlying CPNet formalism, there is no need for the user to understand the mathematics involved. However, CPNTools has the ability to generate complete program listings representing the net and its behaviour in an extension of the ML language. For additional details the reader should refer to the CPNet web site 14 for additional references. We use Jensens definition of Coloured Petri Nets as 9-tuple nets: ( , P, T , A, N , C , G, E , I ) [KJ96] where:
is a finite set of non-empty types called colour sets
P is a finite set of places { p1, p 2,... pn}

T is a finite set of transitions {t1, t 2,...tn} A is a finite set of arcs defined from:
P T = P A = T A =
N is a node function defined from:
A into P T T P
C is a colour function defined from:
P into
G is a guard function defined from T into expressions such that:
a A : Type ( G ( t ) ) = Bool Type Var ( G ( t ) )
(Bool = Boolean, Type represents a Colour Set)
E is an arc expression function defined from A into expressions such that:
13
Haas, Peter J. Stochastic Petri Nets, pub. Springer, Chapter 9, Coloured Stochastic Petri Nets 2002. http://www.daimi.au.dk/CPnets/
14
92
a A : Type ( E ( a ) ) = C ( p ( a ) ) MS Type Var ( E ( a ) )
where p ( a ) is the place of N ( a ) .
I is an initialization function defined from P into closed expressions such that: p P : Type ( I ( p ) ) = C ( p ) MS
MS means Multi Set. A multi set generally refers to the number of tokens of a particular type (colour). In the expression above we refer to C ( p ) MS meaning the multi set of colour C at place p . Generally, we follow the rules for Coloured Petri Nets set down by Jensen [KJ94], [KJ96], [KJ97]. We represent markings, Mm as:
CPNn : Mm = [ p1 p 2 p 3... pp ] where each place p is represented by the number of tokens
in that place. M 0 represents the initial marking of the CPNet. We represent the firing sequence s of a set of transitions s = t1 t 2... tn We represent the set of all input places, or preset of a transition tn , and the set of all output places of a transition tn , or the postset. For places we designate the set of all input transitions to the place the preset, pn , and the set of all output transitions of the place the postset pn . Therefore, as an example:
p1 = {t1, t 3} t 2 = { p 2, p 3}
for the classical Petri Net graph:
t3 t2 p1 t1 p2
p3
93
The same example applied to a CPNet graph:

DECLARATIONS colour COLOR1 = with c1|c2; var color1 : COLOR1;
The places p1, p2 and p3 are typed (colour) COLOR1 and can contain tokens of type c1 or c2. The color1 variable defines the arcs. There is an initial marking in place p1 of one token c1. As with classical net representation, the CPNet representation is only a net fragment. We must assume that transitions t3 and t1 have input places of some sort. An additional element in CPNets is the binding element. Bindings assign types (colours) and quantities to the variables in transitions. A binding element is a pair, ( t , b ) , t being a particular transition and b being a binding for the variables of t . If a binding element is enabled, the transition may fire. For the binding element to be enabled, there must be enough tokens on the input place to the transition to satisfy the transition conditions (the variable on the input arc) and they must be of the correct type (colour). Additionally, the guard on the transition must evaluate (Boolean) to True. A step, Y , is a multi-set of binding elements and represents the possible firing of a transition. If the transition fires (i.e., a step Y occurs), the marking preceding the step changes to a different marking defined by:
p P : M 1 ( pp ) = M 0 ( pp ) E ( p, t ) < b > + E ( t , p ) < b > (t ,b )Y ( t ,b )Y
5.2.3.1 Dynamic Properties of Coloured Petri Nets

There are a number of dynamic properties of CPNets. These properties characterize the behaviour of individual nets [KJ94], [KJ97-1]. For our purposes, however, we describe only those two properties most important to the examples that follow. The reader is referred to [KJ97-1] for a more exhaustive discussion of CPNet dynamic properties. 94
5.2.3.1.1 Boundedness
The Boundedness property simply describes how many tokens may exist at a given place. Definition 4.1 [KJ94] states: Definition 4.1: Let a place p P , a non-negative integer n N and a multi-set
m C ( p ) MS be given.
(i) n is an integer bound for p iff:

M [ M 0 :| M ( p ) | n
(ii) m is a multi-set bound for p iff:

M [ M 0 :| M ( p ) | m
The notation [M 0
refers to reachability or an occurrence sequence [KJ96].
Generally, we say that when we enable a step Y in some marking M 1 , that step may occur. If the step occurs, it changes marking M 1 to some other marking, M 2 . We may then say that marking M 2 is directly reachable from marking M 1 by means of step Y . We symbolize this process: M 1[Y M 2 . Thus, in the definition (i) we say that n is an integer bound for place p iff for all markings M that are reachable markings from the initial marking M 0 such that the marking of place p is a non-negative integer less than or equal to n . In other words, there cannot be more tokens in a place reachable from the initial marking than there were in the initial marking.
5.2.3.1.2 Liveness
Liveness refers to the property of a CPNet that describes a set of binding elements that remain active. For liveness to occur it only is necessary that the binding elements (transitions) can become enabled.
5.3 A Penetration Attack

DIPL easily can describe an attack process, including the type of attack, the source and target of the attack and details about the attack, in sufficient detail for forensic analysis. 95
The following DIPL fragment describes, for the purpose of illustrating the use of the DIPL, a penetration attack that may or may not have been successful. The purpose of the attack is to acquire the proxy of a legitimate user, in other words, to penetrate that users account and masquerade as the legitimate user.
5.3.1 DIPL Characterization

The attacker (Initiator) appears to originate at IP address 63.36.3.3. The victim (Target) is at IP address 123.222.3.4. These, of course, are not intended to represent any real addresses. The type of penetration attack, as well as its actual success is not known and we have not identified the account. For our purposes, in this initial example, we simply are concerned with characterizing a potentially successful penetration attack for the purpose of masquerading. Much more detail could be added to the listing, however, for demonstration purposes that is not necessary here. The DIPL listing follows.
(And (AcquireProxy (ByMeansOf (Attack (AttackSpecifics (AttackNickname unknown penetration attack) ) (Target (IPv4Address 123.222.3.4) ) (Initiator (IPv4Address 63.36.3.3) ) ) ) ) )
Figure 10 - DIPL Listing for a Potentially Successful Penetration Attack
5.3.2 Mathematical Analysis

We can interpret this particular DIPL listing mathematically. First, we must decide what we wish to interpret. For the purpose of establishing that an unknown penetration attack
96
could occur and potentially be successful we do not need to include such specific details as the source and target addresses, although there is no particular reason why we could not do so. Our purpose in attempting a mathematical representation is, simply, to establish the potential for success and the parameters that would indicate that success. This representation is useful for exploring the question, could such an attack occur and what would it take to make it successful? We begin with some assumptions. Let p = the set of all penetration attacks, successful or unsuccessful Let t = the target address (123.222.3.4) Let i = the initiator (attacker) address (63.36.3.3) Let prox = a class of penetration attack that acquires proxy (i.e., penetrates the target and gains control of an account). Let p = some specific attack Let prox = the set of successful attacks that can acquire proxy Let f ( prox ) = some specific attack process applied to the general class of penetration attacks, for example a buffer overflow attack, intended to acquire proxy on the target.
We then establish that the attack itself exists:
prox p : f ( prox ) p
We pronounce this as: The class of attacks, prox , intended to acquire proxy control of an account is subset of the set of all penetration attacks, p , successful or not. There exists an attack, , such that some process applied to the class of penetration attacks, prox , intended to acquire proxy yields the specific attack,
p .
97
Next, we can establish an expression that defines success:
p prox : p ( i,r ) ( p, spret , spostt ) uut

We pronounce this as: There exists an attack, p , a member of the set of successful proxy attacks prox such that the attack
p from initiator to target, ( i,r ) , yields a change in uut

state in the target from the pre-attack state spret to a post-attack state spostt . Now we concatenate the two expressions to show the property that must be satisfied for a successful attack:
p prox : p ( i,r ) ( p, spret , spostt ) ( : f ( prox ) p ) p ( i,r ) uut uut

There exists an attack, p , a member of the set of successful proxy attacks prox such that the attack
p from initiator to target, ( i,r ) , yields a change in uut

state in the target from the pre-attack state spret to a post-attack state spostt , if and only if there exists an attack, , such that some process applied to the class of penetration attacks, prox , intended to acquire proxy yields the specific attack, p and the attack p from initiator to target evaluates to True.
Note that we have said only that the attack could be successful. In the example DIPL listing we only characterize the attack. The same is true with the logical representation of it. We simply have said that the attack is possible and if it is to succeed it must meet specific conditions. An important example of one or the other of these conditions not being met is the difference between a local attack and a remote attack. While both may be equally capable of succeeding, a local attack attempted from a remote location (over a network) must fail because the second condition, the ability to deliver the attack, does not exist the process for attacking locally cannot be used remotely.
98
For investigators, this last statement is very important. If the investigator suspects that there has been a particular type of attack and finds that no such attack actually exists, or that it cannot be delivered as suspected, the obvious conclusion is that his or her initial assumption is in error. Likewise, from the perspective of applying countermeasures, the ability to cause either condition to fail constitutes an effective countermeasure. While this example is trivial and the outcome is obvious, in complex investigations the process can be quite helpful.
5.3.3 Petri Net Model

Petri Nets offer a formalism that, for a lay audience, is simple to grasp due to its graphical nature. The Coloured Petri Net for our trivial example is shown below. Note that the graphical representation has no formal meaning in itself. We show the formal verification of the graphic following the graphical example. This example is a combination of figures 11, 12 and 13. The three figures are simply three different views of the same Net. Figure 11 simply shows the Net in general with no tokens in place (although the initial marking of the Initiator place implies a single pen_attack token in that place), figure 12 shows the token in the input place arming the transition, and figure 13 shows the transition firing and the token appearing in the output place. We use the Design/CPN tool, its successor CPNTools, and Jensens graphical representation for CPNets. We refer the reader to section 5.2.3 for additional detail regarding Coloured Petri Nets.
Declarations:
color Source = with pen_attack | other_attack; color Target = Source; var attack : Source;
[attack = pen_attack] Source 1 àttack 1 àttack Target
Initiator
Attack
Attack Successful
1 `pen_attack
Figure 11 - Coloured Petri Net Describing a Simple Penetration Attack (Design/CPN)
The Initiator place is typed (collared) by Source and there is a single token (pen_attack) as its initial marking. We have a choice of two possible kinds of tokens in Initiator: pen_attack
99
and other_attack. The variable, attack on the input arc connecting the input place with the transition Attack is of type Source. If the transition fires, the output arc between the transition and the output place, Attack Successful (also typed by Source), will add the token in the output place and subtract it from the input place. The guard on the transition, [attack=pen_attack], says that, if the token arming the transition by the input arc is pen_attack, the transition will fire and pass the token on to the output place. If the token appears in the output place, the attack has been successful. In the next figure (12), the token armed the transition.
Declarations:
Initiator 1
1 `pen_attack 1 `pen_attack
Attack
Attack Successful
Figure 12 - Coloured Petri net Showing the Token Arming the Transition
Note that 1 pen_attack token is shown on the input arc. Since the token meets the criteria of the transitions guard (i.e., the Boolean expression evaluates to True), the transition will fire and pass the token to the output place. The figure below shows the successful firing of the transition, allowing the token in the output place.
100
Declarations:
Initiator
Attack
Attack Successful
0 `pen_attack
1 `pen_attack
Figure 13 - Coloured Petri Net Showing the Token in the Output Place
The appearance of the token in the output place demonstrates that the attack was successful. Next, we use Jensens CPNet notation to describe formally the graphical representation for the simple example in Figure 11: Defining the declarations:
p1 = Initiator place p 2 = Attack Successful place t1 = Attack transition ai = ( p1, t1) ao = ( t1, p1) C ( p1) = { pen _ attack , other _ attack } C ( p 2 ) = { p1} Varattack = C ( p1) Ei ( ai ) = C ( p1 ( ai ) ) Varattack ( Ei ( ai ) ) G ( t1) = Bool Varattack ( G ( t1) ) I ( p1) = C ( p1) pen _ attack Eo ( ao ) = C ( p 2 ( ao ) ) Varattack ( Eo ( ao ) )
= {Source ( p ) ,Target ( p )}
1 2
The initial marking, M 0 , of the CPNet, CPN 1 , is:
101
CPN 1 : M 0 = [10] where the token in p1 is a pen _ attack token
Let G ( t1) = Varattack ( pen _ attack ) Let binding b ( ai ) = ( Source,1)

Y=
( t ,b )Y
E ( p , t ) < b > M
i 1 1
Since one pen_attack token resides in place p1 (the initial marking M 0 ), the binding on the input arc ai is one token of colour Source, and the number of tokens in the input place (one) is less than or equal to the number of tokens in the binding of the input arc, the step (transition) Y is enabled i.e., the transition t1 is armed. Because the conditions of the guard G ( t1) evaluates to True the transition fires allowing one token in the output place and removing one from the input place, verifying the graphical representation of the CPNet. For future examples we will not perform the preceding analysis. It is sufficient to use the preceding to establish the verification process. In subsequent examples we will depend upon the graphical representation and a brief description.
5.4 A More Complex Example Attack

We expand upon the preceding example to include the investigators observation of a state change in the Target. In this example we see the attack, a more detailed description than the preceding example, preceded by an observed change of state on the victim computer.
5.4.1 DIPL Characterization

The attack is a giant ping packet (a malformed ICMP type 8 packet) sometimes called a Ping Of Death. The certainty that the attack will succeed is 100% and the severity of the attack is 100 on a scale of 1 to 100. We now no longer are concerned solely with the ability of an attack to succeed. We see that the attack is a Ping Of Death and that it has, in fact, succeeded. This changes the outcome of our Petri Net and, likewise, the complexity of the mathematical representation. Expanding upon the previous example gives us a more complex DIPL listing:
102
(And (ChangeState (OldState proper operation (HostName Server1) ) (CurrentState system crashed) (HostName Server1) ) (Observer (RealName Joe Admin) ) (When 03:15:25 GMT 02042002) ) (ByMeansOf (Attack (AttackSpecifics (AttackNickname Ping Of Death attack) ) (ICMPType 8) (Target (IPv4Address 123.222.3.4) ) (Initiator (IPv4Address 63.36.3.3) ) (Certainty 100) (Severity 100) ) ) ) )
Figure 14 - DIPL Listing for a Successful Denial of Service Attack
5.4.2 Mathematical Analysis

This DIPL listing is a bit more complex than the previous example, so, in order to handle additional complexity, we break the listing into sections. We begin with the following
103
assumptions: Let s1 = pre-attack operating state of the host Let s 2 = post-attack operating state of the host Let sf = failed state of the host Let Sh = the set of all operating states of the host Let dos = a class of attack that can cause a host to change states. f ( dos ) therefore, describes a function that changes dos to an attack p that explicitly causes this host to change state Let p = a Ping Of Death attack
Let dos = the set of all denial of service attacks Let dos = the set of successful denial of service attacks Let i = the Initiator Let t = the Target (the host)
These assumptions address four aspects of the DIPL listing: 1. The state of the host can be made to change 2. There is a specific denial of service attack called a Ping Of Death 3. A Ping Of Death attack will be successful, and, 4. Such an attack against the host will cause it to change states to a failed state.
First, we establish that the state of the host can be made to change:
s1 Sh s 2 Sh : ( , s1) ( s 2 : s1s 2 )
We pronounce this as: There is a pre-attack operating state, s1 , of the host, a member of the set Sh of all operating states of the host. There is a post-attack operating state, s 2 , of the
104
host, a member of the set Sh of all operating states of the host. There exists a class of attack such that the attack applied to the pre-attack state s1 of the host yields a change in the hosts state to a post-attack state s 2 such that the pre-attack state is different from the post-attack state. Next, we establish that the specific attack is the Ping Of Death attack:
dos dos : f ( dos ) = p

The pronunciation of this statement is similar to that in our previous example: The class of attacks, dos , intended to cause denial of service to a host is subset of the set of all denial of service attacks, dos , successful or not. There exists an attack, , such that some process applied to the class of attacks, dos , intended to cause denial of service against a host, yields the specific attack, p . We define a successful Ping Of Death attack:
p dos : p ( i,r ) ( p, s1, s 2 ) uut

The pronunciation is: There is an attack p , a member of the set of all successful denial of service attacks dos , such that an attack from the Initiator against the Target results in a change of state in the target. Finally, we show that a successful Ping of death attack, p , a member of the set dos , will cause the target hosts state s 2 to change to a failed state sf :
p dos : p ( i,r ) ( ( p, s1, s 2 ) s 2 sf ) uut

For all Ping of Death attacks p that are members of
105
the set of successful denial of service attacks dos such that a Ping of Death Attack from initiator i to target t results in a change of state in the target host from the pre-attack state s1 to the post attack state
s 2 , and state s 2 is the same as failed state sf .
Summarizing: 1. : ( , s1) ( s 2 : s1s 2 ) 2. : f ( dos ) = p 3. : ( i, t ) p dos :: It is possible to cause a state change in the host :: There is a denial of service attack called a Ping Of Death :: A Ping Of Death from an Initiator to a Target will be successful
4. p dos : p ( i,r ) ( ( p, s1, s 2 ) s 2 sf ) :: A successful Ping of death uut attack will cause the target hosts state to change to a failed state. The issues here are threefold: 1. Can the state of the host be changed? 2. Is there a denial of service attack called a Ping Of Death? 3. Will a Ping Of Death be successful? As we see, if all three conditions are met (possible to change state, Ping Of Death exists and Ping Of Death against the target will succeed), the hosts state will change and the post-attack state will be a failed state. This interpretation is valuable for the investigator, but far more valuable for system administrators simulating the results of a denial of service attack against a host. Knowing that the attack will succeed allows the administrators to take pre-emptive actions to prevent the consequences of the attack.
5.4.3 Petri Net Model

Like the mathematical interpretation of this attack, the Petri Net interpretation is
106
somewhat more complex than the previous example. Although the next Petri Net is, as was the preceding net, simplistic and shows only the outcome for a vulnerable host, a more complex net could be used to introduce countermeasures as inhibitors that would prevent the final transition (Deliver Attack) from succeeding as discussed below. The Petri Net for the above attack is shown below.
Declarations:
color Attacks = with pod_attack | other_attack; color pod_selected = Attacks; color Successful = Attacks; var attack : Attacks;
[attack = pod_attack] Attacks 1 àttack 1 àttack Pod_selected
DOS Attacks
Select Attack
Initiator
1 `pod_attack [attack = pod_attack]
1 àttack
Deliver Attack
1 àttack
Successful
Target
Figure 15 - Coloured Petri Net Describing a Denial of Service Attack
Here, we have two transitions. The first transition establishes that there is, in fact, a denial of service (DOS) attack called a Ping Of Death. We allow the Initiator to select the Ping Of Death (pod) attack from the set of all available DOS attacks. Because the attack exists (the initial marking of DOS Attacks is 1 `pod_attack), the token, 1 `pod_attack moves from the DOS Attacks input place to the Select Attack transition. Since the selected attack is a Ping Of Death (i.e., the Ping Of Death attack p exists in the set Ados of all possible DOS attacks) the transition fires and the Initiator is ready for the next step. The first stage of the pod_attack token is shown next.
107
Declarations:
DOS Attacks
Select Attack
Initiator
0 `pod_attack [attack = pod_attack]
1 àttack
pod_attack
Deliver Attack
1 àttack
Successful
Target
Figure 16 - Coloured Petri Net Showing Token Ready for Attack
The Initiator now launches the attack. If there are no inhibitors (countermeasures) on the Deliver Attack transition, the attack will succeed and will change the state of the Target as shown below.
Declarations:
DOS Attacks
Select Attack
Initiator
0 `pod_attack
1 àttack [attack = pod_attack]
Deliver Attack
1 àttack
Successful
Target 1
pod_attack
Figure 17 - Coloured Petri Net Showing Token at Target - Attack Successful
The state of the output place Target is changed as shown by the single pod_attack token in
108
the Target place. The Target is of type Successful which is of the same type Attacks as the DOS Attacks input place. There no longer is a pod_attack token in the DOS Attacks place since there was only one there initially and it has moved through two transitions to the Target. From the perspective of an incident root cause investigation (post mortem) there are two places we can see where the attack might have been stopped. The first is the Select Attack transition. This is the transition that represents the statement,
: f ( dos ) = p
that the Ping Of Death attack exists. It is not possible to change that fact, so we move to the next transition, Deliver Attack. Here it is possible to inhibit the success of the attack. We interpret the statement deliver attack to mean that the attacker has successfully delivered the attack to the victim and caused the victims state to change. If we introduce countermeasures at this transition, we inhibit the successful delivery of the attack and the victims post-attack state does not become the failed state.
5.5 Example of the Use of Modelling in Incident Post Mortems

The use of models, such as Coloured Petri Nets, is supported by the DIPL. Typically, these models are applied to the verbs and are modified by the DIPL roles. There is no direct ability to include levels of detail such as When (although relative times can be shown) or other atoms. For the most part, this is not a failing since we are not interested in modelling the flow of events. The When SID does not specify a flow of events. It simply places the state described by the DIPL listing at a point in time. The DIPL represents, essentially, a snapshot, not a temporal ordering of events. However, the concept of inhibitors is important and allows us to consider specific SIDs as countermeasures. Conversely, some SIDs may be used as enablers. Formalisms allow us to construct fairly complex representations of events and, as is a goal of the EEDI process, develop simulations that explain complex behaviors during an attack. This capability is extremely important in root cause investigations, or incident post mortems.
109
5.5.1 Simple Petri Net Interpretation of an Incident Post Mortem

We describe the use of the Petri Net formalism in a fragment of an actual incident post mortem. Note that, for brevity, this example represents only a portion of the overall investigation. The victim was attacked with the SQLSlammer worm [QTIM03]. In the portion of the root cause investigation we describe, the set of conditions that permitted the attack to be successful is mapped onto a Petri Net similar to the nets described earlier in this chapter. The investigation found that there were limited countermeasures in place to prevent the incident. These countermeasures focused upon the victims firewalls. These firewalls were configured to deny UDP and TCP packets incoming through port 1434, the port used by the worm to enter a system. The Petri Net of the actual penetration mechanism, and countermeasures in place against it, is shown below.
Declarations:
color Attacks = with sqlslammer_attack | other_attack; color slammer_selected = Attacks; color Successful = Attacks; color Inhibitors = with configured | not_configured; color Inhibited = Inhibitors; var attack : Attacks; var countermeasure : Inhibitors; var perimeter : Inhibitors; var vpn : Inhibitors; var wireless : Inhibitors; var laptop : Inhibitors; Attacks 1 àttack DOS Attacks [attack = sqlslammer_attack] Select Attack 1 àttack Initiator 1 1 `sqlslammer_attack [(attack = sqlslammer_attack), (countermeasure <> configured)] Deliver Attack 1 àttack 1 àttack Target Inhibitors Inhibited 1 `countermeasure 4 4`configured 4 `countermeasure
1 `la
Slammer_selected
1 `sqlslammer_attack
Successful
Perimeter Firewalled 1 `perimeter 1 `configured

ptop
Inhibitors Laptops not Virus Checked 1 `configured
Countermeasures
Deliver Countermeasures
1 `v p
Inhibitors VPN Firewalled 1 `configured
Inhibitors 1 `wireless Wireless Firewalled 1 `configured
Figure 18 CPN1 Coloured Petri Net for SQLSlammer Simulation: Countermeasures in Place Stephenson Structured Investigation of Digital Incidents in Complex Computing Environments
110
In this net we are analyzing several events, all of which must occur to protect the Target. First, we show that the Initiator has selected to launch a SQLSlammer attack against the Target. He or she selects the attack from the set of all available denial of service attacks (DOS Attacks). The transition Select Attack fires if there exists a SQLSlammer attack in the DOS Attacks place. There is, so one sqlslammer_attack token is in the output of the Initiator place, arming the Deliver Attack transition. The Deliver Attack transition will fire iff it sees at least one sqlslammer_attack token and at least one not_configured token. Since all four places collared Inhibitors (Perimeter Firewalled, VPN Firewalled, Laptops Not Virus-Checked and Wireless Firewalled) are set to an initial marking of configured, four configured tokens are presented at the input of the Deliver Countermeasures transition causing it to fire. Note that there is no guard on the Deliver Countermeasures transition, allowing it to fire whenever it is armed. The four configured tokens in the Countermeasures place arm the Deliver Attack transition. Since there is at least one configured token at its input place, the Deliver Attack transition cannot fire because the guard requires that variable attack be satisfied by a sqlslammer_attack token, and that the countermeasure variable not be satisfied by a configured token. The Deliver Attack transition does not fire, no token is in the Target output place, and we have shown that the attack fails. The failure is a result of deadlock caused by the inability of the Deliver Attack transition to fire. The place markings for this net (CPN1), M0 being the initial marking and Place p1 = DOS Attacks Place p2 = Initiator Place p3 = Target Place p4 = Countermeasures Place p5 = Perimeter Firewalled Place p6 = Laptops not Virus Checked (a configured token here means that the laptop has been checked) Place p7 = VPN Firewalled Place p8 = Wireless Firewalled are:
111
CPN1: M0 = [10001111] (1 token each in places p1 , p5 , p6 , p7 , p8 , none in the rest) CPN1: M1 = [10030000] (occurs after the firing of t1 or t3) CPN1: M2 = [01030000] (occurs after the firing of the remaining transition t1 or t3) There are two potential firing sequences s for transitions t1 (Select Attack), t2 (Deliver Attack) and t3 (Deliver Countermeasures):
s = t1 t3 t2 s = t3 t1 t2
or
The desired result, that there is no token in place P3 , has been achieved. Transition t2 has been unable to fire. However, if even one of the places collared Inhibitors is wrongly configured for this attack (i.e., the initial marking is not_configured), the Deliver Countermeasures transition allows a not_configured token in the Countermeasures place. The Deliver Attack transition sees in the variable countermeasure a token (not_configured) that is not equal to configured and allows the attack (Target place state is changed). Thus, because the Wireless Firewalled place is not configured (i.e., the investigation revealed that there was no firewall between the wireless network and the internal network) the attack succeeds. The Petri Net, CPN2, for the successful attack is shown below. We have simplified the Net a bit for space considerations by removing one of the inhibitors input places to the Deliver Countermeasures transition. Also, for ease of reading, we show only one not_configured token in the Countermeasures place. There are, additionally, two configured tokens. However, we are concerned only with the not_configured token since it satisfies the guard condition on the Deliver Attack transition for a token that does not equal configured.
112
Declarations:
color Attacks = with sqlslammer_attack | other_attack; color slammer_selected = Attacks; color Successful = Attacks; color Inhibitors = with configured | not_configured; color Inhibited = Inhibitors; var attack : Attacks; var countermeasure : Inhibitors; var perimeter : Inhibitors; var vpn : Inhibitors; var wireless : Inhibitors;
Attacks 1 àttack DOS Attacks
[attack = sqlslammer_attack] Select Attack 1 àttack
Slammer_selected
Initiator
1 `sqlslammer_attack [(attack = sqlslammer_attack), (countermeasure <> configured)] Deliver Attack
1 àttack Successful 1 àttack Target Inhibitors 1 1 `sqlslammer_attack
Inhibited
1 `countermeasure 1 1`not_configured 3 `countermeasure 1 `countermeasure
Perimeter Firewalled 1 `configured Inhibitors 1 `countermeasure VPN Firewalled 1 `configured Inhibitors 1 `countermeasure Wireless Firewalled 1 `not_configured
Countermeasures
Figure 19 CPN2 Coloured Petri Net for SQLSlammer Simulation: Countermeasures Fail
The place markings for this net (CPN2), M0 being the initial marking are: CPN2: M0 = [1000111] CPN2: M1 = [1003000] CPN2: M2 = [0103000] CPN2: M3 = [0012000] Note that there is now a token in place P3 indicating that the net is live and the attack has succeeded. The possible firing sequences s have not changed except that, in this net, transition t2 has been enabled by a not_configured token in the input place. As can be seen from the Petri Net, the attack must have succeeded. The selection of the Inhibitors collared places and their initial markings comes directly from the incident post mortem investigation as characterized in a DIPL listing. Taking the evidence
113
unambiguously from the process language and performing a formal analysis ensures that the data used for the analysis is correct. The process can be worked in reverse. Given the symptoms and other evidence of an attack, network configuration and observations of witnesses, a Petri Net may be constructed from a detailed DIPL listing that, when simulated, points to the possible failed countermeasures or other possible root causes. This process allows investigators to simulate system behaviour based upon available evidence and gain insight into possible root causes or other avenues of investigation. We can extrapolate the above for the purposes of an example. Consider a situation where one of the inhibitors was a place we will describe as SQL Servers Patched. We will add another inhibitor place called MSDE Patched. The former place represents only the SQL Servers in the organization. The latter place represent all other MSDE (Microsoft Database Embedded) devices. In the actual investigation there was considerable speculation as to whether patching all of the SQL Servers would have prevented the consequences of the attack. The investigation showed that there were 500 SQL Servers and an additional !,000 MSDE devices. Since it would be impossible to patch every MSDE device (some are not patchable due to the nature of the MSDE implementation), the CPNet would show configured tokens in all inhibitor places except MSDE Patched. If there was only a single not_configured token in that place the attack would succeed. This is an example of creating a CPNet that describes the enterprise environment in a perfect state of operation and then, based upon collected evidence, enabling transitions (or not) based upon the investigation of what was found to be the actual state of the enterprise. The CPNet needs to represent the infrastructure correctly as a starting point, however, the configuration of the enterprise devices will affect the behaviour of the Net. Should the underlying architecture of the enterprise be flawed, a CPN representing the actual enterprise will show, by its behaviour, where the flaws exist. By remediating the flaws in the CPNet, we see how to remediate them in the actual enterprise. This remediation represents the application of countermeasures that will prevent successful compromise of the enterprise.
114
6. VALIDATION OF RESEARCH RESULTS

In this chapter we discuss the approaches taken to support the results of this work. Much of the validation of any research results can come from the community of peers that reviews the work and it various applications. The underlying techniques, technologies, mathematics, and models reported here have been applied, subsequent to the original research, in other venues such as post-incident root cause analysis and information systems risk analysis and management by the author. Those applications have resulted in an important citation by other researchers in the field of digital investigation [GP04], and numerous peer-reviewed and published papers by the author[PRS02], [PS03A], [PS04], [PS03B], [PS04B], (papers reported in Section 8.2, below). The format for this chapter begins with a general discussion of some of the validation approaches used (6.1). Because the work reported here is foundational, there is little documented prior experience, and organizations and law enforcement agencies tend not to provide details of investigations and digital incidents. Thus, we have turned to other methods to validate the work. We describe those below. In each of the sections that follow 6.1, we take up an approach used to validate a particular aspect of the work (Sample Investigations: 6.2, SIDs: 6.3, evaluations by practitioners: 6.4, etc.). While any single validation approach as reported below may seem incomplete as regards the whole of the work (i.e., addresses only a portion of the research directly), taken together in the context of peer reviewed publications relating to the research a pattern of validation emerges. The author makes certain claims regarding the research in section 1.5 and the results of the research supporting those claims appear in section 7.3 below. The reader will note that the individual validation approaches reported below support the contentions made in 7.3. For example, the first claim posits a structured, reliable digital investigative process. We validated this claim through feedback from investigative professionals being trained on the process and through the use of formal modelling. The success of the claim was supported further by its appearance in peer reviewed journals. Where possible, we have attempted to use more than one method of validation against any individual claim.
115
6.1 Introduction and discussion of validation approach

There are a number of ways one can validate the results of research such as that reported in this thesis, for example: Statistical validation against historical data Formal (mathematical) verification Application of hypotheses against existing case studies Third party response to research results (less rigorous than other methods) Unfortunately, due to the lack of applicable historical data we needed to develop other verification approaches. We believe that, over time and once the methods reported here begin to gain some credibility in the field, useful investigation data will become available. The primary difficulty with historical data lies in two main areas. First, it is not common for investigators to maintain investigation notes in cyber investigations in a predictable, consistent manner. Thus, comparison of investigations is limited by the rigor with which investigators document digital investigations. The second challenge is that most investigators either cannot or will not share investigation data. For those reasons, we have applied three specific approaches in this work to validation of the methodologies developed during the research. Those approaches are: 1. Validation of sample investigations, conducted by the writer, using the DIPL and formal methods such as Coloured Petri Nets 2. Validation of samples of DIPL SIDs using mathematics and/or formal methods such as Coloured Petri Nets 3. Validation of methods using evaluations of practitioners in training classes on those methods
In the following sections we discuss each of those methods and their results. Figure 20 below summarizes results of methods one and three.
116
DESCRIPTION
Incident post mortem reported in A.2.1 Incident investigation reported in A.2.2
VERIFICATION METHOD (Narrative)
VERIFICATION OUTCOME (Narrative)
% Supporting
N/A
1 hour training class - DIPL
1 hour training class - incident post mortems using EEDI and CPNets
2-day training class - End-to-End Digital Investigation Process
DIPL characterization and CPNet Incident solved using DIPL and CPNets - not model applied to original investigation solved originally and outcome DIPL characterization and CPNet model applied to original investigation Original solution verified and outcome DIPL presented following which the class of approximately 20 was asked to rate the value of the DIPL as an The class supported the DIPL approach original and useful investigative technique. EEDI was presented following which the class of approximately 40 was asked to rate the value of the EEDI The class supported the EEDI approach for process in the context of incident incident post mortems post mortem investigations as an original and useful investigative proces EEDI was presented in detail in a twoThose reporting (7 of the class of 22) day training class following which the overwhelmingly rated the key points as good class of 22 was asked to rate the and the overall session's usefulness as an session in general and the key points average of 19.57 out of a possible 20 points in particular
N/A
75%
87.50%
Key points: 100% Overall: 97.9%
Figure 20 - Validation Results
6.2 Formal Validation of Sample Investigations

In order to test the validity of the processes presented in this thesis, we selected two investigations conducted by the writer. Neither of these investigation was conducted using the methods discussed here. Rather, the investigation notes were subjected to analysis using techniques including DIPL characterization and Coloured Petri Net modelling. The investigations were of two types: an incident post mortem and an intrusion investigation. These two investigations and the formal verification processes are described in detail in Appendices A.2.1 and A.2.2. Upon reviewing the detailed investigator notes of the two actual investigations, DIPL representations were created and CPNet models developed from the notes and the DIPL characterizations following the EEDI process. In the first case the lack of an original solution gave way to a hypothetical solution using the EEDI procedure. That solution was, subsequently verified through discussions with the original victim as being correct or, at the least, credible. In the second case the original solution was verified. Since the original solution resulted in the admission of the suspect, we conclude that the original solution was, in fact, correct. We used this investigation as a control since we were well acquainted with the actual outcome and could test the EEDI process directly against it.
117
6.3 Validation of Selected SIDs

In section 5.2.2 we demonstrate the mathematical verification of some common DIPL SIDs. The incorporation of DIPL SIDs and s-expressions into functional CPNets appears to validate their use further. While it is easier, due to the nature of the DIPL, to verify SIDs mathematically than it is using CPNets, it is intuitive that if a CPNet incorporating a DIPL expression functions correctly the included DIPL expression is likely to be correct as well. Experiments in constructing a CPNet using DIPL sexpressions that are malformed showed that the net failed to function correctly. Since the DIPL s-expression is intended to express a process, and since CPNets may be used to model a process, it appears logical that such behaviour would occur.
6.4 Practitioner Evaluations

Three specific training classes on the EEDI process were conducted in two different venues, with three different audiences and covering EEDI from three different perspectives for both one hour and two days. The attendees then were asked to fill out evaluation forms that, among other questions, addressed the attendees opinion of the validity of the process and the applicability to their work or profession. The results are summarized in Figure 20 above. While this is a small-scale and subjective evaluation, the attendees all were professionals and academics in a position to form a critical opinion about the process. The attendee evaluation forms (not available due to their proprietary nature) addressed standard concerns of training institutions implementing a quality process. Questions referred directly to the attendees impressions of the instructors knowledge and abilities, the validity of the topic material, the applicability of the material to the attendees work and the attendees opinion of the value of the material to him or her. The answers were presented on a scale of 1 to 10 (10 being the most positive), 1 to 15 (15 being most positive), or 1 to 20 (20 being the most positive). While students were aware of the instructors development of the techniques being taught, the attendees of classes provided by the various training organizations represented have a long time reputation for being candid in their evaluations.
118
6.5 On the Use of Graphics in Courts of Law

Smith and Bace spend significant time in their book A Guide to Forensic Testimony [SB03] discussing the benefits of using graphics to support expert testimony. The authors sum up the use of graphics in a chapter 4 section titled Showing and Telling is Better than Just Telling: Our approach, in addition to careful preparation for direct examination and the anticipation of all the imagined problems in preparing for crossexamination, is to construct simple and honest graphics to allow the expert and the attorney and then the jurors to put it all together as the elements are presented through the testimony of witnesses. Technical issues can be very difficult to present to a lay audience, and the concepts discussed in this thesis are no exception. Because our techniques depend upon complex mathematical modelling and technical investigative concepts that are, at best difficult to explain, the use of graphics offers an opportunity to support testimony in a simple to understand way that does not appear to talk down to or patronize the finders of fact. For these reasons we selected tools, such as Coloured Petri Nets, that are well-respected in their scientific, technical and mathematical communities, and present their conclusions graphically. Subjective, analysis of the use of graphical methods in training classes vs. direct mathematical representation demonstrated consistently that practitioners attending the classes had difficulty with many concepts until the graphical tools were applied to the explanations. After describing in mathematical detail the modelling of the data flows in a compromised network, such as that in appendix A.2.1, the writer posed several questions to the class in the form of an oral quiz. The responses indicated a clear lack of understanding of the process. Upon loading the CPNet modelling tool and demonstrating the simulation of the same data flows, the writer again questioned the class. The resulting responses showed a much clearer understanding of the process. Further, the application of an accepted modelling/simulation program to the problem seemed to lend credibility to the results. We concluded from this informal experiment that the contentions of Smith and Bace in regard to graphics in the courtroom appeared to be consistent with our direct experience
119
in the classroom. Subsequent to the class evaluations reported above, we have refined the presentation of the CPNet models for new training sessions with new students. We found that by representing the CPNets simply as models with no explanation of the underlying mathematics, and omitting references to CPNet jargon (such as places, transitions, arcs and initial markings) in favour of common information security terminology (places become policy domains and transitions become channels, for example), class acceptance of the models appeared to improve.
6.6 Comparison With Other Validation Approaches

Some research validation approaches require statistical sampling and analysis. However, the most appropriate type of research for the type of work reported here is the case study. An approach to case studies as a research method is outlined by McNamara [CM99]. McNamara suggests that a case study might be used to evaluate a programs strengths and weaknesses. The application of investigative techniques such as those described in this thesis are not dissimilar to the application of some program to an organization. In both cases we begin with a system/organizational state, apply some state-changing event and then attempt to assess the nature and impact of that event. A case study of that process would, as McNamara suggests: organize a wide range of information about a case and then analyze the contents by seeking patterns and themes in the data, and by further analysis through cross comparison with other cases. A case can be individuals, programs, or any unit, depending on what the program evaluators want to examine through in-depth analysis and comparison. Phillips and Pugh [PP00] discuss various types of research appropriate to the writing of a PhD thesis and conclude that the testing-out approach is most appropriate. They describe this type of research as finding the limits of previously proposed generalizations. Applied to the work reported in this thesis we have attempted to assess the current approaches to digital investigation and have posited a new methodology. Because there are few documented examples of existing digital investigations with sufficient detail to be useful as a universe for comparison, we have taken two examples from our own personal knowledge as benchmarks for comparison. To these two examples, we have applied the methods described herein and reported the results.
120
However, to extend the testing-out method of validation we need a larger and more representative universe of digital cases for comparison between various investigative methods. While the method used here is, in reality, a hybrid of Phillips and Pughs testing-out approach and their problem solving approach, it might be argued that a pure testing-out methodology would be preferable. We do not support that conclusion for three reasons. First, although quantitative methods are used in the analysis of digital incidents described here, digital, and, in reality, all investigation is, at its heart, qualitative in nature. That qualitative nature tends to imply a more qualitative approach to research validation. The type of case study approach suggested by McNamara seems to us to be consistent with a qualitative problem and its associated vagaries. Second, a pure testing-out approach to validating an investigation - a process that is, by its nature, somewhat intuitive may leave untested those methods of drawing conclusions typical to seasoned investigators. An important part of the development of a novel approach to digital investigation is the preservation, where appropriate, of the intuitive process of the experienced investigator. Analysis of final conclusions in the context of the methods used to arrive at those conclusions is a more reliable, but arguably more difficult, approach to their validation and the validation of the methods used to arrive at them. The third and, perhaps, the most important reason for avoiding a pure testing-out research methodology, is that investigations, digital or otherwise, do not follow a strict, predictable format. Although we have proposed a framework in this research, we emphasize that it is a framework only. The ability to model the application of that framework along with the use of the DIPL to characterize its application in an actual investigation, says little about the rigor of the investigative process itself. In fact, it is exactly that lack of rigor that this work addresses. Because traditionally that predictable rigor has not been present there is no pre-established baseline against which to test or measure a new approach. As previously noted, we are forced to work from the conclusions of previous investigations instead of a detailed, well-documented process. For these reasons, we suggest that, while future work may be structured to take advantage of detailed investigations, well-documented for use as research subjects, a hybrid approach to the current research is, in this foundational stage, more productive.
121
6.7 Comparison With Investigative Process Models

The notion of formalizing a process model for digital investigation is extremely new. At the time of the original research the closest model (actually a framework) for digital investigation was the DFRWS framework. Very close to the DFRWS framework was the work of Reith et al [RCG02]. Although there are minimal extensions to the DFRWS framework, Reiths framework is essentially the same. Within the literature, we find isolated examples of individual techniques that an investigator may apply to a digital investigation. We do not find a structured process (other than this work) until the late (November) 2003 publication by Carrier and Spafford [CS03] suggesting a framework based upon the physical examination of a crime scene and the conduct of digital investigations patterned after the conduct of physical criminal investigations. First publication by the writer of the EEDI process began in October 2002 [PS02A] nearly a year prior to the Carrier/Spafford paper. Interestingly, although the process suggested by Carrier and Spafford mimics the physical investigation process, and is extremely complex comprising 15 steps, those functions that relate directly to digital investigation are very similar to the EEDI process. Where the two differ significantly is in the granularity of physical steps in the process. In addition to the more structured processes described above, The U. S. Department of Justice suggests a digital investigative process, specifically structured for first responders [TWG01]. This document focuses primarily upon the handling of individual items of evidence potentially located at an electronic crime scene. The framework suggested relates specifically to the physical management of the crime scene by a first responder and does not address the overall digital investigative framework directly. There are three significant differences between these other approaches and the EEDI process, including: The so-called process models are not models but, rather, are frameworks. EEDI is a structured framework that contains elements of formal modelling where the other processes do not. Thus, the EEDI approach meets criteria for scientific rigor that the other frameworks do not. The process models are structured (with the exception of the DFRWS framework) around the traditional approach to
122
investigating a physical crime scene. While it is obvious that there are elements of the physical scene in a digital investigation, the application of physical investigation and evidence management techniques are simply manifestations of the needs of the specific crime scene. The EEDI approach takes this into consideration but focuses upon the digital investigation process as a whole, holistically, not being limited by physical requirements but admitting of them as appropriate. The process models are not rigorously structured with the ability to test their outcome objectively. The EEDI approach incorporates elements that not only allow objective scientific review but, rather, depend upon it.
123
7. SUMMARY, CONCLUSIONS AND FUTURE DIRECTIONS
7.1 Conclusions
The digital investigative process is not, at this writing, defined formally as an explicit discipline. Digital investigators often perform their own digital forensic analysis of data collected in investigations. That process differs from most other forensic sciences where the investigator investigates, crime scene specialists collect evidence and forensic laboratory examiners analyze the evidence collected. For simplistic investigations, the status quo works acceptably well. However, the state of digital crime in general and digital attacks in particular is increasing in complexity and increasingly complex solutions in the form of digital investigative and forensic techniques and tools are required to keep pace with the evolving state of electronic crime. In order for digital forensics to take its place with other branches of forensic science it is necessary to evolve it in terms of scientific rigor, discipline and structured process. Digital forensic science is born out of computer science and, as such, exhibits characteristics unique among the forensic sciences. Some of the unique characteristics include, but certainly are not limited to:
Massive amounts of data collected in digital investigations

juxtaposed against the potentially small amount of useful information buried somewhere in the data
Difficulty in recognizing the value of collected information

early in the investigative/evidence collection process
Decentralized nature of a digital/virtual crime scene Difficulty in establishing the extent of, and certainly that
perimeter of the digital crime scene
Difficulty in locating material evidence Mathematical nature of computer science in general and
124
computer systems in particular
Interpretive nature of digital information in various contexts Evidence management Validity and originality of evidence Organizational conflicts between information security,
electronic business operations and forensic readiness
Limited case digital case law
It is these unique characteristics that simultaneously create challenges for the investigator/analyst, and offer opportunities for scientific analysis not present in some other forensic sciences. For example, the mathematical nature of information technology provides a solid platform for a reasonably informed analysis of a digital event. Unlike the case of other forensic sciences, digital forensic scientists need not answer questions about the probability that an event occurred in a certain manner. If the evidence is available, the probability, due to the mathematically structured nature of data and the systems that manage it, is certain. The problem arises in the collection, recognition, management and analysis of the evidence. In this thesis we have examined that final problem: the collection, recognition, management and analysis of electronic evidence. We have proposed a structure built around a consensus process framework. We have proposed a language that allows that structured framework to be applied consistently. Finally, we have proposed an approach for rigorously testing hypotheses based upon evidence collected and managed around the structured framework. In legal terms, addressing such court issues as Daubert tests now can rest upon established scientific and mathematical techniques. While not every case is of sufficient complexity or importance to warrant detailed examination using formal techniques, investigation of every digital incident now can follow a predetermined and structured pattern, inserting additional technique as necessary to address incident complexities.
7.2 Advantages and Disadvantages of the Proposed Approach

The advantages of the proposed approach are, at this point, obvious and have been well
125
documented in this thesis. They include but are not limited to: Documentation of a structured and reliable digital investigative process. Application of sound scientific and mathematical principles to the digital investigative process. The ability to test a digital investigation for completeness. The ability to extend digital forensic principles to those portions of the conduct of a digital investigation to which they apply. The ability to derive measures of reliability in a specific digital investigation and to test that investigation for its reliability. The ability to show conclusively the scope, process and results of a digital investigation to a lay audience such as a trial jury such that the audience comprehends the complexities of the scope and process, and accepts the results as conclusive. There are, however, some subtle disadvantages that have emerged during the research reported here. Many of these disadvantages come from the nature of the processes and the tools necessary to effect the process in an actual investigation. Some of those disadvantages include: The process is complex and, in some cases, difficult to follow for lay investigators (whether or not they are experienced in investigation) not schooled in the tools and techniques of the EEDI process). The use of the DIPL can be extremely tedious in a large investigation (the type for which the DIPL can be the most beneficial). Not all investigations are the same or even similar. Application of the EEDI techniques described here may be awkward in some cases. Application of the process, including modelling, is very subjective and requires a subtle grasp of the investigative environment by investigators participating in a particular digital investigation. These subtleties can result in incorrect application of the modelling process and lead to incorrect conclusions. We believe that most, if not all, of these disadvantages can be addressed by creating
126
software tools that execute the techniques described in this thesis. Some examples might include a data store tool that can be used to collect and correlate data from large-scale investigations in a single place, a tool for collecting, normalizing and analyzing multiple logs from multiple types of devices, and a tool that allows large quantities of disparate data, from disparate sources, presented in disparate formats to be normalized, mined and analyzed. However, at this point such software tools do not exist and these disadvantages may pose a barrier to the use of the EEDI techniques.
7.3 Summary of Main Contributions

In Section 1.5 we presented the following claims of specific contributions to the body of knowledge regarding digital investigation: 1. The first reliable, structured process for using scientifically derived and proven methods and/or techniques toward the conducting of an investigation or enquiry in a digital environment for the investigation of digital security incidents in a complex network environment such as the Internet (Chapter 3) , and 2. the first process language specifically derived for use in characterizing a digital forensic investigation (Chapter 4) in support of contribution number 1, and 3. the first structured process for creating a formal mathematical model of a digital investigation (Chapter 5) in support of contribution number 1, and 4. an approach to presenting the results of a formally modelled and proven digital investigation to a court of enquiry (Chapter 5 use of Coloured Petri Net graphical representations).
7.3.1 A Digital Investigative Process

The primary objective of this work is the development of a digital investigative process that meets the criteria of the definition in section 1.6.10. The End-to-End Digital Investigation (EEDI) process satisfies that requirement. It is based upon the DFRWS
127
investigative framework, a structured investigative process and the concept that an incident begins with an attacker or Initiator, and ends with the victim or Target and includes all networks and devices in between. An end-to-end investigation depends upon the collection, management and analysis of evidence throughout the entire end-to-end chain. Each link in the incident evidence chain participates in the corroboration process yielding a complete, corroborated chain of evidence that may be viewed both as a process chain (i.e., chain of corroborated cause and effect events) and as a temporal chain (i.e., a corroborated timeline of relevant events).
7.3.2 A Process Language

The Digital Investigation process Language is a derivation of the Common Intrusion Detection Language which is, in turn, a derivation of Lisp. DIPL preserves much of the syntax of CISL but less than half its vocabulary. Replacing CISL vocabulary that is of little use in an investigative or forensic environment, we have added over sixty new SIDs while retaining just under 130, many of which have been modified significantly. DIPL, in its current form, is well suited to characterizing a digital investigation from the perspective of process. It is organized to support both the End-to-End approach and the DFRWS framework.
7.3.3 Mathematical Modelling

DIPL is, likewise, well suited to formatting a digital investigation such that it can act as input for a mathematically sound model of the investigation. Research for this thesis focused upon the of Coloured Petri Nets as the formalism applied to this purpose. However, other formal methods may work equally well. While we create Coloured Petri Nets manually, there are some tools (see Section 1.4.1.1.4) that allow us to automate the modelling process. As we see in Section 5.4.1, we can use standard Coloured Petri Net notation as a manual check on the model as checked by the Design/CPN Simulator tool (integrated within the Design/CPN package). The format for Coloured Petri Net graphs is standard practice. Also, as shown in Section 5.5.1, the successful use of Coloured Petri Nets in an actual investigation suggests that the process is applicable and appropriate. The use of formal modelling techniques allows the verification of an investigation as well as simulation of investigation direction and verification of hypotheses. Coloured Petri
128
Nets were selected largely due to their graphical nature that makes them well suited to demonstration of complex processes in front of a lay audience such as a courtroom jury.
7.3.4 A Presentation Approach

The use of Coloured Petri Nets (CPN) offers a graphical representation of the causal chain of events. Cause and effect are easily represented in the CPN environment. Equally important, the EEDI process supports temporal (timeline) and cause-and-effect analysis. Presentation of both of these analysis methods using a graphical medium has been shown to be effective with lay juries because they are, as Smith and Bace suggest, visual [SB03]. The EEDI process is designed to be effective both in classroom and courtroom environments, especially when presenting complex events. The process has been used experimentally over the past two years in classrooms supporting different levels of audience knowledge and experience with success.
7.4 Future Directions

The research presented in this thesis is intended to be foundational. Because there has been little formal work done to date in developing digital forensics and digital investigation into legitimate forensic science, it has been necessary to establish a foundation upon which digital investigators, practitioners and researchers can build a common body of knowledge. From that foundation issues of scientific credibility may be addressed and the practice of digital forensics will gain increased acceptability in the forensic community. Two important areas of future study, worthy of more detailed mention, are repeatability and scalability of the investigative techniques that support the digital investigative process. Repeatability is discussed in Chapter 6 and comprises methods of verifying more rigorously the core hypotheses and supporting techniques that make up the digital investigative process generally and, specifically, the EEDI process described here. To date that has been difficult due to the paucity of applicable and relevant data upon which to base a comparison with current unstructured approaches. Scalability is an issue that relates explicitly to the application of some of the underlying support techniques, such as the DIPL, in extremely large and complex network environments. While it appears clear from field testing that the process works in such an environment, those same field tests suggested that an automated method of analyzing
129
DIPL results is necessary for practical application of that technique. Additionally, and, perhaps, more importantly, it is from these foundations that research into the solution of difficult problems in digital forensics and investigation may develop. There are several other difficult problems that the digital investigative and forensic communities must address through research and practice:
End-to-end trace back of event sources Formal verification of forensic tools and processes Court acceptability of digital forensic tools and methods Acceptability of practitioner training and certification Admissibility and credibility of digital evidence collected,
processed and presented as the result of an EEDI investigation
Not all of these are directly related to the work reported here. However there exist opportunities for:
Further application of formal methods to modelling and

characterizing digital investigations and forensic analysis of evidence.
Explicit application of formal mathematics to the DIPL

language for the purpose of verifying its mathematical and logical correctness this includes development of an appropriate logic for formal proofs, as well as direct mapping of DIPL s-expressions to specific formal mathematical processes.
Application of the foundational processes reported here to

other areas of digital forensics such as trace back and tool verification.
Application of the foundational processes reported here to

information security, assurance and risk management.
Refinement of the DIPL and the EEDI process concurrently

130
with the refinement of digital forensic and investigative process.
Refinement of the DFRWS framework to support the

separation of digital forensics from digital investigation.
Examination
of
other
paradigms,
such
as
physical
investigative procedures [CS03] for consistency with the process described in this thesis.
The approaches described in chapters 3, 4 and 5, and, especially the results reported in part in Section 5.4.1, suggest that it is feasible to formalize and structure the digital forensic and investigation processes. Other researchers have begun examining the digital forensic process and some of the important challenges it faces. For example, Dr. Sarah Mocas has begun investigating the relationships between digital forensic theory and practice [SM03]. Dr. Thomas Daniels has developed a reference model for passive network origin identification [TED03]. We believe that the foundational work reported here will comprise a contribution to those early efforts and will help shape the future direction of digital forensic science. Finally, although individual items of digital evidence and the techniques used to extract, preserve and manage them are fairly well understood, if not exactly mature, there exists fertile ground for investigating the possible relationships that may exist between those items of evidence in the context of a particular investigation. Such complex relationships may also exist between the multiple techniques and those relationships may have an impact on the management, using those techniques, of digital evidence.
131
8. REFERENCES
8.1 Works Cited Within This Thesis

[ACD02] Doyle, Arthur Conan. The Hound of the Baskervilles. 1st ed. London: G. Newnes, Ltd., 1902. [BB94] Barendgregt, Henk, and Erik Barendsen. "Introduction to the Lambda Calculus.", 1994 [BK02] Brezinski, D., and T. Killalea. "Guidelines for Evidence Collection and Archiving - RFC 3227", 2002 [CAH78] Hoare, C. A. R. "Communicating Sequential Processes." Communications of the ACM 21.8 (1978): 666-77. [CAH85] Hoare, C. A. R. Communicating Sequential processes. 1 ed. Prentice Hall, 1985. [CC03] The Common Criteria web site. http://www.commoncriteria.org. Searched 11 October 2003. [CLP02] Conway, Christopher, Li, Chung-Hong and Pengelly, Megan. Pencil: A Petri Net Specification Language for Java. 8 October 2002 [CM99] McNamara, Carter. Basic of Developing Case Studies.
http://www.mapnp.org/library/evaluatn/casestdy.htm accessed 8 may 2004 [CS03] Carrier, Brian, and Eugene H. Spafford. "Getting Physical with the Digital Investigation Process." International Journal of Digital Evidence 2.2 (2003): 22 November 2003 <http://www.ijde.org/current_home.html>. [DC96] Diliberto, Ken, and Franklin Clark. Investigating Computer Crime. 1 ed. CRC Press, 1996. [DF01] Farmer, Dan. "Bring Out Your Dead." Dr. Dobb's Journal January (2001). [DF02] Forte, Dario. "Analyzing the Difficulties in Backtracing Onion Router Traffic." International Journal of Digital Evidence 1.3 (2002) on-line at http://www.ijde.org .
132
[DFRW01] Digital Forensics Research Workshop. "A Road Map for Digital Forensics Research 2001." Digital Forensics Research Workshop 6 November (2001) . [DFRW03] Digital Forensics Research Workshop. Day 1-DF-Science, Group A Session 1 (D1-A1), Digital Forensic Framework. Work notes for the 2003 Digital Forensics Research Workshop unpublished 6 August 2003: . [DN02] Nicol, David. "BGP Instability Forensics (unpublished presentation)." Dartmouth College, 2002 [DOJ02] United States Department of Justice, Computer Crime and Intellectual Property Section, Criminal Division. "Searching and Seizing Computers and Obtaining Electronic Evidence in Criminal Investigations.", 2002 [DVM93] Blackmun, J. "Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 US 579.", 1993 [FBI02] FBI Crime Scene Search. 6 June 02. Federal Bureau of Investigation. 10 December 2002 <http://www.fbi.gov/hq/lab/handbook/scene1.htm>. [FKP+99] Feiertag, Rich, et al. "A Common Intrusion Specification language
(CISL), 1999 [GM02]Giordano, Joseph, and Chester Maciag. "Cyber Forensics: A Military Operations Perspective." International Journal of Digital Evidence 1.2 (2002) on-line at http://www.ijde.org. [GMU03] Lecture notes from George mason University. An Introduction to Petri Nets. http://viking.gmu.edu/http/syst511/vg511/AppC.html. [GP02] Palmer, Gary. "Forensic Analysis in the Digital World." International Journal of Digital Evidence 1.1 (2002) on-line at http://www.ijde.org . [GP04] Gladyshev, Pavel, Ahmed Patel. Finite State Machine Approach to Digital Even Reconstruction, Digital Investigation, volume 1, number 2, pp 130149. Elsevier, Ltd. [HH83] Hajnal, Andras, Hamburger, Peter. Set Theory. English edition, Cambridge University Press, 1999 [HP01] The Honeynet Project. Know Your Enemy; Revealing the Security Tools,
133
Tactics and Motives of the BlackHat Community. 1 ed. Addison Wesley, 2001. [IK94] Krsul, Ivan. Authorship Analysis: Identifying the Author of a Program. Thesis. Purdue U, 1994. West Lafayette: privately printed. [ISV95] Icove, David, Karl Seger, and William VonStorch. Computer Crime - A Crimefighter's Handbook. 1 ed. O'Reilly & Associates, Inc., 1995. [IOC00] IOCE Best Practice Guide v 1.0, International Organization for Computer Evidence, May 2000. [IOC02] Guidelines for Best Practice in the Forensic Examination of Digital Technology, International Organization for Computer Evidence, 2002. [JD99] Doyle, Jon. "Some Representational Limitations of the Common Intrusion Specification Language." Laboratory for Computer Science, Massachusetts Institute of Technology. 1999 [JN03] Forensic Science An Introduction to Scientific and Investigative Techniques. 1 ed. Boca Raton: CRC Press, 2003. James, Stuart H., Nordby, Jon J. editors [JV23] Vanorsdel, J. Frye v United States No. 3968 D.C. Circuit Court of Appeals 1923 [KJ94] Jensen, Kurt. "An Introduction to the Theoretical Aspects of Coloured Petri Nets." A Decade of Concurrency, Lecture Notes in Computer Science 803 p230-272: Springer-Verlag, 1994. [KJ96] Jensen, Kurt. Coloured Petri Nets Lecture Notes, Computer Science Department, University of Aarhaus, Denmark. 1996. [KJ97] Jensen, Kurt. Coloured Petri Nets - Basic Concepts, Analysis Methods and Practical Use Volume 2. 1 ed. Vol. 2. Berlin Heidelberg: Springer-Verlag, 1997. [KJ97-1] Jensen, Kurt. Coloured Petri Nets - Basic Concepts, Analysis Methods and Practical Use Volume 1. 1 ed. Vol. 1. Berlin Heidelberg: Springer-Verlag, 1997. [KR95] Rosenblatt, Kenneth. High Technology Crime. 1 ed. San Jose: KSK
134
Publishing, 1995. [KVC99] Breyer, J. Kuomo Tire Company Ltd., et al v. Patrick Carmichael, etc. et al on Writ of Certiorari to the United States Court of Appeals for the Eleventh Circuit, 23 March 1999 [MRS04] Rogers, Marcus K., and Kate Seigfried. The Future of Computer
Forensics: A Needs Analysis Survey. Computers & Security 2004, volume 23, pp 12-16. Elsevier Advanced Technology. 2004. [MSC93] Meta Software Corporation. "Design/CPN Tutorial for X-Windows Version 2.0.", 1993 [NCF+01] Northcutt, Stephen, et al. Intrusion Signatures and Analysis. 1 ed. Indianapolis: New Riders, 2001. [NN00] Northcutt, Stephen, and Judy Novak. Network Intrusion Detection, An Analyst's Handbook. 2 ed. Indianapolis: New Riders, 2000. [PB03] Buchholz, Peter. Petri Nets. Lecture Notes CL Dipl. Informatik (fakultativ), Modellierung und Simulation, Institute for Applied Computer Science. 26 may 2003. [PH03] Pearce, Simon, and Simon Halsall. "The SQLSlammer Worm.", Unpublished Paper, 2003, QinetiQ [PP00] Phillips, Estelle M. and Derek S. Pugh. How to get a PhD. 3rd ed.
Buckingham: Open University Press, 2000. [PRS02]* Stephenson, Peter. "Structured Investigation of Digital Incidents in Complex Computing Environments." Information Systems Security, Auerbach Publications Volume 12, Number 2 (2003) pp 29-38 . [PRS03] Stephenson, Peter R. "Digital Investigation Process Language (DIPL) Language Definition Document." 2003. [PS02] Stephenson, Peter R. "Collecting Evidence of a Computer Crime." Computer Fraud and Security, Elsevier Advanced Technology November (2002) pp1719 . [PS02A] Stephenson, Peter R. "The Forensic Investigation Steps." Computer Fraud & Security, Elsevier Advanced Technology October (2002) pp17-19 .
135
[PS03] Stephenson, Peter R. Continuing the Post Mortem. Computer Fraud & Security, Elsevier Advanced Technology September (2003) pp 17-20. [PS03A]* Stephenson, Peter R. A Comprehensive Approach to Digital Investigation, Information Security Technical Report, Elsevier Advanced Technology Vol.8 No.2 2004 [PS04]* Stephenson, Peter R. Forensic Analysis of Risks in Enterprise Systems. Information Systems Security, Auerbach Publications May 2004 (scheduled for publication). [PS03B]* Stephenson, Peter R. Modelling of Post Incident Root Cause Analysis, International Journal of Digital Evidence Fall 2003 Volume 2 Issue 2 [PS04B] Stephenson, Peter R. A Technique for Root Cause Analysis of Complex Digital Incidents, Institutional Investors Guide to Establishing Corporate Accountability 2004 (scheduled for publication). [QTIM03] QinetiQ Trusted Information Management. Victim Corporation SQL Slammer Incident Report Technical Analysis and Report (Unpublished confidential document). 2003 [RCG02] Reith, Mark, Clint Carr, and Cregg Gunsch. "An Examination of Digital Forensic Models." International Journal of Digital Evidence 1.3 (2002) online at http://www.ijde.org . [RR97] SEXP---(S-expressions). Ed. Ronald Rivest. 3 May 1997. MIT. 16 April 2003 <http://theory.lcs.mit.edu/~rivest/sexp.html>. [RS01] Ryan, Peter, and Steve Schneider. Modelling and Analysis of Security Protocols. 1 ed. n.p.: Addison-Wesley, 2001. [SB03] Smith, Fred Chris, and Bace, Rebecca, Gurley. A Guide to Forensic Testimony . 1 ed. Boston: Addison Wesley, 2003. [SM03] Mocas, Sarah. Building Theoretical Underpinnings for Digital Forensics. 2003. Digital Forensics Research Workshop. 6 August 2003
http://www.dfrws.org/ [SO98] Sommer, Peter. Intrusion Detection Systems as Evidence. 1998. Recent Advances In Intrusion Detection RAID Symposium, 1998.
136
[SS04] Smith, Fred Chris. Phone interview with Peter Stephenson on the topic of admissibility, 4 October 2004 [SYM03] W32.SQLExp.Worm. Ed. Symantec. February 04, 2003 04:26:32 PM . 18 Apr.,2003 <http://www.symantec.com/avcenter/venc/data/w32.sqlexp.worm.html>. [TA02] Akin, Thomas. Cisco Router Forensics. 2002. Blackhat Briefings. 3 Dec. 2002 <http://www.blackhat.com/presentations/bh-usa-02/bh-us-02-akin-
cisco/bh-us-02-akin-cisco.ppt>. [TD01] Dunigan, Tom. "Backtracing Spoofed Packets." Network Research Group, Oak Ridge National Laboratory, 2001 [TED03] Daniels, Thomas, E. A Functional Model of Passive Network Origin Identification. 2003. Digital Forensics Research Workshop 6 August 2003. <http://www.dfrws.org>. [TM89] Murata, Tadao. "Petri Nets: Properties, Analysis and Applications." Proceedings of the IEEE 77.4 (1989). [TT02] Talleur, Thomas. "Digital Evidence: The Moral Challenge." International Journal of Digital Evidence 1.1 (2002). [UA03] Assmann, Uwe. Petri nets for Dynamic Semantics of Components. Lecture Slides. Research Center for Integrational Software Engineering. 8 May 2003. [TWG01] The Technical Working Group for Electronic Crime Scene Investigation, U.S. Department of Justice. Electronic Crime Scene Investigation: A Guide for First Responders. U.S. Department of Justice, July, 2001.
http://www.ncjrs.org/pdffiles1/nij/187736.pdf accessed 8 May 2004
8.2 Additional Publications by the Author

The following references comprise publications of work by Peter R. Stephenson relating to the topic of this thesis beginning at the start of the research (May, 2000) and continuing through 2 April 2004 that are not included in the listing in Section 8.1 above. This listing is in no particular order. An asterisk (*) and bold type-face indicates that the paper or talk
137
was peer reviewed. Listings of Stephensons work referenced in Section 8.1 above that were peer reviewed also are indicated in bold type with an asterisk (*). (Stephenson, 2003-1)* A Structured Approach to Incident Post Mortems,
Information Systems Security, Auerbach Publications September/October 2003 pp 50-56 (Stephenson, 2002-1) End-to-End Digital Forensics, Computer Fraud & Security, Elsevier Advanced Technology September 2002 pp 17 - 19 (Stephenson, 2002-2) The Forensic Investigation Steps, Computer Fraud &
Security, Elsevier Advanced Technology October 2002 pp 17 - 19 (Stephenson, 2003-1) Normalization and Deconfliction, Computer Fraud &
Security, Elsevier Advanced Technology January 2003 (Stephenson, 2003-2) Data Analysis First Steps, Computer Fraud & Security, Elsevier Advanced Technology February 2003 pp 18 - 19 (Stephenson, 2003-3) Using Evidence Effectively, Computer Fraud & Security, Elsevier Advanced Technology March 2003 pp 17 - 19 (Stephenson, 2003-4) Using a Formalized Approach to Digital Investigation,
Computer Fraud & Security, Elsevier Advanced Technology July 2003 pp 17 - 20 (Stephenson, 2003-5) Applying DIPL to an Incident Post Mortem, Computer Fraud & Security, Elsevier Advanced Technology August 2003 pp 17 - 20 (Stephenson, 2003-6) Completing the Post Mortem Investigation, Computer
Fraud & Security, Elsevier Advanced Technology October 2003 pp 17 - 20 (Stephenson, 2003-7) Modelling the Post Mortem, Computer Fraud & Security, Elsevier Advanced Technology November 2003 pp 17 - 20 (Stephenson, 2003-8) Applying Forensic Techniques to Information System Risk Management First Steps, Computer Fraud & Security, Elsevier Advanced Technology December 2003 pp 17 - 19 (Stephenson, 2004-1) The Right Tools for the Job, Digital Investigation, Elsevier Advanced Technology Vol.1 No.1 2004 (Stephenson, 2003-9) Conducting an Incident Post Mortem, Invited talk at
138
Computer Security Institute 30th Annual Conference 4 November 2003 (Stephenson, 2003-10) DIPL, The Digital Investigation Process Language, Invited talk at Computer Security Institute 30th Annual Conference 4 November 2003 (Stephenson, 2004-2) Digital Post Mortems and Incident Response, Invited talk at SCInfoSecurity News Executive Roundtable, January 2004. (Stephenson, 2003-11) Introduction to EEDI, Invited seminar at Computer Security Institute NetSec 2003 6 7 November 2003 (Stephenson, 2002-4, 2003-12, 2004-3) Intro to End-to-End Digital Forensic Analysis, Invited Seminar at eSecureIT conferences 2002, 2003, 2004, Norwich University. (Stephenson, 2002-5) Intro to End-to-End Digital forensic Analysis, Invited seminar, American Express Company, March 2002
139
APPENDIX 1 DIPL SEMANTIC IDENTIFIER (SID) LISTING

The DIPL SID listing here is the modification and extension of the CISL SID listing from the original CISL document[FKP+99].
Verb SIDs
File Verb SIDs
Name: Copy
Class: verb Syntax: Copy Description: A file, or set of files, is copied from one location to another.
May Contain:
Outcome: The exit status of the copy. When: This contains the time at which the copy was completed. Observer: The entity which observed and/or recorded this occurrence. Initiator: The user or process responsible for copying the files. (Subject SID. Not pluralizable.) FileSource: The source of the copy. This may specify either a file or a directory; if a directory is indicated, then all files in the directory are copied. The host of the source files should be given here. When copying multiple files which may be from different directories), use multiple FileSource roles. Direct object SID. Pluralizable.) FileDestination: The destination of the copy. This may specify either a single directory or a single file. If a directory, then all the files in the FileSource(s) are copied to the named directory. If a file, then there may only be one FileSource, and it must name a single file. (Otherwise, the sentence is
140
malformed.) The host of the destination should be specified here. Indirect object SID. Not pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. Subject to interpretation --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Move
Class: verb Syntax: Move Description: A file, or set of files, is moved from one location to another. May Contain: Outcome: The exit status of the move. When: This contains the time at which the move was completed. Observer: The entity which observed and/or recorded this occurrence. Initiator: The user or process responsible for moving the files. (Subject SID. Not pluralizable.) FileSource: The source of the move. This may specify either a file or a directory; if a directory is indicated, then all files in the directory are moved. When moving multiple files (which may be from different directories), use multiple FileSource roles. The host of the source file(s) should be given here. (Direct object SID. Pluralizable.) FileDestination: The destination of the move. This may specify either a single directory or a single file. If a directory, then all the files in the FileSource(s) are moved to the named directory. If a file, then there may only be one FileSource, and it must name a single file. (Otherwise, the sentence is malformed.) The host of the destination should be given here. (Indirect object SID. Not pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. Subject to
141
--- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Delete
Class: verb Syntax: Delete Description: A file, or set of files, is deleted. May Contain: Outcome: The exit status of the delete. When: This contains the time at which the delete was completed. Observer: The entity which observed and/or recorded this occurrence. Initiator: The user or process responsible for deleting the files. (Subject SID. Not pluralizable.) FileSource: The source of the deleted files. This may specify either a file or a directory; if a directory is indicated, then all files in the directory are deleted. When deleting multiple files (which may be from different directories), use multiple FileSource roles. The host of the deleted files should be given here. (Direct object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. Subject to interpretation. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Process Verb SIDs

Name: Execute
Class: verb Syntax: Execute
142
Description: A program is executed. May Contain: Outcome: The success or failure of the action of executing the code. When: This contains the time at which the program was initiated. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity executing the program. (Subject SID. Not pluralizable.) Process: The program being executed. The name of the program may be indicated either by ProgramName or by FileName. A ProcessID may be indicated here. The host on which the process is running should also be given here. (Direct object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. Subject to interpretation. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Suspend
Class: verb Syntax: Suspend Description: A program is suspended. May Contain: Outcome: The success or failure of the action of suspending the program. When: This contains the time at which the program was suspended. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity suspending the program. (Subject SID. Not pluralizable.)
143
Process: The program or process being suspended, since strictly speaking the Initiator can only suspend specific instances of a program in execution. This means that ProcessID should be given. (Direct object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Resume
Class: verb Syntax: Resume Description: A program in suspension is resumed. May Contain: Outcome: The success or failure of the action of resuming the program. When: This contains the time at which the program was resuming. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity resuming the program. (Subject SID. Not pluralizable.) Process: The program or process being suspended, since strictly speaking the Initiator can only resume specific instances of a program in suspension. This means that ProcessID should be given. (Direct object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs:
144
Comment World Time BeginTime EndTime Authenticate When
Name: Terminate
Class: verb Syntax: Terminate Description: A program in execution (or in suspension) is terminated. May Contain: Outcome: The success or failure of the action of terminating the program. When: This contains the time at which the program was terminated. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity terminating the program. (Subject SID. Not pluralizable.) Process: The program or process being terminated, since strictly speaking the Initiator can only terminate specific instances of a program in execution. This means that ProcessID should be given. (Direct object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Forensic Identification Verb SIDs

Name: ResolveSignature
Class: verb Syntax: ResolveSignature Description: An event is forensically identified by resolving an
145
intrusion detection system or log signature. May Contain: AttackSpecifics: High level information about an attack Tool: The tool used to create the signature Data: The data containing the signature Target: The target of the attack Initiator: The entity thought to be responsible for the attack UDPSourcePort UDPDestinationPort TCPSourcePort TCPDestinationPort ICMPType EtherAddress Observer: The entity which observed and/or recorded this occurrence. Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: DetectProfile
Class: verb Syntax: DetectProfile Description: An event is forensically identified by detecting a match with a known event profile. This usually refers to an event or attack scenario as opposed to a discrete event. May Contain: AttackSpecifics: High level information about an attack Tool: The tool used to create the profile Data: The data containing the profile
146
Target: The target of the attack Initiator: The entity thought to be responsible for the attack UDPSourcePort UDPDestinationPort TCPSourcePort TCPDestinationPort ICMPType EtherAddress Observer: The entity which observed and/or recorded this occurrence. Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: DetectAnomaly
Class: verb Syntax: DetectAnomaly Description: An event is forensically identified by detecting an anomalous departure from normal activity. May Contain: AttackSpecifics: High level information about an attack OldState: The previous state of the target prior to the anomalous observation CurrentState: The state of the target after the anomalous observation, i.e., the observed anomaly. Target: The target of the attack Initiator: The entity thought to be responsible for the attack UDPSourcePort UDPDestinationPort TCPSourcePort
147
TCPDestinationPort ICMPType EtherAddress Observer: The entity which observed and/or recorded this occurrence. Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: MonitorSystem
Class: verb Syntax: MonitorSystem Description: An event is forensically identified by detecting the output of a monitoring device such as an Intrusion Detection System. May Contain: AttackSpecifics: High level information about an attack OldState: The previous state of the target prior to the anomalous observation CurrentState: The state of the target after the anomalous observation, i.e., the observed anomaly. Target: The target of the attack Initiator: The entity thought to be responsible for the attack UDPSourcePort UDPDestinationPort TCPSourcePort TCPDestinationPort ICMPType EtherAddress Observer: The type of monitoring entity which observed and/or recorded this occurrence.
148
Tool: The specific monitoring tool. Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Forensic Preservation Verb SIDS

Name: ImageUsing
Class: verb Syntax: ImageUsing Description: Take a bitstream image of computer media using a particular imaging product May Contain: Initiator: The entity that performed the imaging process. Machine: The computer upon which the image is performed VolumeID: The identifier for the volume imaged DiskID: The identifier for the disk imaged MediaID: The identifier for the media other than a computer disk imaged Tool: The tool used to create the image, ApprovedSoftware, ApprovedHardware, ApprovedMethod LosslessCompression: The lossless compression technique used to compress the image if used. Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs:
149
Name: SynchronizeTime
Class: verb Syntax: SynchronizeTime Description: Indication that the times of various evidentiary devices (or devices containing evidence) are synchronized or normalized, either to each other or to a common time source. May Contain: Outcome: A ReturnCode of 0 for times synchronized, 1 for times not synchronized When: This contains the time that synchronization was verified Initiator: The entity that performed the synchronization or normalization. Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Forensic Collection Verb SIDs

Name: CollectData
Class: verb Syntax: CollectData Description: Perform a collection of data suspected of containing evidence by any of several means including, but not limited to, imaging of computer media, log extraction, or packet interception. May Contain: 150
Initiator: The person performing the collection Tool: The tool used to perform the collection Target: The device containing the data to be collected Data: The nature of the data to be collected Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: SampleData
Class: verb Syntax: SampleData Description: Perform sampling of a large dataset such as a large RAID or oversized medium. May Contain: Initiator: The person performing the sampling Tool: The tool used to perform the sampling Target: The device containing the data to be sampled Data: The nature of the data to be sampled Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: ReduceData
Class: verb Syntax: ReduceData Description: Normalize and deconflict forensic data from multiple sources, extract human-readable data from binary data (as with certain types of
151
firewall logs) May Contain: Initiator: The person performing the normalization and deconfliction Tool: The tools used to perform the deconfliction, normalization or other data extraction FileSource: The set of files containing pre-normalized/deconflicted data FileDestination: The post-processed file containing normalized/deconflicted data Data: The nature of the data to be normalized/deconflicted Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: RecoverData
Class: verb Syntax: RecoverData Description: The process by which data is extracted from various forensic sources such as logs, images, etc. May Contain: Initiator: The entity that extracted the data FileSource: The file from which the data is to be extracted FileDestination: The file in which the extracted data is preserved Data: The data to be recoverd Tool: The tool(s) used, if any, to extract the data Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
152
Forensic Examination Verb SIDS

Name: FilterData
Class: verb Syntax: FilterData Description: The process by which data sets are passed through filters to segregate desired or potentially desired data from the rest of the data set May Contain: Initiator: The entity that filtered the data FileSource: The file containing the data set from which the desired data is to be filtered FileDestination: The file in which the desired (filtered) data is preserved Data: The data to be filtered Tool: The tool(s) used, if any, to perform the filtering Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: MatchPattern
Class: verb Syntax: MatchPattern Description: The process of matching data to some predetermined pattern or signature such as a hash, virus signature or other attack pattern May Contain:
153
Initiator: The entity that performed the pattern matching FileSource: The file containing the data set to be checked for a pattern match FileDestination: The file in which the matched data is preserved Data: The data whose pattern is being matched Hash: The hash to be used as a pattern VirusSignature: The virus signature to be used as a pattern AttackSpecifics: The attack used as a signature or pattern Tool: The tool(s) used, if any, to perform the pattern matching Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: DiscoverData
Class: verb Syntax: DiscoverData Description: The process of discovering hidden data such as deleted, encrypted or otherwise hidden information within a medium or file under examination. May Contain: Initiator: The entity that performed the data extraction FileSource: The file containing the hidden data Data: Description of the hidden data Tool: The tool(s) used, if any, to perform the discovery Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs:
154
Name: ExtractData
Class: verb Syntax: ExtractData Description: The process of extracting hidden data such as deleted, encrypted or otherwise hidden information within a medium or file under examination. May Contain: Initiator: The entity that performed the data extraction FileSource: The file containing the hidden data FileDestination: The file in which the extracted data is preserved Data: Description of the hidden data Tool: The tool(s) used, if any, to perform the discovery Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Forensic Analysis Verb SIDs

Name: ConstructTimeline
Class: verb Syntax: ConstructTimeline Description: The process of constructing a timeline of events from primary and secondary evidence May Contain: Initiator: The entity that constructed the timeline
155
Link: A link in the chain of evidence Tool: The tool(s) used, if any, to perform the discovery Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Investigation Identification Verb SIDs

Name: DetectEvent
Class: verb Syntax: DetectEvent Description: The reporting of a suspected event based upon direct evidence that the event occurred. May Contain: Observer: The entity which observed and/or recorded this process. Initiator: The entity that detected the event and reported it to the observer Link: A link in the chain of evidence Tool: The tool(s) used, if any, to perform the discovery Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: ReceiveComplaint
Class: verb
156
Syntax: ReceiveComplaint Description: The reporting of a suspected event based upon a complaint from a third party that the event occurred. May Contain: receiver: The entity that received the complaint. Initiator: The entity that reported the event (i.e., issued the complaint) to the receiver. Link: A link in the chain of evidence Tool: The tool(s) used, if any, to perform the discovery Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Investigation Preservation Verb SIDS

Name: ManageCase
Class: verb Syntax: ManageCase Description: The recording of details such as Custody, EvidenceIdentifier, etc. used to track the investigative process. May Contain: Initiator: The entity that managed the case Link: A link in the chain of evidence Tool: The tool(s) used, if any, to manage the case Multiplier: Means that this action was carried out the indicated number of times. Extending Detail Using Atom SIDs: Any appropriate atom SID may be used to extend the detail present in ManageCase. --- The following are referent SIDs:
157
ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Investigation Collection Verb SIDs

Name: TraceAuthority
Class: verb Syntax: TraceAuthority Description: The inclusion of legal authority to proceed including laws, policies, regulations, etc. May Contain: Initiator: The entity that created the authority Observer: The entity that verified the authority Citation: The court case, law or regulation that contains the authority Policy: The organizational policy that contains the authority Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: ConductInterview
Class: verb Syntax: ConductInterview Description: The conduct of an interview with a suspect, witness or expert advisor. The notes from the interview may be recorded as comments or, where appropriate, as other SIDs. May Contain: 158
Initiator: The entity that is being interviewed Observer: The entity that conducted the interview Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Investigation Presentation Verb SIDs

Name: Clarify
Class: verb Syntax: Clarify Description: Add detail to a report or presentation to clarify particular investigative points May Contain: Initiator: The entity that is adding the clarification Data: Any additional information in the form of any atom SID allowed by the Data role SID Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: UseStatistics
Class: verb Syntax: UseStatistics
159
Description: Statistical interpretations of forensic data are used to present investigation results or draw conclusions May Contain: Initiator: The entity that is being analyzed statistically Observer: The entity that performed the analysis Tool: The tool, if any, used to perform the statistical interpretation Data: Information, in the form of any atom SID allowed by the Data role SID, being analyzed or interpreted statistically. Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: CreateReport
Class: verb Syntax: CreateReport Description: Create an investigation report. May Contain: Initiator: The entity that is creating the report Receiver: The entity that is to receive the report Tool: The tool, if any, used to create the report Data: Information, in the form of any atom SID allowed by the Data role SID, being included in the report. Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
160
Host Status Verb SIDs
Name: Reboot
Class: verb Syntax: Reboot Description: A machine is rebooted, either deliberately or accidentally, either legitimately or illegitimately. May Contain: Outcome: This indicates the success or failure of the reboot. When: Indicates when the machine was rebooted. Machine: This contains the host that was rebooted. (Direct object SID. Pluralizable.) Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity responsible for performing the reboot. (Subject SID. Not pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: ShutDown
Class: verb Syntax: ShutDown Description: A machine is powered down, either deliberately or accidentally, either legitimately or illegitimately. May Contain:
161
Outcome: This indicates the success or failure of the shutdown. When: This indicates when the machine was shut down. Machine: This contains the host that was shutdown. (Direct object SID. Pluralizable.) Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity responsible for performing the shutdown. (Subject SID. Not pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Boot
Class: verb Syntax: Boot Description: A machine is powered up. May Contain: Outcome: This indicates the success or failure of the bootup. When: This indicates when the machine was booted up. Machine: This contains the host that was booted up. (Direct object SID. Pluralizable.) Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity responsible for performing the bootup. (Subject SID. Not pluralizable.) Multiplier: Means that this action was carried out the indicated
162
number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
TCP Connection Verb SIDs

Name: TCPConnect
Class: verb Syntax: TCPConnect Description: This verb indicates that, in the belief of the observer, the Initiator began a TCP connection to the Receiver (as defined in RFC 793). May Contain: Outcome: Contains the status of the connection which should usually be specified with TCPConnectionStatus. If a Time atom appears in the When role, the status is as of that time. Otherwise, the status reflects the most successful status the connection achieved before being ended. When: Contains the time. Appearance of BeginTime indicates the time of the first packet that initiated the connection. Appearance of EndTime indicates the time of a packet which was believed to end the connection. Appearance of Time implies any time at which the connection was believed to be in existence. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity opening the session--that is, the entity sending the packet with initial Syn. Source addresses, ports, etc. go in here (even if it is possible that they are forged). (Subject SID. Pluralizable.) Receiver: The entity serving the session request. The actual port serving the session will be given here, as well as the standard port which
163
specifies the protocol believed to be used on the connection. (Indirect object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
HTTP Verb SIDs
Name: HTTPPost
Class: verb Syntax: HTTPPost Description: Someone sends an HTTP POST request (e.g., as defined in RFC 2068). May Contain: Outcome: This indicates the success or failure of the POST. The HTTPStatusCode atom SID is designed specifically for this purpose, though other less specific SIDs may be used if the status code is unavailable. When: The time at which or during which the POST took place. Observer: The entity making the report. Message: Additional information about the POST message. The HTTP descriptor SIDs are designed for this purpose. Initiator: The client performing the POST. (Subject SID. Pluralizable.) Receiver: The entity receiving the POST request. (Indirect object SID. Pluralizable.)
164
Name: HTTPGet
Class: verb Syntax: HTTPGet Description: Someone sends an HTTP GET request (e.g., as defined in RFC 2068). May Contain: Outcome: This indicates the success or failure of the GET. The HTTPStatusCode atom SID is designed specifically for this purpose, though other less specific SIDs may be used if the status code is unavailable. When: The time at which or during which the GET took place. Observer: The entity making the report. Message: Additional information about the GET message. The HTTP descriptor SIDs are designed for this purpose. Initiator: The client performing the GET. (Subject SID. Pluralizable.) Receiver: The entity receiving the GET request. (Indirect object SID. Pluralizable.) --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
165
Application Session Verb SIDs

Name: OpenApplicationSession
Class: verb Syntax: OpenApplicationSession Description: An entity opens an application session with another entity. To distinguish this from OpenTCPConnection, success or failure of this open is not based on the status of underlying TCP connection, but on application-specific criteria. For instance, login requests are handled using this verb (OpenApplicationSession); the request succeeds or fails on the user supplying the correct username-password pair. May Contain: Outcome: Contains the status of the request (success, failure, etc) When: Contains the time. Hostnames and addressesanything identifying the location of the Initiator or Receivershould be placed with their respective roles and not here. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity opening the session--that is, the entity sending the open session request. Source addresses, ports, etc go in here (even if it is possible that they are forged). (Subject SID. Pluralizable.) Receiver: The entity serving the session request. The actual port serving the session will be given here, as well as the standard port used to identify the protocol believed to be used in the connection. (Direct object SID. Pluralizable.) Account: Information about the account being logged into, if applicable. (Indirect object SID. Not pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo
166
--- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: CloseApplicationSession
Class: verb Syntax: CloseApplicationSession Description: A user closes a session. This may be any of the sessions opened via OpenApplicationSession. May Contain: Outcome: ReturnCode if present, represents the success or failure of the attempt to close the session. When: This contains the time at which the session was closed. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity operating the session (the client). The host starting the session should be given here. (Subject SID. Not pluralizable.) Receiver: ProcessID should be used to identify the server session that was closed. The host serving the session should be given here. (Direct object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Login
Class: verb Syntax: Login Description: A user logs in or attempts to log in to an account. Login should be used whenever a new shell is started. Therefore, for example, SU should be expressed using Login.
167
May Contain: Outcome: The exit status of the login attempt. When: This contains the time of the login attempt, and also the host serving the login session. Observer: The entity which observed and/or recorded this occurrence. Initiator: The user attempting to log in. The host starting the login attempt also should be put here (i.e., User@SourceHost). (Subject SID. Not pluralizable.) Account: Information about the user account being logged into. The host of the account also should be given here (i.e., AccountName@TargetHost). (Indirect object SID. Not pluralizable.) Receiver: ProcessID should be used to identify the login session process. The host serving the login session should also be given here (even if same as preceding). number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When (Direct object SID. Not pluralizable.) Multiplier: Means that this action was carried out the indicated
Name: OpenFTP
Class: verb Syntax: OpenFTP Description: A user opens or attempts to open an FTP session to an account. Any session using the FTP protocol should be expressed using OpenFTP. May Contain: Outcome: The exit status of the FTP open attempt. When: This contains the time of the FTP open attempt. Observer: The entity which observed and/or recorded this
168
occurrence. Initiator: The user attempting to log in. The host that the user is directly opening the session from should be placed here. (Subject SID. Not pluralizable.) Account: Information about the user account being logged into. If this is an anonymous FTP session, the AnonFTPEMailAddr SID should be used. The host of the FTP account should be given here. (Indirect object SID. Not pluralizable.) Receiver: ProcessID should be used to identify the FTP daemon session process. The host serving the session should be identified here (even if same as preceding). (Direct object SID. Not pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: SendMail
Class: verb Syntax: SendMail Description: A user sends mail to others. May Contain: Outcome: ReturnCode if present, represent the success or failure of the attempt to send the mail. The actual return code should not be placed here. Instead, the encoder should use ByMeansOf to indicate the command by which the mail would be sent, and use ActualReturnCode there. When: The time at which the mail was sent. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity sending the mail message. (Subject SID. Not pluralizable.) Receiver: An entity designated to receive the mail message. (Indirect
169
object SID. Pluralizable.) MailMessage: Information about the e-mail message. A mail message can be referred to with HostName and MailMessageID. (Direct object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
State Assertion Verb SIDs

Name: ObserveState
Class: verb Syntax: ObserveState Description: This gives the current state of the system. May Contain: When: This contains the time for which "current state" has meaning. It also delineates the "system." Observer: The entity observing the system. (Subject SID. Not pluralizable.) Current State: The current state of the system. Any attributes of the system may be given here. The host or network defined as "the system" should be identified here. (Direct object SID. Not pluralizable.) --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
170
Name: ChangeState
Class: verb Syntax: ChangeState Description: This reports a state change in the system. May Contain: When: This contains the time at which the CurrentState is observed. Observer: The entity observing the system. (Subject SID. Not pluralizable.) Old State: The previous state of the system. If there is a time associated with the old state, it should be given here. Any old attributes of the system may be given here. The system (host or network) is also identified here. (Direct object SID. Not pluralizable.) CurrentState: The current state of the system. Any current attributes of the system may be given here. The encoder should, however, use the same attributes which were given in the OldState. The system (host or network) is also identified here, even if same as preceding. (Direct object SID. Not Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When pluralizable.)
Authorization and Policy Verb SIDs
Name: AcquireProxy
Class: verb Syntax: AcquireProxy Description: A user gains or tries to gain the ability to act as another.
171
May Contain: Outcome: ReturnCode if present, represents the success or failure of the attempt to obtain the proxy. The actual return code should not be placed here. Instead, the encoder should use ByMeansOf to indicate the command by which the proxy would be acquired, and use ActualReturnCode there. When: This contains the time at which the proxy was obtained (or the attempt was made). Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity gaining the proxy. The host where the user is acting should be placed here. (Subject SID. Not pluralizable.) Proxy: The entity or entities on whose behalf the Initiator may now act. For example, if a UserName of 'root' is specified here, the Initiator has the ability to act as root. (Direct object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo ---The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: ReleaseProxy
Class: verb Syntax: ReleaseProxy Description: A user loses or releases the ability to act as another. May Contain: Outcome: ReturnCode represents the success or failure of the attempt to release the proxy. The actual return code should not be placed here. Instead, the encoder should use ByMeansOf to indicate the command by which the proxy would be released, and use ActualReturnCode there. When: This contains the time at which the proxy was released (or the attempt was made).
172
Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity releasing the proxy. The host where the user is acting should be placed here. (Subject SID. Not pluralizable.) Proxy: The entity or entities on whose behalf the Initiator could previously act. For example, if a UserName of 'root' is specified here, the Initiator releases the ability to act as root. (Direct object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo ---The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Request
Class: verb Syntax: Request Description: An entity makes a request of another entity. "Request" here is taken in its informal sense. Any directly included S-expression headed by a verb is construed as a sentence indicating what the Initiator requests of the Receiver. There may be multiple sentences; these are treated as pluralized objects. This SID is used to express "third-party" observations about policy decisions imposed by one party on another. It is NOT used to express an action that the sender wants the receiver to perform; for that purpose, the Do SID should be used instead. May Contain: When: This contains the time that the request was made. Appropriate hostnames should be included with the Initiator and Receiver roles. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity making a request. (Subject SID. Not pluralizable.) Receiver: The entity receiving the request. (Indirect object SID.
173
Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When --- An S-expression headed by any of the following SIDs may be used to represent the actual request: And HelpedCause ByMeansOf Copy Move Delete Execute Suspend Resume Terminate Reboot Shutdown Boot TCPConnect SendMail AcquireProxy ReleaseProxy AuditAccount TraceMessage
Name: Require
Class: verb Syntax: Require Description: An entity requires another entity to perform an action. Any included S-expression headed by a verb is construed as a sentence indicating what the Initiator requires of the Receiver. There may be multiple sentences; these are treated as pluralized objects. This SID is used to express "third-party" observations about policy decisions imposed by one party on another. It is NOT used to express an action that the sender wants the receiver to perform; for that purpose, the Do SID should be used instead. May Contain: When: This contains the time that the requiring was made. Appropriate hostnames should be included with the Initiator and Receiver roles. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity making the requirement. (Subject SID. Not pluralizable.) Receiver: The entity receiving the requirement. (Indirect object SID. Pluralizable.)
174
Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo ---The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When --- An S-expression headed by any of the following SIDs may be used to represent the actual requirement: And HelpedCause ByMeansOf Copy Move Delete Execute Suspend Resume Terminate Reboot Shutdown Boot TCPConnect SendMail AcquireProxy ReleaseProxy AuditAccount TraceMessage
Name: Allow
Class: verb Syntax: Allow Description: An entity permits another entity to perform an action. Any included S-expression headed by a verb is construed as a sentence indicating what the Initiator allows to the Receiver. There may be multiple sentences; these are treated as pluralized objects. This SID is used to express "third-party" observations about policy decisions imposed by one party on another. It is NOT used to express an action that the sender wants the receiver to perform; for that purpose, the Do SID should be used instead. May Contain: When: This contains the time that the permission was granted. Appropriate hostnames should be included with the Initiator and Receiver roles. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity granting the allowance. (Subject SID. Not pluralizable.) Receiver: The entity receiving the allowance. (Indirect object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated
175
number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When --- An S-expression headed by any of the following SIDs may be used to represent the actual allowed action: And HelpedCause ByMeansOf Copy Move Delete Execute Suspend Resume Terminate Reboot Shutdown Boot TCPConnect SendMail AcquireProxy ReleaseProxy AuditAccount TraceMessage
Name: Forbid
Class: verb Syntax: Forbid Description: An entity forbids another entity to perform an action. Any included S-expression headed by a verb is construed as a sentence indicating what the Initiator forbids to the Receiver. There may be multiple sentences; these are treated as pluralized objects. This SID is used to express "third-party" observations about policy decisions imposed by one party on another. It is NOT used to express an action that the sender wants the receiver to perform; for that purpose, the Do SID should be used instead. May Contain: When: This contains the time that the forbidding was done. Appropriate hostnames should be included with the Initiator and Receiver roles. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity forbidding the action. (Subject SID. Not pluralizable.) Receiver: The entity being forbidden. (Indirect object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times.
176
--- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When --- An S-expression headed by any of the following SIDs may be used to represent the actual forbidden action: And HelpedCause ByMeansOf Copy Move Delete Execute Suspend Resume Terminate Reboot Shutdown Boot TCPConnect SendMail AcquireProxy ReleaseProxy AuditAccount TraceMessage
Auditing Verb SIDs

Name: AuditAccount
Class: verb Syntax: AuditAccount Description: A user account is audited. May Contain: When: This contains the time at which (or during which) the auditing was performed. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity performing the audit. (Subject SID. Not pluralizable.) Account: The account being audited. The host on which the account resides should be given here. (Direct object SID. Pluralizable.) Tool: The mechanism used to perform the auditing. (Indirect object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs:
177
ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: TraceMessage
Class: verb Syntax: TraceMessage Description: Specified messages are traced (to their source). May Contain: When: This contains the time at which (or during which) the auditing was performed. Observer: The entity which observed and/or recorded this occurrence. Initiator: The entity tracing the messages. (Subject SID. Not pluralizable.) Message: The message(s) being traced. (Direct object SID. Pluralizable.) Tool: The mechanism used to trace the messages. (Indirect object SID. Pluralizable.) AttackPath: The path traversed by the attack being traced. TracePath: The path traversed by the attack through the Observer. Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
178
Analysis Verb SIDs
Name: Attack
Class: verb Syntax: Attack Description: This represents a diagnosis of an attack. Commonly it is used as the front end of a ByMeansOf structureroughly speaking, "An attack occurred, by means of the following action." May Contain: Outcome: This indicates the success or failure of the attack. If the attack could be judged to have succeeded, then (for instance) the ReturnCode would be 0. Actual return codes should accompany the specifics of the attack in the tail ends of a ByMeansOf construct. When: This contains the time at which (or during which) the attack took place. Observer: The entity making the attack diagnosis. (Subject SID. Not pluralizable.) AttackSpecifics: High-level information about the attack. For example, the AttackNickname would go here. This is intended to be a "quick look" category so experts in particular attacks could look for that related parameters (if any). (Indirect object SID. Not pluralizable.) Initiator: The entity (if known) responsible for carrying out the attack. (Subject SID. Pluralizable.) Target: The entity (if known) that is the target of the attack. (Direct object SID. Pluralizable.) Multiplier: Means that this action was carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
179
Command Verb SIDs
Name: Do
Class: verb Syntax: Do Description: This is an explicit request for an action. This is distinguished from Request in the following way: Request is used to express the thought, "X requests Y to do Z." Do is used to express the thought, "Do Z." In other words, a Request is a passive assertion of fact, that a request has taken place; a Do is an active request. Before acceding to the request, the actual receiver of the message should try to match itself to the Receiver, and to authenticate the Initiator, and to check the authorization of the Initiator to request the action. Any included S-expression headed by a verb SID is construed as a sentence indicating what it is that the Receiver should do. The time and location of the action are not included in the Do verb directly, but are instead placed in the included sentences. May Contain: Initiator: The entity making the request. (Subject SID. Not pluralizable.) Receiver: The entity receiving the request. (Direct object SID. Pluralizable.) Multiplier: Means that this action is to be carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When --- An S-expression headed by any of the following SIDs may be used to represent the requested action: And HelpedCause ByMeansOf Copy Move Delete Execute Suspend Resume Terminate Reboot Shutdown Boot TCPConnect SendMail
180
AcquireProxy ReleaseProxy AuditAccount TraceMessage
Name: Did
Class: verb Syntax: Did Description: Did is used only to provide a description of what was done in response to a request for an action (i.e., a sentence with the 'Do' verb). Did is used to express the thought, "I Did Z", where Z is the sentence represented by the S-expression following the Did verb. May Contain: --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When --- An S-expression headed by any of the following SIDs may be used to describe the performed action: And HelpedCause ByMeansOf Copy Move Delete Execute Suspend Resume Terminate Reboot Shutdown Boot TCPConnect SendMail AcquireProxy ReleaseProxy AuditAccount TraceMessage
Name: Authenticate
Class: verb Syntax: Authenticate Description: This is an explicit authentication. The authentication may be by an individual or a program or process. It differs from Login in that the Initiator may be authenticating an application or other code through the use of a hash or other authenticating method. When an entity is authenticating to a process, program or application, use Login. May Contain: Initiator: The entity performing the authentication. (Subject SID.
181
Not pluralizable.) Receiver: The entity receiving the authentication. (Direct object SID. Pluralizable.) Observer: The entity recording the authentication. Process: The process performing authentication Data: Data being authenticated (Direct object SID. Pluralizable.) Tool: The tool being authenticated (Direct object SID. Pluralizable.) Multiplier: Means that this action is to be carried out the indicated number of times. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime
Role SIDs
General Purpose Role SIDs
Name: Initiator
Class: role Syntax: Initiator Description: The process or entity responsible for "doing" the containing verb. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the user executing the initiating process: UserName RealName UserID GroupName GroupID EMailAddress --- The following SIDs are used to describe the process itself: 182
ProcessID ProcessName ProcessStatus TCPPort TCPPortRange UDPPort UDPPortRange --- The following SIDs are used to describe the machine on which the user is executing the initiating process: HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Observer
Class: role Syntax: Observer Description: The process or entity observing an action or a system state, or analyzing a sequence of events. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the user executing the observing process: UserName RealName UserID GroupName GroupID EMailAddress --- The following SIDs are used to describe the process itself: ProcessID ProcessName ProcessStatus TCPPort TCPPortRange UDPPort UDPPortRange --- The following SIDs are used to describe the machine on which the user is executing the initiating process: HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo
183
--- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
File-Related Role SIDs
Name: FileSource
Class: role Syntax: FileSource Description: The source of a file operation. This may be the source of a copy or a move, or it may be the target of a delete. It may contain the host on which the file resides (or at least can be accessed). More specific information about the interpretation of FileSource is included with each applicable verb description. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the file or files being copied, moved, or deleted: FileName FullFileName ByteSize TimeCreated TimeModified TimeAccessed DirectoryName FullDirectoryName AccessPermission --- The following SIDs are used to describe the host on which those files reside: HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
184
Name: FileDestination
Class: role Syntax: FileDestination Description: The destination of a file operation. It may contain the host on which the file is to reside (or at least to be accessed). More specific information about the interpretation of FileDestination is included with each applicable verb description. May Contain: --- The following SIDs are used to describe the file or files which are newly placed as a result of a file copy or move: FileName FullFileName ByteSize TimeCreated TimeModified TimeAccessed DirectoryName FullDirectoryName AccessPermission --- The following SIDs are used to describe the host on which hose files now reside: HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Process-Related Role SIDs

Name: Process
Class: role Syntax: Process Description: A process executing on a single host. May Contain:
185
Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the process running on the host. ProcessID ProcessName SystemTime UserTime CPUTime TCPPort TCPPortRange UDPPort UDPPortRange --- The following SIDs are used to describe the program being run in the process. ProgramName VersionNumber LanguageName --- The following SIDs are used to describe the file containing the executable running as the process. FileName FullFileName ByteSize TimeCreated TimeModified TimeAccessed DirectoryName FullDirectoryName AccessPermission --- The following SIDs are used to describe the host on which the process is running. HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Tool
Class: role Syntax: Tool [generic tool name] Type: The argument [generic tool name] is of type string Example:
(Tool SafeBack (ProgramName safeback) (FileName safeback.exe) (VersionNumber 2.1) )
Description: An application used to perform an action.

186
May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the process the tool is running as: ProcessID ProcessName SystemTime UserTime CPUTime TCPPort TCPPortRange UDPPort UDPPortRange --- The following SIDs are used to describe the program used as the tool: ProgramName VersionNumber LanguageName ApprovedSoftware ApprovedHardware ApprovedMethod --- The following SIDs are used to describe the file containing the executable for the tool: FileName FullFileName ByteSize TimeCreated TimeModified TimeAccessed DirectoryName FullDirectoryName AccessPermission --- The following SIDs are used to describe the host on which the tool is running: HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
User-Related Role SIDs
Name: Account
Class: role Syntax: Account [account name] Type: The argument [account name] is of type string
187
Description: A user account. May Contain: --- The following SIDs describe the user who the account represents: UserName RealName UserID GroupName GroupID EMailAddress --- The following SIDs describe the host on which that account resides: HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Proxy
Class: role Syntax: Proxy Description: A principal on behalf of whom an entity acts. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs describe the user whose privileges have been acquired or released: UserName RealName UserID GroupName GroupID EMailAddress --- The following SIDs describe the host on which those privileges apply: HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs:
188
Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Forensic Preservation Role SIDs
Name: Data
Class: role Syntax: Data Description: A dataset potentially containing evidence. The dataset may be a gross dataset, such as a complete fixed disk, or it may be all or a portion of a file such as a log file.. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the data. VolumeID DiskID MediaID Hash Device FileSource FileName FullFileName ByteSize TimeCreated TimeModified TimeAccessed --- The following SIDs are used to describe the method or procedure for preserving the data. ProcedureName Tool ApprovedMethod ApprovedSoftware ApprovedHardware Observer Initiator --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Forensic Collection Role SIDs
189
Name: ApprovedMethod
Class: role Syntax: ApprovedMethod Description: A forensic collection method or procedure that has been performed by a forensic examiner trained formally in the method or procedure. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the program being used in the method or process. Tool ProgramName VersionNumber --- The following SIDs are used to describe the method or procedure. ProcedureName --- The following SIDs are used to describe the approval method and certification if any. Certification Citation --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: ApprovedSoftware
Class: role Syntax: ApprovedSoftware Description: Forensic collection software that has been used to collect data from computer media or memory. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used.
190
--- The following SIDs are used to describe the program being used. Tool ProgramName VersionNumber --- The following SIDs are used to describe the approval method and certification if any. Certification Citation --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: ApprovedHardware
Class: role Syntax: ApprovedHardware Description: Forensic collection hardware that has been used to collect data from computer media or memory. The hardware must be an inseparable part of the collection procedure. It may not be, simply, the computer upon which collection or imaging software is running. The hardware must participate in the collection process itself beyond simply being a platform for image collection or storage. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the hardware being used. Tool HostName VersionNumber --- The following SIDs are used to describe the approval method and certification if any. Certification Citation --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier
191
Forensic Examination Role SIDs

Name: Validation
Class: role Syntax: Validation Description: Techniques used to corroborate, or show the validity of evidence. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe evidence being validated: FileName FullFileName ByteSize TimeCreated TimeModified TimeAccessed DirectoryName FullDirectoryName Hash VolumeID DiskID MediaID PhysicalLocation Initiator Observer --- The following SIDs are used to describe the approval method and certification if any. Certification Citation ProcedureName Tool ApprovedMethod ApprovedSoftware --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
192
Forensic Analysis Role SIDs

Name: Link
Class: role Syntax: Link [Argument1], [Argument2] Type: The arguments [Argument1] and [Argument2] are of type string. Description: Forensic links between pieces of potential evidence. A link may be forged as part of the chain of evidence or it may be the result of a link analysis procedure. When using a sophisticated link analysis tool in a complex case there may be multiple links discovered. In that case the Link role may be used as many times as necessary to establish all applicable links discovered by the investigator or the tool. The arguments are strings. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe entities in the linking process. Subject Suspect Data FileName FullFileName Hash UserName RealName UserID HostName FQHostName IPV4Address DomainName FQDomainName SourceIPV4Address DestinationIPV4Address MailMessageID Initiator Observer Receiver --- The following SIDs are used to describe the approval method and certification if any. Certification Citation Tool --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
193
Investigation Collection Role SIDs

Name: Citation
Class: role Syntax: Citation Description: The legal citation that describes a court case or other source that validates ApprovedHardware, ApprovedSoftware or ApprovedMethod. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the Citation role. CourtDecision Jurisdiction Observer --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Certification
Class: role Syntax: Certification Description: The certification process that validates ApprovedMethod. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the Certification role. CourtDecision Jurisdiction CertType CertNumber RealName Certifier Observer --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
194
Name: Policy
Class: role Syntax: Policy Description: The organizational policy used to justify or permit an investigation under TraceAuthority. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the Policy role. PolicyName PolicyDate Observer Initiator Receiver --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Investigation Analysis Role SIDs
Name: Subject
Class: role Syntax: Subject Description: The entity who is the target of an interview or interrogation. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the Subject role. Observer RealName In this SID the observer is the interviewer. 195
Name: Suspect
Class: role Syntax: Suspect Description: The subject who is the target of an interview or interrogation and is a suspect in the event or incident investigation. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the Subject role. Observer RealName In this SID the Observer is the interviewer. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Investigation Presentation Role SIDs

Name: Expert
Class: role Syntax: Expert Description: The entity who is an acknowledged expert and may testify in court as such.. May Contain:
196
Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the Expert role. CourtDecision Jurisdiction RealName --- The following are attribute SIDs: Certifier --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Countermeasure
Class: role Syntax: Countermeasure [generic name or term] Type: The argument [generic name or term] is of type string. Description: The countermeasure recommended to mitigate future damage from similar events or incidents to those being investigated. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the Countermeasure role. FileName FullFileName ProgramName VersionNumber HostName FQHostName ArchitectureName OSName DomainName FQDomainName TCPPort UDPPort TCPSourcePort UDPSourcePort TCPPortRange UDPPortRange Initiator Observer Tool Process Receiver --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
197
Network-Related Role SIDs
Name: Receiver
Class: role Syntax: Receiver Description: The process receiving a message, being subjected to a policy. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs are used to describe the receiving user: UserName RealName UserID GroupName GroupID EMailAddress --- The following SIDs are used to describe the process representing the receiver: ProcessID ProcessName Priority SystemTime UserTime CPUTime TCPPort TCPPortRange UDPPort UDPPortRange --- The following SIDs are used to describe the machine on which that process is running: HostName FQHostName IPV4Address ArchitectureName OSName --- The following SIDs are used to describe the domain in which the host resides. DomainName FQDomainName IPV4Mask --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
198
Messaging-Related Role SIDs
Name: Message
Class: role Syntax: Message Description: Information about a specific message referred to as part of a sentence. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs give the header fields of the message: EtherAddress SourceIPV4Address DestinationIPV4Address SourceIPV4Mask DestinationIPV4Mask IPV4Protocol IPV4Checksum TCPPort TCPSourcePort TCPDestinationPort TCPPortRange TCPSourcePortRange TCPDestinationPortRange UDPPort UDPSourcePort UDPDestinationPort UDPPortRange UDPSourcePortRange UDPDestinationPortRange UDPLength ICMPType --- The following SIDs give information about HTTP fields: URL --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: MailMessage
Class: role Syntax: MailMessage Description: Information about an e-mail message referred to as part of a SendMail sentence. May Contain:
199
Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used. --- The following SIDs describe the e-mail message being sent: MailMessageID Content ContentType --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
State-Related Role SIDs

Name: OldState
Class: role Syntax: OldState [description] Type: The [description] argument is of type string. Description: A previous state of the system. May Contain: --- The following SIDs identify the host, data or network being defined as the "system: HostName FQHostName IPV4Address ArchitectureName OSName Data --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: CurrentState
Class: role Syntax: CurrentState [description} Type: The argument [description] is of type string.
200
Description: A current state of the system (current defined with respect to the time given in the When). May Contain: --- The following SIDs identify the host, data or network being defined as the "system: HostName FQHostName IPV4Address ArchitectureName OSName data --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Machine
Class: role Syntax: Machine Description: A specific computer (host). This does not include networking devices such as routers or switches. For those and similar devices use Device. May Contain: --- The following SIDs identify the host. HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
201
Analysis-Related Role SIDs
Name: AttackSpecifics
Class: role Syntax: AttackSpecifics Description: High-level information about an attack. May Contain: --- The following required SID is used to describe the attack: AttackNickname --- The following SIDs are used to describe the analyzer's appraisal of the seriousness of the attack: Certainty Severity --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Target
Class: role Syntax: Target Description: The target of an attack, as described in the Attack verb. Also may be used as the target of an investigative or forensic process. May Contain: Multiplier: If this role is designated as pluralizable under the containing verb, this SID may be used.
202
--- The following SIDs are used to describe the user that is the target of the attack: UserName RealName UserID GroupName GroupID EMailAddress --- The following SIDs are used to describe the targeted program, data or process: ProcessID ProcessName Priority SystemTime UserTime CPUTime TCPPort TCPPortRange UDPPort UDPPortRange Data Tool ProgramName Data ICMPType --- The following SIDs are used to describe the targeted machine: HostName FQHostName IPV4Address ArchitectureName OSName Device --- The following are referent SIDs: ReferAs ReferTo --- The following are attribute SIDs: Owner Certifier Developer --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Auditing Role SIDs
Name: AttackPath
Class: role Syntax: AttackPath Description: Information about the path that an attack travels. This may not be the actual final path but a best guess. May Contain: IPv4Path: Identifies the path along which the attack traveled. Required. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
203
Name: TracePath
Class: role Syntax: TracePath Description: Information about how an event traveled through the observing component. May Contain: IPv4Path: Contains two IPv4 addresses, one identifying the connection where the event entered, the second where the event exited. Required. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: ReceivedVia
Class: role Syntax: ReceivedVia Description: The connection from which an event travels to the entering connection (see TracePath). This usually is the last hop in a TracePath Sentence. May Contain: IPv4Address: The IPv4 address of the sourcing connection. Required. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: ReceivedFrom
Class: role Syntax: ReceivedFrom
204
Description: The device that sent a trace request (see TracePath). May Contain: IPv4Address: The IPv4 address of the requesting device. Required. --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Adverb SIDs
Name: Outcome
Class: adverb Syntax: Outcome Description: Information about the return status of the containing verb. This may affect the interpretation of the sentence, obviously: if the verb is Delete, but there is an Outcome including a clause such as (ReturnCode failed), then the interpretation is that a Delete was attempted, but it didn't succeed. Inversely, if there is no Outcome role in a sentence, then the verb--if it represents an event--can be presumed to have happened successfully. May Contain: --- The following SIDs describe the observer's appraisal of the seriousness of the report: Certainty Severity --- When the verb represents an action, the following SIDs denote the status or the result of the attempted action: ReturnCode ActualReturnCode TCPConnectionStatus --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
205
Name: When
Class: adverb Syntax: When Description: Information about when an event took place, or when a response is to take place (if the sentence is a prescription). May Contain: --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Attribute SIDs
Name: Owner
Class: attribute Syntax: Owner Description: An entity with ownership rights to the object specified in the containing role clause. If there is no specific assignment of ownership, then this refers to an entity with the right to control access to the object. May Contain: --- The following SIDs describe the user who owns the parent object: UserName RealName UserID GroupName GroupID EMailAddress --- The following SIDs describe the host on which that user resides: HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo
206
Name: Certifier
Class: attribute Syntax: Certifier Description: An entity which vouches for the identity of the object. May Contain: --- The following SIDs describe the entity who certifies the parent object: UserName RealName UserID GroupName GroupID EMailAddress Certification --- The following SIDs describe the host on which that user resides: HostName FQHostName IPV4Address ArchitectureName OSName --- The following are referent SIDs: ReferAs ReferTo --- The following are universally usable SIDs: Comment World Time BeginTime EndTime Authenticate When
Name: Developer
Class: attribute Syntax: Developer Description: An entity which developed the object (typically a program). May Contain: --- The following SIDs describe the user who developed the parent object: UserName RealName UserID GroupName GroupID EMailAddress --- The following SIDs describe the host on which that user resides: HostName FQHostName IPV4Address ArchitectureName OSName
207
Atom SIDs
Generally speaking, unless noted differently, the syntax for an atom SID is:
SID [arg. 1], [arg. 2], ... [arg. n]
The arguments, unless stated differently, take the type associated with the SID.
General Purpose Atom SIDs

Name: Comment
Class: atom Type: string Description: A comment in text. Short comments relating to a line of DIPL code also may be entered at the end of the line using the double colon ( :: ) preceding the comment.
Name: World
Class: atom Type: string Description: This SID takes a list of worlds as described below. Each world is represented by a name. Currently defined worlds are Unix, Microsoft
Name: Multiplier
Class: atom Type: ushort Description: Briefly, pluralizes either an entire sentence or a single
208
role clause within a sentence.
File Descriptor Atom SIDs
Name: FileName
Class: atom Type: string Description: The name of a file. It may be incompletely specified; i.e., in the Unix world, it may be either 'passwd' or '/etc/passwd'; both are allowed.
Name: FullFileName
Class: atom Type: string Description: The name of a file. It must be completely specified; i.e., in the Unix world, it must be '/etc/passwd', not just 'passwd'.
Name: ByteSize
Class: atom Type: ushort Description: The length of a file, data structure (or in general any object whose size can be measured this way) in bytes (octets).
Name: TimeCreated
Class: atom 209
Type: string Description: The time at which an object (such as a file) was created. Expressed as hh:mm:ss [TZ] ddmmyyyy
Name: TimeModified
Class: atom Type: string Description: The time at which an object (such as a file) was last modified. Expressed as hh:mm:ss [TZ] ddmmyyyy .
Name: TimeAccessed
Class: atom Type: string Description: The time at which an object (such as a file) was last accessed (read). Expressed as hh:mm:ss [TZ] ddmmyyyy
Name: DirectoryName
Class: atom Type: string Description: As with FileName, but for a directory.
Name: FullDirectoryName
Class: atom Type: string Description: As with FullFileName, but for a directory.
210
Name: AccessPermission
Class: atom Type: character (single character) Description: The access permission granted on a directory or file as given by the following table: r = Read. l = Lookup. i = Insert. d = Delete. w = Write. k = locK. a = Administer.
Program Descriptor Atom SIDs
Name: ProgramName
Class: atom Type: string Description: A name (which may be colloquial) of a program (e.g., 'MS Word 5.0' or simply 'MS Word'), as opposed to the filename at which it resides on a particular host. Therefore, all instances of a single program should have the same ProgramName (as expressed by a given observer).
Name: VersionNumber
Class: atom Type: string Description: A version number associated with a program.
211
Name: LanguageName
Class: atom Type: string Description: A name of a programming language (e.g., 'Pascal').
Process Descriptor Atom SIDs
Name: ProcessID
Class: atom Type: ulong Description: The ID number of a process.
Name: ProcessName
Class: atom Type: string Description: The name of a process (typically used for daemon or service name).
Name: ProcessStatus
Class: atom Type: number (single character) Description: The status of a process, as given by the following table:
212
0 1 2 3 4
= active = killed
( running ) ( terminated by external signal )
= suspended ( awaiting an OS action ) = finished ( terminated internally ) = unknown (no such process)
5-9 = undefined
User Descriptor Atom SIDs
Name: UserName
Class: atom Type: string Description: The name of a user account.
Name: RealName
Class: atom Type: string Description: A real name of (ordinarily) a human being (e.g., 'Joe Smith').
Name: UserID
Class: atom Type: number (5 characters maximum) Description: An ID number associated with a user account. It can be counted on to be unique per machine at any time.
213
Name: GroupName
Class: atom Type: string Description: A name associated with a user group. This can be bound to either a user or an object such as a file.
Name: GroupID
Class: atom Type: number (5 character maximum) Description: A user group ID number. (See GroupName.)
Name: EMailAddress
Class: atom Type: string Description: A standard-format e-mail address (name@domain).
Forensic Collection Atom SIDs
Name: LosslessCompression
Class: atom Type: string Description: The name of a lossless compression program, process or algorithm
214
Name: Hash
Class: atom Type: string Description: The name of a hash type or algorithm.
Name: VirusSignature
Class: atom Type: string Syntax: VirusSignature [antivirus software name], [virus name], [signature file date] Description: The common name of a virus as identified from a signature.
Name: VolumeID
Class: atom Type: string Description: The name or other identifier of a disk volume.
Name: DiskID
Class: atom Type: string Description: The name or other identifier (such as manufacturer, model number and serial number) of a disk drive.
Name: MediaID
Class: atom Type: string Description: The name or other identifier (such as media type, manufacturer, model number and serial number) of a computer media device
215
other than a disk drive.
Name: Device
Class: atom Type: string Description: The name or other identifier (such as device type, manufacturer, model number and serial number) of a computer device other than a disk drive. May include network devices such as routers and switches, etc.
Name: ProcedureName
Class: atom Type: string Description: The name of a test, evaluation, investigation, forensic or other procedure.
Name: CourtDecision
Class: atom Type: string Description: The specific citation of a court decision.
Name: Jurisdiction
Class: atom Type: string Description: The jurisdiction within which a court decision was rendered.
216
Forensic Analysis Atom SIDs

Name: StatisticalMethod
Class: atom Type: string Description: The name or description of the method used to perform a statistical analysis on forensic data.
Investigation Preservation Atom SIDs
Name: ChainOfCustody
Class: atom Type: string Syntax: ChainOfCustody [Custodian Name] Description: The name or other identifier of the entity taking custody of an item of evidence or potential evidence.
Name: CaseName
Class: atom Type: string Description: The name or other identifier of a case under investigation. Do not use CaseName to refer to court cases. Instead, use CourtDecision and Jurisdiction.
Name: EvidenceID
Class: atom Type: string 217
Description: The name or other identifier of a piece of evidence or potential evidence in an investigation.
Name: CertType
Class: atom Type: string Description: The name or other identifier of a certification as applied to an individual, process, program or tool.
Name: CertNumber
Class: atom Type: string Description: The number or other identifier of a specific certification
Name: PolicyName
Class: atom Type: string Description: The name or other identifier of an organizational policy.
Name: PolicyDate
Class: atom Type: timestamp Description: The effective date of a specific organizational policy.
Name: BackupImageType
Class: atom Type: string Description: The name of a backup image including type (physical or logical), format (by tool developer and tool version) and any other important
218
distinguishing information relative to the type (not the length, content, etc.) of image.
Name: ImageType
Class: atom Type: string Description: The name or other identifier of a graphic image using the filename extension (i.e., jpg, gif, tif, etc.) and any additional required specifics (i.e., Microsoft bmp).
Name: CaseNotes
Class: atom Type: string Description: The name or other identifier of an investigators case notes including the identifier and the name of the investigator.
Investigation Presentation Atom SIDs
Name: Document
Class: atom Type: string Description: The name, description or other identifier of a relevant document.
Name: MediaID
Class: atom Type: string
219
Description: The name or other identifier (such as media type, manufacturer, model number and serial number) of a computer media device other than a disk drive.
Name: MissionImpactStatement
Class: atom Type: string Description: A statement of the impact of the event or incident on the victims mission. Includes the element of the mission impacted and the nature of the impact.
Time Descriptor Atom SIDs

Name: Time
Class: atom Type: timestamp Description: A moment in time, given in a specified time zone (TZ). If an action necessarily occupies an interval of time, and no other interval is given, this specifies the end of that interval. The format for Time is hh:mm:ss [TZ] ddmmyyyy.
Name: BeginTime
Class: atom Type: timestamp Description: A moment in time, given in a specified time zone (TZ). The format for BeginTime is hh:mm:ss [TZ] ddmmyyyy.
Name: EndTime
Class: atom
220
Type: timestamp Description: A moment in time, given in a specified time zone (TZ). If an action necessarily occupies an interval of time, and no other interval is given, this specifies the end of that interval. The format for Time is hh:mm:ss [TZ] ddmmyyyy.
Name: Duration
Class: atom Type: numeric decimal Description: The length of an interval which an event spans. Expressed in seconds.
Host Descriptor Atom SIDs
Name: HostName
Class: atom Type: string Description: The name of a host. It may be incompletely qualified; i.e., it may be 'first' instead of 'first.example.com'; both are allowed.
Name: FQHostName
Class: atom Type: string Description: As above, but a fully qualified host name (e.g., 'ten.ada.net.'). The ending dot is assumed if absent.
221
Name: IPv4Address
Class: atom Type: number (use dotted decimal format: aaa.bbb.ccc.ddd) Description: The IPv4 address of the host. However, if this SID is accompanied by a IPV4Mask, then the combination refers to a network or subnet.
Name: ArchitectureName
Class: atom Type: string Description: A name of a machine architecture (e.g., 'Intel 586').
Name: OSName
Class: atom Type: string Description: A name of an operating system (e.g., 'SunOS 4.1.3').
Domain Descriptor Atom SIDs
Name: DomainName
Class: atom Type: string Description: A (DNS) name of a domain.
222
Name: FQDomainName
Class: atom Type: string Description: As above, but fully qualified (e.g., 'ada.net.'). Again, the ending dot is assumed if absent.
Name: IPv4Mask
Class: number Type: number ( use dotted decimal format: aaa.bbb.ccc.ddd) Description: Makes a domain (typically a subnet) out of an accompanying IPv4Address (when used where allowed).
Protocol Descriptor Atom SIDs
Name: DataLinkProtocol
Class: atom Type: number (0 to 255 only) Description: The data link protocol, as indicated by the following table: 0 1 2 3 4 5 6 7 = reserved = Ethernet = Token ring = ARC net = IEEE 802.5, SNAP header = IEEE 802.2, FDDI = IEEE 802.3, MAN = SLIP 223
= PPP
9-255 = reserved
Name: StandardTCPPort
Class: number Type: number (0 to 65,650 only) Description: The standard TCP port for a session protocol, as defined in RFC 1700, or in the "living document" extensions, given as an FTP URL at the end of the port section in RFC 1700. They are used to identify the protocol (e.g., 23 for telnet) in a session.
Name: StandardUDPPort
Class: number Type: number (0 to 65,650 only) Description: The standard UDP port for a session protocol, as defined in RFC 1700, or in the "living document" extensions, given as an FTP URL at the end of the port section in RFC 1700. They are used to identify the protocol (e.g., 23 for telnet) in a session.
Ethernet Header Atom SIDs
Name: EtherAddress
Class: atom Type: number Description: Ethernet header field. The number is an array 6 octets long separated by colons ( : ). Example: aa:bb:cc:dd:ee:ff
224
IPv4 Header Atom SIDs
Name: SourceIPv4Address
Class: atom Type: number (use dotted decimal format: aaa.bbb.ccc.ddd) Description: IPV4 header field.
Name: DestinationIPv4Address
Class: atom Type: number (use dotted decimal format: aaa.bbb.ccc.ddd) Description: IPv4 header field.
Name: SourceIPv4Mask
Class: atom Type: number (use dotted decimal format: aaa.bbb.ccc.ddd) Description: Mask for interpreting the accompanying IPV4 address field.
Name: DestinationIPv4Mask
Class: atom Type: number (use dotted decimal format: aaa.bbb.ccc.ddd) Description: Mask for interpreting the accompanying IPV4 address field.
ICMP Descriptor Atom SIDs
225
Name: ICMPType
Class: atom Type: number Description: ICMP header field.
TCP Header Atom SIDs
Name: TCPPort
Class: atom Type: number (0 to 65,650 only) Description: TCP header field.
Name: TCPSourcePort
Name: TCPDestinationPort
Name: TCPPortRange
Class: atom Type: number Syntax: TCPPortRange [TCPPort 1] [TCPPort 2] Description: Provides the capability to specify a range of TCP ports.
226
Name: TCPSourcePortRange
Class: atom Type: number Syntax: TCPSourcePortRange [TCPSourcePort 1] [TCPSourcePort 2] Description: Provides the capability to specify a range of TCP source ports.
Name: TCPDestinationPortRange
Class: atom Type: number Syntax: TCPDestinationPortRange [TCPDestinationPort 1] [TCPDestinationPort 2] Description: Provides the capability to specify a range of TCP destination ports.
UDP Header Atom SIDs
Name: UDPPort
Class: atom Type: number (0 to 65,650 only) Description: UDP header field.
Name: UDPSourcePort
227
Name: UDPDestinationPort
Name: UDPPortRange
Class: atom Type: number Syntax: UDPSourcePortRange [UDPSourcePort 1] [UDPSourcePort 2] Description: Provides the capability to specify a range of UDP source ports.
Name: UDPSourcePortRange
Class: atom Type: number Syntax: UDPSourcePortRange [UDPSourcePort 1] [UDPSourcePort 2] Description: Provides the capability to specify a range of UDP source ports.
Name: UDPDestinationPortRange
Class: atom Type: number Syntax: UDPDestinationPortRange [UDPDestinationPort 1] [UDPDestinationPort 2] Description: Provides the capability to specify a range of UDP destination ports.
228
Name: UDPLength
Class: atom Type: number Description: UDP header field.
Mail Descriptor Atom SIDs
Name: MailMessageID
Class: atom Type: string Description: The tag used by an e-mail sending process (e.g., sendmail) to refer to a message, qualified by an associated HostName.
Name: ByteSize
Class: atom Type: numeric decimal Description: The length of a file, data structure (or in general any object whose size can be measured this way) in bytes (octets).
Name: ContentType
Class: atom Type: string Description: The MIME type of an object (e.g., 'text/html').
229
HTTP Descriptor Atom SIDs

Name: URL
Class: atom Type: string Description: The Universal Resource Locator for a web, ftp or telnet site.
Authorization Descriptor Atom SIDs
Name: ACL
Class: atom Type: string Description: An access control list, in some format.
Statistics Atom SIDs
Name: TotalCount
Class: atom Type: numeric decimal Description: The total number of objects sampled in a statisticsgathering session.
230
Attack Descriptor Atom SIDs

Name: IPv4Path
Class: Atom Syntax: IPv4Path [source IPv4 address], [destination IPv4 address] Type: The two arguments are of type string. Description: The path an attacker took, or appeared to take, in an attack. This SID allows only two arguments: the apparent source and apparent destination of the attack. For multiple hops along a suspected path, use multiple instances of IPv4Path.
Name: AttackNickname
Class: atom Type: string Description: A nickname associated with a perceived attack (e.g., 'Ping of Death').
Penetration
Unknown penetration attack (i.e., not in current list) Password guessing attack Use of an unserviced port or port that is prohibited by policy Attempt to use well-known passwords against well-known accounts Send an ICMP redirect message Generic routing protocol attack Generic NFS attack RFC 822 mail from a pipe Fingerd attack TCP hi-jacking Enable or disable features and set values RPC.Admind BackOrifice command usage DNS hostname overflow attack (CA-98.05) DNS length overflow attack (CA-98.05) 231
E-mail debug attack SMTP decode attack E-mail listserv buffer overflow attack E-mail WIZ attack Perl fingerd attack FTP args core dump attack FTP bounce attack (CA-97.27) FTP privileged port bounce attack FTP privileged port attack FTP CWD ~root attack FTP site exec .. attack FTP site exec tar attack HTTP campas cgi-bin attack HTTP count cgi-bin attack (CA-97.24) HTTP .. attack HTTP glimpse cgi-bin attack HTTP Internet Explorer .BAT file attack HTTP Internet Explorer 3.0 .URL/.LNK attack HTTP IIS 3.0 ASP %2e attack HTTP IIS 3.0 ASP . attack HTTP NCSA httpd buffer overflow attack HTTP Novell convert cgi-bin attack HTTP PHF attack HTTP PHP buffer overflow attack HTTP PHP cgi-bin file read attack HTTP SCO view-source cgi-bin attack HTTP SGI handler cgi-bin attack HTTP SGI Webdist cgi-bin attack HTTP access of Unix password file HTTP webSite Win-C-Sample attack HTTP webSite uploader file upload Ident newlines attack Ident buffer overflow attack IMAP buffer overflow attack INN control message attack INN buffer overflow attack
232
IP fragmentation attack IRCd buffer overflow attack Kerberos IV user snarf attack (CA-96.03) NFS file handle guess attack NFS mknod attempt NFS UID bug attack NIS buffer overflow attack (CA-98.06) PCNFSd exec attack (CA-96.08) POP buffer overflow attack (CA-97.09 and CA-98.08) HP/UX remote watch attack Rlogin -froot attack SMB SessionSetupAndX password overflow Statd file creation attack Statd buffer overflow attack Sun SNMP Backdoor TFTP get command TFTP put command Windows .pwl password file access attempt Ypupdated exec attack (CA-95.17) Mountd buffer overflow (CA-98.12) Tooltalk stack overflow (CA-98.11) MIME buffer overflow (CA-98.10) FTP Signal Handling vulnerability (CA-97.16) Metamail MIME vulnerability (CA-97.14) Rlogin buffer overflow (CA-97.06) Email MIME buffer overflow (CA-97.05) Talkd stack smashing (CA-97.04) FTP site exec .. attack General buffer overflow, such as eject, ffbconfig, fdformat, etc. Reserved account (not intended to run processes) executed code Authority violation (EUID not Author) User altered environment configuration of other user Unix ps vulnerability Unauthorized password modification Unauthorized modification to system executable*/ Root was acquired by user not designated as an administrator
233
Root was acquired by an unknown method (not SU, setuid) Anonymous FTP user modified Filesystem FTP login using reserved account name FTP sensitive file retrieval FTP site exec attack FTP core attack Syslog buffer overflow (CA-95.13)
Denial of Service
Unknown denial of service attack (i.e., not in current list) DNS cache poisoning (CA-98.05) SYN flood attack (CA-96.21) Ping Of Death attack (CA-96.26) Land denial of service attack (CA-97.28) Smurf denial of service attack (CA-98.01) Chargen/echo denial of service attack, aka Pepsi (96.01) Ascend kill denial of service attack SMTP Qmail length denial of service attack SMTP Qmail RCPT denial of service attack Redirecting finger HTTP Apache denial of service attack Rwhod buffer overflow attack SNMP attack against WINS server Talk flash attack TearDrop fragmentation attack & variations (CA-97.28) UDP mal-formed packet attack Windows out-of-band denial of service attack (WinNuke) Resource exhaustion: process table Resource exhaustion: file system FTP NLIST denial of service Solaris mail bomb attack
234
Unusual Access
Unknown unusual access attack (i.e., not in current list) Unusual get/put of a file, such as /etc/passwd Connection not usually expected, but permitted by policy - connection need not complete Unusual traffic pattern No ARP reply indicating host is down HTTP Shell Interpreter Accesses Ident error in request Duplicate IP's Unknown IP protocol RealSecure session kills SelSvc remote holdfile attack Source-routed connections Windows remote registry read ActiveX controls in HTTP traffic Java in HTTP traffic ShockWave applets in HTTP traffic Vulnerable HTTP Client Malformed packets that violate TCP/IP rules Remote use of packet capturing tools Local use of packet capturing tools Errors while connecting to Windows servers Windows null session login (possible anonymous user backdoor) Attempted illegal login Successful illegal login Illegal privilege escalation Local host clock has been set back more than "cnt" seconds Suspicious Unix SYSCALL argument name Unix root core file creation Unix root core file access Suspicious private file alteration Suspicious file creation Modification of system resource Suspicious SETUID file created
235
Warez client activity Warez server activity Possible Trojan horse execution Critical process killed Possible Loadmodule attack Access denied to a file or object Process subversion Process abuse
Flooding
Unknown flooding attack (i.e., not in current list) Unusually high number of UDP datagrams Unusually high number of ICMP datagrams HTTP Get
Probe
Unknown probe attack (i.e., not in current list) Standard port scan DNS requests for host information DNS Zone Xfer from high port number DNS Zone transfers SMTP Expn: line SMTP Vrfy: line Use of FTP SYST command HTTP nph-test-cgi attack (CA-97.07) HTTP test-cgi attack TCP half scan attack ISS scan (CA-93.14) Portmap dump attack Normal or heavy SATAN scan of a machine (CA-95.07a) Traceroute being used to map the net FTP directory probing
236
Light SATAN scan of a machine Ping scan Finger probe Port map dump Ruser Mount scan Ping sweep Horizontal scan
Alert Descriptor Atom SIDs
Name: Certainty
Class: atom Type: number Description: The certainty with which an analysis result is believed to hold, as measured by the observer. A value of 100 means absolute certainty. A value of 0 means no certainty; it does not mean that there is absolute certainty that the analysis does not hold (this shouldn't be used very often).
Name: Severity
Class: atom Type: number Description: The severity of an event, as measured by the observer. A value of 0 means that the event is believed to represent no risk to the system (again, this shouldn't be used very often). A value of 100 expresses maximum severity. What represents maximum severity will clearly vary from system to system.
237
Outcome or Status Descriptor Atom SIDs

Name: ReturnCode
Class: number Type: number (0-6) Description: This is a generic return code value, but designed to supply more information than a bare ReturnCode. This is an enumerated value, according to the following table: 0 = success 1 = pending 2 = failed (no reason given) 3 = failed due to server error 4 = failed, rejected by server 5 = failed due to client error 6 = failed, interrupted by user
Name: TCPConnectionStatus
Class: atom Type: number (0-4) Description: One of the following enumerated values: 0: Connection in an otherwise not-enumerated state. 1: Initial SYN packet sent only 2: Initial SYN and SYN-ACK response from receiver only 3: Three way handshake achieved but no application data sent 4: Connection established in good health and application data transmitted
238
Conjunction SIDs
Name: And
Class: conjunction Syntax: (And <Sentence 1> <Sentence n>) Description: The 'And' conjunction takes two or more sentences as arguments. If each of the sentences is an event, then the compound sentence headed by 'And' asserts that all the events took place. If each of the sentences is a prescription, then the compound sentence prescribes all of the prescriptions. May Contain: --- The following are referent SIDs: ReferAs ReferTo --- The And SID may contain any number of sentences.
Name: HelpedCause
Class: conjunction Syntax: (HelpedCause <Sentence 1> ... <Sentence n>) Description: Sentence 1 through Sentence n are all true, and furthermore, Sentence2 through SentenceN all represent events. In addition, Sentence 1 helped cause Sentence 2 to happen, Sentence 2 helped cause Sentence 3 to happen, and so forth. Note that Sentence 1 does not have to be an event; it could be a system state that set the stage for Sentence2. May Contain: --- The following are referent SIDs: ReferAs ReferTo --- The HelpedCause SID must contain two sentences.
239
Name: ByMeansOf
Class: conjunction Syntax: (ByMeansOf <Sentence 1> <Sentence n>) Description: Sentence 1 through Sentence n are all events. More precisely, Sentence 1 occurred by means of Sentence 2, and Sentence 2 occurred by means of Sentence 3, and so forth. For example, one logs into a machine by means of opening a telnet session with that machine. Format: May Contain: --- The following are referent SIDs: ReferAs ReferTo --- The ByMeansOf SID may contain any number of sentences.
Referent SIDs
Name: ReferAs
Class: referent Type: ulong
Name: ReferTo
Class: referent Type: ulong
240
APPENDIX 2 CASE STUDY EXAMPLE FRAGMENTS
A.2.1 An Example Incident Post Mortem Fragment

A.2.1.1 Introduction
In this section we present fragments of an example investigation analyzed using the Digital Investigation Process Language. This analysis is typical in that it represents a review of an investigation that, for a variety of reasons, did not conclude successfully. In virtually every such case, the failure to collect and analyze relevant evidence derives from one or more failures in the investigative process. The inability of the investigation team to draw acceptable conclusions often is a direct outgrowth of these process failures. Because of the length and complexity of the actual investigation, only representative fragments are shown here.
A.2.1.2 DIPL SID Listing

The investigation we use as an example is a post mortem of an infestation of a large enterprise network (VictimEnterprise) by the SQLSlammer worm in January, 2003. The SQLSlammer (Sapphire) worm was a very small worm (376 bytes) that infected the Internet very rapidly. According to a paper on the worm prepared by QinetiQ Trusted Information Management in the UK [PH03]: The worms targeting and replication algorithms were so aggressive that they were the cause of the problems not the exploitation of the SQL server vulnerability. The worm spread so fast and so far that the networks that it used were so overloaded that they could barely cope. A number of the major backbones were handling so much traffic that they started to degrade, some observing up to 20% packet loss.
241
The worm entered the victim enterprise at approximately 03:30 GMT on Monday, 28 January 2003 and the victim network was fully saturated by 03:38 GMT on the same day [QTIM03]. A post mortem investigation was conducted in early February, 2003. This example analyzes portions of that investigation using the EEDI process. The technical report [QTIM03] comprises the investigation narrative 15 and the DIPL characterization is taken from a sanitized version of that report. The characterization is organized by the DFRWS Framework classes, is modularized, and a brief summary of the narrative applied at the start of the DIPL listing. We show the example at the top level of abstraction. Each subsequent level of detail represents individual analysis of the components that make up that level of detail. The purposes for increasing levels of detail vary. For example, we may want to detail the specific type of approved software used along with the legal citations proving that it has been court-tested. That level of detail is very specific and is useful for modelling the details of an actual investigation. However, for the purpose of creating a reference model of an idealized investigation against which to compare an actual investigation, we must stay at a higher level of detail and abstract out those additional details that have other purposes. Typical reference models in DIPL are shown in Chapter 4. Generally speaking, DIPL SID listings take the form of:
(Conjunction SID if used (Verb SID (Adverb SIDs if used (Role SID (Objects such as Atom SIDs) ) ) ) )
Figure 21 - General Format of DIPL SID Listing
There may be more than one verb if the listing begins with a conjunction such as And or ByMeansOf. Conjunctions may be nested. There may be multiple role SIDs beneath a verb
15
The technical report is not available for public viewing due to its confidential nature.
242
and a role may have multiple objects.
A.2.1.2.1 Identification
The worm was not reported officially to security until after it had saturated the network (04:00 GMT, 28 January). It was reported by PCO 16 to information security. At 04:05 GMT security determined from behaviour (per Symantec Virus Exploit Alert V012503) [SYM03] that the attack likely was the W32/SQLSlammer Worm. DIPL listing for this event:
(And (ChangeState (OldState Correct network operation (ArchtitectureName VictimEnterprise) ) (CurrentState Network saturated (ArchitectureNameVictimEnterprise) (When (Time 04:00 GMT 28 January 2003) ) ) (Observer (RealName PCO) ) ) (ByMeansOf (Attack (AttackSpecifics (AttackNickname Unknown denial of service attack) (Comment Likely to be W32/SQLSlammer Worm Symantec Virus Exploit Alert V012503) (Observer (RealName Security) ) (When (Time 04:05 GMT 28 January 2003) ) )
16
The victims Production Control Office
243
) ) )
Figure 22 - DIPL Listing of Observed State Change in Victim Network
At 03:34 28 GMT, 28 January 2003 A subsidiary of the victim reported that a packet containing the worm code was captured by the subsidiarys intrusion detection system showing a source address of computer1 from within the victims internal network. DIPL listing for this event:
(Attack (AttackSpecifics (AttackNickname Unknown denial of service attack) (Comment W32/SQLSlammer Worm Symantec Virus Exploit Alert V012503 and intrusion detection log of packet) (Observer (RealName Victim Subsidiary) ) (When (Time 04:05 GMT 28 January 2003) ) (Initiator (ArchitectureName VictimEnterprice) (HostName computer1) ) (Target (UDPPort 1434) ) ) )
Figure 23 - DIPL Listing for Capture of Worm Packet y Remote Site
The preceding events in the Identification Class satisfy portions of the following elements:
Event/Crime detection Resolve signature Anomalous detection
244
System monitoring
The SID listings for this and remaining classes either may be concatenated to form a single SID listing or may remain as individual modules that, taken together, characterize the entire investigation. For the purpose of analyzing the overall investigation using model checking (as described in the next section) individual modules are more convenient and easier to explain to a lay audience.
A.2.1.2.2 Preservation
When the investigation team arrived on site they opened a case notes repository and set up an evidence management and logging system to establish chain of custody. The case notes, at the end of each day were placed into the evidence locker and logged into the evidence log. The evidence and the log were maintained by the quality manager on the team. Each time a new piece of evidence was collected (such as a log or sniffer trace) it was entered into the chain of custody. The DIPL listing that follows is the module for entry into the chain of custody and entry into case notes. This module is repeated as necessary throughout the investigation when testing the actual investigation against a structured model of a correct process. The RealName of the initiator (the individual creating the case notes or submitting evidence) and the observer (the individual receiving evidence or case notes for submission to chain of custody) are entered for each instance. There was no time synchronization performed. The DIPL listings for these modules are:
(ManageCase (Initiator (RealName [Name Here]) ) (Data (ChainOfCustody [Custodian Name]) (CaseName [Identifier of the Case]) (EvidenceID [ID Number]) ([Other SIDs as appropriate]) ) . . . (BeginTime [hh:mm:ss TZ ddmmyyyy]) (EndTime [hh:mm:ss TZ ddmmyyyy])
245
(Comment [Additional Information Case Notes]) (When (Time [hh:mm:ss TZ ddmmyyyy]) ) )

Figure 24 Case Management DIPL Template
The preceding events in the Preservation class satisfy the following elements:
Case management Chain of custody
There was no time synchronization performed although this is a mandatory element of the Preservation Class.
A.2.1.2.3 Collection
The Collection Class comprises most of the investigation. Collecting gross data elements to examine for applicable evidence is the most time consuming and critical portion of any investigation. It is in this class that that we observe the actual step by step process of collecting and preserving the data from which we will, eventually, extract evidence for analysis. Thus, it is the longest SID listing. For simplification, we will annotate the blocks of SID objects. Note that Preservation as an element is pervasive in this class. That means that the elements of the Preservation Class are preserved in the Collection Class. For that reason, we will see the SID listings for chain of custody and case notes as they appear above repeated for specific actions within this class. Also note that we begin this class (because we are collecting data that may contain evidence) with the TraceAuthority SID. Before the investigator can extract evidence he or she must be certain that he or she is acting within the legal framework that allows this action. In this case the authority is an internal policy on information protection and privacy. Following the verification of legal authority, the collection process begins. Following the first interview we see that the interview notes have been created and placed
246
into chain of custody. The DIPL listing begins here:
(TraceAuthority (Observer (Realname Investigator1) ) (PolicyName Information Protection and Privacy Policy 17) (PolicyDate 1 January 2001) (And (Interview (InterviewSubject (RealName Subject1) ) (BeginTime 20:25 GMT 31 Jan 2003) (EndTime 21:30 GMT 31 Jan 2003) (Observer (RealName Investigator1) ) (Comment Timeline of events and interviewee duties, other key individuals and key meetings conducted or attended) (ManageCase (Initiator (RealName Investigator1) (Comment Created Notes) ) (CaseName (RealName Victim Post Mortem) (CaseNotes (Time 20:25 GMT 31 Jan 2003) ) (Custody (EvidenceID Notes-31Jan03-1) (Time 21:35 GMT 31 Jan 2003)
17
Actual policy names and dates, interviewee names and interviewer names are masked for reasons of
privacy and non-disclosure requirements. Stephenson Structured Investigation of Digital Incidents in Complex Computing Environments
247
(Comment Evidence Description) ) ) )

Figure 25 - DIPL Listing for the Verification of Policy
The Examination and Analysis classes were, in the case of this investigation, particularly complex and the remaining DIPL listings will not be shown here due to the confidential nature of the information involved. However, each of the steps of the investigation were mapped against the DIPL reference models for the particular classes of the Framework and numerous gaps were discovered. Figures 18 and 19 in Section 5.4.1 show the formal models for failed and successful SQLSlammer attacks against an enterprise such as the one in this example, with the exception that one possible countermeasure, patching of SQL server, is not shown. The full Coloured Petri Net of the simplified failure analysis for this specific incident is shown below. Note that there were four potential countermeasures in place:
Perimeter firewall VPN firewall Wireless network firewall SQLServer patches installed
However, the victim had implemented only one of the potential countermeasures: the perimeter firewall. Thus, as can be seen from the token 1 `sqlslammer_attack in place Target, of colour Successful, the attack had to succeed.
248
Declarations:
color Attacks = with sqlslammer_attack | other_attack; color slammer_selected = Attacks; color Successful = Attacks; color Inhibitors = with configured | not_configured; color Inhibited = Inhibitors; var attack : Attacks; var countermeasure : Inhibitors; var perimeter : Inhibitors; var vpn : Inhibitors; var wireless : Inhibitors;
Attacks 1 àttack DOS Attacks
[attack = sqlslammer_attack] Select Attack 1 àttack
Slammer_selected
Initiator
1 `sqlslammer_attack [(attack = sqlslammer_attack), (countermeasure <> configured)] Deliver Attack
1 àttack Successful 1 àttack Target Inhibitors 1 1 `sqlslammer_attack Perimeter Firewalled Inhibitors VPN Firewalled 1 `not_configured
Inhibited
1 `countermeasure 3 3`not_configured 4 `countermeasure 1 `countermeasure
Countermeasures
1 `configured
re easu
1 `co
rm unte
1 `c o unter
meas
Inhibitors
ure
Inhibitors 1 `countermeasure Wireless Firewalled 1 `not_configured
VPN Firewalled 1 `not_configured
Figure 26 - Successful SQLSlammer Attack Against Example Net With Failed Countermeasures
Because the Deliver Attack transition saw at least 1 countermeasure token that did not equal configured, (potentially, it could have seen three) the transition fired and the sqlslammer_attack token was allowed to pass on to place Target.
A.2.2 An Example Incident Investigation Fragment

A.2.2.1 Introduction
The example in section A.2.1 completes with a formal model of the failure analysis for the incident post mortem in the example. In this example we show an investigation of a security breach that resulted in the apprehension of the perpetrator. As in the preceding example we demonstrate fragments only since the entire investigation was confidential and conducted under non-disclosure. As proof of concept we will, in this example, model some of the DIPL fragments using Coloured Petri Nets. For the purpose of validation an
249
of the DIPL we suggest that this modelling has value. Also, similar to A.2.1 we use a small investigation as an example, even though the EEDI process in general and the DIPL in particular may be considered overkill for an investigation of this limited scope. The reasoning, as in the previous example, is that full DIPL characterization of an investigation of appropriate complexity would be well beyond the length allowed for this thesis. While in A.2.1 we excerpted from a larger more complex investigation for the sake of brevity, this example is relatively complete. Exceptions are in areas where we needed to protect the anonymity of the victim and where expansion for the purposes of this thesis would serve no useful purpose. Finally, again as in the above example, this investigation was completed without the use of either the EEDI process or the DIPL. The investigation was revisited using the processes and techniques described in this thesis and, in the case of A.2.1, results were obtained using modelling that the investigators were unable to obtain during the actual investigation. In this example (A.2.2) the analysis supported the results obtained during the actual investigation. These two different outcomes are discussed briefly in Chapter 6. In this example we refer to the End-to-End Digital Investigation process illustrated in Figure 26.
INVVESTIGATIVE NARRATIVE
DIPL CHARACTERIZATION
CPN MODE LING
FOR MA L MODE L O F INVESTIGATION
Figure 27 - Generalized EEDI Process Flow
A.2.2.2 Incident Background

The incident took place in the early 1990s. An MSNT system administrator for a credit union associated with a large corporation left his employment during a disagreement with his management. Subsequently he went to work as a contract programmer for the large corporation. Shortly after his move to the large corporation, he attacked the NT server that he administrated at his former employer. The credit union had not changed the administrator password and there was network access between the large corporation and the credit union that facilitated the attack.
250
The investigation was performed using the tools available at the time. It was not characterized using the techniques described in this thesis. Subsequently, as part of the research for this thesis, we revisited the notes from that investigation and used the EEDI process to characterize it. Additionally, as proof of concept and verification of the process and the DIPL we modelled the investigation, some of which is shown below. For the purposes of this thesis, we have limited the characterization and modelling somewhat to conserve space. In the following sections we follow the general process flow shown above in Figure 26. Dates, IP addresses and other explicit information has been masked to preserve the confidentiality of the organizations involved.
A.2.2.2.1 Investigative Narrative

In this example we refer to the credit union simply as "Credit Union". We refer to the large corporation as "Corporation". The narrative is somewhat simplified from the original case notes to allow for DIPL characterization in a reasonable space. On January 22, 1992, at 12:25, investigator Stephenson was contacted by the Credit Union. The temporary system administrator for the Credit Union had been notified earlier that the mortgage database was not responding. At 13:05 on that date investigator Stephenson opened a case file. Upon arriving at the site, investigator Stephenson verified the presence of an appropriate privacy policy and proceeded, at 14:52 on January 22, 1992, to take an image of the affected server using SafeBack version 3.0. Server logs for server "Server 1" were extracted and preserved from the image at 16:20. The image and the server logs were placed into chain of custody at 16:25 on January 22, 1992. Entries were made in the case log. At 16:30 on January 22, 1992, investigator Stephenson
251
commenced
interviews
with
operator
personnel.
Interviews lasted until 18:25 on January 22, 1992. At 18:25 on January 22, 1992, investigator Stephenson made entries in the case log. At 09:30 on January 23, 1992, investigator Stephenson resumed interviews with operator personnel. Interviews lasted until 14:15 on January 23, 1992. Entries were made in the case log. At 15:10 on January 23, 1992, investigator Stephenson commenced interviews with administrative and
management personnel. The interviews continued until 17:30 on January 23, 1992. Entries were made in the case log. At 17:35 on January 23, 1992 investigator Stephenson made entries in the case log. At 09:15 on January 24, 1992 investigator Stephenson analyzed the SafeBack image of Server 1, and determined that the attacker had logged in as user: Administrator. There was no evidence of hacking. At that time investigator Stephenson determined the machine name of the attack computer, the IP address of the attack computer, and its probable location, as well as the probable path of the attack. Entries were made in the case log at 15:30. At 15:45 investigator Stephenson received the log from Banyan Vines (network system manufacturer) gateway between the Corporation and the Credit Union extracted the MAC address of the attack computers IP address from that log. Made an entry in the case log. At 09:00 on January 25, 1992 investigator Stephenson
252
visited the site where the attack computer was located. At 09:30 on January 25, 1992 investigator Stephenson verified the privacy policies of the Corporation (the location of the suspected attack computer) and performed an image of the attack computer using SafeBack version 3.0. Entries were made in the case log at 12:15. At 13:00, the IP address, machine name, and MAC address of the suspected attack computer were verified. Entries were made in the case log at 13:10. At 14:05 on January 25, 1992 investigator Stephenson analyzed the image from the attack computer. An entry was made in the case log at 16:25. At 09:00 on January 26, 1992 investigator Stephenson analyzed the swipe card log for the card access on the floor where the attack computer was located, and verified that the user of the attack computer, now the primary suspect, was present on the floor at the time of the incident. An entry was made in the case log at 09:15. At 09:30 on January 26, 1992 investigator Stephenson interviewed co-workers of the suspect and determined that the suspect was present at his desk within three hours of the incident. Interview notes were entered into the case log at 11:50. At 14:30 on January 26, 1992 investigator Stephenson interviewed the suspect's supervisor and determined that the supervisor had placed a telephone call to the suspect at his desk (i.e. at his computer) within three minutes of the incident. An entry was made in the case
253
log at 15:30. At 09:00 on January 27, 1992 investigator Stephenson and the suspect's supervisor interviewed the suspect, presented the evidence against the suspect, and obtained a confession from the suspect. Entries were made in the case log at 10:30 on January 27, 1992.
A.2.2.2.2 DIPL Characterization

For the purposes of this example, we break the investigation down from the narrative above. This approach, shown in the figure below, to presenting the DIPL listing adjacent to the narrative allows the reader to match the narrative with the listing easily. In practice, the DIPL characterization for each entry in the narrative is produced at the time of the narrative. The result is a collection of sequential DIPL characterizations for the overall investigation. These characterizations represent the detailed sequence of investigative steps and may, as will be seen in A.2.2.2.3, be modelled and compared against a reference model if desired.
NARRATIVE
(And
DIPL LISTING
(ReceiveComplaint (Initiator (RealName Joe Operator)
1. Call received by investigator Stephenson, January 22, 1992, at 12:25
) (Receiver (RealName Peter Stephenson) ) (When (BeginTime 12:25 22 Jan 1992) ) (AttackNickName access denied to a file or object)
2. Mortgage database not responding
(FileName Mortgages.db) (Target (HostName Server-1) )
254
(When (BeginTime 10:32 22 Jan 1992) ) ) (ManageCase (Initiator
3. January 22, 1992, 13:05 case file opened by investigator Stephenson

) )
(RealName Stephenson) (CaseName Case 123) (BeginTime 13:05 22 Jan 1992) (TraceAuthority (Policy (PolicyName Credit Union Information privacy Policy) (PolicyDate 1 Jan 1990)
4. 14:52 verified policy
(Observer (RealName Stephenson) ) (BeginTime 15:42 22 Jan 1992) ) ) (ImageUsing (Initiator (RealName Stephenson) (ApprovedMethod (Certification (Certifier (RealName NTI) ) (CertType NTI Training) (CertNumber Course 1-190) ) ) ) (BeginTime 14:52 22 Jan 1992) ) (Machine (HostName Server 1) (VolumeID Applications)
5. 14:52 imaged Server 1 using SafeBack v3.0
255
(DiskID SEA1294-9832-110A) ) (Tool SafeBack (ProgramName SafeBack) (FileNamesafeback.exe) (VersionNumber 3.0) (ApprovedSoftware (Citation (CaseName Joe v Volcano) ) ) ) (ReferAs 0x12345678) ) (CollectData (Initiator (RealName Stephenson) ) (When (BeginTime 16:20 22 Jan 1992) ) (Tool TextSearch (ProgramName TextSearch) (FileName txtsrch.exe) (VersionNumber 1.0) ) (Target (ReferTo 0x12345678) ) (Data Server Logs (ReferAs 0x87654321) ) ) (ManageCase (Initiator (RealName Stephenson)
6. Extracted logs from Server 1 image at 16:20.
7. Placed logs and image in chain of custody at 16:25
) (CaseName Case 123) (ChainOfCustody Stephenson) (Data (ReferTo 0x12345678)
256
(ReferTo 0x87654321) ) (When (BeginTime 16:25 22 Jan 1992) ) ) (ManageCase (Initiator (RealName Stephenson)
8. Entry made in case log
) (CaseName Case 123) (BeginTime 16:25 22 Jan 1992) ) (ConductInterview (Initiator (RealName Joe Operator) (RealName Jane Operator) )
9. 16:30 18:25, 22 January 1992 Stephenson interviewed operators
(Observer (RealName Stephenson) ) (When (BeginTime 16:30 22 Jan 1992) (EndTime 18:25 22 Jan 1992) ) ) (ManageCase (Initiator (RealName Stephenson)
10. 18:25 entry in case log
) (CaseName Case 123) (BeginTime 18:25 22 Jan 1992) ) (ConductInterview (Initiator (RealName James Operator) (RealName JoAnn Operator) ) (Observer (RealName Stephenson) )
11. 09:30 14:15, 23 January 1992, interviews with operators
257
(When (BeginTime 09:30 23 Jan 1992) (EndTime 14:15 23 Jan 1992) ) ) (ManageCase (Initiator (RealName Stephenson)
) (CaseName Case 123) (BeginTime 14:15 23 Jan 1992) ) (ConductInterview (Initiator (RealName Greg Admin) (RealName Mary Manager) )
13. 15:10 17:30 interviews with administrators and managers
(Observer (RealName Stephenson) ) (When (BeginTime 15:10 23 Jan 1992) (EndTime 17:30 23 Jan 1992) ) ) (ManageCase (Initiator (RealName Stephenson)
) (CaseName Case 123) (BeginTime 17:30 23 Jan 1992) )
15. 09:15 on January 24, 1992 Stephenson analyzed the SafeBack v 3.0 image of Server 1, and determined that the attacker had logged in as Administrator. Determined the machine name of the attack computer is Corp-
(RecoverData (Initiator (RealName Stephenson) ) (BeginTime 09:15 24 Jan 1992) (FileSource (ReferTo 0x12345678) ) (Link Attacker: Corp-HP100-100,
258
HP100-100, the IP address of the attack computer is 10.10.10.212, and its probable location is at the Corporation room 1212, as well as the probable path of the attack through Banyan Vines gateway between Corporation and Credit Union.
Victim: Server 1) (Link Login Attacker, Administrator) (Link Corp-HP100-100, IP Address 10.10.10.212) (Link Corp-HP100-100 Location, Corporation Room 1212) (Link IPGateway-1, IP Address 10.1.1.1) (Link Server 1, IP Address 10.20.20.34) (TracePath (IPv4Path 10.10.10.212, 10.1.1.1.) (IPv4Path 10.1.1.1., 10.20.20.34) ) ) (ManageCase (Initiator (RealName Stephenson)
) (CaseName Case 123) (BeginTime 15:30 24 Jan 1992) ) (RecoverData (Initiator (RealName Stephenson) )
17. 15:45 Stephenson received gateway log and extracted MAC address for 10.10.10.212 from a session through the gateway at the time of the incident.
(BeginTime 15:45 24 Jan 1992) (FileSource (FileName IPGateway1\adm\logs\authlog-1.txt) ) (Link IP Address 10.10.10.212, MAC Address 00:08:02:d5:47:b8) (Data (ReferAs 0x56781234) ) )
259
(ManageCase (Initiator (RealName Stephenson)
18. 15:45 entry in the case log
) (CaseName Case 123) (BeginTime 15:45 24 Jan 1992) ) (TraceAuthority (Policy
19. 09:30 on January 25, 1992 investigator Stephenson verified the privacy policies of the Corporation (the location of the suspected attack computer).
) )
(PolicyName Corporation Information privacy Policy) (PolicyDate 1 Jan 1985) (Observer (RealName Stephenson) ) (BeginTime 09:30 25 Jan 1992)
(ImageUsing (Initiator (RealName Stephenson) (ApprovedMethod (Certification (Certifier (RealName NTI) ) (CertType NTI Training) (CertNumber Course 1-1-
20. Performed an image of the attack computer using SafeBack version 3.0 at 09:35.
) ) )
90)
(BeginTime 09:35 25 Jan 1992) ) (Machine (HostName Corp-HP100-100) (VolumeID Volume-1) (DiskID SEA1294-11783-220) ) (Tool SafeBack (ProgramName SafeBack)
260
(FileNamesafeback.exe) (VersionNumber 3.0) (ApprovedSoftware (Citation (CaseName Joe v Volcano) ) ) ) (ReferAs 0x34567812) ) (ManageCase (Initiator (RealName Stephenson)
21. Entries were made in the case log at 12:15. )
) (CaseName Case 123) (BeginTime 12:15 25 Jan 1992)
(MatchPattern (Initiator (RealName Stephenson)
22. At 13:00, the IP address, machine name, and MAC address of the suspected attack computer were verified against logs.
) (FileSource (ReferTo 0x56781234) ) (Comment Match to IP Address, MAC Address, and Machine Name of Attack Machine) (BeginTime 13:00 25 Jan 1992) ) (ManageCase (Initiator (RealName Stephenson)
23. Entries were made in the case log at 13:10. ) 24. At 14:05 on January 25, 1992 investigator Stephenson analyzed the image from the
) (CaseName Case 123) (BeginTime 13:10 25 Jan 1992)
(RecoverData (Initiator (RealName Stephenson) )
261
attack computer.
(BeginTime 14:05 25 Jan 1992) (FileSource (ReferTo 0x34567812) ) ) (ManageCase (Initiator (RealName Stephenson) ) (CaseName Case 123) (BeginTime 16:25 25 Jan 1992) ) (RecoverData (Initiator (RealName Stephenson) ) (BeginTime 09:00 26 Jan 1992)
25. An entry was made in the case log at 16:25.
26. At 09:00 on January 26, 1992 Stephenson analyzed the swipe card log for the card access on the floor where the attack computer was located.
(FileSource (FileName CardAccessSystem\logs\1212\012692) (Suspect (RealName Tom Suspect) ) (Link Tom Suspect, 10:32 22 Jan 1992) ) ) (ManageCase (Initiator (RealName Stephenson) ) (CaseName Case 123) (BeginTime 09:15 26 Jan 1992) )
28. At 09:30 on January 26, 1992 investigator Stephenson interviewed coworkers of the suspect.
(ConductInterview (Initiator (RealName Minnie Coworker) (RealName John Coworker) )
262
(Observer (RealName Stephenson) ) (When (BeginTime 09:30 26 Jan 1992) ) ) (ManageCase (Initiator
29. Interview notes were entered into the case log at 11:50.
) )
(RealName Stephenson) (CaseName Case 123) (BeginTime 11:50 26 Jan 1992) (ConductInterview (Initiator (RealName Michelle Supervisor)
30. At 14:30 on January 26, 1992 investigator Stephenson interviewed the suspect's supervisor.
) (Observer (RealName Stephenson) ) (When (BeginTime 14:30 26 Jan 1992) ) ) (ManageCase (Initiator (RealName Stephenson) ) (CaseName Case 123) (BeginTime 15:30 26 Jan 1992) ) (ConductInterview
32. At 09:00 on January 27, 1992 Stephenson and the suspect's supervisor interviewed the suspect, and obtained a confession from the suspect.
(Initiator (RealName Tom Suspect) ) (Observer (RealName Stephenson) (RealName Michelle Supervisor) ) (When
263
(BeginTime 14:30 26 Jan 1992) ) (Comment Obtained confession in writing from Tom Suspect) ) (ManageCase (Initiator (RealName Stephenson)
33. Entries were made in the case log at 10:30 on January 27, 1992.
) )
) (CaseName Case 123) (BeginTime 10:30 27 Jan 1992) (Comment Case closed)
Figure 28 - DIPL Listing of Example Investigation
A.2.2.2.3 Selected Modelling

In this section we select portions of the DIPL listing in A.2.2.2.2 above and model them using Coloured Petri Nets. As discussed in Chapter 6, This modelling comprises one method of validating the DIPL characterization of the EEDI process. However, the reader will have noted that the example used above is an investigation that was not, originally, conducted using the EEDI methodology. Another useful approach in cases such as this one is to create an idealized investigative reference model and measure the actual investigation against it. This could be used to validate (or invalidate) the work of another, perhaps opposing, investigator. There are two possible approaches to modelling an investigation. The first is modelling each DIPL sentence as a CPNet module or page and connecting the individual modules through one or more transitions. The second is modelling the investigation as a whole, modelling individual clauses where necessary. For small investigations, the latter makes more sense since the investigation often may fit onto a single CPNet page. For large, complex investigations a variant of the former is the better technique. In that approach individual clauses may be modelled as re-usable modules and used as necessary. For the purposes of this example, we excerpt sections of the DIPL characterization above and, as proof of concept, develop a model of the actual process. Rather than sticking
264
exclusively to one or the other of the two approaches, we mix them as necessary to get the most efficient and simplest CPNet. This example differs from the example in A.1 in that this example models the investigation while the A.1 example models the outcome. The CPNet for this investigation appears in Figure 28 below. The declarations for the CPNet in Figures 28 and 29 are shown in Figure 30.
Figure 29 - CPNet for the Investigation in A.2.2 in Pre-set State
The CPNet in Figure 28 shows the first six steps of the example investigation. It begins with the notification of an event, applies all of the proper tests (policy verified, approved method, approved software and entry made in case log) leading to the point of completing
265
an image of the server (Server 1). The simulator completes successfully validating the investigation to that point if the appropriate requirements (policy verified, approved method and case log entry made) are set to an initial marking of true and the software used is one of the approved versions and the court citation validating it is correct. If any of these tests fail, the image cannot be made. Figure 29 shows the CPNet in its pre-set state. Figure 30 shows the same CPNet in a successful post-set state.
Figure 30 - CPNet for the Investigation in A.2.2 in Successful Post-set State
In this case, the model is pre-made and the initial markings of the places determine whether or not the investigation has progressed correctly. In a complex investigation the CPNet in Figures 29 and 30 might represent a single page of many. These pages then converge in a top level page where the correctness of the investigative process as a whole may be seen. More complex CPNets, including such enhancements as timing relationships, evaluation of functions and tests for concurrency allow the handling of more complex investigations.
266
Here, we show a simple net for illustrative purposes. The reader should have no trouble understanding the markings for this CPNet.
Figure 31 - Declarations for the CPNets in Figures 29 and 30
267

Structured Investigation of Digital Incidents in Complex Computing Environments

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Structured Investigation of Digital Incidents in Complex Computing Environments

Загружено:

Авторское право:

Доступные форматы

Structured Investigation of Digital Incidents in Complex Computing Environments

Peter Reynolds Stephenson

School of Technology Oxford Brookes University

Structured Investigation of Digital Incidents in Complex Computing Environments

Structured Investigation of Digital Incidents in Complex Computing Environments i

Structured Investigation of Digital Incidents in Complex Computing Environments iii

4.2.2 SID Descriptions .............................................................................................................68

...........................................................................................................84 5.2.1.1.2 The Function f ( ) .................................................................................................86

5.2.1.1.4 Implies, is the same as, and If and Only If (iff)...............................................................89

5.2.2 Example Mathematical Definitions ................................................................................89

Structured Investigation of Digital Incidents in Complex Computing Environments iv

5.2.3 Coloured Petri Nets........................................................................................................91

Process Verb SIDs ......................................................................................................................142

Forensic Identification Verb SIDs ..................................................................................................145

Forensic Preservation Verb SIDS...................................................................................................149

Forensic Collection Verb SIDs.......................................................................................................150

Structured Investigation of Digital Incidents in Complex Computing Environments vi

Forensic Examination Verb SIDS.................................................................................................153

Forensic Analysis Verb SIDs........................................................................................................155

Investigation Identification Verb SIDs.............................................................................................156

Investigation Preservation Verb SIDS .............................................................................................157

Investigation Collection Verb SIDs .................................................................................................158

Investigation Presentation Verb SIDs..............................................................................................159

Host Status Verb SIDs ...............................................................................................................161

TCP Connection Verb SIDs.........................................................................................................163

HTTP Verb SIDs .....................................................................................................................164

Structured Investigation of Digital Incidents in Complex Computing Environments vii

Name: HTTPGet ...............................................................................................................................165

Application Session Verb SIDs .....................................................................................................166

State Assertion Verb SIDs...........................................................................................................170

Authorization and Policy Verb SIDs..............................................................................................171

Auditing Verb SIDs ...................................................................................................................177

Analysis Verb SIDs....................................................................................................................179

Command Verb SIDs .................................................................................................................180

ROLE SIDS ...................................................................................................................................182 General Purpose Role SIDs...........................................................................................................182

Structured Investigation of Digital Incidents in Complex Computing Environments viii

File-Related Role SIDs ................................................................................................................184

Process-Related Role SIDs ............................................................................................................185

User-Related Role SIDs ...............................................................................................................187

Forensic Preservation Role SIDs.....................................................................................................189

Forensic Collection Role SIDs........................................................................................................189

Forensic Examination Role SIDs...................................................................................................192

Forensic Analysis Role SIDs.........................................................................................................193

Investigation Collection Role SIDs ..................................................................................................194

Investigation Analysis Role SIDs ...................................................................................................195

Investigation Presentation Role SIDs ...............................................................................................196

Messaging-Related Role SIDs ........................................................................................................199

State-Related Role SIDs...............................................................................................................200

Analysis-Related Role SIDs..........................................................................................................202

Auditing Role SIDs ....................................................................................................................203

ADVERB SIDS ..............................................................................................................................205

ATTRIBUTE SIDS .........................................................................................................................206

ATOM SIDS..................................................................................................................................208 General Purpose Atom SIDs.........................................................................................................208

File Descriptor Atom SIDs...........................................................................................................209

Structured Investigation of Digital Incidents in Complex Computing Environments x

Program Descriptor Atom SIDs.....................................................................................................211

Process Descriptor Atom SIDs.......................................................................................................212

User Descriptor Atom SIDs..........................................................................................................213

Forensic Collection Atom SIDs......................................................................................................214

Structured Investigation of Digital Incidents in Complex Computing Environments xi

Forensic Analysis Atom SIDs.......................................................................................................217

Investigation Preservation Atom SIDs .............................................................................................217

Investigation Presentation Atom SIDs .............................................................................................219

Time Descriptor Atom SIDs .........................................................................................................220