Академический Документы
Профессиональный Документы
Культура Документы
Sang H. Son Department of Computer Science University of Virginia Charlottesville, Virginia 22903 son@cs.virginia.edu
Outline
Introduction: real-time database systems and real-time data services Why real-time databases? Misconceptions about real-time DBS Paradigm comparison Characteristics of data and transactions in real-time DBS Origins of time constraints Temporal consistency and data freshness Time constraints of transactions Real-time transaction processing Priority assignment Scheduling and concurrency control Overload management and recovery
2
Outline (contd)
Advanced real-time applications Active, object-oriented, main-memory databases Flexible security paradigm for real-time databases Embedded databases Real-world applications and examples Real-time database projects and research prototypes
BeeHive system
Research issues, trends, and challenges Exercises
I. Introduction
Outline Motivation: Why real-time databases and data services? A brief review: real-time systems Misconceptions about real-time DBS Comparison of different paradigms: Real-time systems vs real-time database system Conventional DBS vs real-time DBS
Real-time FAST
Real-time nonosecs or secs
Rapid growth in research and development workshops, conferences, journals, commercial products standards (POSIX, RT-Java, RT-COBRA, etc)
9
Time Constraints
v(t)
d v(t)
v0
Soft deadline
d1
d2
t 10
12
But
Not all data in RTDB is durable: need to handle different types of data differently (will be discussed further later) Correctness can be traded for timeliness - Which is more important? Depends on applications, but timeliness is more important in many cases Atomicity can be relaxed: monotonic queries and transactions Isolation of transactions may not always be needed Temporally-correct serializable schedules serializable schedules
15
designed to provide good average response time, while possibly yielding unacceptable worst case execution time
resource management and concurrency control in conventional database systems do not support the timeliness and predictability
17
19
sufficient. although main-memory resident database eliminate disk delays, conventional databases have many sources of unpredictability, such as delays due to blocking on locks and transaction scheduling increases in performance cannot completely make up for the lack of time-cognizant protocols in conventional database systems
while both of temporal DB and RTDB support time-specific data operations, they support different aspects of time in RTDB, timely execution is of primary concern, while in temporal DB, fairness, resource utilization, and ACID properties of transactions are more important
22
RTDBS guarantee is meaningless unless H/W and S/W never fails true, in part, due to the complexity involved in predictable and timely execution it does not justify the designer not to reduce the odds of failure in meeting critical timing constraints
Reference: Stankovic, Son, and Hansson, Misconceptions About RealTime Databases, IEEE Computer, June 1999.
23
27
29
Outline
Real-time transactions
30
Issues to be discussed:
1. What are the origins of (the semantics of) time constraints of the data, events, and actions?
Example #1 (contd)
Features of an object must be collected while the object is still in front of the camera.
Current object and features apply just to the object in front of the camera
Lose validity once a different object enters the system.
Designer Artifacts
Subsequent decisions of the database system designer introduce additional constraints: The type of computing platform used (e.g. centralized vs. distributed) The type of software design methodology used (e.g., datacentric vs. action-centric) The (pre-existing) subsystems used in composing the system The nature of the actions (e.g., monolithic action vs. graphstructured or triggered action)
Time constraints reflect the specific design strategy and the subsystems chosen as much as the externally imposed timing requirements
37
Determining all related time constraints in an optimal fashion for non-trivial systems is intractable => divide and conquer (and live with acceptable decisions)
Multi-layer decision process The decisions made at one level affect those at the other level(s) While no decision at any level is likely to be unchangeable, cost and time considerations will often prevent overhaul of prior decisions
38
39
Specifications of minimal separation between response to a stimulus and the next stimulus -> property of the sporadic activity that deals with that stimulus
41
42
An Example
Data object is specified by (value, absolute validity interval, time-stamp)
47
49
50
Aperiodic
- If temperature > 1000 within 10 secs add coolant to reactor
Based on Value: Hard: must execute before deadline Firm: abort if not completed by deadline Soft: diminished value if completed after deadline
51
Result useful even after deadline => a soft deadline system must reassign successors parameters - so that the overall end-to-end time constraints are satisfied
Firm and soft time constraints offer the system flexibility - not present with hard or safety-critical time constraints
52
54
55
Outline
Priority assignment Scheduling paradigms Priority inversion problem Concurrency control protocols Predictability issues Overload management and recovery
56
Priority Assignment
Different approaches EDF: earliest deadline first highest value (benefit) first highest (value/computation time) first complex function of deadline, value, slack time
57
59
60
Scheduling Paradigms
Scheduling analysis or feasibility checking of real-time computations can predict whether timing constraints will be met Several scheduling paradigms emerge, depending on whether a system performs schedulability analysis if it does, whether it is done statically or dynamically, and whether the result of the analysis itself produces a schedule or plan according to which computations are dispatched at runtime
61
Different Paradigms
1. Static Table-Driven approaches: Perform static schedulability analysis The resulting schedule is used at run-time to decide when a computation must begin execution 2. Static Priority Driven Preemptive Approaches: Perform static schedulability analysis but unlike in the previous approach, no explicit schedule is constructed At run-time, computations are executed (typically) highestpriority- first Example: rate-monotonic priority assignment - priority is assigned proportional to frequency
62
63
All transactions have to meet the timing constraints best-effort is not enough
For soft deadlines, the transaction is continued to finish in general, even if the deadline is missed
Various time-cognizant concurrency controls developed, many of which are extensions of two-phase locking (2PL), timestamp, and optimistic concurrency control protocols
65
Read X
Read Y
DD(Y)
DD(X)
Activate TR T
Begin TR T
Deadline of TR T
66
Read X
Read Y
DD(Y)
DD(X)
Activate TR T
Deadline of TR T
DD = Data deadline
67
Read X
Read Y
DD(Y)
DD(X)
Activate TR T
Begin TR T ABORT
Deadline of TR T
68
Read X Read Y
DD(Y)
DD(X)
Activate TR T
Begin TR T
Deadline of TR T
69
Read X
Activate TR T
Begin TR T
Deadline of TR T
71
72
73
T2:
read_lock (X); read_object (X); write_lock (Y); unlock (X); read_object (Y); Y = X + Y; write_object (Y); unlock (Y);
74
T2:
read_lock (Y); read_object (Y);
: :
75
2PL (or any other locking schemes) relies on blocking requesting transaction if the data is already locked in an incompatible mode. What if a high priority transaction needs a lock held by a low priority transaction? Possibilities are ... let the high priority transaction wait abort the low priority transaction let low priority transaction inherit the high priority and continue execution The first approach will result in a situation called priority inversion Several conflict resolution techniques are available, but the one that use both deadline and value show better performance
76
77
T 2:
read_lock (X); read_object (X); write_lock (Y); unlock (X);
time
78
Priority abort abort the low priority transaction - no blocking at all quick resolution, but wasted resources Priority inheritance execute the blocking transaction (low priority) with the priority of the blocked transaction (high priority) intermediate blocking is eliminated Conditional priority inheritance based on the estimated length of transaction inherit the priority only if blocking one is close to completion; abort it, otherwise
79
Ti requests data object locked by Tj if Priority (Ti) < Priority (Tj) then block Ti else if (remaining portion of Tj > threshold) abort Tj else Ti waits while Tj inherit the priority of Ti to execute
80
Potential problems of (blind) priority inheritance: life-long blocking - a transaction may hold a lock during its entire execution (e.g., strict 2PL case) a transaction with low priority may inherit the high priority early in its execution and block all the other transactions with priority higher that its original priority especially severe if low priority transactions are long Conditional priority inheritance is a trade-off between priority inheritance and priority abort Not sensitive to the accuracy of the estimation of the transaction length
81
Performance Results
Priority inheritance does reduce blocking times. However, it is inappropriate under strict 2PL due to life-time blocking of the high priority transaction. It performs even worse than simple waiting when data contention is high Priority abort is sensitive to the level of data contention Conditional priority inheritance is better than priority abort when data contention becomes high Blocking is a more serious problem than resource waste, especially when deadlines are not tight In general priority abort and conditional priority inheritance are better than simple waiting and priority inheritance Deadlock detection and restart policies appear to have little impact
82
write phase:
if validation ok then local copies are written to the DB otherwise discard updates and (re)start transaction
83
84
OCC Example
T1 :
read_object (X); X = X + 1; write_object (X); validation <conflict resolution, .e.g, restart transaction>
T 2:
read_object (X); read_object (Y);
T3 :
read_object (Y); Y = Y + 1; write_object (Y); ...
85
86
OCC: Comparison
Broadcasting commit (no priority consideration) not effective in real-time databases Sacrifice policy: wasteful theres no guarantee the a transaction in H will actually commit; if all in H abort, T is aborted for nothing Wait policy: address the above problem if commit after waiting, it aborts lower priority transactions after waiting, which may have not enough time to restart and commit the longer T stays, the higher the probability of conflicts Wait-X policy: compromise between sacrifice and wait X=O: sacrifice policy; X=100: wait policy performance study shows X=50 gives the best results
87
Why? to provide blocking at most once property the system can compute (pre-analyze) the worst case blocking time of a transaction, and thus schedulability analysis for a set of transaction is feasible A complete knowledge of data and real-rime transactions necessary: for each data object, all the transactions that might access it need to be known true in certain applications (hard real-time applications) not applicable to other general applications
88
For each data object O: write-priority ceiling: the priority of the highest priority transaction that may write O absolute priority ceiling: the priority of the highest priority transaction that may read or write O r/w priority ceiling: dynamically determined priority which equals absolute priority ceiling if O is write-locked; equals write priority ceiling if O is read locked Ceiling rule: transaction cannot lock a data object unless its priority is higher that the current highest r/w priority ceiling locked by other transactions Inheritance rule: low priority transaction inherits the higher priority from the ones it blocks Good predictability but high overhead
89
90
91
Managing Overloads
The result of overload is a slow response for the duration of the overload In real-time databases, catastrophic consequences may arise: hard real-time transactions must be guaranteed to meet deadlines under overloads transaction values must be considered when deciding which transactions to shed missing too many low-valued transactions with soft deadlines may eventually degrade system performance Dealing with overloads is complex and practical solutions are needed
92
Background Dynamic real-time systems are prone to transient overloads requests for service exceed available resources for a limited time causing missed deadlines may occur when faults in the computational environment reduce computational resource available to the system may occur when emergencies arise which require computational resources beyond the capacity of the system Overloads cause performance degradation Schedulers are generally not overload-tolerant
94
95
96
97
Scheduling Module
Scheduler consists of several components Pre-analysis of schedulability: critical transactions are pre-analyzed to check whether they can be executed properly and how much reduction in resource requirement can be achieved by using contingency transactions Admission controller determines which transactions will be eligible for scheduling Scheduler can schedule according to different metric deadline-driven value-driven Overload Resolver decides the overload resolution actions Dispatcher patches from the top of the ready queue (highest priority)
98
Scheduling Components
Transactions Rejection queue ........ Admission Controller Ov erload Resolv er Transaction Scheduler
Dispatcher
99
100
Recovery Issues
Recovery of temporal as well as static data necessary Not always necessary to recover original state because of temporal validity intervals and application semantics: if recovery takes longer than the absolute validity interval, it would be a waste to recover that value example: recovery from a telephone connection switch failure if connection already established: recover billing information and resources, but no need to recover connection information if connection was being established: recover assigned resources
101
102
103
104
Database manager reacts to events Transactions can trigger other transactions Triggers and alerters Actions specified as rules ECA-rules (event - condition - action) Upon Event occurrence, evaluate Condition, and if condition is satisfied, trigger Action Coupling Modes: immediate (triggered action will be executed right away), deferred (it will be executed at the end of the current transaction), detached (scheduled as a separate transaction) Cascaded triggering is possible
105
106
Temporal scope...
time event occurrence event detection composite event detection event delivery rule retrieval condition evaluation
108
109
110
Concurrency control since lock duration is short, using small locking granules to reduce contention is not effective --- large lock granules are appropriate in MM-RTDBS even serial execution can be a possibility, eliminating the cost of concurrency control potential problems of serial execution:
long transactions to run concurrently with short ones need synchronization for multiprocessor systems
lock information can be included in the data object, reducing the number of instructions for lock handling --- performance improvement
111
Commit processing to protect against failures, logging/backup necessary --- log/backup must reside in stable storage (e.g., disks) before a transaction commits, its activity must be written in the log: write-ahead logging (WAL) logging threatens to undermine performance advantage:
response time: transaction must wait until logging is done on disk -> logging can be a performance bottleneck
possible solutions:
small in-memory log, using non-volatile memory (e.g., flash memory) pre-commit and group commit strategy
112
114
Object-Oriented RTDBS
OO data models support support for modeling, storing and manipulation of complex objects and semantic information into databases encapsulated objects OO data models need (for RT applications) time constraints on objects, i.e, attributes and methods Objects more complex -> unit of locking is the object -> less concurrency memory-resident RTDB may fit well with this restriction inter-object consistency management could be difficult Need better solutions to provide higher concurrency and predictable execution for RT applications
115
116
117
118
119
Database Security
Security services
Trends
Increasing number of systems operate in unpredictable (even hostile) environments task set, resource requirements (e.g., worst-case execution time) ... High assurance required for performance-critical applications
121
122
Access
Access
Resource
Both require lock on the resource How to resolve this conflict? if lock is given to T1, security violation if lock is given to T2, priority inversion
123
124
Research Issues
Flexible security vs absolute security paradigm for flexible security services identifying correct metrics for security level Adaptive security policies Mechanisms to enforce required level of security and trading-off with other requirements: access control, authentication, encryption, .. time-cognizant protocols, data deadlines, ... replication, primary-backup, ... Specification to express desired system behavior verification of consistency/completeness of specification
125
Flexible vs absolute (binary) security traditional notion of security is binary: secure or not problem of binary notion of security: difficult to provide acceptable level of security to satisfy other conflicting requirements research issue: quantitative flexible security levels One approach represent in terms of % of potential security violations problem: not precise --- percentage alone reveals nothing about implications on system security e.g., 1%violation may leak most sensitive data out
126
D
127
128
129
Improved Functionality
Exploiting real-time properties for improved/new features Example: Intrusion detection sensitive data objects are tagged with time semantics that capture normal behavior about read/update time semantics should be unknown to intruder violation of security policy can be detected: suspicious update request can be detected using a periodic update rate tolerance in the deviation from normal behavior can be parameterized
130
Need for resource tradeoffs in database services Adaptable security manager fits well with the concept of multiple service levels in real-time secure database systems Short term relaxation of security could be preferable to missed critical deadlines aircraft attack warning during burst of battlefield updates loss of production time for missed agile manufacturing command
131
132
data flow
Client Table
client security level & key
Session Table
session keys & status
execution control
transaction handoff
transaction results
TransData
thread n thread n-1
DB
object read & write
Beehive
133
LEGEND
transaction request request with switch acknowledgment transaction response message response with switch command Sn+3 Sn send the nth message
t1
prepare to switch last message accounted for
t5
switch
Client X events
Sn
Sn+1
Rn
Rn+1
Sn+2
t2 Client X level
3 2 1 0
t3
t4
time
134
120
% deadlines made
Level Switching
(100% adaptive client)
120
% deadlines made
100 80 60 40 20 0 2 1.5 1
LEVEL
% MADE
3 2 1 0
E
V E L
0.5
0.2
It shows the security level change and the miss ratio change
136
Performance Results
Good performance gains achievable in soft real-time system during overload conditions When the overload is not severe, switching the security level can bring the desired performance back (as shown in the graph) If the system is too much overloaded or some component failed, then even reducing the security level to 0 cannot keep the system working properly (meeting critical deadlines) Performance gain depends also on other factors such as message size and I/O cost: significant performance improvement with large message sizes with large I/O overhead
137
138
139
Device-embedded databases
Embedded systems Strict timing constraints involved
140
141
144
145
146
148
150
Applications
Air Traffic control Aircraft Mission Control Command Control Systems Distributed Name Servers Manufacturing Plant Navigation and Positioning Systems automobiles, airplanes, ships, space station Network Management Systems
152
Applications (2)
Real-time Process Control Spacecraft Control Telecommunication Cellular phones Normal PBX Training Simulation System Pilot training Battlefield training
153
156
157
TCM - Transmission Control Module TCS - Traction Control System CBC - Corner Braking Control DCS - Dynamic Safety Control ESP - Electronic Stabilization Program Car Diagnosis Systems
Hard and soft TCs Significant interaction with external environment Distributed
160
161
162
Commercial RTDBs
Polyhedra http://www.polyhedra.com/ Tachys, (Probita) http://www.probita.com/tachys.html ClustRa http://www.clustra.com DBx Eaglespeed (Lockheed-Marthin) RTDBMS (NEC) (Mitsubishi)
163
165
Applications of BeeHive
Real-Time Process Control hard deadlines, main memory, need atomicity and persistence limited or no (i) schema, (ii) query capability Agile Manufacturing Business Decision Support Systems information dominance Intelligence Community Global Virtual Multimedia Real-Time DBs
166
Satellite Imagery
167
168
10
10
10
10
10
10
20
20
20
Absolute Validity Interval (X) = 10 Absolute Validity Interval (Y) = 20 Relative Validity Interval X-Y < 15
169
Data in BeeHive
Data from sensors (including audio/video) Derived data Time-stamped data Absolute consistency - environment and DB Relative consistency - multiple pieces of data Schema and meta data User objects (with semantics) Pre-declared transactions (with semantics)
170
Some Enterprise)
Interact with External DBs Utilize Distributed Execution Platforms Properties Real-Time QoS Fault Tolerance Security
171
BeeHive System
RDBMS
OODB
B W
RAW
B W
172
BeeHive Object is specified by <N, A, M, CF, T> N, the object ID A, set of attributes (name, domain, values) value -> value and validity interval semantic information M, set of methods name and location of code, parameters, execution time, resource needs, other semantic information CF, compatibility function T, timestamp
173
BeeHive Transactions
BeeHive transaction is specified by < TN, XT, I, RQ, P > TN, unique ID XT, execution code I, importance RQ, set of requirements (for each of RT, QoS, FT, and security) and optional pre and post conditions P, policy for tradeoffs Example: if all resources cannot be allocated reduce FT requirement from 3 to 2 copies.
174
Resolve inter-transaction conflicts in time cognizant manner (concurrency control) Assign transaction priorities (cpu scheduling)
175
Goals
Maximize the number of TRs (sensor and user) that meet deadlines Keep data temporally valid on overload allow invalid intervals on data (note that
data with invalid interval may not be used during that invalid time)
176
Structured Data Returned data Cogency Monitor Data maintained by cogency monitor (external to BeeHive) Object Data manipulation BeeHive integration Unstructured Data Raw Data
BeeHive
177
178
Cogency Monitor
Support value added services RT, FT, QoS, and Security Execute client supplied functionality Map incoming data into BeeHive objects Monitor the incoming data for correctness and possibly make decisions based on the returned data not just a firewall
179
180
Real-time timeout Periodic activation Start time Return partial results Timestamp incoming data Broadcast in parallel to multiple sites for faster response Monitor response times (and/or adjust deadlines dynamically)
181
182
Expand DB Database
RT Threads
Admission Control
RTDB Internals
Security
183
185
186
191
192
Resource management and scheduling temporal consistency guarantee (especially relative validity intervals) interactions between hard and soft/firm RT transactions transient overload handling I/O and network scheduling models which maximizes both concurrency and resource utilization support of different transaction types for flexible scheduling: Alternative, Compensating, Imprecise Recovery availability (partial) of critical data during recovery semantic-based logging and recovery
193
Concurrency Control alternative correctness models (relaxing ACID properties) integrated and flexible schemes for concurrency control Fault tolerance and security models to interact with RTDBS Query languages for explicit specification of real-time constraints -> RT-SQL Distributed real-time databases commit processing distribution/replication of data recovery after site failure predictable (bounded) communication delays
194
Data models to support complex multimedia objects Schemes to process a mixture of hard, soft, and firm timing constraints and complex transaction structures Support for integrating requirements of security and fault-tolerance with real-time constraints Performance models and benchmarks Support for more active features in real-time context techniques for bounding time in event detection, rule evaluation, rule processing mode, etc. associate timing constraints with triggering mechanisms Interaction with legacy systems (conventional databases)
195
196
VIII. Exercises
197
Exercise (1)
Suppose we have periodic processes P1 and P2, which measure pressure and temperature, respectively. The absolute validity interval of both of these parameters is 100 ms. The relative validity interval of a temperature-pressure pair is 50 ms. What is the maximum period of P1 and P2 that ensures that the database system always has a valid temperature-pressure pair reading?
198
Exercise (2)
Sometimes a transaction that would have been aborted under the twophase locking protocol can commit successfully under the optimistic protocol. Why is that? Develop a scenario in which a case of such transaction execution occurs.
199
Exercise (3)
Explain why EDF does not work well in a heavily loaded real-time database systems, and propose how you can improve the success rate by adapting EDF. Will your new scheme work as well as EDF in lightly loaded database systems? Will it work well in real-time applications other than database systems?
200
Exercise (4)
Generate examples of an application where it is permissible to relax one or more ACID properties of transactions in real-time database systems.
201
Exercise (5)
Suppose a transaction T has a timestamp of 100. Its read-set has X1 and X2, and its write-set has X3, X4, and X5. The read timestamps of these data objects are (prior to adjustment for the commitment of T) 5, 10, 15, 16, and 18; their write timestamps are 90, 200, 250, 300, and 5, respectively. What should be the read and write timestamps after the successful commitment of T? Will the value of X3, X4, and X5 will be changed when T commits?
202
Exercise (6)
Why the concurrency control protocols used in conventional database systems are not very useful for real-time database systems? What are the information that can be used by real-time database schedulers?
203
Exercise (7)
Compare pessimistic and optimistic approaches in concurrency control when applied to real-time database systems. Discuss different policies in optimistic concurrency control and their relative merits.
204
Exercise (8)
Are the techniques developed for real-time operating systems schedulers directly applicable to real-time database schedulers? Why or why not?
205
Exercise (9)
Discuss design criteria for real-time database systems that are different from those of conventional database systems. Why conventional recovery techniques based on logging and checkpointing may not be appropriate for real-time database systems?
206
Exercise (10)
What are the problems in achieving predictability in real-time database systems? What are the limitations of the transaction classification method we discussed in this course to support predictability?
207