Вы находитесь на странице: 1из 16

414 IEEE TRANSACTIONS ON COMPUTERS, VOL. C-28, NO.

6, JUNE 1979

DBC Database Computer for


A
Very Large Databases
JAYANTA BANERJEE, DAVID K. HSIAO, SENIOR MEMBER, IEEE, AND KRISHNAMURTHI KANNAN

Abstract-Design considerations of a database computer are tion must interface with computer professionals for problem
presented in this paper. The overall architecture of the computer as solving and decision making. Essentially, computer profes-
well as the organization of its individual components are discussed. sionals attempt to understand the problems and needs of the
Several key concepts which are vital to database management are
incorporated in the design and organization of the components. The user, devise programs to solve the problems, run the pro-
concepts of tracks-in-parallel read-out and logic-per-some-track grams for the user, and return the results to the user. This
processing are provided in an on-line database store for the purpose of entire cycle is repeated many times until the information
achieving high-volume content-addressability. The use of auxiliary needs of the user are met. Modern database management is
information about the database for access precision and control has not meant to be a closed-shop operation. Instead, it allows
resulted in the design of a structure memory, an array of content-
addressable memory and processor pairs, for large collections of multiple users to have access to a shared database. Although
indices. The choice of technologies for the implementation of these computer professionals are still needed to support and
components is considered in terms of their cost and performance. manage the facility, they are primarily involved in the design
Modified moving-head disk technology is chosen in order to support and creation of the shareable database, development of
the very large on-line database store. Emerging technologies such as high-level data languages and software aids for the ease of
magnetic bubbles and CCD's are chosen for the structure memory on
the basis of their matching performance with the on-line database user-database interactions, and incorporation of effective
store and their capability for parallel-in-blocks-and-serial-within- access control measure and reliable security provisions so
block processing. Five other important components are also dis- that access to sensitive information can be regulated and
cussed in the paper. Their role in the database computer and protected. This multiaccess operation requires considerable
relationship with the structure memory and on-line database store new software development and hardware support.
are delineated.
The database computer is meant to be a back-end machine which This change also requires that the off-line mode of
interfaces with front-end general-purpose computers. To this end, the operations be replaced by an on-line one. In other words, the
paper attempts to show that the database computer provides a very software must be capable of supporting on-line databases
high-level instruction repertoire for interfacing with the front-end, a and interacting with the user in real time.
set of elaborate security mechanisms, and an effective cluster 2) The availability and variety of memory and processor
mechanism. These built-in capabilities tend to allow the database
computer to support existing and new database applications with technology-Typically, the software-laden database man-
better throughput and higher security. agement system has been large in size and complex in
structure, which not only overtaxes the hosting hardware,
Index Terms-Clustering mechanism, computer architecture, but also overshadows the hosting operating system. Exclud-
content-addressable memory, database computer, logic-per-track,
mass memory, security enforcement, structure memory, tracks-in- ing the database, they are still large relative to the hosting
parallel read-out. operating systems and thereby utilize considerable main
memory and auxiliary storge as the operating systems do. It
I. BASIC DESIGN GOALS is therefore not surprising that attempts have been made to
ATABASE machines are special-purpose computers remove the software-laden database management system
which may have been prompted [1] in recent years by from the general-purpose computer and replace it with a
the following factors. specialized machine [3], [4], [6]-[8]. In addition, we have at
1) The change of data-processing-oriented information present a wide choice of emerging technology such as
management to database-management-oriented informa- charge-coupled devices, magnetic bubbles, electron beam
tion management-Traditional data processing is essen- addressable memories, dynamic RAM's, and modifiable
tially a closed-shop operation which is supported and moving-head disks [17], [18], [20]. It may thus be possible to
managed by computer professionals. The user of informa- design and configure a special-purpose computer which can
perform database management tasks cost-effectively. By
Manuscript received May 22, 1978; revised November 13, 1978. This eliminating much of its software, the database management
work was performed at The Ohio State University and was supported by system can perhaps now interface with the host computer
the Office of Naval Research under Contract N00014-75-C0573. A version
of this paper was presented at the 5th Annual Symposium on Computer and the host operating system more reliably with better
Architecture, Palo Alto, CA, April 3-5, 1978. response time and throughput.
J. Banerjee and D. K. Hsiao are with the Department of Computer and The database computer (DBC) [9]-[1 1] to be discussed in
Information Science, The Ohio State University, Columbus, OH 43210.
K. Kannan is with the IBM T. J. Watson Research Center, Yorktown this paper is an attempt to incorporate as much specialized
Heights, NY 10598. hardware for data management as possible. As a back-end
0018-9340/79/0600-0414$00.75 (© 1979 IEEE
BANERJEE et al.: DBC 415

Information Path
machine, the DBC attempts to achieve high performance Control Path
and low cost. There are five basic goals in the design of the
database computer (DBC). The first goal is to design it with
the capability of handling a very large on-line database of
1010 bytes or beyond since special-purpose machines are not DBCCP: Data Base
Comrannd a
likely to be cost-effective for small databases. The second struchlra X
Loop / Processor
goal is to build the database computer now. This implies KXU: Keyword
that only emerging technology and modifications of the IXU / v KXU
Transformation
Unit
existing technology may be considered for the hardware SM: Structure
Memory
design. No reliance is to be placed on distant technology. SMIP: Sturture
From PES ~
The third goal is that the DBC must compete favorably with DBCC P
Memory
Informotion
existing software-laden database management systems To PES
/ /
5
~~~~~~~~~I XU: ndex
Processor
(which are run on general-purpose computers) in terms of Translation
system throughput and cost of database storage. The fourth // / \ \\ ~~~~~~~M
M: Mass
goal is to design at the outset a security mechanism as an / // \ \5 ~~~~~~~~Memory
SFP: Security
integral part of the DBC since a modern database must have Filter
Processor
security and control for sharing and protection. The final / D\fo PE S Program
goal is that the DBC, working as a back-end computer, must / DOto Loop g XExecution
provide a repertoire of very high-level commands to inter-
face with front-end computers and support different types of
database management applications (in particular, those
applictions utilizing the hierarchical [12], CODASYL [13],
and relational [5] data models). As we progress through the
remaining sections in this paper, we will attempt to show
how the DBC design meets the first four goals. We will not Fig. 1. Architecture of the DBC.
elaborate on the DBC design in meeting the fifth goal. This
study is voluminous [14]-[16] and is being published
elsewhere. moving-head disks, perhaps the least expensive of all large-
capacity on-line storage devices. The disks, however, are
II. AN OVERVIEW OF THE DBC ARCHITECTURE modified to allow parallel read-out of an entire cylinder in
Fig. 1 is a complete diagram of the major DBC compon- one revolution time, instead of one track at a time. The
ents. The DBC acts as a back-end machine to one or more parallel read-out capability of the DBC provides rapid
front-end general-purpose computers which are jointly access to a relatively large block ofdata. These data can now
referred to as the program execution system (PES). Users' be content-addressed simultaneously by a set of track
programs reside in the PES, and are executed by the PES information processors (TIP's) in the same revolution. It
using the DBC as one of its various resources. The PES seems adequate that access is limited to one or a few
communicates with the DBC by way of DBC commands, cylinders since single user transactions seldom refer to data
and the DBC responds either by returning a group of beyond megabytes in size. As long as data are not physically
records or parts of such records (i.e., the response set), or by scattered, sweeping of a large number of disk cylinders can
indicating successful or unsuccessful execution of a be avoided. The physical dispersion of related data is
command. prevented by a built-in clustering mechanism in the data-
The DBC makes use of two loops of processors and base command and control processor (DBCCP) which uses
memories in executing the commands. The data loop, which information provided by the creators of the database via the
consists of the database command and control processor program execution system (PES).
(DBCCP), mass memory (MM), and security filter proces- The DBC needs the use of some structural information
sor (SFP), is used for storing and accessing the database, for about the database. Without the help of such information,
post-processing of retrieved records, and for enforcing field- every request would require all the cylinders (that constitute
level security (known as the type B control). The structure the database) to be accessed whether there is any clustering
loop, which consists of the database command and control or not. Furthermore, preprocessing of the user's access
processor (DBCCP), keyword transformation unit (KXU), authorization in determining well-compartmentalized data
structure memory (SM), structure memory information aggregates for security purpose may not be possible (known
processor (SMIP), and index translation unit (IXU), is used as type A control). Although both the access and security-
for limiting the mass memory search space (through the related information are likely to be at most 1 percent of the
determination of cylinder numbers), for determining the size of the database [14]-[16], they are still quite large since
authorized records for accesses (known as the type A the database itself is of 1010 bytes. Furthermore, since there
control), and for clustering records received for insertion may be a number of accesses to the information for every
into the database. access to the database, it must be possible to access them
The DBC design exploits both existing and emerging very fast. Therefore, the structure memory (SM), which is the
technologies. The on-line mass memory (MM) is made from repository of all structural information, has to provide a
416 IEEE TRANSACTIONS ON COMPUTERS, VOL. C-28, NO. 6, JUNE 1979

large capacity and good access speed. Such a performance whereabouts (in the structure memory) of the keywords that
can be achieved through the use of emerging technology, satisfy the predicate. The aggregates ofall index terms for the
such as charge-coupled devices or magnetic bubble memory keywords satisfying a predicate in a query conjunction are
devices. retrieved from the structure memory and transmitted to the
The DBC is the first database machine with security structure memory information processor (SMIP). The
mechanisms being incorporated in it at the outset. Genera- SMIP, then, intersects the aggregates of index terms. There
lity in security enforcement is allowed through the record- are as many aggregates as there are predicates in the query
at-a-time post-checking for field-level security in the security conjunction. After the intersection, the resultant set of index
filter processor (SFP) and the more efficiently implemented terms are further filtered by the DBCCP. The DBCCP
security control for compartmentalizing records of the same deletes all those index terms that have numbers ofthe atoms
security specifications (in the structure loop). Post-proces- to which the user (i.e., the issuer of the query conjuction)
sing of records and data items constitute some other does not have the authorized access right. This final set of
functions provided by the SFP. index terms, together with the complete query conjunction,
Other components such as the structure memory infor- are now sent to the mass memory for content search. Output
mation processor (SMIP), the index translation unit (IXU), from the mass memory may be post-processed by the SFP
and the keyword transformation unit (KXU) are func- before routing to the front-end PES.
tionally specialized in the DBC. They are pipelined and As depicted in Fig. 2, there are two classes of input
multiprocessed by the database command control processor commands recognized by the DBCCP: access commands
(DBCCP) for concurrency that enhances the overall perfor- and preparatory commands. (For a complete repertoire of
mance of the DBC. The DBCCP is therefore charged with DBC commands, see [14].) Access commands are those that
the synchronization and control of all the DBC components require accesses to the mass memory. Preparatory com-
so that they can work concurrently on one or more com- mands, on the other hand, convey information about the
mands. The variable-length commands are sent to the database such as the names and attributes of files to be
DBCCP by the front-end program execution system (PES). created, characteristics of the attributes, space requirement
The DBCCP interfaces with the PES by receiving com- of files, and security specifications. Each access command is
mands and returning appropriate responses, such as sets of executed in a pipelined fashion by the various components of
records, diagnostic messages, etc. Other functions of the the DBC. The DBCCP coordinates the operation of the
DBCCP include the clustering of records during insertion, other components and keeps track of the status of the
preprocessing of the record-level (type A) and field-level commands that are currently being executed. The informa-
(type B) security specifications, coordinating the task of tion received in the preparatory commands are organized in
security checking during database accesses, instructing the a random access memory of the DBCCP. This information
SFP to post-check the response set for the field-level (type B) is referenced frequently during the execution of access
control, and performing certain essential bookkeeping commands.
chores. Records to be inserted in the database are physically
Without belaboring the terminology and details of the clustered by the DBCCP according to their primary and
various components which will be provided in later sections, secondary clustering attributes. We will return to Figs. 1 and
let us first gain an overview of the flow of command 2 in later sections when we discuss the individual compon-
execution of the DBC. The database stored in the mass ents of the DBC.
memory (MM) is made of records. Every record consists of a Both the clustering and security mechanisms are il-
record body, a set of attribute-value pairs (known as key- lustrated by way of an example in the Appendix of [9]. In
words), and a number representing the record set (known as that Appendix, the execution of a number of queries through
security atom) of which all the records satisfy the same major stages of the DBC is also illustrated. The reader may
security specifications. The set of all security atoms makes a refer to [9] for a more theoretical discussion of DBC
logical partition of the database such that all records concepts.
belonging to an atom are protected in an identical manner
with respect to a given user. Since the database resides on III. DESIGN CONSIDERATIONS OF THE ON-LINE
many cylinders and one cylinder is searched at a time, MASS MEMORY
keyword indices are maintained in the structure memory The design of the mass memory (MM) is heavily dictated
(SM). For a keyword K, an entry of the structure memory by the storage and processor technologies, database size,
consists of a list of index terms of the form (f, s) where and processing characteristics. Let us consider each of these
the cylinder f and security atom s contain records with factors in the sequel.
the keyword K.
Given the Boolean conjunctions of keyword predicates A. The Use of Moving-Head Disks
(known -as query conjunctions) as part of an input A survey of the current and emerging technologies indi-
command, the database command and control processor cates that the various on-line memory technologies may be
(DBCCP) considers each query conjunction in turn. For divided into three major classes on -the basis oftheir cost and
each predicate of the conjunction, the KXU uses the attri- performance. At the higher end of the cost-performance
bute of the predicate and the file name to determine the spectrum, there are the magnetic core, MOS, and bipolar
BANERJEE et at.: DBC 417

Database Command and Control Processor (DBCCP)

Fig. 2. Execution of commands received from a front-end computer.

technologies. In the middle, there is the fixed-head disk B. The Tracks-in-Parallel Read-Out Capability
technology and its potential replacements, namely, charge- Conventional moving-head disks, as well as fixed-head
coupled devices (CCD's), dynamic RAM's, magnetic disks, allow the read-out of only one track per disk revolu-
bubbles, and electron beam addressable memories tion. By modifying the read-out mechanism of moving-head
(EBAM's). In terms of low cost per bit and high storage disks, the mass memory can read, instead of one track per
capacity, however, there is no known and emerging technol- disk revolution, all the tracks of a cylinder in the same
ogy in sight that can compete with the moving-head disk revolution. This modification is called tracks-in-parallel
technology which occupies the lower end of the cost- read-out. Such modification is known, at the time of this
performiance spectrum. Thus, moving head disks seem to be writing, to be feasible and relatively low in cost [17] since
the only alternative for large on-line database store. We have some of the read/write electronics are already a part of the
thus chosen moving-head disks for the DBC mass memory. moving-head disks. Modifications are necessary so that all
Once the technology is chosen, we then ask what kind of the read/write heads can be triggered to read simultaneously
modifications of the moving-head disk are necessary in and so that the data buses are enlarged for accommodating
order to support database management. The performance the increased data rate.
gain due to such modifications must be cost-and-
performance-effective so that the cost-performance projec- C. The Dynamically Associated Logic-per-Track Approach
tion of the modified disks will not exceed either the With the moving-head disks modified for high-volume
fixed-head disk or its replacements. read-out, the mass memory must now provide high-volume
Typical database management operations require the processing. The mass memory information processor (MMIP)
processing of 90-95 percent ofrelated data for the purpose of obtains and processes an entire cylinder of information in
producing 5-10 percent of useful information (known as the one disk rotation time. Since the rotation speed of the disks
90-10 rule). It is desirable that the mass memory should is relatively slow, it is possible to process information "on
process the related data rapidly so that the results can be the fly." Processing on the fly is possible because every track
obtained without being delayed by the sheer volume of the of the cylinder is actually processed by a separate processing
related data. This calls for high-volume read-out and proces- unit called a track infornation processor (TIP) having some
sing capabilities. amount of buffer space. For instance, considering a disk
418 IEEE TRANSACTIONS ON COMPUTERS, VOL. C-28, NO. 6, JUNE 1979

rotation speed of 3000 revolutions/min and a track capacity A record collection may also be specified in terms of a
of 30 000 bytes, we require a processing speed (for conjunction of predicates called the query conjunction. An
comparison-type operations) of no more than 1.5 Mbytes/s example of a query conjunction is
from each track information processor. This is within the
present state of the art of microprocessor technology. (SALARY > 25000) A (JOB * MGR) A (RELATION = EMP).
Furthermore, if there are 40 tracks in a cylinder, then there Carefully planned physical layouts of the record are used
will be 40 TIP's in the MMIP. The MMIP is time-shared in the DBC to eliminate unnecessary disk revolutions and to
among all the cylinders of the mass memory. reduce the cost and size ofthe TIP's buffers. Each attribute is
first encoded by the DBC so that it has a unique numerical
D. The Content-Addressable Capability identifier. The attribute-value pairs (keywords) in a record as
In data management, processing means content- shown in Fig. 3(a) are now arranged in an ascending order of
addressable search, retrieval, and update. With the mass the attribute identifiers. The cluster number and the security
memory modified for high-volume read-out and with the atom number of a record, seen in the record layout of Fig.
high-performance processors, we now illustrate how the 3(a), will be discussed later in this paper. The layout of a
mass memory (MM) performs content-addressing. For this query conjunction is depicted in Fig. 3(b). The predicates in
discussion, we must introduce some notions *and a query conjunction, like the keywords in a record, are
terminology. arranged in an ascending order based on the attribute
The DBC accepts and stores a database as a collection of identifiers. A query conjunction is stored in a sequentially
records. Each record consists of a record body and a set of accessed memory. The track information processor (TIP)
variable-length attribute-value pairs where the attribute may reads a record from the track as a part of one data stream
represent the type, quality, or characteristic ofthe value. The and the query conjunction from the sequentially accessed
record body is composed of a (possibly empty) string of buffer as another data stream and carries out a simple
characters which are ignored by the DBC for search pur- bit-by-bit comparison of the two streams. Whenever there is
poses. For logical reasons, all the attributes in a record are a match between an attribute identifier in the record and an
required to be distinct. An example of a record is shown attribute identifier in the conjunction, the TIP then com-
below: pares the value parts to determine if the corresponding
predicate is satisfied. Ifthe attribute identifier in the record is
(<RELATION, EMP>, <JOB, MGR>, less than the attribute identifier in the conjunction, then the
<DEPT, TOY>, <SALARY, 15000>). TIP skips over the corresponding value to the next attribute
identifier of the same record. If the attribute identifier in the
The record consists of four attribute-value pairs. The value record is greater than the one in the conjunction, then
of the attribute JOB, for instance, is MGR. Attribute-value the TIP skips the entire record. The above logic is repeated
pairs are called, for short, keywords. They obviously charac- until either all predicates in the conjunction are satisfied or
terize records and may be used as "keys" in a search the record does not satisfy the conjunction. The scheme just
operation. described will result in a simple serial-by-bit comparison.
The DBC interfaces with the front-end computers by A conjunction Q, after it is broadcasted by the mass
accepting a large repertoire of high-level database manage- memory controller, is stored in each of the TIP's. All the
ment commands [14], by delivering collections of records as track information processors (TIP's) simultaneously evalu-
response sets, and by indicating successful or unsuccessful ate the query conjunction against their corresponding in-
execution of the commands in messages. Some of the coming record streams. For example, the first TIP searches
commands, called record access commands, may be used for the records of the first track of the cylinder. At the same time,
specifying a collection of records in the database and for the ith TIP searches all the records in the ith track of the
carrying out an intended operation on these records, such as same cylinder. In one disk revolution, all tracks of an entire
retrieval, deletion, and modification.-Other commands may cylinder are thus searched in parallel by the TIP's.
be used for database loading, record insertion, initialization,
etc. IV. THE OVERALL ORGANIZATION OF THE
An important feature of the DBC record access com- MASS MEMORY
mands is that they allow natural expressions for specifying a The overall organization of the mass memory is shown in
record collection. A record collection may be specified in Fig. 4. The database resides in data volumes mounted on
terms of a keyword predicate, or simply, predicate, which is a moving-head disk drives. A volume is composed of 200-400
triple consisting of an attribute, a relational operator (such cylinders. Data transfer to/from a cylinder is achieved by
as, =, $, > > <, < ) and a value. For example, the activating all the read/write heads of the access mechanism
predicate concurrently.
(SALARY > 10000) Although other attempts [3] have taken advantage of the
fact that the read and write heads on a track could be
may be used to indicate all records that haveRSALARY as one positioned a short distance from each other, we do not favor
of the attributes, the value of that attribute being greater such an arrangement. This is because, at high track densities
than 10000. (1000 tracks/in or higher), the required mechanical toler-
BANERJEE et al.: DBC 419

Cluster Number
Security Atom Identifier
Number k of Keywords in Records

I II I lal O~VIv l21 V2V2Attribute-Value Pairs 1okklVkYk 1


2 ... a

Record Body

a. Fixed-Length Attribute Identifier of the i-th Keyword of the Record


v: Variable-Length Valuewith Length Indicotorof the i-th Keyword
al<a2<a<0 <ak.

(a) The Format of a Record R in the Mass Memory

Number m of Predicates in the Conjunction

vr a21r2 1 V2 rI v,mI
Predicate K-

a: Fixed-Length Attribute Identifier of the i-th Predicate of the Conjunction


ri Relational Operator of the i-th Predicate
vi Variable-Length Value with Length Indicatorof the i-th Predicate
a1< 2<a3< ...am-
(b) The Format of a Query Conjunction
DDC: Disk Drive Controller
Fig. 3. Internal formats of records and query conjunctions. TIP: Track Information Processor
t = * of tracks per cylinder
m = * of disk drives per disk drive control ler
n =# of disk drive controllers for the entire database
ances for sustaining separate read and write heads may well Fig. 4. The mass memory organization.
deprive the disk technology of much of the cost-effectiveness
brought about by the higher densities [18]. In this design, a
combined read/write mechanism is assumed. The implica-
tion of such a decision is that a disk device in the mass drive controllers. In fact, t 1-bit registers are all that is
memory can either be read from or written into at a given needed in each disk drive controller for buffering the data
time. Reading and writing cannot be performed between the drive selector and a selected disk drive.
simultaneously. Although a bit- (or byte-) length buffer in each TIP is
The set of disk drives is partitioned into groups of 8-16 sufficient for the evaluation of a query conjunction; a
drives for access and control purposes. Each group of disk record-length random access buffer is provided in each TIP.
drives is controlled by a disk drive controller (DDC). A drive This is necessary for performing updates as well as for
selector determines at any instant a particular disk drive holding on to a record during the query evaluation process.
controller which, in turn, determines a disk drive from/to If a record satisfies the query conjunction, then it may be
which data are being transferred. The drive selector also transferred to the mass memory controller (MMC). During
routes data in parallel to/from all the track information insertion, a record received from the MMC is held in the
processors (TIP's) that constitute the mass memory infor- record-length buffer before being written into the track.
mation processor (MMIP). Finally, there is a mass memory During updates, a record is modified in the buffer only if it
controller (MMC) to receive requests, broadcast query satisfies a selection criterion (i.e., query conjunction). The
conjunctions and commands to the track information updated record is written back in place during the next
processors, and control the operations of the mass memory revolution, as long as it does not increase in size. Ifthe record
information processor, the drive selector, and the disk drive does increase in size, then the original record is tagged for
controllers. Data are transferred between the mass memory deletion in the next revolution. The updated record is then
controller (MMC) and the track information processors sent from the TIP buffer to the MMC for insertion. (Since
(TIP's) via the 1-0 bus. record insertion involves clustering, it is dealt with in more
Recall that, in this design, a single cylinder is content- detail later in this paper.) We note that if no more than one
addressed at a time. Therefore, assuming that there are t record from each track of the content-addressable cylinder
tracks to a disk cylinder, a data transfer path consists of a requires update, then the process is usually completed in two
1-bit line from each of the t tracks of a cylinder belonging to disk revolutions. If some track has more than one record for
a particular disk drive, t 1-bit lines from the corresponding update, then more revolutions will be required. To approach
disk drive controller, and all the t 1-bit lines from the drive an update speed of two disk revolutions per cylinder, it may
selector to the individual track information processors. The be desirable to increase the buffer size in each TIP, perhaps
above approach provides for a simple design of the disk to a multiple of the record length.
420 IEEE tRANSACTIONS ON COMPUTERS, VOL. c-28, NO. 6, JUNE 1979

A. Two Modes of Operation limitation of the technology-the time delay in reposition-


The mass memory operates in two basic modes the ing the read/write heads from one content-addressable
normal mode and the compaction mode. In the normal mode, cylinder to another. This delay is particularly acute if the
input requests are decoded by the mass memory controller number of cylinders to be addressed is large. There are two
(MMC) and are queued according to the cylinders factors which may cause the unnecessary search of a large
referenced by the requests. For each cylinder for which a number of cylinders: 1) the database creator inadvertently
queue of requests exists, the MMC asks the appropriate disk scatters his records over a large number of cylinders, thus
drive controller (if free) to position the read/write heads to requiring the mass memory (MM) to "sweep" through all
the cylinder. When the cylinder is thus accessed, the MMC those cylinders; and 2) for a given query conjunction, the
sends the requests one at a time to the mass memory MM does not have any knowledge of those records which
information processor (MMIP). While the track informa- may satisfy the query conjunction. If, on the other hand, it
tion processors (TIP's) of the MMIP are executing the knows which cylinders may contain the desired records,
requests, the MMC can ask the disk drive controllers to then the MM can restrict its content-addressable search to
position the read/write mechanisms to other cylinders for just those cylinders, instead of the entire cylinder space.
which there are nonempty queues. Thus the access time with 1) The Clustering Mechanism: To eliminate problem 1),
respect to a cylinder is at least partly overlapped by useful the database computer (DBC) provides a clustering mechan-
work performed by the MMIP. The extent of overlap is ism in the database command and control processor
determined by such factors as the average number of (DBCCP). With the clustering mechanism, the DBC allows
different cylinders for which there are nonempty queues. physical grouping -of records that are likely to be retrieved
Records which are identified by a delete command under and updated together into as few content-addressable cylin-
the normal mode are tagged by the track information ders of the MM as possible. The DBC provides two levels of
processors (TIP's) for later removal during the compaction clustering: first, by a primary clustering attribute, and
mode. Since reading and writing are not done simulta- second, by a secondary clustering attribute. Clustering attrib-
neously, the record deletion process involves two disk utes are supplied by the front-end system (PES) based on a
revolutions per cylinder. During the first revolution, each knowledge of the access pattern. In other words, clustering
TIP creates a bit-map of tag bits (there is a bit position in attributes are chosen on the basis of the frequency of access
the bit-map for each record position in a track). The bit- ofvarious record collections. This choice may be straightfor-
maps are created by the TIP's and inserted in the beginning ward, as demonstrated in [14]-[16].
of the tracks during the second revolution. When the mass The DBC attempts to store all records with the same value
memory controller is ordered to reclaim the space occupied for the primary clustering attribute into as few cylinders as
by tagged record, it enters the compaction mode. During possible. Therefore, given a query conjunction involving a
this mode, cylinders with tagged records (this information primary clustering attribute, the search space is limited to a
being maintained by the mass memory controller using a very few cylinders, even if there is no further knowledge
bit-map, with one bit for each cylinder) are read into the about the database. For example, in the DBC implementa-
mass memory via the TIP's. The mass memory controller tion of a relational database, each record (corresponding to
then sends back to the TIP's only the untagged records. a relational tuple) contains a keyword <RELATION, relation-
There are two reasons for handling deletions in this name> where RELATION is an attribute and relation-name is
manner. First, if reclamation of space were to be attempted the relation to which the record (tuple) belongs. If RELATION
in the normal mode, one of two undesirable things will is declared as a primary clustering attribute, then every
occur: 1) we will have to provide a track-size buffer with single-relation query can be executed by searching adja-
each TIP, resulting in low utilization of the buffer during cently only as few cylinders as are required to store the entire
retrieval; and 2) we will have to reclaim space in segments of relation.
the track, each segment size being equal to the size of a TIP At the second level of clustering, the secondary clustering
buffer. In the latter case, the number of revolutions required attribute provides a further degree of search precision. In
to "sweep" the entire track for reclamation will be a multiple fact, since the cylinder size is very large (say, 1/2 megabyte),
of the ratio of the track size to the TIP buffer size. During the the two levels of clustering should allow most queries to be
normal mode of operation, a single delete operation could executed in only one cylinder access. A more detailed
hold up retrievals for several revolutions. This is undesir- example of the clustering process is included in [9].
able. On the other hand, we might expect during the course 2) The Maintenance of Indices: To address problem 2),
of a 24-hour day periods of light load. Such periods usually the database computer (DBC) maintains some auxiliary
result in low utilization of system resources. By operating information about the database in a separate component
the mass memory in the compaction mode during these known as the structure memory (SM). Indices are maintained
intervals of light load, we may be able to achieve a more in the SM on selected attributes of the records and their
equitable distribution of load on the mass memory. value ranges. Clustering attributes are likely candidates for
indices, since most queries are expected to refer to these
B. The Need for Search Space Reduction attributes. Furthermore, each query conjunction is recom-
Despite all the improvements that can be made to the mended to include at least the primary clustering attribute.
moving-head disk technology, there is still one fundamental An index term for a selected attribute-value (range) pair
BANERJEE et al.: DBC 421

consists of, among other items, the cylinder number of the cal expression of security keywords form a record set called a
cylinder containing at least one record having the selected security atom. The advantageous properties of the security
attribute-value pair. For a query conjunction, it is now atom [19] are as follows.
feasible to consult the SM for the purpose of obtaining just 1) Security atoms represent disjoint record sets, i.e., a
those cylinder numbers of the index terms whose attribute- record belongs to one and only one security atom.
value (range) pairs satisfy the query conjunction. 2) The database can be partitioned into security atoms,
with all records in an atom having the same security
V. DESIGN CONSIDERATIONS OF THE attributes.
STRUCTURE MEMORY 3) With proper choice of security attributes, the partition-
The structure memory (SM) is the repository of auxiliary ing (i.e., the sizes of security atoms) can be made from very
information about database. This information is concerned fine to very coarse, depending on the security requirements.
with search precision and access control. For improving 4) Usually the total number of security atoms in the
search precision, the SM is employed by the database database is much smaller than the total number of records in
computer (DBC) to determine the mass memory cylinders the database.
that need be content-addressed. For access control, the SM 5) For any arbitrary query conjunction made up of
is again used by the DBC to determine whether an access security keywords, the records of a security atom will have
operation is an authorized one and whether access is the following exclusive property: either all or none of the
permitted to the records involved. The use of cylinder records of the security atom will satisfy the query
numbers as a part of the index term for search precision has conjunction.
been discussed in the previous section. In the following For this type of access control, a user of the database is
section, we will concentrate on the discussion of the access always provided with a database capability. Each element of
control feature of the SM. the capability consists of a query conjunction (made up of
security keywords) and a set of access rights. A security atom
A. Pre- and Post-Checking for Access Control expression may satisfy a number of query conjunctions in
the database capability. The access rights on a security atom
The DBC provides two types of access control. Access for the user are therefore the intersection ofthe sets of access
requests with the type B control are slower to execute rights corresponding to the query conjunctions that are
because such requests require post-checking of every re- satisfied by the atom expression. Consequently, for each
trieved record for field-level security clearance. This type of user, a list can be created indicating the access rights on each
security enforcement is performed by a special processor security atom. This list is called the atomic access privilege
known as the securityfilter processor (SFP) which also does list of the user. Using this list, the database computer can
some other post-processing of records retrieved from the now process a user request by first determining whether
mass memory (see Section VII). Further, these requests may there is any atom expression that satisfies the request. If
result in access imprecision since some of the retrieved there is such an expression, then the access requested by the
records may have to be discarded by the SFP due to security user is compared with the access rights assigned to the atom.
violation. The type A control, on the other hand, requires no If the requested access is an authorized one, then access to
post-checking of records. It works solely on the basis of the the atom (i.e., record set) is permitted. Subsequently, the
access control-related information stored in the structure record set is accessed by the mass memory (MM). A detailed
memory (SM) and in the database command and control illustration of the security atom concept for access control is
processor (DBCCP). During database creation time, the included in [9].
access-control related information is extracted from the new
records and stored in the structure memory. The effect is that C. The Structure Information
of prechecking of records. Thus, at query execution time, For every keyword designated for indexing, there is an
security clearance may be made even before records are entry in the structure memory (SM) consisting of the
actually retrieved from the mass memory. Since the type A keyword itself and a list of index terms. An index term is
control incurs no access imprecision, it should be used composed of a cylinder number f and a security atom
regularly. However, to use type A control, the database number s. An index term (f, s) for a keyword K, therefore,
creator must understand the notion of security atoms and be indicates that there exists one or more records containing
willing to designate certain keywords of his records as the keyword K that are residing in the cylinderf of the mass
security keywords. With the security atoms and keywords, memory (MM), and that belong to the security atom s.
the DBC can then construct access control-related informa- For type A control, the query conjunction of a user is
tion and place the information in the SM and DBCCP for processed as follows. For each predicate with an indexed
subsequent use. attribute, the structure memory (SM) determines all those
keywords which satisfy the predicate. Corresponding to
B. The Notion of Security Atom each of the satisfying keywords, a set of index terms is
A security keyword of a record is a keyword of the record retrieved. The sets of index terms for all such predicates are
which is designated by the database creator to reflect his then intersected (by the structure memory information
security requirements. All records having the same canoni- processor to be discussed in Section VII). The result of the
422 IEEE TRANSACTIONS ON COMPUTERS, VOL. c-28, NO. 6, JUNE 1979

intersection is a list L of index terms for the given query choice for the structure memory technology. Electron beam
conjunction. addressable memories (EBAM's) have also been studied in
This list L of index terms is compared [by the database [10] for their applicability in structure memory design.
command and control processor (DBCCP)] to the user's Although such memories are expected to provide the lowest
atomic access privilege list to determine the final list L. The cost per bit (about 10 to 20 mcents/bit), the reliability of
list L includes only those (f, s)-pairs of L where the required these memories is still uncertain. Furthermore, to absorb the
access is permitted on the security atom s. The list L together high cost of their complex circuitry, EBAM's are cost-
with the query conjunction and the requested access are now effective only for very large memories. In our implementa-
forwarded to the mass memory (MM). tion of the DBC, either bubble memories or CCD's are the
As we stated earlier, the mass memory stores a record as present choice for the structure memory design.
variable-length attribute-value pairs, together with a record
body. For the purpose of identifying the security atom to VI. THE OVERALL ORGANIZATION OF THE
which it belongs, each record is also tagged with the security STRUCTURE MEMORY
atom number as depicted earlier in Fig. 3(a). Given a query From our discussion in Section V-D, it is apparent that
conjunction Q and a list L of index terms (f, s), the mass the structure memory should provide for a high search speed
memory can then narrow its content-addressable search to at a low cost. With the total size of the structure memory
those cylinders whose numbers appear in L. For each unique being of the order of 100 Mbytes, the speed requirement
cylinder numberfin L, the mass memory will access cylinder implies that the memory must be content-addressable and
f, skipped those records that are not tagged with one of the that the content-search operation should be carried out by
corresponding security atom numbers s, and output only multiple processing elements. The structure memory may,
those that satisfy the conjunction. therefore, be split up into a number of sections (later called
memory units), and each section may be assigned to a
D. The Performance Requirement and Choice of Technology separate processor.
Typically, indices for conventional databases range from The structure memory is made up of a segmented sequen-
1 to 10 percent of the size of the database [22]. In the DBC, tial memory (e.g., CCD's or bubbles). Hence, any search on
the database needs to be indexed to the level of cylinders such a memory can be carried out no sooner than the data
(instead of tracks, pages, and offsets within pages as in transfer time of a single physical segment. The larger the
conventional systems). The total number of index terms for number of segments to be serially searched, the longer will
the database is therefore smaller. In fact, the size of the be the total search time. It is, therefore, reasonable to try and
indices in the SM should not exceed 1 percent of the size of assign a separate processor to each physical segment. Unfor-
the database. This has been verified for realistic applications tunately, a segment is normally quite small, say up to 2
on the DBC [14]-[16]. Therefore, the capacity requirement Kbytes, while the entire structure memory size is up to 100
of the SM for a 101'-byte database is at most 108 bytes. Mbytes. Consequently, the above assignment would call for
Another important feature required in the SM is that it an extremely large number of processing elements. On the
should provide sufficient search and retrieval speed, so that other hand, it would be cost effective to: 1) utilize a small
query conjunctions may be processed at a rate commensur- number of processors, 2) assign a number of segments (later
ate with that of the mass memory. While the mass memory is called memory modules) to each processor, and 3) provide a
working on the current request, the structure memory can mechanism to identify a single segment (if possible) for
work on the next request. Normally, a query conjunction search by each processor in response to an index search
contains no more than two predicates of indexed attributes, request. The structure memory organization presented
as seen in [14]-[16]. If each of these predicates is satisfied by below adheres to these guidelines.
5-10 keywords, then at most 10 or 20 sets of index terms The structure memory is organized as an array of memory
need be referenced per query conjunction. Consequently, for unit-processor pairs which are managed by a controller. A
accessing a set of index terms, the structure memory requires memory unit, in turn, is composed of a set of memory
a speed of 1 to 2 ms since all the 10 or 20 sets of index terms modules. All memory modules are of the same fixed size. A
must be accessed in 20 ms, which corresponds to the time processor can address any memory module within its
required for one disk revolution. memory unit, and then content-address the entire module.
The above performance requirement can be met at a Furthermore, the structure memory controller can trigger
relatively low cost by using one of the emerging technologies all the processors to content-address their corresponding
such as bubble memories and charge-coupled devices modules simultaneously.
(CCD's). According to a recent survey [20], CCD's can
access a random block in 100 jis, and their costs are A. The Notion of Bucket and Parallel Array of
projected to be 50 mcents/bit. Bubble memories can also Memory Unit-Processor Pairs
access random blocks, but in 1 ms, and their costs are Whenever possible, searching of the structure memory on
coming down to 10 or 20 mcents/bit. At the system level, the the basis of a given keyword should be restricted to at most
cost of CCD memories is about 250 mcents/bit, while the one module from each memory unit. To achieve this goal, all
cost of bubble memories may be 30-50 mcents/bit. Since keywords and their index terms corresponding to a particu-
the block-oriented bubble memories provide the required lar attribute (and lying within a given value range) will
access speed at a lower cost than CCD's, they are a very good constitute a bucket. Each bucket is physically distributed
BANERJEE et al.: DBC 423

Bucket Memory System


C. The Look-Aside Buffer
I10
The look-aside buffer is used for enhancing the perfor-
mance of the structure memory. During normal operations
A Memory A \ of the database, the retrieval of information from the
LI Look-Aside
Unit IL I., v
u Buffer structure memory is likely to be more frequent than the
,G
update of information in the structure memory, especially
Structure Input (Request
Memory Keywords for
because update operations are also preceded by search and
Controller Seorch and
Updotes of
retrieval. However, it is conceivable that during short
Directory intervals of time, a large number of updates may have to be
Entries )
carried out. Such an event may adversely affect the average
I Output
(Index Terms
retrieval rate. The use of a look-aside buffer, implemented
"041) for Further
Processing)
with fast random-access memory, is aimed at alleviating
such a degradation in structure memory performance.
A Memory Module -
A Processc0 When an update request is received by the structure
(Block)
memory, it is temporarily placed in the look-aside buffer.
The information in the bucket memory is not immediately
Note: Shaded Modules Constitute a Single Physical Bucket.
updated. The contents of the look-aside buffer, therefore,
Fig. 5. Organization of the structure memory. represent pending updates which are yet to be permanently
recorded in the bucket memory system. Execution of the
requests in the buffer is delayed until either of the following
among the various memory units in order that it may be two conditions occurs: 1) the loading of the buffer reaches a
searched in parallel by all the processors. Ideally, a bucket is certain threshold value, or 2) the structure memory encoun-
placed in n modules, one from each ofthe n different memory ters a slack period with no new requests awaiting execution.
units. Execution of a retrieval request, then, is carried out in the
Unfortunately, buckets are not necessarily equal in size. following manner. Given a keyword K, the processors are
Therefore, a mechanism needs to be provided for dyna- simultaneously activated to determine the set of index terms
mically varying the amount of physical space that is to be of K stored in the bucket memory. The structure memory
assigned to each bucket. The above structure memory controller then adds to this set, if necessary, extra index
organization with small module size allows for such vari- terms as a consequence of the insert requests stored in the
ability of bucket size. Each busket may be placed in one or look-aside buffer that affect K. Similarly, delete requests
more modules (as many as necessary) evenly distributed stored in the look-aside buffer may cause the deletion of
among'different memory units. The concept is illustrated in some index terms from the final set of index terms prepared
Fig. 5 where the "shaded" modules contain a single bucket. for output.
The bucket to which a keyword and its index terms belong In summary, the complete structure memory organiza-
is determined by a separate component of the DBC, called tion is also shown in Fig. 5. It consists of a bucket memory
the keyword transformation unit (KXU) which we will system, a structure memory controller, and a look-aside
discuss in Section VII. One of the functions of the structure buffer. Input requests are received by the structure memory
memory is to map a bucket name into the memory-modules controller either in the form of keywords for subsequent
allocated to the bucket. For this purpose, the controller has search for their index terms, or in the form of keyword-index
a small random access memory in which it records a bucket term pairs for intended update. Output from the structure
name and stores the corresponding module numbers. Thus, memory consists of one or more sets of index terms for
given a bucket name, all the processors can work simulta- further processing. Thus, the responsibility of the structure
neously on the modules which contain the bucket. memory controller consists of maintaining the bucket-to-
module maps, controlling the bucket memory system, main-
B. The Use of Emerging Technologies taining the look-aside buffer, taking input requests from'the
The processors of the bucket memory system must be database command and control processor (DBCCP), and
sufficiently fast so that the data in each memory module can transferring index terms to another DBC component,
be processed on the fly. Shift register memories, made of namely, the structure memory information processor
bubble memories or CCD's, commonly have a module size (SMIP) (to be discussed in Section VII). In response to
of 2 Kbytes. For an access time of Ims, each processor must, requests for keyword search, the structure memory control-
therefore, be able to process data (with comparison-type ler activates the processors and then broadcasts the keyword
to them for the required content-search of index terms.
operations) at the rate of 0.5 ps/byte. This speed should be
easily achievable with relatively powerful microprocessors VII. THE FIVE OTHER COMPONENTS OF THE
(or a few of them working in parallel as a single processing DATABASE COMPUTER
element). If module size is larger, then data may be processed
in a buffered mode, with each processing element having a We have so far discussed the organization of the mass
random access store equal in size to a module of the bucket memory (MM) and the structure memory (SM). But from
memory. time to time we have made reference to the fact that some
424 IEEE TRANSACTIONS ON COMPUTERS, VOL. C-28, NO. 6, JUNE 1979

other components are also necessary. In particular, we have


referred to the database command and control processor
(DBCCP), the security filter processor (SFP), the keyword
transformation unit (KXU), the structure memory informa-
tion processor (SMIP), and the index translation unit (IXU).
In referring to Fig. 1, we note that the structure loop, which
consists of the KXU, SM, SMIP, IXU, and DBCCP, is used
for limiting the mass memory search space, for determining
the security atoms allowed for accesses with the type A
control, and for clustering records received for insertion into
the database.
A. The Keyword Transformation Unit
The keyword transformation unit (KXU) allows the
structure memory first to readily identify the modules which
Output
contain the index terms of the keywords by providing the (Fixed-Length Keywords
associated bucket name, and then to process index terms or Logical Bucket Name)
and keywords rapidly since KXU transforms all informa- Fig. 6. Organization of the keyword transformation unit (KXU).
tion to be stored in the structure memory into fixed-length
fields. where each Pi is a predicate. The database command and
Each attribute in the database has a unique identifier. control processor (DBCCP) makes use of the structure
Information about the various attributes, supplied by the memory and the SMIP to determine the set ofindex terms to
program execution system (PES), is stored in a table of the be sent to the mass memory. After the SMIP memory is
KXU, called the attribute information table. It includes for cleared, the first set of index terms for keywords satisfying
each attribute the minimum and maximum values, the type P1, called the argument set of P1, is provided by the structure
of these values (numeric, floating point, alphanumeric, etc.), memory and then stored in the SMIP memory. Each of the
and the number of ranges into which these values may be stored index terms is initially associated with a count ofone,
divided. For different attributes, different hash algorithms indicating the number of predicates it has satisfied.
may be used to hash the variable-length values into fixed- Next, the argument set of P2 is provided by the structure
length codes. These hash algorithms constitute a hash memory and sent to the SMIP. The associated count of an
algorithm library. We observe that in the above process, a existing index term in the SMIP memory is incremented by
keyword, which is a variable-length attribute-value pair, is one if the index term matches an index term ofthe argument
transformed into a fixed-length triple (a, r, v) where a is the set of P2. The process for P2 is repeated for each ofthe other
attribute identifier, r is the range number in which the value predicates. At the end of this entire process, the stored index
belongs, and v is the hash code of the value. The pair (a, r) is terms, those whose counts are n, represent a refined list
the bucket name of the keyword. Due to hashing, the applicable to the evaluation of Q. This list of index terms is
structure memory may not be able to distinguish between then retrieved by the SMIP and forwarded to the database
values of two keywords whose attribute and range number command and control processor (DBCCP). Subsequently,
are identical. However, this will not result in the retrieval of the list is checked by the DBCCP for security clearance,
unnecessary records by the mass memory since the values of before being transmitted to the mass memory.
the keywords are used and stored in the mass memory in The most important part of the above procedure is the
their complete variable-length form. determination of whether an index term already exists in the
The organization of the KXU is shown in Fig. 6. It SMIP memory. To perform this task rapidly, the SMIP is
consists of a quasi-random access memory for storing the implemented as a set of MU-PE pairs where MU is a
hash algorithm library, a random access memory for storing memory unit and PE is a processing element. Since the total
the attribute information table, and the KXU control number of index terms stored in the SMIP memory is small
processor for performing keyword transformation and for (in fact, this number is never more than the largest number of
interfacing with the database command and control proces- index terms of a single attribute), the memory units (MU's)
sor and structure memory. An LSI bit-slice microprocessor forming the SMIP memory can be made from fast random
may be sufficient for the arithmetic capabilities required in access memory. A "double hashing" method may now be
the KXU control processor. applied for the set intersection operation. An index term (f
B. The Structure Memory Information Processor s) may be treated as a single key and hashed into a number
between 1 and m where m is the number of MU-PE pairs.
The structure memory information processor (SMIP) The index term is thus assigned to an MU-PE pair. Having
performs intersection on the sets of index terms delivered by received the first argument set (that of P1), the SMIP
the structure memory. For an understanding of the opera- controller hashes each index term of this set and thereby
tion of the SMIP, let us consider a query conjunction Q, assigns it to an MU-PE pair. After receiving an index term
Q=P1AP2A.-- AP of the argument set of P1, each PE uses a second hashing
BANERJEE et al.: DBC 425

algorithm to determine the address in its MU where the


index term is to be stored together with an associated count
of one. Thus, the first argument set is distributed among the
m memory units. Index terms that hash to the same address
in an MU are chained together within the MU itself. In case
an MU runs out of space, then a chain can be extended into a
less-filled MU.
The ith argument set (namely, that of Pi, for i > 1) is
treated as follows. Each index term of this set is hashed
(using the first algorithm) by the SMIP controller and given
to the PE to which the term is hashed. All the PE's can be
working in parallel, yet searching for different index terms
(in contrast to the structure memory where all the proces-
sors search for the same keyword). After receiving an index
term, each PE applies on it the second hashing algorithm to
determine the address in its MU which starts a chain of
stored index terms. If the given index term is found in this
chain and its associated count is (i - 1), then the count is
incremented by one; otherwise, no action is taken. Fig. 7. Organization of the structure memory information processor
(SMIP).
When all the argument sets have been processed in this
fashion, the stored index terms, having an associated count
of n, are output for further processing. The hardware
organization of the SMIP is shown in Fig. 7. Each memory of mass memory cylinders. In this case, for an index term of a
unit is a single module of random access memory. The keyword of some file, instead of storing the absolute cylinder
processing elements are made of microprocessors and are number, only a relative number is stored with respect to
capable of doing comparison-type operations as well as other cylinders occupied by the same file. However, since
executing the second hashing algorithm. The SMIP control- these relative numbers have to be converted into absolute
ler must be quite fast since it executes the first hashing cylinder numbers before being passed on to the mass
algorithm on all the index terms of the argument sets. memory (MM), a cylinder address table is maintained by the
However, a very simple but effective algorithm [10] may be IXU for every file of the database.
used for this purpose, so that the SMIP controller can For an estimate of the type of storage savings that may be
process index terms at the same rate as it receives them. The achieved, consider a large database with 40 000 cylinders. An
common memory bus is used for data transfer when an MU absolute cylinder number, then, is 16 bits long. If a file is
overflows and requires space within another MU. limited to at most 256 cylinders, then only 8 bits are
sufficient for a relative cylinder number. Therefore, a 50
C. The Index Translation Unit percent saving can be achieved in storing cylinder numbers
The index terms stored in the structure memory (SM) and in the structure memory information processor (SMIP).
manipulated by the structure memory information proces- In addition to cylinder address tables, the IXU also
sor (SMIP) are actually represented in an intermediate form. maintains a cluster identifier bit map and a security atom
The purpose of the index translation unit (IXU) is to name bit map. These bit maps are used to keep track of the
translate them into a usable form for the mass memory allocation and release of cluster identifiers and security atom
(MM). The other function of the IXU is the assignment and names.
release of cluster identifiers and security atom names, on Index terms from the structure memory information
demand from the database command and control processor processor (SMIP) are received in a burst mode and stored in
(DBCCP). a buffer made of sequential access memory. These index
The DBC allows different users to create files of the terms are expanded by the IXU control processor, one at a
database. A user may create one or more files. The creator of time, by making use of the cylinder address table. The
a file determines the attributes of the file, the clustering expanded index terms are sent to the database command
needs, and the access rights of the users of the file. The use of and control processor (DBCCP). The IXU also receives
files with different primary clustering attributes allows the requests from the DBCCP for allocation and release of
database computer (DBC) to support different types of data cluster identifiers and security atom names. The bit maps are
structures such as hierarchical, relational, and network data used for answering such requests. The size of the bit maps
models. Furthermore, it may allow different security provi- and cylinder address table of each file is estimated to be less
sions to be assigned at the file-level of the database. than 1K bytes. Hence, a small random access memory is
There can be a large saving of storage in the structure used for storing this information about the "current" file.
memory (SM) and in the structure memory information However, because there may be hundreds of files in the
processor (SMIP) if the index terms are reduced in size. This database, the information about the aggregate of all files is
is possible since files are allowed to occupy only disjoint sets stored in a bulk memory.
426. IEEE TRANSACTIONS ON COMPUTERS, VOL. C-28, NO. 6, JUNE 1979

D. The Security Filter Processor Input


(Commonds and Security
Specifications from DBCCP)
The major function of the security filter processor (SFP) is
to enforce the field-level (i.e., type B) security ofthe database. Output to DBCCP(Records
or Portions There of) and
After the records have been retrieved from the database by Input Output to MM (Records
the mass memory (MM) in response to a user query (Records from Cleared by Type B Control
for Update)
conjunction, they are individually checked for security
clearance. The SFP is capable of extracting (removing)
specified attribute-value pairs from the retrieved records
and sends only (none of) these keywords to the database
command and control processor (DBCCP).
It might appear that, unlike record retrieval requests, Post Field
record update and record deletion requests may cause Processing
Unit
Extraction
Unit
difficult problems if they are to be checked for type B
security. This misconception is based on the notion that,
once in a while, original copies of deleted or modified
records would have to be restored if they have violated the
type B security. However, such a problem never appears in
the DBC. We recall that record update and record deletion
take place in two distinct steps. Both of these operations Fig. 8. Organization of the security filter processor (SFP).
require that records be first selected on the basis of a given
criterion (query conjunction). This is the selection phase (or sor (DBCCP). Each of the three processing units is imple-
read phase). The retrieved records are post-checked for type mented as pairs of circulating memory and processing
B clearance, and only those that are cleared may now be element. Thus, each of these units can carry out fast
modified or deleted from the database. This completes the comparison-type operations simultaneously on a number of
write phase and signals the end of the update or deletion records, thereby providing rapid response to the user
operation. In other words, no deletion or modification of the request.
original database takes place prior to the post-checking for
the type B clearance. If there is an overwhelmingly large E. The Database Command and Control Processor
number of records to be updated by the SFP due to The database command and control processor (DBCCP)
field-level security control and processing, the mass memory provides the control of the entire system as discussed in
(MM) may neither send the next "batch" of records, if any, to Section II in referring to Figs. 1 and 2. In addition, the
the SFP nor write newly modified records back to MM since DBCCP performs clustering. Records to be inserted in the
the SFP is still busy. In this case, the MM misses a disk database are physically clustered by the DBCCP according
revolution and attempts to either send the retrieved records to their primary and secondary clustering attributes. In
or write the modified records in the next revolution. There is doing the clustering, the DBCCP maintains a cylinder space
no outstanding problem. The lesson to learn is that large table, indicating the space available in each mass memory
amounts of updates due to the type B control will take a cylinder, and a cluster information table, showing, first, the
longer amount of time. However, typical updates follow the definition of each cluster in terms of the keywords with
90-10 rule (see Section III-A), i.e., only 5-10 percent of data primary and secondary clustering attributes and, second, the
is required to be written back to the MM. Therefore, numbers of the cylinders currently occupied by the cluster.
contention for the MM is typically not present. These tables, together with the PES-supplied estimates on
The organization of the SFP is shown in Fig. 8. Input to the space requirement of the files, support the clustering
the SFP consists of records retrieved from the mass memory mechanism of the DBCCP.
(MM) and commands and the type B security specification Whenever a -record is to be inserted in the database, its
from the database command and control processor cluster number is first determined by reference to the cluster
(DBCCP). Input records that form the response set of a information table. The corresponding cylinder numbers
query conjunction are stored by the SFP in a random access found in this table represent candidate cylinders in which the
memory and, thus, are accessed by all the processing units of new record may be inserted. The space vacancy of the
the SFP. The type B security specifications are stored in a candidate cylinders is reflected in the cylinder space table.
quasi-random access storage. Whenever needed, the Once a cylinder is determined, it is accessed by the mass
specifications for a user are loaded by the access- memory. The detailed space availability data of each track is
authorization unit for the type B security checking. Records found in the header information of the track. The DBCCP
that do not qualify for access are deleted from the random then selects a track with the maximum amount of available
access memory. The post-processing unit performs set func- space. The header for that track is updated and the new
tion (such as maximum and average) on the response set of a record is stored in the track. A detailed algorithm for
query conjunction. The records in their entirety or certain cylinder selection is presented in [9].
portions of the records, extracted by thefield-extraction unit, For the type A access control, the DBCCP performs the
are sent back to the database command and control proces- following. For each query conjunction in an access
BANERJEE et al.: DBC 427

command, a set of index terms are received from the each drive. Although the mass memory is expanded into
structure memory via the structure memory information even larger content-addressable blocks (each block being
processor (SMIP) and the index translation unit (IXU) in a made up of several cylinders), the need for a structure
pipelined fashion. These index terms carry information on memory is still there, since no two blocks may be accessed
the security atoms to which the records satisfying the query concurrently. However, as the size of these blocks grow, the
conjunction may belong. Accordingly, only those index need for clustering and the amount of indexing decreases.
terms are sent to the mass memory whose atoms are Thus, the structure memory may decrease in size. Another
authorized for access. The DBCCP checks the access auth- benefit may occur if there are a multiplicity of MMIP's
orization by using atomic access privilege lists which show, where each MMIP handles a separate query conjunction,
for every user, the access rights on each atom of a file. Such a thereby allowing user queries to be multiprocessed.
list is prepared by the DBCCP on a one-time basis for every
user of a file. Finally, the mass memory does its share in A. A Raw Estimate of the Hardware Performance'
security checking by accessing the records that not only A rather gross first-order analysis of the DBC hardware
satisfy the given query conjunction, but are also tagged with may proceed as follows. The mass memory logic is designed
the numbers of the atoms authorized for access. For the type to process an entire cylinder in one revolution. Because a
B security, checks on any access are done solely by the cylinder generally consists of between 20 and 40 tracks, and
security filter processor (SFP). In performing this operation, because conventional disk systems process one track at a
the SFP makes use of the security specifications supplied on time, we can expect a performance improvement factor of
a one-time basis by the DBCCP. between 20 and 40 over conventional disk systems. Further-
In Fig. 2, we have sketched the path in the DBC data loop more, since the structure loop can be processing a current
through which commands and data flow. Access commands request while the mass memory is processing a previous one,
are security-checked in the DBCCP unless they have the a performance improvement factor of 2 can be expected over
type B security requirement. Insert commands result in conventional systems which process or store both the
activating the record clustering mechanism of the DBCCP. indices and database at the same time or on the same storage
The DBCCP can be implemented on a moderately power- medium. In addition, the high degree of pipelining of the
ful minicomputer with sufficient random access memory to DBC components and the clear delineation of front-end
store the information on the characteristics of only the active general-purpose processing from back-end special-purpose
files and active users. Other information may be stored in a database management may allow a performance improve-
conventional disk. The minicomputer should preferably be ment factor of 2. Thus, the DBC is likely to have a hardware
microprogrammable, so that the various functions of the processing power which is (20, or 40, x 2 x 2 = ) 80 to 160
DBCCP may be directly implemented in firmware. times that of conventional software-based systems.
Although it is in charge of a number of different tasks, the
DBCCP performs only a limited number oftasks during the B. Hardware Performance and Limitations
processing of a single command. If a command, on Several simulation experiments [21] have been carried out
the average, requires access to one or two content- to determine the response times to query conjunctions and
addressable cylinders in the mass memory, then the DBCCP possible bottlenecks in the DBC hardware. In the simulation
should be able to handle a command within the time it takes study, record retrieval requests to the DBC were assumed to
for one or two disk revolutions (i.e., 20-40 ms). By using a represent 50 percent of all requests. Since the DBC is
minicomputer and implementing the various tasks in designed primarily to respond to the retrieval requests
firmware, it is anticipated that the DBCCP will be able to rapidly and the update requests adequately, this low re-
cope with the above performance requirement. trieval percentage was expected to be a worst-case perfor-
mance measure. Retrieval requests as well as update
VIII. CONCLUDING REMARKS requests may require the use of query conjunctions. A job in
Since a large number of common database management the simulation model consists of a single query conjunction
functions are implemented in hardware, the DBC is ex- and its associated access operation.
pected to perform appreciably better than the computers A request is processed first in the structure loop of the
that provide these functions by software means. High cost of DBC and then in the data loop. When a job, i.e., a query
and long delay in software security enforcement may also be conjunction, is scheduled for processing by the structure
absorbed by the hardware. In addition, it should be loop, its predicates are first translated by the keyword
performance-and-cost-effective to support very large datab- transformation unit (KXU), index terms for keywords sat-
ases in an on-line and interactive mode, since the DBC's isfying the predicates are then retrieved from the structure
database is stored in relatively low-cost and simply modified memory (SM) and intersected in the structure memory
moving-head disks. The mass memory information proces- information processor (SMIP), and finally, the resulting
sor (MMIP), if need arises, may be expanded to simultan- index terms are translated by the index translation unit
eously handle disk cylinders each of which is from a separate (IXU). In the data loop, a job is associated with a cylinder
disk drive. In this expansion, it is only necessary that the number.
number of -track information processors (TIP's) in the ' This estimate was suggested to us by Gordon Bell during a presenta-
MMIP be increased accordingly, i.e., one set of TIP's for tion of DBC architecture at DEC by one of the authors.
428 IEEE TRANSACTIONS ON COMPUTERS, VOL. C-28, No 6, JUNE 1979

The results of the simulation are as follows. Assuming that storage. However, it does not replace the application pro-
the SMIP and the IXU can match the processing speed of grams written for the database and run in the general-
the structure memory (SM) and that the KXU provides a purpose front-end computers.
fixed processing delay, the response time to requests in the It has been estimated [14]-[16] that in supporting these
structure loop increases rather rapidly as the access time of applications on the DBC, the database transformation may
the structure memory increases. For instance, for a KXU result in a database storage requirement as much as 1.5 or 2
processing delay of 1 ms, the response time is about 35 ms times that in a conventional system. This excess storage
when the structure memory access time is 1 ms. The response requirement, however, is adequately offset by one or more
time increases to about 120 ms when the structure memory orders of magnitude improvement in the execution time of
access time is 2 ms and KXU delay is 1 ms. The structure user transactions. Furthermore, the storage requirement for
memory reaches 90 percent or greater utilization with a 2 ms the indices decreases by one or more orders of magnitude.
access time. The response times given above are measured Finally, the size of the software (i.e., the DBC interface) is
for requests that are composed of 50 percent retrieval expected to be several orders of magnitude smaller than
requests, with query conjunctions being made up of an conventional database management software.
average of 4 predicates of indexed attributes. The response D. Future Work
time is improved by 10-20 percent when a look-aside buffer
is used. Certain important problems such as recovery from fail-
The data loop is slightly slower because of the assumption ure, concurrency control, and integrity validation are cur-
that the disk revolution time is 20 ms and a processing time rently being delegated to software in the front-end system.
of 15 ms is required by the security filter processor (SFP). Future research is anticipated, therefore, in improving the
Jobs arriving at the mass memory may be placed in one of DBC to provide some hardware solutions to the aforemen-
several queues based on the cylinder to be accessed. Good tioned problems and relieve the front-end system further
performance can be achieved by executing in sequence all from much of database software. We would also like to
those jobs that are queued up to the same cylinder. In investigate more thoroughly the performance bottlenecks of
general, the wait time of jobs improves rapidly until the the DBC, in particular, the mass memory, the database
number of queues reaches 4 or 5, and there is very little command and control processor, and the security filter
improvement beyond that point. Due to a limited buffer processor due to their complexity in design and elaborate
space in the track information processors (TIP's) and a usage. The anticipated security cost in utilizing both types A
limited capacity of the bus carrying information from the and B will be studied. Preliminary analysis of DBC perfor-
TIP's to the mass memory controller and beyond, it is not mance and capability, however, tends to indicate that the
always possible to execute a job in one disk revolution time DBC may indeed perform very well in realizing the conven-
even if it refers to a single cylinder. However, jobs requiring tional database management applications. This leads us to
the read-out of complete cylinders are very rare. Therefore, believe that database machines in general and the DBC in
the average number of disk revolutions per job (i.e., query particular may become viable special-purpose computers
conjunction) remains very close to 1. for very large database management.
C. Performance Evaluation of the DBC in Supporting ACKNOWLEDGMENT
the Existing Applications The work reported here is the result of research initiated
We have also investigated the manner in which the DBC by D. K. Hsiao, contributed first by R. I. Baum, expanded by
supports hierarchical [12], CODASYL [13], and relational K. Kannan, and continued by J. Banerjee under the supervi-
[5] databases. An existing database may be supported on the sion of D. K. Hsiao. The authors of this paper are listed
DBC by converting the database to conform to the DBC alphabetically.
representation of data. This one-time conversion is known The authors thank R. I. Baum for his contributions to the
as database transformation. We do not require the user to database computer project. Portions of this paper are
reprogram his database management applications. Instead, derived from project reports available either through NTIS
we provide an interface which in real-time translates the under AD-A03415, AD-A035178, and AD-A036217, or from
database management calls issued by the application pro- The Ohio State University under OSU-CISRC-TR-76-1,
grams into DBC commands. Because DBC commands OSU-CISRC-TR-76-2, and OSU-CISRC-TR-71-3. These
constitute a high-level data language which closely reports were issued in September, October, and December
resembles many high-level data languages and calls of of 1976, respectively, and were coauthored by either R. I.
contemporary systems, the translation is straightforward Baum, D. K. Hsiao and K. Kannan, or D. K. Hsiao and K.
and the interface requires minimal software. Such a process Kannan.
is known as query translation. Both the tasks of database REFERENCES
transformation and query translation are charged to a [1] D. K. Hsiao and S. E. Madnick, "Database machine architecture in
software package called the DBC interface which resides in the context of information technology evaluation," in Proc. 3rd Int.
the front-end computer system. Thus, the interface, together Conf: on Very Large Data Bases, ACM, NY, 1977, pp. 63-84.
[2] R. I. Baum and D. K. Hsiao, "Database computers-A step toward
with the database computer, replaces a full-scale software data utilities," IEEE Trans. Comput., vol. C-25, pp. 1254-1259, Dec.
database management system and its conventional disk 1976.
BANeRIEE et al.: DsC 429

[3] S. Y. W. Su and G. J. Lipovski, "CASSM: A cellular system for very [21] D. K. Hsiao and K. Kannan, "Simulation studies of the database
large data bases," in Proc. Ist Int. Conf on Very Large Data Bases, computer (DBC)," The Ohio State Univ., Columbus, Tech. Rep.
ACM, NY, Sept. 1975, pp. 456-472. OSU-CISRC-TR-78-1, Feb. 1978.
[4] C. S. Lin, D. C. P. Smith, and J. M. Smith, "The design of a rotating [22] G. F. Coulouris et al., "Towards content-addressing in data bases,"
associative memory for relational database applications," ACM Comput. J., vol. 15, pp. 95-98, Feb. 1972.
Trans. Database Syst., vol. 1, pp. 53-65, Mar. 1976.
[5] E. F. Codd, "A relational model of data for large shared data banks,"
Commun. ACM, vol. 13, pp. 377-387, June 1970.
[6] E. A. Ozkarahan, S. A. Schuster, and K. C. Smith, "RAP-
Associative processor for data base management," in AFIPS Conf
Proc., vol. 44, 1975, pp. 379-388. Jayanta Banerjee received the Bachelor of Tech-
[7] E. A. Ozkarahan and K. C. Sevcik, "Analysis of architectural features nology degree in electronics and electrical com-
for enhancing the performance of a database machine," ACM Trans. munication engineering in 1973 and the Master
Database Syst., vol. 2, pp. 297-316, Dec. 1977. of Technology degree in computer science both
[8] R. Moulder, "An implementation of a data management system on from the Indian Institute of Technology, Kharag-
an associative processor," in Proc. AFIPS Nat. Comput. Conf., vol. pur, in 1975.
42, 1973, pp. 171-176. He held a research assistantship involving work
[9] J. Banerjee, R. I. Baum, and D. K. Hsiao, "Concepts and capabilities on a PDP-1 computer at the Indian Institute of
of a database computer," ACM Trans. Database Syst., vol. 3, pp. Technology from August 1974 until May 1975,
347-384, Dec. 1978. Also available in R. I. Baum, D. K. Hsiao, and and also did work on building up a time-sharing
K. Kannan, "The architecture of a database computer-Part I: system on the PDP-1 computer. In September
Concepts and capabilities," The Ohio State Univ., Columbus, Tech. 1975 he joined the Department of Computer and Information Science,
Rep. OSU-CISRC-TR-76-1, Sept. 1976. The Ohio State University, Columbus, as a Graduate Teaching Associate.
[10] K. Kannan, D. K. Hsiao, and D. S. Kerr, "A microprogrammed Subsequently, he passed the General Examination and became a Ph.D.
keyword transformation unit for a database computer," in Proc. 10th candidate. He is presently a Graduate Research Associate. Since then
Annu. Workshop on Microprogramming, Oct. 1977, Niagara Falls, he has been working on his doctoral in the area of computer architecture
NY; and D. K. Hsiao, K. Kannan, and D. S. Kerr, "Structure and systems programming. He has written two technical reports and
memory designs for a database computer," in Proc. ACM 77 Conf., several papers with D. K. Hsiao and other project members on database
Oct. 1977, Seattle, WA. Also available in D. K. Hsiao and K. computers. In the summer of 1978 he visited Univac for the purpose of
Kannan, "The architecture of a database computer-Part II: The helping the Univac staff in a joint study.
design of the structure memory and its related processors," The Ohio Mr. Banerjee is a member of the Association for Computing Machinery,
State Univ., Columbus, Tech. Rep. OSU-CISRC-TR-76-2, Oct. 1976. Sigma Xi, and the IEEE Computer Society.
[11] K. Kannan, "The design of a mass memory for a database computer,"
in Proc. 5th Annu. Symp. on Computer Architecture, Apr. 1978, Palo
Alto, CA. Also available in D. K. Hsiao and K. Kannan, "The
architecture of a database computer-Part III: The design of the
mass memory and its related processors," The Ohio State Univ.,
Columbus, Tech. Rep. OSU-CISRC-TR-76-3, Dec. 1976. 4 David K. Hsizo (M'68-SM'77) received the Ph.D.
. degree from the University of Pennsylvania,
[12] IBM, Information Management System/Virtual Storage (IMS/VS)
Version 1, General Information ManuaL GH20-1260-4. Philadelphia.
[13] CODASYL Data Base Task Group Report, ACM, NY, Apr. 1971. -tl,^ He conducted research at the Honeywell In-
[14] J. Banerjee, D. K. Hsiao, and F. K. Ng, "Data network-A computer formation Sciences Research Center and taught
S ie:
_i;, e at the Moore School of Electrical Engineering
network of general-purpose front-end computers and special-
purpose back-end database machines," in Proc. Int. Symp. on
at the University of Pennsylvania. In the fall of
Comput. Network Protocols (A. Danthine, 1975 he was a Visiting Professor of Management
Ed.), Liege, Belgium, Feb. at the Sloan School of M.I.T. In the summer of
1978, pp. D6-1-D6-12. Also available in D. K. Hsiao, D. S. Kerr, and
F. K. Ng, "DBC software requirements for supporting hierarchical 1976 he was a Faculty Associate at the IBM
databases," The Ohio State Univ., Columbus, Tech. Rep. OSU- Research Laboratory, San Jose, CA. He is cur-
CISRC-TR-77-1, Apr. 1977. rently a Professor of Computer and Information Science at The Ohio
[15] J. Banerjee, D. K. Hsiao, and D. S. Kerr, "DBC software require- State University, Columbus.
ments for supporting network databases," The Ohio State Univ.,
Dr. Hsiao has published widely in the area of database systems design
Columbus, Tech. Rep. OSU-CISRC-TR-77-4, June 1977. and engineering. He is the author of the textbook, Systems Programming-
[16] J. Baneriee and D. K. Hsiao, "Performance evaluation of a database Concepts of Operating and Database Systems (Reading, MA: Addison-
computer in supporting relational databases," in Proc. 4th Int. Conf
Wesley), and is coauthor of the monograph, Computer Security (New
on Very Large Data Bases, Berlin, Germany, Sept. 13-15, 1978; and J.
York: Academic, ACM Monograph Series).
Banerjee and D. K. Hsiao, "The use of a 'non-relational' database
machine for supporting relational databases," in Proc. 4th Workshop
on Comput. Architecture for Non-Numeric Processing, Syracuse, NY,
Aug. 1-3, 1978. Also available in J. Banerjee and D. K. Hsiao,"DBC
software requirements for supporting relational databases," The Krishnamurthi Kannan was born in Madras, India,
Ohio State Univ., Columbus, Tech. Rep. OSU-CISRC-TR-77-7, on July 25, 1948. He received the B.Tech. and
Nov. 1977. the M.Tech. degrees in electrical engineering in
[17] PTD-9300 Parallel Transfer Disk Drive, Ampex Corporation, Red- 1970 and 1972, both from the Indian Institute of
wood City, CA. (A product announcement communicated to the Technology at Bombay and Kanpur, respectively,
authors in May 1978.) and the Ph.D. degree in computer science from
[18] A. S. Hoagland, "Magnetic recording storage," IEEE Trans. Comput., The Ohio State University, Columbus, in 1977.
vol. C-25, pp. 1283-1289, Dec. 1976. He joined the Research Staff of the IBM T. J.
[19] E. J. McCauley, III, "Highly secure attribute-based file organization," Watson Research Center, Yorktown Heights, NY,
in Proc. 2nd USA-Japan Comput. Conf., Aug. 1975, pp. 497-501. in 1977. His main research interests are database
[20] L. Altman, "New arrival in the bulk storage inventory," Electronics, systems, computer architecture, and distributed
vol. 51, pp. 106-113, Apr. 13, 1978. systems.

Вам также может понравиться