Вы находитесь на странице: 1из 13

Heterogeneous Distributed Database Management: The HD-DBMS

ALFONSO F. CARDENAS
Invited Paper

The proliferation of different D B M S and advances in computer networking and communications have led to increasing heterogeneous distributed D B M S network scenarios. Major heterogeneity problems and challenges include: different database models, syntactically and semantically different DBMS, different types of controls (recovery, etc.), etc. We addressherein the long-range goal for a heterogeneous distributed DBMS (HD-DBMS) to be able to s u p port a network in which any user in any node can be given an integrated and tailored view or schema, while in reality the data may reside in one singledatabase or in physicallyseparated databases, managed individually by the same type of D B M S (by the only one the user understands) or by different DBMS. We cite the major approaches to data sharing and accessing: from the primitive commercial file and database unload/load and PC download, to common interfaces on top ofexisting DBMS, to the R&D and prototypeefforts toward thelong-range desires. Commercial availability of the more encompassing thrusts may become a reality withthemounting problems, opportunity costs, and demand for data sharing inthe heterogeneous world. Major research and development projects in this arena areleading toward some partial attainment of thelong-range objective. The UCLA HD-DBMS project is highlighted herein, with a presentation of its status, progress, andplans. Itis a longer range project, with the unique feature of allowingany user in the networkto use a preferred database model and D M 1 to access or update any data in theheterogeneous network. H D D B M S is to provide a multilingual interface to heterogeneous distributed databases.

location Xin the figure. Database machines may be involved in managing the databaseb) at a node, or in a local network. The heterogeneous database environment has emerged in many organizations, governmental environments, and computer networks dueto a) the proliferation of databases b) the proliferation of different DBMS c) the proliferation of a variety of minicomputers and personal computers d) the emergence of networks tying together heterogeneous hardware and software e) advances in data communications f ) distributed databases g) lack of overall (not justlocal) database planning and control. This environment adds to all the challenges and problems for the homogeneous distributed environment the problems of heterogeneity of DBMS: different data models (network, hierarchical, relational, etc.), syntactically and semantically different DBMS (e.g., even within the relational model family there are significant differences between SQL and QBE), different types of controls in each GDBMS (e.g., backup and recovery, locking and synchronization, etc.). It is desired thatafuture heterogeneous distributed DBMS (HD-DBMS) provide not only distribution transparency but also heterogeneity transparency. The example, in Fig. 1, shows four databases involved: at IocationXthere isadatabase managed bya relational DBMS and another managed by a network DBMS (e.g., a CODASYL System) on another local computer, and at two other remote locations there are two separate databases, each managed bya hierarchical DBMS such as IMS. With current is expected technologies every user accessing any database to use the facilities and abide by the syntactic and semantic regulations of the DBMS which created eachdatabase, unless some interface software is developed by theinstallation. Although some such interface software is, of necessity, being developed frequently by user installations, thus far it allows only cosmetic variations from the syntax and semantics of the DBMS managing the particular database.

I. INTRODUCTION The use of different generalized database management systems (DBMS) has proliferated in recentyears.As a result, the heterogeneous distributed database management system scenario has emerged. An example is shown in Fig. 1. A variety of large and small computers and even personal computers, mostof them with their own and incompatible DBMS, may betied together in a networkas shown. Satellite communication may be involved between distant nodes. Local networks of computers might involved, be such as at

Manuscript received October 22,1985; revised November 26,


1986.

The author is with the ComputerScience Department, University of California, Los Angeles, CA 90024, USA, and with Computomata International Corp., Los Angeles, CA 90025, USA. IEEE Log Number 8714292.

PROCEEDINGS OF THE IEEE, VOL. 75, NO. 5, M A Y 1987

u)uIKH

Fig. 1. Heterogeneous database management system scenario-an example.

What would be greatly desired to enhance the attractiveness and usefulnessof sharingdata resources in a heterogeneous network, as shown in Fig. 1, is the ability for a user to access any database as if it were managed under any one of theDBMS at one central location. Thus a user could have access to any database through a relational view at oneof the minicomputers in local the networkat location X, while another set of users, at nodes where IMS databases reside, could have accessto any databaseas if it were managed by IMS. Ideally, a user anywhere could look at any it was database through his favorite DBMS, whether or not the preferred one at his site. Therewill be,ofcourse,manyuserswhowillconfinetheir database accessesto a localdatabase managedby the local DBMS. In fact, they will undoubtedly constitute the majority of the bulk applications. However, there is a growing population of usersacross the heterogeneous scenario whose needs we address herein. In a nutshell, the ideal long-range goals would be for an HD-DBMStobeabletosupportanetworkinwhichanyuser in any node can be given an integrated and tailored view or schema, while in reality the data mayreside in one single database or in physically separateddatabases,managed individually by the same type of DBMS (by the only one the user understands) or by a different DBMS. No HD-DBMS with such full capabilities is available today. There are many unsolved problems, and others remain to be uncovered. However, major research and development projects in this arena are leading toward some partial attainment of the previous long-range objectives. Section II outlines the range of approaches to the heterogeneous challenge, from the extreme of database unload/load, to a common interface for DBMS, to the top of the line and long-range R&D and prototypeefforts. Section Ill outlines the UCLA HD-DBMS project and progress striving for the longerrange goals.
11. APPROACHES TO COMMUNICATION IN ENVIRONMENT
A

unload the data from the source hardware/software environment, then store them in a common format understood and handled by both source and target environments, and load them into the target environment. This approach in fact has been used to unload/load data files across heterogeneous environments for several years. The common format has been usually ASC 11. In a number of cases, specialized types of data are unloadedlloaded via common formats specially designed and tailored to carry data descriptionsandother semantic information from source to target. Examples aresatellite telemetrydata, geographical data types, etc. 1)personal comWith the emergenceand proliferation of putersandthe many different types (IBM PC, Apples McIntosh, etc.), 2) LocalArea Networks (LANs), and 3) incompatible software packages (spread sheets,word processors, file and databasemanagers,etc.), the need for unloadlload has increased. There is an increasing number of commercial file transfer programs whose task is to help transferfiles fromonemachine to another, providing increasing levels of help andtransparency over the many details that heterogeneity springs onto the unload/load process. An examination of the generalized file transfer technologycommerciallyavailable shows that thedata that can be easily transferred are essentially sequential files. Even random-access files are not easily transferable. The transfer of more sophisticated files such as indexed sequential files, e.g., VSAM in large IBM operating systems, is not transparent and usually is not automated (try, for example,totransfer such files between Honeywell and IBM environments). Ausual approach is to unload such indexed files to sequential files, stripping all indexing and other vendor-specific control information, use the sequential ASC II file transfer route, and load into the equivalent version (including indexing) on the target environment. Conversion software houses and specialists are usually necessary to dothis. Simplistic unload/load or file transfer programs are of little help in a database environment. All the crucial relatability know-how, indexing, and/orhashing, would be lost in converting the database into a number of individual

HETEROGENEOUS

A. File and Database Unload/Load

One extreme and simplistic approach t o accessing data in a heterogeneous environment i s to physically

CARDENAS: HETEROGENEOUS DISTRIBUTED DATABASE MANAGEMENT

589

and attractiveness of relational data management has led sequential ASC I 1files. The process of loading the database into the target environment would involve new database to many commercial relational DBMS. Relationalmicrodefinitions, new indexing definitions, invocation of loading DBMS now predominateatthePC level. Furthermore, there is a tendency for various vendors to provide a relational utilities that might requirespecial formating over the files being transferred,etc. In all, it is a practically most difficult interface or view on top of their existing nonrelational DBMS. Examples of this are Cullinets IDMSlR providing a process. Try converting a CODASYL database schema and relational interface to the internal CODASYL IDMS datacontents from any vendor you select to IMS or vice versa. base [Ill, andHoneywells PDQ permitting a relational Specialized Database LoadlUnload: A number of interface to operate directly on thenative CODASYL IDS/ pioneering efforts on thesubject of file description and translation in a true database environment were started D M IV database (or, as another option, on a copy of IDS/ the in the 1970s,[38] and others. Other more recent efforts D M IV database) [19]. A few vendors have first provided a relational interface on a native nonrelational system and include IBMs Express [42]. Due to commercial interests, a numberof specialized database unloadlload packages have then have redone it into a more native relational system. been developed by number a of vendors. The predominant Unfortunately, there is no relational standard. Although ones are the relational structure and relational calculus andalgebra are the common thread, and IBMs SQL and QBE may be 1) those that unload froma nonrelational database syslargely takenas de facto standards, the fact is that there are tem and load to a relational database system; manyvariationsof SQLandQBE. Exceptforthesimpler read2) those that download a database or portions of a only commandSELECT. FROM. WHERE. ,there are database from a mainframe computer to a smaller noticeable variations in other areas, such as o n updating computer or PC. commands whose semantics and integrity controls vary The subtle difference between thesetwo types of packages greatly among implementations. Thus a standard relational is that the latter are more numerous, usually less sophisinterface among DBMS vendors has not evolved, and it ticated, and generally download into sequential ASC IIfiles appears that it will not evolve. Nevertheless, there will be for input to simple file handlers such as spread sheets, a common general way of structuring databases and s u p graphics packages, etc. porting operations, specifically project and join. FurtherAmong the most frequently cited mainframe database more, relational interfacesfor DBMS and mainframeto PC bridges may be exercised jointly in some cases. For examloadlunload software bridges are IMSs Extract to unload portions froman IMS database and loadit as an equivalent ple, download data from the mainframe nonrelational dataSQUDS or DB2 database [22], and HoneywellsPDQ facility base via its relational interface into the PC environment, to unload portions from an IDSlDM IV database to a relawhere the copy might be manipulated via another relational IQ database (essentially SQL) [19]. tional DBMS (like the dBase family). In spite of the relational DBMS differences, the interThere is a growing number of mainframePC data downloading packages. In fact, a growing numberof DBMS venconnection of DBMS with relational interfacesmay be augdors now offer such capabilityfrom theirDBMS to sequenmented further bya network-widegenera1ized relational interface that may provide a user at any node in Fig. 1transtial files for use at the PC level. In a number of cases the parency over the relational DBMS differences. Such gendownload may be invoked from the PC, and data are then downloaded from thedatabase into the following: eralized relational interfacewill nothave the challenge of a)ASC II or DIF format for use with popular spreadmapping schemas between different models, and translating between widely different database access languages; sheets, word processors, and even the dBase relational it only has to be concerned with relatively simpler differmicro DBMS; an example is Informatics AnswerlDB for downloading IMS data [23]-[25], and its 123/Answer, and ences between the various relational interfaces of each dBase/Answer packages thattranslate the ASC 11 files Such generalized relational interface DBMS in the network. retrieved by AnswerlDBinto the proper internal format of or front-end t o a distributed relational DBMS network is exemplified by the SDC project outlined in the article by these packages. Templeton et al. in this issue. b) Special vendor format foruse with thevendors own PC software packages; an example is Cullinets facility to download data from its IDMSlR database into its PC GolC. Research and Prototype Projects dengatesoftware packages (including graphics,spreadA number of longer range R&D and prototype projects sheets, etc.) [12]. are aimed at achieving the goals cited in the Introduction. A fundamental problem or challenge with downloaded They do not entail data unloadlload or download, nor the data is, since it is a redundant copy of mainframe data, the existence of relational DBMS or relational interfaces to every maintenance of consistency or synchronization in an DBMS in the network. Such long-range projects address updating environment. The usual current commercial and perform in various ways the mapping or translation of approach is t o download andpropagate pertinent updates database structures and corresponding data-accessing lanfrom the main database periodically, and either a) not perguage commands illustrated in Fig. 2. Most projects mit updating from the PC level orb) permit updatingon the approach this by introducing intermediate database model PC level and not reflect the updates upstream. and databaseaccess language levels. Both thetypesof intermediate models and languages and the number of levels B. Relational Interfaces to DBMS vary, with the number of levels usually ranging from three to five depending on the project. Major efforts include One of the hopes of the advocates of relational database UClAs HD-DBMS project [4] the mainfocus of Section Ill; management is that it will be widelyadopted. The success

..

..

..

PROCEEDINGS O F THE IEEE, VOL. 75, NO. 5, M A Y 1987

lications it on have appeared literature open the in [4]-[6], [20], [35].This section-provides a status of the project, progress, and near-term.plans. The HD-DBMS strives to achieve the major long-range user to acomgoals cited inSection I, not constraining the mon arbitrarylanguage nor t o read-only queries; however, it is a very-long-range possibility, beyond the more achievable MULTIBASE and SIRIUS-DELTAtasks. Its primaryfocus is on the heterogeneity challenge, not on the database physical distribution challenge taken up by other efforts assuminga homogeneousorcommon DBMSenvironment. The HD-DBMS approach entails a global (network-wide) conceptual model of data and a global internal model of is a highly logical model data. The global conceptual mode I I I of the information content of the integrated system. It is used as avehicle in the processof understanding userquerFig. 2. Relationship between schema translation and DML ies and decomposing them to extract information from translation. is the individual databases. The globalinternalmodel access-path oriented model of the structure of the integrated system showing precisely the data structures and Computer Corporation ofAmericas MULTIBASE [13], [14], access paths actually available (e.g., network-wide access [29],[31],[MI; INRIAs heterogeneous SIRIUS-DELTA [Iq; routes, local database relationships, inter-database relaand Informatics MARK V DAG [24]. In addition to these tionships, etc.), but independent of a specific implemenprojects; a number of authors have also addressed thechaltation.The global internal model is the union of the internal lenge [I], [181, [261-[281, [301, [331, WI, I451, [461. models of each participating database. It is used as avehicle The majority of the current research and development inthe processof identifyingthespecific access paths efforts and initial commercial support expected simplify through the different databases that should be followed to the task by requiring every user to communicate using a answer userqueries, while shielding user the from the need common language and data model [MULTIBASE, DAG, SIRto know the intricacies of the access path implementation IUS-DELTA]. A frequent choice is a relational model [SIRand physical storage of data.The global internal model IUS-DELTA]. MULTIBASE further simplifies the task for a identifies major elements outside the realm or interests of more near-term achievable system by handling only read each local DBMS: relationships between entities in differtype of globaldatabase requests; all updates are managed ent DBMS, logical replication, and perhaps physical replocally by individual sites. The complexity and restrictions lication of entities and relationships in heterogeneous dataof updatingthrough user views in relational DBMS is bases. acknowledged. The initial commercial version of MULTIAn extension of the ER model proposed by Chen [q is BASE may be available in the near future. It will provide distribution transparency and heterogeneity transparency for fundamentally used for the conceptual level, rather than other models [I], [2]. Our model for the internal level [ZO] read-only global queries using DAPLEX as a common lanis an evolution of our earlier proposal [35]; it was inspired guage and data model [43]. (See the article by Chan et al. by and includes ingredientsfrom DlAM (Data Independent in thisissue which includes a synopsis of DAPLEX.) In conAccessing Model) [32], [39], and [40]. trast, HD-DBMS provides a multilingual interface t o hetOther significant efforts toward heterogeneous DBMS erogeneous distributed databases, while these other networks propose providing users with either anew model systems provide only a monolingual interface to heteroview, typicallya relational view (MARKV DAG a hierarchical geneous distributed databases. view), of every database, and one query language to be DAG (Distributed Application Generator) (241 intends to eventually translated into search programs to access the be a generator of applications and also of the necessary DBMS commands embedded in the application program to actual databases. A crucial difference between our project and others is that we wish to permit each user or program accessdatabases managed by IMS andlor SQUDS.The at a node to view and access data in the database model and database view t o the application is a logically integrated language desired rather than force learning another lanhierarchical IBM database, although it may be composed guage or reprogramming for another model and language. of portions residing in several separate IMS and/or SQUDS The desired languages would be constrained to a few, of databases at different sites and under different IBM opercourse, but not to only one in a given database model. ating systems and data communications software (CICS, system architecture. Theglobal Fig. 3 shows the proposed IMSIDC). query translator processes the query initially submitted by

----

I l l . THE UCLA HD-DBMS PROJECT


A. Overall Architecture

The UCLA HD-DBMS project is a multi-year, long-range project startedin the late 1970s. Since 1983 part of the project has involvedcollaborationandsupportfromInformatics General Corp. (now Sterling Software). Several pub-

auserand,withtheknowledgeofthevirtualdatabasemodel associated to that query,translates it to the form acceptable by the global conceptual model (an ER model) and global internal model.The query is then decomposed bya query decomposer andaccess path selector, a translator,into the appropriate subquery(ies). The subquery(ies)will then have to be translated into the query language or data manipulation language of a specific DBMS, so as to then be pro-

GENEOUS CARDENAS:

591

~ I l u t l o F'rogrmn n

a = a
v i a l Layer
Un1114 Vimal Lsyw U n iG l o w l a y e r

1I ,

L o u l D N h

Fig. 4. Layered architecture for the HD-DBMS.

MODEL 1 AREA
AREA

MODEL 2
bREA

Fig. 3. System architecture and building blocks to support communication in a heterogeneous database environment.

cessed by the corresponding node(s) t o extract the information from the specific physicaldatabase(s) involved. The answers to thesubquery(ies1 arethen joined together and reformatted by the query composer, a translator, according to thevirtual database model. The result is the answer to the original querybased on the user's virtual model. Therewill be,ofcourse,manyuserswhowiIIconfinetheir queries locally to a given physical database managed by a given DBMS.TheywiII undoubtedlyconstitute the majority of the bulk volume applications. In this case, the local DBMS will process their queries directly and completely. The global query translator, the query decomposer and access path selector, and the querycomposer will notbe needed for such cases. HD-DBMS Layered Architecture: A number of important -w NUMBER r PAR? WULL r PART MSCRlPMN catalogs or directories and mapping or translation procedures for data structures and data access commands are NON-NULL r PAR?CLISSIFlCAllON necessary. Fig.4 shows the five different layers of our architecture and their associated models. The local layer conWULL r WAREHOUSENUMBER tains the physical databases actually stored.The outermost DESCRIPTK)N WOKWULL r WAREHOUSE layer is the collection of virtual databases as seen by the users of the heterogeneous database network. The outWULLNUMBER r PAR? ermost layer is the database network. The user deals with NUYBER ~OK~R r LWAUEHOUSE the outermostlevel, called the virtual model (VM),and the HAND ON WULL r WANW system should handle all the necessary mapping to extract information from the localphysical databases. NOKNULL r Following Fig. 4: NON-NULL r 1) An application program databaseview is defined using r the data definition language o f a host DBMS. This view is " . N U L L r defined to the HD-DBMS at the virtual layer (VL). r 2) An application program query (DML or query comr mand) enters thevirtual layer and is transformed by the HDFig. 5. DB1 definition: An SQL relational database. DBMS into a unified virtual layer (UVL) query. This layer is

an ER representation of the application program's virtual layer view. 3) The UVL query is then mapped into a unified global layer (UGL) query. The UGL is an ER conceptual representation of the entire heterogeneous database. It represents the union of individual unified local layer (ULL) database views. 4) The UGLquery is transformed into a set of one or more ULL queries andan access plan. A ULL definition exists for each physical database. Externally, a ULL definition of a database. Internally, physical database is an ERviewof that ULL accesspath specifications exist for data within a single physical database and for each interdatabase relationship between two or more physical databases. 5) A ULLquery istransformed intoa local layer (LL) DBMS dependentquery, and then sentto local the DBMS.The ULL queries are performed according to theprecedence established by theaccess plan. are obtained, the Once theresults of the original query data are translated back through the layered architecture

'1

't

*'

I
*I I

592

PROCEEDINGS O F THE IEEE, VOL. 75, NO. 5, M A Y 1987

into the form expected by the application program. This involves both structural and data translation.

SCHEMA NAME I S PART-WAREHOUSE AREA NAME IS DATA-AREA RECORD NAME IS PART

B. Example Heterogeneous Database Network


The following is an example of a close-to-reality heterogeneous database network. It will be used in subsequent sections. The scenario consists of four databases under different DBMS: SQL (two databases), CODASYL, and IMS. Each of the databases i s defined inFigs. 5-8.Fig. 9 presents the unified global conceptual ER model (UGCM) that covers the four databases; note that the partitioned global conceptual model shows the contribution each of of the four databases to the UGCM.
SCHEMA AREA RECORD NAME IS DB2. NAUE Is DB-AREA NAME IS PART. LOCATION UODE IS CALC H A W . P# USING P t IN PART. WPLICATES NOT ALLOWED. WITHIN DB-ARA.

LOCATION MODE IS CALC HASH. P t USING P# IN PART DUPLICATES ARE NOT ALLOWED WITHIN DATA-AREA
02 PX

: TYPE IS CHAR 16

:TYPEISCHAR% 02PD
02 CLASS :

TYPE IS CHAR 1

RECORD NAUE IS WH

02 WX 02 WD
020TY

: TYPE IS CHAR 5 :TYPE ISCHAR 16 :TYFISDEC6

SET NAME IS INVENTORY OWNER IS PART MEMBER IS Wn MANDATORY AUTOMATIC

ASCENDING KEY IS W I

mw
MODE OF OWNER

DUPLICATES ARE ALLOWED

s n OCCURRENCE s E m n o N IS LOCATION

Fig. 8. DB4 definition: A CODASYL network database.

02 PX TYPE IS CHAR 5. 02 PD TYPE I S CHAR 25.

02 CL TYPE IS CHAR 2.
RECORD NAME IS W H . WlTnlN DB-AREA 02 W 1 TYPE IS CHAR 5. 02 WD TYPE IS CHAR 25. SET

NAME IS INVENTORY
OWNER IS PART. MEMBER IS

WH. W H .
El: E2 PART WWOUSE

UANDATORY AUTOUAW.
N ASCENDING KEY IS W I I

DUPLICATES ARE NOT ALLOWED.

SET OCCIlRENCE SELECTION IS THRU LOCATION MODE OF OWNER.

Fig. 6. DB2 definition: A CODASYL network database.

'4
P A R I T W E E GLOBAL CONCEPTUAL MODEL

COMWSEW R1: COMPOSED OF


R2: AVPWSl-IN OBI,): I h * b

DBD

NAME I 001. ACCESS = HISAM OD1 = DEPTDDI. DEVICE = 3380. OVFLW = DEPTOVF NAME E PART, BVTES I32 NAME = (COMP-ISSEM0. O W ) , PAIR = ASSEMI-COMP NAUE = (PI. SEOI. BYTES = 5. START = 1 NAME = PD, BYTES El

DATASET SEGM LCHILD FIELD FIELD FIELD SEGM

'

P*,

P o , CL

= 25, START = 6

NAME 5 CL. BYTES = 2, START = 31 NAME = ASSEMB-COMP, BYTES = 10, POINTER = (LPART. TWIN, LTWIN). PARENT = ((PART). (PART, PHYSICAL, 003) )

FIELD FIELD SEGM FIELD FIELD

NAME = [PI. SEO). BYTES = 5, START = 1 NAME z O N ,BYTES

UGCM

= 5, START = 6

THE WKlED G

L U CONCEPTUAL MODEL FORMED BY JOININGTHE ULCM OF DB(11 DM41

NAME = COUP-ASSEMB, BYTES = 10. POINTER =PAIRED, PARENT = PART, SOURCE = (ASSEMI-COMP, 003) NAME = (P#, SEO), BYTES = 5, START = 1 NAME = O N . BYTES = 5. START = 6

Fig. 9. Global conceptual model in the HD-DBMS.

PART PX, PD, CL


i..... ......................

Asampleof queries issued at the virtual conceptual model isshowninFig.10,withatraceofthedataaccessedthrough the various heterogeneous databases.

............................
ASSEMB-COMP PU, QTY
~

C. Database Mappingflranslation
...........................
COMP-ASSEMB PX, QTY

Fig. 7. DB3 definition: An ISM/DB hierarchical database.

The UGCM is the conceptual model of the integrated database. It is formed by the union of the ULCMs of the participating databases, and any inter-database relationships. A V M can be derived from the UGCM so that a VM

CARDENAS: HETEROGENEOUS DISTRIBUTED DATABASE MANAGEMENT

591

QUERY 1:
FlNDTHESTOCKSTATUSOFTHEPARTWITHW=l12

NOTES

Rw SW I m ( n!4y b.fwndIn DB(1), DB(2) md DB(4), n c hunder s dWumnl GDBYS


QUERY 2

Fig. 10. Sample queries.

is independent of the organization or physical disposition of the underlying database(s). Thus a number of crucial database mappings or translation procedures are needed. These translations in a few cases may be more like reformatting. The mappings should be kept at least in the network data dictionarykatalog, Fig. 3. Thedata model (schema or subschema) mappings or translations have been identified or developed thus far, from theuser view through the various data model layers, to the individual DBMS and back to theuser. We assessed ourworkandworkbyothersinthefieldandoptedforusing algorithms for the following specific translations proposed by Dumpala and Arora[IS]:

Mapping Relational Schema into ER Schema Mapping Network Schema into ER Schema Mapping Hierarchical Schema into ER Schema Mapping ER Schema into Relational Schema Mapping ER Schema into Network Schema Mapping ER Schema into Hierarchical Schema These algorithms are ready for implementation. The following is just an example of the mapping between relational andER schemas. A relationin a relationalschema will correspond to one of the followingER constructs: an entity a k-ary relationship a binary relationship with attributes (1:N or N : l ) an M : N binary relationship set without attributes an entity, plus key attributes of some other entities.

Thus a relational query targeted at a relation will be translatedintodifferentquery commands at the ER level, depending on which of the abovedataconstructs are involved. More on this in Section Ill-E.
D. Query/DML Translation Theterms"datamanipulationlanguage(DML)"or"query language" shall be used synonymously to refer to any of the data access languages of the major typesof DBMS: CODASYL DML, relational SQL, or IMS DUI. The terms"database request" and "query" will also be used synonymously.

As per Fig. 4, the queriesmade by a user on a V M should be translatedto the equivalent queriesat the UGCMlevel, then at the UGlMlevel, then at the ULlMlevel, and finally at the LLM level for processing by the particular DBMS involved; the answer is then composed or reformatted to adhere to theoriginal V M level. Thus a number of crucial mappings or translation procedures is needed. Fig.2 shows therelationshipbetween schema translationandDML translation. We provide ourprogress in the followingsections. I ) The E R DML Global Conceptual Language: The HDDBMS architecture uses an ER DML as the global conceptual language (GCL), at the unified global conceptual level. all virtual layer DMLs are transThis is the DML into which lated. This is also the DML whose queriesare decomposed and distributedto various local physicaldatabases. Two of the most important justifications for aGCL are the following. First, a GCL reduces the number oftranslations (both schematranslationand DMLtranslations) necessarilywithin a distributed database system.It is easy to understand that, without a GCL, m x n translators would be needed in an HD-DBMS that has n physical databases and supports rn virtual model databases, while with a GCL, only m n translations would be needed. Secondly, a GCL allows for a single, conceptual view of the whole database, which, in reality, consists of a group of heterogeneous physicaldatabases. Functional Requirements of a GCL: The single most important functional requirement of a GCL is that it be semantically "rich" enough to express queries fromall the virtual level DMLs. This meansthat, for any existing virtual level DML, any DML statement may find its equivalent in the GCL. It is not necessary to have a one to one correspondence between the GCL and other virtual level DMLs so long as the GCL is able to express any statement expressed by a virtual level DML. How do we know if a GCL meets this requirement?There has not been a satisfactory answer to this question despite various attempts that have been made. One of them is the introduction of the term "completeness" [3], [36]. Informally, a DMLis complete if, for a database, any piece of informationstored in thedatabase can be retrieved using that DML. A GCL that is complete should meet this requirement. Unfortunately, there is no consensus on the definition ofcompleteness for an ER based DML. In addition to the above requirement, it is desirable for a GCL to be as independent of the physical aspects of the database as possible. The reason for this is that a CCL is a DML against a conceptual database only. This requirement alone excludes the possibility of using a procedural type DML (record-at-a-time DML) as CCL since a procedural DML ties itself too closely to the physical aspect of a database. There are thus two choices for a GCL: 1)an algebraictype of DML, and 2 ) a calculus type of DML. There have been several proposals for "ER algebra" in the literature [8],[36]. All those proposals are clearly inspired by the relational algebra proposed by Codd [9],[IO]. However, the situation is different in the ER model as opposed to the relational model. In the relational model, only the data entity i s a relation. All the operations in the relational algebra apply to relations only. The result of any relational algebraic operation i s also a relation.In contrast, there are two basic data entities inthe ER data model: entityandrelationship. Semantically, they are different. An algebra that applies on

PROCEEDINGS OF THE IEEE, VOL. 75, NO. 5 , M A Y 1987

twodataentitiesisconsiderablymoredifficulttodefinethan one that applies on a single data entity since as the number of data entities increases the types ofthe output data entity and their semantic meanings seems to grow rapidly. The area of ER algebra is still at its infant stage. More research is needed to find a good definition ofER algebra. Consequently, we have not adopted any existing ER algebra for the GCL. The I? DML: Our choice for GCL is a calculus type language. Fig. 11 shows a summary of the GCL. We call it cal-

fiedcharacteristicsandrequirementsofalgorithms to translate between the various model and language layers of Fig. 4. Our major approachlobjective is to develop a DDUDML compiler-compiler work bench from which we can more easily develop the desired translations. Thus we have completed translation algorithms for: Hierarchical IMS D U I (except logical databaseandFast Path commands) into the ER DML CODASYL DML into the ER DML Relational Algebra into the ER DML Relational SQL into the ER DML. Some of this work, that focusing on the translation from SQL to ER DML, is presented in[6];examples of it are provided in the next section. Translation algorithms for the following are now being developed: ER DML into relational SQL ER DML into hierarchical D U I ER DML into CODASYL DML. Our next task is to start prototype implementation of a subset of the following algorithms for proof of concept: CODASYL DML into ER DML into SQL SQL into ER DML into CODASYL DML. Small programs in languages such as COBOL and C with CODASYL DMLand SQL embedded in them would be used to test the translation paths.

Fig. 11. The ER DML global conceptual language.

culustype because there is a naturalcorrespondence betweenthistypeofDMLandtherelationalcalculus.Afundamental aspect of a calculus-based DML is the notion of the tuple variable: In relational calculus, tuple variable is a variable that ranges over some named relation. In acalculus type ER DML (i.e., the proposed GCL), the a-list in the GET statement plays the similar role. An a-list is a variable that ranges over a specified set of paths, where a path is a traversal of an ER diagram. The results from our research have demonstrated that, with afew modifications, most DML (DUI,CODASYL DML, SQL, and relationalalgebra) against the corresponding data model (hierarchical, network, relational) find their equivalence in this GCL. Therefore, this GCL satisfies the first requirement posed earlier. This GCL has little, if anythingat all, to do with the physical aspects ofthe database, which is thesecond requirement. In arriving at our requiredER DML, wealso analyzed four earlier relational-type languages proposed by otherauthors: EAS-E [MI, GORDAS [16], ERL 1211, and DAPLEX[29],[43]. EAS-E is very English-like, but seems best suited foran interactive query language rather than a good intermediary language. DAPLEX i s the query language based on the CCA functional data model. GORDAS is a read-only query language. However, the GETcommand it uses seemsvery powthe GETcommand after erful, andso our language patterns GORDAS. Our language is very similar to ERL. ERL claims to be a complete query language (READ, INSERT, MODIFY, DELETE), but there are a fewfeatures we dropped.The language presented will be seen to approach a relational language with the major addition of commands using interentity relations. 2) QueryIDML Translation Algorithms: Wehave identi-

E. SQL to ER DML Translation and Examples


Herein we provide some insight into the translations involved, by outliningSQUDS to ER DML translation. The translation environment and scheme from SQUDS to ER DML has the following characteristics: It is composed of a set of 10 basic rules. Each SQL statement is one of six types of commands. Each SQL statement appliesto one of five types of relations. A rule may, in turn, cause other rules to be invoked. Fig. 12 outlines the translation matrix. It portrays the ten rules that compose the overall SQL to ER DML translation

R W 4

RULE9

9+10+
CONNECT

Fig. 12. SQL to ER DML translation scheme matrix.

algorithm. Our translation covers all SQL DML commands except groupby and aggregate functions which we may add later. Let uslook at three example translations. Fig. 13 provides two sample ER schemas and corresponding relational sche-

GENEOUS CARDENAS:

595

SCHENA 1

Example 1:

.
*

Single rmpping with mm than o n miation


A.scauing M m p bU b n m1

R3 (P#, WX, On)


R4 (P#.l, PX.2. O

.
M
O N

Ualngrub3
s(xstatemmi SELECT

* W .

WD

PI. Wt, WD
Rl,W,Rl R1.W = '1" R1.W
I

FROM
WHERE

AND
AND

R3.W

R3.Wt = R2.Wt

SCHEMA 2
GET (W,W#,WD) WHERE (El RZ 2 6 E1.W

,100')

Fig. 14. Example of SQL to ER DML translation.

Example 2:

R (EMPX, DER#, NAME, BIRTH DATE,

Fig. 13. Sample schema and corresponding relational

schema.
SELECT
FROM

*
R1

mas. Figs. 14-16 provide three DML translationexamples. The translation scheme for SQL read-type commands follows, explaining in detail the Examples in Figs. 14 and 15, and much of Fig. 12. The translation detail forall SQL commands appears in [6]. In our data model translation strategy, adapted from [15], a relation in a relational schema corresponds t o one of the five following ER constructs: 1) An Entity 2) A k-ary relationship 3 A binary relationship with attributes (1 : N o r N:l) 4) An N: M binary relationship set without attributes 5) An entity, plus key attributes of some other relations. We shall call therespectiverelationstype 1, type 2, . . ,and type5 (see Fig. 13).We now discuss, for each type of relation, how a single mapping involving such a relation can be mapped into the ER DML. Type 7 Relation: In this case, a relation,R, with attributes A l , A2, ,An, corresponds to exactly an entity, E, with attributes A l , A2, ,An. Forexample, for the following relation in a relational schema: Relation: EMP(EMPNO,NAME,DNO,SAL) ER schema having the following there exists an entity in the format: Entity EMP(EMPNO,NAME,DNO,SAL). The attribute names need not be exactly the same so long as their semantics remain the same, for example,SAL in the relation versus SALARY in the entity. Thetranslation of an SQLquery involving thistypeof relation into theER DML is straightforward since in both data models only one data entity is involved (a relation in the

WHERE SELECT

P#tN

PI

F R o y R 3

WHERE
*

Wlr'Wl23'

ERttamnt
This InMI.dW m r m n i la gemmed first:

GET(P#) WHERE (R2 6 R z W C W 1 2 3 ' )

ThhbUmRulmumnt(~rimaapnviournaiemnl n c l Min the WHERE slur):


GET(P#,PD,CL) WHERE (El R2 6 E1.W I

GET(W) WHERE (R2 6 W W * W l Z 3 ' ) )

Fig. 15. Example of SQL to ER DML translation.

Example 3:

update Type5 relation


ACCesaing M m p b achema

.
*

U d n g rub 9,lO
s(xrtaiemeni UPDATE SET WHERE R

D E P T l s ' W 6 "TLGENO'
E Y W E l W

ERrmtefnenl
DISCONNECT E2(EWP&'ElMS3') FROM E l P 4 R12

MODIFY E 2 W E N G ) WHERE (ELEYPb'E10493')


CONNECT EZ(E"ElM93')TO EI(DEPTh'DZ3) IN R l Z

Fig. 16. Example of SQL to ER DML translation.

5%

PROCEEDINGS O F THE IEEE,

VOL. 75, NO. 5, M A Y 1987

relational model and entity in the ER model). The following rule is designed to guide such translation: Rule 1: For a single mapping involving a type 1 relation, generate a GET statement in the ER DML. The a-list in theGET statement takes the form of the select-clause in the single mapping.The WHERE clause in theGET statement includes twoparts. The first is the name of the entity involved. The second takes the form of the WHERE-clause in the single mapping. Type2 Relation:A type 2 relation in the relational schema corresponds to a k-ary relationship in the ER schema. An attribute of a type2 relation is either one of the attributes of that k-ary relationship or one of the key attributesof the entities connected by the k-ary relationship. The rule for translating a single mapping involving a type 2 relation into the ER DML is as follows:
2 relation, Rule 2: For a single mapping involving a type generate a GET statement in the ER DML. The a-list in theGET statement takes the form of the select-clause in the single mapping. The WHERE clause in theGET statement includes two parts. The first consists of the corresponding relationship name and the names of the k-entities connected by this relationship. The second part takes the form of the WHERE-clause in the single mapping.

Type3 Relation:A type 3 relation in the relational schema comes from (being mapped from) a binary relationship with attributes in the ER schema. The binary relationship is of either type 1: N o r type N: 1, but not of type N:M, which is mapped into a type 4 relation. An attribute of a type 3 relation i s either one of the attributes of the binary relationshipRule 3: For a single mapping involving more than one or one of the key attributes of the two entities this binary relation,eachofwhichisofoneofthefivetypes, relationship connects. Clearly, the relationship in this case generate a GET statement in the ER DML. The (binary) is a special case of that in the previous case (k-ary). a-list of the GET statement takes the form of the Therefore, the translation of a single mapping involving a select-clause in the single mapping. The WHERE type 3 relation can be done by using rule 2. clause of theGET statement includestwo parts. Type4 Relation:A type 4 relation in the relational schema The first part contains the traversal of the ER corresponds to an N:M binaryrelationshipinthe ER schema. The second part takes the form of the schema. Again, this is a special case of a k-ary relationship. WHERE-clause in the single mapping. The traThe translation of a single mapping involving a type 4 relaversal of the ER schema i s generated by first findtion can, therefore, also be done by using rule 2. ing the corresponding ER segments for therelaType 5 Relation: In order to understand the formation of tions in the single mapping and then taking part a type 5 relation, the concepts of source and targetentities of the ER diagram that includes all the ER segneed to be introduced. Let 1 and 2 be the entity sets ments involved in relationship set R, of type 1:N. Then 1 is referred toas the source entity set and 2, the target entity Example: See the example in Fig. 15. set. When an ER schema is mapped intoa relational schema, Nested Mapping: With SQL it is possible to use the result for each type 1:N relationship set without attributes,a type of a mapping in the WHERE clause of another mapping. This 5 relation is created in the relational schema. Theattributes operation is called nested mapping. Nested mappings are ofthetype5relationconsistofalltheattributesofthetarget not restricted to only levels. two When processing a nested entity plus thekey attribute of thesource entity. To transmapping, the innermost mappingis executed as though it 5 relation into the ER late a single mapping involving a type were a single mapping; the result of the mapping is passed DML, the threeclauses (SELECT, FROM, and WHERE) of the to the outer mapping and the outer mapping proceeds then singlemapping areexamined first; if the keyattributeof the as though itwere given set a of constantsin place the inner source entity appears in one or more of the clauses, then mapping. This continues from the innermost mapping out

rule 2 is used to guide the translation, otherwise rule 1 is used. than One Relation: Using Single Mapping Involving More so far, we are able to translate a single the rules developed a single relation into the ER DML. These mapping involving rules alone have limited use since most queries, when of SQL, involve more than one relation. expressed in terms Let us discuss howthis kind of multi-relation mapping can be mapped into the ER DML. To start, we note that at the global conceptual model level we have an ER schema which is a connected ER diagram. By connected we mean that any two entity sets in the diagram are connected via some relationship sets and some entity sets. This is important to our developing the translation rulessince this guarantees that there is at least some directed (through a single relationship set) or indirected (through more than one relationship set and some entity sets) relationship between any two relations in the relational schema. This suggests that we should try to find such relationship when wehave a single mapping that involves more than one relation. As we have indicated earlier, a relation in the relational schema corresponds to one of the five ER segments (part ofan ER diagram) in the ER sechema. For a single mapping involving more than one relation, we first find all ER segments in the ER schema corresponding to those relations in the single mapping. Once we haveall the ER segments, we find a traversal of the ER diagram that includesall the ERsegments.Thistraversa1 will then contain the relationship between the relations in the single m a p ping.Thenextthingtodoistoconnectthequalifier(WHERE clause) in the single mapping into the qualifier on trathe versal (part of the ER diagram that encompasses all the ER segments). The following rule summarizes the above and can be used to guide the translations of a single mapping involving more than one relation into the ER DML.

CARDENAS: HETEROGENEOUS DISTRIBUTED DATABASE MANAGEMENT

597

until it reaches the outermost mapping. Similarly, the ER DML(theGlobalConceptual Language) allows forthe embedment of a GET statement in the WHERE clause of another GET statement. This nested GET statement feature makes it possible to map a nested mapping in SQL into the ER DML. The following rule guides such a translation. Rule 4 For each nested mapping, generatea nested GET statement in the following manner. Working from the innermost mapping out, each for mapping seen, which is a single mapping, generate a GET statement using the rules described earlier for single mappings. If the current single mapping has a single mapping in its WHERE clause, which should have been mapped into a GET statement due to the fact that wework from insideout,thentheWHEREclauseofthecurrent GET statement is combined with the inner GET statement to form the new WHERE clause. This process continues until the outermost mapping is mapped into the ER DML.

and heterogeneity. The role of the Prolog language or of some of its mechanisms as an internal mechanism to formally express such controls are being considered.We are now identifying the translation of such controls to corresponding controls (DDL andlor application programs) on specific DBMS. We have identified the major issues referred t o as the view update problemand also mostof the required integrity controls or database update decisions that DBAs or users must make to solve most, it not all, realistic view update problems.

G. Futher Features
A very brief synopsis ofwork wehave donein two major areas follows. Protocols: We have identified the protocol information needed to implement theHD-DBMS. In developingthese protocols, the logical components within the HD-DBMSto implement these protocols were also defined. The protocols defined describe the information exchange neededto enable the various logical components of the HD-DBMS handshake or communicateso as t o maintain data integrity in thesystem and also to handle the translations. The protocols allowthe components to implement: queries andlor updates on data within the system; aborts on queried updates; delayed updates; broadcasting andhandling systems status (as in upldownlrecovering). Inaddition todefiningthe protocols,theformat bywhich the protocols travel between the logical componentswas alsodefined. Ample example scenarios of events within the HD-DBMS have been created. Each scenario contains detailed illustration of the protocols needed to handle the event and the sequence in whichthey are used. lnternal Model: A major model of the HD-DBMS is the internal model, both at the global and the local levels. A generalized database access path model has been defined for the purpose of representing relationships between data entities in theHD-DBMS [20]. This data model, termed the Generalized DataAccess Graph (GDAG), is a major architectural component. The GDAG is maintained by the HD-DBMS as part of the network data dictionary (catalog). It encompasses the capability of modeling the access paths of the three major data models, via a common data independent notation. A salient capability is the modeling of inter-database relationships using an equivalent notation. IV. CONCLUDING REMARKS We have outlined thelanguage desiderata for data sharing and accessing in the increasing scenarios of heterogeneous databases. We have cited the major approaches t o data sharing and accessing: from the primitive commercial file and database unloadlload and PC download, to common interfaces on topof existing DBMS, to the R & D and prototype efforts toward the long-range goals. Commercial availabilityof the more encompassing thrusts may become a realitywith the mounting problems, opportunity costs, and demand for data sharingin the heterogeneous world. The HD-DBMS project is highlighted herein, with a presentation of its status, progress, and plans. It i s a longer range project, with the unique feature of allowing any user

xample: See the example in Fig. 16. We stress that the overall translation approach in our HD-DBMS effort will hold even if thesource relational language and the target ER DML were tovary. This has been one of our requirements. Thus the translation would be extended t o other relational materializations. The same holds for the other types of DML and correspondingtranslation schemes within our scope.

F. View Update
While we havestated the ideal long-range goals, we have identified problems that may impose limits on the types of user views of the databases and particularly on the types of data accessing commands that may be issued from the VM user level. We have sorted out the various problems, assessed the possibility and cost of solution, identified the limitation ontypes of commands and data model mapping if such problems are not solved, and outlined possible solution approaches. As an example,the magnitude of foreseen and unsolved problems appears to have led most efforts to not to permit updating database, a evenwhile forcingeach user wishing access t o a heterogeneous database to abide by a new or common model and query language. The view update problem in relational systems is one major problem in the distributed heterogeneous case even if relationalsystems are not involved; constraining the differences permitted betweeen user views and local logical models alleviates the problemand makes it more solvable. We have now formally identified the rules of the game to permit 1)updating commands to various degrees and 2) differences i n mapping between the user view and the underlyingparticipating database schemas, whilepreserving integrityconstraints. We first assessed actual view updating in IMS, SQUDS, DB2, Oracle, Ingres, and QBE. We also analyzed paper approaches proposed by various authors. We are now designing the mechanisms for DBAsorusers for logically and easily expressing various limitations or controls on the types of user views, data accessing commands, and updates so as to preserve stated integrity controls and various degrees of transparency of distribution

598

PROCEEDINGS

O F THE

IEEE, VOL. 75, NO. 5, MAY 1987

in the network to use his preferred database model and DML to access any data in the heterogeneousnetwork; another distinguishing feature, thus far, is its support for updating, not only for read-type accessing. Prototype implementation of the HD-DBMS for proof of concept will follow. The first thread probably will be to translate: from a CODASYL DML at the virtual level into ER DML into SQL from SQL at the virtual level into ER DML into CODASYL DML. Prototyping will first face read-only commands and immediately thereafter updating commands. A robust data dictionary will beused, undoubtedly extending its model,t o implement the crucial network-wide dictionarykatalog. We intend use to graphical mouse-oriented tools to paint E R database models. ER data definitions and graphical ER diagrams should eventually be generatedautomatically from existing DDLs, and DDLs should be generated automatically also from ERdata definitions and graphical ERdiagrams. Schema integrationintotheglobalconceptual modelshouldbe semi-automated; the reverse process should also be automated. Although theflavor of presentation is bottom-up, that is, starting with existing individually designed heterogeneous databases, the system is also targeted for new databases being designed globally from the start, and then being distributed in the heterogeneous environment. The latter will be a growing case as the flexibility of heterogeneous distributed systems becomes available. ACKNOWLEDGMENT The author wishes to acknowledge the contribution of the followingpast and current members of the HD-DBMS project: E. Nahouraii and M. H. Pirahesh (IBM Corp.), J. BenZvi and J. Horowitz (Informatics), G. Chen (Hughes Aircraft), W. Johnson(Lockheed), A. Chen, and G. Wang. The collaboration and support of Informatics General Corporation is appreciated. Finally, he wishes to thank the two anonymous reviewers for their comments. REFERENCES M. Adiba and D. Portal, A cooperations system for heterogeneous data base management systems, lnformat. Syst., vol. 3, no. 3, pp. 209-215, 1978. I.R. Abrial, Data semantics, in Coflf. Proc. lflf-TUWorking Conf.onDataBaseManagement(Cargese,Corsica,Apr. 1974), J. W. Klimbie and L. Koffeman, Eds. Amsterdam, The Netherlands: North-Holland, 1974. P. Atzeni andP. P. Chen, Completeness of query languages for the entity-relationship model, in Proc. Zndlnt. Conf. On Entity-Relationship Approach, P. P. Chen,Ed., ER Institute,
1981.

28-30, 1985). P. P. Chen, The entity-relationship model-Toward a unified view of data, ACM Trans. Database Syst., vol. 1, no. 1, Mar. 1976.

-, An algebra for a

directional binary entity-relationship model, in froc. 7st /E COMPDEC (Los Angeles, CA, Apr. 1984), pp. 37-40. E. F. Codd, A relational model of data for large shared data banks, Commun. ACM, vol. 13, no. 6,1970. -, Relational completeness of data base sublanguages, in DataBaseSystems, R. Rustin,Ed.Englewood Cliffs, NJ: Prentice-Hall, 1972. Cullinet Software Inc., IDMSIR, summary description, Westwood, MA. Cullinet Software Inc., Goldengate, summary description, Westwood, MA. U.DayalandH. Y. Hwang,View definition and generalization for database integration in a multidatabase system, /FEE Trans. Software Eng., vol. SE-10, no. 6,pp. 628-645, Nov.
1984.

U. Dayal, Query processing in a multidatabase system, in Query Processingin Data Systems, W. Kim, D. Reiner, and D. Batory,Eds.NewYork,NY:Springer-Verlag, 1985. S. R. Dumpala and S. K. Arora, Schema translation using the entity-relationshipapproach, in froc.2nd lnt. Conf. on Entity-Relationship Approach, P. P. Chen,Ed., ER Institute,
1981. R. Elmasri and G. Wiederhold, GORDAS: Aformal high-level

A. F. Cardenas and M. H. Pirahesh, Database communication in a heterogeneous database management system network, lnformat. Syst., vol. 5, no. 1, pp. 55-79, 1980. -, The E-R model in a heterogeneous data base management system network architecture, in P. Chen, Ed., froc. lnt. and Conf. on Entity-Relationship Approach to System Analysis Design. Amsterdam,The Netherlands: North-Holland, 1980, pp. 577-583. A. F. Cardenas and G. Wang, Translation of SQUDS data accesshpdate into entity/relationship data accesshpdate, in Proc. 4th lnt. Conf. on the E-R Approach (Chicago, IL, Oct.

query language for the entity-relationship model, in froc. 2nd lnt. Conf. on Entity-Relationship Approach (Washington, DC, 1981). A. Ferrier and C. Stangret, Heterogeneity in the distributed database management systems SIRIUS-DELTA, in Proc. 8th lnt. Conf. on VeryLargeDataBases(MexicoCity,Mexico, Sept. 8-10, 1982), pp. 45-53. V. D. Gligor and G. L. Luckenbaugh, Interconnecting heterogeneous data base management system, /Computer, vol. 22, pp. 33-43, Jan. 1984. Honeywell Information Systems, Relational queryhnteractive query reference manual, Manual #DR52. J. Horowitz and A. F. Cardenas, Relationships in a heterogeneous distributed database environment, submitted for publication to lnformat. Syst. H. Y. Hwangand U. Dayal, Using the entity-relationship model for implementing multiple model database system, in Proc. 2nd lnt.Conf. on Entity-Relationship Approach, P.P. Chen, Ed., 1981. IBMCorp.,SQUDS,conceptsandfacilities,Reference Manual GH24-5013. Informatics General Corp., Answer/DB reference manual, Canoga Park, CA. Informatics General Corp., Distributed application generator, technical system description, Canoga Park, CA. Informatics General Corp., LotuslAnswer, Visi/Answer, and dBase II/Answer, Reference Manuals, Canoga Park, CA. j. lossiphidis, A translation to convert the DDLof ERMto the DDL of System 2000, in Proc. lnt. Conf. on Entity-Relationship Approach to System Analysis and Design, P. P. Chen, Ed. Los Angeles, CA, 1979). B. E. Jacobs, On database logic, J. ACM, vol. 29, no. 2, pp. 310-332, Apr. 1982. R. H. Katz, Database design and translation multiple for data models, Ph.D. dissertation, UC Berkeley, 1980. R. Katz and N. Goodman, View processingin multibase-A heterogeneous database system, in Entity-Relationship Approach to lnformation Modeling and Analysis, P. P. Chen, Ed., ER Institute, 1981. R. H. Katz and E. Wong, Decompiling CODASYL DML into relational queries, ACMTrans. Database Syst., vol. 7, no. 1 , pp. 1-23, 1982. T.A.Landersand R. L. Rosenberg, An overview of multibase, in DistributedDatabases, H. j . Schneider, Ed. Amsterdam, The Netherlands: North-Holland, 1982. M. Levin, The DlAM theory of algebraic access graphics, Sterling Systems, Inc., Denver, CO, 1980. Y. D. Lien, Hierarchical schematafor relationaldatabases,

6, no. 1 , pp. 48-69, Mar. 1981. ACM Trans. Database Syst., vol. 1341 H. M. Markowitz, A. Mallhota, and D. P. Pazel, "The ER and EAS formalisms for systemmodeling,and the EAS-E language," in Proc. 2nd Int. Conf. on Entity-Relationship Approach (Washington, DC, 1981). E. Z. Nahouraii, L. 0. Brooks, and A. F. Cardenas, "An approach to data communication between different GDBMS," in Proc. 2nd Int. Conf. on VeryLargeDataBases (Brussels, Belgium, Sept. 1976). C. Parent and S. Spaccapietra, "An entity-relationship algebra," in Proc. Ist /E Conf. on Data Engineering (Los Angeles, CA, Apr. 24-27,1984), pp. 500-507. L. S. Schneider "A relational query compiler for distributed heterogeneous databases," IFlP TC 2.6, NASWG, Jan. 1977.

Conf.Reston,VA:AFIPSPress, pp. 487-499. [45] G . Sockut,"Aframeworkfor logical-level changeswithin data base systems,'' IEEE Computer, vol. 23, pp. 9-27, May 1985. [46] E. Wong and R. H. Katz,"Logicaldesignandschema.conversion for relational and DBTG databases," in Proc. Int. Conf. on Entity-Relationship Approach to System Analysis and Design, P.P. Chen, Ed., Los Angeles, CA, 1979.

SDDTGofCODASYLSystemsCommittee,"Astoreddatadef-

inition language for the translationof data," Informat. Syst.,


vol. 2, no. 3, 1977. M. E. Senko, E. 6. Altman, M. M. Astrahan, and P. L. Fehder, "Data structures and accessing in database systems," ISM Syst. I., vol. 12, no. 1, 1973. M. E. Senko,"DIAMasadetailed exampleof theANSllSPARC architecture," in Proc. IFIP-TC2 Working Conf. Modeling in

Data Base Mangement Systems (Freudenstadt, Germany, Jan. The Netherlands: 1976), C. M. Nijssen, Ed. Amsterdam, North-Holland, 1976. N. Shu, B. Housel, and V. Lum, "CONVERT high A level translation definition language for data conversion," IBM Corp. Res.Rep. RJ 1500, San Jose, CA, Jan. 1975. N. Shu et a/., "EXPRESS: A data extraction, processing and restructuring system," ACM Trans. DatabaseSyst.,vol. 2, no. 2, June 1977. D. W. Shipman, "The functional data model and the language DAPLEX," ACM Trans. Database Syst., vol. 6 , no. 1, pp. 140173, Mar. 1981. J. M. Smith eta/., "MULTIBASE-Integrating heterogeneous distributed database systems," in Proc. 1981 Nat. Computer

Вам также может понравиться