Вы находитесь на странице: 1из 16
a2) United States Patent Suresh et al. 10620899081 US 6,208,990 BL Mar. 27, 2001 a0) Patent No.: (45) Date of Patent: Ga) (75) 73) © en @) Gn (2) (58) (56) METHOD AND ARCHITECTURE FOR AUTOMATED OPTIMIZATION OF ETL THROUGHPUT IN DATA WAREHOUSING APPLICATIONS Inventors: Sankaran Suresh, Sante Clara; Jyotindra Pramathnath Gautam, Fremont; Girish Pancha, San Francisco; Frank Joseph DeRose, Fremont; Mohan Sankaran, Union City, all of CA (US) Assigace: Informatica Corporation, Palo Alto, CAS Notice: subject to any disclaimer, the term of this patent is extended or adjusted under 35, U.S.C. 15440) by 0 days. Appl. No. O9/116,426 Filed: Jul. 18, 1998 Int. CL? GO6F 17/30 u 1707/6; 707/10; 707/100; "707/102; 707/104; 707/200 Field of Search 707/102, 10, 100-3, 707/104, 200, 201; 395/149; 414/786 References Cited USS, PATENT DOCUMENTS. 5,403,147 * 4/1995 ‘Tanaka 414/786 5,563,999 * 10/1996 Yaksich e al 395/149 51675,785 * 10/1907 Hallet al 305/613, S781.911 * 7/1998 Young et a. 7077201 6,014,670 * 1/2000 Zamanian etal ro7nio1 6,032,158 * 2/2000 Mukhopadhyay 7077201 6,044,374 * 3/2000. Nesamoney eta 707/10 FOREIGN PATENT DOCUMENTS 99724922 5/1999 (WO) Go6F)17;30 noBaghg OTHER PUBLICATIONS ‘Weyman, PJ., “The Case For A Process-Driven Approach to Data Warehousing”, Database and Network Journal, vol. 7, No. 1, Feb, 1, 1997, pp. 3-6, Squire, C, “Data Extraction and Transformation for the Data Warehouse”, ACM Proceedings of Sigmod, Intl Con- ference on Management of Data, vol. 24, No. 1, Mat. 1, 1995, p. 4461447. White, C.,“ Managing Data Transformations”, Byte, vol. 22, No. 13, Dec. 1, 1997, p. 53/54. White, C,, "Data Warehousing: Cleaning and Transforming Data”, INFO DB, vol. 10, No. 6, Apr. 1997 p. 11/12, * cited by examiner Primary Examiner—Tomas G. Black Assistant Examiner—Thuy Do (74) Attorney, Agent, or Firm—Wagner, Murabito & Hao LLP 67) A.computer software architecture to automatically optimize the throughput ofthe data extraction/tcansformation loading (ETL) process in data warehousing applications. This archi= ture has 2 componentized aspect and a pipeline-based aspect. The componentized aspect refers to the fact thet every transformation used inthis architecture is built up with tansformation components selected from an extensible set of transformation components, Besides simplifying source code maintenance and adjustment for the data warehouse users, these transformation components also provide these users the building blocks to elfectively construct pertinent and functionally sophisticated transformations in a pipelined ‘manner, Within a pipeline, each transformation component ‘aulomatically stages or streams its data to optimize ETL throughput. Furthermore, each transformation either pushes data to another transformation component, pulls data from another transformation component, or performs a push/pull operation on the data. Thereby, the pipelining; staging/ streaming; and pushing/pulling Teatures of the tansforma- tion components effectively optimizes the throughput of the ETL process, ABSTRACT 16 Claims, 9 Drawing Sheets 2 [=]$[2 31 wins oye eee sce |= ao ft oil US 6,208,990 B1 Sheet 1 of 9 Mar. 27, 2001 U.S. Patent | aunBi4 % ed E _ _ 30130 % % aod 70uINOD ANdNI ma LINN Avidsiq OmSWnN WNOIS HOSUNO 30¥HOLS “WNOLLdO aS WHdlv AndLNo Bia JWNOLLdO INdNI a wT o a AMOWAN FILWIOA Sana wossa00ud “NON US 6,208,990 B1 Sheet 2 of 9 Mar. 27, 2001 U.S. Patent 092 g LBDUVL 0S yp AFOUL z aunbi4 Ore AAAS ANIDNS NOLLWWHOSSNVYL 082 OWNOS 0¢e q 3OWNOS Vv JOuNOS

Вам также может понравиться