Академический Документы
Профессиональный Документы
Культура Документы
RAHIKA PATIL APURVA SHAMRAJ RUTUJA SHIRPEWAR SHRIKANT GHAYTADAK SAHIL RAMTAKE SHAIKH GULAM MUKASSAR
DBMS
Data is one of the most important assets of a company. It is very important to make sure data is stored and maintained accurately and quickly. DBMS (Database Management System) is a system that is used to store and manage data. A DBMS is a set of programs that is used to store and manipulation data. Manipulation of data includes the following: Adding new data, for example adding details of new student. Deleting unwanted data, for example deleting the details of students who have completed course. Changing existing data, for example modifying the fee paid by the student. A DBMS provides various functions like data security, data integrity, data sharing, data concurrence, data independence, data recovery etc. However, all database management systems that are now available in the market like Sybase, Oracle, and MS-Access do not provide the same set of functions, though all are meant for data management. Database managements systems like Oracle, DB2 are more powerful and meant for bigger companies. Whereas, database management systems like MS-Access are meant for small companies. So one has to choose the DBMS depending upon the requirement.
Features of DBMS
The following are main features offered by DBMS. Apart from these features different database management systems may offer different features. For instance, Oracle is increasing being fine-tuned to be the database for Internet applications. This may not be found in other database management systems. These are the general features of database management systems. Each DBMS has its own way of implementing it. A DBMS may have more features the features discussed here and may also enhance these features.
Data Security
While DBMS allowing data to be shared, it also ensures that data in only accessed by authorized users. DBMS provides features needed to implement security at the enterprise level. By default, the data of a user cannot be accessed by other users unless the owner gives explicit permissions to other users to do so.
Data Integrity
Maintaining integrity of the data is an import process. If data loses integrity, it becomes unusable and garbage. DBMS provides means to implement rules to maintain integrity of the data. Once we specify which rules are to be implemented, then DBMS can make sure that these rules are implemented always.
Page 2
Apart from supporting a non-procedural language like SQL to access and manipulate data DBMS now days also provides a procedural language for data processing. Oracle supports PL/SQL and SQL Server provides T-SQL.
Page 3
Data Warehousing
y
A data warehouse can be defined as any centralized data repository which can be queried for business benefit
o
Dramatic advances in data capture, processing power, data transmission, and storage capabilities are enabling organizations to integrate their various databases into data warehouses. Data warehousing is defined as a process of centralized data management and retrieval. Data warehousing, like data mining, is a relatively new term although the concept itself has been around for years. Data warehousing represents an ideal vision of maintaining a central repository of all organizational data. Centralization of data is needed to maximize user access and analysis. Dramatic technological advances are making this vision a reality for many companies. And, equally dramatic advances in data analysis software are allowing users to access this data freely. The data analysis software is what supports data mining. According to Bill Inman, author of Building the Data Warehouse and the guru who is widely considered to be the originator of the data warehousing concept, there are generally four characteristics that describe a data warehouse:
y
Subject-oriented: data are organized according to subject instead of application e.g. an insurance company using a data warehouse would organize their data by customer, premium, and claim, instead of by
Page 4
different products (auto, life, etc.). The data organized by subject contain only the information necessary for decision support processing.
y
Integrated: When data resides in many separate applications in the operational environment, encoding of data is often inconsistent. For instance, in one
application, gender might be coded as "m" and "f" in another by 0 and 1. When data are moved from the operational environment into the data warehouse, they assume a consistent coding convention e.g. gender data is transformed to "m" and "f".
Time-variant: The data warehouse contains a place for storing data that are five to 10 years old, or older, to be used for comparisons, trends, and forecasting. These data are not updated.
Page 5
However, continuous innovations in computer processing power, disk storage, and statistical software are dramatically increasing the accuracy of analysis while driving down the cost.
APPLICATIONS
Data Warehousing y Insulate data - i.e. the current operational information
o
y Gives access to the broadest possible base of data. Retrieve data - from a variety of heterogeneous operational databases
o
Data is transformed and delivered to the data warehouse/store based on a selected model (or mapping definition)
y Metadata - information describing the model and definition of the source data elements Data cleansing - removal of certain aspects of operational data, such as low-level transaction information, which slow down the query times. y Transfer - processed data transferred to the data warehouse, a large database on a high performance box. Data Mining
y
Medicine - drug side effects, hospital cost analysis, genetic sequence analysis, prediction etc.
Marketing/sales - product analysis, buying patterns, sales prediction, target mailing, identifying `unusual behavior' etc.
y y y
Knowledge Acquisition Scientific discovery - superconductivity research, etc. Engineering - automotive diagnostic expert systems, fault detection etc.
ADVANTAGES:
y Enhances end-user access to a wide variety of data. y Business decision makers can obtain various kinds of trend reports e.g. the item with the most sales in a particular area / country for the last two years. A data warehouse can be a significant enabler of commercial business applications, most notably Customer relationship Management (CRM).
DISADVANTAGES:
Data mining systems rely on databases to supply the raw data for input and this raises problems in that databases tend be dynamic, incomplete, noisy, and large. Other problems arise as a result of the adequacy and relevance of the information stored. Limited Information A database is often designed for purposes different from data mining and sometimes the properties or attributes that would simplify the learning task are not present nor can they be requested from the real world. Inconclusive data
SINHGAD INSTITUTE OF MANAGEMENT Page 7
causes problems because if some attributes essential to knowledge about the application domain are not present in the data it may be impossible to discover significant knowledge about a given domain. For example cannot diagnose malaria from a patient database if that database does not contain the red blood cell count of the patients Missing data can be treated by discovery systems in a number of ways such as;
y y y y
Simply disregard missing values Omit the corresponding records Infer missing values from known values Treat missing data as a special value to be included additionally in the attribute domain
Page 8