Академический Документы
Профессиональный Документы
Культура Документы
Database Management
In all information systems, data resources must be organized and structured in logical manner so that they can be accessed easily, processed efficiently, retrieved quickly and managed effectively. A character is most basic element of data that can be observed and manipulated. A field or data item consists of grouping of related characters for example, Name. It represents an attribute (characteristic or quality) of some entity (object, person, place or event). For ex. An employees salary is an attribute to describe an employee. Fields are organized in logical order for ex. last_name, first_name and so on. A record represents a collection of attributes that describe an instance of an entity. An example is a persons payroll record. Variable-length records contain a variable number of fields and field lengths. Normally, first field in a record stores unique identifier for the record and is called primary key. Student ID can be primary key as long as none shares it. If there cannot be unique identifier, designer can assign a sequential number to a record.
Database Structures
All pictures, videos, songs, messages, chats, icons, email addresses and others are stored on popular social networking websites as fields, records, files or objects in large databases. Data are stored so that there is easy access, can be shared by respective owners and can be protected from unauthorized access or use. Database Management System (DBMS) packages are designed to use logical data structure to provide end users with quick, easy access to information stored in databases. Early mainframe DBMS used hierarchical structure, in which relationships between records form a tree like structure. There is one root record and multiple subordinate levels in one-to-many relationship. Any data element can be accessed by moving progressively down from a root and along branches of tree until desired record is located. Network structure is more complex and still used by some mainframe DBMS. It allows many-to-many relationships. For example, department records can be related to more than one employee record and employee records can be related to more than one project record.
Project A
Project B
Employee 1
Employee 2
Employee 3
Employee 1
Employee 2
Project A
Project B
Hierarchical Structure
Network Structure
Relational model
Relational model is most widely used. All data elements are stored in simple 2D tables or relations. Tables are flat files where each row and column represent a record and field respectively. A database can specify data attributes for many files simultaneously and can relate data elements in one file to those in one or more other files. For example, a manager may retrieve an employee name and salary from employee table, as well as department name from department table. 3 basic operations can be performed on relational database to create useful data sets. Select operation may be used on employee database to create subset of records that contain all employees who have spent 2 years and make more than Rs.3 lakhs per year. Join operation can combine 2 or more tables temporarily so that user can see relevant data from all. Project operation creates a subset of columns contained in temporary tables created by select and join operations. Large mainframe relational databases include Oracle 10g from Oracle and DB2 from IBM. Popular midrange database application is SQL Server from Microsoft. Common database for PC is Microsoft Access.
Relational structure
Deptno Dname Dloc Dmgr Department Table
Dept A
Dept B Dept C Empno Emp 1 Emp 2 Emp 3 Emp 4 Emp 5 Emp 6 Ename Etitle Esalary Employee Table Deptno Dept A Dept A Dept B Dept B Dept C Dept B
Multidimensional database
East Actual Sales Camera TV VCR Audio Margin Camera TV VCR February Budget Actual March Budget
Audio
Multidimensional database
Sales Actual TV January February March Qtr 1 VCR January February March East Budget Actual West Budget
Qtr 1
Multidimensional database
January Sales TV East West South Total VCR East West South Actual Margin Sales Budget Margin
Total
Multidimensional database
January TV East Actual Budget Forecast Variance West Actual Budget Forecast Sales VCR TV Margin VCR
Variance
Object-oriented structure
Bank Account Object
Attributes (Customer, Balance, Interest) Operations (Deposit, Withdraw, Get owner)
Database development
DBMS like Microsoft Access or Lotus Approach allow end users to develop databases easily. Large organizations place control of enterprise database development in hands of Database Administrators (DBA) and other database specialists. This improves security and integrity of organizational databases. Database developers use Data definition language (DDL) to develop and specify data contents, relationships and structure of each database, as well as modify them. Such information is cataloged and stored in a database of data definitions and specifications called a data dictionary or metadata repository. Data dictionary contains metadata (data about data). It contains name and description of all type of data records and their relationships; requirements for end users access and use of application programs; and database maintenance and security. An active data dictionary would not allow a data entry program to use non-standard definition of a customer record, nor to enter a name of customer that exceeds defined size of data element.
Ordered on
Contains
Stocked as
Purchase order
Product Stock
Holds
Warehouse
Types of databases
Operational databases store detailed data to support business process and operations of a company. They are also called subject area databases, transaction databases and production databases. For example, a human resource database would include data identifying each employee and his/her time worked, compensation, benefits, performance appraisals, training and development status. Many organizations replicate copies or parts of databases to servers at different sites. Distributed databases can reside on network servers on World Wide Web, on corporate intranets or extranets. These may be copies of operational or analytical databases, hypermedia or discussion databases, or any other. Replication improves database performance at work sites. A company with many branch operations may distribute data so that each branch operation is location of its branch database. If all data reside in a physical location, any catastrophe such as fire or damage to data media result in data loss. Ensuring consistent and concurrent data is major challenge. Replication involves using special software application that looks at each distributed database and finds changes to it. Once changes are identified, replication process makes all distributed databases look same by applying proper changes to each. This takes time and computer resources based on number and size of distributed databases.
Data warehouses
Duplication identifies one database as master and duplicates it at prescribed time after hours. External databases is available for fee from commercial online services and with(out) charge from sources on World Wide Web. Hypermedia database stores hyperlinked pages of multimedia (text, graphic and photo images, video clips, audio segments). Web browser on client PC connects to Web network server. This server runs Web server software to access and transfer Web pages you request. Web page content may be described by HTML or XML language. Data warehouse stores data extracted from multiple databases. It is a central source of data that is cleaned, transformed and cataloged so that managers and business professionals can use for data mining, online analytical processing, and other forms of business analysis, market research and decision support. Data warehouses may be subdivided into data marts, that focus on specific aspects such as a department or business process. Metadata (data that define data in warehouse) are stored in metadata repository and cataloged by directory. Unlike data in databases, data in warehouse are static. This restriction allows queries to be made on data to look for complex patterns or historical trends.
Data mining
In data mining, data in warehouse are analyzed to reveal hidden correlations, patterns and trends in historical business activity. This software uses advanced pattern of recognition algorithms, various mathematical and statistical techniques, to sift through terabytes of data. Companies use data mining to (1) perform market-basket analysis to identify new product bundles, (2) find root causes of quality or manufacturing problems, (3) prevent customer attrition and acquire new customers.
Selection
Databases
Target data
Data transformation
Data Warehouse
Patterns
File processing
Earlier each business application was designed to use one or more data files containing specific data records. File processing systems had following problems. Independent data files included a lot of duplicated data. Same data (such as customers name and address) were recorded and stored in many files. This data redundancy needed file maintenance programs to ensure each file was properly updated. Having data in independent files made it difficult to provide end users with information for ad hoc requests that required accessing data stored in many files. Special programs were written to retrieve data from each independent file. Organization of files, their physical location on storage hardware and application software used to access those files depended on one another. Changes in format and structure of data and records in a file needed program maintenance efforts. Different users and applications could define data elements such as stock number and customer address differently. Lack of standard caused inconsistency problems in data access. Integrity (accuracy and completeness) of data was suspect because there was no control over their use and maintenance by authorized end users.