Академический Документы
Профессиональный Документы
Культура Документы
For example, suppose we want to aggregate REVENUE table. The DDL (Data Dictionary
Lock) to create the view would look similar to this:
create materialized view revenue_summary
build immediate
refresh complete
as
select r.product_id, r.location_id, sum(r.revenue_dollars) sum_revenue_dollars
from revenue r
group by r.product_id, r.location_id;
This statement will define and populate the materialized view revenue_summary
immediately upon executing the script. In addition, any time any information is
committed with REVENUE table, the materialized view is also updated.
For Oracle 8i to perform a fast refresh (that is, one that doesnt rebuild the whole table),
the DBA must define a materialized view log for the source table. For the above
revenue_summary example, the log would be created as follows:
create materialized view log on revenue
with rowed (product_id, location_id, revenue_dollars)
including new values;
Parallelism
Parallelism involves the ability of software to take advantage of multi-CPU machines to
reduce response time when queries are passed to the database engine for processing.
The two most commonly used multi-CPU machines are:
A number of operations can be parallelized using Oracle 8i, some of them are crucial to
the loading, transforming and population of the data warehouse. Oracle 8i can parallelize
more than 20 operations. Some of the operations are:
i.
ii.
iii.
iv.
v.
vi.
vii.
viii.
ix.
x.
xi.
Table scan
Not in
Group by
Select distinct
Aggregation
Order by
Create table as select
Index Maintenance
Inserting rows from other tables
Enabling constraints
Star optimization
Degree of Parallelism
The degree of parallelism is the number of query process associated with a single
operation.
It can be set at:
The statement level or
The object level.
At the Statement Level
This can be accomplished during any phase of data warehouse activity by using hints.
Hints are special keywords used to influence the way optimizer process queries.
The optimizer is a set of routines enlivened when query is passed to Oracle; the optimizer
ensures the most efficient processing is performed on the query based on the nature of the
data in the tables the query references.
Using hints, the developer can influence the degree of parallelism to be used on a query
and what structures should be parallelized to what degree.
At the Object Level
This is the best place to define degree of parallelism. The familiar create table statement
includes a parallel (degree n) clause, where n refers to the optimal number of query
processes that will be used to process queries against the process.
Turning on Parallel Query at the Instance Level
A database administrator places the following entries in the initialization parameter file:
PARALLEL_MIN_SERVERS
PARALLEL_MAX_SERVERS
PARALLEL_SERVER_IDLE_TIME
PARALLEL_MIN_SERVERS
This entry determines the number of query processes to spawn when the database is
started. There is an additional requirement in memory to initiate and keep these
processes running.
When run on UNIX machine, the query server processes will be identified by P000 to
P00XX where XX equal the setting for the parameter minus one. Thus, a setting of 12
will spawn query process P000 to P011.
PARALLEL_MAX_SERVERS
This determines maximum number of query processes that can be initiated if extra
processes are required over and above set by the previous parameter.
PARALLEL_MAX_SERVERS is a cumulative number such that it specifies the total
number to run, not the number of extra to start.
Oracle will spawn processes over the minimum setting up to the maximum setting if
query processing requires more than the minimum. Suppose the former is set to 12 and
the latter to 24; Oracle will spawn up to 12 additional processes when needed.
PARALLEL_SERVER_IDLE_TIME:
This parameter sets out a time, at the expiration of which the extra query processes up to
and including those designated by PARALLEL_MAX_SERVERS will be killed off by
Oracle.
It is measured in minutes, and it is very possible that in a warehouse with sporadically
heavy usage-the number of P0XX processes could vary during different monitoring times
of day.
Tablespace Segregation
There is a logical rather than physical association between the files making up each
tablespace. There are two types of tablespaces require in the warehouse:
SYSTEM
The heart of the database, this tablespace contains the data dictionary and the
objects owned by the special users SYS and SYSTEM.
ROLLBACK
This is the tablespace where the rollback segments are stored. These are used to
save pre-updated copies of rows massaged by users of database before they
commit or rollback a transaction. Commit is the activity of saving changes to the
database; rollback is the act of rolling transaction back to the state they were in
before the changes were initiated.
TEMPORARY
This is Oracles scratch pad, where temporary tables are created for the life of the
processing cycle of each query, then cleaned up when no longer required.
TOOLS
This is where Oracle places tables that are used by its own tools delivered with
the database; they suggest you place where vendors objects here as well, rather
than in the SYSTEM tablespace.
USERS
This area is set aside for any non-system objects required by application.
Application Tablespace
The tablespaces that contain the warehouse data must be created manually using an
interface like Oracle Enterprise Manager.
Guidelines for setting up Tablespaces
Estimate the space required for your data and indexes. Formulae for these
calculations are in works such as the Oracle 8i Server Administrators Guide.
Often, the row sizes in the warehouse can be estimated based on the source of
their operational counterparts.
Create Oracle accounts to be the keeper of the data for each section of your data
warehouse or each individual data mart on its own.
Separate the data and index containers.