Вы находитесь на странице: 1из 9

ASCIEnterpriseJobSchedulingSoftware WhitePaper

JobScheduling:AStrategicPathwayforImproved DataWarehouse/BusinessIntelligencePerformance

Contents
2 In tr oduc t io n

3 The Real Time Enterprise

4 Data Manag e mentOptions

6 Increasing DBM SEfficienc y

7 ActiveBatch : Un ique V a lue f o rB I

9 C on c lus i o n

JobScheduling:AStrategicPathwayforImproved 2DataWarehouse/BusinessIntelligencePerformance

Introduction
I t s h a r d t o b e l i e v e , i n a n e r a wh e n d a t a a r r i v e s a t t he e n d o f a f ib e r o p t i c c ab l e a n d seconds-old information is considered dated, that the term Business Intelligence is more than 50 years old. As far back as 1958, businesspeople realized the need to c r e a t e a c om p u t i n g i n f r a s t r u c t u r e t h a t wo u l d a l l o w t h e i r o r g a n i z a t i o n s t o s ys t e m a t i c a l l y g a t h e r , a c c e s s a n d a n a l yz e o p e r a t i o n a l d a t a i n o r d e r f o r i t t o r un b e t t e r . s we l l i n g f l o o d o f i n f o r m at i o n . T im e t o i n s i g h t , a k e y g o a l o f B I , is o f t e n c o m p r o m i s e d wh e n I T s y s t e m s c a n n o t deliver adequate storage and processing capabilities.

Information in a Data Warehouse Must Be Four Things:

O v e r t h e ye a r s , a s t h e a b i l i t y f o r c o m p a n i e s t o c o l l e c t B us i n e s s I n t e l li g e n c e ( B I ) d a t a h a s g r o wn e x p o n e n t i a l l y , s o h a s t h e n e e d to efficiently organize and store it. This n e e d h a s l e d t o t h e n o t io n - a s i n g l e , c e n t r a l i z e d r e p o s i t o r y o f t h e d a t a t h at describes an organizations full gamut of a c t i v i t i e s . B i l l I n m o n , g en e r a l l y r e g a r d e d a s t h e f a t h e r o f d a t a wa r e h o u s i n g , s a y s t h a t i n f o r m a t i o n i n a d a t a wa r e h o u s e m u s t b e four things: subject-oriented (organized by topic); time-variant (tracked over time); n o n - v o l a t i l e ( r e a d - o n l y) ; a n d i n t e g r a t e d (complete and consistent). M a s t e r D a t a M a n a g e m e n t ( M D M ) , wh i c h s e e k s t o i m p l e m e n t a n o f f i c i a l , c o ns i s t e n t C l e a r l y, B I s y s t e m s a r e o n l y a s e f f e c t i v e a s the quality and timeliness of the data supporting them. But as information continues to flood in to the typical data wa r e h o u s e f r o m c o n v e n t i o n a l s o u r c e s a s we l l a s n e we r p o i n t s o f o r i g i n s u c h a s R F I D readers and Web Services applications, increasingly sophisticated technologies and standards are needed to manage the set of identifiers and hierarchies to a companys data sets, is another example of the coping disciplines currently gaining c u r r e n c y. E f f e c t i v e r e s ou r c e u t i l i z a t i o n , wo r k l o a d b a l a n c i n g , a n d s c a l a b i l i t y a l s o become critical, both for storage and data processing resources.
1.SUBJECTORIENTEDORGANIZEDBYTOPIC 2.TIMEVARIANTTRACKEDOVERTIME 3.NONVOLATILEREADONLY 4.INTEGRATEDCOMPLETEANDCONSISTENT

JobScheduling:AStrategicPathwayforImproved 3DataWarehouse/BusinessIntelligencePerformance

TheRealTimeEnterprise
Perhaps the biggest single evolutionary driver in the expansion of modern data wa r e h o u s e m a n a g e m e n t i s t h e d e m o c r a t i z a t i o n o f B u s in e s s I n t e l l i g e n c e . W h e r e B I wa s o n c e t h e p u r v i e w o f s p e c i a l l y t r a i n e d b u s i n e s s a n a l ys t s wh o u s e d s o p h i s t i c a t e d s o f t wa r e t o s l i c e - a n d - d i c e highly complex data sets, the advent of easy to use, multidimensional analytical a p p l i c a t i o n s h a v e e x t e n d e d B I t o t he farthest corners of the enterprise.


up from less than 25% in 2007. It also f i n d s t h a t t he s a m e p e r c e n t a g e o f G lo b a l 2 0 0 0 c o m p a n i e s a l r e a d y h a v e a n a l yt i c s e n g i n e s b u i l t i n t o t h e i r op e r a t i o n a l applications, or plan to have them by the e n d o f 2 0 1 0 ( O p e r a t i o na l A n a l yt i c s a n d t h e Emerging Mission-Critical Data Warehouse, May 2007).

W i t h t h e s h i f t f r o m b a c k r o o m s t r at e g i c application to essential daily management tool comes the need for real-time, or near r e a l - t i m e , da t a c o l l e c t i o n . I n f a c t , an o t h e r G a r t n e r s t ud y r e s e a r c h e d t h e n e e d f o r l o wlatency data delivery (Gartner Study on D a t a I n t e g r at i o n I d e n t i f ie s K e y U s a g e T r e n d s , F eb r u a r y 2 0 0 6 ) . T h e r e p or t f o u n d strong demand for low latency data

T o d a y, m a n a g e r s a t a l l l e v e l s c a n i n s t a n t l y call up sales, manufacturing, finance, human resource and other kinds of data to p o s e q u e r i e s a n d g a i n i n s i g h t f o r be t t e r d e c i s i o n - m a k i n g . O n l i ne A n a l yt i c a l P r o c e s s i n g ( O L A P ) , G oo g l e D o c s , S e r v i c e Oriented Architecture (SOA) implementations and other developments have givenin some cases mandatedBI a v a i l a b i l i t y t o n e a r l y e v e r yo n e wh o s e j o b description involves some level of business management.

Increasing Demand for Real-Time Data Integration

N o l o n g e r i s a n a l ys i s o n e s t e p r e m ov e d f r o m d a i l y o p e r a t i o n s ; i n f a c t , G a r t ne r p r e d i c t s t h a t b y t h e e n d o f 2 0 0 9 , 9 0% o f G l o b a l 2 0 0 0 c o m p a n i e s wi l l h a v e i m p l e m e n t e d s o m e t yp e o f m i s s i o n - c r i t i c a l d e p e n d e n c y b e t we e n t h e [ d a t a ] wa r e h o u s e and at least one revenue-supporting or c o s t - c o n t r o l li n g o p e r a t i o n a l a p p l i c a t i o n

JobScheduling:AStrategicPathwayforImproved 4DataWarehouse/BusinessIntelligencePerformance

delivery

on

g l o ba l

basis,

wi t h

T h e n u m b e r o f e n t e r p r i se s t h a t c a n g e t b y wi t h s i m p l e b a t c h - o r i e n t e d , h i g h - l a t e n c y data refresh programs continues to drop. T h o s e wh o s t i l l p o p u l a t e t h e i r d a t a wa r e h o u s e s n i g h t l y, a r e p r i m a r i l y u s i n g t h e i r d a t a wa r e h o u s e s f o r l o n g - t e r m strategic BI. More common today are r e f r e s h r a t es t h a t a r e s e m i - d a i l y, h o u r l y, o r s e m i - h o u r l y.

organizations in aggregate indicating that more than 60 percent of their data i n t e g r a t i o n a c t i v i t i e s m u s t h a p p e n wi t h latency of one hour or less, and over 35 p e r c e n t wi t h l a t e n c y o f l e s s t h a n o n e minute. This significant shift in the past s e v e r a l ye a r s c a n b e a t t r i b u t e d t o g r e a t e r l e v e l s o f c om p e t i t i o n , c u s t o m e r d e m a n d f o r rapid service in all industries, and the overall business climate. Federal / state / local governments, financial services, manufacturing, retail, transportation and u t i l i t i e s we r e f o u n d t o be t h e s e g m e n t s wi t h the highest need for real-time requirements.

I t s i m p o r t a n t t o n o t e t h at t r u e r e a l - t im e , o r i n s t a n t a n e o u s , d a t a a v ai l a b i l i t y i s oftentimes more an ideal driven by perceived competitive pressures, than a necessary or even desirable goal. While many enterprises report that users are d e m a n d i n g d a t a r e f r e s h r a t e s d o wn t o t h e

Increasing Business Demands Require That Many Industries Perform with Real-Time Data Integration

m i l l i s e c o n d , o n l y t h o s e i n d i v i d u a l s wh o d e p e n d o n b u s i n e s s - a wa r e a p p l i c a t i o n s , in fields like production management or transactional processing, are likely to need such immediate and fluid information. Most operational applications, and nearly all BI t o o l s , c a n f u l f i l l t h e i r t a s k s wi t h p e r i o d i c o r near real-time data.

DataManagementOptions

To provide the relevant, timely and optimized data businesspeople need to do their jobs, its necessary to continually process and update the information stored i n a d a t a wa r e h o u s e . A n u m b e r o f d a t a

JobScheduling:AStrategicPathwayforImproved 5DataWarehouse/BusinessIntelligencePerformance

management applications and architectures a r e a v a i l a b l e t o f u lf i l l t he s e f u n c t i o n s ; m o s t D a t a b a s e M a n a g e m e n t S ys t e m s ( DB M S ) , a n d e v e n o p e r a t i n g s ys t e m s l i k e U N I X a n d W i n d o ws , o f f e r s o m e s o r t o f s c h e d u l e r t o coordinate processing tasks. Yet because most of these are either attached to a single database server, focused on d a t a b a s e m a i n t e n a n c e o n l y, o r r e s t r i c t e d t o s o f t wa r e f r o m a s p e c i f i c v e n d o r , t h ey o f t e n a r e t o o m yo p i c f o r b r o ad e r , e n t e r p r i s e - wi d e data management requirements. Dedicated job scheduling applications, on t h e o t h e r h a n d , h a v e t h e n e c e s s a r y p o we r a n d c a p a b i l i t y t o p e r f o r m f a r b e yo n d t h e constraints of OS- or DBMS-based scheduling tools. Job schedulers can trigger tasks based on events, rather than simply date or time, and can accommodate unpredictable or one-time occurrences. T h e y c a n r e c o v e r / r e s t a r t a u t o m a t i c al l y i n case of job failure, generate execution reports on scheduled tasks, and provide audit trails for compliance purposes. I n a n e r a wh e n q u i c k d a t a l o a d s a r e e s s e n t i a l t o B I p e r f o r m a n c e , ye t m a n u a l i n v o l v e m e n t i s o f t e n n e c e s s a r y, j o b schedulers can completely automate the

l o a d i n g a n d e x e c u t i o n p r o c e s s . E v en m o r e i m p o r t a n t l y, j o b s c h e d u l e r s c a n l e v e r a g e a n d l o a d - b a l a n c e l a r g e n u m b e r s o f j ob s a c r o s s m u l t ip l e s e r v e r s a n d s t o r a g e devices, effectively increasing the efficiency of an enterprises IT i n f r a s t r u c t u r e wh i l e a l s o c o m p l e t i n g j o b s faster, all at little or no additional cost. When selecting a job scheduler for data wa r e h o u s e m a n a g e m e n t , it s i m p o r t a n t t o s e e k o n e t h a t c a n s e p a r a t e d i f f e r e n t t a s k s, e . g . , o p t im i z i n g d a t a f o r q u i c k l o a d s , y e t a l s o o f f e r a f r a m e wo r k f o r t yi n g t o g e t h e r many kinds of configurations and defining job streams as tightly as needed. It should also have the ability to pick off events from d i f f e r e n t m ac h i n e s i n a m u l t i - O S e n v i r o n m e n t , a n d r u n f l at f i l e ( t wo d i m e n s i o n a l ) a s we l l a s m u l t i - d i m e n s i o n a l database jobs. J o b s c h e d u l e r s c a n a l l o w d a t a wa r e h o u s e a d m i n i s t r a t o r s t o t a k e ad v a n t a g e o f processing capacity beyond the server or s ys t e m i n q u e s t i o n , a n d m a x i m i z e u s e o f available resources. Best-of-breed schedulers also support heterogeneous OS e n v i r o n m e n t s ( e . g . , L i n u x , U N I X , W in d o ws , z / O S a n d O p e n V M S ) , t he r e b y r e m o v i n g roadblocks to important applications and data.

JobScheduling:AStrategicPathwayforImproved 6DataWarehouse/BusinessIntelligencePerformance

IncreasingDBMSEfficiency

E v e n t - b a s e d s c h e d u l i n g , o f t h e t yp e f o u n d i n A d v a n c e d S ys t e m s C o n c e p t s ActiveBatch

W e b S e r v i c e R e q u e s t R e s u l t , W M I Q u e r y, XM L Q u e r y, a n d F i l e S ys t e m I nf o r m at i o n .

J o b S c h e du l i n g a n d W o r k l o a d Event-driven scheduling, because it responds to the fluid and sometimes u n p r e d i c t a b l e p a c e o f b u s i n e s s , i s i de a l f o r near real-time data inserts for mission critical applications. Because low latency e n v i r o n m e n t s o f t e n n e c es s i t a t e t h e c o m b i n a t i o n o f m o r e v o l a t i l e d a t a wi t h s t a b l e m a s t e r d a t a r e p o s i t o r i e s , e v en t based job schedulers can be used to e s t a b l i s h h o l d i n g a r e a s a t s e t t i m e s o f d a y, i n t e g r a t i n g t h e t wo o n a s c h e d u l e d b a s i s .

Automation application, is perhaps the most unique and valuable advantage of job s c h e d u l i n g s o f t wa r e . E v e n t t r i g g e r s c a n n o t o n l y s h r i n k t h e t ot a l d a y p a r t ( s ) de v o t e d t o p r o c e s s i n g , b u t a l s o in c r e a s e responsiveness. Furthermore, depending o n h o w wi d e l y t h e s c h e d u l e r s a g e n t s a r e i m p l e m e n t e d , t h e p r o c es s i n g e n v i r o n m e n t c a n a p p r o a c h c l o u d c o m p u t i n g s t at u s a n d e v e n c r e a t e a g r e e n e r I T i n f r a s t r u c t ur e .

Jobstreamscanbeassembledwith triggerslinkedtodozensofevents,suchas:

O t h e r p l u s s e s e x i s t a s we l l f o r t h e e v e n t b a s e d m o d e l . W h i l e m an y I T d e p a r t m e n t s use their DBMS to build processing jobs, it can be advantageous to use a job scheduler to tie together smaller packages o f j o b s , s i n c e i t e f f ec t i v el y s e p a r a t e s t h e execution logic from the data transfer logic. I n t h i s wa y t h e j o b s c h e d u l e r s e v e n t - b a s e d scheme can manage even large job packages, reporting on success/fail and minimizing checkpointing. Coding and maintenance are also reduced.

J o b s t r e a m s c a n b e a s s e m b l e d wi t h t r i g g e r s l i n k e d t o d o z e n s o f e v en t s , f r o m s y s t e m s t a r t u p a n d f i l e c r e a t i o n / m o d i f i c a t io n / d e l e t i o n t o r u n a wa y p r o c e s s e s , j o b f a i l u r e s a n d m u c h m o r e . W i t h t he a d d i t i o n o f a c t i v e variables, its also possible to interrogate d a t a a s a m e a n s o f t r i gg e r i n g a t a s k . E x a m p l e s o f a c t i v e v a r i ab l e s i n c l u d e D a t e Expression, File Contents, SQL Record Set, Of course, virtually all job schedulers can also accommodate date/time scheduling for routine database refreshes. The c o m b i n a t i o n o f t h e t wo , a s f o u n d i n A c t i v e B a t c h, c r e a t e s m ax i m u m o p p o r t u n i t y t o f u r n i s h r el e v a n t a n d a c t i o n a b l e d a t a t o decision makers across the enterprise.

JobScheduling:AStrategicPathwayforImproved 7DataWarehouse/BusinessIntelligencePerformance

ActiveBatch:UniqueValueforBI

I n t h e m o d er n e n t e r p r i s e , d a t a wa r e h o u s e s u s e d f o r B us i n e s s I n t e l li g e n c e m u s t c o l l e c t d a t a f r o m a b e wi l d e r i n g a r r a y o f s o u r c e s . The need for fast loading of product transactions, for example, is critical; in s o m e c a s e s , o r d e r d a t a m u s t b e i n t eg r a t e d into production processes even before the s a l e i s c o m p l e t e . ET L ( E x t r a c t / T r a n s f o r m / L o a d ) t o o ls f o u n d i n I B M W e b S p he r e DataStage, Microsoft SQL Server Integration Services and the open source Apatar application can accomplish many of t h e s e t a s k s ; h o we v e r , t h e r i g h t j o b scheduler can simplify the creation of a job s t r e a m f r a m e wo r k t h a t t ig h t l y c o o r d i n a t e s E T L wo r k f l o ws , e n s u r i n g t h a t s u c h p r o c e s s e s a r e h a n d l e d e f f i c i e n t l y. A c t i v e B a t c h, b e c a u s e o f i t s e a s y- t o - u s e management environment, can add new d a t a s o u r c e s a u t o m a t i c a l l y, wi t h l i t t l e o r n o human decision-making. The elimination of m a n u a l i n t e r v e n t i o n a l l o ws e n t e r p r i s e s t o configure and integrate data from more s o u r c e s , m or e q u i c k l y t h a n e v e r b e f o r e . I n addition, ActiveBatchs ability to balance wo r k l o a d s a c r o s s m a n y s e r v e r s h a s a l l o we d u s e r s t o i m pr o v e s e r v i c e l e v e l s b y c o m p l e t i n g j o b s i n l e s s t i m e , a n d wi t h f e we r errors.

The unique Integrated Jobs Library in ActiveBatch can be used to quickly create E T L wo r k f l o ws wi t h o u t c u s t o m s c r i pt i n g . F o r e x a m p l e , i t c a n b e us e d t o q u i c k l y c r e a t e wo r k f l o ws t h a t r e a c h a c r o s s m u l t i p l e S Q L S e r v e r s t o j o b c h a in m u l t i p l e D a t a Transformation Services jobs. Alternately it can create job plans that pass information from one database to another o r t o v a r i o us a p p l i c a t i o n s . I t c a n al s o i n t e g r a t e m a n a g e m e n t t a s k s wi t h o t h e r scripts or applications. Workload balancing, higher server utilization and scalability is of utmost i m p o r t a n c e t o m o s t e n t er p r i s e s . A c t i v e B a t c h s s t r e n g t h s in t h e s e a r e a s h a s g i v e n u s e r s b e t t e r u s e of t h e i r I T r e s o u r c e s ; in f a c t , A c t i v e B a t c h h a s b e e n p r o v e n t o r el i a b l y r u n o v e r o n e m i l l i o n j o b s per day and connect to 2,000 servers, ensuring its efficiency and dependability in production environments of all sizes.

JobScheduling:AStrategicPathwayforImp proved 8DataWarehouse/B BusinessIntelligencePerfo ormance

ActiveBatch also minimizes investment risk by maximizing utilization of existing c o m p u t i n g p o we r , r e d u c i n g e l a p s e d j o b c o m p l e t i o n t i m e a n d a l l o wi n g m o r e business processes to be executed. ETL, f o r e x a m p l e , i s p a r t ic u l ar l y we l l - s u i t e d t o c d i s t r i b u t e d p r o c e s s s c h ed u l i n g o n c o m m o d i t y ( I n t e l , W in d o ws o r L i n u x ) servers surrounding the central data wa r e h o u s e .

F o r l a r g e e n t e r p r i s e s wi t h d i s c r e t e da t a f m a r t s o r o p e r a t i o n a l d a t a s t o r e s , it s of t e n necessary or desirable to allow multiple d e p a r t m e n t s t o c r e a t e wo r k f l o ws , o r t o manipulate objects such as calendars, a l e r t s , o r s c h e d u l e s . A c t i v e B a t c h c an a l s o a c c o m m o d a t e t h i s n e e d wi t h i t s V i r t u a l R o o t , a c a pa b i l i t y t h a t g i v e s i n d i v i d u a l s , wo r k t e a m s , d e p a r t m e n t s a n d e v e n business units log-in protected access to those objects, jobs and plans appropriate to their job descriptions. Permissions in the V i r t u a l R o o t a r e c o n t r o l le d c e n t r a l l y a n d c a n b e s e t by u s e r , g r o u p o r o t h e r established unit on a granular basis, allow ng those closest to the BI or wi operational need to make necessary c h a n g e s wi t h o u t b e i n g a b l e t o a c c e s s , m o d i f y o r e v e n s e e o b j e c t s o r wo r k f l o w s pertaining to other parts of the enterprise.

Architectu ureChart C rossPlat formETLA

JobScheduling:AStrategicPathwayforImproved 9DataWarehouse/BusinessIntelligencePerformance

Conclusion
A s d a t a wa r e h o u s e s b e c o m e i n c r e a s i n g l y m i s s i o n - c r i t ic a l i n n a t u r e , a n d a s B I extends its march across the modern enterprise, database administrators face the daunting task of making their repositories increasingly agile and r e s p o n s i v e . A t t h e s a m e t i m e a s an y I T p r o f e s s i o n a l k n o ws g l o b a l b u s i n e s s i m p e r a t i v e s d e m a n d t h a t m o r e b e d o n e wi t h less.

Ultimately, an organizations ability to compete is only as great as the information it has available to it perhaps the best reason of all to put a best-of-breed job scheduler into the data warehouse management mix.

Many organizations already use dedicated job schedulers like ActiveBatch to manage t h e i r d a t a p r o c e s s i n g wo r k l o a d s . B y leveraging their existing job schedulers for d a t a wa r e h o u s e m a n a g e m e n t , I T o r g a n i z a t i o n s c a n i m p r ov e t h e i r a b i l i t y t o handle more sophisticated database needs wh i l e a l s o r e d u c i n g t h e t i m e d e v o t e d t o s c r i p t wr i t i n g a n d r o u t i n e j o b m a n a g e m e n t .

The inclusion of an existing job scheduler, or installation of a new job scheduler, can o p t i m i z e d a t a wa r e h o u s e p e r f o r m a n c e , s h o r t e n i n g B I t i m e t o in s i g h t i n a n e r a wh e n t i m e l y, a c c u r a t e a n d u s e r - o p t i m i z e d business information has never been g r e a t e r . U lt i m a t e l y, a n o r g a n i z a t i o n s ability to compete is only as great as the i n f o r m a t i o n i t h a s a v a i l a b l e t o i t p er h a p s t h e b e s t r e as o n o f a l l t o p u t a b e s t - o f - b r e e d j o b s c h e d u l e r i n t o t h e da t a wa r e h o u s e management mix.
Copyright Advanced Systems Concepts, Inc. All rights reserved

Discuss Your Workload Automation Goals with an ActiveBatch Consultant Learn more about ActiveBatch with a personalized Live Product Demonstration hosted online.

Вам также может понравиться