Вы находитесь на странице: 1из 6

T e c h n i c a l

W h i t e

P a p e r

S e r i e s

Building Highly Scalable, Highly Available, High-Performance eBusiness Applications

Using BroadVision One-to-One


In todays wired world, the business value of an eCommerce one-to-one web site is enormous. Any web site has the ability to leverage different channels for conducting business, but personalized one-to-one web sites introduce the potential to increase sales, improve customer service, and gain valuable user data. To optimize this potential, however, a web site must have certain functional capabilities: it must be high availability (HA), high performance (HP) and highly scalable (HS). This document discusses the BroadVision One-To-One architecture, outlines techniques for improving system performance and throughput, and covers methods of ensuring 7 x 24 availability by system monitoring. Scalability and Availability
Scalability and load availability are crucial to a highly available, high performance system. A well-tuned BroadVision One-To-One application should be highly scalable and limited only by the CPU cycles. The five principal ways in which BroadVision One-to-One can be tuned to increase scalability and load balancing include Scalable Components Architecture, Multiple Database Accessors, Multi-Tier Architecture, Sticky Load Balancing, and Service Level Process Grouping. Scalable Components Architecture One-To-One is based on scalable components architecture. Because One-to-One allows separation of the business and presentation logic while the actual work is performed by efficient C++ components, additional C++ components can be easily added to the system. These C++ components can also take full advantage of the underlying platform specificity. Multiple Database Accessors One-To-One provides a distributed CORBA server architecture, allowing additional accessors to be added to the system to support load balancing through process level parallelism. This enables multiple database accessor processes to access the same database from multiple machines. Multi-Tier Architecture One performance goal for a one-to-one web site is to optimize the dynamic page throughputthe dynamic generation and production of the page seen by the visitor. To achieve this goal, One-To-One load balances by supporting flexible configurations that can run on multiple machines and processes with multiple instances. The HTTP Server, Interaction Manager, One-To-One Servers, and Database can all run on different machines. Figure 1 illustrates a typical large site configuration.

IT Solutions. Guaranteed. www.eforceglobal.com

TM

T e c h n i c a l

W h i t e

P a p e r

S e r i e s

Sticky Load Balancing Sticky load balancing involves running One-To-One Interaction Manager servers on different machines from those hosting the HTTP servers. Each Interaction Manager machine can run multiple instances or engines. Visitor sessions continue uninterrupted by remaining on the engine that created the session. All subsequent requests during the same session will be routed back to the same engine. This process has the advantage of caching data in the engine process, thereby avoiding expensive data retrieval from the backends. Service Level Process Grouping Service Level Process Grouping is a method of load balancing by optionally grouping service level processes into clusters using the group construct. This gives programmers the option of identifying which processes to use with which services.

Caching
Cache
Caching is one of the most successful techniques for improving system performance. Instead of making expensive calls to the database, data items are cached in the memory. Figure 2 demonstrates the caching process. A system generally uses one of five methods to cache data: Collection Cache, Queries Cache, Content Cache, Page Request Cache, and Category Cache.

Collection Cache
After evaluating rules in a Matching collection, the Matching system caches the parsed results. You can configure the number of rules kept in the cache by setting:
rule_cache_size=500 in bv1to1.conf

The following table summarizes database accessor processes and their functions:

Process
Content Accessor (cntdb) Profile Accessor (cmsdb) Generic Accessor (genericdb) External Accessor (extdbacc)

Function
Access One-to-One Content Database Access One-to-One Profile Database Handle Content Query Requests Access External Database

Queries Cache
The query cache stores the result of a query. You can change the following parameters in bv1to1.conf.
query_cache_size=500 query_limit=2000 query_cache_timeout=60

Browsers Connecting Through the Web HTTP/HTTPS Connections Two HTTP Servers On Two Hosts TCP/IP Connections Through a Firewall One-To-One Command Center Four Interaction Manager Engines on Three Hosts CORBA Connections One-To-One Servers on Two Hosts

Two Servers Are Database Accessors

Figure 1. BroadVision One-To-One Architecture


2

T e c h n i c a l

W h i t e

P a p e r

S e r i e s

Generic database access caches:


gdb_query_cache_size=10000 gdb_query_limit=1000 gdb_query_cache_limit=60

Although caching is a good technique for improving system performance by avoiding expensive backend calls, caching has the potential to create problems of inconsistent data between the cache (in memory) and the persistent storage (database). To force the system to empty or reload a cache, you can use cache_utl utility or use DCCs Notify Servers command. For example, to clean up the request cache, you can invoke
cache_utl -e request_cache.

Content Cache
The content cache allows you to specify the number of content items cached over the set of all content types:
default_cnt_cache_size=1000

Guidelines For Using Cache


Caching helps you improve performance by avoiding expensive backend calls. However, when caching you must take care to protect data integrity and synchronization. Use these guidelines to get the most from caching: Data synchronization when data is updated in the database, flush the database to ensure data consistency. Cache size caching increases performance but uses up disk space (virtual memory) and resident memory. Avoid caching frequently changed items such as stock quotes. Avoid caching unused data such as unused content type or offline data not visible to customers. Do not cache the page requests that require backend access and processing, such as a login script.

You can achieve granular control by overriding the default content cache and specifying the type and number of items to hold:
cnt_type_cache_type=AD=100 PRODUCT=200

This is useful when you do not want to waste cache space by storing infrequently accessed content type or frequently changing content type, such as a stock quote.

Category Cache
Category cache allows certain categories, or semantically defined units, of items to be cached. You can specify category cache:
cat_cache_size specifies the count of categories to retain cnt_type_cache_size because category is another content

type, it can be controlled by setting this parameter.

Interaction Manager

Page Request Cache


Most pages are dynamically generated based on the visitor profile. In many cases the page contains the same results for multiple visitors. You can significantly improve performance by caching the page requests. You can also configure the cache in a file. By default, the file name is called bvsm.req and is located in the same directory as the Interaction Managers configuration file. The file contains two sections: Cache Settings Cache settings define the behavior and size of the caches. You can specify the timeout, maximum request size, maximum number of generated requests, and new cache name. Cache List The cache list defines which requests are qualified to be placed in the request cache. You can specify:
type defines the type of item to cache. It is always J for JavaScript. request path sets a path for the JavaScript. cache rule specifies the conditions under which the generated DB Accessor The Database Accessor Builds an Object Model That Represents The Relational Database System One-To-One Command Cache One-To One Servers Servers Cache Content and Collection Data

result will be cached.


fixup path allows you to make small modifications to the cached

One-To-One Database

request, such as adding current time to the page.


request size sets the maximum request size to be cached. timeout times out a cached request. cache name identifies which request cache will hold the result.

Figure 2. The Caching Process


3

T e c h n i c a l

W h i t e

P a p e r

S e r i e s

JSP Scripting Techniques


JavaScript is the core of generating dynamic pages. This section outlines four techniques for improving system performance by optimizing the way in which the JavaScript is written.

Performance Tuning
Observation Logs Tuning To perform customer profiling, we must record events (or observations) initiated by visitor actions. One-To-One provides an observation logging mechanism that logs raw data into a file that can later be aggregated and uploaded into observation database by utilities such as obs_dbload and obs_aggr. Any IO operation may impact the throughput of the system. You can configure the observation logging by turning it on/off:
observation_flag=1

Avoid Crossing the JavaScript/C++ Boundary


Calls that cross the JavaScript/C++ boundary are expensivethey incur the overhead of component reference lookup in a hash table, data types conversion, and string copying. For optimal system performance, utilize these scripts as infrequently as possible. A script crosses the JavaScript/ C++ for every component instantiation, property access, and method invocation. One of the techniques that helps you avoid crossing the JavaScript/C++ boundary is to assign all output to a JavaScript variable and invoke Response.write() to display it. Avoid object creation by setting cursor in list function rather than using get() function. For example:
var len = content List.length for (I=0; I<len; I++) { content = content List.get(I); //

To improve performance, the IM collects observation information and caches the data in a large buffer. It will write the data to the file system whenever the buffer is full or after a specified period of time. You can configure the time period in bv1to1.conf:
observation_flush_time=5

Access Control Tuning


One-To-One provides flexible access control for each request and determines the scripts, page templates, and dynamic objects that visitors may run. Since this is one of the most common and frequent operations, you can improve the performance by simplifying your permission rules and reducing the size of your permissions in the list.

avoid this

content List.cursor = I; // do this instead }

NSAPI versus CGI Performance Tuning


In a production system, we can use the Connection Module (written in NSAPI) to connect HTTP Server to Interaction Manager. This method is preferred to using CGI, which has a large overhead because it launches a separate process for each request.

Another way to optimize JavaScript performance is to store commonly used components in Session and Process objects. In this way you can avoid re-instantiating the components later.

Database Server Tuning


To improve database throughput and multiple database access concurrency, you can increase the count of the database accessor process instances. To take advantage of multiple database accessors you must have at least as many Interaction Manager engines, because each Interaction Manager engine uses just one database accessor. You need more database accessors if: You havent reached your throughput goal. The Interaction Manager machines still has lots of idle CPU time. The cmsdb (Visitor profiles) or cntdb (Content) process consumes lots of CPU time. You can also change the database cache parameters to increase the throughput of the system or you can place the database accessors on another machine.

Use JSPOPT
The utility jspopt optimizes Java scripts by stripping out the JavaScript and HTML comments, cleaning up white space, and removing the usage <%= syntax. Be sure not to modify the output file from jspopt and jspopt. It is an as is product and is not supported by BroadVision.

Use Fastconcat
Fastconcat is a C++ utility that provides fast string concatenation. Its usage is similar to concat().

Monitor Script Performance


We can measure the processing time of a script by enabling message set 14 (Performance Measurements) in .bvlog.conf and then adding BVI_Log at the beginning and end of the script.

T e c h n i c a l

W h i t e

P a p e r

S e r i e s

Server Process Tuning


Three tips for server process tuning: 1. As previously discussed, One-To-One software can run on multiple machines. If your machine has reached close to 100% CPU utilization, you may want to consider running HTTP servers, Interaction Manager servers, and One-To-One servers on different machines. If the backend One-To-One servers utilize close to 100% CPU cycles, move some of the One-To-One servers onto another machine. You must also improve system performance by running multiple instances of the server process, particularly the database accessors. 2. If CPU utilization is low but there is slow response time, it is probably caused by the network. Try upgrading the external network bandwidth and increasing the internal network. 3. If you see slow response time and the network is fine, it is probably caused by long blocking operations in your application. You can execute your long blocking operations in parallel mode, specifically writing components to allow threads to run in parallel. Such components allow other request threads to proceed while performing time-intensive processing. Components can do this by calling the parallel mode API.

resident memory size, and number of lightweight processes.


Swap provides a method of adding, deleting, and monitoring the

system swap areas used by the Memory Manager.

Performance Analysis and Stress Testing


To build a successful eCommerce site you must determine the system workload that drives the capacity planning for hardware and the network. Here are some important questions you must answer: What is the peak, mean, standard derivation, and frequency distribution of the arrival rates of the requests? What is the total number of customers? What is the total number of active customers? What is the number of sessions per day? What is the projected customer base growth rate? What is the peak and average of concurrent sessions? What is the transactions/requests mix per sessions? What is the accepted response time for different transactions? What is the maintenance outage window per week? Two hours? Four hours? In fact, the arrival rate is the most important question to be answered. Although a requirement may specify the number of concurrent users, this number does not represent the actual concurrent requests. Even though the system may have 500 concurrent users, it is unlikely that all users are submitting page requests at the same time. Statistical models such as the Poisson distribution can help determine the probability of the arrival rate of the requests. Stress testing is extremely important to a highly available, high-performance eCommerce site. To perform stress testing we use a test drive to simulate a large volume of traffic. We also design and induce data conditions supporting different test cases and concurrent logons. This is most often done with the assistance of a software testing package.

System Monitoring
High Availabilityor 7x24systems must be constantly monitored. One-ToOne provides some useful monitoring tools to monitor the system and gather statistics:
bvconf monitor reports the usage statistics of One-To-One

servers and processes. One of the common usages is: bvconf monitor which shows process, host, virtual memory size, resident memory size, CPU usage, number of lightweight processes, and so on.
bvconf monitor -m BV_DB_ STAT -a monitors the

database query of an accessor.


bvconf monitor -p bvmgr -m BV_CACHE_STAT

monitors the request cache.


watcher allows you to monitor the state of the Interaction Managers. bvping verifies that One-To-One servers are up and running, and

optionally, can launch a server not already running.


bvconf ps similar to Unix ps, it displays information about the

One-To-One servers. You can also monitor the Operating System level information. Unix has several useful tools for monitoring systems:
top displays the top 15 processes on the system and periodically

updates this information. Raw CPU percentage is used to rank the processes. It will also display other information such as virtual memory size and resident memory size.
mpstat reports per-processor statistics in tabular form. It has

information such as percent user time, percent system time, percent wait time, and percent idle time.
ps reports the process status, process ID, virtual memory size,

T e c h n i c a l

W h i t e

P a p e r

S e r i e s

The following table lists popular performance testing software packages:

Product
SilkPerformer LoadRunner ETest, Eload WebLoad Web Performance Center WebStone WorkBench

Company
Segue Inc. Mercury Interactive RSW Inc. Radview Software Inc. Web Performance Center Silicon Graphics SES Inc.

Web Sites
www.segue.com www.merc-int.com www.rswsoftware.com www.webload.com www.webperfcenter.com www.sgi.com www.ses.com

Summary
Attaining optimal 7 x 24 system performance necessitates a high availability, high performance, scalable one-to-one system. In this document we outlined important points in the context of BroadVisions One-To-One application that will aid you in optimizing web site performance and meeting your strategic goals. In summary, plan your system carefully and conduct thorough stress and functional tests before deploying the system in production. It is important to deploy the right number of BroadVision server components in the right configuration. Configureand more importantly, maintainyour various BroadVision caches appropriately. Follow best coding practices when developing JSPs and BroadVision components. Rather than trying to embed all your business logic in JSPs, make use of C++ components and their multi-threading capabilities. Know your BroadVision APIsusing the right API calls at the right time will substantially improve your system performance. Remember that no amount of BroadVision performance tuning will yield the desired results unless you have carefully tuned your network, hardware, operating system, input/output, third-party application, andmost importantlyyour database.

About eFORCE
eFORCE specializes in the Design, Development, Migration/ Modernization, Maintenance and Support of IT products and solutions in the areas of Enterprise Portals, Digital Information Management, Customer Relationship Management, Enterprise Application Integration, Business Intelligence and Enterprise Infrastructure. Combining expertise in business architecture, technical architecture, design, deployment and maintenance, eFORCE delivers production-scale products/solutions that result in measurable ROI. eFORCE customers include leading Global 1000 end-users such as Alcatel, AT&T, Avaya, Baker Hughes, Bank of America, DHL, Fleet Bank, France Telecom, GE Capital, Hilton, HP, Janssen, Janus, Mazda, Mitsubishi, Novartis, Viacom, and Visa; and innovative Software Product companies such as Annexient, BEA Systems, Checkfree, GMS360, eTeam, Infonet, Matrics, MatrixOne, Netbrowser, Reuters and WorldGroup. eFORCE delivers solutions based on best-in-class enabling technologies such as ATG, BEA Systems, BroadVision, E.piphany, HP, IBM, Interwoven, Mercury, Netegrity, Savvion, Siebel Systems, Sun Microsystems, Microsoft, Oracle, Stellent, TIBCO, Verity and webMethods. eFORCE (www.eforceglobal.com) is headquartered in Silicon Valley, has Development Centers in North America, Europe and India, and, through its Global Delivery ModelTM, provides both onshore and offshore design and development as well as full lifecycle deployment, maintenance and support.

Bibliography
BroadVision One-To-One Enterprise Installation and Administration Guide BroadVision One-To-One Enterprise One-To-One Overview BroadVision Training, Advanced Scripting Best Practice

Contact eFORCE
510.265.5800 sales@eforceglobal.com

66

Вам также может понравиться