Вы находитесь на странице: 1из 6

Web Sites Performance Measurement

Deng Zheng Zhong


MSc Student Royal Institute of Technology

Ehiwere Mathew
Graduate Member IEEE Royal Institute of Technology

Akunna Udochukwu
Graduate Member IEEE Royal institute of Technology

zdeng@kth.se

ehiwere@kth.se

akunna@kth.se

ABSTRACT
This paper presents simple performance evaluation of different websites taken under uniform environmental parameters such as same local machine and connectivity. Numerous websites available on the cyberspace today provide services ranging from information, education, business, social interactions, entertainment, religion etc, and they present varying access experiences to different users. In general, the architecture of a web site, server load and hardware characteristics, backbone network characteristics, user end related parameters, and many other nongeneric factors contribute to the overall quality performance of web services. Some of the key user experiences such as Average Response Time, HTTP errors, Average download time, Hits per second, etc of selected websites, subjected to same test conditions are presented.

Categories and Subject Descriptors


H.3.4 [Systems and Softwares]: (efficiency and effectiveness) Performance evaluation

General Terms
Measurement, Performance, Human Factors

Keywords
HTTP, Web page, Web services

1. INTRODUCTION
A web service is a core technology for sharing information resources and integrating processes in companies, institutions, and organization [2].The web services leverage the global accessibility of the World Wide Web www. World Wide Web (www) is a repository of information spread all over the world and linked together [1]. As the number of applications and contents connected by web services increases, the number of users also increases, hence the increased concern over performance issues. Web services quality factors could be classified as interoperability, manageability, security, business processing capability, business value and measureable quality [2]. For the purpose of this work, our major concern is on the measureable quality which is actually the visible user experience determining factor. Performance characteristics of Web sites and Intranet applications directly affect the quality of service and staff productivity, the volume of online sales and general utility of a web site. Today, web technology has advanced up to what is generally known as Web 2.0 where there is an increased volume of user-generated contents (UGC) as seen in social networks and blogospheres such as Wikis, YouTube, Facebook, MySpace etc. It is in the interest of the user that a web page accessed , responds

in expected time and with the expected request-result. The architecture and composites of a web site determines to a greater extent the response of the pages when requested by client web browser software, if all client-end related performance issues are ignored. A website could consists of static, dynamic or interactive text, images or programs that run on the server or downloaded to run on clients using java, XML , HTML or AJAX (asynchronous Java and XML). The availability of dynamic and static contents and images affect the response time of a server as seen from the user perspective .A unit of hypertext or hypermedia available on the web is called a page. These pages are accessed by the clients browser using HTTP, FTP, Gopher or TELNET [1]. Most commonly used is HTTP over TCP connections. Each page requires a TCP connection establishment between the client and the server (see Figure 1).This is to say that several TCP connections are required to have a home page of a web site downloaded. In this work, the following parameters are measured for different categories of websites: Average Response time, Pages per second, HTTP errors, Hits per second, Received rate (kb/s), Sent rate (kb/s), Received traffic per user, Sent traffic per user. The choice of websites in our test is spread across education-www.kth.se, social network-www.facebook.com, business www.amazon.com, corporate website-www.cisco.com, technical website www.ietf.org and entertainment-www.msnbc.com. The idea behind the broad spectrum of choice for website is to cover all sorts of web sites ranging from those characterized by simple plain text to those that run interactive and real-time multimedia such as cisco.com and msnbc.com. The measurement tools used in this work are pathload2 and online free bandwidth test (www.speeedtest.net) for determining the available capacity on the local connectivity and WAPT (web Application Test) , which is used for the actual measurement. This work features measurement procedures and the actual test results in Section 2. In Section 3, factors that determine the performance of a website is presented and the Why measure website performance is discussed in the subsequent section. We conclude the paper with recommendations for future work .

2. MEASUREMENT PROCEDURE AND TEST RESULTS.


The choice of web site as mention in Section 1 was spread across broad spectrum of organizations. They are www.kth.se, www.ietf.org, www.cisco.com, www.msnbc.com, and www.amazon.com. Real world active users (load) factor, which varies for different websites was ignored since, we do not have control over it. A test tool called WAPT (Web Applications Test )

was employed for the test run. WAPT is a load, stress and performance testing tool for web sites and intranet applications with web interface. HTTP/1.1 and SSL3 were enabled for all the test scenarios. The local machine used for the test was a Dual core 2.8 GHz, 4Gig RAM, computer running on windows XP. The local LAN interface was a 10Gig Ethernet card. Pthload2 and speedtest.net were employed to determine the available average throughput for the test period, which is average of 300Mb/s download. Local ISP related factors were ignored since all test scenarios were subject to same condition in the university laboratory, over the holiday period when internal load was greatly reduced to barest minimum. Virtual users were emulated in the test. For each scenario, two profiles were created. First was with a fixed number of users (emulating 20 host computers), having a think time (i.e. time for each user to go to a link or another page as in real world) of 1 to 3 seconds .The second profile was created with a RAMP increase, starting with 1 user and users join or start using the website, after every 10 seconds. A total number of 20 virtual users were emulated for each test which was run under duration of 4 minutes. The above scenarios replicated how different web servers experience simultaneous load or ramp up load increase in real world experience. Before the test was run, the distances of the client machine to the closest server for each of the web sites were determined with a trace route as shown in table1. The reason for the trace route is that distant servers are expected to exhibit little more delay in response.

website based on Response time. They are Ideal Response Time (0.1s), Fairly acceptable performance (1.0s) and Unacceptable performance Response Time (10s). Response Time greater than 2s creates user dissatisfaction. It is worth noting that WAPT measures download time without images. It is the time from the moment a user sees a web page title (in browser's title bar) till the moment he can start reading a page, i.e. when readable content of web page is displayed on users monitor. Table 1. The number of hops of servers determined by tracert on windows shell Web server www.kth.se www.ietf.org www.cisco.com www.msnbc.com www.amazon.com No of hops 4 15 17 > 19 *(times out) 8

2.1 Explanation of Measurement Parameters


WAPT which is a Standard web analytical package, provides a wealth of information about several site's performance. In this sub Section we will briefly explain some of the information and statistical parameters use in order for the result to make meaning when doing the analyses of the report generated. Session: This refers to the number of unique accesses of a site by a particular user identifier. That is to say one browser visiting your site, one time. It's equal to the average number of unique visits times the number of unique visitors, and might be a better indicator of site popularity compared to page views for a site whose visitors, on average, don't visit more than a few times during the spanned time period. HTTP error percentage shows the percentage of response with http errors from the total number of hits. If you receive an http error, it means there are problems with the work of the web server.Pages per second shows the speed of the current page execution average for all users who requested a given page. HIT is a single request for resource (page code, image, script and so on) sent to the server. Each page may include many hits. Hits per second shows many hits were executed per time scale unit for the current page. Average and maximum Response Times are the most important characteristics of load testing. It is the time from the first byte of page request sent till the last received byte of server response. In other words, it is the time from clicking a link or a button in browser till the moment a page is downloaded. WAPT measures 2 types of response time: without images and including images They tell you how long a user waits for server response to his request. Website owners should ensure that users of site/application get the response in acceptable time. The values of average and maximum response time are the most important results of load testing; they measure web user experience. There are three defined yardsticks for determining performance of a

Access to web services from a client computer is by a web browser. Different web browsers exhibit varying response times for a given web server. As a result, Internet Explorer 7 was used for all the scenarios. We ensured that java plug in was installed in the browser and that the persistent cookies was not enabled so that the actual Get and response experience was obtained without an enhancement offered by persistent cookies. The results obtained in our test are as presented below. Table2. Home Page response Time (RAMP-profiles) Web site Max. (seconds) Kth.se Ietf.org Cisco.com Msnbc.com Amazon.com 0.01 0.41 1.04 1.19 1.52 Min. (seconds) 0.004 0.34 0.74 0.73 0.41 Mean deviation (Seconds) 0.007 0.375 0.89 0.96 0.965

Table 3. Home page Response Time for Simultaneous users Web site Max. (seconds) Kth.se Ietf.org Cisco.com Msnbc.com Amazon.com 0.009 1.25 1.23 1.09 1.94 Min. (seconds) 0.005 0.34 0.98 0.73 0.41 Mean deviation (Seconds) 0.007 0.795 1.105 0.91 1.175

Table 4. Pages performed and error levels for RAMP-profile Web site Pages performed Pages errors 1 with

Table 5.Pages performed and error levels for simultaneous users Web site Pages performed 887 1226 197 3200 1212 Pages errors 0 2 0 235 124 with

Kth.se Ietf.org Cisco.com Msnbc.com Amazon.com

495 956 95 1749 952

Kth.se 4 0 2 20 Ietf.org Cisco.com Msnbc.com Amazon.com

Several units of hypertexts and hypermedia make up the complete displayed page of a web site. Each of this unit is called a page. For instance, the jpeg or gif image component make up a page, and one or several TCP connections are required to open a page. This implies that there are several HTTP/1.1 Get-Response sessions for a completely displayed web homepage. Also, the response from a server to a client request is called a hit. Dynamic graphic and interactive contents which a web site contains increase the connections required to open the pages. Figure 1 show a Wireshark capture of ietf (www.ietf.org) home page download showing the TCP connections for page downloads. The quality of a website affects the amount of errors from the server to the client. The table below shows the errors obtained from our test results. Table 6. HTTP and Hit errors measured from different web sites Web site % hit errors http errors (average) Description Kth.se Ietf.org Cisco.com Msnbc.com Amazon.co m Ram p 6.348 0.046 0.08 1.277 0.082 Simultaneou s 6.276 0.018 0.846 0.522 0.238 Ram p 6.33 0.05 7.30 1.26 0.07 Simultaneou s 6.26 0 0.79 0.51 0.23

2.2 Overall Performance


Some performance characteristics of the scenarios and profiles, with their variations with time through out the duration of the test are graphically shown below. More of the graphs are found in appendix 1 to appendix 10.

Figure 1. Part of IETF.com Home Page capture using Wireshark showing the TCP and HTTP connections

Several techniques exist in web 1.0 and web 2.0 design that enhance performance of web sites. Techniques such as web prefetching techniques [3], use of AJAX for dynamic and interactive contents, use of Content Delivery Networks are some of the numerous available technologies. These in general enhance the performance of web sites despite the nature of the contents and services available on the web site. A typical example is the result obtained from www.msnbc . Msnbc.com contains archived videos and live video streams when compared to Amazon.com which is characterized mainly by static images and fewer interactive contents. From tables 1 to 6, it can be observed that despite having a distance characteristic greater than 19 hops, when compared to 8 hops, for amazon.com, msnbc.com exhibited overall better performance than amazon.com (ignoring all real world load effect on the servers during the test period). Furthermore, www.kth.se proved to have the worst case for http and hit errors when compared to all other websites tested. Figure 2. Hit/s and Pages/s Performance of www.ietf.org , tested with 24 simultaneous users

Figure 5. Hit/s and Pages/s Performance of www.kth.se, tested with 24 simultaneous users Figure 3. Hit/s and Pages/s Performance of www.ietf.org , tested with ramp increase of users from 0 to 20 users Comparing the results from the graphs of kth.se and ietf.org, it might sound well to say that kth.se had more hits per second than ietf.org. This could be explained from the trace route result shown in table 1. The server www.ietf.org is 15 hops away from the client machine as compared to 4 hops for kth.se. The overall latency incurred is expected to be more in ietf.org. Moreover, looking at the pages successfully downloaded per second, despite having more hits from the www.kth server, lesser pages were downloaded by the client. This could be explained from fact that there are more hypermedia contents (e.g. images) on the kth.se website. Hence the user tends to have a faster performance feel from the ietf.org website than the kth.se website despite the proximity of the kth.se server and the measured server response time (see tables 2 and 3).

3. WHY MEASURE THE PERFOMANCE OF A WEB SITE


The level of interdependence between day to day activities and web services is increasing in an exponential curve. From automated business process, e-health delivery services, information and entertainment, social networking and education, online trading ,-e-government to online shopping, human lives have been tied to web services. These web services are accessed through their respective websites. By definition web services are software services exposed on the web through SOAP ( Simple Object Access Protocol), described in WSDL (Web Services Description Language) and registered [7] in UDDI ( Universal Description, Discovery and Integration) node. The increasing utility and no of users of web based services create performance challenges. For example, online stock trading website could alter the world life and business activities if all online stock trading websites fail to perform optimally. The visions of the creators of

YouTube, Facebook, (which are characterized by user generated images, interactive contents and videos), and the likes would be undermined if the websites exhibited performance bottlenecks. This not only would reduce the number of users of the website, but also would have a ripple effect the horizontally and vertically integrated businesses, relationship and psycho- related health case would result from all over the world because of the level of dependence. Stress and overall performance characteristics of websites measurements enable owners of web services and websites to adjust to complexities and changes introduced due to unpredictable changes in the volume of access and also changes in the complex internet architecture.

user. Other benefits include backup, caching, and ability to better absorb traffic spikes [4]. The need for the availability of high graphic images, interactive and dynamic contents on a web site should be applied with performance in mind. This does not imply that web sites should look pale with still and dull images. Improved and enhance techniques such as AJAX, JavaScript etc should be well integrated when dynamic and video contents are necessary on a website. AJAX caching is expected to add more performance feel to websites. More information on web performance improvement is found in [4].

5. CONCLUSIONS
In this paper, we have shown the result of performance test run for a broad spectrum of web site with varying characteristics. These websites are www.kth.se, www.amazon.com, www.ietf.org www.msnbc.com and www.cisco.com. We first determined the distance of our local client machine from the various web servers using a trace route. Though Latency/response time could be a function of distance, but some other factors greatly determined the response time, error level, pages/s hits/s and the overall performance of the websites measured in our test. Also in this paper, we have been able to show that the different websites reacted differently on simultaneous load and ramp load. For example there are more http and % hit errors for msnbc.com with ramp load than with simultaneous 20 users. This could be explained from the several subsequent ICMP multicast- join that is required to join the group as seen in our Wireshark. The quality of the website architecture also affects the user experience on the client browser. Cisco.com characterized by rich dynamic graphics and videos gave better error performance that kth.se which contains virtually little or no video. Finally, in this paper, we have been able to indentify and make recommendations for simple methods of enhancing the performance of a website.

3.1 Factors that Determine the Performance of a Website.


Website performance on client computers could be traced from either the local machine, network (local or remote) or the web server. Popularity of a web server determines the expected load on the server. Also the characteristic of the machines hosting the web servers and its backbone network affect the performance of the server . Most importantly are the contents and services delivered by web servers and the technique for publishing the web contents. Todays websites are more interactive, and they contain more user generated contents, videos and static images more than websites found in the early days of the World Wide Web. JavaScript, XML, PHP, AJAX etc enable the creation and publishing of these interactive contents. The availability of dynamic and static contents and images affect the response time of a server as seen from the user perspective. The characteristic load on the web servers ISP network could also pose a threat to the overall performance of the server. Clients machine hardware and software related factors, browser capability and local ISP are some of the client end factors that affect the user experience on a website and the provided web services.

6. ACKNOWLEDGMENT
The motivations behind this work is from the series of lectures on TCP/IP and Internet Services by Mr. Thomas Lindh at the Royal Institute of Technology, campus Haninge, Stockholm.

4. RECOMMENDATIONS FOR IMPROVED WEBSITE PERFORMANCE


Based on the result obtained and some previous work done [3] by other people, we propose some simple techniques to aid website performances. The technique of web Pre-fetching enhances the general performance of a website at the frontend. Web pre-fetch is a technique aimed at reducing the users perceived latency by downloading, during the navigation idle times, the web objects before the user asks for them. Several works have been done on we pre-fetching such as [5,6]. From the work done in [3], a framework that implements web pre-fetching techniques on real environments called Delfos, is proposed. Our second recommendation is the use of Content delivery Network CDN, as seen in cisco.com. Despite the high and multiple interactive and dynamic video contents on www.cisco.com, the error level (hit and http) recorded from our lab was zero (0). Also the response time competes favorable despite having a distance of 15 hops from tour client machine, in addition to its rich content.CDN is a collection of distributed web servers used to deliver content to users more efficiently [4]. Examples are Akamia, Limelight Networks, SAVVIS etc. It is obvious that Yahoomail.com employs Akamia CDN. The main performance advantage provided by CDN is delivering static resources from a server that is geographically closer to the end

7. REFERENCES
[1] Behrouz A.F. and Sophia C.H. 2003. TCP/IP Protocol Suite, McGraw-Hill Forouzan Network Series 2(2003) 649698.

[2] Youngkon L. 2008. Quality Context Taxonomy for Web Service Classification. In Proceedings of Third International on Convergence and Hybrid Information Technology 2008 [3] Ossa B., Gil J.A., Sahuquillo J. Web 2007. Prefetch Performance Evaluation in a Real Environment. In Proceedings of 4th International IFIP/ACM Latin American Conference on Networking .2007 http://portal.acm.org/citation.cfm?id=1384117.138412 Steve S. 2008 High Performance Websites. Communications of The ACM Dec. 2008, Vol.51 No.12. . [5] Palpanas T. and Mendelzon A. 1999 Web Prefetching Using Partial Match Prediction. In Proceedings. of the 4th International [4]

[6] Zukerman I., Albrecht D. W., and Nicholson A. E. 1999 Predicting Users Requests on the www. In Proceedings. of the seventh international conference on User Modeling, , Secaucus, NJ, USA, 1999. [7] Ana C.C and Carlos A.G 2005.Guidelines for Performance Evaluation of Websites. In Proceedings of 11th Brazilian Symposium on Multimedia and Web services 2005