0 оценок0% нашли этот документ полезным (0 голосов)
13 просмотров3 страницы
Cluster computing is widely used in the world of
computing for computer intensive applications. In this approach
the compute nodes are connected with Fast/Gigabit Ethernet in a
local area network. The activities of the nodes are orchestrated
by “clustering middleware”, a software layer that sits atop the
nodes and allows the users to treat the cluster as by and large one
cohesive computing unit.
Оригинальное название
Impact of Two Levels Switching in a LAN for Cluster Computing
Cluster computing is widely used in the world of
computing for computer intensive applications. In this approach
the compute nodes are connected with Fast/Gigabit Ethernet in a
local area network. The activities of the nodes are orchestrated
by “clustering middleware”, a software layer that sits atop the
nodes and allows the users to treat the cluster as by and large one
cohesive computing unit.
Cluster computing is widely used in the world of
computing for computer intensive applications. In this approach
the compute nodes are connected with Fast/Gigabit Ethernet in a
local area network. The activities of the nodes are orchestrated
by “clustering middleware”, a software layer that sits atop the
nodes and allows the users to treat the cluster as by and large one
cohesive computing unit.
Impact of Two Levels Switching in a LAN for Cluster Computing Mamta #1 , Charanjit Singh *2
# Research Scholar, Computer Science, RIMT, Mandigobindgarh, India *Assistant Professor, Computer Science, RIMT, Mandigobindgarh, India
Abstract Cluster computing is widely used in the world of computing for computer intensive applications. In this approach the compute nodes are connected with Fast/Gigabit Ethernet in a local area network. The activities of the nodes are orchestrated by clustering middleware, a software layer that sits atop the nodes and allows the users to treat the cluster as by and large one cohesive computing unit.
Keywords Bandwidth measurement, Latency, network performance, parallel computing. I. INTRODUCTION The use of interconnected PCs as single logical computational resources has become a widespread approach to a cost effective computer intensive problems. Though dedicated high end supercomputers still have their place in the market yet combined unused CPU cycles of desktop PCs available in the campus network can formcomparable virtual supercomputers. Consequently, parallel processing in a network of PCs are attracted a boost of attention and becoming one of the most promising areas of large scale scientific computing.
A cluster is made up of identical computers connected by a fast network. Building an efficient cluster means studying the issues related to performance measures. A collection of computer nodes, interconnected by a LAN and/or a high-speed switching network, all nodes can be used individually or collectively as a cluster. All computers in the cluster are glued together with middleware support for collective usage as a single computing resource, in addition to the traditional usage as individual computers.
Cluster is a type of parallel or distributed processing systemwhich consists of a collection of interconnected stand alone/complete computers cooperatively working together as a single integrated computing resource and provides scalability by adding servers to it or by adding more clusters to the network as the need arises. It offers high systemavailability and reliability due to the redundancy of hardware, operating systems and applications. It provides data support and high performance massive storage by running cluster enabled programs. Cluster Architecture is needed for different tasks. It adopts the following system:-
End systems A single computer can be described as the ideal cluster [1] or not a cluster at all. All components in a computer have been standardized and they are well integrated. An operating has absolute control.
Clusters A cluster is made of identical computers connect by a fast network. The cluster and the computers in centrally controlled. As there are many computers in a cluster, algorithms are needed to propagate [1] and execute controls signals and manage resources. Integrated is reduced compared to the end system.
Intranets An intranet cluster is made of computers with a centralized administration. In an intranet cluster the parts are heterogeneous and they are not controlled by the central administration. An intranet cluster [1] is more focused on maintaining and administrating itself than performing high performance parallel computing.
Internet clusters There are many difficulties to overcome in an internet cluster. There is no central administration and the cluster is spread out all over the world. The distances will limit the bandwidth [1] so latency will be high. The scale is both larger and more unpredictable than the intranet. Any attempt to limit the scale of an internet cluster to an absolute number will either produce a virtual intranet or expose the cluster to availability problems. Cluster computing is potentially able to deliver high performance at the unbeatable price/performance and thus providing a low-cost alternative to both shared memory multiprocessors and distributed memory Massively Parallel Processors. II. RELATED WORK Gupta et .al Studied the effect of TCP socket size for the local nodes and non local nodes in a grid enabled PC Cluster for parallel computing[2]. It shows that the obvious answer is to use level 1 Ethernet network with socket size 64 KB along with other parameter TCP window size value and additional parameter of GlobalWindowSize. Desk one has performed better in peer to peer topology but when the client move International Journal of Engineering Trends and Technology (IJETT) volume 5 number 3- Nov 2013
across the switches them Desk one is not suitable as the bandwidth chokes.
In [3, 4] author proposed a scheme in which multiple programs or tasks carried out simultaneously. Which makes programs or tasks run faster because there are more CPUs or cores running it without interfering with each other.
This approach relieves the user fromworry of bottlenecks. But as the demand of high performance computers is always a hunt so to achieve high speed, low latency interconnection networks. So we analyse that it could be better to have modifications in Level 1 and performance can be enhanced by using Level 2.Which will be having High speed and low latency interconnections. III. TECHNIQUES Parallel Performance Metrics
The execution time of a parallel algorithmdepends on input size, number of processing elements and their computation speeds. The number of speed up laws have been proposed such as Amdahls law (1969) [4] based on a fixed workload or problem size, Gustafsons law (1985)[4] based on skilled problemsize where problem size increases with the increase in machine size and Sun and Ni(1993)[4] law for scaled problems bounded by memory availability. In comparison to serial algorithm, parallel algorithmspends more time in inter processor interaction and idling due to load imbalance. The presence of serial components in a program sometime generates the need to carry out excess computation due to absence of reusability of results by different processing elements. The various measures that have been used to evaluate the scheduling policy based on the outcome of performance analysis are execution time, speed up, efficiency and costs. The trade off among these performance measures is required to achieve optimal output. Execution time is the time interval between the beginning of parallel computation and the time since the last processing elements finishes execution. During the execution, balancing of workload is carried out by identifying concurrency among tasks to reduce idling time and communication time. Speed up is another measure that is defined as the ratio of the time taken to execute a problemon a single processor to the time required to solve the same problem on a parallel computers with p identical processing elements. It is generally devoted as S(p) and expressed as
Execution time using single processor = Execution time using multiprocessor with p processors
= =t(s) / t (p)
Where t(s) is the execution time on a single processor and t(p) is the execution time on a multiprocessor. The whole program practically cannot be executed in parallel as there are some operations in a computation that must be performed sequentially.
Amdahls law
In 1967, Amdahl presented Amdahls law that attempts to calculate the speed up based on the nature of the algorithm. This law suggests the need to identify the fraction of operations in the algorithmthat must be executed sequentially and denoted as f and remaining fraction (1 f) of an algorithm can be executed in parallel. Let p be the no of available processing elements, t(s) and t(p) be the execution time on single processing element and p number of processing element respectively. Assuming no overhead occurs, then time required to performcomputation with p processing elements as per Amdahls law is
t(p) = f * t(s) +( 1 - f ) * t(s) / p
and speed up is S(p) = t(s) = 1
t(p) f +( 1 f ) / p
The formulation for speed up using Amdahls law are based on fixed problemsize and speed up drops very rapidly with the increase in sequential fraction. So fixed load acts as deterrent in achieving the scalability in performance. John Gustafson law
John Gustafson (1988) proposed a fixed law using time concept to scale [4] the speed up model and remove the fixed load restriction. This law states that problemsize scales with the number of processors and idea is to keep all processors busy by increasing the problemsize. The sequential fraction is International Journal of Engineering Trends and Technology (IJETT) volume 5 number 3- Nov 2013
no longer a problemif problemsize can be scaled to match available computing power.
As per Gustafsons Law, Scaled speed up
S(n) = ( s +( p * n ))
s +p
= (s +( p * n )) where s +p =1
= n +( 1 n ) * s
Where s is time for executing serial part of computation, p is time for executing parallel part of computation on multiprocessor and s +p is total time and equal to 1. The equation (s +(p * n)) denotes the execution time on single computer as n parallel parts executed sequentially. So speed up according to Gustafsons law is approximately double than Amdahls law. Another law by SUN and Ni (1993) suggested memory bounded speed up [4] to optimize the use of both processor and memory. In memory constrained sealing, the problemis made to fit in the available memory. With the increase in number of processors, memory also grows and enables the systemto solve a scaled problem though programand data decomposition.
Efficiency is another measure of the fraction of the processors time spent in useful computation and defined as the ratio of speed up to the number of processing elements. Cost is the next measure defined as the product of parallel execution time and the number of processing elements used. It is expressed as Cost = Execution time * total number of processing elements used The cost of solving a problem on single processor is simply its execution time t(s) and cost of solving a problemon parallel processor is t(p) * n. So parallel execution time is t(p) = t(s) / S(n). Utilization is the measure for evaluating resource utilization and may be defined as ratio of total usage time over the total available time of the resources (processors, memory). So it measures the percentage of keeping the resources busy during the execution of parallel algorithm. Utilization denoted as U and expressed as
Utilization U = O(n)
N * T(n) Where O(n) actual number of operations performed by n processors and N * T(n) total number of operations that could be performed by n processors in T(n) time.
In addition to the above metrics, some other general measures such as CPU time, and CPI (Clock cycles for instruction) play important role in measuring the success of parallel implementation of problem. Different implementation of same algorithmon parallel systemmay produce different outputs. IV. METHODOLOGY In network evaluation, latency and asymptotic bandwidth are the two parameters, which affect the performance of both data intensive and compute intensive applications. In the research, certain parametric values of TCP/IP are suggested which improves the latency and bandwidth. "Ping-Pong" is a standard tool to check one way communication [5] and two-way communication in a network of PCs. A ping utility sends specifically marked packets from the local computer to a remote computer/device. Besides determining whether the remote computer is currently 'alive', ping also provides indicators of the general speed or reliability of the network connection. Ping utility will be used to see the network traffic, no. of hops, and zero-TTL conditions. A Cluster is based on the networked PCs, the time of sending and receiving the message should be analysed based on the TCP/IP [6,8] parameters defined above and suitability of Ethernet/Fast Ethernet for coarse grain application will be tested. V. CONCLUSION Cluster computing based on a loosely coupled PC in a LAN is a relatively new area of interest which is required to accomplish the task in minimum amount of time because it is having a critical importance now a days. For this we will attempt to solve the communication bottlenecks in a cluster enabled local area network. . REFERENCES [1] Chamlers University of Technology 13 th April 2007, 2007-04-30, Cluster Systems. [2] Gupta.OP (2006) Performance Evaluation Of LAN For Parallel Computing Pb. Univ. Res j(Sci.) vol.56, pp.207-210 [3] Brent Wilson (2005) Introduction to parallel programming using message-passing J ournal of Computing Sciences in Colleges, Volume 21 Issue 1 [4] Speedup performance laws. Fixed Workload- Amdhals Law, Scaled Problems-Gustafron Law(1987),Memory-bounded Speedup Model- Sunand NI (1993) URL www.comp.nus.edu.sg [5] TCP/IP test: Ping - pong test. This is the classic latency and bandwidth testing URL www.jncasr.ac.in [6] The TCP/IP Guide URL www.tcpipguide.com [7] Introduction of Parallel Programming Paradigm, message passing, data parallel implementation URL www.msi.umn.edu [8] An overview of TCP/IP URL www.wilyhacker.com