Вы находитесь на странице: 1из 4

A Basic Look at Competing Architectures:

Before diving into specifics, lets define some basic terms and basic
architectural differences:

SMP (symmetric multiprocessing): single node machines, scale up


CPU, high-speed bus, fast bandwidth, not necessary to share disk.

Figure 1: SMP Processing

Clustered: Requires shared disk across all nodes, synchronization of


memory and processing must take place across all nodes, high-speed
interconnect is required.

Figure 2: Clustered Processing

MPP (massively parallel processing): Usually independent SMP


components, capable of scaling out and up (within a single SMP node),

doesnt share disk, splits processing into parallel components across the
architecture.

Figure 3: Massively Parallel Processing

SMP Processing Pros and Cons


Pros
Large scalability
Extremely fast access

Cons
Upper size (processing and data) limit due to
hardware bus size.
All CPUs and all RAM must be of the same
make/speed and size.

Logical Partitioning
Number of I/O channels must be kept in concert
with CPUs in order to avoid bottlenecking.
Single unit

Amount of RAM should be at least 1.3x the


number of CPUs once 32 CPUs are reached
(costly).

Single point of management


Expensive solution to purchase outright.
Doesn't seem to be possible to scale to MPP
today (without special hardware).

Clustering Pros and Cons

Pros

Cons

Mid-tier scalability

Hot disk spots with large batch loads.

LARGE clustered SMP machines


work similar to large MPP nodes

Cannot effectively control slices of data via


partitioning, making it a challenge to balance data sets
across bandwidth.

Single point of management

Hot disk spots with large data queries (data that tries to
aggregate multi-hundreds of thousands of rows has to
synchronize all that data in RAM before aggregating it).

Cheaper than single SMP or MPP as Once a mixed workload is put on the single cluster and
an entry point
large data sets are being written and collected, the
engine spends more time synchronizing across the
network than it does answering the needs of either the
load or the query.

Due to increased volumes, increased network traffic


means increased CPU utilization. It also means a
diminishing law of returns the higher the volume (both
query and load), the closer to the top of the curve of

performance and responsiveness will be experienced.

MPP Pros and Cons

Pros

Cons

Scale-out solution

To add another node, it is necessary to repartition


and redistribute data sets.

Doesnt require big-iron nodes

Costly to engineer, results in high up-front costs,


which appear to be inhibitors to entry.

Single point of management

Everything is run in parallel,


supporting a mixed workload is
easy

Вам также может понравиться