ENTERPRISE
GRID COMPUTING VS. CLUSTERING
Clustering is the use of multiple computers, typically PCs or UNIX
workstations, multiple storage devices, and redundant interconnections,
to form what appears to users as a single, highly-available system.
Cluster computing can be used for load
balancing as well as for high
availability. Advocates of clustering suggest that the approach
can help an enterprise achieve 99.999%
availability in some cases. One of the main ideas of cluster computing
is that, to the outside world, the cluster appears to be a single
system.
This differs from Enterprise Grid Computing where resources can
enter and leave the pool as necessary.
Cluster computing can't truly be characterized as a distributed
computing solution; however, it's useful to understand the relationship
of grid computing to cluster computing. Often, people confuse grid
computing with cluster-based computing, but there are important
differences.
Grids consist of heterogeneous resources. Cluster computing is
primarily concerned with computational resources; grid computing
integrates storage, networking, and computation resources. Clusters
usually contain a single type of processor and operating system;
grids can contain machines from different vendors running various
operating systems. (Grid workload-management software from IBM,
Platform Computing, DataSynapse, and United Devices are able to
distribute workload to a multitude of machine types and configurations.)
Grids are dynamic by their nature. Clusters typically contain a
static number of processors and resources; resources come and go
on the grid. Resources are provisioned onto and removed from the
grid on an ongoing basis.
Grids are inherently distributed over a local, metropolitan, or
wide-area network. Usually, clusters are physically contained in
the same complex in a single location; grids can be (and are) located
everywhere. Cluster interconnect technology delivers extremely low
network latency, which can cause problems if clusters are not close
together.
Grids offer increased scalability. Physical proximity and network
latency limit the ability of clusters to scale out; due to their
dynamic nature, grids offer the promise of high scalability.
For example, recently, IBM, United Devices, and multiple life-science
partners completed a grid project designed to identify promising
drug compounds to treat smallpox. The grid consisted of approximately
two million personal computers. Using conventional means, the project
most probably would have taken several years on the grid
it took six months. Imagine what could have happened if there had
been 20 million PCs on the grid. Taken to the extreme, the smallpox
project could have been completed in minutes.
Cluster and grid computing are completely complementary; many grids
incorporate clusters among the resources they manage. Indeed, a
grid user may be unaware that his workload is in fact being executed
on a remote cluster. And while there are differences between grids
and clusters, these differences afford them an important relationship
because there will always be a place for clusters certain
problems will always require a tight coupling of processors.
However, as networking capability and bandwidth advances, problems
that were previously the exclusive domain of cluster computing will
be solvable by grid computing. It is vital to comprehend the balance
between the inherent scalability of grids and the performance advantages
of tightly coupled interconnections that clusters offer.
Although these workday tasks are clustering's greatest hits, another
application often gets more press: grid computing. The two terms
are often used interchangeably both involve multiple systems
working together to carry out a similar set of functions
but there are differences. You can think of a cluster as grid computing
under one roof: One company or department sets up a cluster and
controls the whole, usually localized or centralized, system.
Grid computing is more far-reaching; individual systems can be
added or subtracted without a central control. What's more, miles
can separate grid participants as long as there's a network connection
between them. An example on a massive nay, cosmic
scale is the SETI@Home project, which enlists PC users all over
the Internet to download a screen saver that uses extra clock cycles
to sort through radioClustering.
In simple terms, clustering is the connecting together of two or
more computers in a way that they behave like a single computer.
Clustering refers to a number of ways to group servers in order
to distribute load and eliminate single points of failure within
a business-critical system.
Clustering solutions are employed for parallel processing, load-balancing
and, most commonly, fault tolerance. Proponents of clustering suggest
that the approach can help an enterprise achieve close to 100% availability
in some cases. One of the attributes of clustering is that, to the
outside observer, the cluster appears to be a single system.
|