Page 593 - (ISC)² CISSP Certified Information Systems Security Professional Official Study Guide
P. 593
Large-Scale Parallel Data Systems
Parallel data systems or parallel computing is a computation system
designed to perform numerous calculations simultaneously. But
parallel data systems often go far beyond basic multiprocessing
capabilities. They often include the concept of dividing up a large task
into smaller elements, and then distributing each subelement to a
different processing subsystem for parallel computation. This
implementation is based on the idea that some problems can be solved
efficiently if broken into smaller tasks that can be worked on
concurrently. Parallel data processing can be accomplished by using
distinct CPUs or multicore CPUs, using virtual systems, or any
combination of these. Large-scale parallel data systems must also be
concerned with performance, power consumption, and
reliability/stability issues.
Within the arena of multiprocessing or parallel processing there are
several divisions. The first division is between asymmetric
multiprocessing (AMP) and symmetric multiprocessing (SMP). In
AMP, the processors are often operating independently of each other.
Usually each processor has its own OS and/or task instruction set.
Under AMP, processors can be configured to execute only specific
code or operate on specific tasks (or specific code or tasks is allowed to
run only on specific processors; this might be called affinity in some
circumstances). In SMP, the processors each share a common OS and
memory. The collection of processors also works collectively on a
single task, code, or project. A variation of AMP is massive parallel
processing (MPP), where numerous SMP systems are linked together
in order to work on a single primary task across multiple processes in
multiple linked systems. An MPP traditionally involved multiple
chassis, but modern MPPs are commonly implemented onto the same
chip.
The arena of large-scale parallel data systems is still evolving. It is
likely that many management issues are yet to be discovered and
solutions to known issues are still being sought. Large-scale parallel
data management is likely a key tool in managing big data and will
often involve cloud computing, grid computing, or peer-to-peer
computing solutions. These three concepts are covered in the

