Oracle RAC 11g Overview
Article by author Jeff Hunter
A cluster is a group of two or more interconnected computers or
servers that appear as if they are one server to end users and
applications and generally share the same set of physical disks.
The key benefit of clustering is to provide a highly available
framework where the failure of one node (for example a database
server) does not bring down an entire application. In the case
of failure with one of the servers, the other surviving server
(or servers) can take over the workload from the failed server
and the application continue to function normally as if nothing
has happened.
The concept of clustering computers actually started several
decades ago. The first successful cluster product was developed
by DataPoint in 1977 named ARCnet. The ARCnet product enjoyed
much success by academia types in research labs, but didn't
really take off in the commercial market. It wasn't until the
1980's when Digital Equipment Corporation (DEC) released its VAX
cluster product for the VAX/VMS operating system.
With the release of Oracle 6 for the Digital VAX cluster
product, Oracle Corporation was the first commercial database to
support clustering at the database level. It wasn't long,
however, before Oracle realized the need for a more efficient
and scalable distributed lock manager (DLM) as the one included
with the VAX/VMS cluster product was not well suited for
database applications. Oracle decided to design and write their
own DLM for the VAX/VMS cluster product which provided the
fine-grain block level locking required by the database.
Oracle's own DLM was included in Oracle 6.2 which gave birth to
Oracle Parallel Server (OPS) - the first database to run the
parallel server.
By Oracle 7, OPS was extended to included support for not
only the VAX/VMS cluster product but also with most flavors of
UNIX. This framework required vendor-supplied clusterware which
worked well, but made for a complex environment to setup and
manage given the multiple layers involved. By Oracle 8, Oracle
introduced a generic lock manager which was integrated into the
Oracle kernel. In later releases of Oracle, this became known as
the Integrated Distributed Lock Manager (IDLM) and relied on an
additional layer known as the Operating System Dependant (OSD)
layer. This new model paved the way for Oracle to not only have
their own DLM, but to also create their own clusterware product
in future releases.
Oracle Real Application Clusters (RAC), introduced with
Oracle9i, is the successor to Oracle Parallel Server.
Using the same IDLM, Oracle 9i could still rely on
external clusterware but was the first release to include their
own clusterware product named Cluster Ready Services (CRS). With
Oracle 9i, CRS was only available for Windows and Linux.
By Oracle 10g, Oracle's clusterware product was available
for all operating systems. With the release of Oracle Database
10g Release 2 (10.2), Cluster Ready Services was renamed
to Oracle Clusterware. When using Oracle 10g or higher,
Oracle Clusterware is the only clusterware that you need for
most platforms on which Oracle RAC operates (except for Tru
cluster, where you need vendor clusterware). You can still use
clusterware from other vendors if the clusterware is certified
for Oracle RAC. This guide uses Oracle Clusterware 11g.
Like OPS, Oracle RAC allows multiple instances to access the
same database (storage) simultaneously. RAC provides fault
tolerance, load balancing, and performance benefits by allowing
the system to scale out, and at the same time since all
instances access the same database, the failure of one node will
not cause the loss of access to the database.
At the heart of Oracle RAC is a shared disk subsystem. Each
instance in the cluster must be able to access all of the data,
redo log files, control files and parameter file for all other
instances in the cluster. The data disks must be globally
available in order to allow all instances to access the
database. Each instance has its own redo log files and UNDO
tablespace that are locally read-writeable. The other instances
in the cluster must be able to access them (read-only) in order
to recover that instance in the event of a system failure. The
redo log files for an instance are only writeable by that
instance and will only be read from another instance during
system failure. The UNDO, on the other hand, is read all the
time during normal database operation (e.g. for CR fabrication).
The biggest difference between Oracle RAC and OPS is the
addition of Cache Fusion. With OPS a request for data from one
instance to another required the data to be written to disk
first, then the requesting instance can read that data (after
acquiring the required locks). With cache fusion, data is passed
along a high-speed interconnect using a sophisticated locking
algorithm.
Not all database clustering solutions use shared storage.
Some vendors use an approach known as a Federated Cluster,
in which data is spread across several machines rather than
shared by all. With Oracle RAC, however, multiple instances use
the same set of disks for storing data. Oracle's approach to
clustering leverages the collective processing power of all the
nodes in the cluster and at the same time provides failover
security.
Pre-configured Oracle RAC solutions are available from
vendors such as Dell, IBM and HP for production environments. It
is possible to put together your own Oracle RAC 11g
environment for development and testing by using Linux servers
and a low cost shared disk solution; iSCSI.
|