Inside Oracle Data Guard
Article by author Chris Foot
From the book
"OCP
Instructors Guide for DBA Certification"
When a mission-critical application becomes
unavailable, it can threaten the survivability of the
organization. The financial impact of downtime is not the only
issue that faces companies that have critical applications that
are offline. Loss of customer goodwill, bad press, idle
employees and legal penalties (lawsuits, fines, and so on) must
also be considered. It is up to the database administrator to
recommend and implement technical solutions that deal with these
unforeseen "data disruptions."
Introducing Oracle Data Guard
Oracle's Data Guard is becoming a popular
solution to the problem of providing highly available
architectures at a reasonably low cost. Oracle Data Guard is a
passive failover environment that uses a single system to run
the user applications until a failure occurs. Then the backup
system is engaged and takes over for the primary system. The
Data Guard primary system can then be repaired or replaced.
Passive failover systems are designed to be able to recover from
faults not compute through faults. This means that there will be
an outage if a problem occurs on the primary server. The length
of the outage depends on the length of time it takes for the
problem to be identified (either by the administrator or the
software) and the time it takes for the failover system to be
brought online.
Because the systems are not mirror images of each other, data
loss is also a concern with Data Guard failover architectures.
How much data is lost as a result of the failure depends upon
how the failover environment is designed and configured. Oracle
Data Guard can be configured to provide different levels of
protections that range from minimal to zero data loss. But as it
is with everything in life, there is a trade-off between zero
data loss configurations and production system performance.
But Data Guard is more than just failover
software, it is a software architecture that creates, supports
and monitors a failover environment that protects data from
hardware failures, human errors and corruptions that might
otherwise cause a critical application failure to occur.
Oracle Data Guard Architecture
Let's continue our discussion on the Data
Guard Architecture by breaking the Data Guard architecture down
into its main components:
The primary database is the live production system. Every
standby database is associated with one (and only one) primary
database. In Oracle9i Release 2, up to 9 physical and logical
Data Guard standby databases can be associated with a single
primary database. As changes are being made to the primary
database, LGWR or ARCH transfers a copy of those changes (in the
form of redo log entries) to the standby databases.
A physical standby database is identical to the primary database
on a block-by-block basis. A physical standby database is
updated by applying redo log entries that are received from the
primary database. A Data Guard physical standby database
must be in recovery mode while applying the redo. It can be not
be used for reporting while it is recovering data.
A logical standby database is an independent database that
contains the same data as the primary database. The logical
standby database uses LogMiner technology to convert the log
information received from the primary database into SQL
statements. The SQL statements are then applied to the logical
standby database. The tables in a logical standby database can
be simultaneously used for end-user reporting. Additional
indexes and materialized views can be created in the database to
increase query performance. All tables in the standby database
that are protecting primary database tables are read-only.
Tables that are not protecting primary database tables are
read-write.
If the environment is configured for maximum protection, log
writer (LGWR) will ship transaction redo data directly to the
standby's Remote File Server Process (RFS) via Oracle NET. LGWR
will transmit the redo information to the destination
concurrently as the online redo log is populated. Administrators
are able to specify synchronous or asynchronous network
transmission of redo data to the remote destinations.
The environment can also be configured to have archiver (ARCH)
ship full archived redo logs to the standby server's Remote File
Server Process via Oracle NET. Administrators configure ARCH to
ship archived redo logs to the standby server by placing
additional entries in the parameter file. The full archived logs
can only be sent to the Remote File Server Process using
synchronous network transmission. Since only completed archive
redo logs are sent to the standby server, data changes on the
standby will lag behind the primary.
The standby server's Remote File Server Process (RFS) is
responsible for receiving the archived or online redo log data
from the primary server.
Depending on how the redo log data was shipped from the primary
server (LGWR or ARCH), administrators are able to store the
shipped redo data as standby online redo logs or standby
archived redo logs. The standby database will still use
conventional online redo logs (required for normal database
operations) but can be configured to use both online redo logs
and standby online redo logs. The following conditions must
occur before standby online redo logs can be used as the
repository for shipped redo log data:
- The Data Guard primary database must be configured to use LGWR
to ship redo log data from the primary server to the standby.
- The size of the Data Guard standby redo log must match the
size of at least one of the primary online redo logs.
- The standby redo log must be archived on the standby server
before its contents can be applied the standby database.
- The standby database server will use the Managed Recover
Process (MRP) to apply the redo information if the standby
database is a physical standby and will use the Logical
- Standby Process (LSP) to apply redo information if the standby
database is a logical standby.
The Fetch Archive Log Process (FAL) is a background Oracle
process that runs on the primary database server. If ARCH is
used to ship archived redo logs to the standby server there is a
possibility of log gaps occurring during network failures. The
standby environment can be configured to detect network failures
and initiate requests to the FAL server process to send the
missing archived redo logs.
Data Guard Protection Modes
Oracle Data Guard offers three modes of data protection. The
ultimate goal of any failover system is to keep the primary and
standby databases as identical as possible. But the key to
success is to balance the needs of transaction protection with
transaction performance. Administrators use the ALTER DATABASE
SET STANDBY DATABASE TO MAXIMIZE {PROTECTION | AVAILABILITY |
PERFORMANCE}; statement to configure the Data Guard environment
to maximize the Data Guard environment for data protection,
availability, or performance
Data Guard Maximum Protection
Maximum protection ensures the highest
level of data availability for the primary database. In maximum
protection mode, redo log records are synchronously sent by LGWR
to the standby database. Primary database changes are not
committed until it has been confirmed that the data is available
on at least one standby database. The Data Guard redo log data
does not have to be committed on the standby database, it must
only be acknowledged that the data has been received on the
standby server.
If Oracle determines that the redo data
can't be transferred from the primary server to the standby
servers, Data Guard will automatically stop the primary database
instance. This ensures that no transaction data is lost when the
primary and standby databases are unable to communicate. In
order to prevent unwanted primary database shutdowns from
occurring, administrators should configure more than one Data
Guard standby database when creating an Oracle Data Guard
environment that will be configured for maximum protection.
Standby servers that participate in a
maximum protection environment must use standby online redo
logs. Because logical standby databases cannot be configured to
use standby online redo logs, they are unable to participate in
maximum protection configurations.
Maximum protection configurations have the greatest impact on
transaction performance. Ensuring there is a high-speed
connection between the primary and standby servers can lessen
this impact.
Data Guard Maximum Availability
Maximum availability provides the second
highest level of data availability. As with its maximum
reliability counterpart, redo data is synchronously transmitted
from the primary database to the standby database by LGWR.
Primary database changes are not committed until it has been
confirmed that the data is available on at least one standby
database.
The standby database may temporarily lag behind, or diverge,
from the primary database without negatively impacting the
production environment.
If the standby database becomes unavailable for any reason, the
Data Guard protection mode is temporarily lowered to maximum
performance until the problem has been corrected. Once
connectivity is reestablished, the Data Guard standby database
will automatically synchronize with the primary database and no
data will be lost. If the primary database fails during a
primary/standby communication outage, all transactions that
occurred on the primary server after the communication outage
could be lost.
The use of standby online redo logs is optional for maximum
availability mode. This means that logical standby databases can
participate in maximum availability configurations. Oracle does
recommend that physical standby servers be configured to use
standby online redo logs in maximum availability configurations.
Data Guard Maximum Performance
Maximum performance is the default protection mode. It offers
lower data availability and higher performance than its
counterparts. Redo log data is asynchronously shipped to the
standby database by either LGWR or ARCH. The commit operation on
the primary database is not contingent upon the data being
received by the standby server.
If all of the standby servers become unavailable, processing
will continue on the primary database. The use of standby
online redo logs is also optional for this mode. As a result,
logical standby databases are able to participate in maximum
performance configurations. Physical standby databases can use
standby redo logs if redo log data is shipped from the primary
database by LGWR.
Data Guard Broker
Oracle's Data Guard Broker is the management framework that is
used to create, configure, administer and monitor a Data Guard
environment. The Data Guard Broker provides the following
benefits:
- Simplifies the creation of Data Guard environments by
providing wizards to create and configure physical or logical
standby databases. Data Guard is able to generate all of the
files necessary (parameter, tnsnames.ora, etc.) to establish the
connectivity between the standby and primary database servers.
- Allows administrators to invoke a failover or switchover
operation with a single command and control complex role changes
across all systems in the configuration. A switchover is a
planned transfer of control from the primary to the standby
while a failover is an unplanned transfer of control due to some
unforeseen event. By automating Data Guard activities such as
failover and switchover, the possibility of errors is reduced.
- Provides performance-monitoring tools to monitor log transport
and log apply times.
- Provides a GUI interface (Data Guard Manager) tool that allows
DBAs to administer a primary /multiple standby configuration
with a simple point-and-click interface.
Administrators are able to manage all components of the
configuration, including primary and standby servers and
databases, log transport services, and log apply services.
- Data Guard is highly integrated with Oracle Enterprise Manager
to provide e-mail and paging capabilities.
An Oracle background server process called DMON is started on
every site that is managed by the broker. The DMON process is
created when the Data Guard Broker monitor is started on the
primary or standby database servers. The DMON process is
responsible for interacting with the local instance and the DMON
processes running on the other servers to perform the functions
requested by the Data Guard Manager or command line interface.
The DMON process is also responsible for monitoring the health
of the broker configuration.
DMON maintains a persistent configuration
file on all of the servers managed by the Data Guard Broker
framework. The configuration file contains entries that provide
details on all objects in the configuration and their statuses.
The broker uses this information to send information back to the
Data Guard Manager, configure and start the site and database
resource objects and control each object's behavior.
|