Lewis Cunningham - 11/10/2005
Oracle RAC & Grid Tuning with Solid State
Disk, sub-titled: Expert Secrets for High
Performance Clustered Grid Computing is a
pretty specific book covering RAC and how it
can make the most of Solid State Disk (SSD).
The goal of this book is to show how SSD can
improve the performance of an Oracle
database, specifically a clustered one. This
book is published by Rampant Press and is
part of the Oracle In-Focus series. The
book is written by Mike Ault and Donald K.
Burleson. It's 200 pages and is liberally
sprinkled with graphs, listings and
benchmark results. TPC benchmarks are used
extensively throughout. Like all Rampant
Press books, the price is excellent at
US$16.95. A book of this size on an esoteric
topic like this would probably cost US$30 or
more from most publishers.
The book is seven chapters and includes
two appendices and a decent index. The text
starts off with some detailed explanatory
text and architectural overviews and moves
into tuning and TPC benchmarks.
The chapters are:
- Chapter 1: Solid State Disk with
Oracle - This is mainly an Oracle
Architecture Overview
- Chapter 2: SSD and Bandwidth - This
is mainly an SSD Overview
- Chapter 3: Solid-State Disk with RAC
- This is getting to the meat of the
matter
- Chapter 4: TPC-C Online Benchmark
with Solid-State Disk - The first test
- Chapter 5: TPC-H Warehouse Benchmark
with SSD - The big data
- Chapter 6: Oracle Tuning With
Selective Application of SSD - The most
important chapter to me, what to put
where
- Chapter 7: No-disk Oracle
Architectures - The future?
- Index
- Appendix A: TPC-C ERD and Tables
- Appendix B: Example AWRRPT
Chapter 1: Solid State Disk with
Oracle
Chapter 1 starts off with some commentary
about Moore's law as it pertains to memory
speed and prices and a description about why
RAM-SAN is better than disk SAN. A central
theme of the book is the proper use of SSD.
Just throwing SSD at a database MIGHT help
but proper use will raise the odds that it
will help. From the book:
The proper use of SSD is the central
question for this benchmark. Traditional
architectures of the 1990's have left
users with duplicate cache areas such
as web cache, Oracle buffer cache,
on-board disk cache, etc., and it is now
the challenge of the Oracle DBA to
exploit SSD for the most benefit for
their database application.
Each benchmark includes an introduction
to the issue, a predictive hypothesis, the
test methodology and the results and final
conclusions.
In this chapter, the authors say that
today, 128 GB of SSD can be purchased for
about US$150k. That amount of memory could
easily hold many databases. Databases are
getting larger and larger but there are many
much smaller than 128GB.
Most of the remainder of chapter 1 covers
using SSD as a tuning tool and the
architecture of buffers and caching. The
authors ask and explain, "Why is Oracle
Logical I/O So Slow". The posited answer is
the overhead required to maintain read
consistency and data concurrency.
Chapter 1 ends with a review of some
existing studies by James Morle, Dr. Paul
Dorsey and Woody Hutsell (for Texas Memory
Systems) and presents some conclusions that
can be drawn from those studies.
Chapter 2: SSD and Bandwidth
Chapter 2 is all about bandwidth and how
SSD can help. The chapter starts with a
review of Oracle I/O and how bandwidth
impacts it. The author even goes into a bit
of RAM bandwidth and access speed history
starting with 8 bit in the 1970s through the
64 bit systems today.
The gist of this chapter is that the
bottlenecks in data processing have shifted
away from CPU and memory and towards
storage.
For years, storage architects have
observed the growing divide between
processor performance and storage access
times. Remember, when the CPU waits on
storage, the users are waiting on
storage.
There is also coverage of super-large
disks and the problems inherent in a
database placed on a single pair of mirrored
devices. The issue here is a data
transmission bottleneck due to excessive
read-write head movement and controller
bandwidth limitations. I would like to have
seen some comparison of the super-large disk
with a SAN and how they compare.
The chapter goes on to suggest ways to
use SSD to remove, or at least alleviate,
the issue of bandwidth saturation. One way
is to move all concurrent access data files
to SSD. That would get expensive in larger
systems.
The final parts of this chapter deal
directly with RAC. Specifically, Cache
Fusion and I/O bandwidth and finding the
source of bandwidth bottlenecks in a RAC
cluster.
The author provides several scripts to
gather statistics on I/O, contention, wait
events, etc. All of these scripts are
available at what the author calls "The Code
Depot". You can get access to this code
repository by purchasing the book.
Chapter 3: Solid-State Disk with RAC
Chapter 3 is a short chapter. This
chapter gets into the nitty gritty of RAC
and I/O. The author discusses how disks are
striped (8k vs. 128k) and even the
mechanical nature of disks. He makes the
point that it is difficult enough to decide
where SSD can help in your basic Oracle set
up. Bringing RAC, and its specific
requirements like the high speed
interconnect between nodes.
Latency seems to be a key calculation for
using SSD in a RAC environment. With many
users and many large data files, reducing
latency seems to, statistically at least, be
the place where SSD helps the most.
Another point here would tie to cost of
an SSD as opposed to regular disk
technology.
Experts agree that for optimal
performance, no disk should be filled o
more than 60% of its total capacity,
which is rather like buying a six
passenger car and being told that only
four people should be transported in it.
On SSD technology, the complete
capacity is usable since there is no
positional or rotational latency, and
the number of simultaneous reads/writes
is only dependent on bandwidth since
there is no that needs to be
repositioned after each read or write
operation.
I will leave this here for now. In Part
2, I will finish covering each of the
chapters in detail and then make some
comments on the usability of the book and
it's suitability in my, or your, library.
And, if you've made it this far, I have a
favor to ask of you. My 100th entry is
coming up soon. I would some ideas of what
you would like for that entry. I want it to
be my best yet and since I try to make this
blog a tool for all of you, let me know what
you would like to see.
|
Chapter 4: TPC-C Online Benchmark With Solid-State Disk
This chapter, as the title says, is about the results
of TPC-C benchmarking tests in a 10g RAC and SSD
environment. The benchmark is to show the effects of
varying the SGA in both RAID array and SSD
configurations.
A TPC-C benchmark utilizes nine tables in a typical
OLTP scenario. The tables used are listed in an
appendix. Insert and Delete operations were performed.
The chapter begins with the test configuration and
setup. The hardware was:
- Two Dual AMD 244 Processor 1.7 GHz Opteron servers
- 1MB CPU cache
- Redhat Linux EL (kernel 1.4.21-27.Elsmp)
- 2GB RAM
- dual port Qlogic HBA (2Gbps Fiber Channel)
The HBA was attached to both a RAID 5 array with a
64k stripe. The stripe was across five disks for 64GB
space. The disks were Maxtor MaxLine Plus II 250GB SATA.
The RamSan was a RamSan400 with up to 128GB storage
and an expected 3GB/sec bandwidth. That's kind of a
sweet setup. I would love to have this setup as my test
environment. They don't mention cost.
The database itself was somewhat small. The database
was 2.4GB with indexes. It says it took 6 hours to
install. The schema was loaded using Quest's Benchmark
Factory tool. From the book:
Many TPC-C tests utilize larger test databases,
however, the memory of this test configuration was a
total of 4 gigabytes and the test team intended to
only utilize 2-3 gigabytes of this memory for the
Oracle system to allow for large numbers of users.
The only parameters that were changed during the test
SGA_MAX_SIZE and SGA_TARGET. Automatic Memory Management
was used.
Clients were a combination win2000 and WinXP
desktops.
The tests were run over 100 times to minimize the
impact of any network or configuration issues. The tests
were also run with various user loads from 10 to 600
users.
The chapter ends with the results of the tests.
First, the RAID results are graphed for RAID throughput,
Bytes Per Second, Average Transaction Time and Average
Response Time followed by graphs for the same categories
but for the SSD.
I'll let you read the book to get the actual results
but one interesting note the author makes is that for a
RAM SAN environment it makes sense to reduce the server
cache and drive I/O to the SSD array in some situations.
Very interesting.
Chapter 5: TPC-H Warehouse Benchmark With
Solid-State Disk
A TPC-H test differs from the TPC-C test in that it's
more concerned with data loading and querying. The
authors used tools provided by TPC (dbgen and qgen).
The tests ran 22 standard queries that included
aggregation, subqueries, group bys and other SQL
features in a DSS environment.
It's not entirely clear to me if the same hardware
from the TPC-C test were used for the TPC-H tests. I
would be interested in knowing that but I don't think
they were the same.
Anyway, the authors started with a SCSI array that
failed and was replaced by an ATA array. The entire
SCSI/ATA test took 58 days to run. The authors don't say
exactly how many days the SSD runs took but it does say
at one point that the SSD outperformed the normal disk
by a factor of 179.
The chapter ends with the author's conclusions about
SSD in a datawarehouse environment.
Chapter 6: Oracle Tuning With Selective
Application of SSD
The chapter starts with analyzing what to put on SSD.
With the size of databases in the real world, it's rare
that someone can afford to put an entire database on
RAMSAN. The choices of what to put on SSD break down
into: Data, Indexes, Redo and Temp.
The authors make the point that several tools may be
available to analyze a system, including: Custom
Scripts, OEM, Third-party tools and AWRRPT reports. The
authors chose to use Custom Scripts (in some cases
utilizing statspack) and the AWRRPT reports. There is
even a quick Statspack install section.
It's obvious from this chapter, if not from previous
ones, that I/O is the key. All of the custom scripts
cover getting the nitty gritty details about I/O.
This is not a small chapter but is mainly filled with
report results comparing Disk with SSD.
The chapter ends with some conclusions, including
when NOT to move your application to SSD. This is an
informative chapter to get some hints about how to use
STATSPACK for tuning if you don't already know and use
STATSPACK.
No-disk Oracle Architectures
This chapter is a look into a possible future for SSD.
The author feels, and believes that Oracle feels, that
disks will soon be relegated to the job of backing up
SSD. He makes the point that the Oracle CBO is moving
away from I/O based costing to a CPU based costing.
There are two graphics in this chapter that I like. A
modern Oracle-RAC configuration with normal disk and the
same configuration but with RAMSAN. I think it's amusing
that the only difference is some text.
The author covers what this move to SSD will mean for
Oracle and points out that that there will be no more
disk failures (the SSD uses redundant memory) and PGA
will take over as the main memory area, amongst other
points.
He also covers what will change when a DBA monitors a
database. Pretty much everything that needs to be
monitored becomes memory structures running at memory
speeds. Oracle is constantly adding auto-tuning
features. This makes the point, like so many others,
that a DBA's job in the future will require more
knowledge than just being a good database caretaker or
administrative clerk.
The author concludes this chapter with this
statement:
It is time for progressive users to consider using
SSD technology in those areas of their databases
where it can deliver outstanding performance and
bring their databases into the 21st century.
The final sections of the book are the index and
appendices.
My Conclusions
This book is well written and provides one of the
best high-level overviews of what RAC is and how SSD can
help a database that I have seen. This book is not your
typical reference work. If you are curious about how SSD
can help you, or even curious about how SSD works, this
is a book you should read.
This is not a book that you will refer back to over
time. You will use it to sate your curiosity or perhaps
write a justification to acquire SSD but I doubt you
will ever read it again. However, at US$16.95, it's
worth the purchase cost. I now know a significant more
about both RAC and SSD than I did before reading it.
This is also not a large book. It's 200 pages front
to back and is fairly large type. It reads, to me,
almost as a series of essays rather than a traditional
reference book. I like the conversational style although
others may not.
Even if you don't agree with with all of the opinions
presented in the book, it is a good read for those
interested in RAC and/or SSD technology. I am a
technophile and love this kind of stuff.
The ISBN is 0976157357.