Note: The
Source Code file
that accompanies
this article
contains the
following:
- Lab1.sql-PL/SQL
package header
spec
- Lab1body.sql-PL/SQL
package body
definition
- Lab1.java-Java
test application
Many
articles have been
written about the
use of PL/SQL vs.
Java. These articles
tend to focus on the
intricacies of each
language and their
strengths and
weaknesses as the
server-side logic.
This article won't
focus on this issue
but will instead
concentrate on an
issue that faces
Java and PL/SQL
developers alike --
where do you put the
SQL? The SQL is a
major part of each
application, and its
location can greatly
enhance or degrade
performance,
distribution, and
maintenance.
Java
is certainly
mainstream at this
point and is a very
powerful language.
Even though it's a
personal favorite of
this author, it
still has its
disadvantages.
PL/SQL, while
obviously
database-centric
(Oracle), also has
some enormous
benefits. Good
developers use both
where appropriate
and often find
themselves perplexed
with the same issue
-- the location of
the SQL statements.
When
I say "the
location of the
SQL," I'm
referring to its
storage location.
The developer has
many options when
writing Java
applications,
servlets, and
applets that access
databases. You can
store SQL in the
application itself
(Java side) and
access the database
directly from the
applet via a SQL
statement. Or the
programmer can make
a request from the
applet to the
servlet, which then
sends the SQL to the
database. Another
option within Java
is to simply make a
procedure or
function call
(Callable Statement
in JDBC Driver) and
process the result
set. In the last
scenario, the SQL
would reside inside
the database in an
Oracle package. This
article explores the
advantages and
disadvantages of
these options and
includes performance
benchmarks. It's my
intention to
objectively answer
my own questions
while helping to
make your decision
easier when you
encounter this issue
in your own
projects.
The
Power of a Package
If
you have the option
of installing a
PL/SQL package as
part of your
application, you
should seriously
consider installing
it. (Although the
focus of this
article is on
PL/SQL, the package
could also be
written in Java in
versions of Oracle
8.1 and later.)
PL/SQL packages have
many advantages:
- Privilege
Management-Instead
of being
concerned about
whether each
user has the
rights to
perform a
function and
trapping
exceptions
throughout your
code, you can
grant execute on
a package. The
user inherits
rights to all of
the underlying
objects
indirectly
through package
execution. For
example, let's
assume that part
of your code
issued a
TRUNCATE command
on a table. If
the command is
issued as the
connected user,
you can expect
privilege
problems and
would need to
resolve these
problems by
granting the
proper
privileges to
the user. You
would either
have to connect
as someone with
privileges
behind the
scenes or put
the TRUNCATE
command in a
procedure call
in the package.
The procedure
becomes the
gatekeeper of
the transaction
and ensures that
any access to
the underlying
objects goes
through the
procedure.
- Global
Location-Having
the SQL in one
spot is the most
flexible option:
By having
calc_inventory()
as a procedure
call, any
application that
can issue a
database call
can benefit from
the procedure.
Your Java apps,
applications,
Oracle*Forms,
Visual Basic, or
any application
that can issue a
SQL statement
can easily
access the same
result because
it's the same
code that
produces the
result. This
solution is
preferable to
reproducing the
same algorithm
in different
code bases
maintained by
different
people. And it
surpasses trying
to access
methods in one
language from
the architecture
of another by
leveraging APIs
and RPCs-a
strategy that's
more complex to
build and
maintain. Put
the SQL in the
database and be
done with it.
- Performance-Most
of the time,
performance will
be the driving
issue in the
decision of how
to partition an
application.
Very few users
are interested
in elegant
architectures
when their query
takes five
minutes to
return after
they clicked a
button. The fact
is that you
can't run a SQL
statement any
faster than
running it
inside the
Oracle kernel.
But at what
point is
performance
really an issue?
Is a package
faster when
submitting a
single SQL
statement? Or is
it only when
submitting two
or more that a
package is
faster? When is
it clearly the
best solution
for performance
reasons? I'll
show later in
this article
that PL/SQL is
faster with some
tasks and slower
with others.
- Interface
Agnostic-Software
architects
learned long ago
that it makes
sense to
separate
back-end logic
from the
interface.
Interface
technologies
change much more
frequently than
databases.
Whether it's
Java Swing, AWT,
Visual Basic,
PowerBuilder, or
Oracle*Forms, a
well-designed
application can
support them
all. Decouple
the business
logic from the
user interface
controls and
you're well on
your way.
- Global
Variables-Global
variables are
useful, and yes,
Java has static
variables as
well. The
difference is
that global
package
variables apply
to anyone who
accesses them
regardless of
the application.
This allows you
to maintain data
across and
between
transactions
through
persistent data.
A user in
SQL*Plus, JDBC,
Java, Visual
Basic, and ODBC
will inherit the
performance
benefits. The
following code
shows an example
since it only
calculates the
global cost when
the global cost
variable hasn't
been set (=0).
Function x returns VARCHAR2
BEGIN
If global_cost = 0 then -- global_cost is
defined outside the scope
Calc_global_cost(); -- (first time in only)
endif;
END
The
Flexibility of Java
Java
also has some
benefits as a home
for your Oracle SQL
statements. These
benefits include:
- Simplicity-Embedding
SQL statements
in native Java
code is much
easier than
building
additional
pieces, like
PL/SQL packages.
- Debugging-Although
tools exist that
enable the
debugging of
PL/SQL, they're
less robust than
the
functionality in
existing Java
IDEs, like
JBuilder. Being
able to set
breakpoints and
inspect variable
values is a
critical
requirement for
any developer,
and it's
preferable to do
that from a
single
development
environment.
- Distribution-Having
the code
encapsulated in
one place makes
the distribution
and management
of the
application much
easier. If a
user can simply
access a URL and
use your
application
instantly, then
you have a
satisfied user.
If you create a
package in the
database, you'll
have to install
that package in
each database to
be accessed. By
keeping the SQL
in the Java
code, you can
connect to any
database and
don't have to
worry about
PL/SQL package
maintenance on
each node.
- Performance-When
performing many
inserts or
updates, Java
performs better
because it has
the ability to
batch these
statements. The
JDBC driver
provides the
ability to queue
the request and
when the number
of queued
requests reaches
the batch size,
JDBC sends them
to the database
for execution.
Performance
At
first glance, the
main performance
degradation for a
SQL request is
sending and
receiving the
request across the
network. We also
know that most
boosts in
performance rely on
the architect
eliminating or
reducing the number
of round trips to
the database.
A
logical assumption
is that batching
requests and then
sending them as a
group would greatly
enhance performance.
But how much does it
enhance performance?
And what's the
threshold that seems
to be the decision
point? Is a single
statement faster in
a package? Two
statements? Or is it
only when you have
25-50 statements
that a package
becomes the ideal
choice? These are
the questions that I
hope to answer in
the benchmark tests
in the following
section.
Benchmarks
The
timings for these
benchmark tests are
given in
milliseconds and are
calculated by making
the call
System.currentTimeMillis()
in the Java code.
Each test was
performed 10 times,
and the average
result is reported.
Each test was
performed locally,
over an internal
network, and
remotely, over a
broadband connection
(DSL).
Environment
Database:
Oracle 8.1.5 on HP-UX
B.11.00 A 9000/785
JDBC
Driver: Oracle Thin
8.1.6.0
Client
machine: Pentium II
NEC Laptop 366 MHz
with 256MB RAM
Java
Virtual Machine:
1.3.0_01
The
Tests
My
test scenarios
consisted of timing
the particular
statements during
their execution
only. I tried to
eliminate the
definition of the
statement as well as
processing and
closing of any
result sets. You'll
notice in the Java
code that the clock
is started right
before the execute
statement and
stopped after it
returns.
I
also used prepared
statements in my
tests instead of
regular statements.
It's more efficient
to use
PreparedStatement
with bind variables
for frequently
executed statements.
Although
PreparedStatement is
inherited from
Statement, it's
different in the
following two ways:
- Each
time you execute
a Statement
object, its SQL
statement is
compiled.
However, when
you execute a
PreparedStatement,
its SQL
statement is
only compiled
when you first
prepare the
PreparedStatement.
- You
can specify
parameters in
the
PreparedStatement
SQL string, but
not in a
Statement SQL
string. Single
statement:
The
single statement
test is a very
simple test. I tried
to create a
statement that
everyone could run
on their machines,
and one that
wouldn't be answered
immediately to
ensure that the time
reported wasn't
solely network
communication time.
At the time I ran
this query, I had
3,194 objects
reported in
DBA_OBJECTS (see
Listing 1).
Java:
SQLText = "select count(*) from dba_objects";
pstmt = databaseConnection.prepareStatement(SQLText);
startTime = System.currentTimeMillis();
ResultSet rs = pstmt.executeQuery();
rs.next();
x = rs.getString(1);
PL/SQL:
FUNCTION single_statement RETURN VARCHAR
IS
row_count PLS_INTEGER := 0;
BEGIN
select count(*) into row_count from dba_objects;
return row_count;
END single_statement;
Listing
1: The single
statement test.
Multiple
Statements
The
multiple statement
test was a bit more
difficult. At first,
I used 10 different
queries. Later I
decided to use the
same query 10 times
and place it in a
loop instead of
coding it 10 times
(see Listing 2).
Java:
SQLText = "select count(*) from dba_objects";
pstmt = databaseConnection.prepareStatement(SQLText);
ResultSet rs = null;
startTime = System.currentTimeMillis();
while (multiCount < 10) {
rs = pstmt.executeQuery();
/*
don't include result set processing in the timings since we do not
do it in the PL/SQL
rs.next();
x = rs.getString(1);
*/
multiCount ++;
}
multiStatementJava = multiStatementJava + System.currentTimeMillis() - startTime;
PL/SQL:
FUNCTION multiple_statements
RETURN VARCHAR
IS
row_count PLS_INTEGER := 0;
num_objects VARCHAR2(20);
BEGIN
WHILE row_count < 10
LOOP
select count(*) into num_objects from dba_objects;
row_count := row_count + 1;
END LOOP;
return row_count;
END multiple_statements;
Listing
2: The multiple
statement test.
Truncate
The truncate
test simply truncates a table. No result set is involved. I included this
test to observe the benchmarks when no rows are returned (see Listing
3).
Java:
SQLText = "TRUNCATE TABLE SOOTHSAYER.BMC$PKK_INSTANCE_STATS";
pstmt = databaseConnection.prepareStatement(SQLText);
startTime = System.currentTimeMillis();
ResultSet rs = pstmt.executeQuery();
truncateJava = truncateJava + System.currentTimeMillis() - startTime;
PL/SQL:
Procedure truncate_table IS
trunc_command varchar2(100);
BEGIN
trunc_command := 'TRUNCATE TABLE BMC$PKK_INSTANCE_STATS';
execute immediate (trunc_command);
END truncate_table; Java Oracle package
Listing
3: The truncate
test.
Benchmark
Results
The
following table
shows the average
local test results,
in milliseconds, for
each type of test:
|
Java |
Oracle
package |
DB
on HP
machine |
|
|
Single
statement |
47 |
48 |
Multiple
statements |
448 |
376 |
TRUNCATE |
88 |
82 |
Single
Statement Results
The
single statement
test shows nearly
equal results after
10 executions. The
difference of 1 ms
seems negligible
(although the remote
test provides a very
different result). I
was a bit surprised
at how close these
two results were, so
I decided to see how
much time was spent
going to the server
and not executing
the SQL statement. I
did this by
commenting out the
one line of work in
the function:
FUNCTION single_statement RETURN VARCHAR
IS
row_count PLS_INTEGER := 0;
BEGIN
-- select count(*) into row_count from dba_objects;
return row_count;
END get_row_count;
It
took an average of 8
ms (~16% of total
execution time) to
access the procedure
and return, without
actually executing
the statement. This
result tells me that
each trip I can
eliminate will save
8 ms.
Multiple
Statements Results
The
results of the
multiple statement
test confirm my
assumption. The
Oracle package is
faster because it
makes one trip to
the database, does
its work, and then
returns. This result
clearly shows that a
package should be
used when the
complete unit of
work can be
performed on the
database. Larger
units of work will
result in more
significant
performance gains.
To
determine whether
any overhead was
incurred because the
procedure was inside
a package, I
eliminated the
package and created
a stand-alone
procedure. The
results showed no
impact; the numbers
were the same.
Truncate
Results
The
results of the
truncate test were
surprising-the
execution is
actually faster in
Java than in PL/SQL.
Oracle must be
eliminating some
overhead in the JDBC
driver that isn't
eliminated in
PL/SQL.
Remote
Results
The
following table
shows the average
remote test results,
in milliseconds, for
each type of test:
HP
machine via
broadband
connection (DSL) |
Java |
Oracle
package |
Single
statement |
286 |
113 |
Multiple
statements |
1662 |
506 |
TRUNCATE |
217 |
332 |
The
test results from
the broadband
connection are
revealing:
- A
1 ms difference
in the single
statement test
equates to 173
ms with a slow
connection
speed.
- The
results of the
multiple
statement test
are
overwhelming,
showing that the
PL/SQL package
is three times
faster.
- Once
again, the
TRUNCATE command
is faster in
Java
than in PL/SQL.
Although
we might easily
discount a
difference of a few
milliseconds as
being "close
enough," as in
the first single
statement test, the
remote test shows
that we should
always consider the
faster approach. A
query that executes
only 5 ms slower in
Java than in a
PL/SQL package (or
vice versa) might
not seem like an
issue. However, if
the query is
executed 5,000 times
per day, that
difference affects
performance
considerably. Also
consider that the
difference of 5 ms
might occur when
you're testing the
code at work on a T1
line with the server
three feet from your
desk. But when you
test the code over
an ISDN, cable, or
DSL connection, that
5 ms can become 55
ms or 300 ms. Every
millisecond counts
when performance
tuning.
Conclusion
It's
hard to declare a
clear winner in this
topic. Many factors
demand a combination
of strategies. The
ultimate decision
should weigh the
following factors
and their
applicability to the
application:
- Unit
of Work (UOW)-If
the UOW is one
SQL statement,
then creating a
function solely
for it makes
little sense.
However, if the
unit of work is
a series of SQL
statements with
processing in
between, a
package might
provide the best
solution. This
solution assumes
that:
1.
the Graphical User
Interface doesn't
need to be informed
of the status of the
work, as is typical
in a progress bar of
a GUI control; and
2.
the database can
perform all
processing required.
- Network
Speed-We
witnessed the
impact of
running the same
program over the
internal network
vs. a DSL
connection. If
network speed
becomes an
issue, the use
of the package
is preferable.
- Database
Accessibility-If
the application
gives the option
to connect to
any database or
a large number
of databases,
having the code
in the Java
eliminates the
distribution and
maintenance of
the package. All
of the code used
in the tests is
available in the
Source Code file
at
www.oracleprofessionalnewsletter.com
and can be used
as a template to
test your SQL
statements in
your
environment.
 |
For more details on Oracle utilities, see the book "Advanced
Oracle Utilities" by Bert Scalzo, Donald K. Burleson, and Steve Callan.
You can buy it direct from the publisher for 30% off directly from
Rampant TechPress.
|
|