Hotsos Symposium, Day 3

by Jonathan Gennick

Day 3 of the The Hotsos
The last day. I began it with Lex de Haan's session
on "Null Values: Nothing to Worry About". A nice title, but it turns out that
nulls are something to worry about, and they're not going away, so we
need to be cognizant of the sort of trouble they can cause. Lex pointed out
many cases in which the possibility of nulls can lead to subtle issues that
you must consider when writing a query. For example, suppose you wish to find
all employees in the scott/tiger emp table who have no subordinates. The
following query, using NOT IN, returns no rows:

select e1.*
from emp e1
where e1.empno NOT IN
(select e2.mgr
from emp e2);

But use NOT EXISTS and you do get rows back:

select e1.*
from emp e1
(select 'x'
from emp e2
where e2.mgr = e1.empno);

Which query is correct? The NOT IN query fails to return rows because there
is one (and only one) employee in the emp table with a null in the mgr column.
Who does that employee report to? It could be anybody, and, thus, you can't
really know for sure whether any other given employee has subordinates. Which
query is correct comes down to whether you consider null to mean "has no
manager" or "we don't know the manager". And if you use null for both those
cases, well, then you tell me which of the above queries gives the "correct"

Lex also spoke later in the day about the ISO SQL standard in a
presentation titled "Writing Portable SQL'. One fascinating tidbit Lex pointed
out was that if you set event 10407, you get access to a TIME datatype. I've
long surmised the existence of such a datatype, as Oracle's support (beginning
in Oracle 9i Database) of ISO TIME literals practically demands that
such a datatype exist. Else how would the Oracle kernel be able to evaluate
expressions involving such literals. Please don't use the TIME datatype in any
production code though. It's not supported. Maybe someday.

Tom Kyte gave an excellent presentation on the importance of using
bind variables in OLTP applications. His demonstrations of their importance to
scaleability were most convincing. In one example, 10 sessions using a single
SQL statement with bind variables were able to insert 25,000 records each into
a table in about the same time that a single session was able to insert 25,000
records without using bind variables. Tom then demonstrated the
non-linear negative impact on scaleability when many sessions at once
are not using bind variables. He also pointed out that bind variables sidestep
many SQL injection problems, because you are not stringing together
user-supplied values in order to build up SQL statements, and thus users
cannot slip in their own SQL text. Yet bind variables aren't always the right
answer. For data warehousing queries you are often best off not using bind
variables. In the end, Tom suggested the following rule-of-thumb: "seconds per
query: don't bind, queries per second: do bind".

Cary Millsap's presentation on "How to Make an Application Easy to
Diagnose" was my last for the day, and my last for the Symposium. I should
have taken better notes for this one. Cary gave his thoughts on instrumenting
applications that you write, so that they can be traced. In his opinion:

  • Trace files should be designed to economize on space. Rather than:

    date=10-Mar-2005, time=10:00:00.00am, ela=.05, cpu=.01

    Cary prefers a format that does not repeat the labels:

    10-Mar-2005, 10:00:00.00am, .05, .01

  • Trace files should all begin with some sort of "key line" that describes
    the format of the data that follows. For example:

    date, time, ela, cpu
    10-Mar-2005, 10:00:00.00am, .05, .01
    10-Mar-2005, 10:00:00.05am, .04, .02

    This enables trace file
    format to change while making it possible to recognize which format is used in
    a given file.

  • Users must have the option to initiate and stop tracing, to facilitate
    capturing trace data of the correct scope.

  • The option must exist to write trace files in an unbufferred fashion, so
    that trace data can be viewed in real-time.

Cary also showed how DBMS_MONITOR can be used in conjunction with
DBMS_APPLICATION_INFO to trace applications that you write. Use DBMS_MONITOR
to start and stop tracing. Invoke DBMS_APPLICATION_INFO from within your
application to keep track of where you are in the application.
DBMS_APPLICATION_INFO calls get logged to the trace file.

And that's it! As I write this, I hear the hotel's housekeeping staff
closing in on my room. I've got to get out of here. I've got a plane to catch.
I can't wait for next year's symposium.