A COMMIT is generally a very fast operation, regardless of
the transaction size. You might think that the bigger a transaction (in other
words, the more data it affects), the longer a COMMIT would take. This is not
true. The response time of a COMMIT is generally “flat,” regardless of the
transaction size. This is because a COMMIT does not really have too much work
to do, but what it does do is vital.
One of the reasons this is an important fact to understand
and embrace is that it will lead to let your transactions be as big as they
should be. Many developers artificially constrain the size of their
transactions, committing every so many rows, instead of committing when a
logical unit of work has been performed. They do this in the mistaken belief
that they are preserving scarce system resources, when in fact they are
increasing them. If a COMMIT of one row takes X units of time, and the COMMIT
of 1,000 rows takes the same X units of time, then performing work in a manner
that does 1,000 one-row COMMITs will take an additional 1,000*X units of time
to perform. By committing only when you have to (when the logical unit of work
is complete), you will not only increase performance, and you’ll also reduce
contention for shared resources (log files, various internal latches, and the
like).
The more often you commit, the longer you
wait (your mileage will vary on this). And the amount of time you wait is more
or less directly proportional to the number of times you commit .Remember, this
is just a single-user scenario; with multiple users doing the same work, all
committing too frequently, the numbers will go up rapidly. We’ve heard the same
story, time and time again, with similar situations. We must perform operations only when we need to—a COMMIT is just
another such operation. It is best to size our transactions based on business
need, not based on misguided attempts to lessen resource usage on the database.
There are two factors contributing to the expense of the
COMMIT in this example
- We’ve obviously increased the round-trips to and from the database. If we commit every record, we are generating that much more traffic back and forth. I didn’t even measure that, which would add to the overall runtime.
- Every time we commit, we must wait for our redo to be written to disk. This will result in a “wait.” In this case, the wait is named “log file sync.”
So, we committed after every INSERT, we waited every time
for a short period of time—and if you wait a little bit of time but you wait
often, it all adds up. Fully thirty seconds of our runtime was spent waiting
for a COMMIT to complete when we committed 10,000 times—in other words, waiting
for LGWR to write the redo to disk. In stark contrast, when we committed once,
we didn’t wait very long (not a measurable amount of time actually). This
proves that a COMMIT is a fast operation; we expect the response time to be
more or less flat, not a function of the amount of work we’ve done.
So, why is a COMMIT’s response time fairly flat,
regardless of the transaction size? It is because before we even go to COMMIT in the database, we’ve already done the
really hard work. We’ve already modified the data in the database, so we’ve
already done 99.9 percent of the work. For example, operations such as the following
have already taken place:
- Undo blocks have been generated in the SGA.
- Modified data blocks have been generated in the SGA.
- Buffered redo for the preceding two items has been generated in the SGA.
- Depending on the size of the preceding three items and the amount of time spent, some combination of the previous data may be flushed onto disk already.
- All locks have been acquired.
When we COMMIT, all that is left to happen is the
following:
- A System Change Number (SCN) is generated for our transaction. In case you are not familiar with it, the SCN is a simple timing mechanism Oracle uses to guarantee the ordering of transactions and to enable recovery from failure. It is also used to guarantee read-consistency and checkpointing in the database.Think of the SCN as a ticker; every time someone COMMITs, the SCN is incremented by one.
- LGWR writes all of our remaining buffered redo log entries to disk and records the SCN in the online redo log files as well. This step is actually the COMMIT. If this step occurs, we have committed. Our transaction entry is “removed” from V$TRANSACTION—this shows that we have committed.
- All locks recorded in V$LOCK held by our session are released, and everyone who was enqueued waiting on locks we held will be woken up and allowed to proceed with their work.
- Some of the blocks our transaction modified will be visited and “cleaned out” in a fast mode if they are still in the buffer cache. Block cleanout refers to the lockrelated information we store in the database block header. Basically, we are cleaning out our transaction information on the block, so the next person who visits the block won’t have to. We are doing this in a way that need not generate redo log information, saving considerable work later
As you
can see, there is very little to do to process a COMMIT.
The lengthiest operation is, and always will be, the activity performed by LGWR, as this is
physical disk I/O. The amount of time spent by LGWR
here will be greatly reduced by the fact that it has already been flushing the
contents of the redo log buffer on a recurring basis. LGWR will not buffer
all of the work you do for as long as you do it. Rather, it will incrementally
flush the contents of the redo log buffer in the background as you are going
along.
This is
to avoid having a COMMIT wait
for a very long time in order to flush all of your redo at once.So, even if we
have a long-running transaction, much of the buffered redo log it generates would
have been flushed to disk, prior to committing.
On the flip side is the fact that when we COMMIT,
we must typically wait until all buffered
redo that has not been written yet is safely on disk. That is, our call to LGWR is by default a
synchronous one. While LGWR may use
asynchronous I/O to write in parallel to our log files, our transaction will
normally wait for LGWR to
complete all writes and receive confirmation that the data exists on disk
before returning.
No comments:
Post a Comment