The LGWR dilemma


This was discussed at Jonathan Lewis's excellent seminar. We've all had beaten into us over the years that the fundamental requirement for database integrity and recoverability is that any change must always be guaranteed to be in the redo log files. This means that every time you do a commit, your session is blocked until the LGWR has successfully written the information from the log buffer to the redo log. (In fact, a common performance problem is that people commit far too frequently).

But as was demonstrated at the seminar, there is some funny business going on within PL/SQL. Consider the following example. First some setup work:

create table blah ( x number );
insert into blah values (1);

create procedure spin(p number) is 
begin
  for j in 1 .. p loop 
     null; 
  end loop;
end;
/

create procedure redo_test(p number) is
begin
  for i in 1 .. 20 loop
    update blah set x = x + 1;
    commit; 
    spin(p);
  end loop;
end;
/

So the 'spin' procedure just serves to burn some cpu (and hence take up a little time), and the 'redo_test' will perform 20 updates and issue a commit after each update. Now then, onto the test

SQL> exec redo_test(100000);

ie 20 updates, commit after each one, with a few milliseconds delay between each commit. A look at v$sysstat gives:

redo writes = 20
redo blocks written = 20
user commits = 20

Fair enough... This is what we would expect. But then if we do NOT induce such a 'large' delay between the commits...

SQL> exec redo_test(5000);

redo writes = 8
redo blocks written = 20
user commits = 20

and pow! LGWR appears not to have done as many writes as commits! And if we have NO delay between commits, then

SQL> exec redo_test(0);

redo writes = 2
redo blocks written = 20

No solutions yet... If and when information comes to hand, it will be posted.


Update:

The moment you accept the possibility of a PL/SQL routine continuing on relentlessly AFTER a commit has been issued, you immediately introduce the concept of being able to issue asynchronous requests to the log writer. This of course breaks the fundamental rule that we've all had bashed into us over the years - that a commit is NEVER complete until the log writer comes back and tell us so. An asynchronous facility introduces of the possibility of a process thinking that a commit is completed (or at least not really caring if it has or not), when in fact, the appropiate log entries are not yet written.

After the initial panic of this scenario falls away, you can start to see the marvellous ingenuity of this "cheat" solution from the folks at Oracle. Consider a "worst case" scenario, where the PL/SQL program "thinks" that 10 changes have been committed, but really on 6 have been done and then a spectacular database crash occurs. One would assume that we have a database corruption but in fact we don't. What we actually have is something akin to the Heisenberg Uncertainty principle. We can't tell that there were 4 "half-committed" changes done - from our perspective, only 6 will have been completed at the time of the crash. To trap the "cheat" in such a way that the evidence would be persistent after the crash basically stops the cheat from being detectable (due to the very small timings involved).

All that the PL/SQL needs to do is ensure that the last commit (or at least at some stage before it returns control to the calling environment) that it has waited on the log writer to successfully do all of it work. This can in fact be observed from the 'redo synch writes' statistic. After the 'redo_test' procedure is run, this goes up by 1, independent of the number of 'redo writes' that have occurred.

And the real beauty of this, is that you'll see half the work on the redo allocation latch - one of the main potential bottlenecks in an Oracle instance.

Update:

All of this has finally become well known and documented with version 10.2. The COMMIT command has been extended to allow developer control over the synch / asynch nature of commit.