It appears that there's a special optimization for predicates of the form:
col like 'pq%'
where the number of leading characters is 2. Consider the following example (8174 and 9204 show the same behaviour).
SQL> create table o3 as 2 select chr(trunc(rownum/1000)+65)||to_char(rownum,'fm000000000000000') x, 3 'pppppppppppppppppp' padding 4 from all_Objects 5 where rownum <= 26000 6 order by dbms_random.value; Table created.
So we've got a table with 1000 rows for each letter of the alphabet,and we index on col X
SQL> create index o3x on o3 ( x ); Index created. SQL> analyze table o3 compute statistics; Table analyzed. Test 1: Single leading char --------------------------- SQL> set autotrace traceonly explain SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J%'; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=300 Card=1002 Bytes=34068) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=300 Card=1002 Bytes=34068) 2 1 INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=5 Card=1002) Test 2: Two leading chars --------------------------- SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J0%'; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=5 Bytes=170) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=3 Card=5 Bytes=170) 2 1 INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=2 Card=5) Wow! Look at the "brilliant" cost Test 3: Three leading chars -------------------------- SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J00%'; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=326 Card=1085 Bytes=36890) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=326 Card=1085 Bytes=36890) 2 1 INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=6 Card=1085) Test 4: Four leading chars -------------------------- SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J000%'; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=173 Card=575 Bytes=19550) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=173 Card=575 Bytes=19550) 2 1 INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=4 Card=575) Test 5: Five leading chars -------------------------- SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J0000%'; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=92 Card=305 Bytes=10370) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=92 Card=305 Bytes=10370) 2 1 INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=3 Card=305) Test 6: Six leading chars -------------------------- SQL> select /*+ INDEX(o3) */ * from o3 where x like 'J00000%'; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=49 Card=162 Bytes=5508) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'O3' (Cost=49 Card=162 Bytes=5508) 2 1 INDEX (RANGE SCAN) OF 'O3X' (NON-UNIQUE) (Cost=2 Card=162)
With all the tests, as you increase the number of leading chars, the index gets cheaper which seems reasonable. The exception is the "magical" 2-leading char case which always comes ultra-cheap. A 10053 trace showed the computed cardinality to about spot-on for a single leading char (ie cmptcd=1000), but this drops to 5 for the 2-char case.
A friend has hypothesized that this may be linked to the fact that for years Oracle Apps would do case insensitive searching by generating predicates such as:
col1 like 'AB%'
or col1 like 'aB%'
or col1 like 'Ab%'
or col1 like 'ab%'
and upper (col1) like upper('AbcdE%')'
in the times before function-based indexes became available. This allowed for the use of four index probes rather than a scan to do case insensitve searching.