Skip to content
accelerando

Monthly Archives: October 2010

Optimizing the Oracle Query Optimizer

20-Oct-10

The Oracle Query Optimizer does a great job in rewriting Queries. For example the nested subquery in

SELECT C.cust_last_name, C.country_id 
FROM    customers C 
WHERE EXISTS (SELECT 1 
                          FROM sales S 
                          WHERE S.quantity_sold > 1000 AND 
                                        S.cust_id = C.cust_id);

will be rewritten into the following query, using a semi-join:

SELECT C.cust_last_name, C.country_id 
FROM customers C, sales S 
WHERE S.quantity_sold > 1000 AND 
C.cust_id S= S.cust_id;

(Example from Optimizer Transformations: Subquery Unnesting part 1)

In the first form, the subquery will be invoked for each entry in customers, replacing the value of the correlated column C.cust_id. Thus the query cannot be parallelized among other implications. Most of the time, if the columns are correctly indexed, the semijoin will be much faster.

This form of rewriting is called “unnesting”.

Consider the following query:

SELECT C.cust_last_name, C.country_id 
FROM    customers C, (SELECT f_compute_some_complicated_value(s.quantity_sold), S.cust_id
                          FROM sales S                          
                          WHERE S.quantity_sold > 1000) S
WHERE  S.cust_id = C.cust_id

The subquery in the from clause its called a derived table. It will be rewritten as:

SELECT C.cust_last_name, C.country_id, f_compute_some_complicated_value(s.quantity_sold), S.cust_id
FROM    customers C, Sales S
WHERE  S.cust_id = C.cust_id 
AND S.quantity_sold > 1000)

This technique is called query merging. Here the join should also be faster in most cases.

Both techniques are good and will improve many queries.

But consider a scenario where the subqueries query remote tables (via database link for example) or call PL/SQL or even Java stored procedures and the main table (customes in the examples) contains many more rows than the joined table or the where clause is less selective then the where clause of the subquery: The subquery will be executed as many times as there are rows in the main table. This can lead to very poor performance.

Luckily, the optimizer can be given hints:

To stop unnesting, use the NO_UNNEST hint:

SELECT C.cust_last_name, C.country_id 
FROM    customers C 
WHERE EXISTS (SELECT /*+ NO_UNNEST */ 1 
                          FROM sales S 
                          WHERE S.quantity_sold > 1000 AND 
                                        S.cust_id = C.cust_id);

Furthermore, i’d rewrite the query like so:

SELECT C.cust_last_name, C.country_id 
FROM    customers C 
WHERE C.cust_id =ANY (SELECT /*+ NO_UNNEST */ S.cust_id 
                                       FROM sales S 
                                       WHERE S.quantity_sold > 1000);

Where the EXISTS clause triggers the subquery for each item in customes, the =ANY clause will trigger the subquery only once. Both queries are semantically identical.

To stop the merging of derived tables, use the NO_MERGE hint in the surrounding select clause:

SELECT /*+ NO_MERGE(S) */ C.cust_last_name, C.country_id 
FROM    customers C, (SELECT f_compute_some_complicated_value(s.quantity_sold), S.cust_id
                          FROM sales S                          
                          WHERE S.quantity_sold > 1000) S
WHERE  S.cust_id = C.cust_id

The NO_MERGE hint takes the alias of the derived table as a parameter.

Combining both techniques brought a complex query down from 40s to just about 1s. We are using an Oracle 10g Database, but the hints also apply to 11g.

Unicode substrings in Ruby 1.8.x

15-Oct-10

Here is how to get correct unicode substrings in Ruby 1.8.x

require 'jcode'
$KCODE = 'u'
 
"öäüÖÄÜß".split(//)[4,2].join
# ÄÜ

Don’t know, if this performes well… In case you have better ideas, please drop me a comment.