The Oracle Query Optimizer does a great job in rewriting Queries. For example the nested subquery in
SELECT C.cust_last_name, C.country_id FROM customers C WHERE EXISTS (SELECT 1 FROM sales S WHERE S.quantity_sold > 1000 AND S.cust_id = C.cust_id); |
will be rewritten into the following query, using a semi-join:
SELECT C.cust_last_name, C.country_id FROM customers C, sales S WHERE S.quantity_sold > 1000 AND C.cust_id S= S.cust_id; |
(Example from Optimizer Transformations: Subquery Unnesting part 1)
In the first form, the subquery will be invoked for each entry in customers, replacing the value of the correlated column C.cust_id. Thus the query cannot be parallelized among other implications. Most of the time, if the columns are correctly indexed, the semijoin will be much faster.
This form of rewriting is called “unnesting”.
Consider the following query:
SELECT C.cust_last_name, C.country_id FROM customers C, (SELECT f_compute_some_complicated_value(s.quantity_sold), S.cust_id FROM sales S WHERE S.quantity_sold > 1000) S WHERE S.cust_id = C.cust_id |
The subquery in the from clause its called a derived table. It will be rewritten as:
SELECT C.cust_last_name, C.country_id, f_compute_some_complicated_value(s.quantity_sold), S.cust_id FROM customers C, Sales S WHERE S.cust_id = C.cust_id AND S.quantity_sold > 1000) |
This technique is called query merging. Here the join should also be faster in most cases.
Both techniques are good and will improve many queries.
But consider a scenario where the subqueries query remote tables (via database link for example) or call PL/SQL or even Java stored procedures and the main table (customes in the examples) contains many more rows than the joined table or the where clause is less selective then the where clause of the subquery: The subquery will be executed as many times as there are rows in the main table. This can lead to very poor performance.
Luckily, the optimizer can be given hints:
To stop unnesting, use the NO_UNNEST hint:
SELECT C.cust_last_name, C.country_id FROM customers C WHERE EXISTS (SELECT /*+ NO_UNNEST */ 1 FROM sales S WHERE S.quantity_sold > 1000 AND S.cust_id = C.cust_id); |
Furthermore, i’d rewrite the query like so:
SELECT C.cust_last_name, C.country_id FROM customers C WHERE C.cust_id =ANY (SELECT /*+ NO_UNNEST */ S.cust_id FROM sales S WHERE S.quantity_sold > 1000); |
Where the EXISTS clause triggers the subquery for each item in customes, the =ANY clause will trigger the subquery only once. Both queries are semantically identical.
To stop the merging of derived tables, use the NO_MERGE hint in the surrounding select clause:
SELECT /*+ NO_MERGE(S) */ C.cust_last_name, C.country_id FROM customers C, (SELECT f_compute_some_complicated_value(s.quantity_sold), S.cust_id FROM sales S WHERE S.quantity_sold > 1000) S WHERE S.cust_id = C.cust_id |
The NO_MERGE hint takes the alias of the derived table as a parameter.
Combining both techniques brought a complex query down from 40s to just about 1s. We are using an Oracle 10g Database, but the hints also apply to 11g.
One comment
Hey, great tip man! I came across your blog surfing for tuning tips and I really wish you posted a LOT more like these.
I am certain in what you are doing you have come across many more pieces of sage advice that would help others like me out here. So please post more (on my knees pleading)
Post a Comment