ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button
Article:
  Emulating Analytic (AKA Ranking) Functions with MySQL
Subject:   MySQL version
Date:   2007-04-04 19:51:13
From:   jaypipes
Response to: MySQL version

Hi there! Great article, Stephane! Reminds me of the examples I put into Chapter 8 of "Pro MySQL"...


Couple things I wanted to note, however.


First, the number one reason you see such a performance difference with adding an index *isn't* actually the (correct) point you make about the subquery being fired once for each subset row. The reason is because you are using the InnoDB storage engine, and InnoDB has a known issue with doing a SELECT COUNT(*) when there is no WHERE condition on an indexed field. InnoDB is forced to do a full table scan, as you point out, but for a different reason, each time a SELECT COUNT(*) is used in this manner. See http://dev.mysql.com/doc/refman/5.0/en/innodb-restrictions.html for more info on that.


Second, is that you can get *much* better performance from your MySQL SELECTs by avoiding correlated subqueries and rewriting as a derived table... It's a bug in the MySQL optimizer unfortunately. Even better performance would be from creating a temporary table in place of the derived table and having an appropriate index on the temporary table. Unfortunately, a derived table is created internally with no indexes.


Here's your correlated subquery rewritten as a derived table. Let me know if you get much better performance from it, though I think the biggest performance benefit comes from adding an index because of the InnoDB limitation mentioned above:


original SQL:


select *
from (select a.DEPTNO, a.EMPNO, a.LASTNAME, a.FIRSTNAME, a.SAL,
(select 1 + count(*)
from EMPLOYEES b
where b.DEPTNO = a.DEPTNO
and b.SAL > a.SAL) RANK
from EMPLOYEES as a) as x
where x.RANK <= 5
order by x.DEPTNO, x.RANK;


Rewritten SQL:


select a.DEPTNO, a.EMPNO, a.LASTNAME, a.FIRSTNAME, a.SAL, x.emp_count + 1 as RANK
from EMPLOYEES as a
inner join (
select count(*) as emp_count
from EMPLOYEES
) as x
on x.DEPTNO = a.DEPTNO
and x.SAL > a.SAL
where RANK <= 5
order by a.DEPTNO, RANK;


Do the comparison after adding that index, of course. :)


Cheers,


Jay Pipes
jay at mysql dot com


1 to 1 of 1
  1. Stephane Faroult photo MySQL version
    2007-04-06 10:16:23  Stephane Faroult | O'Reilly Author [View]

1 to 1 of 1