A Little Noise


Quiz: A More Perfect UNION

Today I saw a query like this:

Performance was terrible. I ran out of patience after several minutes and killed the thread.

I changed the query to this:

It completed in under 3 seconds.

Can you explain how a no-op UNION so dramatically improved performance? (I couldn't have, without help from Jesper Krogh and James Day).

Hint #1 ▼

Hint #2 ▼

Hint #3 ▼

Hint #4 ▼

Answer ▼

Comments (4) Trackbacks (1)
  1. Simple answer: Caches

    If it is innodb, your first one loaded the table into the buffer pool. If not, most likely, you have disk caches, os caches, mysql caches, et al. These would have been populated the caches between the data and you.

  2. Caches are not relevant here. I can run these two versions in either order, multiple times, and performance is the same as initially stated.

  3. I’m guessing, as there is not much info:
    The UNION ALL query will first write your data into temporary table; only then will the data return to you.
    This means that if your result set is huge, and your network is slow, query #1 suffers from the time it takes to send the results over to you; query #2 does not, since the temporary table releases the table.
    Can you try SELECT SQL_BUFFER_RESULT d FROM t; to see if this is indeed the case?

    Otherwise more is needed; myisam? innodb? write contention? size?

  4. Not a network issue; this is on localhost.

    Storage engine of table `t` can be either MyISAM or InnoDB, and we’ll get the same results. But storage engines and temporary tables are nevertheless key to the quiz!

    Size is important, but size of what? For my test, I had a million rows in `t`, but I can reduce that to 65K, and the first approach still takes over a minute, while the second approach with the UNION takes a quarter of a second.

Leave a comment