A Little Noise

December 4, 2009

On unions and derivatives

Filed under: MySQL — snoyes @ 4:25 pm

On occasion I need to generate a number of rows. I don’t care about the contents, just how many rows there are. Maybe I want to create a table filled with a sequence of numbers. One common way to do this is to join a bunch of subqueries built out of unions:


SET @count := 0;
INSERT INTO table SELECT @count := @count + 1
FROM (SELECT 1 UNION SELECT 2) a
JOIN (SELECT 1 UNION SELECT 2) b
JOIN (SELECT 1 UNION SELECT 2) c

Of course, I could union more than two selects (e.g. SELECT 1 UNION SELECT 2 UNION SELECT 3). That makes each derived table longer, but reduces how many must be joined.

Question: What’s the optimal size of each derived table?

Defining “optimal”: since this is a one-off query, I don’t much care about performance (within reason), but I do care about how much typing I have to do. So I will favor “fewer characters” over “fewer joins”.

Each derived table in the query takes up 15 * numSelects + 2 characters.

To generate a given number of rows requires CEILING(LOG(numRows) / LOG(numSelects)) derived tables.

If I want 1000 rows in the result, I’d need 10 derived tables of two rows each (320 characters), or 7 of three rows (329 characters), or 3 of ten rows (456 characters).

For any given numRows, what’s the minimum of CEILING(LOG(numRows) / LOG(numSelects)) * (15 * numSelects + 2) ?

The answer varies with numRows – 2, 3, or 4 are alternately best when generating less than 130 rows, but any higher than that and the best choice usually turns out to be 3.

(SELECT 1 UNION SELECT 2 UNION SELECT 3)

Powered by WordPress