On occasion I need to generate a number of rows. I don’t care about the contents, just how many rows there are. Maybe I want to create a table filled with a sequence of numbers. One common way to do this is to join a bunch of subqueries built out of unions:
SET @count := 0;
INSERT INTO table SELECT @count := @count + 1
FROM (SELECT 1 UNION SELECT 2) a
JOIN (SELECT 1 UNION SELECT 2) b
JOIN (SELECT 1 UNION SELECT 2) c
…
Of course, I could union more than two selects (e.g. SELECT 1 UNION SELECT 2 UNION SELECT 3
). That makes each derived table longer, but reduces how many must be joined.
Question: What’s the optimal size of each derived table?
Defining “optimal”: since this is a one-off query, I don’t much care about performance (within reason), but I do care about how much typing I have to do. So I will favor “fewer characters” over “fewer joins”.
Each derived table in the query takes up 15 * numSelects + 2 characters.
To generate a given number of rows requires CEILING(LOG(numRows) / LOG(numSelects)) derived tables.
If I want 1000 rows in the result, I’d need 10 derived tables of two rows each (320 characters), or 7 of three rows (329 characters), or 3 of ten rows (456 characters).
For any given numRows, what’s the minimum of CEILING(LOG(numRows) / LOG(numSelects)) * (15 * numSelects + 2) ?
The answer varies with numRows – 2, 3, or 4 are alternately best when generating less than 130 rows, but any higher than that and the best choice usually turns out to be 3.
(SELECT 1 UNION SELECT 2 UNION SELECT 3)