A Little Noise

July 20, 2007

Complete Set

Filed under: MySQL — snoyes @ 11:40 am

Occasionally in #mysql the question pops up, “How can I get all the values from a table that satisfy all of multiple conditions?”

Then we beat the questioner with a large stick, until they give an example of what they really mean.

SELECT * FROM quizAnswers;
+-------------+----------+
| studentName | question |
+-------------+----------+
| seekwill    | A        |
| seekwill    | B        |
| seekwill    | C        |
| roxlu       | A        |
| fury        | B        |
| fury        | B        |
+-------------+----------+

Find all the students who have answered both questions ‘A’ and ‘B’.

First shot at it is to group them together, and count their answers:

SELECT studentName FROM quizAnswers GROUP BY studentName HAVING COUNT(*) = 2;

Sometimes that works, but it won’t in this case: fury has 2 answers, but to the same question. seekwill has 3 answers. We’ll get the wrong result.

UPDATE: As Arlen notes, this query works just fine if we change COUNT(*) to COUNT(DISTINCT question). Now back to the regularly scheduled exercise in obfuscation.

Next we turn to subqueries:

SELECT DISTINCT studentName 
FROM quizAnswers 
WHERE 
    studentName IN (
        SELECT studentName FROM quizAnswers WHERE question = 'A'
    ) 
    AND studentNAme IN (
        SELECT studentName FROM quizAnswers WHERE question = 'B'
);
+-------------+
| studentName |
+-------------+
| seekwill    |
+-------------+

Gets the right answer. But it doesn’t work in 4.0 or earlier (and people on Freenode always seem to use a shared host running 3.23). So we’ll rewrite it to use JOIN instead.

SELECT DISTINCT studentName 
FROM 
    quizAnswers AS a 
    JOIN quizAnswers AS b USING (studentName) 
WHERE 
    a.question = 'A' 
    AND b.question = 'B';
+-------------+
| studentName |
+-------------+
| seekwill    |
+-------------+

“But wait!” they say. “This example only had 2 values. My real data has 22 possible values. I don’t want to self-join the table 22 times (or write 22 subqueries). And the real values are random strings, not nice and sequential, so don’t try building the query in a loop or anything!”

Never fear! Obscure and infrequently used string and aggregate functions to the rescue!

SELECT studentName 
FROM quizAnswers 
GROUP BY studentName 
HAVING BIT_OR(1 << FIELD(question, 'A', 'B') - 1) = 3;
+-------------+
| studentName |
+-------------+
| seekwill    |
+-------------+

Handling the rest of those 22 values is a simple matter of adding them to the parameters to FIELD(), and changing that 3 to (2^number_of_values - 1). We'll even put the comma-separated values into a user variable and let MySQL figure out the rest (note the switch to FIND_IN_SET here, since now we're using a single string of values):

SET @q = 'A,B';

SELECT studentName 
FROM quizAnswers 
GROUP BY studentName 
HAVING 
        BIT_OR(1 << FIND_IN_SET(question, @q) - 1) 
    = 
        (1 << LENGTH(@q) - LENGTH(REPLACE(@q, ',', '')) + 1) - 1; -- This is 2^numValues - 1
+-------------+
| studentName |
+-------------+
| seekwill    |
+-------------+

How does it work? I'll let you look up the functions on your own, but here are the values each generates:

SELECT 
    studentName, 
    question, 
    FIND_IN_SET(question, 'A,B,C') AS position, 
    1 << FIND_IN_SET(question, 'A,B,C') - 1 AS bitmap 
FROM quizAnswers;
+-------------+----------+----------+--------+
| studentName | question | position | bitmap |
+-------------+----------+----------+--------+
| seekwill    | A        |        1 |      1 |
| seekwill    | B        |        2 |      2 |
| seekwill    | C        |        3 |      4 |
| roxlu       | A        |        1 |      1 |
| fury        | B        |        2 |      2 |
| fury        | B        |        2 |      2 |
+-------------+----------+----------+--------+
SELECT 
    studentName, 
    BIT_OR(1 << FIND_IN_SET(question, 'A,B,C') - 1) AS groupBitMap 
FROM quizAnswers 
GROUP BY studentName;
+-------------+-------------+
| studentName | groupBitMap |
+-------------+-------------+
| fury        |           2 |
| roxlu       |           1 |
| seekwill    |           7 |
+-------------+-------------+

1 Comment »

  1. Hi Scott. Liked the article – would like to feature it in the Dec issue of the Pipelines Newsletter. Here is our November issue:

    http://www.quest-pipelines.com/newsletter-v8/newsletter_1107.htm

    Look forward to hearing back from you.

    Comment by Robin — November 15, 2007 @ 9:49 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress