A Little Noise

March 2, 2007

Octets from INET_ATON

Filed under: MySQL — snoyes @ 6:02 pm

So you’ve follow Arjen’s suggestion and are storing all your IP addresses in a nice integer format, having converted them with INET_ATON and selecting them with INET_NTOA.

But now you want to use one of the octets, maybe to find out how many connections you get from 192.x.x.x.

One way is to turn the integer back into a string, and use one of the substring functions to grab the parts you need. Let’s try that out for speed (I’ll use 192.168.1.1 in my examples, which translates to 3232235777):

mysql> SELECT BENCHMARK(10000000, SUBSTRING_INDEX(INET_NTOA(3232235777), '.', 1) = '192');
+-----------------------------------------------------------------------------+
| BENCHMARK(10000000, SUBSTRING_INDEX(INET_NTOA(3232235777), '.', 1) = '192') |
+-----------------------------------------------------------------------------+
|                                                                           0 |
+-----------------------------------------------------------------------------+
1 row in set (6.28 sec)

Is there a better way? Of course; I’d hardly be writing this blog entry if there weren’t. According to the manual, the calculation for INET_ATON is
192x256^3 + 168x256^2 + 1x256^1 + 1x256^0
Notice all those 256s. 256 is 2^8, so all we have to do is shift the bits to the right by the appropriate multiple of 8, and then throw away anything higher than the 8th bit. We’ll check real quick that gives us the right number:

mysql> SELECT
    ->   (3232235777 >> 24) & 255 AS oct1,
    ->   (3232235777 >> 16) & 255 AS oct2,
    ->   (3232235777 >> 8) & 255 AS oct3,
    ->   (3232235777 >> 0) & 255 AS oct4;
+------+------+------+------+
| oct1 | oct2 | oct3 | oct4 |
+------+------+------+------+
|  192 |  168 |    1 |    1 |
+------+------+------+------+

Looks like the right answer. Let’s see how it compares.

mysql> SELECT BENCHMARK(10000000, (3232235777 >> 24) & 255 = 192);
+-----------------------------------------------------+
| BENCHMARK(10000000, (3232235777 >> 24) & 255 = 192) |
+-----------------------------------------------------+
|                                                   0 |
+-----------------------------------------------------+
1 row in set (1.61 sec)

Noticeably faster, and you get to use those bitwise operators that you’ve always wondered about but never had a reason to use.

3 Comments »

  1. SELECT BENCHMARK(100000000 (3232235777 & 4278190080) = 3221225472);

    .. would be even faster, for what it’s worth. These values are obviously calculated in beforehand, though.

    The right-right answer would be to have the ip field indexed and then e.g.:
    SELECT .. WHERE ip BETWEEN 192*POW(256,3) AND 193*POW(256,3)-1

    In that case the index is used. That would not be the case if the field is changed in the statement (by passing the value through a bitwise operator or converting it with INET_NTOA() ).

    Comment by Peter Brodersen — March 2, 2007 @ 7:11 pm

  2. The index sounds good. Let’s check that BETWEEN statement though:

    mysql> SELECT BENCHMARK(10000000, 3232235777 BETWEEN 192*POW(256,3) AND 193*POW(256, 3)-1);
    +------------------------------------------------------------------------------+
    | BENCHMARK(10000000, 3232235777 BETWEEN 192*POW(256,3) AND 193*POW(256, 3)-1) |
    +------------------------------------------------------------------------------+
    |                                                                            0 |
    +------------------------------------------------------------------------------+
    1 row in set (7.30 sec)

    Ick, even worse than the substring. We can improve on it some by pulling out that slow POW function.

    mysql> SELECT BENCHMARK(10000000, 3232235777 BETWEEN 192*16777216 AND 193*16777216-1);
    +-------------------------------------------------------------------------+
    | BENCHMARK(10000000, 3232235777 BETWEEN 192*16777216 AND 193*16777216-1) |
    +-------------------------------------------------------------------------+
    |                                                                       0 |
    +-------------------------------------------------------------------------+
    1 row in set (3.00 sec)

    Better, still not as good as bit-shifting. But the index lookup instead of a data seek will more than make up for that, provided MySQL chooses the index on that field. And if you take the math out altogether and supply constants for the BETWEEN, that’s the fastest of all.

    Is MySQL smart enough to evaluate a deterministic expression and use that instead? Will POW(256, 3) really be evaluated for each row, or just once?

    Comment by snoyes — March 2, 2007 @ 7:50 pm

  3. Nope, it’s not smart enough… EXPLAIN EXTENDED, then SHOW WARNINGS. I don’t think it does constant folding for any functions, from what I’ve seen. Could be wrong though.

    Comment by Xaprb — March 8, 2007 @ 5:55 pm

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress