Category: Technical

Anything that uses 1s and 0s

  • mysqldump privileges required

    “mysqldump requires at least the SELECT privilege for dumped tables, SHOW VIEW for dumped views, TRIGGER for dumped triggers, and LOCK TABLES if the –single-transaction option is not used. Certain options might require other privileges as noted in the option descriptions.”
    http://dev.mysql.com/doc/refman/5.6/en/mysqldump.html

    Format Default Privileges Required
    –add-drop-database Off  
    –add-drop-table On in –opt  
    –add-drop-trigger Off  
    –add-locks On in –opt  
    –all-databases Off SELECT, SHOW DATABASES ON *.*
    –allow-keywords Off  
    –apply-slave-statements Off  
    –bind-address=ip_address Off  
    –comments On  
    –compact Off  
    –compatible=name[,name,…] Off  
    –complete-insert Off  
    –create-options On in –opt  
    –databases Off  
    –debug[=debug_options] Off  
    –debug-check Off  
    –debug-info Off  
    –default-auth=plugin Off  
    –default-character-set=charset_name utf8/latin1  
    –delayed-insert Off  
    –delete-master-logs Off SUPER ON *.*
    –disable-keys On in –opt  
    –dump-date On in –comments  
    –dump-slave[=value] Off SUPER or REPLICATION CLIENT ON *.*
    –events Off EVENT
    –extended-insert On in –opt  
    –fields-enclosed-by=string ” in –tab  
    –fields-escaped-by ‘\\’ in –tab  
    –fields-optionally-enclosed-by=string Off  
    –fields-terminated-by=string ‘\t’ in –tab  
    –flush-logs Off RELOAD ON *.*
    –flush-privileges Off  
    –help Off  
    –hex-blob Off  
    –host localhost  
    –ignore-table=db_name.tbl_name Off  
    –include-master-host-port Off  
    –insert-ignore Off  
    –lines-terminated-by=string ‘\n’ in –tab  
    –lock-all-tables Off LOCK TABLES ON *.*
    –lock-tables On in –opt LOCK TABLES
    –log-error=file_name Off  
    –login-path=name Off controlled at OS level
    –master-data Off RELOAD ON *.*
    SUPER or REPLICATION CLIENT ON *.*
    –max_allowed_packet=value 24MB  
    –net_buffer_length=value 1022KB  
    –no-autocommit Off  
    –no-create-db Off  
    –no-create-info Off  
    –no-data Off  
    –no-set-names Off  
    –no-tablespaces Off  
    –opt On  
    –order-by-primary Off  
    –password[=password] Off  
    –pipe Off  
    –plugin-dir=path Off  
    –port=port_num 3306  
    –quick On in –opt  
    –quote-names On  
    –replace Off  
    –result-file=file Off  
    –routines Off SELECT ON mysql.proc
    –set-charset On  
    –set-gtid-purged=value Auto  
    –single-transaction Off  
    –skip-add-drop-table Off in –opt  
    –skip-add-locks Off in –opt  
    –skip-comments Off  
    –skip-compact On  
    –skip-disable-keys Off in –opt  
    –skip-extended-insert Off in –opt  
    –skip-opt Off  
    –skip-quick Off in –opt  
    –skip-quote-names Off  
    –skip-set-charset Off  
    –skip-triggers Off  
    –skip-tz-utc Off  
    –ssl-ca=file_name Off  
    –ssl-capath=dir_name Off  
    –ssl-cert=file_name Off  
    –ssl-cipher=cipher_list Off  
    –ssl-crl=file_name Off  
    –ssl-crlpath=dir_name Off  
    –ssl-key=file_name Off  
    –ssl-verify-server-cert Off  
    –tab=path Off  
    –tables Off  
    –triggers On TRIGGER since 5.1.23, SUPER ON *.* earlier
    –tz-utc On  
    –user=user_name system user on Linux, ‘ODBC’ on Windows  
    –verbose Off  
    –version Off  
    –where=’where_condition’ Off  
    –xml Off  
  • Where’s my line?

    mysql -e "select * from test.t where d < '2013-07-17 17:00:00'"
    +---------------------+
    | d                   |
    +---------------------+
    | 2013-07-17 15:34:19 |
    +---------------------+
    
    mysqldump -t --compact test t --where="d < '2013-07-17 17:00:00'"
    (no output)

    Where's my line?

    --tz-utc

  • Five Number Summary

    From Freenode: how do you generate a five number summary in MySQL? There is no “median” aggregate function built in. You could do some clever things involving self joins or temporary tables, or build an aggregate UDF – see the comments section in the manual for those approaches.

    Here’s another way using a single query. Be sure to set group_concat_max_len high enough for your data, and since it relies on string manipulation, it’s probably not a good choice if your data is millions of rows.

    First, a helper function to get the Nth element of a comma-delimited string, just to make the query shorter:

    CREATE FUNCTION LIST_ELEM(inString text, pos int) 
    RETURNS TEXT DETERMINISTIC 
    RETURN SUBSTRING_INDEX(SUBSTRING_INDEX(inString, ',', pos), ',', -1);
    

    Now, fetching the min, max, median, and first and third quartiles (computing method 2) for each group:

    SELECT 
    
      groupId,
    
      GROUP_CONCAT(data ORDER BY data) AS dataSet,
    
      MIN(data) AS min,
    
      (
        LIST_ELEM(GROUP_CONCAT(data ORDER BY data), CEIL(COUNT(*)/4))
        + LIST_ELEM(GROUP_CONCAT(data ORDER BY data), FLOOR(COUNT(*)/4 + 1))
      ) / 2 AS q1,
    
      (
        LIST_ELEM(GROUP_CONCAT(data ORDER BY data), CEIL(COUNT(*)/2))
        + LIST_ELEM(GROUP_CONCAT(data ORDER BY data), FLOOR(COUNT(*)/2 + 1))
      ) / 2 AS median,
    
      (
        LIST_ELEM(GROUP_CONCAT(data ORDER BY data DESC), CEIL(COUNT(*)/4))
        + LIST_ELEM(GROUP_CONCAT(data ORDER BY data DESC), FLOOR(COUNT(*)/4 + 1))
      ) / 2 AS q3,
    
      MAX(data) AS max
    
    FROM t 
    GROUP BY groupId;
    +---------+---------------------+------+------+--------+------+------+
    | groupId | dataSet             | min  | q1   | median | q3   | max  |
    +---------+---------------------+------+------+--------+------+------+
    |       1 | 0,0,1,2,13,27,61,63 |    0 |  0.5 |    7.5 |   44 |   63 |
    |       2 | 0,0,1,2,25          |    0 |    0 |      1 |    2 |   25 |
    +---------+---------------------+------+------+--------+------+------+
  • Default user

    It came up twice in two days: if you do not specify the user name when connecting, what gets picked?

    The manual says:
    “On Unix, most MySQL clients by default try to log in using the current Unix user name as the MySQL user name, but that is for convenience only.”
    http://dev.mysql.com/doc/refman/5.6/en/user-names.html

    “The default user name is ODBC on Windows or your Unix login name on Unix.”
    http://dev.mysql.com/doc/refman/5.6/en/connecting.html

    Let’s be a little more specific. The relevant section of code is in libmysql/libmysql.c

    On Linux, we check the following in this order:
    – if (geteuid() == 0), user is “root”
    – getlogin()
    – getpwuid(geteuid())
    – environment variables $USER, $LOGNAME, $LOGIN
    If none of those return non-NULL results, use “UNKNOWN_USER”

    On Windows:
    – environment variable $USER
    If that’s not set, use “ODBC”.

    I wondered why on Windows we check $USER but not $USERNAME. I gather that it’s an ODBC thing.

  • Connection Conundrum

    Define a user like this:

    GRANT ALL ON *.* TO 'myuser'@'localhost' IDENTIFIED BY 'super$ecret';

    Then try a PHP script like this:

    <?php
    mysqli_connect("localhost", "myuser", "super$ecret");
    ?>

    What happens and why? How could you avoid it?

    And for glory, what single line could you add that would prevent the error, without making any changes to the mysqli_connect line?

  • Auto_increment gaps

    Why are there gaps in my auto_increment sequence, even if there are no deletes or rolled back transactions?

    Is it a bug?

    The manual says, “For lock modes 1 or 2, gaps may occur between successive statements because for bulk inserts the exact number of auto-increment values required by each statement may not be known and overestimation is possible.”

    Where does that overestimation come from?

    An example to illustrate:

    DROP TABLE IF EXISTS t;
    CREATE TABLE t (a bigint unsigned auto_increment primary key) ENGINE=InnoDB SELECT NULL AS a;
    /* #1 */ INSERT INTO t SELECT NULL FROM t;
    /* #2 */ INSERT INTO t SELECT NULL FROM t;
    /* #3 */ INSERT INTO t SELECT NULL FROM t;
    /* #4 */ INSERT INTO t SELECT NULL FROM t;
    SELECT * FROM t;

    +----+
    | a |
    +----+
    | 1 |
    | 2 |
    | 3 |
    | 4 |
    | 6 |
    | 7 |
    | 8 |
    | 9 |
    | 13 |
    | 14 |
    | 15 |
    | 16 |
    | 17 |
    | 18 |
    | 19 |
    | 20 |
    +----+
    16 rows in set (0.02 sec)

    Notice that 5 and 10-12 are missing. If we did another insert, we’d be missing 21-27 (try it and see!)

    Here’s a model of what MySQL is doing:

    Create the table and simultaneously insert a single row. That is the row where a=1.

    #1: Insert as many rows as there are in the table (it’s one row, but MySQL doesn’t know that.)
    – Grab a chunk of auto_increment values. How many in the chunk? One – the value ‘2’. Insert it (one row inserted).
    – No more rows to insert, so all done.

    #2: Insert as many rows as there are in the table (it’s two rows, but MySQL doesn’t know that.)
    – Grab a chunk of auto_increment values. How many in the chunk? One – the value ‘3’. Insert it (one row inserted).
    – Still more rows to insert. Grab another chunk, twice as big as before – two values, ‘4’ and ‘5’. Insert the ‘4’ (two rows inserted).
    – No more rows to insert. Discard the left over ‘5’.

    #3: Insert as many rows as there are in the table (it’s four rows, but MySQL doesn’t know that.)
    – Grab a chunk of auto_increment values. How many in the chunk? One – the value ‘6’. Insert it (one row inserted).
    – Still more rows to insert. Grab another chunk, twice as big as before – two values, ‘7’ and ‘8’. Insert them (three rows inserted).
    – Still more rows to insert. Grab another chunk, twice as big as before – four values, ‘9’, ’10’, ’11’, ’12’. Insert the ‘9’ (four rows inserted).
    – No more rows to insert. Discard the left over ’10’, ’11’, and ’12’.

    #4: Insert as many rows as there are in the table (it’s eight rows, but MySQL doesn’t know that.)
    – Grab a chunk of auto_increment values. How many in the chunk? One – the value ’13’. Insert it (one row inserted).
    – Still more rows to insert. Grab another chunk, twice as big as before – two values, ’14’ and ’15’. Insert them (three rows inserted).
    – Still more rows to insert. Grab another chunk, twice as big as before – four values, ’16’, ’17’, ’18’, ’19’. Insert them (seven rows inserted).
    – Still more rows to insert. Grab another chunk, twice as big as before – eight values, ’20’, ’21’, ’22’, …, ’27’. Insert the ’20’ (eight rows inserted).
    – No more rows to insert. Discard the left over ’21’, ’22’, etc.

    The gap can get as big as 65535 (I didn’t look in the code to confirm that, it’s just what running the test above a few more times seems to suggest).

    When innodb_autoinc_lock_mode=1, there can be gaps between statements, but not within a statement, because there’s a lock on the auto_increment until we’re done. #4 above is guaranteed to get 8 consecutive values, you just might not be sure where they are going to start (well, now you are, because you read this post).

    When innodb_autoinc_lock_mode=2, there can be gaps within a statement, because the auto_increment is not locked. Imagine we’re in the middle of #4 above. Our statement is inserting the ’14’ and ’15’, when another statement comes along wanting just a single auto_increment value. It gets the ’16’. Now it’s our turn to ask for another chunk, and we get ’17’, ’18’, ’19’, ’20’. While we’re doing those, another statement comes along and steals our ’21’. So the last row for our statement is ’22’.

  • MidPoint of a LineString

    CREATE FUNCTION MidPoint(ls LINESTRING)
    RETURNS POINT
    DETERMINISTIC
    BEGIN
    DECLARE len double;
    DECLARE workLength double DEFAULT 0;
    DECLARE workPoint int DEFAULT 1;
    DECLARE point0 POINT;
    DECLARE point1 POINT;
    DECLARE distanceOver double;
    DECLARE segmentLength double;

    IF (NumPoints(ls) = 1) THEN return PointN(ls, 1); END IF;

    -- find half the length of the linestring
    SET len := GLength(ls) / 2;

    -- walk the linestring until we exceed the distance
    WHILE (workLength < len) DO SET point0 = PointN(ls, workPoint); SET workPoint := workPoint + 1; SET point1 = PointN(ls, workPoint); SET segmentLength = GLength(LineString(point0, point1)); SET workLength := workLength + segmentLength; END WHILE; -- distance to backup SET distanceOver = workLength - len; -- midpoint is distanceOver back down the last segement RETURN POINT( X(point1) - distanceOver / segmentLength * (X(point1) - X(point0)), Y(point1) - distanceOver / segmentLength * (Y(point1) - Y(point0)) ); END

  • Quiz: A More Perfect UNION

    Today I saw a query like this:
    SELECT d FROM t;

    Performance was terrible. I ran out of patience after several minutes and killed the thread.

    I changed the query to this:
    (SELECT d FROM t) UNION ALL (SELECT NULL LIMIT 0);

    It completed in under 3 seconds.

    Can you explain how a no-op UNION so dramatically improved performance? (I couldn’t have, without help from Jesper Krogh and James Day).

    http://dev.mysql.com/doc/refman/5.5/en/cursor-restrictions.html
    The field `d` is a varchar and is bigger than it needs to be.
    http://dev.mysql.com/doc/refman/5.5/en/memory-storage-engine.html
    http://dev.mysql.com/doc/refman/5.5/en/internal-temporary-tables.html
    The query is in a cursor. MySQL materializes cursors to a temporary table, using the MEMORY engine if it can, then converting to MyISAM if it gets too big. The MEMORY engine doesn’t support variable length rows, so VARCHARs are converted to CHARs. The field `d` only contains one character in each row, but is defined as a VARCHAR(20000) – which means the temporary table will grow very large. Even when converted to MyISAM, it retains that fixed width format, so the temp table is huge.

    The “presence of any column larger than 512 bytes in the SELECT list, if UNION or UNION ALL is used,” prevents use of an in-memory temporary table. That means the temporary table will be created directly as MyISAM on the disk, and will keep the VARCHAR format, so it will be much smaller. Converting the field to TEXT would have the same effect.

  • Yet another way to count occurences of a substring in a string

    In response to http://forge.mysql.com/snippets/view.php?id=60 – adding support to count strings, not just characters

    (
      character_length(<string1>)
      - character_length(REPLACE(<string1>, <searchString>, '')
    ) / character_length(<searchString>)
  • Easter date

    Finds the date for Easter (Western) given a year.
    Sample usage: SELECT easter(YEAR(NOW()));

    DELIMITER //
    CREATE FUNCTION easter(inYear YEAR) RETURNS DATE DETERMINISTIC
    BEGIN
        DECLARE a, b, c, d, e, k, m, n, p, q INT;
    
        DECLARE easter DATE;
    
        SET k = FLOOR(inYear / 100);
        SET a = MOD(inYear, 19);
        SET b = MOD(inYear, 4);
        SET c = MOD(inYear, 7);
        SET q = FLOOR(k / 4);
        SET p = FLOOR((13 + 8 * k) / 25);
        SET m = MOD((15-p+k-q), 30);
        SET d = MOD((19 * a + m), 30);
        SET n = MOD((4+k-q), 7);
        SET e = MOD((2*b+4*c+6*d+n), 7);
    
        SET easter = CASE
            WHEN d + e <= 9 THEN CONCAT_WS('-', inYear, '03', 22 + d + e)
            WHEN d = 29 AND e = 6 THEN CONCAT_WS('-', inYear, '04-19')
            WHEN d = 28 AND e = 6 AND a > 10 THEN CONCAT_WS('-', inYear, '04-18')
            ELSE CONCAT_WS('-', inYear, '04', LPAD(d + e - 9, 2, 0))
        END;
    
        RETURN easter;
    END
    //
    DELIMITER ;