Blog

  • Make an ALTER TABLE script with MySQL Workbench

    So you have two versions of a database sitting around – maybe you’ve been trying things out on your development box, and now you want to move it all over to production. If it’s just a new table here and an extra column there, you could do it by hand, but if you’ve made lots of changes, it would be nice to let the computer do it for you.

    Enter MySQL Workbenchmysqldiff.

  • Trigger on firing table

    CREATE TABLE t1 (id int);
    CREATE TRIGGER t1_ai AFTER INSERT ON t1 FOR EACH ROW 
      INSERT INTO t1 VALUES (NEW.id);
    

    That’s not allowed; it won’t let you create the trigger.

    CREATE TABLE t1 (id int);
    CREATE TABLE f1 (id int) ENGINE=FEDERATED CONNECTION='localhost/test/t1';
    CREATE TRIGGER t1_ai AFTER INSERT ON t1 FOR EACH ROW 
      INSERT INTO f1 VALUES (NEW.id);

    The server lets that through. Don’t do it.

  • On unions and derivatives

    On occasion I need to generate a number of rows. I don’t care about the contents, just how many rows there are. Maybe I want to create a table filled with a sequence of numbers. One common way to do this is to join a bunch of subqueries built out of unions:


    SET @count := 0;
    INSERT INTO table SELECT @count := @count + 1
    FROM (SELECT 1 UNION SELECT 2) a
    JOIN (SELECT 1 UNION SELECT 2) b
    JOIN (SELECT 1 UNION SELECT 2) c

    Of course, I could union more than two selects (e.g. SELECT 1 UNION SELECT 2 UNION SELECT 3). That makes each derived table longer, but reduces how many must be joined.

    Question: What’s the optimal size of each derived table?

    Defining “optimal”: since this is a one-off query, I don’t much care about performance (within reason), but I do care about how much typing I have to do. So I will favor “fewer characters” over “fewer joins”.

    Each derived table in the query takes up 15 * numSelects + 2 characters.

    To generate a given number of rows requires CEILING(LOG(numRows) / LOG(numSelects)) derived tables.

    If I want 1000 rows in the result, I’d need 10 derived tables of two rows each (320 characters), or 7 of three rows (329 characters), or 3 of ten rows (456 characters).

    For any given numRows, what’s the minimum of CEILING(LOG(numRows) / LOG(numSelects)) * (15 * numSelects + 2) ?

    The answer varies with numRows – 2, 3, or 4 are alternately best when generating less than 130 rows, but any higher than that and the best choice usually turns out to be 3.

    (SELECT 1 UNION SELECT 2 UNION SELECT 3)

  • group date

    A non-rigorous, non-scientific, totally off-the-cuff test of which function to pick when you need to group by year and month.

    I populated a table with 262K rows of random dates, and then ran

    SELECT %s, COUNT(*) FROM table GROUP BY %s ORDER BY NULL

    with various functions, which should all result in the same grouping. I repeated each query five times and show the average time, using three different column types DATE, DATETIME, and TIMESTAMP.

    expression DATE DATETIME TIMESTAMP
    EXTRACT(YEAR_MONTH FROM d) 0.362 0.369 0.581
    LAST_DAY(d) 0.374 0.389 0.582
    DATE_SUB(d, INTERVAL DAY(d) DAY) 0.429 0.882 1.452
    d – INTERVAL DAY(d) DAY 0.429 0.887 1.535
    SUBSTRING(d, 1, 7) 0.454 0.488 0.719
    YEAR(d), MONTH(d) 1.046 1.126 2.045
    MONTH(d), YEAR(d) 1.116 1.196 2.112
    LEFT(d, 7) 1.307 1.405 2.123
    DATE_FORMAT(d, ‘%Y%m’) 1.480 1.565 2.112
    DATE_FORMAT(d, ‘%Y-%m’) 1.514 1.615 2.564
    DATE_FORMAT(d, ‘%m%Y’) 1.517 1.604 2.420
    DATE_FORMAT(d, ‘%m-%Y’) 1.562 1.656 2.465
    MONTHNAME(d), YEAR(d) 1.613 1.713 2.812
    YEAR(d), MONTHNAME(d) 1.663 1.766 2.873

    And just in case you want to extract a full date (which really only makes sense for datetime and timestamp):

    expression DATE DATETIME TIMESTAMP
    DATE(d) 0.357 0.374 0.591
    EXTRACT(YEAR_MONTH FROM d), EXTRACT(DAY FROM d) 0.377 0.407 0.730
    EXTRACT(DAY FROM d), EXTRACT(YEAR_MONTH FROM d) 0.395 0.422 0.751
    MONTH(d), YEAR(d), DAY(d) 0.395 0.426 0.870
    YEAR(d), DAY(d), MONTH(d) 0.398 0.440 0.862
    YEAR(d), MONTH(d), DAY(d) 0.406 0.441 0.867
    DAY(d), YEAR(d), MONTH(d) 0.409 0.444 0.859
    LEFT(d, 10) 0.437 0.487 0.728
    SUBSTRING_INDEX(d, ‘ ‘, 1) 0.439 0.475 0.743
    DAY(d), MONTH(d), YEAR(d) 0.441 0.477 0.901
    MONTH(d), DAY(d), YEAR(d) 0.442 0.472 0.914
    SUBSTRING(d, 1, 10) 0.460 0.496 0.729
    DATE_FORMAT(d, ‘%d%Y%m’) 0.539 0.571 0.852
    DATE_FORMAT(d, ‘%m%Y%d’) 0.542 0.570 0.841
    DATE_FORMAT(d, ‘%m%d%Y’) 0.543 0.572 0.846
    DATE_FORMAT(d, ‘%Y%m%d’) 0.544 0.570 0.842
    DATE_FORMAT(d, ‘%Y%d%m’) 0.544 0.572 0.843
    DATE_FORMAT(d, ‘%d-%Y-%d’) 0.547 0.574 0.848
    DATE_FORMAT(d, ‘%d%m%Y’) 0.549 0.573 0.842
    DATE_FORMAT(d, ‘%m-%Y-%d’) 0.552 0.583 0.869
    DATE_FORMAT(d, ‘%Y-%m-%d’) 0.558 0.583 0.854
    DATE_FORMAT(d, ‘%Y-%d-%m’) 0.569 0.593 0.867
    d – INTERVAL HOUR(d) HOUR – INTERVAL MINUTE(d) MINUTE – INTERVAL SECOND(d) SECOND 0.573 0.653 1.249
    DATE_FORMAT(d, ‘%d-%m-%Y’) 0.598 0.637 0.920
    DATE_FORMAT(d, ‘%m-%d-%Y’) 0.601 0.633 0.904
  • Drilling a hole in a stainless steel sink

    I needed to drill a couple of 1″ holes in the stainless steel kitchen sink to install a pair of soap and lotion dispensers.

    The most popular answer online was “use a bi-metal hole saw, low speed, lots of pressure, lots of lubricant.” I tried that, and the pilot bit wouldn’t even scratch it.

    I happened to have a hammer drill and some 3/8″ bits, and those went through … eventually. So I tried to enlarge it with a 1″ hammer drill bit. I got a jagged hole and a burned out hammer drill motor, which I have to ship off for repair now, having used it only three times.

    The next most popular answer seemed to be a conduit knockout punch, but that requires a pilot hole of 3/4″ (if I could make that, I wouldn’t need this punch, now would I?) and Lowe’s only carries them in a set of 4 for $150.

    Fortunately, the manager appeared and talked me down to a $55 Greenlee step bit. Went home, popped it in my little “backup” drill, and had two nice clean holes in well under ten minutes.

    Some tips:

    – I bought a Greenlee step drill bit, $55 at Lowe’s. They also had an Irwin for $53, but the manager said the Greenlee is twice the quality of the Irwin.
    – Measure the hole location every which way you can (remember, you only get one shot at this!), and then make a little starter divot with a hammer and nail or punch so the drill doesn’t wander over the surface.
    – Don’t run the drill at full speed; you will work-harden the sink and dull the bit. If it’s smoking, you were going too fast.
    – Don’t run the drill too slow. Apply firm downward pressure (we’re talking “both hands on the drill” firm, not “the combined body weight of you and your cousin Earl” firm.) If you run it too slow or don’t press firmly, when you finish one step and drop to the next step, it will gouge two chunks out of the side of the hole and lock the bit in place.
    – Use plenty of lubricant. WD-40 is fine; any oil ought to suffice. Spray it on liberally before and during drilling.
    – Don’t forget to mark on the drill bit where to stop – wrap a piece of masking tape around it at the right height, or you may get to repeat this procedure on a new sink.

  • GROUP UNCONCAT

    Dunno why you’d rather do this in SQL than in your application layer, but if you do, here’s one way to turn a delimited string of values back into multiple rows – just the opposite of GROUP_CONCAT:

    SET @sourceString = 'a,b,c,d,e';
    SET @sql = CONCAT('INSERT INTO t VALUES (\\'', REPLACE(@sourceString, ',', '\\'),(\\''), '\\')');
    PREPARE myStmt FROM @sql;
    EXECUTE myStmt;

    Just to show what’s going on:

    mysql> SELECT @sql;
    +----------------------------------------------------+
    | @sql                                               |
    +----------------------------------------------------+
    | INSERT INTO t VALUES ('a'),('b'),('c'),('d'),('e') |
    +----------------------------------------------------+
    
  • Fine-tuning FULLTEXT: Adding word characters

    Perhaps you want “.” or “:” or “-” or something to be treated as a word character so that you can do a fulltext match against an IP or MAC address.

    If you want to change the set of characters that are considered word characters, you can do so in two ways…
    * Modify the MySQL source … and recompile MySQL.
    * Modify a character set file: This requires no recompilation.

    Thus sayeth the Manual. What it doesn’t say is just what to modify in that character set file. (Edit: Now it does.) I had to ask MySQL developer Alexander Barkov for help to get it right.

    In the user comments on that manual page, John Navratil has provided most of the answer, but there’s an important step missing: adding a new collation.
    (more…)

  • Choosing index prefix length

    It can be handy to index just part of a long character column. The strings might vary enough in just the first few characters to get the same cardinality out of a partial index as a full index would give, and the partial index takes up much less disk space and memory.

    Indexes in The Manual

    But how long should that prefix be? Should it be the same cardinality as the original data? Within 10%?

    I dunno; somebody else can blog about that decision. I’m just interested in a query that finds the number which satisfies your requirements.

    I have a table called ‘numbers’ with an int field ‘num’ that just holds values 1 through a large number; for this case, it needs to be no larger than the length of the longest string in the table.

    Then:

    SELECT
      numbers.num AS prefixLength,
      dt.originalCardinality,
      (SELECT COUNT(DISTINCT LEFT(stringField, numbers.num)) FROM yourTable) AS prefixCardinality
    FROM
      numbers
      JOIN (SELECT COUNT(DISTINCT stringField) AS originalCardinality FROM yourTable) AS dt
    WHERE
      (SELECT COUNT(DISTINCT LEFT(stringField, numbers.num)) FROM yourTable) >= .90 * originalCardinality
    ORDER BY numbers.num
    LIMIT 1;
    
  • Who’s there?

    While playing under the dining room table:

    Eden (squeaky voice): You can’t come in!
    Eden (deep voice): Why not?
    Eden (squeaky voice): Because you’re a monster!
    Eden (deep voice): Oh. Ok.

  • Careful with those required files in Ruby

    Getting Ruby to talk to MySQL was pretty easy in irb:

    C:\>irb
    irb(main):001:0> require 'rubygems'
    => true
    irb(main):002:0> require 'mysql'
    => true
    irb(main):003:0> db = Mysql.connect('localhost', 'root', 'password', 'test')
    => #<Mysql:0x2cc5270>
    irb(main):004:0>

    But then I tried the very same code from a file:

    C:\>ruby mysql.rb
    ./mysql.rb:3: uninitialized constant Mysql (NameError)
            from C:/Ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
            from C:/Ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require'
            from mysql.rb:2

    That require 'mysql' line looks in the current directory first … and I’d used the name mysql.rb for the file I was coding … so my code just included itself, and not the gem I wanted. Doh!