mysql’s --xml and some XSLT

January 29th, 2010

Somebody asked in Freenode the other day how to get their data out of MySQL into a specific XML format.

Both mysqldump and the mysql client have --xml options. The output from the cli looks like this:

<resultset statement="SELECT here">
  <row>
    <field name="field1">value</field>
    <field name="field2">value</field>
  </row>
  <row>
    <field name="field1">value</field>
    <field name="field2">value</field>
  </row>
</resultset>

I wanted to change those <field name=”fieldName”> into <fieldName> tags, and all it takes is a little XSLT, which I’ve uploaded here.

On Linux you can do something like this:

mysql -e "SELECT * FROM dbname.tablename" | xsltproc mysql.xsl -

Make an ALTER TABLE script with MySQL Workbench

January 28th, 2010

So you have two versions of a database sitting around - maybe you’ve been trying things out on your development box, and now you want to move it all over to production. If it’s just a new table here and an extra column there, you could do it by hand, but if you’ve made lots of changes, it would be nice to let the computer do it for you.

Enter MySQL Workbench.
Read the rest of this entry »

Trigger on firing table

January 26th, 2010
CREATE TABLE t1 (id int);
CREATE TRIGGER t1_ai AFTER INSERT ON t1 FOR EACH ROW
  INSERT INTO t1 VALUES (NEW.id);

That’s not allowed; it won’t let you create the trigger.

CREATE TABLE t1 (id int);
CREATE TABLE f1 (id int) ENGINE=FEDERATED CONNECTION='localhost/test/t1';
CREATE TRIGGER t1_ai AFTER INSERT ON t1 FOR EACH ROW
  INSERT INTO f1 VALUES (NEW.id);

The server lets that through. Don’t do it.

On unions and derivatives

December 4th, 2009

On occasion I need to generate a number of rows. I don’t care about the contents, just how many rows there are. Maybe I want to create a table filled with a sequence of numbers. One common way to do this is to join a bunch of subqueries built out of unions:


SET @count := 0;
INSERT INTO table SELECT @count := @count + 1
FROM (SELECT 1 UNION SELECT 2) a
JOIN (SELECT 1 UNION SELECT 2) b
JOIN (SELECT 1 UNION SELECT 2) c

Of course, I could union more than two selects (e.g. SELECT 1 UNION SELECT 2 UNION SELECT 3). That makes each derived table longer, but reduces how many must be joined.

Question: What’s the optimal size of each derived table?

Defining “optimal”: since this is a one-off query, I don’t much care about performance (within reason), but I do care about how much typing I have to do. So I will favor “fewer characters” over “fewer joins”.

Each derived table in the query takes up 15 * numSelects + 2 characters.

To generate a given number of rows requires CEILING(LOG(numRows) / LOG(numSelects)) derived tables.

If I want 1000 rows in the result, I’d need 10 derived tables of two rows each (320 characters), or 7 of three rows (329 characters), or 3 of ten rows (456 characters).

For any given numRows, what’s the minimum of CEILING(LOG(numRows) / LOG(numSelects)) * (15 * numSelects + 2) ?

The answer varies with numRows - 2, 3, or 4 are alternately best when generating less than 130 rows, but any higher than that and the best choice usually turns out to be 3.

(SELECT 1 UNION SELECT 2 UNION SELECT 3)

group date

November 24th, 2009

A non-rigorous, non-scientific, totally off-the-cuff test of which function to pick when you need to group by year and month.

I populated a table with 262K rows of random dates, and then ran

SELECT %s, COUNT(*) FROM table GROUP BY %s ORDER BY NULL

with various functions, which should all result in the same grouping. I repeated each query five times and show the average time, using three different column types DATE, DATETIME, and TIMESTAMP.

expression DATE DATETIME TIMESTAMP
EXTRACT(YEAR_MONTH FROM d) 0.362 0.369 0.581
LAST_DAY(d) 0.374 0.389 0.582
DATE_SUB(d, INTERVAL DAY(d) DAY) 0.429 0.882 1.452
d - INTERVAL DAY(d) DAY 0.429 0.887 1.535
SUBSTRING(d, 1, 7) 0.454 0.488 0.719
YEAR(d), MONTH(d) 1.046 1.126 2.045
MONTH(d), YEAR(d) 1.116 1.196 2.112
LEFT(d, 7) 1.307 1.405 2.123
DATE_FORMAT(d, ‘%Y%m’) 1.480 1.565 2.112
DATE_FORMAT(d, ‘%Y-%m’) 1.514 1.615 2.564
DATE_FORMAT(d, ‘%m%Y’) 1.517 1.604 2.420
DATE_FORMAT(d, ‘%m-%Y’) 1.562 1.656 2.465
MONTHNAME(d), YEAR(d) 1.613 1.713 2.812
YEAR(d), MONTHNAME(d) 1.663 1.766 2.873

And just in case you want to extract a full date (which really only makes sense for datetime and timestamp):

expression DATE DATETIME TIMESTAMP
DATE(d) 0.357 0.374 0.591
EXTRACT(YEAR_MONTH FROM d), EXTRACT(DAY FROM d) 0.377 0.407 0.730
EXTRACT(DAY FROM d), EXTRACT(YEAR_MONTH FROM d) 0.395 0.422 0.751
MONTH(d), YEAR(d), DAY(d) 0.395 0.426 0.870
YEAR(d), DAY(d), MONTH(d) 0.398 0.440 0.862
YEAR(d), MONTH(d), DAY(d) 0.406 0.441 0.867
DAY(d), YEAR(d), MONTH(d) 0.409 0.444 0.859
LEFT(d, 10) 0.437 0.487 0.728
SUBSTRING_INDEX(d, ' ', 1) 0.439 0.475 0.743
DAY(d), MONTH(d), YEAR(d) 0.441 0.477 0.901
MONTH(d), DAY(d), YEAR(d) 0.442 0.472 0.914
SUBSTRING(d, 1, 10) 0.460 0.496 0.729
DATE_FORMAT(d, '%d%Y%m') 0.539 0.571 0.852
DATE_FORMAT(d, '%m%Y%d') 0.542 0.570 0.841
DATE_FORMAT(d, '%m%d%Y') 0.543 0.572 0.846
DATE_FORMAT(d, '%Y%m%d') 0.544 0.570 0.842
DATE_FORMAT(d, '%Y%d%m') 0.544 0.572 0.843
DATE_FORMAT(d, '%d-%Y-%d') 0.547 0.574 0.848
DATE_FORMAT(d, '%d%m%Y') 0.549 0.573 0.842
DATE_FORMAT(d, '%m-%Y-%d') 0.552 0.583 0.869
DATE_FORMAT(d, '%Y-%m-%d') 0.558 0.583 0.854
DATE_FORMAT(d, '%Y-%d-%m') 0.569 0.593 0.867
d - INTERVAL HOUR(d) HOUR - INTERVAL MINUTE(d) MINUTE - INTERVAL SECOND(d) SECOND 0.573 0.653 1.249
DATE_FORMAT(d, '%d-%m-%Y') 0.598 0.637 0.920
DATE_FORMAT(d, '%m-%d-%Y') 0.601 0.633 0.904

Drilling a hole in a stainless steel sink

November 21st, 2009

I needed to drill a couple of 1″ holes in the stainless steel kitchen sink to install a pair of soap and lotion dispensers.

The most popular answer online was “use a bi-metal hole saw, low speed, lots of pressure, lots of lubricant.” I tried that, and the pilot bit wouldn’t even scratch it.

I happened to have a hammer drill and some 3/8″ bits, and those went through … eventually. So I tried to enlarge it with a 1″ hammer drill bit. I got a jagged hole and a burned out hammer drill motor, which I have to ship off for repair now, having used it only three times.

The next most popular answer seemed to be a conduit knockout punch, but that requires a pilot hole of 3/4″ (if I could make that, I wouldn’t need this punch, now would I?) and Lowe’s only carries them in a set of 4 for $150.

Fortunately, the manager appeared and talked me down to a $55 Greenlee step bit. Went home, popped it in my little “backup” drill, and had two nice clean holes in well under ten minutes.

Some tips:

- I bought a Greenlee step drill bit, $55 at Lowe’s. They also had an Irwin for $53, but the manager said the Greenlee is twice the quality of the Irwin.
- Measure the hole location every which way you can (remember, you only get one shot at this!), and then make a little starter divot with a hammer and nail or punch so the drill doesn’t wander over the surface.
- Don’t run the drill at full speed; you will work-harden the sink and dull the bit. If it’s smoking, you were going too fast.
- Don’t run the drill too slow. Apply firm downward pressure (we’re talking “both hands on the drill” firm, not “the combined body weight of you and your cousin Earl” firm.) If you run it too slow or don’t press firmly, when you finish one step and drop to the next step, it will gouge two chunks out of the side of the hole and lock the bit in place.
- Use plenty of lubricant. WD-40 is fine; any oil ought to suffice. Spray it on liberally before and during drilling.
- Don’t forget to mark on the drill bit where to stop - wrap a piece of masking tape around it at the right height, or you may get to repeat this procedure on a new sink.

GROUP UNCONCAT

November 20th, 2009

Dunno why you’d rather do this in SQL than in your application layer, but if you do, here’s one way to turn a delimited string of values back into multiple rows - just the opposite of GROUP_CONCAT:

SET @sourceString = 'a,b,c,d,e';
SET @sql = CONCAT('INSERT INTO t VALUES (\'', REPLACE(@sourceString, ',', '\'),(\''), '\')');
PREPARE myStmt FROM @sql;
EXECUTE myStmt;

Just to show what’s going on:

mysql> SELECT @sql;
+----------------------------------------------------+
| @sql                                               |
+----------------------------------------------------+
| INSERT INTO t VALUES ('a'),('b'),('c'),('d'),('e') |
+----------------------------------------------------+

Fine-tuning FULLTEXT: Adding word characters

November 19th, 2009

Perhaps you want “.” or “:” or “-” or something to be treated as a word character so that you can do a fulltext match against an IP or MAC address.

If you want to change the set of characters that are considered word characters, you can do so in two ways…
* Modify the MySQL source … and recompile MySQL.
* Modify a character set file: This requires no recompilation.

Thus sayeth the Manual. What it doesn’t say is just what to modify in that character set file. I had to ask MySQL developer Alexander Barkov for help to get it right.

In the user comments on that manual page, John Navratil has provided most of the answer, but there’s an important step missing: adding a new collation.
Read the rest of this entry »

Choosing index prefix length

November 11th, 2009

It can be handy to index just part of a long character column. The strings might vary enough in just the first few characters to get the same cardinality out of a partial index as a full index would give, and the partial index takes up much less disk space and memory.

Indexes in The Manual

But how long should that prefix be? Should it be the same cardinality as the original data? Within 10%?

I dunno; somebody else can blog about that decision. I’m just interested in a query that finds the number which satisfies your requirements.

I have a table called ‘numbers’ with an int field ‘num’ that just holds values 1 through several thousand.

Then:

SELECT
  num AS prefixLength,
  COUNT(DISTINCT LEFT(stringField, num)) AS prefixCardinality,
  dt.originalCardinality
FROM
  yourTable
  JOIN numbers
  JOIN (SELECT COUNT(DISTINCT stringField) AS originalCardinality FROM yourTable) AS dt
GROUP BY prefixLength
HAVING
  prefixCardinality >= .90 * originalCardinality
LIMIT 1;

Adjust the HAVING clause according to your needs.

Power Play, Eating a Hole in my Pocket

August 16th, 2009

The budget is a big obstacle in getting to Skokomish. We’re trying to save nearly 1/3 of my income. We’re not big spenders as a general rule - we rarely buy name brand anything, we don’t spend a lot on entertainment, and we don’t carry credit card balances so our only interest payments are the home mortgage and a student loan. Still, there doesn’t seem to be much accumulating in the savings account at the end of each month.

I looked at last month’s bills. There were two that immediately stood out where we could find the savings we need: electricity and food.

The power company took $400 last month. Most of that was to run the air conditioner; we’ve turned the temperature up 5 degrees and all the ceiling fans on high. It costs 76 cents to dry a load of clothes; the clothesline that sterilizes and bleaches our cloth diapers will see more use. My computer burns about $7 each month just sitting on all night; I’ve started shutting it down before bed. We already use CFLs instead of incandescent bulbs; we might burn those even less by adjusting to a ‘dawn to dusk’ schedule instead of a ‘mid-morning to well after dark’ schedule.

Food cost us about $1500 last month - $300 apiece, and that includes the baby still on just milk. About 2/3 was groceries, and the rest was eating out. For the next year, we’re going to eat out only Sunday lunch and maybe dinner, and all other meals at home. We’re going to keep our monthly grocery bill under $250, and we’re not going to sacrifice adequate nutrition and variety while doing it. How? We’re starting by eating through everything we have stockpiled in the house before we make another shopping trip.