A Little Noise

November 24, 2009

group date

Filed under: MySQL — snoyes @ 8:49 am

A non-rigorous, non-scientific, totally off-the-cuff test of which function to pick when you need to group by year and month.

I populated a table with 262K rows of random dates, and then ran

SELECT %s, COUNT(*) FROM table GROUP BY %s ORDER BY NULL

with various functions, which should all result in the same grouping. I repeated each query five times and show the average time, using three different column types DATE, DATETIME, and TIMESTAMP.

expression DATE DATETIME TIMESTAMP
EXTRACT(YEAR_MONTH FROM d) 0.362 0.369 0.581
LAST_DAY(d) 0.374 0.389 0.582
DATE_SUB(d, INTERVAL DAY(d) DAY) 0.429 0.882 1.452
d – INTERVAL DAY(d) DAY 0.429 0.887 1.535
SUBSTRING(d, 1, 7) 0.454 0.488 0.719
YEAR(d), MONTH(d) 1.046 1.126 2.045
MONTH(d), YEAR(d) 1.116 1.196 2.112
LEFT(d, 7) 1.307 1.405 2.123
DATE_FORMAT(d, ‘%Y%m’) 1.480 1.565 2.112
DATE_FORMAT(d, ‘%Y-%m’) 1.514 1.615 2.564
DATE_FORMAT(d, ‘%m%Y’) 1.517 1.604 2.420
DATE_FORMAT(d, ‘%m-%Y’) 1.562 1.656 2.465
MONTHNAME(d), YEAR(d) 1.613 1.713 2.812
YEAR(d), MONTHNAME(d) 1.663 1.766 2.873

And just in case you want to extract a full date (which really only makes sense for datetime and timestamp):

expression DATE DATETIME TIMESTAMP
DATE(d) 0.357 0.374 0.591
EXTRACT(YEAR_MONTH FROM d), EXTRACT(DAY FROM d) 0.377 0.407 0.730
EXTRACT(DAY FROM d), EXTRACT(YEAR_MONTH FROM d) 0.395 0.422 0.751
MONTH(d), YEAR(d), DAY(d) 0.395 0.426 0.870
YEAR(d), DAY(d), MONTH(d) 0.398 0.440 0.862
YEAR(d), MONTH(d), DAY(d) 0.406 0.441 0.867
DAY(d), YEAR(d), MONTH(d) 0.409 0.444 0.859
LEFT(d, 10) 0.437 0.487 0.728
SUBSTRING_INDEX(d, ‘ ‘, 1) 0.439 0.475 0.743
DAY(d), MONTH(d), YEAR(d) 0.441 0.477 0.901
MONTH(d), DAY(d), YEAR(d) 0.442 0.472 0.914
SUBSTRING(d, 1, 10) 0.460 0.496 0.729
DATE_FORMAT(d, ‘%d%Y%m’) 0.539 0.571 0.852
DATE_FORMAT(d, ‘%m%Y%d’) 0.542 0.570 0.841
DATE_FORMAT(d, ‘%m%d%Y’) 0.543 0.572 0.846
DATE_FORMAT(d, ‘%Y%m%d’) 0.544 0.570 0.842
DATE_FORMAT(d, ‘%Y%d%m’) 0.544 0.572 0.843
DATE_FORMAT(d, ‘%d-%Y-%d’) 0.547 0.574 0.848
DATE_FORMAT(d, ‘%d%m%Y’) 0.549 0.573 0.842
DATE_FORMAT(d, ‘%m-%Y-%d’) 0.552 0.583 0.869
DATE_FORMAT(d, ‘%Y-%m-%d’) 0.558 0.583 0.854
DATE_FORMAT(d, ‘%Y-%d-%m’) 0.569 0.593 0.867
d – INTERVAL HOUR(d) HOUR – INTERVAL MINUTE(d) MINUTE – INTERVAL SECOND(d) SECOND 0.573 0.653 1.249
DATE_FORMAT(d, ‘%d-%m-%Y’) 0.598 0.637 0.920
DATE_FORMAT(d, ‘%m-%d-%Y’) 0.601 0.633 0.904

November 21, 2009

Drilling a hole in a stainless steel sink

Filed under: Habitat — snoyes @ 10:32 pm

I needed to drill a couple of 1″ holes in the stainless steel kitchen sink to install a pair of soap and lotion dispensers.

The most popular answer online was “use a bi-metal hole saw, low speed, lots of pressure, lots of lubricant.” I tried that, and the pilot bit wouldn’t even scratch it.

I happened to have a hammer drill and some 3/8″ bits, and those went through … eventually. So I tried to enlarge it with a 1″ hammer drill bit. I got a jagged hole and a burned out hammer drill motor, which I have to ship off for repair now, having used it only three times.

The next most popular answer seemed to be a conduit knockout punch, but that requires a pilot hole of 3/4″ (if I could make that, I wouldn’t need this punch, now would I?) and Lowe’s only carries them in a set of 4 for $150.

Fortunately, the manager appeared and talked me down to a $55 Greenlee step bit. Went home, popped it in my little “backup” drill, and had two nice clean holes in well under ten minutes.

Some tips:

– I bought a Greenlee step drill bit, $55 at Lowe’s. They also had an Irwin for $53, but the manager said the Greenlee is twice the quality of the Irwin.
– Measure the hole location every which way you can (remember, you only get one shot at this!), and then make a little starter divot with a hammer and nail or punch so the drill doesn’t wander over the surface.
– Don’t run the drill at full speed; you will work-harden the sink and dull the bit. If it’s smoking, you were going too fast.
– Don’t run the drill too slow. Apply firm downward pressure (we’re talking “both hands on the drill” firm, not “the combined body weight of you and your cousin Earl” firm.) If you run it too slow or don’t press firmly, when you finish one step and drop to the next step, it will gouge two chunks out of the side of the hole and lock the bit in place.
– Use plenty of lubricant. WD-40 is fine; any oil ought to suffice. Spray it on liberally before and during drilling.
– Don’t forget to mark on the drill bit where to stop – wrap a piece of masking tape around it at the right height, or you may get to repeat this procedure on a new sink.

November 20, 2009

GROUP UNCONCAT

Filed under: MySQL — snoyes @ 3:18 pm

Dunno why you’d rather do this in SQL than in your application layer, but if you do, here’s one way to turn a delimited string of values back into multiple rows – just the opposite of GROUP_CONCAT:

SET @sourceString = 'a,b,c,d,e';
SET @sql = CONCAT('INSERT INTO t VALUES (\\'', REPLACE(@sourceString, ',', '\\'),(\\''), '\\')');
PREPARE myStmt FROM @sql;
EXECUTE myStmt;

Just to show what’s going on:

mysql> SELECT @sql;
+----------------------------------------------------+
| @sql                                               |
+----------------------------------------------------+
| INSERT INTO t VALUES ('a'),('b'),('c'),('d'),('e') |
+----------------------------------------------------+

November 19, 2009

Fine-tuning FULLTEXT: Adding word characters

Filed under: MySQL FAQ — snoyes @ 8:27 am

Perhaps you want “.” or “:” or “-” or something to be treated as a word character so that you can do a fulltext match against an IP or MAC address.

If you want to change the set of characters that are considered word characters, you can do so in two ways…
* Modify the MySQL source … and recompile MySQL.
* Modify a character set file: This requires no recompilation.

Thus sayeth the Manual. What it doesn’t say is just what to modify in that character set file. (Edit: Now it does.) I had to ask MySQL developer Alexander Barkov for help to get it right.

In the user comments on that manual page, John Navratil has provided most of the answer, but there’s an important step missing: adding a new collation.
(more…)

November 11, 2009

Choosing index prefix length

Filed under: MySQL — snoyes @ 9:12 am

It can be handy to index just part of a long character column. The strings might vary enough in just the first few characters to get the same cardinality out of a partial index as a full index would give, and the partial index takes up much less disk space and memory.

Indexes in The Manual

But how long should that prefix be? Should it be the same cardinality as the original data? Within 10%?

I dunno; somebody else can blog about that decision. I’m just interested in a query that finds the number which satisfies your requirements.

I have a table called ‘numbers’ with an int field ‘num’ that just holds values 1 through a large number; for this case, it needs to be no larger than the length of the longest string in the table.

Then:

SELECT
  numbers.num AS prefixLength,
  dt.originalCardinality,
  (SELECT COUNT(DISTINCT LEFT(stringField, numbers.num)) FROM yourTable) AS prefixCardinality
FROM
  numbers
  JOIN (SELECT COUNT(DISTINCT stringField) AS originalCardinality FROM yourTable) AS dt
WHERE
  (SELECT COUNT(DISTINCT LEFT(stringField, numbers.num)) FROM yourTable) >= .90 * originalCardinality
ORDER BY numbers.num
LIMIT 1;

Powered by WordPress