A little two-part quiz. If you get the first one without peeking, you’re worth your pay as a DBA. If you get the second one without peeking, you may tell your boss that some random guy on the Internet says you deserve a raise.
Start with a text file, ‘test.txt’, with these three lines:
1
1
2
Set up the test in MySQL:
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (id int primary key);
LOAD DATA INFILE 'test.txt' INTO TABLE t1;
This gives “ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'
“, which is expected.
What’s in the table?
Show Answer ▼
It depends. If the engine is MyISAM, then you’ll have one row: the first ‘1’ from the file was inserted, everything else was skipped. If the engine is InnoDB, you’ll have no rows, because the transaction would rollback. So either 1 row or 0 rows.
Now, pretend you’re setting up a slave, or there was a crash and you’re recovering from binary logs:
mysqlbinlog bin.000001 | mysql
How many rows are in t1 now?
Show Answer ▼
mysql> SELECT * FROM t1;
+----+
| id |
+----+
| 1 |
| 2 |
+----+
2 rows in set (0.00 sec)
Why? The manual says, “mysqlbinlog converts LOAD DATA INFILE statements to LOAD DATA LOCAL INFILE statements,” and, “with LOCAL, the default duplicate-key handling behavior is the same as if IGNORE is specified.”
http://dev.mysql.com/doc/refman/5.1/en/mysqlbinlog.html
http://dev.mysql.com/doc/refman/5.1/en/load-data.html
Note that a replicating slave will handle it correctly – if the master used LOCAL (and therefore IGNORE), the slave will do IGNORE. If the master did not use LOCAL or IGNORE and so got the error above, the slave will do the same, and so the data will match. So be advised: replication and mysqlbinlog | mysql
may not give the same results.
Related
[…] LOAD DATA and recovery « A Little Noise […]
Pingback by Computer Security Devices « Network And Security | Network Security — March 25, 2010 @ 2:22 am
This probably falls under the category:
“It isn’t a bug if we document it…”
MySQL is really bad about documenting bad behavior instead of fixing it.
Comment by Justin Swanhart — March 25, 2010 @ 2:46 pm
Wouldn’t the second answer depend on the type of replication (row/statement)?
Comment by Patrice Levesque — March 20, 2014 @ 4:41 pm