A Little Noise


ERROR 3037 (22023): Invalid GIS data provided to function st_geometryfromtext.

1. Watch the parentheses. It's:

ST_GeomFromText('POLYGON((outerRing), (innerRing), (innerRing), ...)')

The inner rings are optional. If you have just the outer ring, then it's still:


and not:


2. Polygons have to start and end at the same point.

3. Watch the commas. Rings are comma-delimited sets of of whitespace-delimited coordinate pairs:

x1 y1, x2 y2, x3 y3, x4 y4

x1 y1 x2 y2 x3 y3
x1, y1, x2, y2, x3, y3, x4, y4
(x1, y1), (x2, y2), (x3, y3)
or other variations on that theme.

Filed under: MySQL No Comments

On slave_parallel_workers and the logical clock

How can you tell if a given workload on the master could be replicated with many parallel workers on the slave?

The slave_parallel_type=LOGICAL_CLOCK is an implementation of a Lamport clock. The implementation is described in WL #7165 - including a neat little ASCII-art graphic.

Each event in the binary log is stamped with two values:
- The sequence_number increments for each commit
- The last_committed is the sequence_number which was in effect when this transaction entered the prepare phase.

A slave worker cannot begin doing a transaction until the last_committed value is smaller than the sequence_number of all other running threads.

mysqlbinlog mysql-bin.0000x | grep last_committed

#160118 15:31:34 server id 3  end_log_pos 1527 CRC32 0xcdf6bd8d         GTID    last_committed=0        sequence_number=1
#160118 15:31:34 server id 3  end_log_pos 2627 CRC32 0x72e5fa80         GTID    last_committed=1        sequence_number=2
#160119  9:59:17 server id 3  end_log_pos 2952 CRC32 0x7290d02f         GTID    last_committed=2        sequence_number=3
#160119  9:59:17 server id 3  end_log_pos 3248 CRC32 0x227fe513         GTID    last_committed=3        sequence_number=4
#160119  9:59:17 server id 3  end_log_pos 3544 CRC32 0x8a6a2b78         GTID    last_committed=4        sequence_number=5
#160119  9:59:17 server id 3  end_log_pos 5315 CRC32 0x35bca421         GTID    last_committed=4        sequence_number=6
#160119  9:59:17 server id 3  end_log_pos 7094 CRC32 0x75584a42         GTID    last_committed=4        sequence_number=7
#160119  9:59:17 server id 3  end_log_pos 8865 CRC32 0xe4ee1b54         GTID    last_committed=6        sequence_number=8
#160119  9:59:17 server id 3  end_log_pos 10636 CRC32 0xed1f7c48        GTID    last_committed=6        sequence_number=9
#160119  9:59:17 server id 3  end_log_pos 12409 CRC32 0x7802f625        GTID    last_committed=9        sequence_number=10

In the first transaction, last_committed=0, sequence_number=1
The first worker thread takes that transaction and goes to work.

In the second transaction, last_committed=1, sequence_number=2
This transaction cannot start until the first one finishes, because last_committed=1 is not smaller than the previous sequence_number=1. So the first two transactions must be done in series.

In the third transaction, last_committed=2, sequence_number=3. Again, no worker could start this transaction until that previous trx with sequence_number=2 finishes.

Although these first four transactions might actually be assigned to different workers, effectively they execute in series, just like single-threaded replication would work.

Now, we come to the transactions with last_committed=4. As soon as that transaction with sequence_number=4 finishes, these next three transactions can go in parallel.

last_committed=4        sequence_number=5
last_committed=4        sequence_number=6
last_committed=4        sequence_number=7

Once the first two finish, two more can start:

last_committed=6        sequence_number=8
last_committed=6        sequence_number=9

They can run even while the one with sequence_number=7 is still running.

But this one has to wait for everything before it:

last_committed=9        sequence_number=10

We can see that for these 10 transactions, only three workers could ever be operating at the same time.

You might have spotted a pattern: if the difference (sequence_number - last_committed) = 1, then the transaction has to wait for everything before it to finish. If the difference is 2, then the transactions can run in parallel with just the previous transaction, and so on.

You can count how much of your transaction history falls into each category:

mysqlbinlog mysql-bin.0000x | grep -o 'last_committed.*' | sed 's/=\|\s/ /g' | awk '{print $4-$2}' | sort -g | uniq -c

 20953 1
  8998 2
  6021 3
  4209 4
  3083 5
  2180 6
  1403 7
   820 8
   418 9
   176 10
     7 11
     2 12
     1 13
     1 15
     1 16

In that binary log, 20953 transactions had to wait for all previous transactions to commit before starting. 8998 could run in parallel with one previous transaction. Only a very few could run in parallel with more than 10 previous transactions. Jut 3 worker threads would cover 75% of the possible parallel transactions. 8 worker threads would cover 99%.

See also Percona's blog for discussion on how to view similar information via the performance_schema on a running slave.

Filed under: MySQL 1 Comment



BIT values cannot be loaded using binary notation (for example, b'011010').

$ cat test.txt



CREATE TABLE loadTest (b BIT(6));


| Level   | Code | Message                               |
| Warning | 1406 | Data too long for column 'b' at row 1 |
| Warning | 1406 | Data too long for column 'b' at row 2 |

-- Note the wrong values:

SELECT BIN(b) FROM loadTest;
| BIN(b) |
| 111111 |
| 111111 |

TRUNCATE loadTest;

LOAD DATA LOCAL INFILE 'test.txt' INTO TABLE loadTest (@b) 
  SET b = CAST(CONV(TRIM('''' FROM SUBSTRING_INDEX(@b, 'b', -1)), 2, 10) AS UNSIGNED);

SELECT BIN(b) FROM loadTest;
| BIN(b) |
| 101010 |
| 111000 |


SUBSTRING_INDEX(@b, 'b', -1)

removes the leading 'b' or '0b'.

TRIM('''' FROM ...)

removes the quotes if there are any.

CONV(..., 2, 10)

converts the string of 0s and 1s into a string of decimal digits.


turns the string of decimal digits into an integer.
MySQL automatically casts integers into bits when inserting into a BIT type.

Filed under: MySQL, One liners No Comments

Quiz: Drop non-global users

Somebody asked on Freenode. I don't know why they wanted it. How would you drop all MySQL users who do not have "GRANT ALL ON *.* ... WITH GRANT OPTION"? That is, drop any users who have 'N' in any of the privilege columns in `mysql`.`user`.

My solution shown below. Did you think of a different approach?

My solution ▼

Filed under: MySQL, One liners No Comments

Square Root of a Complex Number using Compass and Straight Edge Geometry

Also available as an interactive applet (scrollwheel or pinch to zoom so you can adjust the axis how you want).

Descartes allowed a line segment of unit length to find roots. The rest of the construction is completed with just Euclid's compass and straight edge.

Given: a complex number C plotted on the complex plane.
Bisect the angle formed between C, the origin O, and the positive X (real) axis.
Extend OC past O by a unit line segment to U.
Find the midpoint, M, between U and C.
Construct a half-circle centered at M through U and C.
Draw the perpendicular to UC through O to intersect the half-circle at I.
Draw a circle centered at O through I.
The circle intersects the bisector at S.
S is a square root of C.

Filed under: Technical No Comments

Debugging Large Data with Rewriter

A customer showed that a particular client reported a less-than-helpful error message when it tried to display some meta-data about a table.

A less-than-helpful error message that hints at an int.

I couldn't repeat the behavior with just a copy of the schema, so I suspected it was because of the size of data in the customer's server - somebody had used an int where they needed a long.

The customer's data was quite large - many hundreds of GB - more than I could easily whip up on my laptop to test. But, I didn't really need all that data, or even any data at all; I just needed MySQL to pretend it had all that data. Specifically, I needed information_schema to report a large data_length.

Enter Rewriter, the query rewrite plugin that ships with MySQL 5.7 and later.

First, the general query log gave the exact query sent by the client:

select * FROM information_schema.partitions WHERE TABLE_SCHEMA = 'schemaNameHere' AND TABLE_NAME = 'tableNameHere'

Create a copy of that table:

CREATE DATABASE debug_schema;
CREATE TABLE debug_schema.partitions LIKE information_schema.partitions;
INSERT INTO debug_schema.partitions SELECT * FROM information_schema.partitions;

Adjust the copy to report a different value:

UPDATE debug_schema.partitions SET data_length = POW(2, 32);

Install the Rewriter plugin, and add a rule to modify the schema name:

INSERT INTO query_rewrite.rewrite_rules(pattern, replacement) VALUES (
'select * FROM information_schema.partitions WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ?',
'select * FROM debug_schema.partitions WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ?'

CALL query_rewrite.flush_rewrite_rules();

Now when I try the client, it thinks information_schema has reported a huge data_length, and I get the same error which my customer had reported. Success!

Filed under: MySQL 1 Comment

Aggregate JSON function in MySQL

There is not yet an equivalent to GROUP_CONCAT that produces a JSON array. (There is in MySQL 8, but that's not GA yet.) Until then, you can hack it together with string functions:

| id   | data   |
|    1 | First  |
|    2 | Second |

SELECT CONCAT('[', GROUP_CONCAT(JSON_OBJECT('id', id, 'value', data) SEPARATOR ', '), ']') AS j FROM t;
| j                                                           |
| [{"id": 1, "value": "First"}, {"id": 2, "value": "Second"}] |

Or you can use all JSON functions but hack the grouping:

         @c := @c + 1 AS c,
         @j := JSON_MERGE(@j, JSON_OBJECT('id', id, 'value', data)) AS j
       FROM t
       JOIN (SELECT @c := 0,  @j := JSON_ARRAY()) dt1
     ) dt2 ORDER BY c DESC LIMIT 1;
| j                                                           |
| [{"id": 1, "value": "First"}, {"id": 2, "value": "Second"}] |
Filed under: MySQL FAQ 1 Comment

Disassembly of the Royal P80 Electric Pencil Sharpener

(the one you bought at Sam's Club)

Remove both of the screws marked with red arrows.
Remove either pair of the screws marked with green or yellow arrows (or both pairs, but then the whole thing falls apart and goes bang.)
Royal P80 screws

Push down on the down arrow side and up on the up arrow side to unhook the little plastic hook thing.
Royal P80 tabs

If you still have the P50 instead of the P80, visit John's blog instead.

Filed under: Technical No Comments

I’m really quite good with maps

Workbench announced support for a spatial view in 6.2, but examples are somewhat lacking. Just how do you get a SHP into MySQL?


Download and unpack a SHP file such as these country boundaries.

In the Workbench installation directory, you'll find a program "ogr2ogr" that can convert .shp to .csv. Run it like this:

"C:\Program Files\MySQL\MySQL Workbench 6.3\ogr2ogr.exe" -f CSV countries.csv countries.shp -lco GEOMETRY=AS_WKT

Now create a table and load the CSV.

CREATE TABLE worldmap (
	OBJECTID smallint unsigned,
	NAME varchar(50),
	ISO3 char(3),
	ISO2 char(2),
	FIPS varchar(5),
	COUNTRY varchar(50),
	ENGLISH varchar(50),
	FRENCH varchar(50),
	SPANISH varchar(50),
	LOCAL varchar(50),
	FAO varchar(50),
	WAS_ISO varchar(3),
	SOVEREIGN varchar(50),
	CONTINENT varchar(15),
	UNREG1 varchar(30),
	UNREG2 varchar(15),
	EU boolean,
	SQKM decimal(20,11),
	g geometry

SET g = ST_GeomCollFromText(@WKT);

Now just select rows of interest in Workbench, click the Spatial View format button, and there's your world map.

You can run multiple selects (such as the citylot data from yesterday's post) to overlay on top of the world map.


Filed under: MySQL 1 Comment

The world is not in your books and maps.

MySQL 5.7 came out with support for JSON, improved geometry, and virtual columns. Here's an example showing them all playing together.

click to embiggen

Download citylots.json.

It comes as one big object, so we'll break it up into separate lines:
grep "^{ .type" citylots.json > properties.json

Connect to a 5.7 instance of MySQL.

CREATE TABLE citylots (id serial, j json, p geometry as (ST_GeomFromGeoJSON(j, 2)));
LOAD DATA LOCAL INFILE 'properties.json' INTO TABLE citylots (j);

A few of the rows don't contain useful data:
DELETE FROM citylots WHERE j->'$.geometry.type' IS NULL;

In MySQL Workbench, do:
SELECT id, p FROM citylots;

Then click on Spatial View. It takes a couple of minutes for 200k rows, but there's a map of San Francisco.

The default projection, 'Robinson', is designed for showing the whole world at once and so is pretty distorted for this particular data set. Mercator or Equirectangular are better choices. Fortunately, Workbench repaints the data in just a few seconds.

If you selected some other fields, you can click on the map and see the relevant data for that particular geometry.

Filed under: MySQL No Comments