A Little Noise

April 11, 2022

Date ranges in JSON

Filed under: MySQL — snoyes @ 9:50 am

Suppose you have these documents stored in a JSON column:

CREATE TABLE tableName (id int auto_increment primary key, data JSON);
INSERT INTO tableName (data) VALUES ('
{
    "2022/03/29": ["macguffin", "plot coupon"], 
    "2022/03/31": ["lorem", "ipsum"], 
    "2022/04/01": ["dolor", "sit amet"], 
    "2022/04/02": ["silent engine", "radioactive ore"], 
    "2022/04/03": ["pulpy briefcase", "Schmilblick"]
}
'), ('
{
    "2022/03/30": ["Big Dumb Object", "Unobtanium"], 
    "2022/03/31": "glowing orb", 
    "2022/04/02": "blinker fluid",
    "2022/04/03": "hunted snipe",
    "2022/04/04": ["wild goose", "alien space bat"]
}
');

And suppose you want the values for each row from the range 29 March through 01 April.

JSON has no idea what a “date” is, nor how to cover a “range” of values. But MySQL does. So, we can turn the keys into something that MySQL can work with, filter those keys with SQL, and then look up the values in JSON again.

SELECT
 JSON_ARRAYAGG(
  JSON_EXTRACT(data, CONCAT('$."', DATE_FORMAT(d, '%Y/%m/%d'), '"'))
 ) AS importantValues 
FROM
 tableName
 JOIN JSON_TABLE(
    REPLACE(JSON_KEYS(data), '/', '-'),
    "$[*]" COLUMNS (d DATE PATH "$")
 ) AS jt 
WHERE d BETWEEN '2022-03-29' AND '2022-04-01'
GROUP BY id;
+---------------------------------------------------------------------------+
| importantValues                                                           |
+---------------------------------------------------------------------------+
| [["macguffin", "plot coupon"], ["lorem", "ipsum"], ["dolor", "sit amet"]] |
| [["Big Dumb Object", "Unobtanium"], "glowing orb"]                        |
+---------------------------------------------------------------------------+

Let’s break that down.

JSON_KEYS gets an array of keys from a JSON object:

SELECT JSON_KEYS(data) FROM tableName;
+------------------------------------------------------------------------+
| JSON_KEYS(data)                                                        |
+------------------------------------------------------------------------+
| ["2022/03/29", "2022/03/31", "2022/04/01", "2022/04/02", "2022/04/03"] |
| ["2022/03/30", "2022/03/31", "2022/04/02", "2022/04/03", "2022/04/04"] |
+------------------------------------------------------------------------+

At the moment, MySQL still understands any delimiter character between the date parts, in this case a forward slash. However, as of 8.0.29, this allowance is deprecated, so to be proper the slashes must be replaced with dashes:

SELECT REPLACE(JSON_KEYS(data), '/', '-') FROM tableName;
+------------------------------------------------------------------------+
| REPLACE(JSON_KEYS(data), '/', '-')                                     |
+------------------------------------------------------------------------+
| ["2022-03-29", "2022-03-31", "2022-04-01", "2022-04-02", "2022-04-03"] |
| ["2022-03-30", "2022-03-31", "2022-04-02", "2022-04-03", "2022-04-04"] |
+------------------------------------------------------------------------+

JSON_TABLE turns JSON data into a table. The “$[*]” path selects the part of the expression that will be turned into the table. The COLUMNS section turns those values into MySQL dates.

SELECT * FROM 
 JSON_TABLE(
    '["2022/03/29", "2022/03/31", "2022/04/01", "2022/04/02", "2022/04/03"]',
    "$[*]" COLUMNS (d DATE PATH "$")
 ) AS jt;
+------------+
| d          |
+------------+
| 2022-03-29 |
| 2022-03-31 |
| 2022-04-01 |
| 2022-04-02 |
| 2022-04-03 |
+------------+

You can provide a column name to JSON_TABLE, provided it comes from a table listed earlier in the FROM clause. Filtering and grouping is ordinary SQL. JSON_ARRAYGG is an aggregate function that turns rows into a JSON array; it’s the JSON version of GROUP_CONCAT.

SELECT
 id, JSON_ARRAYAGG(d)
FROM
 tableName
 JOIN JSON_TABLE(
    JSON_KEYS(data),
    "$[*]" COLUMNS (d DATE PATH "$")
 ) AS jt 
WHERE d BETWEEN '2022-03-29' AND '2022-04-01'
GROUP BY id;

JSON_EXTRACT returns the values from a given path. It requires an exact match for looking up keys, so those dashes must be turned back into slashes – REPLACE would work, but DATE_FORMAT is more obvious. It also needs the key wrapped in double quotes, and prepend “$.” to make it into a proper path.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress