No doubt everybody has solved it before, but ten seconds on Google didn’t show any obvious results, so here we go: How to pick the top 5 records of each group.
We’ll define “top 5” as “the first 5 records when ordering by a field called `rating`”.
For MySQL 8.0 and later, use the RANK() window function inside either a common table expression or a derived table:
WITH cte AS ( SELECT *, RANK() OVER(PARTITION BY groupId ORDER BY rating) AS rank FROM theTable ) SELECT * FROM cte WHERE rank <= 5;
5.7 and earlier didn’t have CTEs. Subqueries will suffice.
SELECT theTable.* FROM theTable JOIN ( SELECT t1.groupId, t1.rating, COUNT(t2.groupId) AS theCount FROM theTable t1 LEFT JOIN theTable t2 ON t1.groupId = t2.groupId AND t1.rating > t2.rating GROUP BY t1.groupId, t1.rating HAVING theCount < 5 ) AS dt USING (groupId, rating);
Pre-4.1, you couldn’t even use subqueries.
SELECT @oldGroup := 0, @count := 0; SELECT @oldGroup := groupId, theTable.* FROM theTable HAVING (@count := IF(groupId <=> @oldGroup, @count+1, 0)) < 5 ORDER BY groupId, rating;
This is one of the numerous examples showing why MySQL is vastly inferior compared to real RDBMS … you can solve this on Oracle, for example, with a simple subquery.
Comment by nah — August 22, 2006 @ 5:09 pm
Nah — you did not read the post — he said specifically “And we’ll do it without subqueries, for the pre-4.1 folks.” It can be done the same way in Oracle as it can in MySQL — with a correlated subquery that is ANSI SQL Standard.
Comment by Sheeri — August 23, 2006 @ 3:18 am
Here’s another form of the solution without subquery that also resolves matching ratings by using the primary key:
DROP DATABASE IF EXISTS topn;
CREATE DATABASE topn;
USE topn;
CREATE TABLE theTable (
pKey int PRIMARY KEY
, groupId int
, rating int
);
INSERT INTO theTable (
pKey
, groupId
, rating
) VALUES
( 1, 1, 55 )
, ( 2, 1, 53 )
, ( 3, 1, 51 )
, ( 4, 1, 59 )
, ( 5, 1, 58 )
, ( 6, 1, 58 )
, ( 7, 1, 53 )
, ( 8, 1, 55 )
, ( 9, 1, 55 )
, ( 10, 1, 53 )
, ( 11, 2, 52 )
, ( 12, 2, 53 )
, ( 13, 2, 54 )
, ( 14, 2, 59 )
, ( 15, 2, 58 )
, ( 16, 2, 57 )
, ( 17, 2, 56 )
;
SELECT t1.pKey
, t1.rating
, t1.groupId
, COUNT(t2.pKey) AS cnt
FROM theTable AS t1
LEFT JOIN theTable AS t2
ON (t1.rating, t1.pKey)
Comment by Jon Armstrong (Xgc) — February 12, 2009 @ 12:56 pm
Looks like the comment was cut off. Just the query:
SELECT t1.pKey
, t1.rating
, t1.groupId
, COUNT(t2.pKey) AS cnt
FROM theTable AS t1
LEFT JOIN theTable AS t2
ON (t1.rating, t1.pKey)
Comment by Jon Armstrong (Xgc) — February 12, 2009 @ 12:57 pm
Sorry. You have a problem with inserting greater than and less than characters in this blog. That’s what cut off the query.
SELECT t1.pKey
, t1.rating
, t1.groupId
, COUNT(t2.pKey) AS cnt
FROM theTable AS t1
LEFT JOIN theTable AS t2
ON (t1.rating, t1.pKey) >= (t2.rating, t2.pKey)
AND t1.groupId = t2.groupId
GROUP BY t1.pKey
, t1.rating
, t1.groupId
HAVING cnt >= 5
ORDER BY t1.groupId, cnt
;
Comment by Jon Armstrong (Xgc) — February 12, 2009 @ 12:59 pm
Sorry. Please remove all the above mess. This replaces it.
Here’s another form of the solution without subquery that also resolves matching ratings by using the primary key:
DROP DATABASE IF EXISTS topn;
CREATE DATABASE topn;
USE topn;
CREATE TABLE theTable (
pKey int PRIMARY KEY
, groupId int
, rating int
);
INSERT INTO theTable (
pKey
, groupId
, rating
) VALUES
( 1, 1, 55 )
, ( 2, 1, 53 )
, ( 3, 1, 51 )
, ( 4, 1, 59 )
, ( 5, 1, 58 )
, ( 6, 1, 58 )
, ( 7, 1, 53 )
, ( 8, 1, 55 )
, ( 9, 1, 55 )
, ( 10, 1, 53 )
, ( 11, 2, 52 )
, ( 12, 2, 53 )
, ( 13, 2, 54 )
, ( 14, 2, 59 )
, ( 15, 2, 58 )
, ( 16, 2, 57 )
, ( 17, 2, 56 )
;
SELECT t1.pKey
, t1.rating
, t1.groupId
, COUNT(t2.pKey) AS cnt
FROM theTable AS t1
LEFT JOIN theTable AS t2
ON (t1.rating, t1.pKey) <= (t2.rating, t2.pKey)
AND t1.groupId = t2.groupId
GROUP BY t1.pKey
, t1.rating
, t1.groupId
HAVING cnt <= 5
ORDER BY t1.groupId, cnt
;
Comment by Jon Armstrong (Xgc) — February 12, 2009 @ 1:05 pm
[…] subquery’ restriction. But there’s always some way around. Thx to Jon Armstrong from A little Noise for the code […]
Pingback by ActiveSPLIT.com » MySQL – Get the Top N of a Group — March 26, 2010 @ 4:43 am
thank you,thank you,thank you,thank you….
Comment by John — September 28, 2014 @ 3:39 pm
See also:
ROW_NUMBER() OVER (PARTITION BY groupId ORDER BY rating DESC, pKey) AS n
Place that in a CTE term or derived table. Now select all results with n <= 5.
Easily done. Note: To use Window Functions and Common Table Expressions, MySQL 8.0 or at least MariaDB 10.2+ or 10.3 is required.
Comment by Jon Armstrong (Xgc) — April 25, 2018 @ 9:10 pm
Using the prior topn schema / data (hopefully, the logic / symbols will survive the paste, even if the format and spacing does not):
— Top 5 Items (by rating) per Group – Using Window Functions and Common Table Expressions
WITH ranks AS (
SELECT t1.pKey
, t1.rating
, t1.groupId
, ROW_NUMBER() OVER (PARTITION BY groupId ORDER BY rating DESC, pKey) AS n
FROM theTable AS t1
)
SELECT *
FROM ranks
WHERE n <= 5
ORDER BY groupId, n
;
Comment by Jon Armstrong (Xgc) — April 25, 2018 @ 9:36 pm