MySQL 5.7 でクエリだけでランキングを実現する方法

ここ半年、競プロをこつこつ頑張っているエンジニアの id:dorapon2000 です。好きなアルゴリズムは累積和です。

解決したい課題

「MySQL 5.7 内で完結できるように、クエリだけでランキングを取得したい」

データベースのデータを使って調査をする際、ランキングを出したいこと、あるいはランキングを基にした処理をしたいことがあります¹。アプリケーション側で集計・ソートして値を取得することもできますが、1回限りの調査の場合、わざわざコードを書くことが億劫です。

MySQL 8.0ではRANK関数が導入されたため、そちらで簡単にランキングを取得できます（記事末尾に参考で記載）。今回は、RANK関数導入以前のMySQL 5.7 で、クエリから一発でランキングを取得することを目指します。

検証環境

MySQL 5.7.36

scoreというテーブルを作成し、1万レコードのscore（0〜9999のランダム）をインサートしました。このscoreの降順のランキングを取得することを目指します。

mysql> desc score;
+---------+---------+------+-----+---------+----------------+
| Field   | Type    | Null | Key | Default | Extra          |
+---------+---------+------+-----+---------+----------------+
| user_id | int(11) | NO   | PRI | NULL    | auto_increment |
| score   | int(11) | NO   |     | NULL    |                |
+---------+---------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

mysql> show create table score\G;
*************************** 1. row ***************************
       Table: score
Create Table: CREATE TABLE `score` (
  `user_id` int(11) NOT NULL AUTO_INCREMENT,
  `score` int(11) NOT NULL,
  PRIMARY KEY (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=16394 DEFAULT CHARSET=utf8 COMMENT='score'
1 row in set (0.00 sec)

mysql> select count(*) from score;
+----------+
| count(*) |
+----------+
|    10000 |
+----------+
1 row in set (0.01 sec)

方法① SQL標準

MySQLに限らずどのDBでも実行できる、SQL標準でのランキング取得方法です。

SELECT
    s1.score,
    s1.user_id,
    (SELECT COUNT(s2.score)
     FROM score AS s2
     WHERE s1.score < s2.score) + 1 AS r
FROM score AS s1
ORDER BY r
LIMIT 10;

そこまで複雑なクエリではないですが、ポイントはサブクエリの外と内で別々に2つのscoreを呼び出していることです。外側のscore AS s1で選ばれたs1.scoreがscore AS s2の中でどの順位にいるのかをWHERE句で判定します。scoreを使った2重のfor文と考えるとイメージしやすいかもしれません。正確ではないかもしれませんが、時間計算量はO(N²)だと考えられます。

実行結果は以下のようになります。

mysql> select s1.score, s1.user_id, (select count(s2.score) from score as s2 where s1.score < s2.score) + 1 as r from score as s1 order by r limit 10;
+-------+---------+------+
| score | user_id | r    |
+-------+---------+------+
|  9999 |    2080 |    1 |
|  9999 |    9919 |    1 |
|  9999 |    9507 |    1 |
|  9997 |    6792 |    4 |
|  9997 |    7412 |    4 |
|  9995 |     532 |    6 |
|  9993 |    5835 |    7 |
|  9990 |    8757 |    8 |
|  9990 |    8890 |    8 |
|  9988 |    2596 |   10 |
+-------+---------+------+
10 rows in set (17.23 sec)

mysql> explain select s1.score, s1.user_id, (select count(s2.score) from score as s2 where s1.score < s2.score) + 1 as r from score as s1 order by r limit 10\G;
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: s1
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 10000
     filtered: 100.00
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: s2
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 10000
     filtered: 33.33
        Extra: Using where
2 rows in set, 2 warnings (0.00 sec)

時間計算量が大きいため、たった1万レコードの集計でもかなり時間がかかってしまうことが難点です。同スコアの場合は同順位になります。

方法② ユーザー定義変数

SET @r = 0;
SELECT score, user_id, @r:=@r+1
FROM score
ORDER BY score DESC
LIMIT 10;

MySQLでは、そのセッションでのみ有効なユーザー定義変数というものをクエリの中で使うことができます。ここに順位を一時格納することで、ランキングを実現します。時間計算量はO(N)です。

実行結果は以下のようになります。

mysql> set @r = 0;
Query OK, 0 rows affected (0.00 sec)

mysql> select score, user_id, @r:=@r+1 from score order by score desc limit 10;
+-------+---------+----------+
| score | user_id | @r:=@r+1 |
+-------+---------+----------+
|  9999 |    9507 |        1 |
|  9999 |    2080 |        2 |
|  9999 |    9919 |        3 |
|  9997 |    7412 |        4 |
|  9997 |    6792 |        5 |
|  9995 |     532 |        6 |
|  9993 |    5835 |        7 |
|  9990 |    8890 |        8 |
|  9990 |    8757 |        9 |
|  9988 |    2596 |       10 |
+-------+---------+----------+
10 rows in set (0.01 sec)

mysql> explain select score, user_id, @r:=@r+1 from score order by score desc limit 10\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: score
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 10000
     filtered: 100.00
        Extra: Using filesort
1 row in set, 1 warning (0.00 sec)

同じscoreの場合、同順位にならない点が意図しないかもしれませんが、時間計算量が小さいため爆速で集計が終わります。

（参考）MySQL 8.0の場合

MySQL :: MySQL 8.0 リファレンスマニュアル :: 12.21.1 Window 関数の説明

mysql> select score, user_id, rank() over w as 'rank' from score window w as (order by score desc) limit 10;
+-------+---------+------+
| score | user_id | rank |
+-------+---------+------+
|  9995 |    5081 |    1 |
|  9994 |    3665 |    2 |
|  9994 |    7494 |    2 |
|  9993 |     558 |    4 |
|  9993 |    6981 |    4 |
|  9992 |    3584 |    6 |
|  9992 |    8848 |    6 |
|  9990 |    9185 |    8 |
|  9989 |     525 |    9 |
|  9989 |    4740 |    9 |
+-------+---------+------+
10 rows in set (0.00 sec)

Redisのsorted setなどランキングに適したデータストアを使う方がより良いと思いますが↩

Mobile Factory Tech Blog

技術好きな方へ！モバイルファクトリーのエンジニアたちが楽しい技術話をお届けします！