我的查询要花特别长的时间(15+秒),并且随着时间的推移,随着数据集的增长,查询只会变得越来越糟。我过去对此进行了优化,并添加了索引,代码级排序和其他优化,但是还需要进一步完善。
SELECT sounds.*, avg(ratings.rating) AS avg_rating, count(ratings.rating) AS votes FROM `sounds`
INNER JOIN ratings ON sounds.id = ratings.rateable_id
WHERE (ratings.rateable_type = 'Sound'
AND sounds.blacklisted = false
AND sounds.ready_for_deployment = true
AND sounds.deployed = true
AND sounds.type = "Sound"
AND sounds.created_at > "2011-03-26 21:25:49")
GROUP BY ratings.rateable_id
查询的目的是让我获得sound id
以及最近发布的声音的平均评分。大约有1500种声音和200万种评级。
我有几个指数 sounds
mysql> show index from sounds;
+--------+------------+------------------------------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+————+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+------------------------------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+————+
| sounds | 0 | PRIMARY | 1 | id | A | 1388 | NULL | NULL | | BTREE | |
| sounds | 1 | sounds_ready_for_deployment_and_deployed | 1 | deployed | A | 5 | NULL | NULL | YES | BTREE | |
| sounds | 1 | sounds_ready_for_deployment_and_deployed | 2 | ready_for_deployment | A | 12 | NULL | NULL | YES | BTREE | |
| sounds | 1 | sounds_name | 1 | name | A | 1388 | NULL | NULL | | BTREE | |
| sounds | 1 | sounds_description | 1 | description | A | 1388 | 128 | NULL | YES | BTREE | |
+--------+------------+------------------------------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+
还有几个 ratings
mysql> show index from ratings;
+---------+------------+-----------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+————+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+-----------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+————+
| ratings | 0 | PRIMARY | 1 | id | A | 2008251 | NULL | NULL | | BTREE | |
| ratings | 1 | index_ratings_on_rateable_id_and_rating | 1 | rateable_id | A | 18 | NULL | NULL | | BTREE | |
| ratings | 1 | index_ratings_on_rateable_id_and_rating | 2 | rating | A | 9297 | NULL | NULL | YES | BTREE | |
+---------+------------+-----------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
这里是 EXPLAIN
mysql> EXPLAIN SELECT sounds.*, avg(ratings.rating) AS avg_rating, count(ratings.rating) AS votes FROM sounds INNER JOIN ratings ON sounds.id = ratings.rateable_id WHERE (ratings.rateable_type = 'Sound' AND sounds.blacklisted = false AND sounds.ready_for_deployment = true AND sounds.deployed = true AND sounds.type = "Sound" AND sounds.created_at > "2011-03-26 21:25:49") GROUP BY ratings.rateable_id;
+----+-------------+---------+--------+--------------------------------------------------+-----------------------------------------+---------+-----------------------------------------+---------+——————+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+--------+--------------------------------------------------+-----------------------------------------+---------+-----------------------------------------+---------+——————+
| 1 | SIMPLE | ratings | index | index_ratings_on_rateable_id_and_rating | index_ratings_on_rateable_id_and_rating | 9 | NULL | 2008306 | Using where |
| 1 | SIMPLE | sounds | eq_ref | PRIMARY,sounds_ready_for_deployment_and_deployed | PRIMARY | 4 | redacted_production.ratings.rateable_id | 1 | Using where |
+----+-------------+---------+--------+--------------------------------------------------+-----------------------------------------+---------+-----------------------------------------+---------+-------------+
我确实会缓存获得的结果,因此站点性能不是什么大问题,但是由于此调用花费的时间太长,因此缓存预热器的运行时间越来越长,这已成为一个问题。在一个查询中似乎没有很多数字可以处理...
我还能做些什么来使它更好地执行?
EXPLAIN
输出吗?EXPLAIN SELECT sounds.*, avg(ratings.rating) AS avg_rating, count(ratings.rating) AS votes FROM sounds INNER JOIN ratings ON sounds.id = ratings.rateable_id WHERE (ratings.rateable_type = 'Sound' AND sounds.blacklisted = false AND sounds.ready_for_deployment = true AND sounds.deployed = true AND sounds.type = "Sound" AND sounds.created_at > "2011-03-26 21:25:49") GROUP BY ratings.rateable_id