单个查询的运行时间为10毫秒,使用UNION ALL则需要290毫秒以上的时间(MySQL DB记录为770万)。如何优化?


9

我有一个表,用于存储教师可用的约会,允许两种插入:

  1. 每小时收费:完全自由地每天为每位老师添加无限制的时间段(只要时间段不重叠即可):15年4月15日,老师可能在10:00、11:00、12:00和16:00有时间段。选择特定的老师时间/时间后,将为一个人服务。

  2. 时间段/范围:4月15日,另一位老师可以在10:00至12:00,然后从14:00至18:00上班。一个人按到达顺序得到服务,因此,如果教师在10:00到12:00之间工作,则在此期间到达的所有人员都将按到达顺序(本地队列)参加。

由于必须在搜索中返回所有可用的教师,因此我需要将所有空位与到达范围的顺序保存在同一张表中。这样,我可以通过date_from ASC进行排序,在搜索结果上首先显示第一个可用的广告位。

当前表结构

CREATE TABLE `teacher_slots` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `teacher_id` mediumint(8) unsigned NOT NULL,
  `city_id` smallint(5) unsigned NOT NULL,
  `subject_id` smallint(5) unsigned NOT NULL,
  `date_from` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `date_to` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `status` tinyint(4) NOT NULL DEFAULT '0',
  `order_of_arrival` tinyint(1) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `by_hour_idx` (`teacher_id`,`order_of_arrival`,`status`,`city_id`,`subject_id`,`date_from`),
  KEY `order_arrival_idx` (`order_of_arrival`,`status`,`city_id`,`subject_id`,`date_from`,`date_to`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

搜索查询

我需要过滤:实际日期时间,city_id,subject_id以及是否有空位(状态= 0)。

对于基于小时的时间,我必须显示每位老师在最近的第一天的所有可用时段(显示给定日期的所有时段,并且同一位老师不能显示超过一天)。(我在mattedgod的帮助下获得了查询)。

对于基于范围的命令(order_of_arrival = 1),我必须显示最接近的可用范围,每位老师仅一次。

第一个查询分别在大约0.10毫秒内运行,第二个查询在0.08毫秒内运行,而UNION ALL平均运行300毫秒。

(
    SELECT id, teacher_slots.teacher_id, date_from, date_to, order_of_arrival
    FROM teacher_slots
    JOIN (
        SELECT DATE(MIN(date_from)) as closestDay, teacher_id
        FROM teacher_slots
        WHERE   date_from >= '2014-04-10 08:00:00' AND order_of_arrival = 0
                AND status = 0 AND city_id = 6015 AND subject_id = 1
        GROUP BY teacher_id
    ) a ON a.teacher_id = teacher_slots.teacher_id
    AND DATE(teacher_slots.date_from) = closestDay
    WHERE teacher_slots.date_from >= '2014-04-10 08:00:00'
        AND teacher_slots.order_of_arrival = 0
        AND teacher_slots.status = 0
        AND teacher_slots.city_id = 6015
        AND teacher_slots.subject_id = 1
)

UNION ALL

(
    SELECT id, teacher_id, date_from, date_to, order_of_arrival
    FROM teacher_slots
    WHERE order_of_arrival = 1 AND status = 0 AND city_id = 6015 AND subject_id = 1
        AND (
            (date_from <= '2014-04-10 08:00:00' AND  date_to >= '2014-04-10 08:00:00')
            OR (date_from >= '2014-04-10 08:00:00')
        )
    GROUP BY teacher_id
)

ORDER BY date_from ASC;

有没有一种方法可以优化UNION,这样我就可以在一个查询(带有IF等)中获得最大〜20ms甚至是基于返回范围+每小时返回的合理响应?

SQL小提琴: http ://www.sqlfiddle.com/#!2/59420/1/0

编辑:

我通过创建一个仅存储日期的字段“ only_date_from”来尝试了一些非规范化,因此可以更改此日期...

DATE(MIN(date_from)) as closestDay / DATE(teacher_slots.date_from) = closestDay

...对此

MIN(only_date_from) as closestDay / teacher_slots.only_date_from = closestDay

它已经为我节省了100毫秒!平均仍为200毫秒。

Answers:


1

首先,我认为您的原始查询可能不正确。参考你的SQLFiddle,在我看来,就好像你应该能够返回行ID= 234(除了一行ID= 1从今年上半年获得),因为现有的逻辑好像你似乎打算为这些其他行被包含在内,因为它们明确地满足了OR (date_from >= '2014-04-10 08:00:00')您第二WHERE条的要求。

GROUP BY teacher_id您第二部分中的子句UNION使您丢失这些行。这是因为您实际上并未汇总选择列表中的任何列,在这种情况下,这GROUP BY将导致“难以定义”行为。

另外,虽然我无法解释您的性能不佳UNION,但可以通过从查询中彻底删除它来为您解决:

我没有使用两组单独的逻辑(从一部分开始,重复一些)来从同一张表中获取行,而是将您的逻辑合并到一个查询中,并把逻辑的不同之处放在OR一起-即,如果一行遇到一个或另一个包括原始WHERE条款。这是可能的,因为我已经更换了(INNER) JOIN你使用找到closestDateLEFT JOIN

LEFT JOIN意味着我们现在也能够区分应将哪一组逻辑应用于行;如果联接有效(closestDate IS NOT NULL),则从上半部分开始应用您的逻辑;但是,如果联接失败(closestDate IS NULL),则从后半部分开始应用逻辑。

因此,这将返回查询返回的所有行(在小提琴中),并且还将获取其他行。

  SELECT
    *

  FROM 
    teacher_slots ts

    LEFT JOIN 
    (
      SELECT 
        teacher_id,
        DATE(MIN(date_from)) as closestDay

      FROM 
        teacher_slots

      WHERE   
        date_from >= '2014-04-10 08:00:00' 
        AND order_of_arrival = 0
        AND status = 0 
        AND city_id = 6015 
        AND subject_id = 1

      GROUP BY 
        teacher_id

    ) a
    ON a.teacher_id = ts.teacher_id
    AND a.closestDay = DATE(ts.date_from)

  WHERE 
    /* conditions that were common to both halves of the union */
    ts.status = 0
    AND ts.city_id = 6015
    AND ts.subject_id = 1

    AND
    (
      (
        /* conditions that were from above the union 
           (ie when we joined to get closest future date) */
        a.teacher_id IS NOT NULL
        AND ts.date_from >= '2014-04-10 08:00:00'
        AND ts.order_of_arrival = 0
      ) 
      OR
      (
        /* conditions that were below the union 
          (ie when we didn't join) */
        a.teacher_id IS NULL       
        AND ts.order_of_arrival = 1 
        AND 
        (
          (
            date_from <= '2014-04-10 08:00:00' 
            AND  
            date_to >= '2014-04-10 08:00:00'
          )

          /* rows that met this condition were being discarded 
             as a result of 'difficult to define' GROUP BY behaviour. */
          OR date_from >= '2014-04-10 08:00:00' 
        )
      )
    )

  ORDER BY 
   ts.date_from ASC;

此外,你可以在“整理”您的查询进一步,这样你就不需要“插入”你的statuscity_id并且subject_id参数不止一次。

为此,将子查询更改a为也选择那些列,并对这些列进行分组。然后,JOINON子句将需要将这些列映射到它们的ts.xxx等效项。

我认为这不会对性能产生负面影响,但是如果不对大型数据集进行测试就无法确定。

因此,您的加入将更像:

LEFT JOIN 
(
  SELECT 
    teacher_id,
    status,
    city_id,
    subject_id,
    DATE(MIN(date_from)) as closestDay

  FROM 
    teacher_slots

  WHERE   
    date_from >= '2014-04-10 08:00:00' 
    AND order_of_arrival = 0
  /* These no longer required here...
    AND status = 0 
    AND city_id = 6015 
    AND subject_id = 1
  */

  GROUP BY 
    teacher_id,
    status,
    city_id,
    subject_id

) a
ON a.teacher_id = ts.teacher_id
AND a.status = ts.status 
AND a.city_id = ts.city_id 
AND a.subject_id = ts.city_id
AND a.closestDay = DATE(ts.date_from)

2

试试这个查询:

(
select * from (SELECT id, teacher_slots.teacher_id, date_from, date_to,  order_of_arrival
FROM teacher_slots  WHERE teacher_slots.date_from >= '2014-04-10 08:00:00'
    AND teacher_slots.order_of_arrival = 0
    AND teacher_slots.status = 0
    AND teacher_slots.city_id = 6015
    AND teacher_slots.subject_id = 1) 
 teacher_slots
JOIN (
    SELECT DATE(MIN(date_from)) as closestDay, teacher_id
    FROM teacher_slots
    WHERE   date_from >= '2014-04-10 08:00:00' AND order_of_arrival = 0
            AND status = 0 AND city_id = 6015 AND subject_id = 1
    GROUP BY teacher_id
) a ON a.teacher_id = teacher_slots.teacher_id
AND DATE(teacher_slots.date_from) = closestDay

)

UNION ALL

(
SELECT id, teacher_id, date_from, date_to, order_of_arrival
FROM teacher_slots
WHERE order_of_arrival = 1 AND status = 0 AND city_id = 6015 AND subject_id = 1
    AND (
        (date_from <= '2014-04-10 08:00:00' AND  date_to >= '2014-04-10 08:00:00')
        OR (date_from >= '2014-04-10 08:00:00')
    )
GROUP BY teacher_id
)

ORDER BY date_from ASC;
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.