我在将DATETIME(甚至日期)作为PRIMARY KEY的第一部分时遇到问题。
我使用MySQL 5.5
这是我的两张桌子:
-- This is my standard table with dateDim as a dateTime
CREATE TABLE `stats` (
`dateDim` datetime NOT NULL,
`accountDim` mediumint(8) unsigned NOT NULL,
`execCodeDim` smallint(5) unsigned NOT NULL,
`operationTypeDim` tinyint(3) unsigned NOT NULL,
`junkDim` tinyint(3) unsigned NOT NULL,
`ipCountryDim` smallint(5) unsigned NOT NULL,
`count` int(10) unsigned NOT NULL,
`amount` bigint(20) NOT NULL,
PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
-- Here is a copy with datDim as an integer
CREATE TABLE `stats_todays` (
`dateDim` int(11) unsigned NOT NULL,
`accountDim` mediumint(8) unsigned NOT NULL,
`execCodeDim` smallint(5) unsigned NOT NULL,
`operationTypeDim` tinyint(3) unsigned NOT NULL,
`junkDim` tinyint(3) unsigned NOT NULL,
`ipCountryDim` smallint(5) unsigned NOT NULL,
`count` int(10) unsigned NOT NULL,
`amount` bigint(20) NOT NULL,
PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
我用完全相同的数据填充了两个表(接近1亿)
但:
- 统计信息表使用DATETIME作为dateDim
- stats_todays将un INTEGER与TO_DAYS()一起用作dateDim
我的问题是:当索引的第一部分是日期时间时,为什么MySQL不使用PRIMARY KEY?这是非常奇怪的,因为使用相同的数据,但使用INTEGER和TO_DAYS(dateDim)合并了相同的请求。
统计信息表(和日期时间)的示例:
SELECT *
FROM `stats`
WHERE
dateDim = '2014-04-03 00:00:00'
AND accountDim = 4
AND execCodeDim = 9
AND operationTypeDim = 1
AND junkDim = 5
AND ipCountryDim = 3
=> 1 result (4.5sec)
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE stats ALL NULL NULL NULL NULL 8832329 Using where
另一个表stats_todays上的相同请求(使用INTEGER和TO_DAYS())
EXPLAIN SELECT *
FROM `stats_todays`
WHERE
dateDim = TO_DAYS('2014-04-03 00:00:00')
AND accountDim = 4
AND execCodeDim = 9
AND operationTypeDim = 1
AND junkDim = 5
AND ipCountryDim = 3
=> Result 1 row (0.0003 sec)
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE stats_todays const PRIMARY PRIMARY 13 const,const,const,const,const,const 1
如果您阅读了全文,则可以理解这不是一个低基数问题,因为该请求使用的整数与INTEGER dateDim字段完全相同。
以下是一些高级详细信息:
SELECT COUNT( DISTINCT dateDim )
FROM stats_todays
UNION ALL
SELECT COUNT( DISTINCT dateDim )
FROM stats;
Result:
COUNT(DISTINCT dateDim)
2192
2192
这是INDEX描述:
SHOW INDEXES FROM `stats`
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
stats 0 PRIMARY 1 dateDim A 6921 NULL NULL BTREE
stats 0 PRIMARY 2 accountDim A 883232 NULL NULL BTREE
stats 0 PRIMARY 3 execCodeDim A 8832329 NULL NULL BTREE
stats 0 PRIMARY 4 operationTypeDim A 8832329 NULL NULL BTREE
stats 0 PRIMARY 5 junkDim A 8832329 NULL NULL BTREE
stats 0 PRIMARY 6 ipCountryDim A 8832329 NULL NULL BTREE
SHOW INDEXES FROM `stats_todays`
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
stats_todays 0 PRIMARY 1 dateDim A 7518 NULL NULL BTREE
stats_todays 0 PRIMARY 2 accountDim A 4022582 NULL NULL BTREE
stats_todays 0 PRIMARY 3 execCodeDim A 8045164 NULL NULL BTREE
stats_todays 0 PRIMARY 4 operationTypeDim A 8045164 NULL NULL BTREE
stats_todays 0 PRIMARY 5 junkDim A 8045164 NULL NULL BTREE
stats_todays 0 PRIMARY 6 ipCountryDim A 8045164 NULL NULL BTREE
从统计资料中选取dateDim,COUNT(*),并以ROLLUP替换dateDim
- 告诉您有2192个不同的日期,并且分区很平滑(按日期约3000-4000行)
- 表格中有8 831 990行
- 其他表也一样
- 我尝试了COVERING INDEX(用所有PK列替换*)=>没有任何变化
- 我试过了|使用索引=>没有任何变化
- 与日期字段相同,而不是日期时间
- 与INDEX或UNIQUE代替主键相同
如果您跑步
—
ypercubeᵀᴹ
WHERE dateDim = DATE('2014-04-03 00:00:00')
?
date
代替会发生同样的事情datetime
吗?