# 为什么非数字喜欢[0-9]？

13

``SELECT SERVERPROPERTY('Collation') AS Collation;``

### 过滤数字会产生非数字字符

``````WITH P0(_) AS (SELECT 0 UNION ALL SELECT 0),
P1(_) AS (SELECT 0 FROM P0 AS L CROSS JOIN P0 AS R),
P2(_) AS (SELECT 0 FROM P1 AS L CROSS JOIN P1 AS R),
P3(_) AS (SELECT 0 FROM P2 AS L CROSS JOIN P2 AS R),
Tally(Number) AS (
SELECT -1 + ROW_NUMBER() OVER (ORDER BY (SELECT 0))
FROM P3
)
SELECT Number AS CodePoint, CHAR(Number) AS Symbol
INTO #CodePage
FROM Tally
WHERE Number >= 0 AND Number <= 255;``````

``````0
1
2
...
32
33  !
34  "
35  #
...
48  0
49  1
50  2
...
65  A
66  B
67  C
...
253 ý
254 þ
255 ÿ``````

``````SELECT CodePoint, Symbol
FROM #CodePage
WHERE Symbol LIKE '[0-9]';``````

``````CodePoint   Symbol
48  0
49  1
50  2
51  3
52  4
53  5
54  6
55  7
56  8
57  9
178 ²
179 ³
185 ¹
188 ¼
189 ½
190 ¾``````

### 使用二进制排序规则作为解决方法

``````SELECT CodePoint, Symbol
FROM #CodePage
WHERE Symbol LIKE '[0-9]' COLLATE Latin1_General_BIN;``````

``````CodePoint   Symbol
48  0
49  1
50  2
51  3
52  4
53  5
54  6
55  7
56  8
57  9``````

22

`[0-9]` 不是定义为仅匹配数字的某种正则表达式。

`LIKE`模式中的任何范围都根据排序规则排序顺序匹配开始字符和结束字符之间的字符。

``````SELECT CodePoint,
Symbol,
RANK() OVER (ORDER BY Symbol COLLATE Latin1_General_CI_AS) AS Rnk
FROM   #CodePage
WHERE  Symbol LIKE '[0-9]' COLLATE Latin1_General_CI_AS
ORDER  BY Symbol COLLATE Latin1_General_CI_AS ``````

``````CodePoint            Symbol Rnk
-------------------- ------ --------------------
48                   0      1
188                  ¼      2
189                  ½      3
190                  ¾      4
185                  ¹      5
49                   1      5
50                   2      7
178                  ²      7
179                  ³      9
51                   3      9
52                   4      11
53                   5      12
54                   6      13
55                   7      14
56                   8      15
57                   9      16``````

``````SELECT CodePoint, Symbol
FROM #CodePage
WHERE Symbol LIKE '[0123456789]' COLLATE Latin1_General_CS_AS``````

6

Latin1是代码页1252，其中178是'SUPERSCRIPT TWO'。这是Unicode 上标：是字符“ 2”作为上标。根据Unicode技术标准＃10，它应比较等于2，请参阅8.1归类折叠