连接字符用于连接两个字符。
在Java中,连接字符是Character.getType(int codePoint) / Character.getType(char ch)返回等于Character.CONNECTOR_PUNCTUATION的值的字符。
请注意,在Java中,字符信息基于Unicode标准,该标准通过为连接字符分配通用类别Pc来标识连接字符,Pc是Connector_Punctuation的别名。
以下代码段,
for (int i = Character.MIN_CODE_POINT; i <= Character.MAX_CODE_POINT; i++) {
if (Character.getType(i) == Character.CONNECTOR_PUNCTUATION
&& Character.isJavaIdentifierStart(i)) {
System.out.println("character: " + String.valueOf(Character.toChars(i))
+ ", codepoint: " + i + ", hexcode: " + Integer.toHexString(i));
}
}
在jdk1.6.0_45上打印可用于启动标识符的连接字符
character: _, codepoint: 95, hexcode: 5f
character: ‿, codepoint: 8255, hexcode: 203f
character: ⁀, codepoint: 8256, hexcode: 2040
character: ⁔, codepoint: 8276, hexcode: 2054
character: ・, codepoint: 12539, hexcode: 30fb
character: ︳, codepoint: 65075, hexcode: fe33
character: ︴, codepoint: 65076, hexcode: fe34
character: ﹍, codepoint: 65101, hexcode: fe4d
character: ﹎, codepoint: 65102, hexcode: fe4e
character: ﹏, codepoint: 65103, hexcode: fe4f
character: _, codepoint: 65343, hexcode: ff3f
character: ・, codepoint: 65381, hexcode: ff65
以下是在jdk1.6.0_45上编译的,
int _, ‿, ⁀, ⁔, ・, ︳, ︴, ﹍, ﹎, ﹏, _, ・ = 0;
显然,对于以下两个连接字符(向后兼容... oops!),上述声明无法在jdk1.7.0_80和jdk1.8.0_51上编译,
character: ・, codepoint: 12539, hexcode: 30fb
character: ・, codepoint: 65381, hexcode: ff65
无论如何,除了细节之外,该考试仅侧重于基本拉丁字符集。
另外,对于Java中的合法标识符,此处提供了规范。使用Character类API获取更多详细信息。