不同编译器中C ++和C之间无符号位域整数表达式的截断不一致

编辑2：

当以前驻留在C ++源文件中但完全移入C文件的函数开始返回不正确的结果时，我正在调试一个奇怪的测试失败。下面的MVE允许重现GCC问题。但是，当我一时兴起，用Clang（后来又用VS）编译示例时，得到了不同的结果！我无法弄清楚是将其视为编译器之一中的错误，还是C或C ++标准允许的未定义结果的体现。奇怪的是，没有一个编译器给我有关该表达式的任何警告。

罪魁祸首是这样的表达：

ctl.b.p52 << 12;

在这里，p52键入为uint64_t；它也是工会的一部分（见control_t下文）。移位操作不会丢失任何数据，因为结果仍然适合64位。但是，如果我使用C编译器，那么GCC决定将结果截断为52位！使用C ++编译器，将保留所有64位结果。

为了说明这一点，下面的示例程序用相同的主体编译了两个函数，然后比较了它们的结果。c_behavior()放在C源文件和cpp_behavior()C ++文件中，并main()进行比较。

带有示例代码的存储库：https : //github.com/grigory-rechistov/c-cpp-bitfields

标头common.h定义64位宽位域和整数的并集，并声明两个函数：

#ifndef COMMON_H
#define COMMON_H

#include <stdint.h>

typedef union control {
        uint64_t q;
        struct {
                uint64_t a: 1;
                uint64_t b: 1;
                uint64_t c: 1;
                uint64_t d: 1;
                uint64_t e: 1;
                uint64_t f: 1;
                uint64_t g: 4;
                uint64_t h: 1;
                uint64_t i: 1;
                uint64_t p52: 52;
        } b;
} control_t;

#ifdef __cplusplus
extern "C" {
#endif

uint64_t cpp_behavior(control_t ctl);
uint64_t c_behavior(control_t ctl);

#ifdef __cplusplus
}
#endif

#endif // COMMON_H

这些函数具有相同的主体，不同之处在于一个函数被视为C，另一个被视为C ++。

c-part.c：

#include <stdint.h>
#include "common.h"
uint64_t c_behavior(control_t ctl) {
    return ctl.b.p52 << 12;
}

cpp-part.cpp：

#include <stdint.h>
#include "common.h"
uint64_t cpp_behavior(control_t ctl) {
    return ctl.b.p52 << 12;
}

main.c：

#include <stdio.h>
#include "common.h"

int main() {
    control_t ctl;
    ctl.q = 0xfffffffd80236000ull;

    uint64_t c_res = c_behavior(ctl);
    uint64_t cpp_res = cpp_behavior(ctl);
    const char *announce = c_res == cpp_res? "C == C++" : "OMG C != C++";
    printf("%s\n", announce);

    return c_res == cpp_res? 0: 1;
}

GCC显示了它们返回的结果之间的差异：

$ gcc -Wpedantic main.c c-part.c cpp-part.cpp

$ ./a.exe
OMG C != C++

但是，对于Clang C和C ++，它们的行为相同且符合预期：

$ clang -Wpedantic main.c c-part.c cpp-part.cpp

$ ./a.exe
C == C++

使用Visual Studio，可以获得与使用Clang相同的结果：

C:\Users\user\Documents>cl main.c c-part.c cpp-part.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24234.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

main.c
c-part.c
Generating Code...
Compiling...
cpp-part.cpp
Generating Code...
Microsoft (R) Incremental Linker Version 14.00.24234.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:main.exe
main.obj
c-part.obj
cpp-part.obj

C:\Users\user\Documents>main.exe
C == C++

我在Windows上尝试了这些示例，即使在Linux上发现了GCC的原始问题。

— 格里高里·雷史托夫
source

众所周知，位域对于大宽度是伪造的。：我遇到类似的问题，这个问题就stackoverflow.com/questions/58846584/...

— chqrlie

@chqrlie我读了Ç <<运营商为需要截断。

— Andrew Henle

请发布stackoverflow.com/help/minimal-reproducible-example 。当前代码没有，main.c并且可能以多种方式导致未定义的行为。IMO会更清楚地发布一个单文件MRE，该MRE在使用每个编译器进行编译时会产生不同的输出。因为C-C ++互操作标准没有很好地指定。另请注意，联合别名会导致C ++中的UB。

— MM

@MM是的，当我发布问题时，它已经滑落了。我现在添加了它，我还认为拥有一个小的存储库也可能是一个主意

— Grigory Rechistov

@MM“ IMO，发布使用每个编译器编译时会产生不同输出的单文件MRE，会更加清楚。”我没有考虑过这一点，因为我正在将生产代码转换为更小的代码，但是应该可以将再现器重新格式化为单个文件。

— Grigory Rechistov

Answers:

C和C ++对位字段成员的类型进行不同的处理。

C 2018 6.7.2.1 10说：

位字段被解释为具有符号或无符号的整数类型，该整数类型由指定的位数组成…

观察到这不是关于类型的特定的-它是某种整数类型-并没有说该类型是用于声明位字段的类型，如uint64_t a : 1;问题所示。显然，这使实现可以选择类型。

C ++ 2017草案n4659 12.2.4 [class.bit] 1表示位字段声明：

…位字段属性不是类成员类型的一部分…

这意味着，在诸如之类的声明中uint64_t a : 1;，: 1is不是类成员类型的一部分a，因此该类型就像是它的存在uint64_t a;，因此ais 的类型uint64_t。

因此，似乎GCC会将C中的位字段视为32位或更小的整数类型（如果适合），并将C ++中的位字段视为其声明的类型，并且这似乎没有违反标准。

— 埃里克·波斯蒂奇（Eric Postpischil）
source

我按照6.5.7 4的要求将C中的截断作为必读项（C18的措辞类似）：“ E1 << E2的结果是E1左移E2位的位置；空位用零填充。如果E1具有无符号类型，结果的值为E1 x 2E2，与结果类型中可表示的最大值相比，模减少了1。” E1在这种情况下是52位的位字段。

— 安德鲁·亨利

@AndrewHenle：我明白你在说什么。n位位域的类型为“ n位整数”（暂时忽略符号）。我将其解释为n位位域的类型是某种整数类型，由实现选择。仅基于6.7.2.1 10中的措词，我赞成您的解释。但是具有的一个问题是，给定一个uint64_t a : 33结构中的组2 ^ 33-1 s，那么，在C实现具有32位int，s.a+s.a应该产生2 ^ 33-2由于包装，但锵产生2 ^ 34- 2; 它显然将其视为uint64_t。

— Eric Postpischil

@AndrewHenle ：（更多理由：在中s.a+s.a，通常的算术转换不会更改的类型s.a，因为它的宽度比宽unsigned int，因此该算术将在33位类型中进行。）

— Eric Postpischil

但是Clang产生2 ^ 34-2; 它显然将其视为uint64_t。 如果那是64位编译，那似乎使Clang与GCC通过不截断处理64位编译的方式保持一致。Clang处理32位和64位编译是否不同？（似乎我刚刚学会了避免位字段的另一个原因...）

— Andrew Henle

@AndrewHenle：好了，旧的Apple Clang 1.7使用和都会产生2 ^ 32−2（不是2 ^ 33−2；它丢失了一点！），-m32并-m64警告该类型是GCC扩展。使用Apple Clang 11.0，我没有运行32位代码的库，但是生成的程序集显示pushl $3并pushl $-2在调用之前printf，所以我认为是2 ^ 34−2。因此，Apple Clang在32位和64位目标之间没有区别，但是随着时间的推移确实有所变化。

— Eric Postpischil

Andrew Henle建议对C标准进行严格的解释：位字段的类型是具有指定宽度的带符号或无符号整数类型。

这是支持这种解释的测试：使用C1x _Generic()构造，我试图确定不同宽度的位字段的类型。我必须用类型定义它们，long long int以避免在使用clang编译时出现警告。

来源如下：

#include <stdint.h>
#include <stdio.h>

#define typeof(X)  _Generic((X),                         \
                       long double: "long double",       \
                       double: "double",                 \
                       float: "float",                   \
                       unsigned long long int: "unsigned long long int",  \
                       long long int: "long long int",   \
                       unsigned long int: "unsigned long int",  \
                       long int: "long int",             \
                       unsigned int: "unsigned int",     \
                       int: "int",                       \
                       unsigned short: "unsigned short", \
                       short: "short",                   \
                       unsigned char: "unsigned char",   \
                       signed char: "signed char",       \
                       char: "char",                     \
                       _Bool: "_Bool",                   \
                       __int128_t: "__int128_t",         \
                       __uint128_t: "__uint128_t",       \
                       default: "other")

#define stype long long int
#define utype unsigned long long int

struct s {
    stype s1 : 1;
    stype s2 : 2;
    stype s3 : 3;
    stype s4 : 4;
    stype s5 : 5;
    stype s6 : 6;
    stype s7 : 7;
    stype s8 : 8;
    stype s9 : 9;
    stype s10 : 10;
    stype s11 : 11;
    stype s12 : 12;
    stype s13 : 13;
    stype s14 : 14;
    stype s15 : 15;
    stype s16 : 16;
    stype s17 : 17;
    stype s18 : 18;
    stype s19 : 19;
    stype s20 : 20;
    stype s21 : 21;
    stype s22 : 22;
    stype s23 : 23;
    stype s24 : 24;
    stype s25 : 25;
    stype s26 : 26;
    stype s27 : 27;
    stype s28 : 28;
    stype s29 : 29;
    stype s30 : 30;
    stype s31 : 31;
    stype s32 : 32;
    stype s33 : 33;
    stype s34 : 34;
    stype s35 : 35;
    stype s36 : 36;
    stype s37 : 37;
    stype s38 : 38;
    stype s39 : 39;
    stype s40 : 40;
    stype s41 : 41;
    stype s42 : 42;
    stype s43 : 43;
    stype s44 : 44;
    stype s45 : 45;
    stype s46 : 46;
    stype s47 : 47;
    stype s48 : 48;
    stype s49 : 49;
    stype s50 : 50;
    stype s51 : 51;
    stype s52 : 52;
    stype s53 : 53;
    stype s54 : 54;
    stype s55 : 55;
    stype s56 : 56;
    stype s57 : 57;
    stype s58 : 58;
    stype s59 : 59;
    stype s60 : 60;
    stype s61 : 61;
    stype s62 : 62;
    stype s63 : 63;
    stype s64 : 64;

    utype u1 : 1;
    utype u2 : 2;
    utype u3 : 3;
    utype u4 : 4;
    utype u5 : 5;
    utype u6 : 6;
    utype u7 : 7;
    utype u8 : 8;
    utype u9 : 9;
    utype u10 : 10;
    utype u11 : 11;
    utype u12 : 12;
    utype u13 : 13;
    utype u14 : 14;
    utype u15 : 15;
    utype u16 : 16;
    utype u17 : 17;
    utype u18 : 18;
    utype u19 : 19;
    utype u20 : 20;
    utype u21 : 21;
    utype u22 : 22;
    utype u23 : 23;
    utype u24 : 24;
    utype u25 : 25;
    utype u26 : 26;
    utype u27 : 27;
    utype u28 : 28;
    utype u29 : 29;
    utype u30 : 30;
    utype u31 : 31;
    utype u32 : 32;
    utype u33 : 33;
    utype u34 : 34;
    utype u35 : 35;
    utype u36 : 36;
    utype u37 : 37;
    utype u38 : 38;
    utype u39 : 39;
    utype u40 : 40;
    utype u41 : 41;
    utype u42 : 42;
    utype u43 : 43;
    utype u44 : 44;
    utype u45 : 45;
    utype u46 : 46;
    utype u47 : 47;
    utype u48 : 48;
    utype u49 : 49;
    utype u50 : 50;
    utype u51 : 51;
    utype u52 : 52;
    utype u53 : 53;
    utype u54 : 54;
    utype u55 : 55;
    utype u56 : 56;
    utype u57 : 57;
    utype u58 : 58;
    utype u59 : 59;
    utype u60 : 60;
    utype u61 : 61;
    utype u62 : 62;
    utype u63 : 63;
    utype u64 : 64;
} x;

int main(void) {
#define X(v)  printf("typeof(" #v "): %s\n", typeof(v))
    X(x.s1);
    X(x.s2);
    X(x.s3);
    X(x.s4);
    X(x.s5);
    X(x.s6);
    X(x.s7);
    X(x.s8);
    X(x.s9);
    X(x.s10);
    X(x.s11);
    X(x.s12);
    X(x.s13);
    X(x.s14);
    X(x.s15);
    X(x.s16);
    X(x.s17);
    X(x.s18);
    X(x.s19);
    X(x.s20);
    X(x.s21);
    X(x.s22);
    X(x.s23);
    X(x.s24);
    X(x.s25);
    X(x.s26);
    X(x.s27);
    X(x.s28);
    X(x.s29);
    X(x.s30);
    X(x.s31);
    X(x.s32);
    X(x.s33);
    X(x.s34);
    X(x.s35);
    X(x.s36);
    X(x.s37);
    X(x.s38);
    X(x.s39);
    X(x.s40);
    X(x.s41);
    X(x.s42);
    X(x.s43);
    X(x.s44);
    X(x.s45);
    X(x.s46);
    X(x.s47);
    X(x.s48);
    X(x.s49);
    X(x.s50);
    X(x.s51);
    X(x.s52);
    X(x.s53);
    X(x.s54);
    X(x.s55);
    X(x.s56);
    X(x.s57);
    X(x.s58);
    X(x.s59);
    X(x.s60);
    X(x.s61);
    X(x.s62);
    X(x.s63);
    X(x.s64);

    X(x.u1);
    X(x.u2);
    X(x.u3);
    X(x.u4);
    X(x.u5);
    X(x.u6);
    X(x.u7);
    X(x.u8);
    X(x.u9);
    X(x.u10);
    X(x.u11);
    X(x.u12);
    X(x.u13);
    X(x.u14);
    X(x.u15);
    X(x.u16);
    X(x.u17);
    X(x.u18);
    X(x.u19);
    X(x.u20);
    X(x.u21);
    X(x.u22);
    X(x.u23);
    X(x.u24);
    X(x.u25);
    X(x.u26);
    X(x.u27);
    X(x.u28);
    X(x.u29);
    X(x.u30);
    X(x.u31);
    X(x.u32);
    X(x.u33);
    X(x.u34);
    X(x.u35);
    X(x.u36);
    X(x.u37);
    X(x.u38);
    X(x.u39);
    X(x.u40);
    X(x.u41);
    X(x.u42);
    X(x.u43);
    X(x.u44);
    X(x.u45);
    X(x.u46);
    X(x.u47);
    X(x.u48);
    X(x.u49);
    X(x.u50);
    X(x.u51);
    X(x.u52);
    X(x.u53);
    X(x.u54);
    X(x.u55);
    X(x.u56);
    X(x.u57);
    X(x.u58);
    X(x.u59);
    X(x.u60);
    X(x.u61);
    X(x.u62);
    X(x.u63);
    X(x.u64);

    return 0;
}

这是使用64位clang编译的程序输出：

typeof(x.s1): long long int
typeof(x.s2): long long int
typeof(x.s3): long long int
typeof(x.s4): long long int
typeof(x.s5): long long int
typeof(x.s6): long long int
typeof(x.s7): long long int
typeof(x.s8): long long int
typeof(x.s9): long long int
typeof(x.s10): long long int
typeof(x.s11): long long int
typeof(x.s12): long long int
typeof(x.s13): long long int
typeof(x.s14): long long int
typeof(x.s15): long long int
typeof(x.s16): long long int
typeof(x.s17): long long int
typeof(x.s18): long long int
typeof(x.s19): long long int
typeof(x.s20): long long int
typeof(x.s21): long long int
typeof(x.s22): long long int
typeof(x.s23): long long int
typeof(x.s24): long long int
typeof(x.s25): long long int
typeof(x.s26): long long int
typeof(x.s27): long long int
typeof(x.s28): long long int
typeof(x.s29): long long int
typeof(x.s30): long long int
typeof(x.s31): long long int
typeof(x.s32): long long int
typeof(x.s33): long long int
typeof(x.s34): long long int
typeof(x.s35): long long int
typeof(x.s36): long long int
typeof(x.s37): long long int
typeof(x.s38): long long int
typeof(x.s39): long long int
typeof(x.s40): long long int
typeof(x.s41): long long int
typeof(x.s42): long long int
typeof(x.s43): long long int
typeof(x.s44): long long int
typeof(x.s45): long long int
typeof(x.s46): long long int
typeof(x.s47): long long int
typeof(x.s48): long long int
typeof(x.s49): long long int
typeof(x.s50): long long int
typeof(x.s51): long long int
typeof(x.s52): long long int
typeof(x.s53): long long int
typeof(x.s54): long long int
typeof(x.s55): long long int
typeof(x.s56): long long int
typeof(x.s57): long long int
typeof(x.s58): long long int
typeof(x.s59): long long int
typeof(x.s60): long long int
typeof(x.s61): long long int
typeof(x.s62): long long int
typeof(x.s63): long long int
typeof(x.s64): long long int
typeof(x.u1): unsigned long long int
typeof(x.u2): unsigned long long int
typeof(x.u3): unsigned long long int
typeof(x.u4): unsigned long long int
typeof(x.u5): unsigned long long int
typeof(x.u6): unsigned long long int
typeof(x.u7): unsigned long long int
typeof(x.u8): unsigned long long int
typeof(x.u9): unsigned long long int
typeof(x.u10): unsigned long long int
typeof(x.u11): unsigned long long int
typeof(x.u12): unsigned long long int
typeof(x.u13): unsigned long long int
typeof(x.u14): unsigned long long int
typeof(x.u15): unsigned long long int
typeof(x.u16): unsigned long long int
typeof(x.u17): unsigned long long int
typeof(x.u18): unsigned long long int
typeof(x.u19): unsigned long long int
typeof(x.u20): unsigned long long int
typeof(x.u21): unsigned long long int
typeof(x.u22): unsigned long long int
typeof(x.u23): unsigned long long int
typeof(x.u24): unsigned long long int
typeof(x.u25): unsigned long long int
typeof(x.u26): unsigned long long int
typeof(x.u27): unsigned long long int
typeof(x.u28): unsigned long long int
typeof(x.u29): unsigned long long int
typeof(x.u30): unsigned long long int
typeof(x.u31): unsigned long long int
typeof(x.u32): unsigned long long int
typeof(x.u33): unsigned long long int
typeof(x.u34): unsigned long long int
typeof(x.u35): unsigned long long int
typeof(x.u36): unsigned long long int
typeof(x.u37): unsigned long long int
typeof(x.u38): unsigned long long int
typeof(x.u39): unsigned long long int
typeof(x.u40): unsigned long long int
typeof(x.u41): unsigned long long int
typeof(x.u42): unsigned long long int
typeof(x.u43): unsigned long long int
typeof(x.u44): unsigned long long int
typeof(x.u45): unsigned long long int
typeof(x.u45): unsigned long long int
typeof(x.u46): unsigned long long int
typeof(x.u47): unsigned long long int
typeof(x.u48): unsigned long long int
typeof(x.u49): unsigned long long int
typeof(x.u50): unsigned long long int
typeof(x.u51): unsigned long long int
typeof(x.u52): unsigned long long int
typeof(x.u53): unsigned long long int
typeof(x.u54): unsigned long long int
typeof(x.u55): unsigned long long int
typeof(x.u56): unsigned long long int
typeof(x.u57): unsigned long long int
typeof(x.u58): unsigned long long int
typeof(x.u59): unsigned long long int
typeof(x.u60): unsigned long long int
typeof(x.u61): unsigned long long int
typeof(x.u62): unsigned long long int
typeof(x.u63): unsigned long long int
typeof(x.u64): unsigned long long int

所有位域似乎都具有定义的类型，而不是特定于定义的宽度的类型。

这是使用64位gcc编译的程序输出：

typestr(x.s1): other
typestr(x.s2): other
typestr(x.s3): other
typestr(x.s4): other
typestr(x.s5): other
typestr(x.s6): other
typestr(x.s7): other
typestr(x.s8): signed char
typestr(x.s9): other
typestr(x.s10): other
typestr(x.s11): other
typestr(x.s12): other
typestr(x.s13): other
typestr(x.s14): other
typestr(x.s15): other
typestr(x.s16): short
typestr(x.s17): other
typestr(x.s18): other
typestr(x.s19): other
typestr(x.s20): other
typestr(x.s21): other
typestr(x.s22): other
typestr(x.s23): other
typestr(x.s24): other
typestr(x.s25): other
typestr(x.s26): other
typestr(x.s27): other
typestr(x.s28): other
typestr(x.s29): other
typestr(x.s30): other
typestr(x.s31): other
typestr(x.s32): int
typestr(x.s33): other
typestr(x.s34): other
typestr(x.s35): other
typestr(x.s36): other
typestr(x.s37): other
typestr(x.s38): other
typestr(x.s39): other
typestr(x.s40): other
typestr(x.s41): other
typestr(x.s42): other
typestr(x.s43): other
typestr(x.s44): other
typestr(x.s45): other
typestr(x.s46): other
typestr(x.s47): other
typestr(x.s48): other
typestr(x.s49): other
typestr(x.s50): other
typestr(x.s51): other
typestr(x.s52): other
typestr(x.s53): other
typestr(x.s54): other
typestr(x.s55): other
typestr(x.s56): other
typestr(x.s57): other
typestr(x.s58): other
typestr(x.s59): other
typestr(x.s60): other
typestr(x.s61): other
typestr(x.s62): other
typestr(x.s63): other
typestr(x.s64): long long int
typestr(x.u1): other
typestr(x.u2): other
typestr(x.u3): other
typestr(x.u4): other
typestr(x.u5): other
typestr(x.u6): other
typestr(x.u7): other
typestr(x.u8): unsigned char
typestr(x.u9): other
typestr(x.u10): other
typestr(x.u11): other
typestr(x.u12): other
typestr(x.u13): other
typestr(x.u14): other
typestr(x.u15): other
typestr(x.u16): unsigned short
typestr(x.u17): other
typestr(x.u18): other
typestr(x.u19): other
typestr(x.u20): other
typestr(x.u21): other
typestr(x.u22): other
typestr(x.u23): other
typestr(x.u24): other
typestr(x.u25): other
typestr(x.u26): other
typestr(x.u27): other
typestr(x.u28): other
typestr(x.u29): other
typestr(x.u30): other
typestr(x.u31): other
typestr(x.u32): unsigned int
typestr(x.u33): other
typestr(x.u34): other
typestr(x.u35): other
typestr(x.u36): other
typestr(x.u37): other
typestr(x.u38): other
typestr(x.u39): other
typestr(x.u40): other
typestr(x.u41): other
typestr(x.u42): other
typestr(x.u43): other
typestr(x.u44): other
typestr(x.u45): other
typestr(x.u46): other
typestr(x.u47): other
typestr(x.u48): other
typestr(x.u49): other
typestr(x.u50): other
typestr(x.u51): other
typestr(x.u52): other
typestr(x.u53): other
typestr(x.u54): other
typestr(x.u55): other
typestr(x.u56): other
typestr(x.u57): other
typestr(x.u58): other
typestr(x.u59): other
typestr(x.u60): other
typestr(x.u61): other
typestr(x.u62): other
typestr(x.u63): other
typestr(x.u64): unsigned long long int

这与具有不同类型的每个宽度是一致的。

该表达式E1 << E2具有提升后的左操作数的类型，因此，小于整数的任何宽度都INT_WIDTH将int通过整数提升而被提升，并且大于该宽度的任何宽度都将INT_WIDTH被单独保留。如果该结果的宽度大于，则实际上应该将其结果截断为该位字段的宽度INT_WIDTH。更准确地说，对于无符号类型，应将其截断，并且可以为有符号类型定义的实现。

对于E1 + E2或其他算术运算符，如果E1或是E2宽度大于的宽度的位域，则应该发生相同的情况int。宽度较小的操作数将转换为宽度较大的类型，并且结果也具有类型type。这种非常违反直觉的行为会导致许多意外结果，这可能是人们普遍认为位字段是虚假的，应避免使用的原因。

许多编译器似乎没有遵循C标准的这种解释，从当前的措辞来看，这种解释也不是显而易见的。在将来的C标准版本中，阐明涉及位域操作数的算术运算的语义将很有用。

— chqrlie
source

我认为关键术语是“整数促销”。关于使用整数提升的位域的讨论（C11第6.3.1.1节 - 如果an int可以代表原始类型的所有值（对于位域，受宽度限制），则该值将转换为an int；否则，它将转换为an 被转换成unsigned int这些被称为整数促销。 - §6.3.1.8，§6.7.2.1），不盖，其中一个位字段的宽度比一个更宽的情况int。

— 乔纳森·莱夫勒

它不利于该标准的叶子未定义（最好实现定义）允许哪些类型比其他位字段int，unsigned int和_Bool。

— 乔纳森·莱夫勒

“任何宽度小于32”，“任何宽度大于32”和“如果该宽度大于32”应该大概反映在平原的比特的数量int，而不是成为一个固定的32

— 本福格特

我同意C标准存在（监督）问题。可能会有争论，因为该标准未批准使用uint64_t位域，因此该标准不必赘述-它应由行为的实现定义部分的实现文档覆盖位域。特别是，仅仅因为位域的52位不适合（32位）int，就不意味着它们被压缩为32位unsigned int，但这就是6.3的字面读数。 1.1说。

— 乔纳森·莱夫勒

另外，如果C ++已明确解决了“大位域”问题，则C应该尽可能地遵循这一原则-除非C ++固有固有的某种关于分辨率的东西（这不太可能）。

— Jonathan Leffler

该问题似乎特定于C模式下gcc的32位代码生成器：

您可以使用Godbolt的Compiler Explorer比较汇编代码

这是此测试的源代码：

#include <stdint.h>

typedef union control {
    uint64_t q;
    struct {
        uint64_t a: 1;
        uint64_t b: 1;
        uint64_t c: 1;
        uint64_t d: 1;
        uint64_t e: 1;
        uint64_t f: 1;
        uint64_t g: 4;
        uint64_t h: 1;
        uint64_t i: 1;
        uint64_t p52: 52;
    } b;
} control_t;

uint64_t test(control_t ctl) {
    return ctl.b.p52 << 12;
}

C模式下的输出（标志-xc -O2 -m32）

test:
        push    esi
        push    ebx
        mov     ebx, DWORD PTR [esp+16]
        mov     ecx, DWORD PTR [esp+12]
        mov     esi, ebx
        shr     ebx, 12
        shr     ecx, 12
        sal     esi, 20
        mov     edx, ebx
        pop     ebx
        or      esi, ecx
        mov     eax, esi
        shld    edx, esi, 12
        pop     esi
        sal     eax, 12
        and     edx, 1048575
        ret

问题是最后一条指令and edx, 1048575裁剪了12个最高有效位。

除了最后一条指令，C ++模式下的输出相同：

test(control):
        push    esi
        push    ebx
        mov     ebx, DWORD PTR [esp+16]
        mov     ecx, DWORD PTR [esp+12]
        mov     esi, ebx
        shr     ebx, 12
        shr     ecx, 12
        sal     esi, 20
        mov     edx, ebx
        pop     ebx
        or      esi, ecx
        mov     eax, esi
        shld    edx, esi, 12
        pop     esi
        sal     eax, 12
        ret

64位模式下的输出更加简单和正确，但对于C和C ++编译器却有所不同：

#C code:
test:
        movabs  rax, 4503599627366400
        and     rax, rdi
        ret

# C++ code:
test(control):
        mov     rax, rdi
        and     rax, -4096
        ret

您应该在gcc错误跟踪器上提交错误报告。

— chqrlie
source

我的实验仅针对64位目标，但您的32位情况则更加怪异。我猜应该提交一个错误报告。首先，我需要在可用的最新GCC版本上重新检查它。

— Grigory Rechistov

@GrigoryRechistov给定C标准中的措辞，该错误很可能是64位目标未能将结果截断为52位。我个人会这样看。

— 安德鲁·亨利