函数的内联版本与非内联版本返回的值不同

Question 1

同一函数的两个版本如何不同，它们的不同之处仅在于一个是内联函数而另一个不是内联函数？这是我今天编写的一些代码，我不确定它如何工作。

#include <cmath>
#include <iostream>

bool is_cube(double r)
{
    return floor(cbrt(r)) == cbrt(r);
}

bool inline is_cube_inline(double r)
{
    return floor(cbrt(r)) == cbrt(r);
}

int main()
{
    std::cout << (floor(cbrt(27.0)) == cbrt(27.0)) << std::endl;
    std::cout << (is_cube(27.0)) << std::endl;
    std::cout << (is_cube_inline(27.0)) << std::endl;
}

我希望所有输出都等于1，但实际上输出了此值（g ++ 8.3.1，没有标志）：

1
0
1

代替

1
1
1

编辑：clang ++ 7.0.0输出此：

0
0
0

和g ++ -Ofast this：

1
1
1

Question 2

说明

一些编译器（尤其是GCC）在编译时评估表达式时使用更高的精度。如果表达式仅依赖于常量输入和文字，则即使未将表达式分配给constexpr变量，也可以在编译时对其求值。是否发生这种情况取决于：

表达的复杂性
尝试执行编译时间评估时，编译器用作临界值的阈值
在特殊情况下（例如当clang elides循环时）使用的其他启发式方法

如果像第一种情况那样显式提供表达式，则它的复杂度较低，编译器可能会在编译时对其进行求值。

同样，如果将函数标记为内联，则编译器更有可能在编译时对其进行评估，因为内联函数会提高评估发生的阈值。

更高的优化级别也会增加此阈值，如-Ofast示例中那样，由于高精度的编译时评估，所有表达式在gcc上均评估为true。

我们可以在编译器资源管理器上观察到此行为。使用-O1进行编译时，仅在编译时评估标记为inline的函数，而在-O3时在编译时评估两个函数。

-O1：https：//godbolt.org/z/u4gh0g
-O3：https : //godbolt.org/z/nVK4So

注意：在编译器探索器示例中，我printf改用iostream，因为它降低了主要功能的复杂性，从而使效果更明显。

证明这`inline`不会影响运行时评估

我们可以通过从标准输入中获取值来确保在编译时不对任何表达式求值，并且当我们这样做时，所有3个表达式都返回false，如下所示：https : //ideone.com/QZbv6X

#include <cmath>
#include <iostream>

bool is_cube(double r)
{
    return floor(cbrt(r)) == cbrt(r);
}
 
bool inline is_cube_inline(double r)
{
    return floor(cbrt(r)) == cbrt(r);
}

int main()
{
    double value;
    std::cin >> value;
    std::cout << (floor(cbrt(value)) == cbrt(value)) << std::endl; // false
    std::cout << (is_cube(value)) << std::endl; // false
    std::cout << (is_cube_inline(value)) << std::endl; // false
}

与本示例相反，在本示例中，我们使用相同的编译器设置，但在编译时提供了该值，从而获得了更高精度的编译时评估。

Question 3

正如所观察到的，使用==运算符比较浮点值已导致使用不同的编译器和不同的优化级别得到不同的输出。

比较浮点值的一种好方法是文章中概述的相对公差测试：重新讨论了浮点公差。

我们首先计算Epsilon（相对公差）值，在这种情况下为：

double Epsilon = std::max(std::cbrt(r), std::floor(std::cbrt(r))) * std::numeric_limits<double>::epsilon();

然后以这种方式在内联函数和非内联函数中使用它：

return (std::fabs(std::floor(std::cbrt(r)) - std::cbrt(r)) < Epsilon);

现在的功能是：

bool is_cube(double r)
{
    double Epsilon = std::max(std::cbrt(r), std::floor(std::cbrt(r))) * std::numeric_limits<double>::epsilon();    
    return (std::fabs(std::floor(std::cbrt(r)) - std::cbrt(r)) < Epsilon);
}

bool inline is_cube_inline(double r)
{
    double Epsilon = std::max(std::cbrt(r), std::floor(std::cbrt(r))) * std::numeric_limits<double>::epsilon();
    return (std::fabs(std::round(std::cbrt(r)) - std::cbrt(r)) < Epsilon);
}

现在，[1 1 1]使用不同的编译器和不同的优化级别，输出将如预期的那样（）。

现场演示

函数的内联版本与非内联版本返回的值不同

说明

证明这inline不会影响运行时评估

证明这`inline`不会影响运行时评估