如何在C ++中计算代码段的执行时间

121

我必须以秒为单位计算C ++代码段的执行时间。它必须在Windows或Unix机器上都能工作。

我使用以下代码执行此操作。（之前导入）

clock_t startTime = clock();
// some code here
// to compute its execution duration in runtime
cout << double( clock() - startTime ) / (double)CLOCKS_PER_SEC<< " seconds." << endl;

但是，对于较小的输入或简短的语句（例如a = a + 1），我得到的结果为“ 0秒”。我认为它一定是大约0.0000001秒或类似的东西。

我记得System.nanoTime()在这种情况下，在Java中效果很好。但是我不能从clock()C ++的功能中获得相同的确切功能。

你有解决方案吗？

c++ benchmarking

— AhmetB-Google
source

29

请记住，由于操作系统可能无法从头至尾运行线程，因此任何基于时差的比较都可能不准确。它可能会中断它并运行与您的线程交错的其他线程，这将对完成操作所需的实际时间产生重大影响。您可以运行多次，并对结果求平均。您可以最大程度地减少其他正在运行的进程的数量。但是这些都不能完全消除线程挂起的影响。

— Mordachai

14

Mordachi，为什么要消除它？您想了解函数在现实环境中的性能，而不是在线程永远不会中断的神奇领域。只要运行几次并取平均值，它就会非常准确。

— Thomas Bonini 09年

是的，我运行了几次并给出了结果。

— AhmetB-Google

14

Andreas和Mordachai的评论很有意义，如果OP希望将其代码的性能与其他算法进行比较。例如，如果他今天下午运行了几个时钟测试，然后明天早晨测试了一个不同的算法，则他的比较可能不可靠，因为他可能与下午相比，在下午与更多的进程共享资源。也许一组代码会使操作系统减少处理时间。如果他想执行基于时间的比较，则有很多原因会导致这种类型的绩效衡量不可靠。

— weberc2 2012年

4

@Mordachai我知道我要回复旧的评论，但是对于像我这样偶然发现的人-要在算法时间上表现出色，您需要最少运行几次，而不是平均。这是操作系统中断最少的程序，因此对您的代码进行定时也是很重要的。

— 巴鲁克

115

您可以使用我编写的此功能。您调用GetTimeMs64()，它使用系统时钟返回自unix纪元以来经过的毫秒数-就像一样time(NULL)，但以毫秒为单位。

它可以在Windows和Linux上运行；这是线程安全的。

请注意，在Windows上，粒度为15毫秒；在linux上，它取决于实现，但是通常也要15毫秒。

#ifdef _WIN32
#include <Windows.h>
#else
#include <sys/time.h>
#include <ctime>
#endif

/* Remove if already defined */
typedef long long int64; typedef unsigned long long uint64;

/* Returns the amount of milliseconds elapsed since the UNIX epoch. Works on both
 * windows and linux. */

uint64 GetTimeMs64()
{
#ifdef _WIN32
 /* Windows */
 FILETIME ft;
 LARGE_INTEGER li;

 /* Get the amount of 100 nano seconds intervals elapsed since January 1, 1601 (UTC) and copy it
  * to a LARGE_INTEGER structure. */
 GetSystemTimeAsFileTime(&ft);
 li.LowPart = ft.dwLowDateTime;
 li.HighPart = ft.dwHighDateTime;

 uint64 ret = li.QuadPart;
 ret -= 116444736000000000LL; /* Convert from file time to UNIX epoch time. */
 ret /= 10000; /* From 100 nano seconds (10^-7) to 1 millisecond (10^-3) intervals */

 return ret;
#else
 /* Linux */
 struct timeval tv;

 gettimeofday(&tv, NULL);

 uint64 ret = tv.tv_usec;
 /* Convert from micro seconds (10^-6) to milliseconds (10^-3) */
 ret /= 1000;

 /* Adds the seconds (10^0) after converting them to milliseconds (10^-3) */
 ret += (tv.tv_sec * 1000);

 return ret;
#endif
}

— 托马斯·博尼尼
source

1

供将来参考：我只是将其放入头文件中并使用。很高兴拥有它。

— 丹尼尔·汉托霍16/2/24

1

我相信gettimeofday如果更改系统时钟，该方法可能会产生意想不到的结果。如果这对您来说是个问题，那么您可能需要考虑一下clock_gettime。

— Azmisov '16

与Windows相比，此方法有什么优势GetTickCount吗？

— MicroVirus '16

无法使用gcc -std=c99

— -Assimilater

@MicroVirus：是的，GetTickCount是自系统启动以来经过的时间，而我的函数返回自UNIX时代以来的时间，这意味着您可以将其用于日期和时间。如果您只对两个事件之间的时间感兴趣，那么我的方法仍然是一个更好的选择，因为它是int64。GetTickCount是一个int32，每50天溢出一次，这意味着如果您注册的两个事件在两次溢出之间，您会得到怪异的结果。

— 托马斯·博尼尼

43

我还有另一个使用微秒的工作示例（UNIX，POSIX等）。

    #include <sys/time.h>
    typedef unsigned long long timestamp_t;

    static timestamp_t
    get_timestamp ()
    {
      struct timeval now;
      gettimeofday (&now, NULL);
      return  now.tv_usec + (timestamp_t)now.tv_sec * 1000000;
    }

    ...
    timestamp_t t0 = get_timestamp();
    // Process
    timestamp_t t1 = get_timestamp();

    double secs = (t1 - t0) / 1000000.0L;

这是我们编写代码的文件：

https://github.com/arhuaco/junkcode/blob/master/emqbit-bench/bench.c

— Arhuaco
source

5

您应该#include <sys/time.h>在示例的开头添加。

— niekas 2015年

40

这是C ++ 11中的简单解决方案，可为您提供令人满意的分辨率。

#include <iostream>
#include <chrono>

class Timer
{
public:
    Timer() : beg_(clock_::now()) {}
    void reset() { beg_ = clock_::now(); }
    double elapsed() const { 
        return std::chrono::duration_cast<second_>
            (clock_::now() - beg_).count(); }

private:
    typedef std::chrono::high_resolution_clock clock_;
    typedef std::chrono::duration<double, std::ratio<1> > second_;
    std::chrono::time_point<clock_> beg_;
};

或在* nix上，对于c ++ 03

#include <iostream>
#include <ctime>

class Timer
{
public:
    Timer() { clock_gettime(CLOCK_REALTIME, &beg_); }

    double elapsed() {
        clock_gettime(CLOCK_REALTIME, &end_);
        return end_.tv_sec - beg_.tv_sec +
            (end_.tv_nsec - beg_.tv_nsec) / 1000000000.;
    }

    void reset() { clock_gettime(CLOCK_REALTIME, &beg_); }

private:
    timespec beg_, end_;
};

这是示例用法：

int main()
{
    Timer tmr;
    double t = tmr.elapsed();
    std::cout << t << std::endl;

    tmr.reset();
    t = tmr.elapsed();
    std::cout << t << std::endl;

    return 0;
}

来自https://gist.github.com/gongzhitaao/7062087

— 恭子ita
source

我的c ++ 11解决方案出现此错误：/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.19 not found (required by ../cpu_2d/g500)

— user9869932 2015年

@julianromera您正在使用什么平台？您安装了libstdc ++库和g ++吗？

— gongzhitaao 2015年

它是Linux ubuntu 12的Slurm网格。我刚修复它。我在链接器的末尾添加了-static-libstdc ++。感谢您询问@gongzhitaao

— user9869932

18

#include <boost/progress.hpp>

using namespace boost;

int main (int argc, const char * argv[])
{
  progress_timer timer;

  // do stuff, preferably in a 100x loop to make it take longer.

  return 0;
}

当progress_timer超出范围时，它将打印出自创建以来经过的时间。

更新：这是一个没有Boost的版本（已在macOS / iOS上测试）：

#include <chrono>
#include <string>
#include <iostream>
#include <math.h>
#include <unistd.h>

class NLTimerScoped {
private:
    const std::chrono::steady_clock::time_point start;
    const std::string name;

public:
    NLTimerScoped( const std::string & name ) : name( name ), start( std::chrono::steady_clock::now() ) {
    }


    ~NLTimerScoped() {
        const auto end(std::chrono::steady_clock::now());
        const auto duration_ms = std::chrono::duration_cast<std::chrono::milliseconds>( end - start ).count();

        std::cout << name << " duration: " << duration_ms << "ms" << std::endl;
    }

};

int main(int argc, const char * argv[]) {

    {
        NLTimerScoped timer( "sin sum" );

        float a = 0.0f;

        for ( int i=0; i < 1000000; i++ ) {
            a += sin( (float) i / 100 );
        }

        std::cout << "sin sum = " << a << std::endl;
    }



    {
        NLTimerScoped timer( "sleep( 4 )" );

        sleep( 4 );
    }



    return 0;
}

— 托马斯·安德列（Tomas Andrle）
source

2

这可行，但是请注意，不建议使用progress_timer（有时在提升1.50之前）-auto_cpu_timer可能更合适。

— davidA 2012年

3

@meowsqueak嗯，auto_cpu_timer似乎需要链接Boost系统库，因此它不再是仅标头的解决方案。太糟糕了……突然使其他选择更具吸引力。

— Tomas Andrle 2012年

1

是的，这是个好主意，如果您尚未链接Boost，那么麻烦就多于其价值。但是，如果您已经做过，它会很好地工作。

— davidA

@meowsqueak是的，或者进行一些快速的基准测试，只需获取旧版本的Boost。

— Tomas Andrle 2013年

@TomasAndrle链接不再存在。

— 郑区

5

Windows提供QueryPerformanceCounter（）函数，而Unix提供gettimeofday（）这两个函数至少可以测量1微秒的差异。

— 漫画队长
source

但是使用windows.h是受限制的。Windows和Unix上都必须运行相同的编译源。如何处理这个问题？

— AhmetB-Google 2009年

2

然后寻找一些包装库stackoverflow.com/questions/1487695/...

— 队长漫画

4

相同的编译源听起来像您想在两个系统上运行相同的二进制文件，似乎并非如此。如果您的意思是相同的来源，则#ifdef必须可以（并且从您接受的答案来看），然后我看不到问题所在：#ifdef WIN32 #include <windows.h> ... #else ... #endif。

— 有人2010年

3

在我写的一些程序中，我将RDTS用于此目的。RDTSC与时间无关，而与从处理器启动开始的周期数有关。您必须在系统上对其进行校准才能获得秒级的结果，但是当您要评估性能时，它确实非常方便，最好直接使用周期数，而不必尝试将其更改回秒。

（上面的链接是法语的维基百科页面，但是它具有C ++代码示例，此处是英文版本）

— 克里斯
source

2

我建议使用标准库函数从系统获取时间信息。

如果需要更好的分辨率，请执行更多的执行迭代。与其运行程序一次并获取示例，不如运行它1000次以上。

— 托马斯·马修斯
source

2

最好只用一次性能计时运行一次内循环，然后通过划分内循环重复次数进行平均，而不是将整个事情（循环+性能计时）运行几次并进行平均，这更好。与您的实际概要分析部分相比，这将减少性能计时代码的开销。

为适当的系统包装您的计时器呼叫。对于Windows，QueryPerformanceCounter非常快速且使用“安全”。

您也可以在任何现代X86 PC上使用“ rdtsc”，但在某些多核计算机上可能会出现问题（核心跳变可能会更改计时器），或者您已打开某种速度步进。

— 阿迪萨克
source

2

（特定于Windows的解决方案）当前（大约2017年）在Windows下获取准确时间的方法是使用“ QueryPerformanceCounter”。这种方法的好处是可以给出非常准确的结果，并且MS建议使用。只需将代码blob放入新的控制台应用程序即可获得有效的示例。这里有一个冗长的讨论：获取高分辨率时间戳

#include <iostream>
#include <tchar.h>
#include <windows.h>

int main()
{
constexpr int MAX_ITER{ 10000 };
constexpr __int64 us_per_hour{ 3600000000ull }; // 3.6e+09
constexpr __int64 us_per_min{ 60000000ull };
constexpr __int64 us_per_sec{ 1000000ull };
constexpr __int64 us_per_ms{ 1000ull };

// easy to work with
__int64 startTick, endTick, ticksPerSecond, totalTicks = 0ull;

QueryPerformanceFrequency((LARGE_INTEGER *)&ticksPerSecond);

for (int iter = 0; iter < MAX_ITER; ++iter) {// start looping
    QueryPerformanceCounter((LARGE_INTEGER *)&startTick); // Get start tick
    // code to be timed
    std::cout << "cur_tick = " << iter << "\n";
    QueryPerformanceCounter((LARGE_INTEGER *)&endTick); // Get end tick
    totalTicks += endTick - startTick; // accumulate time taken
}

// convert to elapsed microseconds
__int64 totalMicroSeconds =  (totalTicks * 1000000ull)/ ticksPerSecond;

__int64 hours = totalMicroSeconds / us_per_hour;
totalMicroSeconds %= us_per_hour;
__int64 minutes = totalMicroSeconds / us_per_min;
totalMicroSeconds %= us_per_min;
__int64 seconds = totalMicroSeconds / us_per_sec;
totalMicroSeconds %= us_per_sec;
__int64 milliseconds = totalMicroSeconds / us_per_ms;
totalMicroSeconds %= us_per_ms;


std::cout << "Total time: " << hours << "h ";
std::cout << minutes << "m " << seconds << "s " << milliseconds << "ms ";
std::cout << totalMicroSeconds << "us\n";

return 0;
}

2

完整的线程调度解决方案（每次测试应产生完全相同的时间）是将程序编译为与OS无关，并启动计算机，以便在没有OS的环境中运行程序。然而，这在很大程度上是不切实际的，并且充其量是困难的。

摆脱操作系统依赖的一个很好的替代方法是将当前线程的亲和性设置为1个内核，并将优先级设置为最高。此替代方法应提供足够的一致结果。

另外，您应该关闭可能会干扰调试的优化，对于g ++或gcc，这意味着要添加-Og到命令行中，以防止被测试的代码被优化。-O0不应使用该标志，因为它会引入额外的不必要开销，这些开销会包括在计时结果中，从而使代码的计时速度发生偏差。

相反，既假定您在最终生产版本上使用-Ofast（或至少要使用-O3），又忽略了“无效”代码消除的问题，则-Og与-Ofast; 相比，它们执行的优化很少。因此-Og可能会错误地表示最终产品中代码的实际速度。

此外，所有速度测试（在某种程度上）都是有伪证的：在使用编译的最终产品中-Ofast，每个代码段/代码段/功能均未隔离；而是，每个代码段都会连续流入下一个代码段，从而使编译器可以潜在地将各地的代码段合并，合并和优化在一起。

同时，如果您对大量使用的代码片段进行基准测试realloc()，则该代码片段在具有足够高内存碎片的生产产品中运行速度可能会较慢。因此，表述“整体大于部分的总和”适用于这种情况，因为最终生产版本中的代码运行速度可能比正在测试速度的单个代码段明显更快或更慢。

可以减轻-Ofast速度不一致性的部分解决方案是用于速度测试，asm volatile("" :: "r"(var))并在测试中添加变量以防止死代码/循环消除。

这是一个如何在Windows计算机上对平方根函数进行基准测试的示例。

// set USE_ASM_TO_PREVENT_ELIMINATION  to 0 to prevent `asm volatile("" :: "r"(var))`
// set USE_ASM_TO_PREVENT_ELIMINATION  to 1 to enforce `asm volatile("" :: "r"(var))`
#define USE_ASM_TO_PREVENT_ELIMINATION 1

#include <iostream>
#include <iomanip>
#include <cstdio>
#include <chrono>
#include <cmath>
#include <windows.h>
#include <intrin.h>
#pragma intrinsic(__rdtsc)
#include <cstdint>

class Timer {
public:
    Timer() : beg_(clock_::now()) {}
    void reset() { beg_ = clock_::now(); }
    double elapsed() const { 
        return std::chrono::duration_cast<second_>
            (clock_::now() - beg_).count(); }
private:
    typedef std::chrono::high_resolution_clock clock_;
    typedef std::chrono::duration<double, std::ratio<1> > second_;
    std::chrono::time_point<clock_> beg_;
};

unsigned int guess_sqrt32(register unsigned int n) {
    register unsigned int g = 0x8000;
    if(g*g > n) {
        g ^= 0x8000;
    }
    g |= 0x4000;
    if(g*g > n) {
        g ^= 0x4000;
    }
    g |= 0x2000;
    if(g*g > n) {
        g ^= 0x2000;
    }
    g |= 0x1000;
    if(g*g > n) {
        g ^= 0x1000;
    }
    g |= 0x0800;
    if(g*g > n) {
        g ^= 0x0800;
    }
    g |= 0x0400;
    if(g*g > n) {
        g ^= 0x0400;
    }
    g |= 0x0200;
    if(g*g > n) {
        g ^= 0x0200;
    }
    g |= 0x0100;
    if(g*g > n) {
        g ^= 0x0100;
    }
    g |= 0x0080;
    if(g*g > n) {
        g ^= 0x0080;
    }
    g |= 0x0040;
    if(g*g > n) {
        g ^= 0x0040;
    }
    g |= 0x0020;
    if(g*g > n) {
        g ^= 0x0020;
    }
    g |= 0x0010;
    if(g*g > n) {
        g ^= 0x0010;
    }
    g |= 0x0008;
    if(g*g > n) {
        g ^= 0x0008;
    }
    g |= 0x0004;
    if(g*g > n) {
        g ^= 0x0004;
    }
    g |= 0x0002;
    if(g*g > n) {
        g ^= 0x0002;
    }
    g |= 0x0001;
    if(g*g > n) {
        g ^= 0x0001;
    }
    return g;
}

unsigned int empty_function( unsigned int _input ) {
    return _input;
}

unsigned long long empty_ticks=0;
double empty_seconds=0;
Timer my_time;

template<unsigned int benchmark_repetitions>
void benchmark( char* function_name, auto (*function_to_do)( auto ) ) {
    register unsigned int i=benchmark_repetitions;
    register unsigned long long start=0;
    my_time.reset();
    start=__rdtsc();
    while ( i-- ) {
        auto result = (*function_to_do)( i << 7 );
        #if USE_ASM_TO_PREVENT_ELIMINATION == 1
            asm volatile("" :: "r"(
                // There is no data type in C++ that is smaller than a char, so it will
                //  not throw a segmentation fault error to reinterpret any arbitrary
                //  data type as a char. Although, the compiler might not like it.
                result
            ));
        #endif
    }
    if ( function_name == nullptr ) {
        empty_ticks = (__rdtsc()-start);
        empty_seconds = my_time.elapsed();
        std::cout<< "Empty:\n" << empty_ticks
              << " ticks\n" << benchmark_repetitions << " repetitions\n"
               << std::setprecision(15) << empty_seconds
                << " seconds\n\n";
    } else {
        std::cout<< function_name<<":\n" << (__rdtsc()-start-empty_ticks)
              << " ticks\n" << benchmark_repetitions << " repetitions\n"
               << std::setprecision(15) << (my_time.elapsed()-empty_seconds)
                << " seconds\n\n";
    }
}


int main( void ) {
    void* Cur_Thread=   GetCurrentThread();
    void* Cur_Process=  GetCurrentProcess();
    unsigned long long  Current_Affinity;
    unsigned long long  System_Affinity;
    unsigned long long furthest_affinity;
    unsigned long long nearest_affinity;

    if( ! SetThreadPriority(Cur_Thread,THREAD_PRIORITY_TIME_CRITICAL) ) {
        SetThreadPriority( Cur_Thread, THREAD_PRIORITY_HIGHEST );
    }
    if( ! SetPriorityClass(Cur_Process,REALTIME_PRIORITY_CLASS) ) {
        SetPriorityClass( Cur_Process, HIGH_PRIORITY_CLASS );
    }
    GetProcessAffinityMask( Cur_Process, &Current_Affinity, &System_Affinity );
    furthest_affinity = 0x8000000000000000ULL>>__builtin_clzll(Current_Affinity);
    nearest_affinity  = 0x0000000000000001ULL<<__builtin_ctzll(Current_Affinity);
    SetProcessAffinityMask( Cur_Process, furthest_affinity );
    SetThreadAffinityMask( Cur_Thread, furthest_affinity );

    const int repetitions=524288;

    benchmark<repetitions>( nullptr, empty_function );
    benchmark<repetitions>( "Standard Square Root", standard_sqrt );
    benchmark<repetitions>( "Original Guess Square Root", original_guess_sqrt32 );
    benchmark<repetitions>( "New Guess Square Root", new_guess_sqrt32 );


    SetThreadPriority( Cur_Thread, THREAD_PRIORITY_IDLE );
    SetPriorityClass( Cur_Process, IDLE_PRIORITY_CLASS );
    SetProcessAffinityMask( Cur_Process, nearest_affinity );
    SetThreadAffinityMask( Cur_Thread, nearest_affinity );
    for (;;) { getchar(); }

    return 0;
}

另外，还要感谢Mike Jarvis的Timer。

请注意（这非常重要），如果您要运行更大的代码段，那么您确实必须降低迭代次数，以防止计算机冻结。

— 杰克·吉芬
source

2

除禁用优化外，其他答案均不错。标杆-O0代码是对时间的巨大浪费，因为的开销-O0 ，而不是一个正常的-O2或-O3 -march=native变化疯狂取决于代码和工作量。例如，额外的名为tmp vars的时间为-O0。还有其他避免事情进行优化的方法，例如使用volatile，非内联函数或空的内联asm语句对优化器隐藏事物。 -O0甚至无法使用，因为代码在处有不同的瓶颈-O0，虽然不一样，但是更糟。

— 彼得·科德斯

1

gh，-Og仍然不太现实，具体取决于代码。至少-O2，最好-O3是更现实的。使用asm volatile("" ::: "+r"(var))或之类的东西可使编译器在寄存器中实现一个值，并阻止通过它的恒定传播。

— 彼得·科德斯

@PeterCordes再次感谢您的见解。我已使用更新了内容，-O3并使用了代码片段asm volatile("" ::: "+r"(var))。

— 杰克·吉芬

1

asm volatile("" ::: "+r"( i ));似乎没有必要。在优化的代码中，没有理由迫使编译器i以及i<<7在循环内部实现。您正在阻止它从优化变为tmp -= 128而不是每次都移动。如果不是，使用函数调用的结果是好的void。像int result = (*function_to_do)( i << 7 );。您可以asm在该结果上使用语句。

— 彼得·科德斯

@PeterCordes再次非常感谢您或您的见解。我的帖子现在包含对from的返回值的更正，function_to_do以便function_to_do可以内联而不删除它。如果您还有其他建议，请告诉我。

— 杰克·吉芬

1

对于您希望在每次执行时都对同一段代码进行计时的情况（例如，对于您认为可能是瓶颈的性能分析代码），下面是一个包装器（对Andreas Bonini的功能进行了一些修改），我觉得这很有用：

#ifdef _WIN32
#include <Windows.h>
#else
#include <sys/time.h>
#endif

/*
 *  A simple timer class to see how long a piece of code takes. 
 *  Usage:
 *
 *  {
 *      static Timer timer("name");
 *
 *      ...
 *
 *      timer.start()
 *      [ The code you want timed ]
 *      timer.stop()
 *
 *      ...
 *  }
 *
 *  At the end of execution, you will get output:
 *
 *  Time for name: XXX seconds
 */
class Timer
{
public:
    Timer(std::string name, bool start_running=false) : 
        _name(name), _accum(0), _running(false)
    {
        if (start_running) start();
    }

    ~Timer() { stop(); report(); }

    void start() {
        if (!_running) {
            _start_time = GetTimeMicroseconds();
            _running = true;
        }
    }
    void stop() {
        if (_running) {
            unsigned long long stop_time = GetTimeMicroseconds();
            _accum += stop_time - _start_time;
            _running = false;
        }
    }
    void report() { 
        std::cout<<"Time for "<<_name<<": " << _accum / 1.e6 << " seconds\n"; 
    }
private:
    // cf. http://stackoverflow.com/questions/1861294/how-to-calculate-execution-time-of-a-code-snippet-in-c
    unsigned long long GetTimeMicroseconds()
    {
#ifdef _WIN32
        /* Windows */
        FILETIME ft;
        LARGE_INTEGER li;

        /* Get the amount of 100 nano seconds intervals elapsed since January 1, 1601 (UTC) and copy it
         *   * to a LARGE_INTEGER structure. */
        GetSystemTimeAsFileTime(&ft);
        li.LowPart = ft.dwLowDateTime;
        li.HighPart = ft.dwHighDateTime;

        unsigned long long ret = li.QuadPart;
        ret -= 116444736000000000LL; /* Convert from file time to UNIX epoch time. */
        ret /= 10; /* From 100 nano seconds (10^-7) to 1 microsecond (10^-6) intervals */
#else
        /* Linux */
        struct timeval tv;

        gettimeofday(&tv, NULL);

        unsigned long long ret = tv.tv_usec;
        /* Adds the seconds (10^0) after converting them to microseconds (10^-6) */
        ret += (tv.tv_sec * 1000000);
#endif
        return ret;
    }
    std::string _name;
    long long _accum;
    unsigned long long _start_time;
    bool _running;
};

— 迈克·贾维斯（Mike Jarvis）
source

1

只是对代码块进行基准测试的简单类：

using namespace std::chrono;

class benchmark {
  public:
  time_point<high_resolution_clock>  t0, t1;
  unsigned int *d;
  benchmark(unsigned int *res) : d(res) { 
                 t0 = high_resolution_clock::now();
  }
  ~benchmark() { t1 = high_resolution_clock::now();
                  milliseconds dur = duration_cast<milliseconds>(t1 - t0);
                  *d = dur.count();
  }
};
// simple usage 
// unsigned int t;
// { // put the code in a block
//  benchmark bench(&t);
//  // ...
//  // code to benchmark
// }
// HERE the t contains time in milliseconds

// one way to use it can be :
#define BENCH(TITLE,CODEBLOCK) \
  unsigned int __time__##__LINE__ = 0;  \
  { benchmark bench(&__time__##__LINE__); \
      CODEBLOCK \
  } \
  printf("%s took %d ms\n",(TITLE),__time__##__LINE__);


int main(void) {
  BENCH("TITLE",{
    for(int n = 0; n < testcount; n++ )
      int a = n % 3;
  });
  return 0;
}

— 空曲
source

0

boost :: timer可能会为您提供所需的准确性。距离准确度还远不能告诉您a = a+1;要花费多长时间，但是我为什么要花几纳秒的时间来计时呢？

— 布伦丹·朗（Brendan Long）
source

它依赖于clock()C ++标准头文件中的功能。

— Petter 2012年

0

我创建了一个lambda，它调用N次函数调用并返回平均值。

double c = BENCHMARK_CNT(25, fillVectorDeque(variable));

您可以在此处找到c ++ 11标头。

— 刻录机
source

0

我使用chrono库的high_resolution_clock创建了一个用于测量代码块性能的简单实用程序：https : //github.com/nfergu/codetimer。

可以根据不同的键记录时序，并且可以显示每个键的时序汇总视图。

用法如下：

#include <chrono>
#include <iostream>
#include "codetimer.h"

int main () {
    auto start = std::chrono::high_resolution_clock::now();
    // some code here
    CodeTimer::record("mykey", start);
    CodeTimer::printStats();
    return 0;
}

— 尼尔
source

0

您还可以查看[cxx-rtimers][1]GitHub上的，其中提供了一些仅标头的例程，可用于收集有关可创建局部变量的任何代码块的运行时统计信息。这些计时器具有在C ++ 11上使用std :: chrono的版本，Boost库中的计时器或标准POSIX计时器功能。这些计时器将报告在一个函数中花费的平均，最大和最小持续时间，以及该函数被调用的次数。它们可以简单地如下使用：

#include <rtimers/cxx11.hpp>

void expensiveFunction() {
    static rtimers::cxx11::DefaultTimer timer("expensive");
    auto scopedStartStop = timer.scopedStart();
    // Do something costly...
}

— w
source

0

多数民众赞成在我这样做，没有太多的代码，易于理解，符合我的需求：

void bench(std::function<void()> fnBench, std::string name, size_t iterations)
{
    if (iterations == 0)
        return;
    if (fnBench == nullptr)
        return;
    std::chrono::high_resolution_clock::time_point start, end;
    if (iterations == 1)
    {
        start = std::chrono::high_resolution_clock::now();
        fnBench();
        end = std::chrono::high_resolution_clock::now();
    }
    else
    {
        start = std::chrono::high_resolution_clock::now();
        for (size_t i = 0; i < iterations; ++i)
            fnBench();
        end = std::chrono::high_resolution_clock::now();
    }
    printf
    (
        "bench(*, \"%s\", %u) = %4.6lfs\r\n",
        name.c_str(),
        iterations,
        std::chrono::duration_cast<std::chrono::duration<double>>(end - start).count()
    );
}

用法：

bench
(
    []() -> void // function
    {
        // Put your code here
    },
    "the name of this", // name
    1000000 // iterations
);

— 思科211
source

0

#include <omp.h>

double start = omp_get_wtime();

// code 

double finish = omp_get_wtime();

double total_time = finish - start;

— 内特·弗里施
source

2

尽管这段代码可以解决问题，但包括解释如何以及为什么解决该问题的说明，确实可以帮助提高您的帖子质量，并可能导致更多的投票。请记住，您将来会为读者回答问题，而不仅仅是现在问的人。请编辑您的答案以添加说明，并指出适用的限制和假设。

— 达曼