找出哪个任务正在Linux上生成大量上下文切换

11

根据vmstat的数据，我的Linux服务器（2xCore2 Duo 2.5 GHz）一直在每秒执行约20k次上下文切换。

# vmstat 3
procs -----------memory----------  ---swap-- -----io----  -system-- ----cpu----
 r  b   swpd   free   buff  cache    si   so    bi    bo   in    cs us sy id wa
 2  0   7292 249472  82340 2291972    0    0     0     0    0     0  7 13 79  0
 0  0   7292 251808  82344 2291968    0    0     0   184   24 20090  1  1 99  0
 0  0   7292 251876  82344 2291968    0    0     0    83   17 20157  1  0 99  0
 0  0   7292 251876  82344 2291968    0    0     0    73   12 20116  1  0 99  0

...但是uptime显示的负载很小：load average: 0.01, 0.02, 0.01并且top没有显示CPU使用率高的任何进程。

我如何找出生成这些上下文切换的确切原因？哪个进程/线程？

我试图分析pidstat输出：

# pidstat -w 10 1

12:39:13          PID   cswch/s nvcswch/s  Command
12:39:23            1      0.20      0.00  init
12:39:23            4      0.20      0.00  ksoftirqd/0
12:39:23            7      1.60      0.00  events/0
12:39:23            8      1.50      0.00  events/1
12:39:23           89      0.50      0.00  kblockd/0
12:39:23           90      0.30      0.00  kblockd/1
12:39:23          995      0.40      0.00  kirqd
12:39:23          997      0.60      0.00  kjournald
12:39:23         1146      0.20      0.00  svscan
12:39:23         2162      5.00      0.00  kjournald
12:39:23         2526      0.20      2.00  postgres
12:39:23         2530      1.00      0.30  postgres
12:39:23         2534      5.00      3.20  postgres
12:39:23         2536      1.40      1.70  postgres
12:39:23        12061     10.59      0.90  postgres
12:39:23        14442      1.50      2.20  postgres
12:39:23        15416      0.20      0.00  monitor
12:39:23        17289      0.10      0.00  syslogd
12:39:23        21776      0.40      0.30  postgres
12:39:23        23638      0.10      0.00  screen
12:39:23        25153      1.00      0.00  sshd
12:39:23        25185     86.61      0.00  daemon1
12:39:23        25190     12.19     35.86  postgres
12:39:23        25295      2.00      0.00  screen
12:39:23        25743      9.99      0.00  daemon2
12:39:23        25747      1.10      3.00  postgres
12:39:23        26968      5.09      0.80  postgres
12:39:23        26969      5.00      0.00  postgres
12:39:23        26970      1.10      0.20  postgres
12:39:23        26971     17.98      1.80  postgres
12:39:23        27607      0.90      0.40  postgres
12:39:23        29338      4.30      0.00  screen
12:39:23        31247      4.10     23.58  postgres
12:39:23        31249     82.92     34.77  postgres
12:39:23        31484      0.20      0.00  pdflush
12:39:23        32097      0.10      0.00  pidstat

看起来某些postgresql任务每秒执行> 10个上下文swiche，但是总之，总和不会达到2万。

任何想法如何更深入地寻求答案？

— za
source

关于postgre的事情是它们是不同的pid，所以完全不同的程序。

— Gopoi

1

GOR一个过程：unix.stackexchange.com/questions/39342/...

— 西罗桑蒂利冠状病毒审查六四事件法轮功

5

好吧，这很有趣。尝试观察watch -tdn1 cat /proc/interrupts。您在那看到任何有价值的变化吗？

— Poige
source

“本地计时器中断”在每个CPU内核上生成数百个（200-800）中断。那是什么意思？另外，由于该服务器上的流量，eth0-rx / tx也会生成一些中断，但是中断并不多。

— grzaks 2011年

“函数调用中断”如何？

— poige 2011年

10

尝试使用

pidstat -wt

't'选项也显示线程。正在执行上下文切换的可能是线程。

— 德国人加西亚
source

1

运行pidstat -wt | sort -n -k4更好。

— Ismael Vacco

2

在较新的内核版本中

sudo perf record -e context-switches -a  # record the events

# then ctrl+c

sudo perf report # inspect the result

这将为您提供有关上下文切换事件的确切结果。

通过附加“ -g”标志可以发现引起上下文切换的原因（由符号信息确定的可读结果）

sudo perf record -e context-switches -a -g

— ny
source

1

上下文切换是正常的。一个进程被分配了一个时间量，如果它完成了（或者由于资源的需要而暂停了），它必须做的事情会让处理器离开。

也就是说，要计算完成多少次上下文切换（它成为stackoverflow.com的答案），将需要内部内核schedule（）命令写入进程表。如果您编写自己的内核，您将看不到这种东西，但这是非常困难的。

— 高坡
source

1

好。我知道什么是上下文切换，以及它对系统性能的影响。我只需要一种方法来衡量在Linux上完成了多少次上下文切换。我已经在/ proc / * / stats（voluntary_ctxt_switches）中找到了原始的csw计数器

— grzaks 2010年

0

powertop 可以告诉您进程多长时间唤醒一次CPU。

— 休伯特·卡里奥（Hubert Kario）
source