Kernel panic - not syncing: softlockup: hung tasks

目前遇到一个崩溃问题记录一下!

使用crash 分析结果如下:

crash> sys
      KERNEL: vmlinux
    DUMPFILE: kernel_dump_file_debug  [PARTIAL DUMP]
        CPUS: 32
        DATE: Thu Jul  8 16:06:13 2021
      UPTIME: 12 days, 01:19:36
LOAD AVERAGE: 4.57, 5.64, 5.97
       TASKS: 832
    NODENAME: localhost
     RELEASE: 2.6.39-gentoo-r3-wafg2-47137
     VERSION: #18 SMP Wed Dec 30 21:37:53 JST 2020
     MACHINE: x86_64  (2599 Mhz)
      MEMORY: 128 GB
       PANIC: "[1039338.727675] Kernel panic - not syncing: softlockup: hung tasks"
crash> bt
PID: 22501  TASK: ffff881ff4340690  CPU: 1   COMMAND: "xxxxproess"
 #0 [ffff88107fc238b0] machine_kexec at ffffffff810243b6
 #1 [ffff88107fc23920] crash_kexec at ffffffff810773b9
 #2 [ffff88107fc239f0] panic at ffffffff815f35e0
 #3 [ffff88107fc23a70] watchdog_timer_fn at ffffffff81089a38
 #4 [ffff88107fc23aa0] __run_hrtimer.clone.28 at ffffffff8106303a
 #5 [ffff88107fc23ad0] hrtimer_interrupt at ffffffff81063541
 #6 [ffff88107fc23b30] smp_apic_timer_interrupt at ffffffff81020b92
 #7 [ffff88107fc23b50] apic_timer_interrupt at ffffffff815f6553
 #8 [ffff88107fc23bb8] igb_xmit_frame_ring at ffffffffa006a754 [igb]
 #9 [ffff88107fc23c70] igb_xmit_frame at ffffffffa006ada4 [igb]
#10 [ffff88107fc23ca0] dev_hard_start_xmit at ffffffff814d588d
#11 [ffff88107fc23d10] sch_direct_xmit at ffffffff814e87f7
#12 [ffff88107fc23d60] dev_queue_xmit at ffffffff814d5c2e
#13 [ffff88107fc23db0] transmit_skb at ffffffffa0111032 [wafg2]
#14 [ffff88107fc23dc0] forward_skb at ffffffffa01113b4 [wafg2]
#15 [ffff88107fc23df0] dev_rx_skb at ffffffffa0111875 [wafg2]
#16 [ffff88107fc23e40] igb_poll at ffffffffa006d6fc [igb]
#17 [ffff88107fc23f10] net_rx_action at ffffffff814d437a
#18 [ffff88107fc23f60] __do_softirq at ffffffff8104f3bf
#19 [ffff88107fc23fb0] call_softirq at ffffffff815f6d9c
--- <IRQ stack> ---
#20 [ffff881f2ebcfae0] __skb_queue_purge at ffffffff8153af65
#21 [ffff881f2ebcfb00] do_softirq at ffffffff8100d1c4
#22 [ffff881f2ebcfb20] _local_bh_enable_ip.clone.8 at ffffffff8104f311
#23 [ffff881f2ebcfb30] local_bh_enable at ffffffff8104f336
#24 [ffff881f2ebcfb40] inet_csk_listen_stop at ffffffff8152a94b
#25 [ffff881f2ebcfb80] tcp_close at ffffffff8152c8aa
#26 [ffff881f2ebcfbb0] inet_release at ffffffff8154a44d
#27 [ffff881f2ebcfbd0] sock_release at ffffffff814c409f
#28 [ffff881f2ebcfbf0] sock_close at ffffffff814c4111
#29 [ffff881f2ebcfc00] fput at ffffffff810d4c85
#30 [ffff881f2ebcfc50] filp_close at ffffffff810d1ea0
#31 [ffff881f2ebcfc80] put_files_struct at ffffffff8104d4d9
#32 [ffff881f2ebcfcd0] exit_files at ffffffff8104d5b4
#33 [ffff881f2ebcfcf0] do_exit at ffffffff8104d821
#34 [ffff881f2ebcfd70] do_group_exit at ffffffff8104df5c
#35 [ffff881f2ebcfda0] get_signal_to_deliver at ffffffff810570b2
#36 [ffff881f2ebcfe20] do_signal at ffffffff8100ae52
#37 [ffff881f2ebcff20] do_notify_resume at ffffffff8100b47e
#38 [ffff881f2ebcff50] int_signal at ffffffff815f5e63
    RIP: 00007fd9e52e1cdd  RSP: 00007fd9a7cfa370  RFLAGS: 00000293
    RAX: 000000000000001b  RBX: 00000000000000fb  RCX: ffffffffffffffff
    RDX: 000000000000001b  RSI: 00007fd96a77e05e  RDI: 00000000000000fb
    RBP: 00007fd9a8513e80   R8: 00000000007a7880   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000293  R12: 000000000000001b
    R13: 00007fd96a77e05e  R14: 000000000000001b  R15: 0000000000735240
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b

   首先弄明白 “Kernel panic - not syncing: softlockup: hung tasks” 这个结果是怎么出现,它代表着什么意思?也就是翻译翻译这个结论!!

lockup分为soft lockup和hard lockup。

  soft lockup是指内核中有BUG导致在内核模式下一直循环的时间超过n s(n为配置参数),而其他进程得不到运行的机会;实现方式:内核对于每一个cpu都有一个监控进程watchdog/x 每秒钟会统计相关数据时间戳,,对比时间戳就可以知道运行情况

  hard lockup的发生是由于禁止了CPU的所有中断超过一定时间(几秒)这种情况下,外部设备发生的中断无法处理,内核认为此时发生了所谓的hard lockup

那就看为啥cpu 没有被调度过来了?? 看了一下鬼知道!!! 干饭去----->下午继续

 

上一篇:golang操作数据库


下一篇:hermes golang email 模版包