您尚未登录。

楼主 # 2024-08-01 22:06:31

zenghw
会员
注册时间: 2024-08-01
已发帖子: 3
积分: 13

t113i + openwrt tina

多个板子出现rcu异常,最严重导致板子卡死,串口和网卡均无法进入系统,相关日志如下,各位大神,指导下,出现类似情况如何排查?

[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.err kernel: [167675.824288] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.err kernel: [167675.831188] rcu:     Tasks blocked on level-0 rcu_node (CPUs 0-1): P12756
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838564]     (detected by 0, t=2103 jiffies, g=9414229, q=4234)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.info kernel: [167675.838570] wc              R  running task        0 12756  12755 0x00000000
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838600] [<c010f8b0>] (unwind_backtrace) from [<c010b2f0>] (show_stack+0x10/0x14)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838613] [<c010b2f0>] (show_stack) from [<c017da50>] (rcu_sched_clock_irq+0xad4/0xaf8)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838625] [<c017da50>] (rcu_sched_clock_irq) from [<c01839a0>] (update_process_times+0x2c/0x80)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838636] [<c01839a0>] (update_process_times) from [<c0194d88>] (tick_sched_timer+0x7c/0x118)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838647] [<c0194d88>] (tick_sched_timer) from [<c01843dc>] (__hrtimer_run_queues+0x15c/0x218)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838656] [<c01843dc>] (__hrtimer_run_queues) from [<c0185148>] (hrtimer_interrupt+0x128/0x2a8)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838667] [<c0185148>] (hrtimer_interrupt) from [<c0639e1c>] (arch_timer_handler_phys+0x28/0x30)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838679] [<c0639e1c>] (arch_timer_handler_phys) from [<c017051c>] (handle_percpu_devid_irq+0x90/0x160)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838691] [<c017051c>] (handle_percpu_devid_irq) from [<c016add4>] (__handle_domain_irq+0x90/0xfc)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838704] [<c016add4>] (__handle_domain_irq) from [<c040e35c>] (gic_handle_irq+0x40/0x7c)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838713] [<c040e35c>] (gic_handle_irq) from [<c01021cc>] (__irq_svc+0x6c/0xa8)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838718] Exception stack(0xdbd6dd40 to 0xdbd6dd88)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838727] dd40: dcef1c5a 00000032 00000000 00000020 dbd6dd98 00000c84 00000402 00000406
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838735] dd60: 00000003 c0d66eff c09facf0 61c88647 00000000 dbd6dd90 c087bae4 c087baf4
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838739] dd80: a00f0013 ffffffff
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838750] [<c01021cc>] (__irq_svc) from [<c087baf4>] (xas_load+0x1c/0x6c)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838763] [<c087baf4>] (xas_load) from [<c01bf75c>] (find_get_entry+0x44/0x130)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838775] [<c01bf75c>] (find_get_entry) from [<c01bfd70>] (pagecache_get_page+0x3c/0x388)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838784] [<c01bfd70>] (pagecache_get_page) from [<c01c1504>] (generic_file_read_iter+0x200/0xaf4)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838794] [<c01c1504>] (generic_file_read_iter) from [<c021a444>] (do_iter_readv_writev+0x144/0x1a0)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838804] [<c021a444>] (do_iter_readv_writev) from [<c021b4f8>] (do_iter_read+0xe4/0x19c)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838815] [<c021b4f8>] (do_iter_read) from [<c0353b90>] (ovl_read_iter+0x90/0xe0)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838825] [<c0353b90>] (ovl_read_iter) from [<c021c078>] (__vfs_read+0x11c/0x19c)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838834] [<c021c078>] (__vfs_read) from [<c021c18c>] (vfs_read+0x94/0x110)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838842] [<c021c18c>] (vfs_read) from [<c021c45c>] (ksys_read+0x50/0xbc)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838851] [<c021c45c>] (ksys_read) from [<c0101000>] (ret_fast_syscall+0x0/0x54)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838854] Exception stack(0xdbd6dfa8 to 0xdbd6dff0)
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838862] dfa0:                   01eed190 b6e86824 00000003 01eed2d0 00001000 00000000
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838869] dfc0: 01eed190 b6e86824 b6fac550 00000003 beaf5e1b 01eed190 00000000 00000000
[2024/3/18 17:55:24] Sun Mar 17 21:12:22 2024 kern.warn kernel: [167675.838875] dfe0: 0008dc28 beaf5b00 b6dae644 b6e07944

离线

#1 2024-08-01 22:11:34

晕哥
管理员
注册时间: 2017-09-06
已发帖子: 9,344
积分: 9202

Re: t113i + openwrt tina

CPU和DDR降频排除硬件问题





离线

楼主 #2 2024-08-09 10:30:47

zenghw
会员
注册时间: 2024-08-01
已发帖子: 3
积分: 13

Re: t113i + openwrt tina

晕哥 说:

CPU和DDR降频排除硬件问题

感谢回复,还想跟您咨询下:

1、CPU 频率可以再现有系统调整,DDR有类似方法吗?还是必须配置内核,重新编译烧录?
2、调整这两个频率的目的是放宽RCU的Grace Periods吗?

3、另外重新抓取了一次日志,同时捕获ps进程情况
ps 进程:
。。。。
11963 root      3148 D    ifconfig wwan0
11965 root      2288 R    /sbin/modprobe -q -- wwan0
12126 root      3148 S    /bin/sh /usr/sbin/mobile_monitor
12129 root      3148 D    ifconfig wwan0
12131 root      3148 S    grep -w inet addr
12134 root      2288 R    /sbin/modprobe -q -- netdev-wwan0
。。。


dmesg截取部分信息
==========================dmesg info=========================
[16663.972322] 1fc0: 00000000 b6fbfc90 0000002a 00000000 b6fbc000 00000000 b6fc1df0 bed8acbc
[16663.981440] 1fe0: ffffffff bed8aa40 00011e38 00012658 60080010 ffffffff
[16704.544170] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P12134 P11965 } 335028 jiffies s: 1873 root: 0x0/T
[16704.624042] rcu: blocking rcu_node structures:
[16726.764089] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[16726.770898] rcu:     Tasks blocked on level-0 rcu_node (CPUs 0-1): P11965
[16726.778181]     (detected by 1, t=550763 jiffies, g=803281, q=1065792)
[16726.785167] modprobe        R  running task        0 11965   8550 0x00000000
[16726.793052] [<c087efc4>] (__schedule) from [<c087f8a8>] (preempt_schedule_irq+0x58/0x6c)
[16726.802078] [<c087f8a8>] (preempt_schedule_irq) from [<c0102210>] (svc_preempt+0x8/0x18)
[16726.811094] Exception stack(0xde591db0 to 0xde591df8)
[16726.816722] 1da0:                                     dd5efe42 00000003 00000000 00000001
[16726.825838] 1dc0: 00000000 defc0ec0 defc0ea4 00000003 00000402 00000406 00000003 c0d66eff
[16726.834952] 1de0: ffffffc0 de591e00 c087bae4 c087baac 20080113 ffffffff
[16726.842330] [<c0102210>] (svc_preempt) from [<c087baac>] (xas_start+0x94/0xc0)
[16726.850382] [<c087baac>] (xas_start) from [<c087bae4>] (xas_load+0xc/0x6c)
[16726.858048] [<c087bae4>] (xas_load) from [<c01bf75c>] (find_get_entry+0x44/0x130)
[16726.866393] [<c01bf75c>] (find_get_entry) from [<c01bfd70>] (pagecache_get_page+0x3c/0x388)
[16726.875705] [<c01bfd70>] (pagecache_get_page) from [<c01c0594>] (filemap_fault+0x88/0x868)
[16726.884921] [<c01c0594>] (filemap_fault) from [<c01e4ee4>] (__do_fault+0x38/0x138)
[16726.893360] [<c01e4ee4>] (__do_fault) from [<c01e90e8>] (handle_mm_fault+0x7f0/0xba4)
[16726.902092] [<c01e90e8>] (handle_mm_fault) from [<c01107c8>] (do_page_fault+0x114/0x2a4)
[16726.911114] [<c01107c8>] (do_page_fault) from [<c0110b0c>] (do_DataAbort+0x3c/0xbc)
[16726.919648] [<c0110b0c>] (do_DataAbort) from [<c010255c>] (__dabt_usr+0x3c/0x40)
[16726.927888] Exception stack(0xde591fb0 to 0xde591ff8)
[16726.933516] 1fa0:                                     00000000 b6fc3144 00000000 00003c90
[16726.942633] 1fc0: 00000000 b6fbfc90 0000002a 00000000 b6fbc000 00000000 b6fc1df0 bed8acbc
[16726.951747] 1fe0: ffffffff bed8aa40 00011e38 00012658 60080010 ffffffff

其中在dmesg报错中提到两个进程 P12134 P11965,wwan0是4G模块接口名;另外ifconfig wwan0进程变成D,这个会是导致modprobe RCU告警的吗?

离线

#3 2024-08-09 10:51:18

晕哥
管理员
注册时间: 2017-09-06
已发帖子: 9,344
积分: 9202

Re: t113i + openwrt tina

1、CPU 频率可以再现有系统调整,DDR有类似方法吗?还是必须配置内核,重新编译烧录?


这句没看懂





离线

楼主 #4 2024-08-10 18:52:54

zenghw
会员
注册时间: 2024-08-01
已发帖子: 3
积分: 13

Re: t113i + openwrt tina

晕哥 说:

1、CPU 频率可以再现有系统调整,DDR有类似方法吗?还是必须配置内核,重新编译烧录?


这句没看懂


CPU频率可以在系统内调整(/sys/devices/system/cpu/cpufreq/policy0/路径下修改主频),DDR降频有类似方法吗?

离线

#5 2024-08-10 19:38:07

晕哥
管理员
注册时间: 2017-09-06
已发帖子: 9,344
积分: 9202

Re: t113i + openwrt tina

这是A523 ddr调整方法:

1. 查看DDR频点
a523-pro:/ # cat /sys/class/devfreq/3120000.dmcfreq/available_frequencies
154000000 528000000 739200000 924000000
2. 查看实时频率参数的节点
echo  1 > /sys/module/sun55iw3_devfreq/parameters/dbg_level
或者
echo > 1  /sys/module/ccu_ddr/parameters/dbg_level
如果还是找不到,就到/sys/module/下find -name dbg_level找到具体的路径执行echo 1即可
响应的串口打印

[  434.856467][  T496] drate:154M load:57 rw:705M total:1232M
[  434.870411][  T496] drate:154M load:55 rw:684M total:1232M
[  434.889498][  T496] drate:154M load:42 rw:521M total:1232M
[  434.899553][  T496] drate:154M load:62 rw:771M total:1232M
[  434.909495][  T496] drate:154M load:41 rw:506M total:1232M
可以看到当前的DDR频率是154MHz

T113芯片可能不支持





离线

#6 2024-10-25 12:34:12

freedombye
会员
注册时间: 2024-09-27
已发帖子: 5
积分: 0

Re: t113i + openwrt tina

t113i还有啥问题,能搞个t113i问题汇总就好了

离线

页脚

工信部备案:粤ICP备20025096号 Powered by FluxBB

感谢为中文互联网持续输出优质内容的各位老铁们。 QQ: 516333132, 微信(wechat): whycan_cn (哇酷网/挖坑网/填坑网) service@whycan.cn