专栏算法工具链J6X之Memory corruption问题分析方法

J6X之Memory corruption问题分析方法

费小财2025-09-30
62
0

Memory corruption

对于系统中出现随机、不可解释的异常指针访问或数据错误导致的异常,一般要考虑是内存使用上出现了UAF(Use-After-Free),OOB(Out-of-Bounds)。

本章所指的”Memory corruption”特指Linux kernel侧出现的”Memory corruption”,子系统间的内存踩踏请参考 Firewall 。

通用方法

当怀疑系统有OOB、UAF类问题时,打开CONFIG_KASAN开关,进行复现。

当出现memory corruption问题时,系统默认会BUG_ON。

检查panic log信息,对于slub、stack、buddy page、全局变量的UAF和OOB均有关键信息指出,基本通过log能够解决所有问题。

典型问题

[ 6.262525] BUG: KASAN: global-out-of-bounds in __of_match_node+0x70/0xb8
[ 6.263391] Read of size 1 at addr ffffff9008d153a8 by task swapper/0/1
[ 6.264231]
[ 6.264439] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 6.1.94-rt33-gac7c113a9bab #2
[ 6.265488] Hardware name: Horizon Robotics J6E Evaluation Module Board (DT)
[ 6.266362] Call trace:
[ 6.266694] [
] dump_backtrace+0x0/0x538
[ 6.267391] [
] show_stack+0x14/0x20
[ 6.268048] [
] dump_stack+0xa4/0xc8
[ 6.268703] [
] print_address_description+0x1e4/0x250
[ 6.269539] [
] kasan_report+0x2cc/0x300
[ 6.270240] [
] __asan_load1+0x44/0x50
[ 6.270912] [
] __of_match_node+0x70/0xb8
[ 6.271617] [
] of_match_node+0x38/0x60
[ 6.272301] [
] of_match_device+0x3c/0x50
[ 6.273012] [
] platform_match+0x64/0x118
[ 6.273719] [
] __driver_attach+0x40/0x140
[ 6.274435] [
] bus_for_each_dev+0xcc/0x140
[ 6.275164] [
] driver_attach+0x30/0x40
[ 6.275848] [
] bus_add_driver+0x220/0x388
[ 6.276566] [
] driver_register+0x108/0x170
[ 6.277295] [
] __platform_driver_register+0x7c/0x88
[ 6.278122] [
] j6_wdt_driver_init+0x34/0x4c
[ 6.278861] [
] do_one_initcall+0xe4/0x1b8
[ 6.279581] [
] kernel_init_freeable+0x1ac/0x260
[ 6.280363] [
] kernel_init+0x10/0x118
[ 6.281036] [
] ret_from_fork+0x10/0x18
[ 6.281713]
[ 6.281911] The buggy address belongs to the variable:
[ 6.282570] 0xffffff9008d153a8
[ 6.282974]
[ 6.283173] Memory state around the buggy address:
[ 6.283794] ffffff9008d15280: fa fa fa fa 07 fa fa fa fa fa fa fa 00 00 00 00
[ 6.284715] ffffff9008d15300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6.285636] >ffffff9008d15380: 00 00 00 00 00 fa fa fa fa fa fa fa 00 00 00 00
[ 6.286552] ^
[ 6.287137] ffffff9008d15400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6.288057] ffffff9008d15480: 00 00 00 fa fa fa fa fa 00 00 fa fa fa fa fa fa
[ 6.288974] ==================================================================
[ 6.289890] Disabling lock debugging due to kernel taint
错误类型global-out-of-bounds,全局变量越界访问,越界读取访问一个字节。

检查Calltrace是在驱动probe匹配device、driver的过程中。

复杂问题可能需要检查trace后面buggy address(并非实际数据地址,而是在shadow 区的映射)信息综合分析,此问题可以看到要访问地址ffffff9008d153a8里数据为0xFA,0xFA代表全局变量的redzone(越界检测)。

检查逻辑__of_match_node过程是循环遍历of_match_table中所有的项,直到表项中成员为空退出循环。

const struct of_device_id *__of_match_node(const struct of_device_id *matches,
const struct device_node *node)
{
const struct of_device_id *best_match = NULL;
int score, best_score = 0;
if (!matches)
return NULL;
for (; matches->name[0] |matches->type[0] |matches->compatible[0]; matches++) {
score = __of_device_is_compatible(node, matches->compatible,
matches->type, matches->name);
if (score > best_score) {
best_match = matches;
best_score = score;
}
}
return best_match;
}
根据代码可知,越界原因是of_match_table变量尾部没有填充0。
#ifdef CONFIG_OF
static const struct of_device_id j6_wdt_of_match[] = {
  • { .compatible = "snps,j6_wdt", }

  • /* sentinel */

  • { .compatible = "snps,j6_wdt", },

  • {/* sentinel /} / PRQA S 1041 /
    };
    MODULE_DEVICE_TABLE(of, j6_wdt_of_match); / PRQA S 0605 */

#endif

算法工具链
社区征文征程6技术深度解析官方教程
评论0
0/1000