RDKX5用grid_sample在检查模型时报错

已解决

z2025-07-21

163

用的是RDKX5，具体使用代码是
grid = self.normgrid(pos, H, W).unsqueeze(-2).to(x.dtype)
x = F.grid_sample(x, grid, mode = self.mode , align_corners = False)
这里我把输入的x输出了，输出大小是
torch.Size([1, 1, 288, 384])
torch.Size([1, 1, 36, 48])
torch.Size([1, 64, 36, 48])
值都正常，但是在运行01_check.sh报错如下
2025-07-21 09:57:27,466 INFO log will be stored in /open_explorer/xfeat/hb_mapper_checker.log
2025-07-21 09:57:27,466 INFO Start hb_mapper…
2025-07-21 09:57:27,466 INFO hbdk version 3.49.15
2025-07-21 09:57:27,466 INFO horizon_nn version 1.1.0
2025-07-21 09:57:27,466 INFO hb_mapper version 1.24.3
2025-07-21 09:57:27,480 INFO Model type: onnx
2025-07-21 09:57:27,480 INFO input names
2025-07-21 09:57:27,480 INFO input shapes {}
2025-07-21 09:57:27,480 INFO Begin model checking…
2025-07-21 09:57:27,491 INFO Start to Horizon NN Model Convert.
2025-07-21 09:57:27,570 INFO Loading horizon_nn debug methods:set()
2025-07-21 09:57:27,571 INFO The activation calibration parameters:
calibration_type: fixed
2025-07-21 09:57:27,571 INFO The specified model compilation architecture: bayes-e.
2025-07-21 09:57:27,571 INFO The specified model compilation optimization parameters: .
2025-07-21 09:57:27,572 INFO Start to prepare the onnx model.
2025-07-21 09:57:27,600 INFO Input ONNX Model Information:
ONNX IR version: 6
Opset version: [‘ai.onnx v11’, ‘horizon v1’]
Producer: pytorch v1.13.0
Domain: None
Version: None
Graph input:
image0: shape=[1, 3, 300, 400], dtype=FLOAT32
image1: shape=[1, 3, 300, 400], dtype=FLOAT32
Graph output:
mkpts0: shape=[‘num_keypoints’, ‘Gathermkpts0_dim_1’], dtype=FLOAT32
mkpts1: shape=[‘num_keypoints’, ‘Gathermkpts1_dim_1’], dtype=FLOAT32
2025-07-21 09:57:27,600 INFO Modify argmax output element type from int32 to int64 to make sure this onnx model will be valid.
2025-07-21 09:57:27,735 INFO End to prepare the onnx model.
2025-07-21 09:57:27,766 INFO Saving model to: ./.hb_check/original_float_model.onnx.
2025-07-21 09:57:27,766 INFO Start to optimize the onnx model.
2025-07-21 09:57:28,828 INFO End to optimize the onnx model.
2025-07-21 09:57:28,853 INFO Saving model to: ./.hb_check/optimized_float_model.onnx.
2025-07-21 09:57:28,853 INFO Start to calibrate the model.
2025-07-21 09:57:28,994 WARNING The input0 of Node(name:/GridSample_1, type:GridSample) does not support data type: int16
2025-07-21 09:57:28,995 WARNING The input0 of Node(name:/interpolator/GridSample, type:GridSample) does not support data type: int16
2025-07-21 09:57:28,995 WARNING The input0 of Node(name:/GridSample_3, type:GridSample) does not support data type: int16
2025-07-21 09:57:28,996 WARNING The input0 of Node(name:/interpolator_1/GridSample, type:GridSample) does not support data type: int16
2025-07-21 09:57:28,996 WARNING The output of Node(name:/net/heatmap_head/heatmap_head.3/Sigmoid) is int16, then requantized to int8
2025-07-21 09:57:28,996 WARNING The output of Node(name:/Div_mul) is int16, then requantized to int8
2025-07-21 09:57:28,997 INFO There are 1 samples in the data set.
2025-07-21 09:57:28,997 INFO Run calibration model with fixed thresholds method.
Layer /GridSample
/GridSample layer, feature size in axis 1 should in range [1, 4096]. But given size 4294967295
/GridSample layer, feature size in axis 1 should in range [1, 4096]. But given size 4294967295

Layer /GridSample_1
/GridSample_1 layer, feature size in axis 1 should in range [1, 4096]. But given size 4294967295
/GridSample_1 layer, feature size in axis 1 should in range [1, 4096]. But given size 4294967295

2025-07-21 09:57:29,073 WARNING The output of Node(name:/net/heatmap_head/heatmap_head.3_1/Sigmoid) is int16, then requantized to int8
2025-07-21 09:57:29,073 WARNING The output of Node(name:/Div_3_mul) is int16, then requantized to int8
2025-07-21 09:57:29,077 ERROR *** ERROR-OCCUR-DURING {horizon_nn.build_onnx} ***, error message: ERROR: The size of specified_perm is different with input dim
The error model has been saved as complement_calibration_node_for_graph_input_pass_fail.onnx
2025-07-21 09:57:29,077 INFO End to calibrate the model.
2025-07-21 09:57:29,078 INFO End to Horizon NN Model Convert.

这个怎么解决？是量化到整形时溢出之类的吗，这个4294967295，值超大了

算法工具链

技术深度解析

0/1000

z
Lv.1
在netron可视化里发现出现了unk，会是这个导致的吗，unk是啥意思，
2025-07-21
0
0
Huanghui
Lv.5
你好，根据报错来看可能是动态 shape 未被正确解析。
2025-07-21
0
3
- z回复Huanghui:
  我shape是固定的呀
  2025-07-21
  0
  回复
- z:
  torch.Size([1, 1, 288, 384])
  torch.Size([1, 1, 36, 48])
  torch.Size([1, 64, 36, 48])
  这里三个输出是因为调用了三次插值
  2025-07-21
  0
  回复
- Huanghui回复z:
  要不你把模型与导出onnx的代码发过来看看，这里光看这些也看不出来啥
  2025-07-21
  0
  回复
Huanghui
Lv.5
你尝试下使用tracing 导出 ONNX ，
2025-07-21
0
42
- z回复Huanghui:
  我用的是horizon的导出
  export_onnx(xfeat, (image0, image1), output_path,verbose=False, opset_version=11,do_constant_folding=True,input_names=["image0", "image1"],output_names=["mkpts0", "mkpts1"],dynamic_axes=dynamic_axes,)
  2025-07-21
  0
  回复
- z:
  torch的导出不了11版本的gird sample
  2025-07-21
  0
  回复
- Huanghui回复z:
  的用地平线的导出onnx工具导出
  2025-07-21
  0
  回复
- z回复Huanghui:
  嗯嗯我用的就是啊
  2025-07-21
  0
  回复
- Huanghui回复z:
  我记得上周给过那个解决方案呀，没导出成功吗/
  2025-07-21
  0
  回复
- z:
  dynamic_axes是false
  2025-07-21
  0
  回复
- z回复Huanghui:
  导出成功了，检查报错
  2025-07-21
  0
  回复
- Huanghui回复z:
  就是报的这个错吗
  2025-07-21
  0
  回复
- z回复z:
  Layer /GridSample /GridSample layer, feature size in axis 1 should in range [1, 4096]. But given size 4294967295 /GridSample layer, feature size in axis 1 should in range [1, 4096]. But given size 4294967295 Layer /GridSample_1 /GridSample_1 layer, feature size in axis 1 should in range [1, 4096]. But given size 4294967295 /GridSample_1 layer, feature size in axis 1 should in range [1, 4096]. But given size 4294967295 2025-07-21 11:07:45,462 ERROR *** ERROR-OCCUR-DURING {horizon_nn.build_onnx} ***, error message: ERROR: The size of specified_perm is different with input dim
  一直找不到这个报错原因
  2025-07-21
  0
  回复
- z回复Huanghui:
  对的
  2025-07-21
  0
  回复
- Huanghui回复z:
  要不你把导出代码跟原始模型发出来吧，我去验证下
  2025-07-21
  0
  回复
- Huanghui:
  这里不实验下，也找不到原因
  2025-07-21
  0
  回复
- z回复Huanghui:
  好的，稍等
  2025-07-21
  0
  回复
- z回复Huanghui:
  【快传】我给你发了 rdk, 快来看看 https://www.alipan.com/t/APCbyIOp5bTJ8u7tfWl7 点击链接即可保存。「阿里云盘」APP ，无需下载极速在线查看，视频原画倍速播放。
  这里麻烦了=看下
  2025-07-21
  0
  回复
- Huanghui回复z:
  这个不能添加附件吗？
  2025-07-21
  0
  回复
- z回复Huanghui:
  有好几个文件，直接传附件不好组织，我重新组织下再传附件？
  2025-07-21
  0
  回复
- z回复Huanghui:
  可以加，但最多3个,文件超3个人
  2025-07-21
  0
  回复
- z回复Huanghui:
  百度网盘方便吗？
  2025-07-21
  0
  回复
- Huanghui回复z:
  下载挺慢要会员
  2025-07-21
  0
  回复
- Huanghui回复z:
  发帖时候不能添加附件吗
  2025-07-21
  0
  回复
- z回复Huanghui:
  可以了，我发了压缩文件
  2025-07-21
  0
  回复
- z回复Huanghui:
  有看到我上传的附件吗
  2025-07-21
  0
  回复
- Huanghui回复z:
  没有呀，
  2025-07-21
  0
  回复
- z回复Huanghui:
  添加不上去呀，试好几次了
  2025-07-21
  0
  回复
- Huanghui回复z:
  那还是百度网盘吧
  2025-07-21
  0
  回复
- z回复Huanghui:
  好的，在这，通过网盘分享的文件：test 链接: https://pan.baidu.com/s/1irPA6NKkdrRqyLjLOzbdzA?pwd=g24g 提取码: g24g
  2025-07-21
  0
  回复
- Huanghui回复z:
  哪个onnx模型是你生成出来下一步要用的？
  2025-07-21
  0
  回复
- z回复Huanghui:
  xfeat_e2e_300x400_sim.onnx
  2025-07-21
  0
  回复
- z:
  这是 xfeat_e2e_300x400.onnx用了onnxsim import simplify简化后的，这两个都可以
  2025-07-21
  0
  回复
- z回复Huanghui:
  2025-07-21 09:57:29,077 ERROR *** ERROR-OCCUR-DURING {horizon_nn.build_onnx} ***, error message: ERROR: The size of specified_perm is different with input dim
  这个报错是因为onnx里有if算子，我把所有if算子去掉后能进一步执行了，但是又遇到了girdsample相关的错误了
  2025-07-21 15:00:59,706 ERROR {/Where_1} unsupported broadcast mode. 2025-07-21 15:00:59,711 ERROR *** ERROR-OCCUR-DURING {runtime.runtime_model_generation} ***, error message: HorizonRT not support these cpu operators: GridSample NonZero
  2025-07-21
  0
  回复
- Huanghui回复z:
  我这边hb_compile没跑通
  2025-07-21
  0
  回复
- Huanghui回复z:
  要不再找一找有没有替代方案？
  2025-07-21
  0
  回复
- z回复Huanghui:
  什么的替换方案啊？
  2025-07-21
  0
  回复
- Huanghui回复z:
  gridsample算子
  2025-07-21
  0
  回复
- Huanghui回复z:
  你好，这个为题定位到了，目前平台不支持NoZero算子，要改这个算子的
  2025-07-21
  0
  回复
- Huanghui回复z:
  2025-07-21
  0
  回复
- Huanghui回复z:
  你看下能不能把NonZero那里的维度给固定下来
  2025-07-21
  0
  回复
- z回复Huanghui:
  好嘞
  2025-07-21
  0
  回复
- z回复z:
  我怎么换都有nonzero咋办，改成torch,where也有，是具体用了什么操作onnx就会导出nonzero呢
  2025-07-21
  0
  回复
- z回复Huanghui:
  你们有解决过nonzero去除的问题吗，我怎么改都去不掉啊
  2025-07-22
  0
  回复
- Huanghui回复z:
  2025-07-22
  0
  回复
- Huanghui回复z:
  你试试用masked_fill
  2025-07-22
  0
  回复