专栏算法工具链J6E中使用centerpoint对kiiti数据集做QAT遇到padded size input size per channel: (32 x 0)

J6E中使用centerpoint对kiiti数据集做QAT遇到padded size input size per channel: (32 x 0)

解答中
默认802562026-05-29
11
1
我在J6E中使用centerpointkiiti数据集做QAT:

1、芯片型号:J6M

2、天工开物开发包版本:v3.0.31

3、问题定位:QAT

4、遇到问题:

File "/root/.local/lib/python3.10/site-packages/hat/engine/ddp_trainer.py", line 457, in withexception
fn(*args)
File "/open_explorer/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/train.py", line 186, in train_entrance
trainer.fit()
File "/root/.local/lib/python3.10/site-packages/hat/engine/loop_base.py", line 557, in fit
self.batch_processor(
File "/root/.local/lib/python3.10/site-packages/hat/utils/deterministic.py", line 253, in wrapper
result = func(*args, **kwargs)
File "/root/.local/lib/python3.10/site-packages/hat/engine/processors/processor.py", line 785, in call
model_outs = model(*_as_list(batch_i))
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in wrappedcall_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in callimpl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/parallel/distributed.py", line 1593, in forward
else self._run_ddp_forward(*inputs, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/parallel/distributed.py", line 1411, in runddp_forward
return self.module(*inputs, **kwargs) # type: ignore[index]
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in wrappedcall_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in callimpl
return forward_call(*args, **kwargs)
File "/root/.local/lib/python3.10/site-packages/hat/models/structures/detectors/centerpoint.py", line 106, in forward
input_features = self.reader(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in wrappedcall_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in callimpl
return forward_call(*args, **kwargs)
File "/root/.local/lib/python3.10/site-packages/hat/models/task_modules/lidar/pillar_encoder.py", line 225, in forward
features = self._extract_feature(features)
File "/root/.local/lib/python3.10/site-packages/hat/models/task_modules/lidar/pillar_encoder.py", line 240, in extractfeature
features = pfn(features)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in wrappedcall_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in callimpl
return forward_call(*args, **kwargs)
File "/root/.local/lib/python3.10/site-packages/hat/models/task_modules/lidar/pillar_encoder.py", line 89, in forward
x = self.linear(inputs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in wrappedcall_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in callimpl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 460, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 456, in convforward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Calculated padded input size per channel: (32 x 0). Kernel size: (1 x 1). Kernel size can't be greater than actual input size
算法工具链
社区活动征程6
评论2
0/600
  • 费小财
    Lv.5

    这通常由以下两种情况引起:

    空 Batch / 无效数据增强:某些数据增强操作(如随机裁剪、翻转、缩放)可能导致当前 Batch 中的某些样本点云被完全过滤掉,或者生成的 Pillar 数量为 0,且后续处理没有正确处理这种边界情况(例如 max_points_per_pillar 或 num_pillars 计算错误)。
    配置不匹配:数据集预处理配置(如点云范围 point_cloud_range、体素大小 voxel_size)与模型定义中的期望输入不一致,导致计算出的 Pillar Grid Size 异常。

    检查点:

    1.添加Debug打印:在 pillar_encoder.py中打印输入Shape。
    2.检查数据增强:暂时关闭所有随机数据增强(RandomFlip, RandomScale等),看是否还报错。如果不报错了,说明是某个增强策略产生了非法数据(如空点云)。
    3.过滤空样本:在 Dataset的 __getitem__中增加检查,如果点云处理后点数过少,直接丢弃该样本或返回一个默认的最小有效Tensor。
    15小时前
    0
    0
  • 费小财
    Lv.5

    这通常由以下两种情况引起:

    空 Batch / 无效数据增强:某些数据增强操作(如随机裁剪、翻转、缩放)可能导致当前 Batch 中的某些样本点云被完全过滤掉,或者生成的 Pillar 数量为 0,且后续处理没有正确处理这种边界情况(例如 max_points_per_pillar 或 num_pillars 计算错误)。
    配置不匹配:数据集预处理配置(如点云范围 point_cloud_range、体素大小 voxel_size)与模型定义中的期望输入不一致,导致计算出的 Pillar Grid Size 异常。

    检查点:

    1.添加Debug打印:在 pillar_encoder.py中打印输入Shape。
    2.检查数据增强:暂时关闭所有随机数据增强(RandomFlip, RandomScale等),看是否还报错。如果不报错了,说明是某个增强策略产生了非法数据(如空点云)。
    3.过滤空样本:在 Dataset的 __getitem__中增加检查,如果点云处理后点数过少,直接丢弃该样本或返回一个默认的最小有效Tensor。
    15小时前
    0
    0