专栏算法工具链训练bev_mt_lss

训练bev_mt_lss

已解决
llll2023-07-18
76
4

用户您好,请详细描述您所遇到的问题,这会帮助我们快速定位问题~

1.芯片型号:4090
2.天工开物开发包OpenExplorer版本:J5_OE_1.1.57等
3.问题定位:学习bev_mt_lss时,跑浮点训练时
4.问题具体描述:根据文档设置之后,在提供的docke里面跑train.py,出现如下问题

File "/usr/local/lib/python3.8/dist-packages/hat/engine/ddp_trainer.py", line 420, in _with_exception

fn(*args)

File "/mnt/yrfs/J5/bev_release_package/tools/train.py", line 187, in train_entrance

trainer = build_from_registry(trainer)

File "/usr/local/lib/python3.8/dist-packages/hat/registry.py", line 250, in build_from_registry

return _impl(x)

File "/usr/local/lib/python3.8/dist-packages/hat/registry.py", line 210, in _impl

build_x = dict(((key, _impl(value)) for key, value in x.items())) # noqa

File "/usr/local/lib/python3.8/dist-packages/hat/registry.py", line 210, in <genexpr>

build_x = dict(((key, _impl(value)) for key, value in x.items())) # noqa

File "/usr/local/lib/python3.8/dist-packages/hat/registry.py", line 210, in _impl

build_x = dict(((key, _impl(value)) for key, value in x.items())) # noqa

File "/usr/local/lib/python3.8/dist-packages/hat/registry.py", line 210, in <genexpr>

build_x = dict(((key, _impl(value)) for key, value in x.items())) # noqa

File "/usr/local/lib/python3.8/dist-packages/hat/registry.py", line 193, in _impl

x = type(x)((_impl(x_i) for x_i in x))

File "/usr/local/lib/python3.8/dist-packages/hat/registry.py", line 193, in <genexpr>

x = type(x)((_impl(x_i) for x_i in x))

File "/usr/local/lib/python3.8/dist-packages/hat/registry.py", line 227, in _impl

_raise_invalid_type_error(object_type)

File "/usr/local/lib/python3.8/dist-packages/hat/registry.py", line 75, in _raise_invalid_type_error

raise TypeError(

TypeError: BevRotate has not registered in any of registry ['HAT_OBJECT_REGISTRY'] and is not a class, which is not allowed

ERROR:__main__:launch trainer failed! process 0 terminated with exit code 1

Traceback (most recent call last):

File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main

return _run_code(code, main_globals, None,

File "/usr/lib/python3.8/runpy.py", line 87, in _run_code

exec(code, run_globals)

File "/root/.vscode-server/extensions/ms-python.python-2023.12.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>

cli.main()

File "/root/.vscode-server/extensions/ms-python.python-2023.12.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main

run()

File "/root/.vscode-server/extensions/ms-python.python-2023.12.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file

runpy.run_path(target, run_name="__main__")

File "/root/.vscode-server/extensions/ms-python.python-2023.12.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path

return _run_module_code(code, init_globals, run_name,

File "/root/.vscode-server/extensions/ms-python.python-2023.12.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code

_run_code(code, mod_globals, init_globals,

File "/root/.vscode-server/extensions/ms-python.python-2023.12.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code

exec(code, run_globals)

File "/mnt/yrfs/J5/bev_release_package/tools/train.py", line 279, in <module>

train(

File "/mnt/yrfs/J5/bev_release_package/tools/train.py", line 274, in train

raise e

File "/mnt/yrfs/J5/bev_release_package/tools/train.py", line 257, in train

launch(

File "/usr/local/lib/python3.8/dist-packages/hat/engine/ddp_trainer.py", line 376, in launch

mp.spawn(

File "/root/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn

return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')

File "/root/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes

while not context.join():

File "/root/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 149, in join

raise ProcessExitedException(

torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with exit code 1

算法工具链
评论2
0/1000
  • 颜值即正义
    Lv.2

    你好,从J5 OE1.1.57开始,bev会合入OE包,部分类的名称和路径也发生了变化,如果你使用的是1.1.57的docker,请配套使用1.1.57版本OE包ddk\samples\ai_toolchain\horizon_model_train_sample\scripts目录下的bev参考算法

    2023-07-18
    0
    1
    • llll回复颜值即正义:

      oe包为1.1个G 这里面为空

      2023-07-19
      0
  • llll
    Lv.1

    OE包为1.1个G 这里面为空

    2023-07-19
    0
    1
    • 颜值即正义回复llll:

      请比对一下你的OE压缩包md5:76ab34abb9fc4bce3e0ba8870f24fe13

      2023-07-19
      0