During using "hb_mapper makertbin" to convert my *.onnx to *.bin, i have found that one operator named "GlobalAveragePool_380" has low cosine similarity (about 0.84). It is this operator that brings about my single object tracking model generating invalid outputs. After executing "GlobalAveragePool_380" on CPU, i have had success in reducing loss of precision. However, this will make my *.bin have higher latency. Now, i want to question how can i execute this operator on BPU while having both low latency and high cosine similarity (0.90 may be enough) ?
Some extra descriptions:
My DCMT_sim.onnx has other GlobalAveragePool operators, but those operators (such as GlobalAveragePool_306) have high cosine similarity.
GlobalAveragePool_380 meets the restrictions mentioned in Horizon supported_op_list_and_restrictions document (https://developer.horizon.ai/api/v1/fileData/documents_pi/ai_toolchain_develop/horizon_ai_toolchain_user_guide/supported_op_list_and_restrictions.html).
Even if i put GlobalAveragePool_380 on CPU, the cosine similarity of it is not very high yet (about 0.90).
No matter what the inputs are, the second ouput (named output2) of *.bin with GlobalAveragePool_380 on BPU has constant values. You can see the strange phenomenon by changing the inputs in debug.py provided below.
My config.yaml is as follows:
The DCMT_sim.onnx and files of calibration can be found in the link below:
链接: https://pan.baidu.com/s/1xh4KNGrXgJkPHrrdYX6oZg 提取码: zshn
PS: DCMT_sim.onnx and files of calibration has been validated and they are good~
To verify the *.bin, you can use codes of debug.py as follows:
x.bin, z.bin and b.bin can be found in https://developer.horizon.ai/forumDetail/146176815327779277
THANKS VERY MUCH~

