We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENV FLAGS_npu_jit_compile=False ENV FLAGS_npu_scale_aclnn=True ENV CUSTOM_DEVICE_BLACK_LIST="set_value,set_value_with_tensor"
paddleocr --image_dir tts --use_npu true --type=structure tts 目录中2个图片,都包含表格
paddleocr --image_dir tts --use_npu true --type=structure
`table_engine = NewPPStructure(show_log=True, lang='ch', recovery=False, det_limit_side_len=1920, use_npu=is_use_npu(), det_model_dir='ch_PP-OCRv4_det_infer', rec_model_dir='ch_PP-OCRv4_rec_infer', layout_model_dir='picodet_lcnet_x1_0_fgd_layout_cdla_infer', layout_dict_path='/opt/anaconda3/envs/paddle_env/lib/python3.8/site-packages/paddleocr/ppocr/utils/dict/layout_dict/layout_cdla_dict.txt', # table_char_dict_path='table_structure_dict_ch.txt', table_model_dir='ch_ppstructure_mobile_v2.0_SLANet_infer', layout_score_threshold=0.25, layout_nms_threshold=0.5)
提取table信息 table_extract_num = 0 for b in final_layouts: if str(b.type).lower() == "table": x1,y1,x2,y2 = b.block.x_1, b.block.y_1, b.block.x_2, b.block.y_2 res, table_time_dict = self.table_system( ori_im[y1:y2, x1:x2].copy(), return_ocr_result_in_table) b.text = res['html'] ori_im[y1:y2, x1:x2] = np.ones((y2-y1, x2-x1, 3), dtype=np.uint8)*255 table_extract_num += 1 else: b.text = []
get_logger().info("extract table nums:{}".format(table_extract_num))
`
profiling 执行了一次profiling,日志数据可以提供,profiling数据较大,可以留言发送 paddleocr_6.log
The text was updated successfully, but these errors were encountered:
你好,从profiling日志看到还有部分kernel没有非jit的实现,也就是FLAGS_npu_jit_compile=False对这些非aclnn前缀的kernel不生效,这些kernel输入shape变化还是会重新编译,导致耗时增加,PaddleCustomDevice目前也在不断减少这类kernel,可以尝试更新到最新
Sorry, something went wrong.
更新到最新,同时相同图片请求,结果还是一样 相同图片,相同请求,理论上shape是一样的,但时间还是一样长,6s多
ronny1996
No branches or pull requests
在昇腾910b上,使用paddleocr读取表格,耗时达到6s左右, n卡只需0.8s
物理环境: cann80RC1-ubuntu20-paddleocr2.7.3-paddlepaddle(3.0.0.dev20240527)
环境变量:
ENV FLAGS_npu_jit_compile=False ENV FLAGS_npu_scale_aclnn=True ENV CUSTOM_DEVICE_BLACK_LIST="set_value,set_value_with_tensor"
复现方式:
命令行方式:
paddleocr --image_dir tts --use_npu true --type=structure
tts 目录中2个图片,都包含表格
代码:
`table_engine = NewPPStructure(show_log=True, lang='ch', recovery=False, det_limit_side_len=1920, use_npu=is_use_npu(),
det_model_dir='ch_PP-OCRv4_det_infer',
rec_model_dir='ch_PP-OCRv4_rec_infer',
layout_model_dir='picodet_lcnet_x1_0_fgd_layout_cdla_infer',
layout_dict_path='/opt/anaconda3/envs/paddle_env/lib/python3.8/site-packages/paddleocr/ppocr/utils/dict/layout_dict/layout_cdla_dict.txt',
# table_char_dict_path='table_structure_dict_ch.txt',
table_model_dir='ch_ppstructure_mobile_v2.0_SLANet_infer',
layout_score_threshold=0.25,
layout_nms_threshold=0.5)
提取table信息
table_extract_num = 0
for b in final_layouts:
if str(b.type).lower() == "table":
x1,y1,x2,y2 = b.block.x_1, b.block.y_1, b.block.x_2, b.block.y_2
res, table_time_dict = self.table_system(
ori_im[y1:y2, x1:x2].copy(), return_ocr_result_in_table)
b.text = res['html']
ori_im[y1:y2, x1:x2] = np.ones((y2-y1, x2-x1, 3), dtype=np.uint8)*255
table_extract_num += 1
else:
b.text = []
`
profiling
执行了一次profiling,日志数据可以提供,profiling数据较大,可以留言发送
paddleocr_6.log
The text was updated successfully, but these errors were encountered: