You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(PS: I am using google colab to do all of this)
So, I was trying to use the sparseml zoo and train one of the models (https://sparsezoo.neuralmagic.com/models/yolov8-n-coco-pruned49_quantized?hardware=deepsparse-c6i.12xlarge) with my custom dataset. I trained it and exported it but my finetuned model.onnx file size is ~4.7 MB, while the original pretrained model.onnx file size is ~3.5 MB. I thought the size should remain the same after finetuning?
Also I tried benchmarking it on the CPU using deepsparse.benchmark and I get these results:
2024-06-26 03:39:24 deepsparse.benchmark.helpers INFO Thread pinning to cores enabled
DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.7.1 COMMUNITY | (3904e8ec) (release) (optimized) (system=avx2, binary=avx2)
2024-06-26 03:39:25 deepsparse.benchmark.benchmark_model INFO deepsparse.engine.Engine:
onnx_file_path: /content/drive/MyDrive/sparse_model_runs/detect/sparse49_56epochs/deployment/model.onnx
batch_size: 1
num_cores: 1
num_streams: 1
scheduler: Scheduler.default fraction_of_supported_ops: 0.0
cpu_avx_type: avx2
cpu_vnni: False
2024-06-26 03:39:25 deepsparse.utils.onnx INFO Generating input 'images', type = uint8, shape = [1, 3, 640, 640]
2024-06-26 03:39:25 deepsparse.benchmark.benchmark_model INFO Starting 'singlestream' performance measurements for 10 seconds
Original Model Path: /content/drive/MyDrive/sparse_model_runs/detect/sparse49_56epochs/deployment/model.onnx
Batch Size: 1
Scenario: sync
Throughput (items/sec): 3.6589
Latency Mean (ms/batch): 273.2898
Latency Median (ms/batch): 235.4801
Latency Std (ms/batch): 58.9786
Iterations: 37
Why is the fraction_of_supported_ops = 0.0
When exporting the model, I get these warnings:
2024-06-26 01:49:44.989820: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-26 01:49:47.102603: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Could it be related to this? because I also tried exporting the model.pt file of the original pretrained model and the exported model.onnx file was still around ~4.7 MB, even though in the documentation it is ~3.5 MB.
I tried benchmarking my exported version of the original pretrained model and comparing it to the benchmark of the pretrained model.onnx file and again I get similar differences. The fraction_of_supported_ops = 0.0 for my exported version but it is =1 for the model.onnx file from the website.
Would appreciate any help and can provide any additional information.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
(PS: I am using google colab to do all of this)
So, I was trying to use the sparseml zoo and train one of the models (https://sparsezoo.neuralmagic.com/models/yolov8-n-coco-pruned49_quantized?hardware=deepsparse-c6i.12xlarge) with my custom dataset. I trained it and exported it but my finetuned model.onnx file size is ~4.7 MB, while the original pretrained model.onnx file size is ~3.5 MB. I thought the size should remain the same after finetuning?
Also I tried benchmarking it on the CPU using deepsparse.benchmark and I get these results:
2024-06-26 03:39:24 deepsparse.benchmark.helpers INFO Thread pinning to cores enabled
DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.7.1 COMMUNITY | (3904e8ec) (release) (optimized) (system=avx2, binary=avx2)
2024-06-26 03:39:25 deepsparse.benchmark.benchmark_model INFO deepsparse.engine.Engine:
onnx_file_path: /content/drive/MyDrive/sparse_model_runs/detect/sparse49_56epochs/deployment/model.onnx
batch_size: 1
num_cores: 1
num_streams: 1
scheduler: Scheduler.default
fraction_of_supported_ops: 0.0
cpu_avx_type: avx2
cpu_vnni: False
2024-06-26 03:39:25 deepsparse.utils.onnx INFO Generating input 'images', type = uint8, shape = [1, 3, 640, 640]
2024-06-26 03:39:25 deepsparse.benchmark.benchmark_model INFO Starting 'singlestream' performance measurements for 10 seconds
Original Model Path: /content/drive/MyDrive/sparse_model_runs/detect/sparse49_56epochs/deployment/model.onnx
Batch Size: 1
Scenario: sync
Throughput (items/sec): 3.6589
Latency Mean (ms/batch): 273.2898
Latency Median (ms/batch): 235.4801
Latency Std (ms/batch): 58.9786
Iterations: 37
Why is the fraction_of_supported_ops = 0.0
When exporting the model, I get these warnings:
2024-06-26 01:49:44.989820: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-26 01:49:47.102603: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Could it be related to this? because I also tried exporting the model.pt file of the original pretrained model and the exported model.onnx file was still around ~4.7 MB, even though in the documentation it is ~3.5 MB.
I tried benchmarking my exported version of the original pretrained model and comparing it to the benchmark of the pretrained model.onnx file and again I get similar differences. The fraction_of_supported_ops = 0.0 for my exported version but it is =1 for the model.onnx file from the website.
Would appreciate any help and can provide any additional information.
Beta Was this translation helpful? Give feedback.
All reactions