[Re-Opened] Added support for standalone mlir-conv (used to be #2110) #2142

ravil-mobile · 2023-08-31T14:17:26Z

The old discussion used to be in PR #2110 which was created from a forked project. This was blocking some pipelines to start. To avoid this, I just created a PR from my local branch of migraphx repo.

migraphx-bot · 2023-08-31T18:07:49Z

Test	Batch	Rate new 3da39e	Rate old dfd95c	Diff	Compare
torchvision-resnet50	64	2,284.73	2,282.93	0.08%	✅
torchvision-resnet50_fp16	64	5,353.45	5,367.51	-0.26%	✅
torchvision-densenet121	32	1,834.21	1,823.27	0.60%	✅
torchvision-densenet121_fp16	32	3,384.65	3,387.68	-0.09%	✅
torchvision-inceptionv3	32	1,343.00	1,342.40	0.04%	✅
torchvision-inceptionv3_fp16	32	2,584.58	2,587.53	-0.11%	✅
cadene-inceptionv4	16	679.49	681.13	-0.24%	✅
cadene-resnext64x4	16	590.46	590.61	-0.03%	✅
slim-mobilenet	64	7,215.02	7,223.96	-0.12%	✅
slim-nasnetalarge	64	236.90	237.31	-0.17%	✅
slim-resnet50v2	64	2,527.94	2,529.37	-0.06%	✅
bert-mrpc-onnx	8	720.81	720.88	-0.01%	✅
bert-mrpc-tf	1	391.36	389.42	0.50%	✅
pytorch-examples-wlang-gru	1	305.05	301.89	1.05%	✅
pytorch-examples-wlang-lstm	1	313.25	314.33	-0.34%	✅
torchvision-resnet50_1	1	553.61	554.98	-0.25%	✅
torchvision-inceptionv3_1	1	306.50	307.16	-0.21%	✅
cadene-dpn92_1	1	351.51	351.25	0.07%	✅
cadene-resnext101_1	1	220.19	220.62	-0.20%	✅
slim-vgg16_1	1	224.46	224.37	0.04%	✅
slim-mobilenet_1	1	1,505.38	1,470.16	2.40%	✅
slim-inceptionv4_1	1	220.66	218.52	0.98%	✅
onnx-taau-downsample	1	247.95	247.05	0.36%	✅
dlrm-criteoterabyte	1	21.68	21.71	-0.14%	✅
dlrm-criteoterabyte_fp16	1	40.66	40.63	0.09%	✅
agentmodel	1	5,757.05	5,862.59	-1.80%	✅
unet_fp16	2	55.09	55.01	0.15%	✅

This build is OK for merge ✅

migraphx-bot · 2023-08-31T18:07:51Z

:white_check_mark:bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

:white_check_mark:bert-mrpc-tf: PASSED: MIGraphX meets tolerance

:white_check_mark:pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

:white_check_mark:pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

:white_check_mark:torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

🔴torchvision-inceptionv3_1: FAILED: MIGraphX is not within tolerance - check verbose output

🔴cadene-dpn92_1: FAILED: MIGraphX is not within tolerance - check verbose output

:white_check_mark:cadene-resnext101_1: PASSED: MIGraphX meets tolerance

:white_check_mark:slim-vgg16_1: PASSED: MIGraphX meets tolerance

:white_check_mark:slim-mobilenet_1: PASSED: MIGraphX meets tolerance

🔴slim-inceptionv4_1: FAILED: MIGraphX is not within tolerance - check verbose output

:white_check_mark:dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

:white_check_mark:agentmodel: PASSED: MIGraphX meets tolerance

:white_check_mark:unet: PASSED: MIGraphX meets tolerance

pfultz2 · 2023-08-31T18:54:21Z

src/targets/gpu/fuse_mlir.cpp

@@ -21,6 +21,7 @@
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
 */
+#include "migraphx/shape.hpp"


Use angle brackets, but I dont think this header is needed as its already included by the other headers.

I just removed it. Probably, my vscode included it automatically

pfultz2 · 2023-08-31T19:23:38Z

src/targets/gpu/fuse_mlir.cpp

+};
+
+MIGRAPHX_DECLARE_ENV_VAR(MIGRAPHX_MLIR_ENABLE_OPS);
+bool is_self_decide() { return env(MIGRAPHX_MLIR_ENABLE_OPS::value()).empty(); }


This should use string_value_of, otherwise this will read the env everytime which is slow.

Also, what does is_self_decided mean?

Hi,

Ok, changed

MIGRAPHX_MLIR_ENABLE_OPS means that the user specifies concrete operations which rocMLIR has to handle. For example, MIGRAPHX_MLIR_ENABLE_OPS="conv" will only generate convolutions with rocMLIR; fusing will be switched off.

if MIGRAPHX_MLIR_ENABLE_OPS is not specified by a user then we decide ourselves (named it as self_decide) which operations for rocMLIR to enable - i.e., always use fused operations and, if the underlying GPU is from Navi3x, use standalone convolutions

If you want I can rename MIGRAPHX_MLIR_ENABLE_OPS to MIGRAPHX_MLIR_USE_SPECIFIC_OPS

Sure, either name is find. Can you add a comment to source code to explain how this is supposed to work and what the valid values are? I would expected to use MIGRAPHX_MLIR_ENABLE_OPS=convolution not MIGRAPHX_MLIR_ENABLE_OPS=conv since thats the way we spell the operator, but if its spelled differently for this variable, we should have some comments to document it.

Lets not introduce a another indirection here..
and stick to migraphx op names ?

Hi, @pfultz2 ,

Can you add a comment to source code to explain how this is supposed to work and what the valid values are?

done.

I would expected to use MIGRAPHX_MLIR_ENABLE_OPS=convolution

fixed

src/targets/gpu/fuse_mlir.cpp

codecov · 2023-09-04T11:13:13Z

Codecov Report

Merging #2142 (d134a5a) into develop (72b691a) will not change coverage.
The diff coverage is n/a.

❗ Current head d134a5a differs from pull request most recent head 3da39e8. Consider uploading reports for the commit 3da39e8 to get more accurate results

@@           Coverage Diff            @@
##           develop    #2142   +/-   ##
========================================
  Coverage    91.48%   91.48%           
========================================
  Files          424      424           
  Lines        15873    15873           
========================================
  Hits         14521    14521           
  Misses        1352     1352

* renamed the corresponding struct * addressed suggestions of PR #2110

ravil-mobile changed the title ~~Ravil/standalone mlir~~ [Re-Opened]Added support for standalone mlir-conv Aug 31, 2023

ravil-mobile changed the title ~~[Re-Opened]Added support for standalone mlir-conv~~ [Re-Opened] Added support for standalone mlir-conv (used to be #2110) Aug 31, 2023

ravil-mobile force-pushed the ravil/standalone-mlir branch from a1ba90c to 8f09284 Compare August 31, 2023 15:00

pfultz2 reviewed Aug 31, 2023

View reviewed changes

src/targets/gpu/fuse_mlir.cpp Outdated Show resolved Hide resolved

ravil-mobile force-pushed the ravil/standalone-mlir branch from 8f09284 to 7fbd297 Compare September 4, 2023 11:13

ravil-mobile force-pushed the ravil/standalone-mlir branch from 7fbd297 to 40398bc Compare September 4, 2023 11:43

ravil-mobile requested a review from pfultz2 September 4, 2023 11:59

ravil-mobile force-pushed the ravil/standalone-mlir branch 6 times, most recently from 1c447ff to 954d1c5 Compare September 11, 2023 12:55

ravil-mobile and others added 10 commits September 11, 2023 16:43

Added support for standalone mlir-conv

da2e723

Renamed fuse_mlir.hpp/cpp to mlir_offload.hpp/cpp

a073c22

* renamed the corresponding struct * addressed suggestions of PR #2110

Added env. variable to force standalone convs

faebe28

Reverted names i.e., mlir_offload.cpp to fuse_mlir.cpp

ba00149

Implemted a more flexible selection of mlir ops

43faed9

Reverted names i.e., mlir_offload.hpp to fuse_mlir.hpp

79dd356

Fixed test_gpu_fuse_mlir::dot_add when context == nullptr

621fa54

Rebased with the develop. Updated the use of stringutils

333ce7b

Renamed MIGRAPHX_MLIR_ENABLE_OPS to MIGRAPHX_MLIR_USE_SPECIFIC_OPS

ee9b18e

Documented the use of MIGRAPHX_MLIR_USE_SPECIFIC_OPS env. variable

3da39e8

ravil-mobile force-pushed the ravil/standalone-mlir branch from 954d1c5 to 3da39e8 Compare September 11, 2023 14:43

causten requested a review from manupak September 11, 2023 19:17

pfultz2 approved these changes Sep 11, 2023

View reviewed changes

causten merged commit ea97ce5 into develop Sep 11, 2023
12 checks passed

causten deleted the ravil/standalone-mlir branch September 11, 2023 22:50

ravil-mobile mentioned this pull request Sep 12, 2023

Added support for standalone dot operations with mlir #2169

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Re-Opened] Added support for standalone mlir-conv (used to be #2110) #2142

[Re-Opened] Added support for standalone mlir-conv (used to be #2110) #2142

ravil-mobile commented Aug 31, 2023 •

edited

Loading

migraphx-bot commented Aug 31, 2023 •

edited

Loading

migraphx-bot commented Aug 31, 2023 •

edited

Loading

pfultz2 Aug 31, 2023

ravil-mobile Sep 4, 2023

pfultz2 Aug 31, 2023

ravil-mobile Sep 4, 2023 •

edited

Loading

pfultz2 Sep 8, 2023

manupak Sep 11, 2023

ravil-mobile Sep 11, 2023

codecov bot commented Sep 4, 2023 •

edited

Loading

[Re-Opened] Added support for standalone mlir-conv (used to be #2110) #2142

[Re-Opened] Added support for standalone mlir-conv (used to be #2110) #2142

Conversation

ravil-mobile commented Aug 31, 2023 • edited Loading

migraphx-bot commented Aug 31, 2023 • edited Loading

migraphx-bot commented Aug 31, 2023 • edited Loading

pfultz2 Aug 31, 2023

Choose a reason for hiding this comment

ravil-mobile Sep 4, 2023

Choose a reason for hiding this comment

pfultz2 Aug 31, 2023

Choose a reason for hiding this comment

ravil-mobile Sep 4, 2023 • edited Loading

Choose a reason for hiding this comment

pfultz2 Sep 8, 2023

Choose a reason for hiding this comment

manupak Sep 11, 2023

Choose a reason for hiding this comment

ravil-mobile Sep 11, 2023

Choose a reason for hiding this comment

codecov bot commented Sep 4, 2023 • edited Loading

Codecov Report

ravil-mobile commented Aug 31, 2023 •

edited

Loading

migraphx-bot commented Aug 31, 2023 •

edited

Loading

migraphx-bot commented Aug 31, 2023 •

edited

Loading

ravil-mobile Sep 4, 2023 •

edited

Loading

codecov bot commented Sep 4, 2023 •

edited

Loading