Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Re-Opened] Added support for standalone mlir-conv (used to be #2110) #2142

Merged
merged 10 commits into from
Sep 11, 2023

Conversation

ravil-mobile
Copy link
Contributor

@ravil-mobile ravil-mobile commented Aug 31, 2023

The old discussion used to be in PR #2110 which was created from a forked project. This was blocking some pipelines to start. To avoid this, I just created a PR from my local branch of migraphx repo.

@ravil-mobile ravil-mobile changed the title Ravil/standalone mlir [Re-Opened]Added support for standalone mlir-conv Aug 31, 2023
@ravil-mobile ravil-mobile changed the title [Re-Opened]Added support for standalone mlir-conv [Re-Opened] Added support for standalone mlir-conv (used to be #2110) Aug 31, 2023
@ravil-mobile ravil-mobile force-pushed the ravil/standalone-mlir branch from a1ba90c to 8f09284 Compare August 31, 2023 15:00
@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Aug 31, 2023

Test Batch Rate new
3da39e
Rate old
dfd95c
Diff Compare
torchvision-resnet50 64 2,284.73 2,282.93 0.08%
torchvision-resnet50_fp16 64 5,353.45 5,367.51 -0.26%
torchvision-densenet121 32 1,834.21 1,823.27 0.60%
torchvision-densenet121_fp16 32 3,384.65 3,387.68 -0.09%
torchvision-inceptionv3 32 1,343.00 1,342.40 0.04%
torchvision-inceptionv3_fp16 32 2,584.58 2,587.53 -0.11%
cadene-inceptionv4 16 679.49 681.13 -0.24%
cadene-resnext64x4 16 590.46 590.61 -0.03%
slim-mobilenet 64 7,215.02 7,223.96 -0.12%
slim-nasnetalarge 64 236.90 237.31 -0.17%
slim-resnet50v2 64 2,527.94 2,529.37 -0.06%
bert-mrpc-onnx 8 720.81 720.88 -0.01%
bert-mrpc-tf 1 391.36 389.42 0.50%
pytorch-examples-wlang-gru 1 305.05 301.89 1.05%
pytorch-examples-wlang-lstm 1 313.25 314.33 -0.34%
torchvision-resnet50_1 1 553.61 554.98 -0.25%
torchvision-inceptionv3_1 1 306.50 307.16 -0.21%
cadene-dpn92_1 1 351.51 351.25 0.07%
cadene-resnext101_1 1 220.19 220.62 -0.20%
slim-vgg16_1 1 224.46 224.37 0.04%
slim-mobilenet_1 1 1,505.38 1,470.16 2.40%
slim-inceptionv4_1 1 220.66 218.52 0.98%
onnx-taau-downsample 1 247.95 247.05 0.36%
dlrm-criteoterabyte 1 21.68 21.71 -0.14%
dlrm-criteoterabyte_fp16 1 40.66 40.63 0.09%
agentmodel 1 5,757.05 5,862.59 -1.80%
unet_fp16 2 55.09 55.01 0.15%

This build is OK for merge ✅

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Aug 31, 2023


    :white_check_mark:bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

    :white_check_mark:bert-mrpc-tf: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

    :white_check_mark:torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

🔴torchvision-inceptionv3_1: FAILED: MIGraphX is not within tolerance - check verbose output


🔴cadene-dpn92_1: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:cadene-resnext101_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-vgg16_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-mobilenet_1: PASSED: MIGraphX meets tolerance

🔴slim-inceptionv4_1: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

    :white_check_mark:agentmodel: PASSED: MIGraphX meets tolerance

    :white_check_mark:unet: PASSED: MIGraphX meets tolerance

@@ -21,6 +21,7 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "migraphx/shape.hpp"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use angle brackets, but I dont think this header is needed as its already included by the other headers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just removed it. Probably, my vscode included it automatically

};

MIGRAPHX_DECLARE_ENV_VAR(MIGRAPHX_MLIR_ENABLE_OPS);
bool is_self_decide() { return env(MIGRAPHX_MLIR_ENABLE_OPS::value()).empty(); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use string_value_of, otherwise this will read the env everytime which is slow.

Also, what does is_self_decided mean?

Copy link
Contributor Author

@ravil-mobile ravil-mobile Sep 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

  1. Ok, changed

  2. MIGRAPHX_MLIR_ENABLE_OPS means that the user specifies concrete operations which rocMLIR has to handle. For example, MIGRAPHX_MLIR_ENABLE_OPS="conv" will only generate convolutions with rocMLIR; fusing will be switched off.

if MIGRAPHX_MLIR_ENABLE_OPS is not specified by a user then we decide ourselves (named it as self_decide) which operations for rocMLIR to enable - i.e., always use fused operations and, if the underlying GPU is from Navi3x, use standalone convolutions

If you want I can rename MIGRAPHX_MLIR_ENABLE_OPS to MIGRAPHX_MLIR_USE_SPECIFIC_OPS

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, either name is find. Can you add a comment to source code to explain how this is supposed to work and what the valid values are? I would expected to use MIGRAPHX_MLIR_ENABLE_OPS=convolution not MIGRAPHX_MLIR_ENABLE_OPS=conv since thats the way we spell the operator, but if its spelled differently for this variable, we should have some comments to document it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets not introduce a another indirection here..
and stick to migraphx op names ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @pfultz2 ,

Can you add a comment to source code to explain how this is supposed to work and what the valid values are?

done.

I would expected to use MIGRAPHX_MLIR_ENABLE_OPS=convolution

fixed

@codecov
Copy link

codecov bot commented Sep 4, 2023

Codecov Report

Merging #2142 (d134a5a) into develop (72b691a) will not change coverage.
The diff coverage is n/a.

❗ Current head d134a5a differs from pull request most recent head 3da39e8. Consider uploading reports for the commit 3da39e8 to get more accurate results

@@           Coverage Diff            @@
##           develop    #2142   +/-   ##
========================================
  Coverage    91.48%   91.48%           
========================================
  Files          424      424           
  Lines        15873    15873           
========================================
  Hits         14521    14521           
  Misses        1352     1352           

@ravil-mobile ravil-mobile requested a review from pfultz2 September 4, 2023 11:59
@ravil-mobile ravil-mobile force-pushed the ravil/standalone-mlir branch 6 times, most recently from 1c447ff to 954d1c5 Compare September 11, 2023 12:55
@causten causten requested a review from manupak September 11, 2023 19:17
@causten causten merged commit ea97ce5 into develop Sep 11, 2023
12 checks passed
@causten causten deleted the ravil/standalone-mlir branch September 11, 2023 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants