Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/missing miopen dll #2227

Closed
wants to merge 3 commits into from
Closed

Fix/missing miopen dll #2227

wants to merge 3 commits into from

Conversation

tvukovic-amd
Copy link
Collaborator

Fixed test_gpu_* issue: Exit code 0xc0000135 which is caused by missing MIOpen.dll inside AMDMIGraphX/build/bin directory. Added inside src/targets/gpu/CMakeLists.txt MIOpen.dll copy to appropriate directory (in Windows case). This solution fixed MIGraphX test issue which appeared in the following tests:

  • test_gpu_adjust_allocation
  • test_gpu_context_serialize
  • test_gpu_hip
  • test_gpu_literal
  • test_gpu_mlir
  • test_gpu_pack_args
  • test_gpu_pack_int8_args
  • test_gpu_quantization

@codecov
Copy link

codecov bot commented Sep 21, 2023

Codecov Report

Merging #2227 (cb7d662) into develop (482e8d6) will increase coverage by 0.01%.
Report is 4 commits behind head on develop.
The diff coverage is 96.77%.

❗ Current head cb7d662 differs from pull request most recent head c0ddcfc. Consider uploading reports for the commit c0ddcfc to get more accurate results

@@             Coverage Diff             @@
##           develop    #2227      +/-   ##
===========================================
+ Coverage    91.49%   91.51%   +0.01%     
===========================================
  Files          427      427              
  Lines        15953    15972      +19     
===========================================
+ Hits         14596    14616      +20     
+ Misses        1357     1356       -1     
Files Changed Coverage Δ
src/propagate_constant.cpp 79.48% <66.66%> (ø)
src/include/migraphx/op/roialign.hpp 99.16% <100.00%> (ø)
src/onnx/parse_constant.cpp 100.00% <100.00%> (ø)
src/onnx/parse_roialign.cpp 92.30% <100.00%> (ø)
src/optimize_module.cpp 100.00% <100.00%> (ø)
src/simplify_reshapes.cpp 98.71% <100.00%> (+0.04%) ⬆️

... and 1 file with indirect coverage changes

@TedThemistokleous TedThemistokleous added roadmap Tasks to finish for a release Continous Integration Pull request updates parts of continous integration pipeline Windows Related changes for Windows Environments labels Sep 21, 2023
@migraphx-bot
Copy link
Collaborator

Test Batch Rate new
c0ddcf
Rate old
ce2ada
Diff Compare
torchvision-resnet50 64 2,285.01 2,282.71 0.10%
torchvision-resnet50_fp16 64 5,355.83 5,355.47 0.01%
torchvision-densenet121 32 1,838.81 1,820.41 1.01%
torchvision-densenet121_fp16 32 3,393.31 3,391.58 0.05%
torchvision-inceptionv3 32 1,338.49 1,347.15 -0.64%
torchvision-inceptionv3_fp16 32 2,583.21 2,590.46 -0.28%
cadene-inceptionv4 16 678.47 680.47 -0.29%
cadene-resnext64x4 16 589.65 590.45 -0.14%
slim-mobilenet 64 7,220.02 7,220.57 -0.01%
slim-nasnetalarge 64 236.90 237.06 -0.07%
slim-resnet50v2 64 2,528.57 2,529.32 -0.03%
bert-mrpc-onnx 8 721.19 840.05 -14.15% 🔴
bert-mrpc-tf 1 391.28 389.57 0.44%
pytorch-examples-wlang-gru 1 306.20 303.52 0.88%
pytorch-examples-wlang-lstm 1 313.37 312.69 0.22%
torchvision-resnet50_1 1 555.70 559.28 -0.64%
torchvision-inceptionv3_1 1 308.22 308.73 -0.16%
cadene-dpn92_1 1 356.84 349.09 2.22%
cadene-resnext101_1 1 220.29 220.76 -0.21%
slim-vgg16_1 1 224.75 224.44 0.14%
slim-mobilenet_1 1 1,513.28 1,474.94 2.60%
slim-inceptionv4_1 1 218.57 220.74 -0.98%
onnx-taau-downsample 1 322.58 322.55 0.01%
dlrm-criteoterabyte 1 21.67 21.69 -0.06%
dlrm-criteoterabyte_fp16 1 40.65 40.62 0.07%
agentmodel 1 5,917.23 5,701.68 3.78% 🔆
unet_fp16 2 55.13 55.24 -0.20%

This build is not recommended to merge 🔴

@migraphx-bot
Copy link
Collaborator


    :white_check_mark:bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

    :white_check_mark:bert-mrpc-tf: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

    :white_check_mark:torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

🔴torchvision-inceptionv3_1: FAILED: MIGraphX is not within tolerance - check verbose output


🔴cadene-dpn92_1: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:cadene-resnext101_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-vgg16_1: PASSED: MIGraphX meets tolerance

    :white_check_mark:slim-mobilenet_1: PASSED: MIGraphX meets tolerance

🔴slim-inceptionv4_1: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

    :white_check_mark:agentmodel: PASSED: MIGraphX meets tolerance

    :white_check_mark:unet: PASSED: MIGraphX meets tolerance

@pfultz2
Copy link
Collaborator

pfultz2 commented Sep 21, 2023

Why do we need this for MIOpen and not for rocblas or hip?

Why doesn't setting the PATH work to find MIOpen correctly?

Also, MIOpen uses extra files(like perfdb, finddb, kerneldb, etc) which wont be used if we copy it our build directory.

@apwojcik
Copy link
Collaborator

apwojcik commented Sep 25, 2023

Why do we need this for MIOpen and not for rocblas or hip?

Why doesn't setting the PATH work to find MIOpen correctly?

Also, MIOpen uses extra files(like perfdb, finddb, kerneldb, etc) which wont be used if we copy it our build directory.

UIF is a single team porting MIOpen and MIGraphX (and dependencies) to Windows. Therefore, there is no specific revision (commit) of MIOpen we base development of MIGraphX.
We need the change from this PR for the seamless debugging experience of MIGraphiX and MIOpen from the build directory of MIGraphX with the help of all sorts of debugging tools available.
PATH will not work because there are several builds of MIOpen we want to test against MIGraphX without vast reconfiguration of the whole environment.
It is enough to copy only the MIOpen library to let us run MIGraphX tests. If copying DB files becomes required, we will do it too. It will work on Windows because the location of DB files is not fixed like it is on Linux.

@TedThemistokleous
Copy link
Collaborator

Why do we need this for MIOpen and not for rocblas or hip?
Why doesn't setting the PATH work to find MIOpen correctly?
Also, MIOpen uses extra files(like perfdb, finddb, kerneldb, etc) which wont be used if we copy it our build directory.

UIF is a single team porting MIOpen and MIGraphX (and dependencies) to Windows. Therefore, there is no specific revision (commit) of MIOpen we base development of MIGraphX. We need the change from this PR for the seamless debugging experience of MIGraphiX and MIOpen from the build directory of MIGraphX with the help of all sorts of debugging tools available. PATH will not work because there are several builds of MIOpen we want to test against MIGraphX without vast reconfiguration of the whole environment. It is enough to copy only the MIOpen library to let us run MIGraphX tests. If copying DB files becomes required, we will do it too. It will work on Windows because the location of DB files is not fixed like it is on Linux.

I think this makes sense, since it looks like you've also contained your changes under IF(WIN32) block and shouldn't break existing builds that we use on *Nix right now

Copy link
Collaborator

@pfultz2 pfultz2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PATH will not work because there are several builds of MIOpen we want to test against MIGraphX without vast reconfiguration of the whole environment.

I dont understand why you need a vast reconfiguration. You just need to run with the new path:

cmake -E env PATH=$MIOPEN_PATH;$PATH cmake --build . --config Debug --target check

You could also have a script you can run to copy MIOpen over from the version you want and then you dont need to reconfigure anything or run the build from the command-line.

Rebuilding migraphx to use a different MIOpen seems to require more reconfiguration.

This also seems like a hack very specific to your workflow and is not something we want external users to use when we build for windows, especially since MIOpen will not be performant due to missing DB files and kernel caches.

I dont think this should be merged in.

@apwojcik
Copy link
Collaborator

We are almost not using the command line in favor of IDEs for development. However, that requires support in CMakes to facilitate work with IDEs. The CMake team knows that and offers tooling to help developers work with IDEs. In this PR, we used one of the tools provided by the CMake team to make working with Visual Studio much easier.

You could also have a script you can run to copy MIOpen over from the version you want, and then you don't need to reconfigure anything or run the build from the command-line.

That will not work when we generate Visual Studio project files. It has to be part of CMakeLists.txt to correctly generate post-build steps for the project. Access to MIOpen is required to use the built-in debugger from Visual Studio IDE. So, doing this via script will make it even more complicated. Ultimately, we must add the call to the script as a post-build step. It is much better to do it in CMake natively.

@pfultz2
Copy link
Collaborator

pfultz2 commented Sep 25, 2023

It is not standard practice to copy dependencies artifcats into the build directory. Plus, messing with MIOpen's installation files can bring lots of problems and tightly couples the migraphx and miopen version. So we dont want to have that in our cmake build scripts. It should be done outside of the CMakeLists.txt.

We are almost not using the command line in favor of IDEs for development.

This is really a false dichotomy. You should use both to get the best programmer productivity.

That will not work when we generate Visual Studio project files.

Why?

It has to be part of CMakeLists.txt to correctly generate post-build steps for the project.

I dont see why this should be a post-build step. You just need to run the copy before you run the test executable. No need for any rebuilds whatsoever.

Access to MIOpen is required to use the built-in debugger from Visual Studio IDE.

Sure thats why you have a script to copy it or set the PATH variable to find it.

So, doing this via script will make it even more complicated.

I dont see how calling a script before launching the debugger is complicated

Although setting the PATH in the launch.json seems much simpler to me than making copies.

@tperry-amd
Copy link
Contributor

tperry-amd commented Sep 26, 2023

It is not standard practice to copy dependencies artifcats into the build directory. Plus, messing with MIOpen's installation files can bring lots of problems and tightly couples the migraphx and miopen version. So we dont want to have that in our cmake build scripts. It should be done outside of the CMakeLists.txt.

We are trying to do standard methods to solve this problem. PyTorch provides an example. Another problem that can crop up from path manipulation is the reason PyTorch made this change in the first place related to Windows finding the wrong DLL's here. PyTorch does use MSVC instead of WIN32 which is a change I think we should consider.

This is really a false dichotomy. You should use both to get the best programmer productivity.

Agreed, but we should also not neglect IDE's and provide a good experience for them as well.

I dont see why this should be a post-build step. You just need to run the copy before you run the test executable. No need for any rebuilds whatsoever.

Sure thats why you have a script to copy it or set the PATH variable to find it.

I dont see how calling a script before launching the debugger is complicated

Although setting the PATH in the launch.json seems much simpler to me than making copies.

It's a little hacky when you need to write custom scripts or do manual steps just to get debugging working. Ideally, we have a solution that works great with command line and IDE's that is industry standard for how to solve this problem on Windows. Modifying the launch.json may help this issue and we can definitely get it to work but it is not a great experience for easier developer workflows on Windows similar to what PyTorch provided.

Hopefully I made a convincing argument. If it is still a problem, we will pull the change and create a Python script or something to do the build/copy in a single step.

@tperry-amd
Copy link
Contributor

If it is still a problem, we will pull the change and create a Python script or something to do the build/copy in a single step.

I should clarify that we won't add anything to the repo for this. We'll just have manual steps, launch.json, environment variables, or scripts to solve the developer problem locally.

@pfultz2
Copy link
Collaborator

pfultz2 commented Sep 26, 2023

PyTorch provides an example.

Thats not an example of good practice in cmake: modifying CMAKE_CXX_FLAGS, using variables instead of imported targets, etc. I also do not see pytorch or any other rocm component copying external dependencies into the build directory.

MIOpen defines its installation with cmake --build . --target install(or through its generated packages). Writing our own custom install(via post build copies), breaks the encapsulation and is not a supported way of installing miopen. Any changes to miopen's installation(ie new files installed or removed) requires updates to migraphx(which miopen will not do for us since this is an unsupported way of installing miopen). This leads to migraphx and miopen version being tightly coupled, but we want the cmake to work with any future versions of miopen. It also requires extra maintenance to support newer versions of miopen.

It's a little hacky when you need to write custom scripts or do manual steps just to get debugging working.

The ability to modify configurations like launch.json or write custom scripts is not a sign of a "hacky" approach. It's a testament to the flexibility and extensibility of many modern development environments.

Ideally, we have a solution that works great with command line and IDE's that is industry standard for how to solve this problem on Windows.

Tightly coupling the versions between miopen and migraphx is not a great solution.

@tperry-amd
Copy link
Contributor

Understood, we'll pull the change and stick with developers configuring their environment.

@apwojcik apwojcik closed this Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Continous Integration Pull request updates parts of continous integration pipeline roadmap Tasks to finish for a release Windows Related changes for Windows Environments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants