Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify folder for git clone in compile-pytorch-ipex.sh; update IPEX pin #2763

Merged
merged 2 commits into from
Nov 20, 2024

Conversation

anmyachev
Copy link
Contributor

Closes #2651

@@ -106,8 +106,7 @@ if [[ $BUILD_PYTORCH = true ]]; then
rm -rf $PYTORCH_PROJ

echo "**** Cloning $PYTORCH_PROJ ****"
cd $BASE
git clone --single-branch -b dev/triton-test-3.0 --recurse-submodules --jobs 8 https://github.com/Stonepia/pytorch.git
git clone --single-branch -b dev/triton-test-3.0 --recurse-submodules --jobs 8 https://github.com/Stonepia/pytorch.git $PYTORCH_PROJ
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default was pytorch

@anmyachev anmyachev marked this pull request as ready for review November 19, 2024 21:38
@whitneywhtsang
Copy link
Contributor

Closes #2651

The change LGTM, but it doesn't resolve the problem reported by #2651.

Signed-off-by: Anatoly Myachev <[email protected]>
@anmyachev
Copy link
Contributor Author

@Stonepia just to clarify, did you made force push to dev/triton-test-3.0 branch? intel/intel-extension-for-pytorch@15ef7db looks similar to the previous that we use.

@anmyachev anmyachev changed the title Specify folder for git clone in compile-pytorch-ipex.sh Specify folder for git clone in compile-pytorch-ipex.sh; update IPEX pin Nov 19, 2024
@anmyachev
Copy link
Contributor Author

@whitneywhtsang could you try with the last change?

@whitneywhtsang
Copy link
Contributor

@whitneywhtsang could you try with the last change?

It passes the reported problem, it is at the step [ 74%] Built target dnnl_gpu_ocl for a while, will reply again when it works e2e.

@whitneywhtsang
Copy link
Contributor

[ 74%] Built target dnnl_gpu_jit
make: *** [Makefile:136: all] Error 2
Traceback (most recent call last):
  File "/home/jovyan/intel-xpu-backend-for-triton/.scripts_cache/intel-extension-for-pytorch/setup.py", line 1168, in <module>
    setup(
  File "/home/jovyan/.conda/envs/python-3.12/lib/python3.12/site-packages/setuptools/__init__.py", line 117, in setup
    return distutils.core.setup(**attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@Stonepia
Copy link
Contributor

Oh sorry for that. Yes we made a force push to the dev/triton-test-3.0 branch, because there is a token leak, that should be our fault. We will fix that.

@Stonepia
Copy link
Contributor

Hi @anmyachev , I really apology for not informing you about the force push. Because I thought Triton had already switched to the stock PT.

About this build problem specifically, I have checked about the code change, seems there is no change that would cause the break. I don't see anything that we missed.

Most likely, is that during that version of IPEX, the CPU build would break the XPU version of IPEX. In addition, there might be because of the one API version.

Could you try these things? I could build on my machine when the CPU part of IPEX is closed.

  1. What oneAPI are you using? The dev bundle 0.5.3 could work on my machine.
  2. Could you try the following flags before building?
# I think this flag will be ok. We don't need CPU.
export BUILD_WITH_CPU=OFF
# more flags to close other things.  You could close these things
# I don't think these things would affect Triton dev.
export USE_XETLA=OFF
export USE_PTI=OFF
export USE_KINETO=OFF

Again, so sorry for the issue. If there is anything I could help, please tell me.

@Stonepia
Copy link
Contributor

I didn't try to build with the former version of oneAPI (like 2023.2). I think the oneAPI version might explain why the former build could pass, and now it can't. From the code, I checked that and there should not have missing parts.

@anmyachev
Copy link
Contributor Author

Hi @Stonepia,

thanks for confirming! Don't worry, it's not that bad. I was able to build code with PTDB, even with python 3.12.

[ 74%] Built target dnnl_gpu_jit
make: *** [Makefile:136: all] Error 2
Traceback (most recent call last):
  File "/home/jovyan/intel-xpu-backend-for-triton/.scripts_cache/intel-extension-for-pytorch/setup.py", line 1168, in <module>
    setup(
  File "/home/jovyan/.conda/envs/python-3.12/lib/python3.12/site-packages/setuptools/__init__.py", line 117, in setup
    return distutils.core.setup(**attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@whitneywhtsang I can't reproduce that problem. If it's still there, I guess we can continue in a separate thread.

@anmyachev anmyachev merged commit 80fd6c0 into main Nov 20, 2024
4 checks passed
@anmyachev anmyachev deleted the amyachev/issue2651 branch November 20, 2024 10:38
@whitneywhtsang
Copy link
Contributor

@whitneywhtsang I can't reproduce that problem. If it's still there, I guess we can continue in a separate thread.

Sure, my environment is Agama 1032.19, DLE 2025.0.0.

@anmyachev
Copy link
Contributor Author

@whitneywhtsang I can't reproduce that problem. If it's still there, I guess we can continue in a separate thread.

Sure, my environment is Agama 1032.19, DLE 2025.0.0.

Well, this is kind of expected, our IPEX branch doesn’t involve working with 2025 compiler, or I missed something?

@whitneywhtsang
Copy link
Contributor

Well, this is kind of expected, our IPEX branch doesn’t involve working with 2025 compiler, or I missed something?

Make sense, verified it works for me with PTDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot build pytorch with IPEX using compile-pytorch-ipex.sh
3 participants