Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: IBM Power (ppc64le) support #29566

Open
1 task done
sumitd2 opened this issue Dec 28, 2023 · 84 comments
Open
1 task done

[Enhancement]: IBM Power (ppc64le) support #29566

sumitd2 opened this issue Dec 28, 2023 · 84 comments
Labels
kind/enhancement Issues or changes related to enhancement

Comments

@sumitd2
Copy link

sumitd2 commented Dec 28, 2023

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

I belong to the IBM Power porting team. We have recently developed patches to build Milvus on ppc64le, and would like to open a PR to upstream the changes to this repo. The major changes were adjusting the versions for some conan installed dependencies, and creating a local cmake repo for conan because the conan binary package for cmake currently does not have power binaries. So we need to somehow find a way to ensure cmake supports power, and any future dependency version modifications work on all architectures. Do you have any suggestions?

Why is this needed?

There are customers who wish to use Milvus on Power.

Anything else?

ppc64le/build-scripts#3467

@sumitd2 sumitd2 added the kind/enhancement Issues or changes related to enhancement label Dec 28, 2023
@sumitd2
Copy link
Author

sumitd2 commented Dec 28, 2023

cc: @seth-priya

Copy link

stale bot commented Jan 27, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Jan 27, 2024
@xiaofan-luan
Copy link
Collaborator

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

I belong to the IBM Power porting team. We have recently developed patches to build Milvus on ppc64le, and would like to open a PR to upstream the changes to this repo. The major changes were adjusting the versions for some conan installed dependencies, and creating a local cmake repo for conan because the conan binary package for cmake currently does not have power binaries. So we need to somehow find a way to ensure cmake supports power, and any future dependency version modifications work on all architectures. Do you have any suggestions?

Why is this needed?

There are customers who wish to use Milvus on Power.

Anything else?

ppc64le/build-scripts#3467

Great work.
The only concern is how we can deploy ci/cid to make sure the Power deployment will not be broken by some new PRs.
We don't have any IBM Power machines so that might be really challenge

@stale stale bot removed the stale indicates no udpates for 30 days label Jan 27, 2024
@sumitd2
Copy link
Author

sumitd2 commented Feb 8, 2024

Hi @xiaofan-luan You can request a ppc64le node for your CI from this link:
https://osuosl.org/services/powerdev/request_hosting/ (Please put "Gerrit Huizenga" in the IBM advocate field.)

Also, I have successfully built and tested Milvus 2.3.3 using our patches on Ubuntu 22.04 and RHEL 9.3, on both x86_64 and ppc64le. You will need to create a repo for "cmake/3.28.3@milvus/dev" to host cmake binaries for both Intel and Power. I hope that's fine.

cc: @seth-priya

Copy link

stale bot commented Mar 10, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Mar 10, 2024
@stale stale bot closed this as completed Mar 17, 2024
@xiaofan-luan
Copy link
Collaborator

keep this

@xiaofan-luan xiaofan-luan reopened this Mar 17, 2024
@stale stale bot removed the stale indicates no udpates for 30 days label Mar 17, 2024
@xiaofan-luan
Copy link
Collaborator

@sumitd2
sorry for the late reply.

I'm not very familiar with the compilation part. We can keep it as long as there is simple script that can run for milvus to build on Power series.

This seems to be a large pr so let's try to do this

  1. change knowhere to support POWER -> especially for the same functionality of SIMD
  2. make a script that can compile milvus
  3. dockerfile

@sumitd2
Copy link
Author

sumitd2 commented Apr 4, 2024

Hi @xiaofan-luan, we are working on the SIMD optimizations and the PR. Will get back to you on this.

@xiaofan-luan
Copy link
Collaborator

@alexanderguzhva
can help on this if necessary

@alexanderguzhva
Copy link
Contributor

@sumitd2 I've noticed a wave of AIX-related PRs to the Faiss baseline. I will propagate these changes to our Faiss fork in Knowhere. Please feel free to let me know if you need any assistance with Faiss/Knowhere as well.

Copy link

stale bot commented May 6, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label May 6, 2024
@sumitd2
Copy link
Author

sumitd2 commented May 7, 2024

@sumitd2 sorry for the late reply.

I'm not very familiar with the compilation part. We can keep it as long as there is simple script that can run for milvus to build on Power series.

This seems to be a large pr so let's try to do this

  1. change knowhere to support POWER -> especially for the same functionality of SIMD
  2. make a script that can compile milvus
  3. dockerfile

Hi @xiaofan-luan, couple of quick points here:

  1. Our team that was working on SIMD has abandoned the idea. They say the compiler generated code is better than handwritten.
  2. A script here will not work, as it has the chance of being broken by future PRs. This will have to be a proper port with the changes going into master, and a CI node (which we can provide on osuosl.org) to include the Power CI as a job. The conan binary package for cmake currently does not have Power binaries, so we will have to find a solution for this - many of our customers are interested in Milvus on Power

Let me know your thoughts. cc: @seth-priya

@stale stale bot removed the stale indicates no udpates for 30 days label May 7, 2024
@xiaofan-luan
Copy link
Collaborator

@alexanderguzhva could you help with the IBM team with the milvus side change?

@xiaofan-luan
Copy link
Collaborator

@locustbaby could you try to request a powerPC machine for the basic ci/cd?

@xiaofan-luan
Copy link
Collaborator

@sumitd2这么晚才回复很抱歉。
我对编译部分不是很熟悉。只要有简单的脚本可以运行 milvus 在 Power 系列上构建,我们就可以保留它。
这似乎是一个很大的公关,所以让我们尝试这样做

  1. 更改knowhere以支持POWER -> 特别是对于SIMD的相同功能
  2. 制作一个可以编译milvus的脚本
  3. docker文件

你好@xiaofan-luan,这里有几个要点:

  1. 我们致力于 SIMD 的团队已经放弃了这个想法。他们说编译器生成的代码比手写的更好。
  2. 这里的脚本不起作用,因为它有可能被未来的 PR 破坏。这必须是一个正确的端口,更改将进入主节点,并且必须是一个 CI 节点(我们可以在 osuosl.org 上提供)以将 Power CI 作为作业包含在内。 cmake 的 conan 二进制包目前没有 Power 二进制文件,因此我们必须为此找到解决方案 - 我们的许多客户对 Milvus on Power 感兴趣

让我知道你的想法。抄送:@seth-priya

@sumitd2 sorry for the late reply.
I'm not very familiar with the compilation part. We can keep it as long as there is simple script that can run for milvus to build on Power series.
This seems to be a large pr so let's try to do this

  1. change knowhere to support POWER -> especially for the same functionality of SIMD
  2. make a script that can compile milvus
  3. dockerfile

Hi @xiaofan-luan, couple of quick points here:

  1. Our team that was working on SIMD has abandoned the idea. They say the compiler generated code is better than handwritten.
  2. A script here will not work, as it has the chance of being broken by future PRs. This will have to be a proper port with the changes going into master, and a CI node (which we can provide on osuosl.org) to include the Power CI as a job. The conan binary package for cmake currently does not have Power binaries, so we will have to find a solution for this - many of our customers are interested in Milvus on Power

Let me know your thoughts. cc: @seth-priya

we also need your help to finish the power pc compile. is there a pr for this issue?

@sumitd2
Copy link
Author

sumitd2 commented May 7, 2024

@xiaofan-luan We built 2.3.3 and its rather old now. I am building 2.4.1 on RHEL 9.3 since the afternoon and on the face of it, it looks like it requires minimal changes.

@xiaofan-luan
Copy link
Collaborator

sure @locustbaby can help you to do the setup

@locustbaby
Copy link
Contributor

@sumitd2 I've submitted a ticket#33429 for a powerpc

@sumitd2
Copy link
Author

sumitd2 commented May 10, 2024

@sumitd2 I've submitted a ticket#33429 for a powerpc

@locustbaby We have received your request, and it has been approved. Now its a matter of the administrator creating the VM, shouldn't be long.

@sumitd2
Copy link
Author

sumitd2 commented May 16, 2024

@locustbaby OSL has created the VM but they are waiting on a GPG key from you to provide access:

Your Openstack account is set up as well. Please send us your public GPG key,
in order to allow us to securely provide you the login details.
Checking back to see if you were able to log in and send us a gpg key.

@locustbaby
Copy link
Contributor

@sumitd2 I'm not very familiar with OSL, where should I submit the GPG key, I didn't receive any email about this.
And should I add the GPG key to our project? It would be great if you could describe the steps to follow!

@alexanderguzhva
Copy link
Contributor

alexanderguzhva commented Aug 7, 2024

@sumitd2 zilliztech/knowhere#754
Please let me know if it works. I will ask guys to stamp if it does.

@sumitd2
Copy link
Author

sumitd2 commented Aug 7, 2024

@alexanderguzhva it worked, thank you!

There is another error during the build (make). Not sure if you can fix it but posting it:

Building Milvus ...
# github.com/bytedance/sonic/internal/rt
/root/go/pkg/mod/github.com/bytedance/[email protected]/internal/rt/gcwb.go:102:6: missing function body
/root/go/pkg/mod/github.com/bytedance/[email protected]/internal/rt/gcwb.go:104:6: missing function body
/root/go/pkg/mod/github.com/bytedance/[email protected]/internal/rt/gcwb.go:114:6: missing function body
make: *** [Makefile:82: milvus] Error 1

@xiaofan-luan

@alexanderguzhva
Copy link
Contributor

@sumitd2 yep, I believe that this is out of my immediate knowledge, sorry about that

Copy link

stale bot commented Sep 7, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Sep 7, 2024
@xiaofan-luan
Copy link
Collaborator

@alexanderguzhva it worked, thank you!

There is another error during the build (make). Not sure if you can fix it but posting it:

Building Milvus ...
# github.com/bytedance/sonic/internal/rt
/root/go/pkg/mod/github.com/bytedance/[email protected]/internal/rt/gcwb.go:102:6: missing function body
/root/go/pkg/mod/github.com/bytedance/[email protected]/internal/rt/gcwb.go:104:6: missing function body
/root/go/pkg/mod/github.com/bytedance/[email protected]/internal/rt/gcwb.go:114:6: missing function body
make: *** [Makefile:82: milvus] Error 1

@xiaofan-luan

@haorenfsa see that there is a problem for sonic on ppc arch. guess it dosn't support ppc as well. can we compile it as an optional?

@stale stale bot removed the stale indicates no udpates for 30 days label Sep 9, 2024
@sumitd2
Copy link
Author

sumitd2 commented Sep 10, 2024

@alexanderguzhva it worked, thank you!
There is another error during the build (make). Not sure if you can fix it but posting it:

Building Milvus ...
# github.com/bytedance/sonic/internal/rt
/root/go/pkg/mod/github.com/bytedance/[email protected]/internal/rt/gcwb.go:102:6: missing function body
/root/go/pkg/mod/github.com/bytedance/[email protected]/internal/rt/gcwb.go:104:6: missing function body
/root/go/pkg/mod/github.com/bytedance/[email protected]/internal/rt/gcwb.go:114:6: missing function body
make: *** [Makefile:82: milvus] Error 1

@xiaofan-luan

@haorenfsa see that there is a problem for sonic on ppc arch. guess it dosn't support ppc as well. can we compile it as an optional?

@xiaofan-luan I am currently working on building master, and have already fixed this error by creating a sonic_ppc64le.go which uses encoding/json.

@haorenfsa
Copy link
Contributor

/assign

@haorenfsa
Copy link
Contributor

@sumitd2 to save the work, I think we could just disable sonic json for ppc64le.

@haorenfsa
Copy link
Contributor

@sumitd2 I just checked the code, and it's already optional. So we can just use make MILVUS_GO_BUILD_TAGS=dynamic to disable sonic.
image

@sumitd2
Copy link
Author

sumitd2 commented Sep 10, 2024

@sumitd2 I just checked the code, and it's already optional. So we can just use make MILVUS_GO_BUILD_TAGS=dynamic to disable sonic.

@haorenfsa That will disable it for every architecture, so why would you want to do that? I have already done the work, just needs a PR.

@haorenfsa
Copy link
Contributor

haorenfsa commented Sep 10, 2024

@sumitd2 I just checked the code, and it's already optional. So we can just use make MILVUS_GO_BUILD_TAGS=dynamic to disable sonic.

@haorenfsa That will disable it for every architecture, so why would you want to do that? I have already done the work, just needs a PR.

Um, I think you may have misunderstood. What I meant is that you can turn it off when compiling on powerPC. We still keep it on by default for other architectures.

@haorenfsa
Copy link
Contributor

And of course it's great that you implement it for ppc. Your patch is very welcomed.

But for other not covered architectures, like mips or riscv maybe, if one also want to build milvus on that, one can use the command I mentioned above. There won't be any difference as long as one use GRPC sdk not RESTful API.

@sumitd2
Copy link
Author

sumitd2 commented Sep 11, 2024

@xiaofan-luan
I have the latest master building on Power, and the C++ tests are passing (except for the Azure ones). This feels like the right time to set up your CI. Unfortunately GHA runner cannot be installed on the OSU VM we have provided you because it does not support Power. So the only option I see is Buildkite, which is free for open source projects. It integrates well with GHA so you will be able to see the Power job in the list on the PR page. Let me know if you are fine with it.

@locustbaby
I need access to the OSU VM that has been assigned to you, to be able to setup the CI agent for the Power builds.

cc: @seth-priya

@xiaofan-luan
Copy link
Collaborator

@chuanfeng

could you helo to invesigate on that? Let's merge this pr asap

Copy link

stale bot commented Nov 9, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Nov 9, 2024
@xiaofan-luan
Copy link
Collaborator

@sumitd2 @locustbaby
what's the progress of this issue?

@haorenfsa haorenfsa removed their assignment Nov 12, 2024
@stale stale bot removed the stale indicates no udpates for 30 days label Nov 12, 2024
@sumitd2
Copy link
Author

sumitd2 commented Dec 2, 2024

@locustbaby I need access to the ppc64le OSU server you were assigned. Thanks.

@locustbaby
Copy link
Contributor

locustbaby commented Dec 3, 2024 via email

@sumitd2
Copy link
Author

sumitd2 commented Dec 3, 2024

yes, I need the IP and the credentials. You can send it on sumit.dubey2 at ibm dot com

@locustbaby
Copy link
Contributor

@sumitd2 Just sent it to you, contact me if any issue.

@sumitd2
Copy link
Author

sumitd2 commented Dec 12, 2024

Hi @alexanderguzhva , I am in the process of setting up a ci job for ppc64le which would run on an ubuntu 22.04 agent, and getting the following errors while building knowhere:

  1. <openblas/cblas.h> not found
    Add a special OpenBLAS header handling for PowerPC zilliztech/knowhere#754
    Just <cblas.h> is needed on ubuntu. <openblas/cblas.h> is required on RHEL.

xmmintrin.h
#error "Please read comment above. Use -DNO_WARN_X86_INTRINSICS to disable this error."

This header is distributed to simplify porting x86_64 code that
   makes explicit use of Intel intrinsics to powerpc64le.
   It is the user's responsibility to determine if the results are
   acceptable and make additional changes as necessary.
   Note that much code that uses Intel intrinsics can be rewritten in
   standard C or GNU C extensions, which are more portable and better
   optimized across multiple targets.
   ...

This will have to be defined to silence this error.

Can you please help fix them? Thanks.

@alexanderguzhva
Copy link
Contributor

@sumitd2 I got a bit confused.

  1. Do I get it correct that the include file must take a linux distro into account? Some construct that has the following result:
#ifdef USE_RPM
#include <openblas/cblas.h>
#elif USE_DEB
#include <cblas.h>
#else
static_assert("?!!");
#endif
  1. It is needed to add NO_WARN_X86_INTRINSICS to cmake flags for powerpc, right?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues or changes related to enhancement
Projects
None yet
Development

No branches or pull requests

7 participants