-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End-to-end code generation example #610
End-to-end code generation example #610
Conversation
I've converted this to draft for now until we resolve getting access to hardware. |
@kurapov-peter which versions of LLVM is this known to work with? |
This one is for llvm 10..14 (I used 14 to test it). I can update to support higher version if you like. |
Ah okay, that probably explains why LLVM 16 wasn't working for me locally. I feel like the CMake for finding the necessary packages will need to be more forgiving before we merge this, maybe disable building example if LLVM/SPIRV-LLVM-Translator aren't found instead of an error. No rush though, I was mainly looking at this locally because we've introduced some changes to how UR is initalized recently and wanted to see if things still work with some modifications. |
@kbenzie, sure, will do. |
ecf3c71
to
1a723c2
Compare
@kbenzie, I tested with llvm-16, should work now.
Done. I had to add extra parenthesis for the |
Thanks. Yeah that's very odd - CMake 🤷 I've been testing this rebased on a more recent commit on the adapters branch, this requires some changes to the loader/platform initialization which I'm happy to contribute. I am however seeing a SEGFAULT in the exit handlers which I'm still debugging, so will hold off until I've figured out what's going on there. |
I reinstalled my Level Zero driver and this went away. I've been successfully running the example in Debug mode but get SEGFAULT's in Release or RelWithDebInfo modes. I've created kurapov-peter#1 to update the example code with the adapter handles. |
Sounds good! I wasn't able to rebase without conflicts, so will need to resolve them first. Should I just merge the patch or do I need to rebase the current branch on top of the latest adapters branch first? |
I figure these out. There's no dependency on building the adapters for the examples so a full build is required, then the SEGFAULT's went away.
We have some guidance on dealing with merge conflicts in the Forks and Pull Requests section of the contirbution guide. Hopefully that will help resolving those.
Whatever works for you, if you manage to rebase on the adapters branch its probably easiest to cherry-pick the top commit from my patch branch and ignore the merge in from the adapters branch in my PR. |
I'm going on holiday now so I've assiged this to @veselypeta to work on getting this merged with you @kurapov-peter |
1a723c2
to
80f6eb3
Compare
@veselypeta, I've rebased to the latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few minor comments. I don't want to put more work on you, so I'm OK with leaving these as todo for later.
@@ -0,0 +1,37 @@ | |||
name: examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scripts
is where we keep the specification and associated tooling. The third_party
directory might be a better place for this (and a short README.md would be very useful about how to use this)
|
||
auto adapters = get_adapters(); | ||
auto platforms = get_platforms(adapters); | ||
auto gpus = get_gpus(platforms.front()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to filter out any platforms that aren't level-zero (or platforms that don't support spir64)? How would this behave on CUDA/HIP?
|
||
ur_check(urQueueFinish(queue)); | ||
|
||
for (int i = 0; i < a_size; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This prints out the result but never prints out the initial array. We should either check for the expected value programmatically or be consistent with how the data is printed.
return devices; | ||
} | ||
|
||
template <typename T, size_t N> struct alignas(4096) AlignedArray { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this for page alignment?
getpagesize()
?
@@ -6,6 +6,9 @@ | |||
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/include) | |||
|
|||
add_subdirectory(hello_world) | |||
if(UR_BUILD_EXAMPLE_CODEGEN) | |||
add_subdirectory(codegen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have GPU-equipped runners, it should be possible to create a workflow where this example is built and run as part of tests. Otherwise, it will be easy to accidentally break it.
I've added #859 to track this. |
I was able to come up with an end-to-end pipeline of code generation and execution that typically occurs in a JIT engine (our use case is the analytical query compilation in HDK). It creates a simple function (adding +1 to an array and storing the result in a second one) via llvm api, converts it to spirv, and submits it to the runtime.
The example is a bit complex in terms of dependencies: it uses llvm api, the spirv translator library, and intel gpu stack for execution. I found managing those in conda was the easiest way (I'm a bit biased though :) ). Nevertheless, I'm attaching the exact environment in
scripts/deps.yml
for reproduction simplicity (doesn't include the toolchain since any dev env for UR would already have it). Use as:Configure with:
Ideally, we would want to be able to have the same flow for all the devices (including nvidia for which we would also convert to spirv if needed).
Corresponding L0-specific examples can be found here.