Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix backends in flash_attention and gemm #58

Closed
wants to merge 16 commits into from
Closed

Conversation

xuzhao9
Copy link
Contributor

@xuzhao9 xuzhao9 commented Nov 19, 2024

To run PT2 cutlass backend, we have to add a cutlass submodule that has the same version as pytorch: https://github.com/pytorch/pytorch/tree/main/third_party

The version points to
https://github.com/NVIDIA/cutlass/tree/bbe579a9e3beb6ea6626d9227ec32d0dae119a49 which is 9 months old.
The FBGEMM cutlass is much newer.

Test plan:

$ python run.py --op gemm --mode fwd --only pt2_cutlass_matmul --num-inputs 1
      (M, N, K)    pt2_cutlass_matmul-speedup    pt2_cutlass_matmul-tflops    pt2_cutlass_matmul-gbps
---------------  ----------------------------  ---------------------------  -------------------------
(256, 256, 256)                                                    3.51871                    41.2349

Fixes #17

@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 20, 2024 00:01 — with GitHub Actions Inactive
@xuzhao9 xuzhao9 requested a review from FindHao November 20, 2024 00:07
@xuzhao9 xuzhao9 changed the title Fix pt2-cutlass gemm backend Fix backends in flash_attention and gemm Nov 20, 2024
@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 20, 2024 00:51 — with GitHub Actions Inactive
@facebook-github-bot
Copy link
Contributor

@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Member

@FindHao FindHao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xuzhao9 xuzhao9 force-pushed the xz9/fix-pt2-cutlass branch from 37fc98c to f54c49d Compare November 20, 2024 18:58
@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 20, 2024 18:58 — with GitHub Actions Inactive
@facebook-github-bot
Copy link
Contributor

@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@xuzhao9 merged this pull request in 17b38a4.

@xuzhao9 xuzhao9 deleted the xz9/fix-pt2-cutlass branch November 21, 2024 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Installation][non-reproducible]: Op Flash Attention
3 participants