-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable bwd for flash_attention #74
Conversation
can you please share a screenshot of the output or confirm it ran as expected? |
@adamomainz Sorry it is not ready yet and I am still debugging - I will request review again when it is ready |
@xuzhao9 no problem at all happy to review again once ready! |
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Some backends (e.g., flash_attention/xformers_splitk) don't have backward pass. For those backends, add
fwd_only=True
flag, and we will skip the backward pass automatically.If user specifies
--only xformers_splitk --bwd
, we will still run the backend since it is user-specified. Otherwise we will always skip this backend.Test plan: