Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bertsquad fails assertion during lowering #2487

Closed
umangyadav opened this issue Nov 29, 2023 · 7 comments
Closed

Bertsquad fails assertion during lowering #2487

umangyadav opened this issue Nov 29, 2023 · 7 comments
Assignees

Comments

@umangyadav
Copy link
Member

umangyadav commented Nov 29, 2023

Steps to reproduce :

Build develop branch of migraphx with "debug" build.

Run bertsquad-12.onnx using following command:

./bin//migraphx-driver perf bertsquad-12.onnx --fp16 --fill1 input_ids:0 --fill1 input_mask:0 --fill1 segment_ids:0 --fill1 unique_ids_raw_output___9:0
Compiling ...
Reading: bertsquad-12.onnx
migraphx-driver: /home/umayadav/repo/AMDMIGraphX/src/targets/gpu/lowering.cpp:76: void migraphx::gpu::miopen_apply::check_shape(shape, instruction_ref): Assertion `x == i->get_shape()' failed.
Aborted (core dumped)

Reason is that reshape copy operator is not producing standard shaped output and during lowering it gets replaced with "contiguous + reshape lazy + contiguous` which makes output standard shape.

following is the input reshape:

reshape[dims={1, 1, 256, 256}](multibroadcast[out_lens={1, 256, 256},out_dyn_dims={}]) -> float_type, {1, 1, 256, 256}, {256, 0, 0, 1}

lowering replaces it with following :

gpu::contiguous(reshape_lazy[dims={1, 1, 256, 256}], allocate[shape=float_type, {1, 1, 256, 256}, {65536, 65536, 256, 1},buf_type=nullopt]) -> float_type, {1, 1, 256, 256}, {65536, 65536, 256, 1}

and then it fails this assertion :
https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/blob/44463b94989bfe3f3849ed29629576abd53a9976/src/targets/gpu/lowering.cpp#L76

@pfultz2
Copy link
Collaborator

pfultz2 commented Nov 29, 2023

Reshape copy should just produce standard shape.

@umangyadav
Copy link
Member Author

It started breaking after this PR : #2319

@bpickrel
Copy link
Contributor

It started breaking after this PR : #2319

@causten I'd like to revert to remove_contiguous_from_passes the branch for PR 2319, for debugging, but it's no longer on Github. Do you have a way to recover it?

@bpickrel
Copy link
Contributor

bpickrel commented Dec 1, 2023

It started breaking after this PR : #2319

@umangyadav can you point me to a working commit? I'm finding that with commits from before PR 2319, loads of both models I'm testing fail but with different error messages. I wonder if this change just masks a pre-existing error with different assertions.

@umangyadav
Copy link
Member Author

When i tested locally it worked untill commit before #2319. You can try going further back in time and see if it works for you(it should work.).

@pfultz2
Copy link
Collaborator

pfultz2 commented Dec 1, 2023

The problem is not #2319. The reshape operator is not implemented correctly. Not only is it not giving correct output but its also completely broken when collpasing across non-packed dimensions.

@pfultz2
Copy link
Collaborator

pfultz2 commented Dec 2, 2023

Although, #2038 should fix the assertion.

@causten causten closed this as completed May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants