Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report debugging information to mitigate memory overrun #235

Open
fedorov opened this issue Sep 26, 2024 · 15 comments
Open

Report debugging information to mitigate memory overrun #235

fedorov opened this issue Sep 26, 2024 · 15 comments
Assignees
Labels
enhancement New feature or request

Comments

@fedorov
Copy link
Member

fedorov commented Sep 26, 2024

Large ANN objects can lead to exhausting browser RAM, and (at least in some situations) may lead to complete lockup of the computer.

To help debug this, we discussed to implement the following in the developer console:

  • size of the data that would be downloaded if annotation is toggled
  • memory consumed or available by the browser or the current tab

Related to but distinct from #230.

@pedrokohler
Copy link
Collaborator

@fedorov I've created #236 to address this issue partially.

We need to discuss this question further because I've encountered a few challenges during my attempts to deliver the adequate solutions:

  1. logs that can monitor tab memory usage and availability have been put in place. However I've found that during the HTTP request the values are not updated accordingly. I suspect the reason is that the HTTP request is delegated to the OS and that the response takes up memory allocated to the browser only after the request is finished. I've researched alternative solutions but found that the memory related APIs made available by modern browsers are very limited and also inconsistent throughout different browsers.

  2. One thing we could attempt is to stream the HTTP response. I believe that in this case it will gradually increase the tab memory usage, instead of a sudden spike, allowing for a more accurate measurement during the HTTP request. However I'd like to research and test this option more thoroughly before assuring you that this would work. This could potentially even make a long term solution easier, because this could allow us to process data in chunks and mitigate memory by attempting to store the chunks with an API like IndexDB. Again, I'd like to research further before making any promises.

  3. I've attempted to make HEAD requests to the https://testing-proxy.canceridc.dev/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb dicom server to gauge the size of the responses before making the proper GET requests, but I've consistently gotten 404 responses from the server, no matter which resource I tried to reach with the HEAD method. Because of this I halted the implementation of these HEAD requests so I could get confirmation from you guys that these methods are implemented by the server before proceeding. Maybe I'm testing them against the wrong server or doing something else wrong. Please let me know.

@pieper
Copy link
Member

pieper commented Oct 5, 2024

Thanks for the investigation @pedrokohler.

If we can get it to work, I think option 2, streaming, is the best.

Regarding point 3, can you try doing a HEAD request directly against a google dicom store without going through the proxy?

@pedrokohler
Copy link
Collaborator

pedrokohler commented Oct 8, 2024

@pieper I've just tried making requests directly against the google dicom store with no luck, take a look:

image

Still looking at the stream thing, but it looks promising.

@pieper
Copy link
Member

pieper commented Oct 8, 2024

Interesting, thanks for checking this @pedrokohler . It looks like you checked a QIDO studies endpoint. I wonder if a WADO request for ANN data would behave the same. Just thinking that QIDO could be streaming results from a database while WADO would be serving basically static content.

@pedrokohler
Copy link
Collaborator

Interesting, thanks for checking this @pedrokohler . It looks like you checked a QIDO studies endpoint. I wonder if a WADO request for ANN data would behave the same. Just thinking that QIDO could be streaming results from a database while WADO would be serving basically static content.

@pieper if you take a close look, my second request in the screenshot above is a WADO request
image

@pieper
Copy link
Member

pieper commented Oct 8, 2024

Oh, okay. Then I guess it's a limitation of the dicomweb implementation. I'm not sure if there's anything in the standard about whether this should be supported or not. Maybe it's something that Google would agree has utility.

Did you happen to try range requests? Maybe that would work even if you don't know the total size.

@pedrokohler
Copy link
Collaborator

@pieper what do you mean by range requests?

@igoroctaviano
Copy link
Collaborator

@pieper what do you mean by range requests?

Instead of requesting the entire binary you can retrieve chunks.
@pedrokohler https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests

@fedorov
Copy link
Member Author

fedorov commented Oct 8, 2024

@pedrokohler to help with debugging it in an alternative manner, can you please log in the JS console the number of annotations available in this attribute: https://dicom.innolitics.com/ciods/microscopy-bulk-simple-annotations/microscopy-bulk-simple-annotations/006a0002/006a000c ? This is in parallel to what you are trying to do with bulk data.

@pedrokohler
Copy link
Collaborator

@pieper the range header seems to be completely ignored by the google dicom store

@pieper
Copy link
Member

pieper commented Oct 9, 2024

Thanks @pedrokohler - that makes sense but it was worth trying. Another thing that might help is that we should be able to get progress events from the dicomweb-client to give feedback to the user and potentially abort if we think there's too much data. I took a look but I don't think Slim is currently reporting progress for dicomweb-client requests. Do you think that would help?

This might help in addition to using the NumberOfAnnotations value to estimate the size of then.bulk data.

@pedrokohler
Copy link
Collaborator

pedrokohler commented Oct 9, 2024

@pieper unfortunately, the google dicom store does not send us back the Content-length header when we make the bulk data request.

Because of this, the total attribute of the progress event is always 0 and progress is not computable. See:

image

However I think it's still possible to add some button to allow the user to abort the request. We could also add the amount of bytes downloaded so far to this interface (the loaded property above).

@igoroctaviano
Copy link
Collaborator

@pieper unfortunately, the google dicom store does not send us back the Content-length header when we make the bulk data request.

Because of this, the total attribute of the progress event is always 0 and progress is not computable. See:

image

However I think it's still possible to add some button to allow the user to abort the request. We could also add the amount of bytes downloaded so far to this interface (the loaded property above).

The idea is to abort automatically if the data exceeds a given threshold e.g. NumberOfAnnotations > X.

@pieper
Copy link
Member

pieper commented Oct 9, 2024

Can't we also watch the "loaded" parameter in the progress event and abort if it starts getting too large?

@pedrokohler
Copy link
Collaborator

Can't we also watch the "loaded" parameter in the progress event and abort if it starts getting too large?

yes, we can do that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants