Skip to content
This repository has been archived by the owner on Apr 11, 2022. It is now read-only.

Reject download if incoming Content-Type blatantly contradicts expected media type #850

Open
leonardr opened this issue Apr 18, 2018 · 0 comments

Comments

@leonardr
Copy link
Contributor

leonardr commented Apr 18, 2018

Representation.get takes a presumed_media_type argument which is used if the server doesn't provide a media type. This is nice when the server serves the right thing but doesn't provide any metadata.

However, sometimes the server does provide metadata when serving content that's completely wrong. The most common case is that the server advertises an EPUB at the other end of a link, and then you follow the link and get an HTML page that tells you how to download an EPUB, or that makes you solve a CAPTCHA.

It should be possible to abort the representation fetch if the media type received from the server blatantly contradicts what was expected.

It may not be best to use presumed_media_type for this because there is some leeway here. If the presumed media type is image/jpeg and the file is actually image/png, there's no problem. If the presumed media type is EPUB and the file is actually a PDF, that's disappointing but we can handle it. But if you think you're getting a PDF and you actually get an image, or you get HTML and you expected anything other than HTML, there's a problem.

So a better solution might be to abort the fetch if the media type received doesn't fit with the expected media type and the relation of the link in question. But we could also just create clusters of 'media types that are substitutable for other media types' and decide on that basis.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant