Reject download if incoming Content-Type blatantly contradicts expected media type #850

leonardr · 2018-04-18T15:37:06Z

Representation.get takes a presumed_media_type argument which is used if the server doesn't provide a media type. This is nice when the server serves the right thing but doesn't provide any metadata.

However, sometimes the server does provide metadata when serving content that's completely wrong. The most common case is that the server advertises an EPUB at the other end of a link, and then you follow the link and get an HTML page that tells you how to download an EPUB, or that makes you solve a CAPTCHA.

It should be possible to abort the representation fetch if the media type received from the server blatantly contradicts what was expected.

It may not be best to use presumed_media_type for this because there is some leeway here. If the presumed media type is image/jpeg and the file is actually image/png, there's no problem. If the presumed media type is EPUB and the file is actually a PDF, that's disappointing but we can handle it. But if you think you're getting a PDF and you actually get an image, or you get HTML and you expected anything other than HTML, there's a problem.

So a better solution might be to abort the fetch if the media type received doesn't fit with the expected media type and the relation of the link in question. But we could also just create clusters of 'media types that are substitutable for other media types' and decide on that basis.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reject download if incoming Content-Type blatantly contradicts expected media type #850

Reject download if incoming Content-Type blatantly contradicts expected media type #850

leonardr commented Apr 18, 2018 •

edited

Loading

Reject download if incoming Content-Type blatantly contradicts expected media type #850

Reject download if incoming Content-Type blatantly contradicts expected media type #850

Comments

leonardr commented Apr 18, 2018 • edited Loading

leonardr commented Apr 18, 2018 •

edited

Loading