Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Plugin chooses wrong decryption key #1862

Open
3 tasks done
Leseratte10 opened this issue Oct 1, 2021 · 5 comments
Open
3 tasks done

[Bug] Plugin chooses wrong decryption key #1862

Leseratte10 opened this issue Oct 1, 2021 · 5 comments

Comments

@Leseratte10
Copy link

Leseratte10 commented Oct 1, 2021

CheckList

  • The Title and The Log Title are setted correctly.
  • Clarified about my environment.
  • Code block is used for the log.

[Bug] Plugin chooses wrong decryption key for PDF

I'm adding a PDF to Calibre. The modifications in #1689 are included so PDFs even work.
I have multiple different ADE keys imported into the plugin, with the "correct" one for this PDF being the last one in that list.

However, the plugin seems to think that the very first key it tries to use is correct. Probably, just by chance, decryption with that wrong key leads to a file that at least looks like it could be a valid PDF. Though, actually, it results in a PDF with all pages empty.

When I delete all the other keys from the plugin settings so only the correct key is left, the plugin correctly decrypts the book.

My Environment

Calibre: 5.28

Kindle: N/A

DeDRM: 7.2.1 plus PR #1689

Log

Log Title
DeDRM v7.0.3: Trying to decrypt mzgibmyn.pdf
DeDRM v7.0.3: mzgibmyn.pdf is a PDF ebook
DeDRM v7.0.3: Trying Encryption key key1
DeDRM v7.0.3: Finished after 0.8 seconds
Traceback (most recent call last):
  File "calibre/customize/ui.py", line 433, in get_file_type_metadata
  File "calibre/customize/builtins.py", line 318, in get_metadata
  File "calibre/ebooks/metadata/pdf.py", line 126, in get_metadata
ValueError: Could not read info dict from PDF

pdfinfo returned no UTF-8 data
Added Book to db in: 0.8
Added 1 books in 22.0 seconds

(That "key1" that it tried to use is definitely, 100%, the wrong key for that book).

@Leseratte10
Copy link
Author

Addition: Just had the same bug happen with an EPUB file, so the bug is not limited to just PDFs.

@Leseratte10 Leseratte10 changed the title [Bug] Plugin chooses wrong decryption key for PDF [Bug] Plugin chooses wrong decryption key Oct 26, 2021
@Leseratte10
Copy link
Author

Leseratte10 commented Nov 13, 2021

Okay, due to the questionable legality of tools like this I do not want to make a PR, but here's what I think is going wrong:
Right now, both ineptepub and ineptpdf (PDF inside the "initialize_ebx" function, EPUB inside the "decryptBook" function) assume that if the 17th-to-last byte of the decrypted book key is a NULL byte, then the following 16 bytes (the rest of the data) must be the decrypted book key.

(Btw., does anyone know what the heck all that data before that NULL byte is?)

However, if you take that encrypted book key, and decrypt it with a random, non-matching key for a different adobe account, it will lead to 128 random bytes instead due to the wrong key. If the 17th-to-last byte of these 128 random bytes happens to be a NULL byte, too - which probably has a chance of 1/256 - the plugin assumes it found the correct book key and then continues to decrypt the book with the wrong key. Of course, the more "wrong" user keys you have imported into the plugin settings, the higher is the chance of running into this issue.

Without knowing what all the data before the book key is, there's not really a way to completely prevent that. However, I noticed that that data seems to always start with "00 02" for EPUBs and with "02" for PDFs. This means that by modifying the code to not only check for that NULL byte in the 17th-to-last place but also checking for the one (or two) bytes in the beginning, you decrease the chance for issues from 1/256 to 1/(256^2) = 1/65535 for PDF files, or down to 1/(256^3) = 1/16777216 for EPUB files, which should be way more reliable.

EDIT:
An ideal solution would probably be to not only store the raw encryption keys in the dedrm.json but also the Adobe user UUID for that key. That way the plugin could check the rights.xml for the UUID the book is licensed for, and then directly pick the correct key from the list instead of having to try them all. Though that would not work when the user manually imports an existing DER file as that doesn't contain the UUID ...
Maybe one would need statistics / heuristics like "Okay, for different books all with UUID X we've successfully used key A 100 times and key B one single time, so key A is probably the correct one (for the future) and B was such an 1/256 error".

@j-howell
Copy link

(Btw., does anyone know what the heck all that data before that NULL byte is?)

Those extra bytes are the result of RSAES-PKCS1-v1_5 encryption.

The decrypted string should be start with 00 02, followed by at least eight randomly chosen non-zero bytes, followed by a 00 byte, and then the 16 byte key. A more rigorous verification could look at all of those bytes prior to the key.

@Leseratte10
Copy link
Author

Leseratte10 commented Nov 14, 2021

Yeah, the decrypted string starts with 00 02 for EPUBs (or just 02 for PDFs apparently), and then 109 bytes of data that doesn't seem to have a pattern. According to your link all these 109 bytes would then be junk data / padding data without a purpose. After that there's the NULL byte and then the key. Thanks for that link, I did google for PKCS1 1.5 (as mentioned in the comment in the source code) but failed to figure out that all the other data is just padding.

Verifying these 00 02 bytes (EPUB) or the 02 byte (PDF) would bring the chance of errors from 1/256 to 1/(256*256) for PDFs and down to 1/(256*256*256) for EPUBs, which is what I talked about in the previous comment. Not 100% foolproof, but way better than the current implementation.

With "more rigorous" you mean verifying that none of the junk bytes are zero? I believe that was included in the past but has been removed because there were some buggy books that did have a zero in that padding data. EDIT: Apparently I was wrong about that.

@j-howell
Copy link

With "more rigorous" you mean verifying that none of the junk bytes are zero? I believe that was included in the past but has been removed because there were some buggy books that did have a zero in that padding data.

That is what I meant. Perhaps that check could be reinstated and only be relaxed if none of the possible results pass inspection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants