Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some EPUB2 publications do not use proper NCX content-type in OPF => navigation document gets encrypted, not compliant with the LCP specification #236

Open
danielweck opened this issue Feb 4, 2021 · 2 comments

Comments

@danielweck
Copy link
Member

The LCP specification:
https://github.com/readium/lcp-specs/blob/master/releases/lcp/latest.md#21-encrypted-resources

In addition, this specification defines that the following files must not be encrypted:
META-INF/license.lcpl.
Navigation Documents referenced in any Package Document from the Publication (all Publication Resources listed in the Publication manifest with the "nav" property)
NCX documents referenced in any Package Document from the Publication (all Publication Resources listed in the Publication manifest with the media type “application/x-dtbncx+xml”)
Cover images (all Publication Resources listed in the Publication manifest with the "cover-image" property)

To reproduce:

LSD:
https://lsd-test.edrlab.org/licenses/999806a1-62cd-4f95-a455-6fd9099929c7/status

LCP license:
https://front-test.edrlab.org/api/v1/licenses/999806a1-62cd-4f95-a455-6fd9099929c7

EPUB download link:
https://lcp-test.edrlab.org/contents/1957d6ca-12b3-4918-baa5-41810a2e9e8e

content.opf:

<manifest>
  <item id="ncx" href="toc.ncx" media-type="text/xml"/>
...
 </manifest>
 <spine toc="ncx">
...

Note media-type="text/xml" instead of application/x-dtbncx+xml => the Go code does not consider the manifest item as the NCX, goes on to encrypt the resource.

Code references:

// addCleartextResources searches for resources which must no be encrypted
// i.e. cover, nav and NCX
func addCleartextResources(ep *Epub, p opf.Package) {
var coverImageID string
coverImageID = "cover-image"
for _, meta := range p.Metadata.Metas {
if meta.Name == "cover" {
coverImageID = meta.Content
}
}
// Look for cover, nav and NCX items
for _, item := range p.Manifest.Items {
if strings.Contains(item.Properties, "cover-image") ||
item.ID == coverImageID ||
strings.Contains(item.Properties, "nav") ||
item.MediaType == ContentType_NCX {
// re-construct a path, avoid insertion of backslashes as separator on Windows
path := filepath.ToSlash(filepath.Join(p.BasePath, item.Href))
ep.addCleartextResource(path)
}
}
}

ContentType_NCX = "application/x-dtbncx+xml"

@danielweck
Copy link
Member Author

danielweck commented Feb 4, 2021

Possible solution: discover the NCX via the spine@toc="ncx" indirection, which indirectly references the manifest item id="ncx". This feels heavy-handed, but this is a viable workaround for badly-authored EPUB2 publications (assuming media-type="application/x-dtbncx+xml" is mandatory for NCX, I must admit I haven't checked the specification).

Other solution: fix reading systems (e.g. Thorium, which currently fails to open the publication), by either ignoring the encrypted NCX, or by implementing a delayed parsing strategy (I think ReadiumSDK C/C++ code implemented an alternative codepath to wait for LCP passphrase before starting the parsing of core resources such as NCX/NavDoc or Cover Image)

Other solution: do nothing. Content creators / publishers have the responsibility to fix their publications (the reality is that many "legacy" publications exist and will never be updated).

Other solution: the Go code rejects such publication when it detects that there is an inconsistency

... or alternatively the Go code patches the incorrect media-type (bad idea I think, not just because it would break resource-level signatures, but also because this is not the responsibility of the LCP encryptor)

@danielweck
Copy link
Member Author

Related issue: #129

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant