Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPUB3/HTML support #9

Closed
bertfrees opened this issue Mar 15, 2016 · 52 comments
Closed

EPUB3/HTML support #9

bertfrees opened this issue Mar 15, 2016 · 52 comments
Assignees
Milestone

Comments

@bertfrees
Copy link
Contributor

bertfrees commented Mar 15, 2016

If SBS also intends to use EPUB3 as input format, pipeline-mod-sbs should also have an epub3-to-pef script and corresponding HTML translator.

@mixa72
Copy link

mixa72 commented Mar 15, 2016

Yes, it's quite likely that we will also use EPUB3 as input format in the long run.

@bertfrees bertfrees added this to the 2016/Q4 milestone Jul 26, 2016
@bertfrees bertfrees added the L label Jul 26, 2016
@bertfrees
Copy link
Contributor Author

bertfrees commented Jul 26, 2016

Other agencies have done this by making some or all XSLT and CSS files work with both DTBook and HTML (EPUB). It seems indeed like maintainability and readability benefits from that because DTBook is quite similar to HTML.

(The alternative for making XSLT and CSS files work with both DTBook and HTML is to have separate stylesheets for DTBook and HTML, but because they have a lot in common they could import a common stylesheet.)

See for example:

For us, it means the following files either have to be made to work with HTML or ported to HTML.

  • default.scss
  • block-translate.xsl
  • functions.xsl
  • handle-downgrading.xsl
  • handle-elements.xsl
  • handle-prodnote.xsl
  • insert-boilerplate.xsl
  • select-braille-table.xsl
  • group-starting-with-linenum.xsl
  • handle-toc-and-running-line.xsl

@bertfrees
Copy link
Contributor Author

bertfrees commented Oct 10, 2016

There are a number of SBS-specific extensions to DTBook for which we need to find an alternative in EPUB 3. I will create a table here with the mapping and we can discuss the problematic ones.

[table moved to wiki page]

@bertfrees
Copy link
Contributor Author

@egli Can I get that pointer to the Nordic EPUB specification?

@bertfrees
Copy link
Contributor Author

@bertfrees
Copy link
Contributor Author

@egli The progress can be seen on branch sbs-9. I'm waiting with merging it because I still can't run all the tests at once even after increasing the memory.

@bertfrees
Copy link
Contributor Author

If at some point we want to let Mischa try it, and I haven't found a real solution to the problem yet, we could merge it but with the new tests disabled.

@egli
Copy link
Member

egli commented Oct 31, 2016

We do have an EPUB test document that was produced in India

@bertfrees
Copy link
Contributor Author

@egli and @mixa72 Please have a look at my table above, I've updated it. Maybe you have some more ideas.

@mixa72
Copy link

mixa72 commented Nov 3, 2016

That looks pretty good. It is interesting how many standards there are for the different purposes. Compared to our current DTBook, EPUB3 will apparently involve a lot more namespaces and the terminology will be a varied mix. So whatever you decide for the elements brl:select, brl:running-line, brl:toc-line, brl:time is fine by me since it's not possible to find a uniform naming anyway.
BTW: AFAIK @brl:class is indeed only used for SBSForm.

@bertfrees
Copy link
Contributor Author

bertfrees commented Nov 3, 2016

Coherentness is indeed something we need to carefully think about. You need to work with this every day so your opinion is important. At the same using standards is also important, and last but not least, compatibility with the Nordic guidelines. Changing the Nordic guidelines is possible but apparently a slow process.

The Nordic guidelines have apparently chosen to use "class" for some semantics instead of a custom "epub:type" prefixed with "nordic:". I'm not sure what the motives were. However they do use epub:types that are available in either the default or the z3998 vocabulary. Moreover, they do have a "nordic:" prefix but they only use it for some of the metadata, not for epub:types.

Nordic's use of class is not always appropriate in my opinion, but I think we have to live with this. It's also hard to avoid the mix of different attributes and prefixes because this is just how EPUB works, and because of the compatibility requirement with Nordic. What we could do to simplify things a bit is to not use our own "sbs:" prefix and use classes instead. This is semantically not optimal, but at least it creates some coherentness with the Nordic guidelines. In addition, we can try to completely eliminate "brl:" elements and attributes.

@egli
Copy link
Member

egli commented Nov 3, 2016

I would not take the Nordic guidelines as the be-all-end-all truth. While they are useful and most likely will define the shape of the EPUB we will get from our providers I would also be forward looking and improve things where you think it makes sense.

@bertfrees
Copy link
Contributor Author

bertfrees commented Nov 3, 2016

We could of course have a converter from "Nordic EPUB 3" to "SBS EPUB 3". But this makes interchanging files a bit difficult unless we have the conversion in the two directions.

@mixa72
Copy link

mixa72 commented Nov 3, 2016

Doing the markup with Oxygen is very user-friendly. DTBooks can be validated against both our inhouse minimal schema and the classic DTD. The most important feature is that the editor displays a list with all the possible elements at any place in the document (auto completion). If Oxygen also behaves like that with EPUB3 files then I don't see any problems for the users. It will take some time to learn and memorize the new markup, that's obvious, but after a while everybody will get used to it.

@egli
Copy link
Member

egli commented Nov 22, 2016

I talked to @mixa72 about this yesterday and the consensus seems to be that the actual names of the elements that we will use in the EPUB are not so important to the transcribers, as long as oXygen does the auto completion.

@bertfrees
Copy link
Contributor Author

Yes that's what Mischa said last time. But still we should think it through. What about the things where I have put question marks?

@mixa72
Copy link

mixa72 commented Nov 22, 2016

By me it's ok if you use the following for EPUB3:
brl:class --> @Class (no prefix)

brl:select --> brl:select (or solution with span)
brl:when-braille --> brl:when-braille (or solution with span)
brl:literal[@brl:grade=...] --> brl:literal[@brl:grade=...] (or solution with span)
brl:otherwise --> brl:otherwise (or solution with span)

brl:running-line --> brl:running-line
brl:toc-line --> brl:toc-line
brl:volume[@brl:grade=...] --> br[@Class='braille-volume-break-grade-...']

brl:time --> brl:time (if we keep brl:date; if we use sbs:date instead, I'd also prefer sbs:time)

But I'm open to accept anything as long as there is no loss in functionality with respect to the actual system.

@bertfrees
Copy link
Contributor Author

Okay.

@bertfrees
Copy link
Contributor Author

bertfrees commented Dec 6, 2016

An importance remark that was made in our call today is that what we use as authoring format does not need to be standards compliant, as long as what we distribute or exchange with Nordic countries is standards compliant. So it is no problem if the authoring format has really SBS-specific things such as brl:select in it as long as we remove it when distributing/exchanging. The same can be said about the whole markup. In theory we could have two completely separate types of EPUB. One with all the brl:* that we are used to in DTBook, and one that is standards compliant, and conversion scripts to go from one format to the other and back.

@bertfrees
Copy link
Contributor Author

Test suite works again (#52 (comment)).

@bertfrees
Copy link
Contributor Author

bertfrees commented May 5, 2017

All the existing unit tests pass now. I'm going to merge the sbs-9 branch even though some things might not work yet, and even though the exact EPUB 3 format (see wiki page) hasn't been decided yet.

We can move the issue back to "Backlog" if Mischa finds issues, or if we want to make changes to the EPUB 3 format.

@mixa72
Copy link

mixa72 commented May 14, 2018

I found some issues in the EPUB3 output. I first created a file as DTB and an identical one as EPUB. Here are the differences I found. Possibly my markup is wrong, please take a look at it.
test_epub3_html.zip

 Output from DTB                   Output from EPUB


                               |
         *H:TSV7Z34X           |           *H:TSV7Z34X
         -----------           |           -----------
                               |
           7]7 B+D             |             7]7 B+D
                               |
 TO'CL*E ................ #*A  |   TO'CL*E ................ #*A
                               |
          ZW3T7 B+D            |            ZW3T7 B+D
                               |
 TO'CL*E ............... #,,C  |   TO'CL*E ............... #,,C
                               |
         ::::::::::::          |           ::::::::::::
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
         *H:TSV7Z34X       >I  |           *H:TSV7Z34X       >I
p                              |  p
p                              |  p
       HEAD*G VOLUME #A        |         HEAD*G VOLUME #A
       ----------------        |         ----------------
                               |
 SPAN-+SW7 ---                 |   SPAN-+SW7 ---
 SPAN-+SW7'#A -                |   SPAN-+SW7'#A -
 SPAN-BO'X ---                 |   SPAN-BO'X ---
 LI-BRL-'CLA^                  |   LI-BRL-'CLA^                             <-- brl:class not working in EPUB (css was specified, but has no effect)
 'A-PA&REF #A                  |   'A-PA&REF #A
 BRL-HOMOGRAPH W<]UBE          |   BRL-HOMOGRAPH W<]UBE
 BRL-'V-F?M $S                 |   BRL-'V-F?M S                             <-- brl:v-form not working in EPUB
 BRL-NUM                       |   BRL-NUM                                  <-- brl:num not working in EPUB
   'C)D*AL #E                  |   'C)D*AL #E
   ?D*AL #?                    |   ?D*AL #E.                                <-- brl:num not working in EPUB
   ROMAN >II.                  |   ROMAN II.                                <-- brl:num not working in EPUB
   PHONE #JDC.CCC.CB.CB        |   PHONE #JDC !, #CCC #CB #CB               <-- brl:num not working in EPUB
   ISBN #IGH.C.DIB.BDJGB.G     |   ISBN #IGH-#C-#DIB-#BDJGB-#G              <-- brl:num not working in EPUB
   MEASURE #D'DL               |   MEASURE #D DL                            <-- brl:num not working in EPUB
   FRA'CTJ #C/                 |   FRA'CTJ #C!,#D                           <-- brl:num not working in EPUB
   MI'XED #H#A;                |   MI'XED #H #A!,#B                         <-- brl:num not working in EPUB
 BRL-PLA'CE M+NH3M             |   BRL-PLA'CE M+NH3M
 BRL-SYE'CT KZ                 |   BRL-SYE'CT BASIS Q KZ                    <-- brl:select not working in EPUB
 BRL-[PH                       |   BRL-[PH
   _[PH                        |   [PH                                      <-- brl:emph not working in EPUB
   '(S*GLE'QUO(')              |   S*GLE'QUO(                               <-- brl:emph not working in EPUB
   ('QUO()                     |   'QUO(                                    <-- brl:emph not working in EPUB
   IGN?E                       |   IGN?E
 _#I,)       R/N*GL*E      #A  |                                            <-- brl:running-line not working in EPUB (1 was selected)
p                              |  p
p                              |  p
         *H:TSV7Z34X           |           *H:TSV7Z34X
         -----------           |           -----------
                               |
          ZW3T7 B+D            |            ZW3T7 B+D
                               |
 TO'CL*E ............... #,,C  |   TO'CL*E ............... #,,C
                               |
         ::::::::::::          |           ::::::::::::
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
         *H:TSV7Z34X       >I  |           *H:TSV7Z34X       >I
p                              |  p
p                              |  p
       HEAD*G VOLUME #B        |         HEAD*G VOLUME #B
       ----------------        |         ----------------
                               |
 BRL-A'C'CCTS-SPAN R"EDUIT     |   BRL-A'C'CCTS-SPAN R"EDUIT
   D"%TAIQ"%                   |   D"%TAIQ"%
 BRL-'COMPUT7 '$WWW.SBS.CH     |   BRL-'COMPUT7 WWW.SBS.'4                  <-- brl:computer not working in EPUB
 BRL-DA( #,=AJ#BJJD            |   BRL-DA( #AG.AJ.BJJD                      <-- brl:date not working in EPUB
 BRL-TIME #E.AE                |   BRL-TIME #JE":#JE                        <-- brl:time not working in EPUB
 BRL-NAME K1FM+N               |   BRL-NAME K1FMN                           <-- brl:name not working in EPUB
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
                               |
 _#AJ,;      R/N*GL*E      #C  |
p                              |  p

@bertfrees
Copy link
Contributor Author

OK thanks for the heads up!

@bertfrees
Copy link
Contributor Author

@mixa72 What is supposed to happen with brl:class? As far as I remember this had something to do with macro's in dtbook2sbsform, which I guess would translate to CSS in the new system. If you want to select an element with a brl:class in CSS you should do it like this:

@namespace brl url(http://www.daisy.org/z3986/2009/braille/);
brl|class~='myclass' {
   ...
}

@bertfrees
Copy link
Contributor Author

OK I see what you are trying to do. You put this in the EPUB:

<style>
@namespace xml "http://www.w3.org/XML/1998/namespace";
@namespace brl url(http://www.daisy.org/z3986/2009/braille/);

li[brl|class='myclass'] {
   margin-left:2;
}
   </style>

The problem is that this CSS is not enabled unless you specify the "apply-document-specific-stylesheets" option (why is currently not available in the SBS version of the script).

@mixa72
Copy link

mixa72 commented Jul 9, 2018

@bertfrees Thanks for the hint with the syntax. However, it appears that any css instruction in the style Element is ignored by the system. I even tried

      @namespace xml "http://www.w3.org/XML/1998/namespace";
      @namespace brl url(http://www.daisy.org/z3986/2009/braille/);
      li{
        margin-top:2 !important;
      }

but nothing changes. Is that possible?

@bertfrees
Copy link
Contributor Author

Well, there are two problems. Firstly, like I said above you need the "apply-document-specific-stylesheets". (I will add it.) Secondly, you need to add type="text/css" to the style to make it work. (Preferably also add media="embossed" to make the style not influence the rendering on screen).

@mixa72
Copy link

mixa72 commented Jul 9, 2018

I seem to understand it now: as "apply-document-specific-stylesheets" is disabled now, I'll have to test the brl:class attribute via external stylesheet (scss), right?

@bertfrees
Copy link
Contributor Author

Indeed.

@bertfrees
Copy link
Contributor Author

However I think there is another issue, which might also explain why the elements like brl:v-form, brl:num etc. don't work. I'm investigating it now.

@bertfrees
Copy link
Contributor Author

Never mind, forget that last comment.

@bertfrees
Copy link
Contributor Author

OK so I've added the "apply-document-specific-stylesheets" option and that solves the brl:class issue.

All the other issues are because brl:* elements are not valid in HTML and as a result the prefixes are removed in the load step. (brl:* attributes are also invalid but here the prefixes are retained). A solution is to make the translator and the style sheets work regardless of whether the "brl:" prefix is present. But better is of course to create valid HTML, for example by using epub:type or class attributes.

Another issue I found in your EPUB is that it uses <list type="pl">. In EPUB use <ul style="list-style-type: none"> instead. NLB has a "list-style-type-none" class for it:

.list-style-type-none {
    list-style-type: none;
}

@mixa72
Copy link

mixa72 commented Jul 9, 2018

OK. I'll adjust my EPUB accordingly. Thanks!

@mixa72
Copy link

mixa72 commented Jul 10, 2018

BTW is the apply-document-specific-stylesheets option also visible in the GUI or just available in the background?

@bertfrees
Copy link
Contributor Author

Yes it will be visible in the GUI.

@bertfrees
Copy link
Contributor Author

bertfrees commented Jul 19, 2018

Done.

I had to make some small adjustments to the EPUB in order to make it behave exactly as the DTBook: see chapter.xhtml.

@mixa72
Copy link

mixa72 commented Jul 20, 2018

EPUB3 to PEF Conversion works now. All the above mentioned inline elements are translated as in the DTB to PEF Conversion. CSS Support for stylesheets inside EPUB3 also works. Thanks.

The embedded braille rendition from the EPUB3 to EPUB3 conversion differs a bit from the output in the PEF in that some inline elements are not translated accordingly:
brl:num (ordinal, phone, isbn, measure, fraction, mixed)
em (strong) (brl:emph)
brl:date
brl:time
brl:name
The brl:select element should only render the braille in the corresponding grade (not each literal element).
The rest looks good.

@bertfrees
Copy link
Contributor Author

Yes, I haven't applied the fix to the epub3-to-epub3 script yet. We'll track that in issue #58.

@mixa72
Copy link

mixa72 commented Jul 20, 2018

OK. Then I'll close this issue now.

@mixa72 mixa72 closed this as completed Jul 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants