Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static site generation: Ant task for previewing articles #38

Draft
wants to merge 200 commits into
base: encoding_workflow
Choose a base branch
from

Conversation

amclark42
Copy link
Contributor

@amclark42 amclark42 commented Apr 24, 2023

First steps to creating a static site generator using Apache Ant. The build file has several stub Ant tasks, but the only one that works is the previewArticle task, which is also the default. This task prompts for an article ID (e.g. "000612"), then runs the XSLT to create an HTML preview of that article. To test, run ant -lib common/lib or ant -lib common/lib previewArticle from the repository main directory, and fill in an article identifier when prompted. An alternative is to set the article.id property when calling Ant, e.g. ant -lib common/lib -Darticle.id=000612. The HTML will be saved to dhq-journal/dhq-preview/ARTICLE-ID.html.

Changes:

  • Added the Saxon HE JAR (v11.5) to common/lib, as well as Saxon's dependencies and licenses
  • Added dhq-preview to the .gitignore file
  • Started a new XSLT common/xslt/generate_static_articles.xsl, which will use the DHQ table of contents to fill out the static site HTML with transformed articles
  • Small tweaks to head.xsl:
    • Add a parameter $assets-path which points to the DHQ assets directory. This value is used for <link>s and <script>s within the HTML <head>
    • Take the first //titleStmt/title to fill in the HTML <title>
  • Small tweaks to template_article.xsl:
    • Commented out parameters $vol, $issue and $id, since they are already set (with default values) in the included XSLT dhq2html.xsl
    • Added default value for parameter $fpath, which uses volume, issue, and ID values to figure out the article's filepath
    • Use included parameter $vol_no_zeroes from dhq2html.xsl when checking to make sure an article is in the TOC

Possible issues:

  • The Ant task may not work on Windows machines due to the directory separator being a forward slash Still needs testing on Windows machines. However, the build file now uses the file.separator property for OS-agnostic filepaths
  • The previewArticle task does not fail on transformation error, this may not be helpful in the long run

Only working task so far is to generate a preview version of a
given article.
Some of the parameters defined in template_article.xsl already had
fallback values in imported stylesheets.

head.xsl can take an optional $assets-path.
The XSLT now passes parameters onto template_article.xsl, and
generates a mapping between source and static article directories.

Fix bug in head.xsl: when multiple <title>s are provided, only the
first is used.
@amclark42 amclark42 requested a review from jawalsh April 24, 2023 14:40
The resulting build file is hard to read, but would hopefully make
the Ant tasks usable on Windows machines.
Since the xmlresolver JAR needs to be loaded when Ant starts up,
the new target tests for the required Java class and, if it's not
available, provides instructions for running Ant with the JAR. The
new task is a dependency for the targets which use Saxon HE.
@amclark42
Copy link
Contributor Author

amclark42 commented May 2, 2023

Thanks to @joelsjlee, we discovered that Saxon HE requires that the xmlresolver JAR be available in the Java environment, or the Ant transformation fails with an ambiguous

[xslt] Caught an error during transformation: java.lang.reflect.InvocationTargetException

While Ant provides two methods for setting the classpath for an <xslt> task, neither actually pulls in the xmlresolver JAR. Saxon's documentation says that these methods are "unreliable", stating "the safest approach is to ensure that the Jar files needed to run Saxon are present on the externally-specified classpath (the classpath at the point where Ant is invoked), rather than relying on the task-specific classpath." Getting access to the JAR therefore requires user intervention at a technical level, such as by placing the JAR in ~/.ant/lib.

This Ant build file is intended for use by people who may not have a lot of confidence with the command line or Java's complex needs, and so I added a check for the Java class org.xmlresolver.Resolver. If the class is available, transformation can occur. If not, the new checkXmlResolver task describes how to run Ant with the -lib option, e.g. ant -lib common/lib. (On Windows machines, the forward slash will display as a backward slash.)

Files are copied into the static site directory and static HTML is
wrapped in DHQ trappings.
When restored, `generateSite` will try to find a derived build file
"article-mapper.xml" which maps article directories from their
source to the expected web directories.
@amclark42 amclark42 linked an issue May 12, 2023 that may be closed by this pull request
14 tasks
@amclark42 amclark42 marked this pull request as draft May 12, 2023 20:57
amclark42 and others added 10 commits May 22, 2023 09:46
The new task is intended to create a standalone HTML preview of an
article, which can be sent to the article's author(s) for proofing.
Default Ant task is back to 'previewArticle'
To reduce confusion and duplication, "toDir.base" now refers to the
path to the "dhq-static" directory, which contains a directory of
static files AND the derived Ant file AND the compressed ZIP.
"toDir.static" refers more specifically to the directory of static
files within toDir.base.

Also, added an XSL message to show progress through the TOC.
…works EXCEPT that we have not generated the individual volume/issue index pages yet.
amclark42 and others added 30 commits July 22, 2024 15:34
The custom stylesheet for 000150 had a duplicate template which
overrode the usual sidebar nav. I've skimmed it and there doesn't
seem to be anything special about it, so I commented it out. Better
to maintain the sidebar navigation in one place, not two.
…preview-site

Make internal preview version of DHQ for proofing the static site
Added an introductory comment with changelog.
The stylesheet now does a better job of handling language changes
and Unicode characters with accent marks etc. However, the output
doesn't yet work with author_sort.xsl.
Rather than relying on the existence of <dhq:family>,
<dhq:author_name> now tries to apply templates on that element.
This fixes an outstanding bug where organizational authors (e.g.
"DHQ editorial team" in 000493) were described with leading commas,
separating the name from a family name that didn't exist.

Starting to set up to do sorting within author_index.xsl, rather
than requiring another step with author_sort.xsl (which works now,
btw).
The navigation aids haven't been added yet, but the authors entries
look great and appear to be sorting correctly.

generate_static_issues.xsl does not take transform the results of
author_index.xsl with author_sort.xsl.
author_index.xsl now includes a navigation bar (redesigned for
accessibility), and headings for the alphabetical groupings of
authors.

The navbar is now represented with a list inside <nav>, rather than
a table. I've adjusted the CSS to make the navbar look much like it
used to, though it will now wrap on smaller screens.
Fixes bug due to ADHO's sort key starting with an underscore.

Also, added a bunch more comments.
As I put in a comment, this isn't an ideal solution. I'd be more
comfortable separating the heading from the link, except that I'd
need buy-in from the DHQ team. So instead: a slightly more
accessible implementation of what DHQ already had, plus a comment
musing about how one could do better.
I expect these will need to be used in other stylesheets, so I'm
preparing to move them to a common stylesheet.
Quotation marks and language changes are carried into the HTML
output of the author index. I've added a function from dhq2html.xsl
into common-components.xsl to determine which quotation marks get
used.

Also, whitespace is stripped from the end of the last text node
inside `//titleStmt/title`. This solves a common issue where a
single space would appear between the title and the comma following
it.
The title index will also need to be able to generate a link from
an article title. It's easier to maintain that logic in a single
place.
The title_sort.xsl phase has been removed from
generate_static_issues.xsl, but hasn't yet been added to the main
stylesheet. The sort key also needs to be adjusted to factor in
leading stopwords.
The implementation works for titles with language codes placed
directly on them. It will fail for titles that are mostly in
English but which lead with some text marked as being in another
language.

Also, moved the logic for generating links to articles into a
common function.
DHQ doesn't have any articles fitting this use case, so I wrote a
test to make sure this works. (It does.) I'll comment it out or
delete it later.
It looks like `expression()` was used prior to Internet Explorer 8.

In dhq_screen.css, I replaced these with CSS media queries, which
are a standardized way of capturing the same behavior (no greater-
than symbol necessary).

In dhq.css, I used `width` and `max-width` rules to capture what I
_think_ the expression was trying to do. We don't need to support
IE6 anymore, I hope.

I also removed the full-page background image in favor of a
background color and border on the sidebar. The static site now
looks much the same as the Cocoon site.
Fixes problem discovered by @jawalsh where the Javascript did not
load when previewing only the Internal Preview area.
After inquiries by @sydb and @jawalsh, I decided the stylesheets
needed more explanation about how `$path_to_home` should be set and
where it is used.
…make-standalone

Make static site standalone
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set up static site generator for DHQ
5 participants