Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

static search with proper look-and-feel #91

Open
wants to merge 107 commits into
base: static_site_generation
Choose a base branch
from

Conversation

sydb
Copy link
Contributor

@sydb sydb commented Oct 29, 2024

No description provided.

sydb and others added 30 commits February 19, 2024 09:11
 * Separate out compression of generated site into new target so that I do not have to wait for ZIPping every time I test
 * Update help output to match
 * Fix typo in topnavigation
 * mostly documentation updated, but also
 * made "help" the default target
 * rename the *.NOT files back
This commit adds the University of Victoria Endings Project Static Search codebase, version 1.4.5, with no mods. Thus this version of DHQ requires ant-contrib be installed on the local machine in order to build the "generateSearchable" target.
…tever dir hold the staticSearch version you want to use
Only working task so far is to generate a preview version of a
given article.
Some of the parameters defined in template_article.xsl already had
fallback values in imported stylesheets.

head.xsl can take an optional $assets-path.
The XSLT now passes parameters onto template_article.xsl, and
generates a mapping between source and static article directories.

Fix bug in head.xsl: when multiple <title>s are provided, only the
first is used.
The resulting build file is hard to read, but would hopefully make
the Ant tasks usable on Windows machines.
Since the xmlresolver JAR needs to be loaded when Ant starts up,
the new target tests for the required Java class and, if it's not
available, provides instructions for running Ant with the JAR. The
new task is a dependency for the targets which use Saxon HE.
Files are copied into the static site directory and static HTML is
wrapped in DHQ trappings.
When restored, `generateSite` will try to find a derived build file
"article-mapper.xml" which maps article directories from their
source to the expected web directories.
The new task is intended to create a standalone HTML preview of an
article, which can be sent to the article's author(s) for proofing.
Default Ant task is back to 'previewArticle'
To reduce confusion and duplication, "toDir.base" now refers to the
path to the "dhq-static" directory, which contains a directory of
static files AND the derived Ant file AND the compressed ZIP.
"toDir.static" refers more specifically to the directory of static
files within toDir.base.

Also, added an XSL message to show progress through the TOC.
…works EXCEPT that we have not generated the individual volume/issue index pages yet.
sydb and others added 25 commits April 3, 2024 14:22
I had a conversation with Martin a few weeks ago and learned that "raw" is the generally
accepted scoring algorithm and (to my surprise) "tf-idf" is considered somewhat experimental.
Also decided to just leave JSON indentation on during development. We probably want to turnit
off in production, so files are smaller.
site generation branch (which has just gotten the latest from main, itself)
These files often shadow the actual HTML files we want to index (I think because they
are one level above the desired files with the same name), and even when they do not
shadow the desired file, these preview files are files we do not want to index. So
delete them first.
 * Show only 5 (not 12) hits.
 * Use the word “hits” instead of “Score” to describe how many hits.
 * Because the above improvement required a change _in the staticSearch/ code itself, add
   a new README file to document what change needs to be made each time a new version of
   UVEPSS is installed.
Starting point for advanced search page. This page undergoes processing by DHQ and staticSearch XSLT to generate user-facing advanced search page.
Add links to ssHighlight.js
 * new recommendation code from @joelsjlee
 * tweaks to dhq2html we made this morning
and maybe other stuff, at least, in theory.
Add some additional styling to search features.
difficult manual merges (to build.xml and common/xslt/head.xsl).
@sydb sydb requested a review from amclark42 October 29, 2024 15:05
sydb and others added 3 commits November 5, 2024 11:23
Including not loading staticSearch highlighting Javascript when in proofing mode
1) Update to v. 1.4.9 of staticSearch
2) No longer use symlink and numbered version directory
3) Change a few keywords from capitalized to lower-case.
@sydb
Copy link
Contributor Author

sydb commented Nov 12, 2024

I have done a bunch of ad hoc testing of the generateSearchable output, but then run the following methodical tests on this branch.

  1. Cloned the repo, called it ./dhq-journal_ssg/.
  2. Made a copy, called it ./dhq-journal_jaw/.
  3. In ./dhq-journal_ssg/ checked out the static_site_generation branch.
  4. In ./dhq-journal_jaw/ checked out the jawalsh_uvepss_01 branch.
  5. Each of the following commands was issued in each of the two directories.
  6. ant -lib common/lib/saxon -Darticle.id=000334 previewArticle
    • The output from ant is essentially identical
    • The HTML files differ in CSS, fields, and the search button
  7. ant -lib common/lib/saxon -Ddo.proofing.full=false makeInternalPreview
    • The output from ant is essentially identical
    • Not surprisingly all the HTML files in dhq-static_???/dhq-proofing/editorial/ are different, but it looks like the vast majority of differences are in ID values.
    • I guess not surprisingly, all the .xhtml and .xml files are the same.
  8. ant -lib common/lib/saxon -Ddo.proofing.full=true makeInternalPreview
    • The output from ant is mostly identical, but
      • the msg from target zipArticleXml is different, and
      • the generateSite target created an (empty) directory at ./dhq-journal/dhq in the jawalsh_uvepss_01 but not the static_site_generation branch.
    • Of the nearly 6,000 output files, fewer than 1,000 are different
      • I took a quick scan of the differences in one case: seemed to be almost all CSS, fields, & ID values
  9. ant -lib common/lib/saxon generateIssues
    • The output from ant is essentially identical
    • I spot-checked one of the 180 HTML files that differ, the differences again seemed to be almost all CSS, fields, & ID values
  10. ant -lib common/lib/saxon generateSite
    • The output from ant is mostly identical, but
      • the msg from target zipArticleXml is different
    • I spot-checked one of the 902 HTML files that differ, the differences again seemed to be almost all CSS, fields, & ID values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants