Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Aug 9, 2024
1 parent f5091a1 commit 94b665a
Show file tree
Hide file tree
Showing 15 changed files with 49 additions and 49 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
234bf980
5acb0938
2 changes: 1 addition & 1 deletion content/ComparingResults-Options1&2.html
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down
2 changes: 1 addition & 1 deletion content/Option1.html
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down
2 changes: 1 addition & 1 deletion content/Option2final.html
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down
6 changes: 3 additions & 3 deletions content/PyTesseract_OCR.html
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down Expand Up @@ -3666,8 +3666,8 @@ <h1>bj170wc5114</h1>
</script>
<nav class="page-navigation">
<div class="nav-page nav-page-previous">
<a href="../content/Text_Difference_Checker.html" class="pagination-link" aria-label="difflib Transkribus Output Text Checker">
<i class="bi bi-arrow-left-short"></i> <span class="nav-page-text">difflib Transkribus Output Text Checker</span>
<a href="../content/Text_Difference_Checker.html" class="pagination-link" aria-label="difflib Transkribus Output Text Checker Notebook">
<i class="bi bi-arrow-left-short"></i> <span class="nav-page-text">difflib Transkribus Output Text Checker Notebook</span>
</a>
</div>
<div class="nav-page nav-page-next">
Expand Down
4 changes: 2 additions & 2 deletions content/Text_Difference_Checker.html
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@
<button type="button" class="quarto-btn-toggle btn" data-bs-toggle="collapse" role="button" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<i class="bi bi-layout-text-sidebar-reverse"></i>
</button>
<nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"><li class="breadcrumb-item"><a href="../content/image_gallery.html">Text Checker Outputs</a></li><li class="breadcrumb-item"><a href="../content/Text_Difference_Checker.html">difflib Transkribus Output Text Checker</a></li></ol></nav>
<nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"><li class="breadcrumb-item"><a href="../content/image_gallery.html">Text Checker Outputs</a></li><li class="breadcrumb-item"><a href="../content/Text_Difference_Checker.html">difflib Transkribus Output Text Checker Notebook</a></li></ol></nav>
<a class="flex-grow-1" role="navigation" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
</a>
<button type="button" class="btn quarto-search-button" aria-label="Search" onclick="window.quartoOpenSearch();">
Expand Down Expand Up @@ -204,7 +204,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link active">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down
2 changes: 1 addition & 1 deletion content/experiments-preview.html
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,7 @@ <h6><i class="bi bi-journal-code"></i> experiments</h6>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down
2 changes: 1 addition & 1 deletion content/experiments.html
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down
10 changes: 5 additions & 5 deletions content/experiments.out.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"# experiments"
],
"id": "2dbc8ac2-cbb8-4b0a-abea-79a7b9f48fb3"
"id": "bf3cfde9-f8d0-4aad-bbb5-a8bc7a08b2cd"
},
{
"cell_type": "code",
Expand Down Expand Up @@ -59,7 +59,7 @@
"\n",
"# Results"
],
"id": "42854845-b54c-412f-9d39-dbda2ae843fb"
"id": "26e51538-4c3f-4e3c-bc00-6a03f9c5cbc1"
},
{
"cell_type": "code",
Expand Down Expand Up @@ -111,7 +111,7 @@
"Store, 5. Retrieve (Vector store query). We came up with four main\n",
"options (below) with some possible variations (see the yellow arrows)."
],
"id": "03e56a3a-b2f6-494d-b364-7793716d0549"
"id": "268a6ec4-c684-4e77-9f97-878591238fb1"
},
{
"cell_type": "code",
Expand Down Expand Up @@ -219,7 +219,7 @@
"}\n",
"`"
],
"id": "fc1d0535-2d11-4d5a-9147-70215e67dfe1"
"id": "a1678600-c031-4a8a-8ad2-1cba3d6e9bcb"
},
{
"cell_type": "code",
Expand All @@ -235,7 +235,7 @@
"source": [
"neato = require(\"@observablehq/[email protected]\")"
],
"id": "ba6830a0-0572-4db4-9182-ffe40a28c6f4"
"id": "4094ee4c-68a7-4f5b-a798-24694632520e"
}
],
"nbformat": 4,
Expand Down
8 changes: 4 additions & 4 deletions content/image_gallery.html
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down Expand Up @@ -230,7 +230,7 @@ <h2 class="anchored" data-anchor-id="motivation-for-checker">Motivation for Chec
</ul>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="bv172tc9618/0001_bv172tc9618_0001_diff.jpg" class="img-fluid figure-img"></p>
<p><img src="bv172tc9618/0001_bv172tc9618_0001_diff.jpg" class="img-fluid figure-img" width="600"></p>
<figcaption>bv172tc9618_0001</figcaption>
</figure>
</div>
Expand Down Expand Up @@ -810,8 +810,8 @@ <h2 class="anchored" data-anchor-id="motivation-for-checker">Motivation for Chec
</a>
</div>
<div class="nav-page nav-page-next">
<a href="../content/Text_Difference_Checker.html" class="pagination-link" aria-label="difflib Transkribus Output Text Checker">
<span class="nav-page-text">difflib Transkribus Output Text Checker</span> <i class="bi bi-arrow-right-short"></i>
<a href="../content/Text_Difference_Checker.html" class="pagination-link" aria-label="difflib Transkribus Output Text Checker Notebook">
<span class="nav-page-text">difflib Transkribus Output Text Checker Notebook</span> <i class="bi bi-arrow-right-short"></i>
</a>
</div>
</nav>
Expand Down
2 changes: 1 addition & 1 deletion content/plan.html
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down
2 changes: 1 addition & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./content/Text_Difference_Checker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">difflib Transkribus Output Text Checker</span></a>
<span class="menu-text">difflib Transkribus Output Text Checker Notebook</span></a>
</div>
</li>
</ul>
Expand Down
2 changes: 1 addition & 1 deletion search.json
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@
"text": "Inspiration for this notebook: https://medium.com/@zhangkd5/a-tutorial-for-difflib-a-powerful-python-standard-library-to-compare-textual-sequences-096d52b4c843\n\nfrom difflib import HtmlDiff, SequenceMatcher\n\n# Define the paths to the text files\npath_to_file_a = '0003_bv172tc9618_0003_print.txt'\npath_to_file_b = '0003_bv172tc9618_0003_print.txt'\n\n# Read the contents of the files\nwith open(path_to_file_a, 'r', encoding='utf-8') as file:\n a = file.read()\n\nwith open(path_to_file_b, 'r', encoding='utf-8') as file:\n b = file.read()\n\n# Calculate the similarity ratio using SequenceMatcher\nseq_match = SequenceMatcher(None, a, b)\nratio = seq_match.ratio()\n\n# Create the HtmlDiff object\nd = HtmlDiff()\n\n# Generate the HTML diff\nhtml_diff = d.make_file(a.splitlines(), b.splitlines())\n\n# Add the similarity ratio to the HTML diff\nratio_html = f\"&lt;h2&gt;Similarity Ratio: {ratio:.2f}&lt;/h2&gt;\\n\" + html_diff\n\n# Save the diff to an HTML file\nwith open('0003_bv172tc9618_0003_print_verify.html', 'w', encoding='utf-8') as f:\n f.write(ratio_html)\n\n\nScript to read in directories (folders) for each paper\n\nimport os\nfrom difflib import HtmlDiff, SequenceMatcher\n\n# Define the paths to the directories\npath_to_folder_a = './PrintModelText/PrintModelText/zz472cp8582_jpg/txt'\npath_to_folder_b = './PrivateModel_V3/PrivateModel_V3/zz472cp8582_jpg_v3/txt'\n\n# Create the HtmlDiff object\nd = HtmlDiff()\n\n# Get the list of files in each directory\nfiles_a = sorted(os.listdir(path_to_folder_a))\nfiles_b = sorted(os.listdir(path_to_folder_b))\n\n# Ensure both directories have the same number of files\nif len(files_a) != len(files_b):\n print(\"Error: The directories do not contain the same number of files.\")\n exit() # exits program is both directories dont have the same number of files\n\n# Compare each pair of files\nfor file_a, file_b in zip(files_a, files_b):\n # Read the contents of the files\n with open(os.path.join(path_to_folder_a, file_a), 'r', encoding='utf-8') as fa:\n a = fa.read()\n \n with open(os.path.join(path_to_folder_b, file_b), 'r', encoding='utf-8') as fb:\n b = fb.read()\n \n # Calculate the similarity ratio using SequenceMatcher\n seq_match = SequenceMatcher(None, a, b)\n ratio = seq_match.ratio() #calculates the similarity ratios between a & b\n ratio_percentage = ratio * 100\n \n # Generate the HTML diff report on the differences. splitlines() is used to split the content into individual lines\n html_diff = d.make_file(a.splitlines(), b.splitlines())\n \n # Add the similarity ratio to the HTML diff. Ratio is formatted to two decimal places\n ratio_html = f\"&lt;h2&gt;Similarity Ratio: {ratio_percentage:.2f}%&lt;/h2&gt;\\n\" + html_diff\n \n # Define the output file name\n output_file_name = os.path.splitext(file_a)[0] + '_diff.html'\n \n # Save the diff to an HTML file\n with open(output_file_name, 'w', encoding='utf-8') as f:\n f.write(ratio_html)\n \n print(f\"Processed: {output_file_name}\")\n\nprint(\"All files have been processed.\")\n\nProcessed: 0001_zz472cp8582_0001_diff.html\nProcessed: 0002_zz472cp8582_0002_diff.html\nProcessed: 0003_zz472cp8582_0003_diff.html\nProcessed: 0004_zz472cp8582_0004_diff.html\nProcessed: 0005_zz472cp8582_0005_diff.html\nProcessed: 0006_zz472cp8582_0006_diff.html\nProcessed: 0007_zz472cp8582_0007_diff.html\nProcessed: 0008_zz472cp8582_0008_diff.html\nProcessed: 0009_zz472cp8582_0009_diff.html\nProcessed: 0010_zz472cp8582_0010_diff.html\nProcessed: 0011_zz472cp8582_0011_diff.html\nProcessed: 0012_zz472cp8582_0012_diff.html\nProcessed: 0013_zz472cp8582_0013_diff.html\nProcessed: 0014_zz472cp8582_0014_diff.html\nProcessed: 0015_zz472cp8582_0015_diff.html\nProcessed: 0016_zz472cp8582_0016_diff.html\nProcessed: 0017_zz472cp8582_0017_diff.html\nAll files have been processed.\n\n\n\n\nScript to print out Similarity Ratio (%) for each page per paper\n\nimport os\nfrom difflib import HtmlDiff, SequenceMatcher\n\n# Define the paths to the directories\npath_to_folder_a = './PrintModelText/PrintModelText/bv172tc9618_jpg/txt'\npath_to_folder_b = './PrivateModel_V3/PrivateModel_V3/bv172tc9618_jpg_v3/txt'\n\n# Create the HtmlDiff object\nd = HtmlDiff()\n\n# Get the list of files in each directory\nfiles_a = sorted(os.listdir(path_to_folder_a))\nfiles_b = sorted(os.listdir(path_to_folder_b))\n\n# Ensure both directories have the same number of files\nif len(files_a) != len(files_b):\n print(\"Error: The directories do not contain the same number of files.\")\n exit()\n\n# Store results for the table\nresults = []\n\n# Compare each pair of files\nfor file_a, file_b in zip(files_a, files_b):\n # Read the contents of the files\n with open(os.path.join(path_to_folder_a, file_a), 'r', encoding='utf-8') as fa:\n a = fa.read()\n \n with open(os.path.join(path_to_folder_b, file_b), 'r', encoding='utf-8') as fb:\n b = fb.read()\n \n # Calculate the similarity ratio using SequenceMatcher\n seq_match = SequenceMatcher(None, a, b)\n ratio = seq_match.ratio()\n ratio_percentage = ratio * 100\n \n # Generate the HTML diff\n html_diff = d.make_file(a.splitlines(), b.splitlines())\n \n # Add the similarity ratio to the HTML diff\n ratio_html = f\"&lt;h2&gt;Similarity Ratio: {ratio_percentage:.2f}%&lt;/h2&gt;\\n\" + html_diff\n \n # Define the output file name\n output_file_name = os.path.splitext(file_a)[0] + '_comp.html'\n \n # Save the diff to an HTML file\n with open(output_file_name, 'w', encoding='utf-8') as f:\n f.write(ratio_html)\n \n # Print out the processed file\n print(f\"Processed: {output_file_name}\")\n \n # Append the result to the list\n file_base_name = os.path.splitext(file_a)[0]\n results.append((file_base_name, ratio_percentage))\n\nprint(\"All files have been processed.\")\n\n# Print out the table\nprint(\"\\nComparison Results:\")\nprint(f\"{'File Name and Page Number':&lt;30} {'Similarity Ratio (%)':&lt;20}\")\nprint(\"=\" * 50)\nfor result in results:\n print(f\"{result[0]:&lt;30} {result[1]:&lt;20.2f}\")\n\n\nProcessed: 0001_bv172tc9618_0001_comp.html\nProcessed: 0002_bv172tc9618_0002_comp.html\nProcessed: 0003_bv172tc9618_0003_comp.html\nProcessed: 0004_bv172tc9618_0004_comp.html\nProcessed: 0005_bv172tc9618_0005_comp.html\nProcessed: 0006_bv172tc9618_0006_comp.html\nProcessed: 0007_bv172tc9618_0007_comp.html\nProcessed: 0008_bv172tc9618_0008_comp.html\nProcessed: 0009_bv172tc9618_0009_comp.html\nProcessed: 0010_bv172tc9618_0010_comp.html\nProcessed: 0011_bv172tc9618_0011_comp.html\nProcessed: 0012_bv172tc9618_0012_comp.html\nProcessed: 0013_bv172tc9618_0013_comp.html\nProcessed: 0014_bv172tc9618_0014_comp.html\nAll files have been processed.\n\nComparison Results:\nFile Name and Page Number Similarity Ratio (%)\n==================================================\n0001_bv172tc9618_0001 15.33 \n0002_bv172tc9618_0002 8.80 \n0003_bv172tc9618_0003 19.17 \n0004_bv172tc9618_0004 37.14 \n0005_bv172tc9618_0005 9.09 \n0006_bv172tc9618_0006 39.30 \n0007_bv172tc9618_0007 4.11 \n0008_bv172tc9618_0008 27.47 \n0009_bv172tc9618_0009 3.04 \n0010_bv172tc9618_0010 11.35 \n0011_bv172tc9618_0011 8.27 \n0012_bv172tc9618_0012 50.11 \n0013_bv172tc9618_0013 21.28 \n0014_bv172tc9618_0014 44.60",
"crumbs": [
"Text Checker Outputs",
"difflib Transkribus Output Text Checker"
"difflib Transkribus Output Text Checker Notebook"
]
},
{
Expand Down
Loading

0 comments on commit 94b665a

Please sign in to comment.