From e338f9771b27c837d433bd5dc608b1b9e37e3157 Mon Sep 17 00:00:00 2001
From: Eric Joanis <eric.joanis@nrc-cnrc.gc.ca>
Date: Thu, 20 Jun 2024 12:41:34 -0400
Subject: [PATCH 1/5] refactor(docs): automatically convert from Sphinx .rst to
 mkdocs .md

Using rst2myst convert installed with pip install rst-to-myst
---
 docs/{README.md => Contributing.md}           |   0
 docs/{advanced-use.rst => advanced-use.md}    |  81 ++-
 docs/cli-guide.md                             | 463 ++++++++++++++++
 docs/cli-guide.rst                            | 507 ------------------
 docs/cli-ref.md                               |  53 ++
 docs/cli-ref.rst                              |  39 --
 docs/index.md                                 |  24 +
 docs/index.rst                                |  26 -
 docs/installation.md                          |   5 +
 docs/installation.rst                         |   6 -
 docs/outputs.md                               |  50 ++
 docs/outputs.rst                              |  52 --
 docs/start.md                                 |  41 ++
 docs/start.rst                                |  45 --
 ...troubleshooting.rst => troubleshooting.md} |  87 +--
 15 files changed, 721 insertions(+), 758 deletions(-)
 rename docs/{README.md => Contributing.md} (100%)
 rename docs/{advanced-use.rst => advanced-use.md} (53%)
 create mode 100644 docs/cli-guide.md
 delete mode 100644 docs/cli-guide.rst
 create mode 100644 docs/cli-ref.md
 delete mode 100644 docs/cli-ref.rst
 create mode 100644 docs/index.md
 delete mode 100644 docs/index.rst
 create mode 100644 docs/installation.md
 delete mode 100644 docs/installation.rst
 create mode 100644 docs/outputs.md
 delete mode 100644 docs/outputs.rst
 create mode 100644 docs/start.md
 delete mode 100644 docs/start.rst
 rename docs/{troubleshooting.rst => troubleshooting.md} (58%)

diff --git a/docs/README.md b/docs/Contributing.md
similarity index 100%
rename from docs/README.md
rename to docs/Contributing.md
diff --git a/docs/advanced-use.rst b/docs/advanced-use.md
similarity index 53%
rename from docs/advanced-use.rst
rename to docs/advanced-use.md
index 2f6e3e23..f010c094 100644
--- a/docs/advanced-use.rst
+++ b/docs/advanced-use.md
@@ -1,27 +1,24 @@
-.. _advanced-use:
+(advanced-use)=
 
-Advanced topics
-===============
+# Advanced topics
 
-.. _adding-a-lang:
+(adding-a-lang)=
 
-Adding a new language to g2p
-----------------------------
+## Adding a new language to g2p
 
 If you want to align an audio book in a language that is not yet supported by
 the g2p library, you will have to write your own g2p mapping for that language.
 
 References:
- - The `g2p library <https://github.com/roedoejet/g2p>`__ and its
-   `documentation <https://g2p.readthedocs.io/>`__.
- - The `7-part blog post on creating g2p mappings <https://blog.mothertongues.org/g2p-background/>`__ on the `Mother Tongues Blog <https://blog.mothertongues.org/>`__.
+: - The [g2p library](https://github.com/roedoejet/g2p) and its
+    [documentation](https://g2p.readthedocs.io/).
+  - The [7-part blog post on creating g2p mappings](https://blog.mothertongues.org/g2p-background/) on the [Mother Tongues Blog](https://blog.mothertongues.org/).
 
 Once you have created a g2p mapping for your language, please consider
-`contributing it to the project <https://blog.mothertongues.org/g2p-contributing/>`__
+[contributing it to the project](https://blog.mothertongues.org/g2p-contributing/)
 so others can also benefit from your work!
 
-Pre-processing your data
-------------------------
+## Pre-processing your data
 
 Manipulating the text and/or audio data that you are trying to align can
 sometimes produce longer, more accurate ReadAlongs, that throw less
@@ -29,56 +26,52 @@ errors when aligning. While some of the most successful techniques we
 have tried are outlined here, you may also need to customize your
 pre-processing to suit your specific data.
 
-Audio pre-processing
-~~~~~~~~~~~~~~~~~~~~
+### Audio pre-processing
 
-Adding silences
-^^^^^^^^^^^^^^^
+#### Adding silences
 
 Adding 1 second segments of silence in between phrases or paragraphs
 sometimes improves the performance of the aligner. We do this using the
-`Pydub <https://github.com/jiaaro/pydub>`__ library which can be
+[Pydub](https://github.com/jiaaro/pydub) library which can be
 pip-installed. Keep in mind that Pydub uses milliseconds.
 
 If your data is currently 1 audio file, you will need to split it into
 segments where you want to put the silences.
 
-::
-
-   ten_seconds = 10 * 1000
-   first_10_seconds = soundtrack[:ten_seconds]
-   last_5_seconds = soundtrack[-5000:]
+```
+ten_seconds = 10 * 1000
+first_10_seconds = soundtrack[:ten_seconds]
+last_5_seconds = soundtrack[-5000:]
+```
 
 Once you have your segments, create an MP3 file containing only 1 second
 of silence.
 
-::
-
-   from pydub import AudioSegment
+```
+from pydub import AudioSegment
 
-   wfile = "appended_1000ms.mp3"
-   silence = AudioSegment.silent(duration=1000)
-   soundtrack = silence
+wfile = "appended_1000ms.mp3"
+silence = AudioSegment.silent(duration=1000)
+soundtrack = silence
+```
 
 Then you loop the audio files you want to append (segments and silence).
 
-::
-
-   seg = AudioSegment.from_mp3(mp3file)
-   soundtrack = soundtrack + silence + seg
+```
+seg = AudioSegment.from_mp3(mp3file)
+soundtrack = soundtrack + silence + seg
+```
 
 Write the soundtrack file as an MP3. This will then be the audio input
 for your Read-Along.
 
-::
+```
+soundtrack.export(wfile, format="mp3")
+```
 
-   soundtrack.export(wfile, format="mp3")
+### Text pre-processing
 
-Text pre-processing
-~~~~~~~~~~~~~~~~~~~
-
-Numbers
-^^^^^^^
+#### Numbers
 
 ReadAlong Studio cannot align numbers written as digits (ex. "123").
 Instead, you will need to write them out (ex. "one two three" or "one
@@ -87,10 +80,10 @@ file.
 
 If you have lots of data, and the numbers are spoken in English (or any
 of their supported languages), consider adding a library like
-`num2words <https://github.com/savoirfairelinux/num2words>`__ to your
+[num2words](https://github.com/savoirfairelinux/num2words) to your
 pre-processing.
 
-::
-
-   num2words 123456789
-   one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine
+```
+num2words 123456789
+one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine
+```
diff --git a/docs/cli-guide.md b/docs/cli-guide.md
new file mode 100644
index 00000000..a335fb91
--- /dev/null
+++ b/docs/cli-guide.md
@@ -0,0 +1,463 @@
+(cli-guide)=
+
+# Command line interface (CLI) user guide
+
+This page contains guidelines on using the ReadAlongs CLI. See also
+{ref}`cli-ref` for the full CLI reference.
+
+The ReadAlongs CLI has two main commands: `readalongs make-xml` and
+`readalongs align`.
+
+- If your data is a plain text file, you can run `make-xml` to turn
+  it into ReadAlongs XML, which you can then align with
+  `align`. Doing this in two steps allows you to modify the XML file
+  before aligning it (e.g., to mark that some text is in a different
+  language, to flag some do-not-align text, or to drop anchors in).
+- Alternatively, if your plain text file does not need to be modified, you can
+  run `align` directly on it, since it also accepts plain text input.  You'll
+  need the `-l <language(s)>` option to indicate what language your text is in.
+
+Two additional commands are sometimes useful: `readalongs tokenize` and
+`readalongs g2p`.
+
+- `tokenize` takes the output of `make-xml` and tokenizes it, wrapping each
+  word in the text in a `<w>` element.
+- `g2p` takes the output of `tokenize` and mapping each word to its
+  phonetic transcription using the g2p library. The phonetic transcription is
+  represented using the ARPABET phonetic codes and are added in the `ARPABET`
+  attribute to each `<w>` element.
+
+The result of `tokenize` or `g2p` can be fixed manually if necessary and
+then used as input to `align`.
+
+## Getting from TXT to XML with readalongs make-xml
+
+Run {ref}`cli-make-xml` to make the ReadAlongs XML file for `align` from a TXT file.
+
+`readalongs make-xml [options] [story.txt] [story.readalong]`
+
+`[story.txt]`: path to the plain text input file (TXT)
+
+`[story.readalong]`: Path to the XML output file
+
+The plain text file must be plain text encoded in `UTF-8` with one
+sentence per line. Paragraph breaks are marked by a blank line, and page
+breaks are marked by two blank lines.
+
+| Key Options                    | Option descriptions                                                                                                   |
+| ------------------------------ | --------------------------------------------------------------------------------------------------------------------- |
+| `-l, --language(s)` (required) | The language code for story.txt. Specifying multiple comma- or colon-separated languages triggers {ref}`g2p-cascade`. |
+| `-f, --force-overwrite`        | Force overwrite output files (handy if you're troubleshooting and will be aligning repeatedly)                        |
+| `-h, --help`                   | Displays CLI guide for `make-xml`                                                                                     |
+
+The `-l, --language` argument requires a language’s 3 character [ISO
+code](https://en.wikipedia.org/wiki/ISO_639-3) as an argument.
+
+The languages supported by RAS can be listed by running `readalongs make-xml -h`
+and they can also be found in the {ref}`cli-make-xml` reference.
+
+So, a full command for a story in Algonquin, with an implicit g2p fallback to
+Undetermined, would be something like:
+
+`readalongs make-xml -l alq Studio/story.txt Studio/story.readalong`
+
+The generated XML will be parsed in to sentences. At this stage you can
+edit the XML to have any modifications, such as adding `do-not-align`
+as an attribute of any element in your XML.
+
+The format of the generated XML is based on \[TEI
+Lite\](<https://tei-c.org/guidelines/customization/lite/>) but is
+considerably simplified.  The DTD (document type definition) can be
+found in the ReadAlong Studio source code under
+`readalongs/static/read-along-1.0.dtd`.
+
+(dna)=
+
+### Handling mismatches: do-not-align
+
+There are two types of "do-not-align" (DNA) content: DNA audio and DNA text.
+
+To use DNA text, add `do-not-align` as an attribute to any
+element in the xml (word, sentence, paragraph, or page).
+
+```
+<w do-not-align="true" id="t0b0d0p0s0w0">dog</w>
+```
+
+If you have already run `readalongs make-xml`, there will be
+documentation for DNA text in comments at the beginning of the generated
+xml file.
+
+```
+<!-- To exclude any element from alignment, add the do-not-align="true" attribute to
+     it, e.g., <p do-not-align="true">...</p>, or
+     <s>Some text <foo do-not-align="true">do not align this</foo> more text</s> -->
+```
+
+To use DNA audio, you can specify a timeframe in milliseconds in the
+`config.json` file which you want the aligner to ignore.
+
+```
+"do-not-align":
+    {
+    "method": "remove",
+    "segments":
+    [
+        {
+            "begin": 1,
+            "end": 17000
+        }
+    ]
+    }
+```
+
+#### Use cases for DNA
+
+- Spoken introduction in the audio file that has no accompanying text
+  (DNA audio)
+- Text that has no matching audio, such as credits/acknowledgments (DNA
+  text)
+
+## Aligning your text and audio with readalongs align
+
+Run {ref}`cli-align` to align a text file (RAS or TXT) and an audio file to
+create a time-aligned audiobook.
+
+`readalongs align [options] [story.txt/xml] [story.mp3/wav] [output_base]`
+
+`[story.txt/ras]`: path to the text file (TXT or RAS)
+
+`[story.mp3/wav]`: path to the audio file (MP3, WAV or any format
+supported by ffmpeg)
+
+`[output_base]`: path to the directory where the output files will be
+created, as `output_base*`
+
+| Key Options             | Option descriptions                                                                                                                                     |
+| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `-l, --language(s)`     | The language code for story.txt. Specifying multiple comma- or colon-separated languages triggers {ref}`g2p-cascade`. (required if input is plain text) |
+| `-c, --config PATH`     | Use ReadAlong-Studio configuration file (in JSON format)                                                                                                |
+| `--debug-g2p`           | Display verbose g2p debugging messages                                                                                                                  |
+| `-s, --save-temps`      | Save intermediate stages of processing and temporary files (dictionary, FSG, tokenization, etc.)                                                        |
+| `-f, --force-overwrite` | Force overwrite output files (handy if you’re troubleshooting and will be aligning repeatedly)                                                          |
+| `-h, --help`            | Displays CLI guide for `align`                                                                                                                          |
+
+See above for more information on the `-l, --language` argument.
+
+A full command could be something like:
+
+`readalongs align -f -c config.json story.readalong story.mp3 story-aligned`
+
+**Is the text file plain text or XML?**
+
+`readalongs align` accepts its text input as a plain text file or a ReadAlongs XML file.
+
+- If the file name ends with `.txt`, it will be read as plain text.
+- If the file name ends with `.xml` or `.readalong`, it will be read as ReadAlongs XML.
+- With other extensions, the beginning of the file is examined to
+  automatically determine if it's XML or plain text.
+
+## Supported languages
+
+The `readalongs langs` command can be used to list all supported languages.
+
+Here is that list at the time of compiling this documentation:
+
+```{eval-rst}
+.. command-output:: readalongs langs
+```
+
+See {ref}`adding-a-lang` for references on adding new languages to that list.
+
+## Adding titles, images and do-not-align segments via the config.json file
+
+Some additional parameters can be specified via a config file: create
+a JSON file called `config.json`, possibly in the same folder as
+your other ReadAlong input files for convenience. The config file
+currently accepts a few components: adding titles and headers, adding
+images to your ReadAlongs, and DNA audio (see {ref}`dna`).
+
+To add a title and headers to the output HTML, you can use the keys
+`"title"`, `"header"`, and `"subheader"`, for example:
+
+```
+{
+  "title": "My awesome read-along",
+  "header": "A story in my language",
+  "subheader": "Read by me"
+}
+```
+
+To add images, indicate the page number as the key, and the name of the image
+file as the value, as an entry in the `"images"` dictionary.
+
+```
+{ "images": { "0": "p1.jpg", "1": "p2.jpg" } }
+```
+
+Both images and DNA audio can be specified in the same config file, such
+as in the example below:
+
+```
+{
+    "images":
+        {
+            "0": "image-for-page1.jpg",
+            "1": "image-for-page1.jpg",
+            "2": "image-for-page2.jpg",
+            "3": "image-for-page3.jpg"
+        },
+
+    "do-not-align":
+        {
+        "method": "remove",
+        "segments":
+            [
+                {   "begin": 1,     "end": 17000   },
+                {   "begin": 57456, "end": 68000   }
+            ]
+        }
+}
+```
+
+Warning: mind your commas! The JSON format is very picky: commas
+separate elements in a list or dictionnary, but if you accidentally have
+a comma after the last element (e.g., by cutting and pasting whole
+lines), you will get a syntax error.
+
+(g2p-cascade)=
+
+## The g2p cascade
+
+Sometimes the g2p conversion of the input text will not succeed, for
+various reasons. A word might use characters not recognized by the g2p mapping
+for the language, or it might be in a different language. Whatever the
+reason, the output for the g2p conversion will not be valid ARPABET, and
+so the system will not be able to proceed to alignment by the
+aligner, SoundSwallower.
+
+If you know the language for that text, you can mark it as such in the
+XML. E.g.:
+
+```xml
+<s xml:lang="eng">This sentence is in English.</s>
+```
+
+The `xml:lang` attribute can be added to any element in the XML structure
+and will apply to text at any depth within that element, unless the
+attribute is specified again at a deeper level, e.g.:
+
+```xml
+<s xml:lang="eng">English mixed with <foo xml:lang="fra">français</foo>.</s>
+```
+
+There is also a simpler option available: the g2p cascade. When the g2p
+cascade is enabled, the g2p mapping will be done by first trying the
+language specified by the `xml:lang` attribute in the XML file
+(or with the first language provided to the `-l` flag on the
+command line, if the input is plain text). For each word where the
+result is not valid ARPABET, the g2p mapping will be attempted again
+with each of the languages specified in the g2p cascade, in order, until
+a valid ARPABET conversion is obtained. If no valid conversion is
+possible, are error message is printed and alignment is not attempted.
+
+To enable the g2p cascade, provide multiple languages via the `-l` switch
+(for plain text input) or add the `fallback-langs="l2,l3,...` attribute to
+any element in the XML file:
+
+```xml
+<s xml:lang="eng" fallback-langs="fra,und">English mixed with français.</s>
+```
+
+These command line examples will set the language to `fra`, with the g2p cascade
+falling back to `eng` and then `und` (see below) when needed.
+
+```bash
+readalongs make-xml -l fra,eng myfile.txt myfile.readalong
+readalongs align -l fra,eng myfile.txt myfile.wav output-dir
+```
+
+### The "Undetermined" language code: und
+
+Notice how the sample XML snippet above has `und` as the last language in the
+cascade. `und`, for Undetermined, is a special language mapping that
+uses the definition of all characters in all alphabets that are part of the
+Unicode standard, and
+maps them as if the name of that character was how it is pronounced.
+While crude, this mapping works surprisingly well for the purposes of
+forced alignment, and allows `readalongs align` to successfully align
+most text with a few foreign words without any manual intervention.
+
+Since we recommend systematically using `und` at the end of the cascade, it
+is now added by default after the languages specified with the `-l`
+switch to both `readalongs align` and `readalongs make-xml`. Note that
+adding other languages after `und` will have no effect, since the
+Undetermined mapping will map any string to valid ARPABET.
+
+In the unlikely event that you want to disable adding `und`, add the hidden
+`--lang-no-append-und` switch, or delete `und` from the `fallback-langs`
+attribute in your XML input.
+
+### Debugging g2p mapping issues
+
+The warning messages issued by `readalongs g2p` and `readalongs align`
+indicate which words are causing g2p problems and what fallbacks were tried.
+It can be worth inspecting to input text to fix any encoding or spelling
+errors highlighted by these warnings. More detailed messages can be
+produced by adding the `--debug-g2p` switch, to obtain a lot more
+information about g2p'ing words in each language g2p was unsucessfully
+attempted.
+
+## Breaking up the pipeline
+
+Some commands were added to the CLI in the last year to break processing up step
+by step.
+
+The following series of commands:
+
+```
+readalongs make-xml -l l1,l2 file.txt file.readalong
+readalongs tokenize file.readalong file.tokenized.readalong
+readalongs g2p file.tokenized.readalong file.g2p.readalong
+readalongs align file.g2p.readalong file.wav output
+```
+
+is equivalent to the single command:
+
+```
+readalongs align -l l1,l2 file.txt file.wav output
+```
+
+except that when running the pipeline as four separate commands, you can
+edit the XML files between each step to make manual adjustments and
+corrections if you want, like inserting anchors, silences, changing the
+language for indivual elements, or even manually editting the ARPABET encoding
+for some words.
+
+## Anchors: marking known alignment points
+
+Long audio/text file pairs can sometimes be difficult to align
+correctly, because the aligner might get lost part way through the
+alignment process. Anchors can be used to tell the aligner about known
+correspondance points between the text and the audio stream.
+
+### Anchor syntax
+
+Anchors are inserted in the XML file (the output of
+`readalongs make-xml`, `readalongs tokenize` or `readalongs g2p`)
+using the following syntax: `<anchor time="3.42s"/>` or
+`<anchor time="3420ms"/>`. The time can be specified in seconds (this
+is the default) or milliseconds.
+
+Anchors can be placed anywhere in the XML file: between/before/after any
+element or text.
+
+Example:
+
+```xml
+<?xml version='1.0' encoding='utf-8'?>
+<read-along version="1.0"> <text xml:lang="eng"> <body>
+    <anchor time="143ms"/>
+    <div type="page">
+    <p>
+        <s>Hello.</s>
+        <anchor time="1.62s"/>
+        <s>This is <anchor time="3.81s"/> <anchor time="3.94s"/> a test</s>
+        <s><anchor time="4123ms"/>weirdword<anchor time="4789ms"/></s>
+    </p>
+    </div>
+    <anchor time="6.74s"/>
+</body> </text> </read-along>
+```
+
+### Anchor semantics
+
+When anchors are used, the alignment task is divided at each anchor,
+creating a series of segments that are aligned independently from one
+another. When alignment is performed, the aligner sees only the audio
+and the text from the segment being processed, and the results are
+joined together afterwards.
+
+The beginning and end of files are implicit anchors: *n* anchors define
+*n+1* segments: from the beginning of the audio and text to the first
+anchor, between pairs of anchors, and from the last anchor to the end of
+the audio and text.
+
+Special cases equivalent to do-not-align audio:
+
+- If an anchor occurs before the first word in the text, the audio up to that
+  anchor’s timestamps is excluded from alignment.
+- If an anchor occurs after the last word, the end of the audio is excluded
+  from alignment.
+- If two anchors occur one after the other, the time span between them in the
+  audio is excluded from alignment.
+
+Using anchors to define do-not-align audio segments is effectively the same as
+marking them as "do-not-align" in the `config.json` file, except that DNA
+segments declared using anchors have a known alignment with respect to the
+text, while the position of DNA segments declared in the config file are
+inferred by the aligner.
+
+### Anchor use cases
+
+1. Alignment fails because the stream is too long or too difficult to
+   align.
+
+   When alignment fails, listen to the audio stream and try to identify
+   where some words you can pick up start or end. Even if you don’t
+   understand the language, there might be some words you’re able to
+   pick up and use as anchors to help the aligner.
+
+2. You already know where some words/sentences/paragraphs start or end,
+   because the data came with some partial alignment information. For
+   example, the data might come from an ELAN file with sentence
+   alignments.
+
+   These known timestamps can be converted to anchors.
+
+## Silences: inserting pause-like silences
+
+There are times where you might want a read-along to pause at a particular
+place for a specific time and resume again after. This can be accomplished by
+inserting silences in your audio stream. You can do it manually by editing your
+audio file ahead of time, but you can also have `readalongs align` insert the
+silences for you.
+
+### Silence syntax
+
+Silences are inserted in the audio stream wherever a `silence` element is
+found in the XML input.
+**TODO say something about how the silence placement determined.**
+The syntax is like the anchor syntax: `<silence dur="4.2s"/>` or
+`<silence dur="100ms"/>`. Like anchors, silence elements can be inserted
+anywhere.
+
+Example:
+
+```xml
+<?xml version='1.0' encoding='utf-8'?>
+<read-along version="1.0"> <text xml:lang="eng"> <body>
+    <silence dur="1s"/>
+    <div type="page">
+    <p>
+        <s>Hello.</s>
+        <silence dur="10s"/>
+        <s>After this pregnant pause, <silence dur="100ms"/> we'll pause
+           again before it's all over!</s>
+    </p>
+    <silence dur="1s"/>
+    </div>
+</body> </text> </read-along>
+```
+
+### Silence use cases
+
+1. Your read along has a title page that is not read out in the audio stream:
+   insert a silence at the beginning so that it stays on the first page for
+   the specified time.
+   **TODO: test that a silence before the first word really keeps the RA on the
+   first page during that silence, even if all text on the first page is DNA.**
+2. Your read along has a credits page at the end that is not read out in the
+   audio stream: insert a silence at the end so that people see that credits
+   page for the specified time before the streaming end.
+   **TODO: also test that this use case works as described.**
diff --git a/docs/cli-guide.rst b/docs/cli-guide.rst
deleted file mode 100644
index b92d580b..00000000
--- a/docs/cli-guide.rst
+++ /dev/null
@@ -1,507 +0,0 @@
-.. _cli-guide:
-
-Command line interface (CLI) user guide
-=======================================
-
-This page contains guidelines on using the ReadAlongs CLI. See also
-:ref:`cli-ref` for the full CLI reference.
-
-The ReadAlongs CLI has two main commands: ``readalongs make-xml`` and
-``readalongs align``.
-
-- If your data is a plain text file, you can run ``make-xml`` to turn
-  it into ReadAlongs XML, which you can then align with
-  ``align``. Doing this in two steps allows you to modify the XML file
-  before aligning it (e.g., to mark that some text is in a different
-  language, to flag some do-not-align text, or to drop anchors in).
-
-- Alternatively, if your plain text file does not need to be modified, you can
-  run ``align`` directly on it, since it also accepts plain text input.  You'll
-  need the ``-l <language(s)>`` option to indicate what language your text is in.
-
-Two additional commands are sometimes useful: ``readalongs tokenize`` and
-``readalongs g2p``.
-
-- ``tokenize`` takes the output of ``make-xml`` and tokenizes it, wrapping each
-  word in the text in a ``<w>`` element.
-
-- ``g2p`` takes the output of ``tokenize`` and mapping each word to its
-  phonetic transcription using the g2p library. The phonetic transcription is
-  represented using the ARPABET phonetic codes and are added in the ``ARPABET``
-  attribute to each ``<w>`` element.
-
-The result of ``tokenize`` or ``g2p`` can be fixed manually if necessary and
-then used as input to ``align``.
-
-Getting from TXT to XML with readalongs make-xml
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Run :ref:`cli-make-xml` to make the ReadAlongs XML file for ``align`` from a TXT file.
-
-``readalongs make-xml [options] [story.txt] [story.readalong]``
-
-``[story.txt]``: path to the plain text input file (TXT)
-
-``[story.readalong]``: Path to the XML output file
-
-The plain text file must be plain text encoded in ``UTF-8`` with one
-sentence per line. Paragraph breaks are marked by a blank line, and page
-breaks are marked by two blank lines.
-
-+-----------------------------------+-----------------------------------------------+
-| Key Options                       | Option descriptions                           |
-+===================================+===============================================+
-| ``-l, --language(s)`` (required)  | The language code for story.txt.              |
-|                                   | Specifying multiple comma- or colon-separated |
-|                                   | languages triggers :ref:`g2p-cascade`.        |
-+-----------------------------------+-----------------------------------------------+
-| ``-f, --force-overwrite``         | Force overwrite output files                  |
-|                                   | (handy if you're troubleshooting              |
-|                                   | and will be aligning repeatedly)              |
-+-----------------------------------+-----------------------------------------------+
-| ``-h, --help``                    | Displays CLI guide for                        |
-|                                   | ``make-xml``                                  |
-+-----------------------------------+-----------------------------------------------+
-
-The ``-l, --language`` argument requires a language’s 3 character `ISO
-code <https://en.wikipedia.org/wiki/ISO_639-3>`__ as an argument.
-
-The languages supported by RAS can be listed by running ``readalongs make-xml -h``
-and they can also be found in the :ref:`cli-make-xml` reference.
-
-So, a full command for a story in Algonquin, with an implicit g2p fallback to
-Undetermined, would be something like:
-
-``readalongs make-xml -l alq Studio/story.txt Studio/story.readalong``
-
-The generated XML will be parsed in to sentences. At this stage you can
-edit the XML to have any modifications, such as adding ``do-not-align``
-as an attribute of any element in your XML.
-
-The format of the generated XML is based on [TEI
-Lite](https://tei-c.org/guidelines/customization/lite/) but is
-considerably simplified.  The DTD (document type definition) can be
-found in the ReadAlong Studio source code under
-`readalongs/static/read-along-1.0.dtd`.
-
-.. _dna:
-
-Handling mismatches: do-not-align
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-There are two types of "do-not-align" (DNA) content: DNA audio and DNA text.
-
-To use DNA text, add ``do-not-align`` as an attribute to any
-element in the xml (word, sentence, paragraph, or page).
-
-::
-
-   <w do-not-align="true" id="t0b0d0p0s0w0">dog</w>
-
-If you have already run ``readalongs make-xml``, there will be
-documentation for DNA text in comments at the beginning of the generated
-xml file.
-
-::
-
-   <!-- To exclude any element from alignment, add the do-not-align="true" attribute to
-        it, e.g., <p do-not-align="true">...</p>, or
-        <s>Some text <foo do-not-align="true">do not align this</foo> more text</s> -->
-
-To use DNA audio, you can specify a timeframe in milliseconds in the
-``config.json`` file which you want the aligner to ignore.
-
-::
-
-   "do-not-align":
-       {
-       "method": "remove",
-       "segments":
-       [
-           {
-               "begin": 1,
-               "end": 17000
-           }
-       ]
-       }
-
-Use cases for DNA
-'''''''''''''''''
-
--  Spoken introduction in the audio file that has no accompanying text
-   (DNA audio)
--  Text that has no matching audio, such as credits/acknowledgments (DNA
-   text)
-
-Aligning your text and audio with readalongs align
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Run :ref:`cli-align` to align a text file (RAS or TXT) and an audio file to
-create a time-aligned audiobook.
-
-``readalongs align [options] [story.txt/xml] [story.mp3/wav] [output_base]``
-
-``[story.txt/ras]``: path to the text file (TXT or RAS)
-
-``[story.mp3/wav]``: path to the audio file (MP3, WAV or any format
-supported by ffmpeg)
-
-``[output_base]``: path to the directory where the output files will be
-created, as ``output_base*``
-
-+-----------------------------------+-----------------------------------------------+
-| Key Options                       | Option descriptions                           |
-+===================================+===============================================+
-| ``-l, --language(s)``             | The language code for story.txt.              |
-|                                   | Specifying multiple comma- or colon-separated |
-|                                   | languages triggers :ref:`g2p-cascade`.        |
-|                                   | (required if input is plain text)             |
-+-----------------------------------+-----------------------------------------------+
-| ``-c, --config PATH``             | Use ReadAlong-Studio                          |
-|                                   | configuration file (in JSON                   |
-|                                   | format)                                       |
-+-----------------------------------+-----------------------------------------------+
-| ``--debug-g2p``                   | Display verbose g2p debugging messages        |
-+-----------------------------------+-----------------------------------------------+
-| ``-s, --save-temps``              | Save intermediate stages of                   |
-|                                   | processing and temporary files                |
-|                                   | (dictionary, FSG, tokenization,               |
-|                                   | etc.)                                         |
-+-----------------------------------+-----------------------------------------------+
-| ``-f, --force-overwrite``         | Force overwrite output files                  |
-|                                   | (handy if you’re troubleshooting              |
-|                                   | and will be aligning repeatedly)              |
-+-----------------------------------+-----------------------------------------------+
-| ``-h, --help``                    | Displays CLI guide for ``align``              |
-+-----------------------------------+-----------------------------------------------+
-
-See above for more information on the ``-l, --language`` argument.
-
-A full command could be something like:
-
-``readalongs align -f -c config.json story.readalong story.mp3 story-aligned``
-
-**Is the text file plain text or XML?**
-
-``readalongs align`` accepts its text input as a plain text file or a ReadAlongs XML file.
-
-- If the file name ends with ``.txt``, it will be read as plain text.
-- If the file name ends with ``.xml`` or ``.readalong``, it will be read as ReadAlongs XML.
-- With other extensions, the beginning of the file is examined to
-  automatically determine if it's XML or plain text.
-
-Supported languages
-~~~~~~~~~~~~~~~~~~~
-
-The ``readalongs langs`` command can be used to list all supported languages.
-
-Here is that list at the time of compiling this documentation:
-
-.. command-output:: readalongs langs
-
-See :ref:`adding-a-lang` for references on adding new languages to that list.
-
-
-Adding titles, images and do-not-align segments via the config.json file
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Some additional parameters can be specified via a config file: create
-a JSON file called ``config.json``, possibly in the same folder as
-your other ReadAlong input files for convenience. The config file
-currently accepts a few components: adding titles and headers, adding
-images to your ReadAlongs, and DNA audio (see :ref:`dna`).
-
-To add a title and headers to the output HTML, you can use the keys
-`"title"`, `"header"`, and `"subheader"`, for example::
-
-  {
-    "title": "My awesome read-along",
-    "header": "A story in my language",
-    "subheader": "Read by me"
-  }
-
-To add images, indicate the page number as the key, and the name of the image
-file as the value, as an entry in the ``"images"`` dictionary.
-
-::
-
-   { "images": { "0": "p1.jpg", "1": "p2.jpg" } }
-
-Both images and DNA audio can be specified in the same config file, such
-as in the example below:
-
-::
-
-   {
-       "images":
-           {
-               "0": "image-for-page1.jpg",
-               "1": "image-for-page1.jpg",
-               "2": "image-for-page2.jpg",
-               "3": "image-for-page3.jpg"
-           },
-
-       "do-not-align":
-           {
-           "method": "remove",
-           "segments":
-               [
-                   {   "begin": 1,     "end": 17000   },
-                   {   "begin": 57456, "end": 68000   }
-               ]
-           }
-   }
-
-Warning: mind your commas! The JSON format is very picky: commas
-separate elements in a list or dictionnary, but if you accidentally have
-a comma after the last element (e.g., by cutting and pasting whole
-lines), you will get a syntax error.
-
-.. _g2p-cascade:
-
-The g2p cascade
-~~~~~~~~~~~~~~~
-
-Sometimes the g2p conversion of the input text will not succeed, for
-various reasons. A word might use characters not recognized by the g2p mapping
-for the language, or it might be in a different language. Whatever the
-reason, the output for the g2p conversion will not be valid ARPABET, and
-so the system will not be able to proceed to alignment by the
-aligner, SoundSwallower.
-
-If you know the language for that text, you can mark it as such in the
-XML. E.g.:
-
-.. code-block:: xml
-
-   <s xml:lang="eng">This sentence is in English.</s>
-
-The ``xml:lang`` attribute can be added to any element in the XML structure
-and will apply to text at any depth within that element, unless the
-attribute is specified again at a deeper level, e.g.:
-
-.. code-block:: xml
-
-   <s xml:lang="eng">English mixed with <foo xml:lang="fra">français</foo>.</s>
-
-There is also a simpler option available: the g2p cascade. When the g2p
-cascade is enabled, the g2p mapping will be done by first trying the
-language specified by the `xml:lang` attribute in the XML file
-(or with the first language provided to the ``-l`` flag on the
-command line, if the input is plain text). For each word where the
-result is not valid ARPABET, the g2p mapping will be attempted again
-with each of the languages specified in the g2p cascade, in order, until
-a valid ARPABET conversion is obtained. If no valid conversion is
-possible, are error message is printed and alignment is not attempted.
-
-To enable the g2p cascade, provide multiple languages via the ``-l`` switch
-(for plain text input) or add the ``fallback-langs="l2,l3,...`` attribute to
-any element in the XML file:
-
-.. code-block:: xml
-
-   <s xml:lang="eng" fallback-langs="fra,und">English mixed with français.</s>
-
-These command line examples will set the language to ``fra``, with the g2p cascade
-falling back to ``eng`` and then ``und`` (see below) when needed.
-
-.. code-block:: bash
-
-   readalongs make-xml -l fra,eng myfile.txt myfile.readalong
-   readalongs align -l fra,eng myfile.txt myfile.wav output-dir
-
-The "Undetermined" language code: und
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-Notice how the sample XML snippet above has ``und`` as the last language in the
-cascade. ``und``, for Undetermined, is a special language mapping that
-uses the definition of all characters in all alphabets that are part of the
-Unicode standard, and
-maps them as if the name of that character was how it is pronounced.
-While crude, this mapping works surprisingly well for the purposes of
-forced alignment, and allows ``readalongs align`` to successfully align
-most text with a few foreign words without any manual intervention.
-
-Since we recommend systematically using ``und`` at the end of the cascade, it
-is now added by default after the languages specified with the ``-l``
-switch to both ``readalongs align`` and ``readalongs make-xml``. Note that
-adding other languages after ``und`` will have no effect, since the
-Undetermined mapping will map any string to valid ARPABET.
-
-In the unlikely event that you want to disable adding ``und``, add the hidden
-``--lang-no-append-und`` switch, or delete ``und`` from the ``fallback-langs``
-attribute in your XML input.
-
-Debugging g2p mapping issues
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-The warning messages issued by ``readalongs g2p`` and ``readalongs align``
-indicate which words are causing g2p problems and what fallbacks were tried.
-It can be worth inspecting to input text to fix any encoding or spelling
-errors highlighted by these warnings. More detailed messages can be
-produced by adding the ``--debug-g2p`` switch, to obtain a lot more
-information about g2p'ing words in each language g2p was unsucessfully
-attempted.
-
-Breaking up the pipeline
-~~~~~~~~~~~~~~~~~~~~~~~~
-
-Some commands were added to the CLI in the last year to break processing up step
-by step.
-
-The following series of commands:
-
-::
-
-   readalongs make-xml -l l1,l2 file.txt file.readalong
-   readalongs tokenize file.readalong file.tokenized.readalong
-   readalongs g2p file.tokenized.readalong file.g2p.readalong
-   readalongs align file.g2p.readalong file.wav output
-
-is equivalent to the single command:
-
-::
-
-   readalongs align -l l1,l2 file.txt file.wav output
-
-except that when running the pipeline as four separate commands, you can
-edit the XML files between each step to make manual adjustments and
-corrections if you want, like inserting anchors, silences, changing the
-language for indivual elements, or even manually editting the ARPABET encoding
-for some words.
-
-Anchors: marking known alignment points
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Long audio/text file pairs can sometimes be difficult to align
-correctly, because the aligner might get lost part way through the
-alignment process. Anchors can be used to tell the aligner about known
-correspondance points between the text and the audio stream.
-
-Anchor syntax
-^^^^^^^^^^^^^
-
-Anchors are inserted in the XML file (the output of
-``readalongs make-xml``, ``readalongs tokenize`` or ``readalongs g2p``)
-using the following syntax: ``<anchor time="3.42s"/>`` or
-``<anchor time="3420ms"/>``. The time can be specified in seconds (this
-is the default) or milliseconds.
-
-Anchors can be placed anywhere in the XML file: between/before/after any
-element or text.
-
-Example:
-
-.. code-block:: xml
-
-   <?xml version='1.0' encoding='utf-8'?>
-   <read-along version="1.0"> <text xml:lang="eng"> <body>
-       <anchor time="143ms"/>
-       <div type="page">
-       <p>
-           <s>Hello.</s>
-           <anchor time="1.62s"/>
-           <s>This is <anchor time="3.81s"/> <anchor time="3.94s"/> a test</s>
-           <s><anchor time="4123ms"/>weirdword<anchor time="4789ms"/></s>
-       </p>
-       </div>
-       <anchor time="6.74s"/>
-   </body> </text> </read-along>
-
-Anchor semantics
-^^^^^^^^^^^^^^^^
-
-When anchors are used, the alignment task is divided at each anchor,
-creating a series of segments that are aligned independently from one
-another. When alignment is performed, the aligner sees only the audio
-and the text from the segment being processed, and the results are
-joined together afterwards.
-
-The beginning and end of files are implicit anchors: *n* anchors define
-*n+1* segments: from the beginning of the audio and text to the first
-anchor, between pairs of anchors, and from the last anchor to the end of
-the audio and text.
-
-Special cases equivalent to do-not-align audio:
-
-- If an anchor occurs before the first word in the text, the audio up to that
-  anchor’s timestamps is excluded from alignment.
-- If an anchor occurs after the last word, the end of the audio is excluded
-  from alignment.
-- If two anchors occur one after the other, the time span between them in the
-  audio is excluded from alignment.
-
-Using anchors to define do-not-align audio segments is effectively the same as
-marking them as "do-not-align" in the ``config.json`` file, except that DNA
-segments declared using anchors have a known alignment with respect to the
-text, while the position of DNA segments declared in the config file are
-inferred by the aligner.
-
-Anchor use cases
-^^^^^^^^^^^^^^^^
-
-1. Alignment fails because the stream is too long or too difficult to
-   align.
-
-   When alignment fails, listen to the audio stream and try to identify
-   where some words you can pick up start or end. Even if you don’t
-   understand the language, there might be some words you’re able to
-   pick up and use as anchors to help the aligner.
-
-2. You already know where some words/sentences/paragraphs start or end,
-   because the data came with some partial alignment information. For
-   example, the data might come from an ELAN file with sentence
-   alignments.
-
-   These known timestamps can be converted to anchors.
-
-Silences: inserting pause-like silences
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-There are times where you might want a read-along to pause at a particular
-place for a specific time and resume again after. This can be accomplished by
-inserting silences in your audio stream. You can do it manually by editing your
-audio file ahead of time, but you can also have ``readalongs align`` insert the
-silences for you.
-
-Silence syntax
-^^^^^^^^^^^^^^
-
-Silences are inserted in the audio stream wherever a ``silence`` element is
-found in the XML input.
-**TODO say something about how the silence placement determined.**
-The syntax is like the anchor syntax: ``<silence dur="4.2s"/>`` or
-``<silence dur="100ms"/>``. Like anchors, silence elements can be inserted
-anywhere.
-
-Example:
-
-.. code-block:: xml
-
-   <?xml version='1.0' encoding='utf-8'?>
-   <read-along version="1.0"> <text xml:lang="eng"> <body>
-       <silence dur="1s"/>
-       <div type="page">
-       <p>
-           <s>Hello.</s>
-           <silence dur="10s"/>
-           <s>After this pregnant pause, <silence dur="100ms"/> we'll pause
-              again before it's all over!</s>
-       </p>
-       <silence dur="1s"/>
-       </div>
-   </body> </text> </read-along>
-
-Silence use cases
-^^^^^^^^^^^^^^^^^
-
-1. Your read along has a title page that is not read out in the audio stream:
-   insert a silence at the beginning so that it stays on the first page for
-   the specified time.
-   **TODO: test that a silence before the first word really keeps the RA on the
-   first page during that silence, even if all text on the first page is DNA.**
-
-2. Your read along has a credits page at the end that is not read out in the
-   audio stream: insert a silence at the end so that people see that credits
-   page for the specified time before the streaming end.
-   **TODO: also test that this use case works as described.**
diff --git a/docs/cli-ref.md b/docs/cli-ref.md
new file mode 100644
index 00000000..2291a14c
--- /dev/null
+++ b/docs/cli-ref.md
@@ -0,0 +1,53 @@
+(cli-ref)=
+
+# Command line interface (CLI) reference
+
+This page contains the full reference documentation for each command in the CLI.
+See also {ref}`cli-guide` for guidelines on using the CLI.
+
+The ReadAlongs CLI has five key commands:
+
+- {ref}`cli-align`: full alignment pipeline, from plain text or XML to a
+  viewable readalong
+- {ref}`cli-make-xml`: convert a plain text file into XML, for align
+- {ref}`cli-tokenize`: tokenize an XML file
+- {ref}`cli-g2p`: g2p a tokenized XML file
+- {ref}`cli-langs`: list supported languages
+
+Each command can be run with `-h` or `--help` to display its usage manual,
+e.g., `readalongs -h`, `readalongs align --help`.
+
+(cli-align)=
+
+```{eval-rst}
+.. click:: readalongs.cli:align
+  :prog: readalongs align
+```
+
+(cli-make-xml)=
+
+```{eval-rst}
+.. click:: readalongs.cli:make_xml
+  :prog: readalongs make-xml
+```
+
+(cli-tokenize)=
+
+```{eval-rst}
+.. click:: readalongs.cli:tokenize
+  :prog: readalongs tokenize
+```
+
+(cli-g2p)=
+
+```{eval-rst}
+.. click:: readalongs.cli:g2p
+  :prog: readalongs g2p
+```
+
+(cli-langs)=
+
+```{eval-rst}
+.. click:: readalongs.cli:langs
+  :prog: readalongs langs
+```
diff --git a/docs/cli-ref.rst b/docs/cli-ref.rst
deleted file mode 100644
index 0b94f686..00000000
--- a/docs/cli-ref.rst
+++ /dev/null
@@ -1,39 +0,0 @@
-.. _cli-ref:
-
-Command line interface (CLI) reference
-======================================
-
-This page contains the full reference documentation for each command in the CLI.
-See also :ref:`cli-guide` for guidelines on using the CLI.
-
-The ReadAlongs CLI has five key commands:
-
-- :ref:`cli-align`: full alignment pipeline, from plain text or XML to a
-  viewable readalong
-- :ref:`cli-make-xml`: convert a plain text file into XML, for align
-- :ref:`cli-tokenize`: tokenize an XML file
-- :ref:`cli-g2p`: g2p a tokenized XML file
-- :ref:`cli-langs`: list supported languages
-
-Each command can be run with ``-h`` or ``--help`` to display its usage manual,
-e.g., ``readalongs -h``, ``readalongs align --help``.
-
-.. _cli-align:
-.. click:: readalongs.cli:align
-  :prog: readalongs align
-
-.. _cli-make-xml:
-.. click:: readalongs.cli:make_xml
-  :prog: readalongs make-xml
-
-.. _cli-tokenize:
-.. click:: readalongs.cli:tokenize
-  :prog: readalongs tokenize
-
-.. _cli-g2p:
-.. click:: readalongs.cli:g2p
-  :prog: readalongs g2p
-
-.. _cli-langs:
-.. click:: readalongs.cli:langs
-  :prog: readalongs langs
diff --git a/docs/index.md b/docs/index.md
new file mode 100644
index 00000000..16f2d459
--- /dev/null
+++ b/docs/index.md
@@ -0,0 +1,24 @@
+# Welcome to ReadAlong-Studio's documentation
+
+Audiobook alignment for Indigenous languages
+
+This site provides the full user documentation for ReadAlongs-Studio.
+
+```{toctree}
+:caption: 'Contents:'
+:maxdepth: 2
+
+start
+installation
+cli-guide
+cli-ref
+outputs
+advanced-use
+troubleshooting
+```
+
+# Indices and tables
+
+- {ref}`genindex`
+- {ref}`modindex`
+- {ref}`search`
diff --git a/docs/index.rst b/docs/index.rst
deleted file mode 100644
index a99a1d92..00000000
--- a/docs/index.rst
+++ /dev/null
@@ -1,26 +0,0 @@
-Welcome to ReadAlong-Studio's documentation
-===========================================
-
-Audiobook alignment for Indigenous languages
-
-This site provides the full user documentation for ReadAlongs-Studio.
-
-.. toctree::
-   :maxdepth: 2
-   :caption: Contents:
-
-   start
-   installation
-   cli-guide
-   cli-ref
-   outputs
-   advanced-use
-   troubleshooting
-
-
-Indices and tables
-==================
-
-* :ref:`genindex`
-* :ref:`modindex`
-* :ref:`search`
diff --git a/docs/installation.md b/docs/installation.md
new file mode 100644
index 00000000..84bed1d7
--- /dev/null
+++ b/docs/installation.md
@@ -0,0 +1,5 @@
+(installation)=
+
+# Installation
+
+See [ReadAlongs/Studio/README.md](https://github.com/ReadAlongs/Studio#install)
diff --git a/docs/installation.rst b/docs/installation.rst
deleted file mode 100644
index 3245e0d6..00000000
--- a/docs/installation.rst
+++ /dev/null
@@ -1,6 +0,0 @@
-.. _installation:
-
-Installation
-============
-
-See `ReadAlongs/Studio/README.md <https://github.com/ReadAlongs/Studio#install>`__
diff --git a/docs/outputs.md b/docs/outputs.md
new file mode 100644
index 00000000..73c196fc
--- /dev/null
+++ b/docs/outputs.md
@@ -0,0 +1,50 @@
+% outputs:
+
+# Output Realizations
+
+One of the main motivations for ReadAlong-Studio was to provide a one-stop-shop for audio/text alignment.
+With that in mind, there are a variety of different output formats that can be created. Here are a few:
+
+## Elan/Praat files
+
+## Web Component
+
+When you have standard output from ReadAlong-Studio, consisting of 1) a ReadALong file (XML) and 2) an audio file
+you can mobilize these files to the web or hybrid mobile apps quickly and painlessly.
+
+This is done using the ReadAlong WebComponent. Web components are re-useable, custom-defined HTML elements that you can embed in any HTML, regardless of which
+framework you used to build your site, whether React, Angular, Vue, or just Vanilla HTML/CSS/JS.
+
+Below is an example of a minimal implementation in a basic standalone html page. Please visit <https://stenciljs.com/docs/overview> for more information on framework integrations.
+
+```html
+<!DOCTYPE html>
+<html>
+
+    <head>
+        <!-- Import fonts. Material Icons are needed by the web component -->
+        <link href="https://fonts.googleapis.com/css?family=Lato|Material+Icons|Material+Icons+Outlined" rel="stylesheet">
+    </head>
+
+    <body>
+        <!-- Here is how you declare the Web Component -->
+        <read-along href="assets/sample.readalong" audio="assets/sample.wav"></read-along>
+    </body>
+    <!-- The last step needed is to import the package -->
+   <script type="module" src='https://unpkg.com/@readalongs/web-component@^1.4.0/dist/web-component/web-component.esm.js'></script>
+</html>
+```
+
+The above assumes the following structure:
+
+web
+
+├── assets
+
+│   ├── sample.wav
+
+│   ├── sample.readalong
+
+├── index.html
+
+Then you can host your site anywhere, or run it locally (`cd web && python3 -m http.server` for example)
diff --git a/docs/outputs.rst b/docs/outputs.rst
deleted file mode 100644
index eaa338dc..00000000
--- a/docs/outputs.rst
+++ /dev/null
@@ -1,52 +0,0 @@
-.. outputs:
-
-Output Realizations
-===================
-
-One of the main motivations for ReadAlong-Studio was to provide a one-stop-shop for audio/text alignment.
-With that in mind, there are a variety of different output formats that can be created. Here are a few:
-
-Elan/Praat files
-----------------
-
-Web Component
--------------
-
-When you have standard output from ReadAlong-Studio, consisting of 1) a ReadALong file (XML) and 2) an audio file
-you can mobilize these files to the web or hybrid mobile apps quickly and painlessly.
-
-This is done using the ReadAlong WebComponent. Web components are re-useable, custom-defined HTML elements that you can embed in any HTML, regardless of which
-framework you used to build your site, whether React, Angular, Vue, or just Vanilla HTML/CSS/JS.
-
-Below is an example of a minimal implementation in a basic standalone html page. Please visit https://stenciljs.com/docs/overview for more information on framework integrations.
-
-.. code-block:: html
-
-    <!DOCTYPE html>
-    <html>
-
-        <head>
-            <!-- Import fonts. Material Icons are needed by the web component -->
-            <link href="https://fonts.googleapis.com/css?family=Lato|Material+Icons|Material+Icons+Outlined" rel="stylesheet">
-        </head>
-
-        <body>
-            <!-- Here is how you declare the Web Component -->
-            <read-along href="assets/sample.readalong" audio="assets/sample.wav"></read-along>
-        </body>
-        <!-- The last step needed is to import the package -->
-       <script type="module" src='https://unpkg.com/@readalongs/web-component@^1.4.0/dist/web-component/web-component.esm.js'></script>
-    </html>
-
-
-The above assumes the following structure:
-
-| web
-| ├── assets
-| │   ├── sample.wav
-| │   ├── sample.readalong
-| ├── index.html
-|
-|
-
-Then you can host your site anywhere, or run it locally (``cd web && python3 -m http.server`` for example)
diff --git a/docs/start.md b/docs/start.md
new file mode 100644
index 00000000..cbb7d765
--- /dev/null
+++ b/docs/start.md
@@ -0,0 +1,41 @@
+% start:
+
+# Getting Started
+
+This library is an end-to-end audio/text aligner. It is meant to be used
+together with the ReadAlong-Web-Component to interactively visualize the
+alignment.
+
+## Background
+
+The concept is a web application with a series of stages of processing,
+which ultimately leads to a time-aligned audiobook, i.e., a package of:
+
+- ReadAlong XML file describing text
+- Audio file (WAV or MP3)
+- HTML file describing the web component
+
+Which can be loaded using the [read-along web
+component](https://github.com/roedoejet/ReadAlong-Web-Component).
+
+A book is generated as a standalone HTML page by default, but can
+optionally be generated as an ePub file.
+
+## Required knowledge
+
+- How to use a [Command-line interface (CLI)](https://en.wikipedia.org/wiki/Command-line_interface).
+- How to edit and manipulate plain text, [XML](https://www.w3.org/standards/xml/core) and [SMIL](https://www.w3.org/TR/smil/) files using a text editor or a code editor.
+- How to edit and examine an audio file with [Audacity](https://www.audacityteam.org/) or similar software.
+- How to spin up a local web server (e.g., see [How do you set up a local testing server?](https://developer.mozilla.org/en-US/docs/Learn/Common_questions/set_up_a_local_testing_server))
+
+## What you need to make a ReadAlong
+
+In order to create a ReadAlong you will need two files:
+
+- A text file, either in plain text (`.txt`) or in ReadAlong XML (`.readalong`)
+- Clear audio in any format supported by [ffmpeg](https://ffmpeg.org/ffmpeg-formats.html)
+
+The content of the text file should be a transcription of the audio
+file. The audio can be spoken or sung, but if there is background music
+or noise of any kind, the aligner is likely to fail. Clearly enunciated
+audio is also likely to increase accuracy.
diff --git a/docs/start.rst b/docs/start.rst
deleted file mode 100644
index e3274d07..00000000
--- a/docs/start.rst
+++ /dev/null
@@ -1,45 +0,0 @@
-.. start:
-
-Getting Started
-================
-
-This library is an end-to-end audio/text aligner. It is meant to be used
-together with the ReadAlong-Web-Component to interactively visualize the
-alignment.
-
-Background
-----------
-
-The concept is a web application with a series of stages of processing,
-which ultimately leads to a time-aligned audiobook, i.e., a package of:
-
--  ReadAlong XML file describing text
--  Audio file (WAV or MP3)
--  HTML file describing the web component
-
-Which can be loaded using the `read-along web
-component <https://github.com/roedoejet/ReadAlong-Web-Component>`__.
-
-A book is generated as a standalone HTML page by default, but can
-optionally be generated as an ePub file.
-
-Required knowledge
-------------------
-
--  How to use a `Command-line interface (CLI) <https://en.wikipedia.org/wiki/Command-line_interface>`__.
--  How to edit and manipulate plain text, `XML <https://www.w3.org/standards/xml/core>`__ and `SMIL <https://www.w3.org/TR/smil/>`__ files using a text editor or a code editor.
--  How to edit and examine an audio file with `Audacity <https://www.audacityteam.org/>`__ or similar software.
--  How to spin up a local web server (e.g., see `How do you set up a local testing server? <https://developer.mozilla.org/en-US/docs/Learn/Common_questions/set_up_a_local_testing_server>`__)
-
-What you need to make a ReadAlong
----------------------------------
-
-In order to create a ReadAlong you will need two files:
-
-- A text file, either in plain text (``.txt``) or in ReadAlong XML (``.readalong``)
-- Clear audio in any format supported by `ffmpeg <https://ffmpeg.org/ffmpeg-formats.html>`__
-
-The content of the text file should be a transcription of the audio
-file. The audio can be spoken or sung, but if there is background music
-or noise of any kind, the aligner is likely to fail. Clearly enunciated
-audio is also likely to increase accuracy.
diff --git a/docs/troubleshooting.rst b/docs/troubleshooting.md
similarity index 58%
rename from docs/troubleshooting.rst
rename to docs/troubleshooting.md
index ae79310d..6df8c32f 100644
--- a/docs/troubleshooting.rst
+++ b/docs/troubleshooting.md
@@ -1,22 +1,31 @@
-.. _troubleshooting:
+---
+substitutions:
+  image1: |-
+    ```{image} https://i.imgur.com/vKPhTud.png
+    ```
+---
 
-.. note:: This troubleshooting guide is under construction.
+(troubleshooting)=
 
-Troubleshooting
-===============
+:::{note}
+This troubleshooting guide is under construction.
+:::
+
+# Troubleshooting
 
 Here are three types of common errors you may encounter when trying to
 run ReadAlongs, and ways to debug them.
 
-Phones missing in the acoustic model
-------------------------------------
+## Phones missing in the acoustic model
 
-.. note:: Troubleshooting item under construction
+:::{note}
+Troubleshooting item under construction
+:::
 
-You may get an error that looks like this:|image1|
+You may get an error that looks like this:{{ image1 }}
 
 The general structure of your error would look like
-``Phone [character] is missing in the acoustic model; word [index] ignored``
+`Phone [character] is missing in the acoustic model; word [index] ignored`
 This error is most likely caused not by a bug in your ReadAlong input
 files, but by an error in one of your g2p mappings. The error message is
 saying that there is a character in your ReadAlong text that is not
@@ -29,16 +38,16 @@ Follow these steps to debug the issue **in g2p**.
 1. Identify which characters in each line of the error message are
    **not** being converted to eng-arpabet. These will either be:
 
-   a. characters that are not in caps (for example ``g`` in the string
-      ``gUW`` in the error message shown above.)
-   b. a character not traditionally used in English (for example é or Ŧ,
-      or ``ʰ`` in the error message shown above.) You can confirm you
+   1. characters that are not in caps (for example `g` in the string
+      `gUW` in the error message shown above.)
+   2. a character not traditionally used in English (for example é or Ŧ,
+      or `ʰ` in the error message shown above.) You can confirm you
       have isolated the right characters by ensuring every other
       character in your error message appears as an **output** in the
-      `eng-ipa-to-arpabet
-      mapping <https://github.com/roedoejet/g2p/blob/main/g2p/mappings/langs/eng/eng_ipa_to_arpabet.json>`__.
+      [eng-ipa-to-arpabet
+      mapping](https://github.com/roedoejet/g2p/blob/main/g2p/mappings/langs/eng/eng_ipa_to_arpabet.json).
       These are the problematic characters we need to debug in the error
-      message shown above: ``g`` and ``ʰ``.
+      message shown above: `g` and `ʰ`.
 
 2. Once you have isolated the characters that are not being converted to
    eng-arpabet, you are ready to begin debugging the issue. Start at
@@ -48,55 +57,55 @@ Follow these steps to debug the issue **in g2p**.
    problematic characters incorrectly. Most of the time, the issue will
    be in either the first or the second of the following mappings:
 
-   i.   *xyz-ipa* (where xyz is the ISO language code for your mapping)
-   ii.  *xyz-equiv* (if you have one)
-   iii. *xyz-ipa_to_eng-ipa* (this mapping must be generated
-        automatically in g2p. Refer //here_in_the_guide to see how to do
-        this.)
-   iv.  `eng-ipa-to-arpabet
-        mapping <https://github.com/roedoejet/g2p/blob/main/g2p/mappings/langs/eng/eng_ipa_to_arpabet.json>`__
-        (The issue is rarely found here, but it doesn’t hurt to check.)
+   1. *xyz-ipa* (where xyz is the ISO language code for your mapping)
+   2. *xyz-equiv* (if you have one)
+   3. *xyz-ipa_to_eng-ipa* (this mapping must be generated
+      automatically in g2p. Refer //here_in_the_guide to see how to do
+      this.)
+   4. [eng-ipa-to-arpabet
+      mapping](https://github.com/roedoejet/g2p/blob/main/g2p/mappings/langs/eng/eng_ipa_to_arpabet.json)
+      (The issue is rarely found here, but it doesn’t hurt to check.)
 
 4. Find a word in your text that uses the problematic character. For the
-   sake of example, let us assume the character I am debugging is ``g``,
+   sake of example, let us assume the character I am debugging is `g`,
    that appears in the word "dog", in language "xyz".
 
 5. Make sure you are in the g2p repository and run the word through
-   ``g2p convert`` to confirm you have isolated the correct characters
-   to debug: ``g2p convert dog xyz eng-arpabet``. Best practice is to
+   `g2p convert` to confirm you have isolated the correct characters
+   to debug: `g2p convert dog xyz eng-arpabet`. Best practice is to
    copy+paste the word directly from your text instead of retyping it.
    Make sure to use the ISO code for your language in place of "xyz".
    *If the word converts cleanly into eng-arpabet characters, your issue
    does not lie in your mapping. //Refer to other potential RA issues*
 
 6. From the result of the command run in 5, note the characters that do
-   **not** appear as **inputs** in the `eng-ipa-to-arpabet
-   mapping <https://github.com/roedoejet/g2p/blob/main/g2p/mappings/langs/eng/eng_ipa_to_arpabet.json>`__.
+   **not** appear as **inputs** in the [eng-ipa-to-arpabet
+   mapping](https://github.com/roedoejet/g2p/blob/main/g2p/mappings/langs/eng/eng_ipa_to_arpabet.json).
    These are the characters that have not been converted into characters
    that eng-ipa-to-arpabet can read. These should be the same characters
    you identified in step 2.
 
-7. Run ``g2p convert dog xyz xyz-ipa``. Ensure the result is what you
+7. Run `g2p convert dog xyz xyz-ipa`. Ensure the result is what you
    expect. If not, your error may arise from a problem in this mapping.
    refer_to_g2p_troubleshooting. If the result is what you expect,
    continue to the next step.
 
 8. Note the result from running the command in 7. Check that the
-   characters [TODO-fix this text] (appear/being mapped by generated --
+   characters \[TODO-fix this text\] (appear/being mapped by generated --
    use debugger or just look at mapping)
 
-.. |image1| image:: https://i.imgur.com/vKPhTud.png
-
-Type 2
-------
+## Type 2
 
-.. note:: TODO
+:::{note}
+TODO
+:::
 
 Common error type 2...
 
-Type 3
-------
+## Type 3
 
-.. note:: TODO
+:::{note}
+TODO
+:::
 
 Common error type 3...

From 191e1fbe292c413c8586a6ee90920bc26b9e419f Mon Sep 17 00:00:00 2001
From: Eric Joanis <eric.joanis@nrc-cnrc.gc.ca>
Date: Thu, 20 Jun 2024 13:54:54 -0400
Subject: [PATCH 2/5] refactor(docs): configure mkdocs and fix the .md files
 for it

Also remove the now obsolete .readthedocs.yaml
---
 .readthedocs.yml        | 18 ---------
 docs/Contributing.md    | 27 +++++---------
 docs/advanced-use.md    | 22 +++++------
 docs/cli-guide.md       | 22 +++++------
 docs/cli-ref.md         | 72 +++++++++++++++---------------------
 docs/index.md           | 21 +----------
 docs/installation.md    |  2 -
 docs/outputs.md         | 27 +++++++-------
 docs/requirements.txt   | 10 +++--
 docs/start.md           |  6 +--
 docs/troubleshooting.md | 82 +++++++++++++++--------------------------
 mkdocs.yml              | 31 ++++++++++++++++
 12 files changed, 142 insertions(+), 198 deletions(-)
 delete mode 100644 .readthedocs.yml
 create mode 100644 mkdocs.yml

diff --git a/.readthedocs.yml b/.readthedocs.yml
deleted file mode 100644
index 4660f296..00000000
--- a/.readthedocs.yml
+++ /dev/null
@@ -1,18 +0,0 @@
-version: 2
-
-build:
-  os: ubuntu-20.04
-  tools:
-    python: "3.8"
-  jobs:
-    post_install:
-      - echo "Installing Studio itself in its current state"
-      - which pip python
-      - pip install -e .
-
-sphinx:
-  configuration: docs/conf.py
-
-python:
-  install:
-    - requirements: docs/requirements.txt
diff --git a/docs/Contributing.md b/docs/Contributing.md
index 2b4cb723..a1da745e 100644
--- a/docs/Contributing.md
+++ b/docs/Contributing.md
@@ -2,38 +2,31 @@
 
 ## Edit the files
 
-To contribute to the ReadAlongs Studio documentation, edit the `.rst` files in
+To contribute to the ReadAlongs Studio documentation, edit the `.md` files in
 this folder.
 
+The configuration is found in `../mkdocs.yml`.
+
 ## Build and view the documentation locally
 
 To build the documentation and review your own changes locally:
 
-1. Install the required build software, Sphinx:
+1. Install the required build software, mkdocs and friends:
 
-       pip install -r requirements.txt
+    pip install -r requirements.txt
 
 2. Install Studio itself
 
-       (cd .. && pip install -e .)
-
-3. Run one of these commands, which will build the documentation in `./_build/html/`
-   or `./_build/singlehtml/`:
-
-       make html  # multi-page HTML site
-       make singlehtml  # single-page HTML document
+    (cd .. && pip install -e .)
 
-2. View the documentation by running an HTTP server in the directory where the
-   build is found, e.g.,
+3. Run this command to serve the documentation locally:
 
-       cd _build/html
-       python3 -m http.server
+    (cd .. && mkdocs serve)
 
-   and navigating to http://127.0.0.1:8000 (or whatever port your local web
-   server displays).
+4. View the documentation by browing to <http://localhost:8000>.
 
 ## Publish the changes
 
 Once your changes are pushed to GitHub and merged into `main` via a Pull
 Request, the documentation will automatically get built and published to
-https://readalong-studio.readthedocs.io/en/latest/
+<https://readalong-studio.readthedocs.io/en/latest/>
diff --git a/docs/advanced-use.md b/docs/advanced-use.md
index f010c094..0d925878 100644
--- a/docs/advanced-use.md
+++ b/docs/advanced-use.md
@@ -1,18 +1,16 @@
-(advanced-use)=
-
 # Advanced topics
 
-(adding-a-lang)=
-
 ## Adding a new language to g2p
 
 If you want to align an audio book in a language that is not yet supported by
 the g2p library, you will have to write your own g2p mapping for that language.
 
 References:
-: - The [g2p library](https://github.com/roedoejet/g2p) and its
-    [documentation](https://g2p.readthedocs.io/).
-  - The [7-part blog post on creating g2p mappings](https://blog.mothertongues.org/g2p-background/) on the [Mother Tongues Blog](https://blog.mothertongues.org/).
+
+ - The [g2p library](https://github.com/roedoejet/g2p) and its
+   [documentation](https://roedoejet.github.io/g2p).
+ - The [7-part blog post on creating g2p mappings](https://blog.mothertongues.org/g2p-background/)
+   on the [Mother Tongues Blog](https://blog.mothertongues.org/).
 
 Once you have created a g2p mapping for your language, please consider
 [contributing it to the project](https://blog.mothertongues.org/g2p-contributing/)
@@ -38,7 +36,7 @@ pip-installed. Keep in mind that Pydub uses milliseconds.
 If your data is currently 1 audio file, you will need to split it into
 segments where you want to put the silences.
 
-```
+```py
 ten_seconds = 10 * 1000
 first_10_seconds = soundtrack[:ten_seconds]
 last_5_seconds = soundtrack[-5000:]
@@ -47,7 +45,7 @@ last_5_seconds = soundtrack[-5000:]
 Once you have your segments, create an MP3 file containing only 1 second
 of silence.
 
-```
+```py
 from pydub import AudioSegment
 
 wfile = "appended_1000ms.mp3"
@@ -57,7 +55,7 @@ soundtrack = silence
 
 Then you loop the audio files you want to append (segments and silence).
 
-```
+```py
 seg = AudioSegment.from_mp3(mp3file)
 soundtrack = soundtrack + silence + seg
 ```
@@ -65,7 +63,7 @@ soundtrack = soundtrack + silence + seg
 Write the soundtrack file as an MP3. This will then be the audio input
 for your Read-Along.
 
-```
+```py
 soundtrack.export(wfile, format="mp3")
 ```
 
@@ -83,7 +81,7 @@ of their supported languages), consider adding a library like
 [num2words](https://github.com/savoirfairelinux/num2words) to your
 pre-processing.
 
-```
+```txt
 num2words 123456789
 one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine
 ```
diff --git a/docs/cli-guide.md b/docs/cli-guide.md
index a335fb91..5c0f1583 100644
--- a/docs/cli-guide.md
+++ b/docs/cli-guide.md
@@ -1,9 +1,7 @@
-(cli-guide)=
-
 # Command line interface (CLI) user guide
 
 This page contains guidelines on using the ReadAlongs CLI. See also
-{ref}`cli-ref` for the full CLI reference.
+[Command line interface (CLI) reference ](cli-ref.md) for the full CLI reference.
 
 The ReadAlongs CLI has two main commands: `readalongs make-xml` and
 `readalongs align`.
@@ -32,7 +30,7 @@ then used as input to `align`.
 
 ## Getting from TXT to XML with readalongs make-xml
 
-Run {ref}`cli-make-xml` to make the ReadAlongs XML file for `align` from a TXT file.
+Run [`readalongs make-xml`][readalongs-make-xml] to make the ReadAlongs XML file for `align` from a TXT file.
 
 `readalongs make-xml [options] [story.txt] [story.readalong]`
 
@@ -46,7 +44,7 @@ breaks are marked by two blank lines.
 
 | Key Options                    | Option descriptions                                                                                                   |
 | ------------------------------ | --------------------------------------------------------------------------------------------------------------------- |
-| `-l, --language(s)` (required) | The language code for story.txt. Specifying multiple comma- or colon-separated languages triggers {ref}`g2p-cascade`. |
+| `-l, --language(s)` (required) | The language code for story.txt. Specifying multiple comma- or colon-separated languages triggers the [g2p cascade][the-g2p-cascade]. |
 | `-f, --force-overwrite`        | Force overwrite output files (handy if you're troubleshooting and will be aligning repeatedly)                        |
 | `-h, --help`                   | Displays CLI guide for `make-xml`                                                                                     |
 
@@ -54,7 +52,7 @@ The `-l, --language` argument requires a language’s 3 character [ISO
 code](https://en.wikipedia.org/wiki/ISO_639-3) as an argument.
 
 The languages supported by RAS can be listed by running `readalongs make-xml -h`
-and they can also be found in the {ref}`cli-make-xml` reference.
+and they can also be found in the [`readalongs make-xml`][readalongs-make-xml] reference.
 
 So, a full command for a story in Algonquin, with an implicit g2p fallback to
 Undetermined, would be something like:
@@ -65,8 +63,8 @@ The generated XML will be parsed in to sentences. At this stage you can
 edit the XML to have any modifications, such as adding `do-not-align`
 as an attribute of any element in your XML.
 
-The format of the generated XML is based on \[TEI
-Lite\](<https://tei-c.org/guidelines/customization/lite/>) but is
+The format of the generated XML is based on [TEI
+Lite](https://tei-c.org/guidelines/customization/lite/) but is
 considerably simplified.  The DTD (document type definition) can be
 found in the ReadAlong Studio source code under
 `readalongs/static/read-along-1.0.dtd`.
@@ -120,7 +118,7 @@ To use DNA audio, you can specify a timeframe in milliseconds in the
 
 ## Aligning your text and audio with readalongs align
 
-Run {ref}`cli-align` to align a text file (RAS or TXT) and an audio file to
+Run [`readalongs align`][readalongs-align] to align a text file (RAS or TXT) and an audio file to
 create a time-aligned audiobook.
 
 `readalongs align [options] [story.txt/xml] [story.mp3/wav] [output_base]`
@@ -135,7 +133,7 @@ created, as `output_base*`
 
 | Key Options             | Option descriptions                                                                                                                                     |
 | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `-l, --language(s)`     | The language code for story.txt. Specifying multiple comma- or colon-separated languages triggers {ref}`g2p-cascade`. (required if input is plain text) |
+| `-l, --language(s)`     | The language code for story.txt. Specifying multiple comma- or colon-separated languages triggers the [g2p cascade][the-g2p-cascade]. (required if input is plain text) |
 | `-c, --config PATH`     | Use ReadAlong-Studio configuration file (in JSON format)                                                                                                |
 | `--debug-g2p`           | Display verbose g2p debugging messages                                                                                                                  |
 | `-s, --save-temps`      | Save intermediate stages of processing and temporary files (dictionary, FSG, tokenization, etc.)                                                        |
@@ -167,7 +165,7 @@ Here is that list at the time of compiling this documentation:
 .. command-output:: readalongs langs
 ```
 
-See {ref}`adding-a-lang` for references on adding new languages to that list.
+See [Adding a new language to g2p][adding-a-new-language-to-g2p] for references on adding new languages to that list.
 
 ## Adding titles, images and do-not-align segments via the config.json file
 
@@ -225,8 +223,6 @@ separate elements in a list or dictionnary, but if you accidentally have
 a comma after the last element (e.g., by cutting and pasting whole
 lines), you will get a syntax error.
 
-(g2p-cascade)=
-
 ## The g2p cascade
 
 Sometimes the g2p conversion of the input text will not succeed, for
diff --git a/docs/cli-ref.md b/docs/cli-ref.md
index 2291a14c..7c2f3011 100644
--- a/docs/cli-ref.md
+++ b/docs/cli-ref.md
@@ -1,53 +1,41 @@
-(cli-ref)=
-
 # Command line interface (CLI) reference
 
 This page contains the full reference documentation for each command in the CLI.
-See also {ref}`cli-guide` for guidelines on using the CLI.
+See also [Command line interface (CLI) user guide](cli-guide.md) for guidelines on using the CLI.
 
 The ReadAlongs CLI has five key commands:
 
-- {ref}`cli-align`: full alignment pipeline, from plain text or XML to a
+- [`readalongs align`][readalongs-align]: full alignment pipeline, from plain text or XML to a
   viewable readalong
-- {ref}`cli-make-xml`: convert a plain text file into XML, for align
-- {ref}`cli-tokenize`: tokenize an XML file
-- {ref}`cli-g2p`: g2p a tokenized XML file
-- {ref}`cli-langs`: list supported languages
+- [`readalongs make-xml`][readalongs-make-xml]: convert a plain text file into XML, for align
+- [`readalongs tokenize`][readalongs-tokenize]: tokenize an XML file
+- [`readalongs g2p`][readalongs-g2p]: g2p a tokenized XML file
+- [`readalongs langs`][readalongs-langs]: list supported languages
 
 Each command can be run with `-h` or `--help` to display its usage manual,
 e.g., `readalongs -h`, `readalongs align --help`.
 
-(cli-align)=
-
-```{eval-rst}
-.. click:: readalongs.cli:align
-  :prog: readalongs align
-```
-
-(cli-make-xml)=
-
-```{eval-rst}
-.. click:: readalongs.cli:make_xml
-  :prog: readalongs make-xml
-```
-
-(cli-tokenize)=
-
-```{eval-rst}
-.. click:: readalongs.cli:tokenize
-  :prog: readalongs tokenize
-```
-
-(cli-g2p)=
-
-```{eval-rst}
-.. click:: readalongs.cli:g2p
-  :prog: readalongs g2p
-```
-
-(cli-langs)=
-
-```{eval-rst}
-.. click:: readalongs.cli:langs
-  :prog: readalongs langs
-```
+::: mkdocs-click
+    :module: readalongs.cli
+    :command: align
+    :prog_name: readalongs align
+
+::: mkdocs-click
+    :module: readalongs.cli
+    :command: make_xml
+    :prog_name: readalongs make-xml
+
+::: mkdocs-click
+    :module: readalongs.cli
+    :command: tokenize
+    :prog_name: readalongs tokenize
+
+::: mkdocs-click
+    :module: readalongs.cli
+    :command: g2p
+    :prog_name: readalongs g2p
+
+::: mkdocs-click
+    :module: readalongs.cli
+    :command: langs
+    :prog_name: readalongs langs
diff --git a/docs/index.md b/docs/index.md
index 16f2d459..749f2478 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -2,23 +2,4 @@
 
 Audiobook alignment for Indigenous languages
 
-This site provides the full user documentation for ReadAlongs-Studio.
-
-```{toctree}
-:caption: 'Contents:'
-:maxdepth: 2
-
-start
-installation
-cli-guide
-cli-ref
-outputs
-advanced-use
-troubleshooting
-```
-
-# Indices and tables
-
-- {ref}`genindex`
-- {ref}`modindex`
-- {ref}`search`
+This site provides the user documentation for ReadAlongs-Studio.
diff --git a/docs/installation.md b/docs/installation.md
index 84bed1d7..b530c237 100644
--- a/docs/installation.md
+++ b/docs/installation.md
@@ -1,5 +1,3 @@
-(installation)=
-
 # Installation
 
 See [ReadAlongs/Studio/README.md](https://github.com/ReadAlongs/Studio#install)
diff --git a/docs/outputs.md b/docs/outputs.md
index 73c196fc..86093136 100644
--- a/docs/outputs.md
+++ b/docs/outputs.md
@@ -1,16 +1,12 @@
-% outputs:
-
 # Output Realizations
 
 One of the main motivations for ReadAlong-Studio was to provide a one-stop-shop for audio/text alignment.
 With that in mind, there are a variety of different output formats that can be created. Here are a few:
 
-## Elan/Praat files
-
 ## Web Component
 
-When you have standard output from ReadAlong-Studio, consisting of 1) a ReadALong file (XML) and 2) an audio file
-you can mobilize these files to the web or hybrid mobile apps quickly and painlessly.
+The standard output from ReadAlong-Studio consists of 1) a ReadALong file (XML) and 2) an audio file,
+which you can mobilize to the web or hybrid mobile apps quickly and painlessly.
 
 This is done using the ReadAlong WebComponent. Web components are re-useable, custom-defined HTML elements that you can embed in any HTML, regardless of which
 framework you used to build your site, whether React, Angular, Vue, or just Vanilla HTML/CSS/JS.
@@ -20,7 +16,6 @@ Below is an example of a minimal implementation in a basic standalone html page.
 ```html
 <!DOCTYPE html>
 <html>
-
     <head>
         <!-- Import fonts. Material Icons are needed by the web component -->
         <link href="https://fonts.googleapis.com/css?family=Lato|Material+Icons|Material+Icons+Outlined" rel="stylesheet">
@@ -31,20 +26,26 @@ Below is an example of a minimal implementation in a basic standalone html page.
         <read-along href="assets/sample.readalong" audio="assets/sample.wav"></read-along>
     </body>
     <!-- The last step needed is to import the package -->
-   <script type="module" src='https://unpkg.com/@readalongs/web-component@^1.4.0/dist/web-component/web-component.esm.js'></script>
+    <script type="module" src='https://unpkg.com/@readalongs/web-component@^1.4.0/dist/web-component/web-component.esm.js'</script>
 </html>
 ```
 
 The above assumes the following structure:
 
+```txt
 web
-
 ├── assets
-
 │   ├── sample.wav
+│   └── sample.readalong
+└── index.html
+```
 
-│   ├── sample.readalong
+Then you can host your site anywhere, or run it locally (`cd web && python3 -m http.server` for example)
 
-├── index.html
+## Single-file HTML
 
-Then you can host your site anywhere, or run it locally (`cd web && python3 -m http.server` for example)
+With `-o html`, the ReadAlongs Studio will output a completely self-contained HTML file that be shared by e-mail and used without requiring any access to the Internet.
+
+## ELAN/Praat files
+
+With `-o eaf` or `-o TextGrid`, the ReadAlongs Studio can output files in the ELAN eaf and Praat TextGrid file formats.
diff --git a/docs/requirements.txt b/docs/requirements.txt
index 16c6e0eb..251201ca 100644
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@@ -1,5 +1,7 @@
-Sphinx
-guzzle_sphinx_theme
-sphinx-click
-sphinxcontrib-programoutput
+mkdocs>=1.6.0
+mkdocs-click>=0.8.1
+mkdocs-material>=9.5.27
+mkdocs-autorefs>=1.0.1
+mkdocstrings[python]>=0.25.1
+mike>=2.1.1
 -r ../requirements.min.txt
diff --git a/docs/start.md b/docs/start.md
index cbb7d765..e5bd9775 100644
--- a/docs/start.md
+++ b/docs/start.md
@@ -1,5 +1,3 @@
-% start:
-
 # Getting Started
 
 This library is an end-to-end audio/text aligner. It is meant to be used
@@ -16,7 +14,7 @@ which ultimately leads to a time-aligned audiobook, i.e., a package of:
 - HTML file describing the web component
 
 Which can be loaded using the [read-along web
-component](https://github.com/roedoejet/ReadAlong-Web-Component).
+component](https://github.com/ReadAlongs/Studio-Web/tree/main/packages/web-component).
 
 A book is generated as a standalone HTML page by default, but can
 optionally be generated as an ePub file.
@@ -24,7 +22,7 @@ optionally be generated as an ePub file.
 ## Required knowledge
 
 - How to use a [Command-line interface (CLI)](https://en.wikipedia.org/wiki/Command-line_interface).
-- How to edit and manipulate plain text, [XML](https://www.w3.org/standards/xml/core) and [SMIL](https://www.w3.org/TR/smil/) files using a text editor or a code editor.
+- How to edit and manipulate plain text and [XML](https://www.w3.org/standards/xml/core) files using a text editor or a code editor.
 - How to edit and examine an audio file with [Audacity](https://www.audacityteam.org/) or similar software.
 - How to spin up a local web server (e.g., see [How do you set up a local testing server?](https://developer.mozilla.org/en-US/docs/Learn/Common_questions/set_up_a_local_testing_server))
 
diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md
index 6df8c32f..f29bce6a 100644
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -1,28 +1,19 @@
----
-substitutions:
-  image1: |-
-    ```{image} https://i.imgur.com/vKPhTud.png
-    ```
----
-
-(troubleshooting)=
-
-:::{note}
-This troubleshooting guide is under construction.
-:::
+!!! note
+    This troubleshooting guide is under construction.
 
 # Troubleshooting
 
-Here are three types of common errors you may encounter when trying to
+This document in intended to list common errors your may encounter when trying to
 run ReadAlongs, and ways to debug them.
 
+It only ever got one contribution, but more can get added here as needed.
+
 ## Phones missing in the acoustic model
 
-:::{note}
-Troubleshooting item under construction
-:::
+!!! note
+    Troubleshooting item under construction
 
-You may get an error that looks like this:{{ image1 }}
+You may get an error that looks like this:![error screen capture](https://i.imgur.com/vKPhTud.png)
 
 The general structure of your error would look like
 `Phone [character] is missing in the acoustic model; word [index] ignored`
@@ -36,18 +27,19 @@ because it cannot understand what sound the text is meant to represent.
 Follow these steps to debug the issue **in g2p**.
 
 1. Identify which characters in each line of the error message are
-   **not** being converted to eng-arpabet. These will either be:
-
-   1. characters that are not in caps (for example `g` in the string
-      `gUW` in the error message shown above.)
-   2. a character not traditionally used in English (for example é or Ŧ,
-      or `ʰ` in the error message shown above.) You can confirm you
-      have isolated the right characters by ensuring every other
-      character in your error message appears as an **output** in the
-      [eng-ipa-to-arpabet
-      mapping](https://github.com/roedoejet/g2p/blob/main/g2p/mappings/langs/eng/eng_ipa_to_arpabet.json).
-      These are the problematic characters we need to debug in the error
-      message shown above: `g` and `ʰ`.
+**not** being converted to eng-arpabet. These will either be:
+
+    1. characters that are not in caps (for example `g` in the string
+       `gUW` in the error message shown above.)
+
+    2. a character not traditionally used in English (for example `é` or `Ŧ`,
+       or `ʰ` in the error message shown above.) You can confirm you
+       have isolated the right characters by ensuring every other
+       character in your error message appears as an **output** in the
+       [eng-ipa-to-arpabet
+       mapping](https://github.com/roedoejet/g2p/blob/main/g2p/mappings/langs/eng/eng_ipa_to_arpabet.json).
+       These are the problematic characters we need to debug in the error
+       message shown above: `g` and `ʰ`.
 
 2. Once you have isolated the characters that are not being converted to
    eng-arpabet, you are ready to begin debugging the issue. Start at
@@ -57,14 +49,14 @@ Follow these steps to debug the issue **in g2p**.
    problematic characters incorrectly. Most of the time, the issue will
    be in either the first or the second of the following mappings:
 
-   1. *xyz-ipa* (where xyz is the ISO language code for your mapping)
-   2. *xyz-equiv* (if you have one)
-   3. *xyz-ipa_to_eng-ipa* (this mapping must be generated
-      automatically in g2p. Refer //here_in_the_guide to see how to do
-      this.)
-   4. [eng-ipa-to-arpabet
-      mapping](https://github.com/roedoejet/g2p/blob/main/g2p/mappings/langs/eng/eng_ipa_to_arpabet.json)
-      (The issue is rarely found here, but it doesn’t hurt to check.)
+    1. *xyz-ipa* (where xyz is the ISO language code for your mapping)
+    2. *xyz-equiv* (if you have one)
+    3. *xyz-ipa_to_eng-ipa* (this mapping must be generated
+       automatically in g2p. Refer //here_in_the_guide to see how to do
+       this.)
+    4. [eng-ipa-to-arpabet
+       mapping](https://github.com/roedoejet/g2p/blob/main/g2p/mappings/langs/eng/eng_ipa_to_arpabet.json)
+       (The issue is rarely found here, but it doesn’t hurt to check.)
 
 4. Find a word in your text that uses the problematic character. For the
    sake of example, let us assume the character I am debugging is `g`,
@@ -93,19 +85,3 @@ Follow these steps to debug the issue **in g2p**.
 8. Note the result from running the command in 7. Check that the
    characters \[TODO-fix this text\] (appear/being mapped by generated --
    use debugger or just look at mapping)
-
-## Type 2
-
-:::{note}
-TODO
-:::
-
-Common error type 2...
-
-## Type 3
-
-:::{note}
-TODO
-:::
-
-Common error type 3...
diff --git a/mkdocs.yml b/mkdocs.yml
new file mode 100644
index 00000000..5b83d0b9
--- /dev/null
+++ b/mkdocs.yml
@@ -0,0 +1,31 @@
+site_name: ReadAlong-Studio
+theme:
+  name: material
+  features:
+    - content.code.copy
+    - navigation.instant
+plugins:
+  - search
+  - autorefs
+  - mkdocstrings:
+      default_handler: python
+      handlers:
+        python:
+          paths: [readalongs]
+extra:
+  version:
+    provider: mike
+    default: stable
+markdown_extensions:
+  - mkdocs-click
+  - admonition
+  - def_list
+nav:
+  - Home: index.md
+  - Start: start.md
+  - Installation: installation.md
+  - CLI Guide: cli-guide.md
+  - CLI reference: cli-ref.md
+  - Output Realizations: outputs.md
+  - Advanced Topics: advanced-use.md
+  - Troubleshooting: troubleshooting.md

From da8b6b1291568be2c0c4e9680b12664cae5de5b4 Mon Sep 17 00:00:00 2001
From: Eric Joanis <eric.joanis@nrc-cnrc.gc.ca>
Date: Fri, 21 Jun 2024 12:54:41 -0400
Subject: [PATCH 3/5] docs: add a link to the old documentation site

---
 docs/index.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/index.md b/docs/index.md
index 749f2478..2f652070 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -3,3 +3,6 @@
 Audiobook alignment for Indigenous languages
 
 This site provides the user documentation for ReadAlongs-Studio.
+
+The documentation for versions starting from v1.0.20230228 can be found here.
+The documentation for older versions remains available at <https://readalong-studio.readthedocs.io>.

From 91ee9e5e9597369707807dfe9add26bbf33fb8a5 Mon Sep 17 00:00:00 2001
From: Eric Joanis <eric.joanis@nrc-cnrc.gc.ca>
Date: Fri, 21 Jun 2024 14:47:04 -0400
Subject: [PATCH 4/5] docs: change the docs badge to the new docs location and
 workflow

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 8f1b6113..15f7b4cf 100644
--- a/README.md
+++ b/README.md
@@ -6,7 +6,7 @@
 [![Deploy web-api](https://img.shields.io/badge/%E2%86%91_Deploy_to-Heroku-7056bf.svg)](https://readalong-studio.herokuapp.com/api/v1/docs)
 [![GitHub license](https://img.shields.io/github/license/ReadAlongs/Studio)](https://github.com/ReadAlongs/Studio/blob/main/LICENSE)
 [![standard-readme compliant](https://img.shields.io/badge/readme%20style-standard-brightgreen.svg)](https://github.com/ReadAlongs/Studio)
-[![Documentation Status](https://readthedocs.org/projects/readalong-studio/badge/)](https://readalong-studio.readthedocs.io)
+[![Documentation](https://github.com/ReadAlongs/studio/actions/workflows/docs.yml/badge.svg)](https://readalongs.github.io/Studio/)
 
 > Audiobook alignment for Indigenous languages!
 

From f3d2f05bab73e5b594f8cfb873e6953e271d2558 Mon Sep 17 00:00:00 2001
From: Eric Joanis <eric.joanis@nrc-cnrc.gc.ca>
Date: Fri, 21 Jun 2024 17:44:45 -0400
Subject: [PATCH 5/5] ci: update latest dev docs on push to main

---
 .github/workflows/docs.yml | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)
 create mode 100644 .github/workflows/docs.yml

diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
new file mode 100644
index 00000000..a819973c
--- /dev/null
+++ b/.github/workflows/docs.yml
@@ -0,0 +1,30 @@
+name: Deploy docs
+on:
+  push:
+    branches:
+      - main
+jobs:
+  docs:
+    # Create latest docs
+    runs-on: ubuntu-latest
+    permissions:
+      contents: write  # to push to the gh-pages branch
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0  # needed to get the gh-pages branch
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.8"
+      - name: Install dependencies and Studio
+        run: |
+          python -m pip install --upgrade pip
+          pip install wheel
+          pip install -r docs/requirements.txt -e .
+      - name: Setup doc deploy
+        run: |
+            git config user.name 'github-actions[bot]'
+            git config user.email 'github-actions[bot]@users.noreply.github.com'
+      - name: Deploy docs with mike 🚀
+        run: |
+          mike deploy --push --update-aliases dev latest