Skip to content

4. Marks

nguyenvukhang edited this page Oct 21, 2024 · 1 revision

Every mark has a unique 7-character hash, much like a git commit. Naming this 7-character entity was a tough decision, but I've decided to call it a SHA.

Here's an example of a mark of type Theorem with the title Liouville's Theorem, and with a SHA of cf6d8a9:

\Theorem{Liouville's Theorem}\label{cf6d8a9}

A bounded \href{d508dc8}{entire} function is constant.

\begin{proof}
...
\end{proof}

With this setup, we can uniquely identify marks by their SHA. Referencing this theorem in other parts of the book becomes as simple as

... \href{cf6d8a9}{Clearly}, $f$ is a constant and hence ...

And with some LaTeX magic, this project also supports \autoref referencing:

... By \autoref{cf6d8a9}, $f$ is a constant and hence ...

where pdfTeX will take over at compile-time and figure out the display text of the link. For instance, the snippet above could appear in the output PDF as

... By Theorem 4.1.8, f is a constant and hence ...

Alternative referencing methods

Previously, I've tried some other methods of referencing marks in my notes.

  1. Use some shortened abbreviation of the mark's name (e.g. liouvilles-theorem)
  2. Manually assign an index number (e.g. 1.2.4) to each mark.

The issue with abbreviations is that it has inconsistent length in code and hence adds visual noise in editing. Furthermore, deciding on how to abbreviate a given mark is one extra subjective thing to think about, not to mention the possible collision of abbreviations. It's enough overhead to come up with a title for a mark with an obscure claim as it is.

As for manual indexing, while it may look simpler than SHAs at first, consider the consequences when we want to move one particular mark from one chapter to another. That's gonna be an O(n) operation.

If we move Theorem 3.1.2 out, we would have to manually rename all the marks numbered 3.1.3, 3.1.4, and so on. This is repeated on the receiving end too.

SHAs help to keep this operation O(1), since we can sit back and let the LaTeX engine handle the theorem numbering at PDF build-time.

Another issue with using index numbers is their searchability. It happens that the period symbol . in regex matches any character, so querying 3.1.2 might yield more results than what we want. In addition, there will be index number collisions across chapters unless we extend the index numbering to 4 numbers: 10.3.1.2, at which point we might as well use SHAs.

Finally, the minimath binary offers auto-generation of new SHAs. Running minimath label at the root of this project will automatically label all unlabelled marks. If it sees a line that goes

\Proposition{The empty event has probability zero}

It will overwrite it with

\Proposition{The empty event has probability zero}\label{a0a9280}

This annihilates the mental overhead of thinking of unique identifiers for marks.

Searching marks

SHAs play nicely with (neo)vim because it's recognized as a word (see: cword) so we can search for it with * or pass it into a search function when the cursor is on any part of the SHA. Moreover, the SHA as a string is not a real word so collisions with other text is minimized.

Also, we can use tools like ripgrep to search the codebase for marks, and then fuzzy search over this catalogue. For example:

rg --type tex '^\\(Theorem|Lemma|Result).*\\label\{[a-f0-9]{7}\}'

Will yield all the lines in all TeX files that follow this template.

This leads us into minimath-rg.

Creating a search index with minimath-rg

minimath-rg exists as a standalone C binary because the project's organization level renders ripgrep's capabilities overkill. With just under 150 lines of C, we can create an index of all the marks in this project.

minimath-rg will look for lines such as this (ignore the comment, that's just the filename):

% lib/complex_analysis/basics.tex
\Theorem{Liouville's Theorem}\label{cf6d8a9}

and convert this to a plain-text line

lib/complex_analysis/basics.tex:783:Theorem:Liouville's Theorem:cf6d8a9

it outputs the entire list to stdout, which can now be read by telescope for fuzzy search. This allows us to very quickly jump to a mark of our choice to edit it, or create a new reference to the mark.

╭────────────────────────────── Results ───────────────────────────────╮
│                                                                      │
│                                                                      │
│  [l/NOU] Proposition: Acceptance of full step-size in globalized Newt│
│  [l/NMA] Theorem: Conditions for invertible R in QR factorization    │
│  [l/NOU] Algorithm: Globalized Newton's method for unconstrained opti│
│  [l/CAL] Theorem: Leibniz integral rule                              │
│  [l/REA] Theorem: Sequence converging absolutely to zero also converg│
│  [l/STC] Proposition: Probability of nothing is zero                 │
│  [l/LNA] Theorem: Uniqueness of basis size                           │
│  [l/REA] Theorem: Bolzano-Weierstrass Theorem                        │
│  [l/LNA] Lemma: Gram-Schmidt on a basis does not produce zeros       │
│  [l/FUN] Lemma: Bilinear forms send zeros to zeros                   │
╰──────────────────────────────────────────────────────────────────────╯
╭────────────────────────────── Theorems ──────────────────────────────╮
│> bz                                                         10 / 1166│
╰──────────────────────────────────────────────────────────────────────╯

Searching "bz" in telescope. Notice the fuzzy matches!

This makes writing more of the book much quicker.

 

< Prev          Next >
Clone this wiki locally