diff --git a/README.md b/README.md index dbd30b6..1d95e97 100644 --- a/README.md +++ b/README.md @@ -14,39 +14,37 @@ stitchfinder ``` Invoking stitchfinder produces a table. Each row corresponds to a stitch, where the first column contains the **stitch word**. -Note that there might be multiple rows with the same stitch word; a stitch word can sometimes be produced multiple times from -the same given word but different found words, e.g. `ali` produces `alice` with `ice` or `lice`. -Also note that the stitch word might be the same as the given or found word; a found word may start/end with the given word -and vice versa. ### Example Output The following is first 10 lines of `stitchfinder popular.txt twink`: ``` -Stitched Found I-sect Rem-given Rem-found Pos Valid X-given X-pos -atwink at t wink a right false -atwinkle at t winkle a right false twinkle left -abandonmentwink abandonment t wink abandonmen right false -abandonmentwinkle abandonment t winkle abandonmen right false twinkle left -abbotwink abbot t wink abbo right false -abbotwinkle abbot t winkle abbo right false twinkle left -abductwink abduct t wink abduc right false -abductwinkle abduct t winkle abduc right false twinkle left -abortwink abort t wink abor right false + Stitched Pos-given Pos-expans Valid I-sect Expansion Found Rem-expans Rem-found + atwink right false t twink at wink a + atwinkle left right false t twinkle at winkle a + abandonmentwink right false t twink abandonment wink abandonmen + abandonmentwinkle left right false t twinkle abandonment winkle abandonmen + abbotwink right false t twink abbot wink abbo + abbotwinkle left right false t twinkle abbot winkle abbo + abductwink right false t twink abduct wink abduc + abductwinkle left right false t twinkle abduct winkle abduc + abortwink right false t twink abort wink abor ``` The meaning of the columns is: -- `Stitched`: stitched (final) words -- `Found`: words from `popular.txt` which stitch into the given word (`twink`) -- `I-sect` (intersection): text which the given word and each final word has, which allows the two to stitch together -- `Rem-given` (remaining given): text of the given word without the intersection -- `Rem-found` (remaining found): text of the found word without the intersection -- `Pos` (position): where the given word is relative to the found word -- `Valid`: whether the remaining given and found are valid words (i.e. are they in `popular.txt`) -- `X-given` (expanded given): see [Expansion](#expansion) -- `X-pos` (expanded position): see [Expansion](#expansion) +- `Stitched`: the word which stiches `Expansion` and `Found` together, with the overlap being `I-sect` +- `Pos-given` (position of given): where the given word is relative to `Expansion`. Blank when the expansion is the given word. +- `Pos-expans` (position of expansion): where `Expansion` is relative to `Found` in `Stitched`. +- `Valid`: whether the `Rem-expans` and `Rem-found` are valid words (i.e. are they in `popular.txt`) +- `I-sect` (intersection): the overlapping text between `Expansion` and `Found` +- `Expansion`: what the given word expanded to. May be the same as the given word. +- `Found`: the other word +- `Rem-given` (remaining given): text of the given word without `I-sect` +- `Rem-found` (remaining found): text of `Found` without `I-sect` + +For more information about `Expansion` and `Pos-given`, see [Expansion](#expansion). ### With Nushell @@ -72,53 +70,58 @@ and had no filtering applied (there aren't any symbols to filter). ## Expansion -Expansion is a somewhat-complex feature of stitchfinder. It takes the given word and expands it into -other, valid words, and then uses these expanded words as their own given words. +Expansion is the first step of stitchfinder, and may be disabled with `--disable-expansion`. It takes the given word +and expands it into as many words within the words file as possible (including the given word itself), and then tries +to stitch using those words. For example, let `ia` be the given word. `ia` expands into: - `iambic` from the left side, which stitches with - `insignia` from the left side to make `insigniambic` - `bicycle` from the right side to make `iambicycle` -- `maria` from the right side, which stitches with +- `aria` from the right side, which stitches with - `avatar` from the left to make `avataria` - `iambic` from the right to make `ariambic` Thus, the output includes: ``` -Stitched Found I-sect Rem-given Rem-found Pos Valid X-given X-pos -ariambic aria ia mbic ar right false iambic left -ariambic iambic ia ar mbic left false aria right -avataria avatar ar ia avat right false aria right -iambicycle bicycle bic iam ycle left false iambic left -iambicycle cycle c iambi ycle left false iambic left -insigniambic insignia ia mbic insign right false iambic left -insigniambic iambic ia insign mbic left false insignia right + Stitched Pos-given Pos-expans Valid I-sect Expansion Found Rem-expans Rem-found + ariambic left right false ia iambic aria mbic ar + ariambic right left false ia aria iambic ar mbic + avataria right right false ar aria avatar ia avat + iambicycle left left false bic iambic bicycle iam ycle + iambicycle left left false c iambic cycle iambi ycle + insigniambic left right false ia iambic insignia mbic insign + insigniambic right left false ia insignia iambic insign mbic ``` -Notice the two new columns: `X-given` and `X-pos`. +Notice the two columns `Expansion` and `Pos-given` + +- `Expansion` corresponds to what the given word expanded into. +- `Pos-given` corresponds to where the given word is within the expansion. -- `X-given` corresponds to what the given word was expanded into. For the first two lines, `ia` expanded into `iambic` and `aria`. -- `X-pos` corresponds to which side of the expanded word the given word is on. For the first line, `ia` expanded to `iambic` from -the left side, thus `X-pos` is `left`. In the second line, `ia` expanded to `aria` from the right side, so `X-pos` is `right`. +In the first example, `Expansion` says that `ia` expanded into `iambic`. As stated earlier, it expands from the left side. This side matches with that of `Pos-given`. +Similarly, the second row has an `Expansion` of `aria`, which comes from the right side. This aligns with the tree from earlier and what `Pos-given` says. -If `X-given` and `X-pos` are blank, it means no expansion occured, i.e. the given word was used as-is. +However, not every word can be expanded. When a row does not use expansion, i.e. it uses the given word as-is, `Pos-given` is blank and `Expansion` is the same as the +given word. ## Duplicate Stitched Words -While the `Stitched` column is sorted, it may have duplicates. For example, the following is the output of `stitchfinder popular.txt ash`: +While the `Stitched` column is sorted, it may have duplicates. For example, the following is included in the output of `stitchfinder popular.txt ash`: ``` -Stitched Found I-sect Rem-given Rem-found Pos Valid X-given X-pos -sodash sod d ash so right true dash right -sodash soda da sh so right true dash right -sodash sodas das h so right false dash right -sodash soda a sh sod right true -sodash sodas as h sod right false + Stitched Pos-given Pos-expans Valid I-sect Expansion Found Rem-expans Rem-found + sodash right right false das dash sodas h so + sodash right right true d dash sod ash so + sodash right right true da dash soda sh so + sodash right false as ash sodas h sod + sodash right true a ash soda sh sod ``` -Even without expansion, `sodash` can be formed from `soda` + `ash` and `sodas` + `ash`, thus creating two entries. +`sodash` can be formed from many different stitches. Even `--disable-expansion`, the existence of both `soda` + `ash` and `sodas` + `ash` causes there to be two entries +for the same stitched word. ## TODO diff --git a/src/disp.rs b/src/disp.rs index 9768c87..a531403 100644 --- a/src/disp.rs +++ b/src/disp.rs @@ -5,36 +5,39 @@ use anyhow::Context; use rayon::{prelude::ParallelIterator, slice::ParallelSliceMut}; use tabled::Table; -use crate::{matcher::Combo, Position}; +use crate::{ + matcher::{Combo, Pairing, StitchParts}, + Position, +}; #[derive(tabled::Tabled, Debug, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)] struct Row<'a> { #[tabled(rename = "Stitched")] stitch: crate::matcher::Whole<'a>, - #[tabled(rename = "Found")] - found: &'a str, + #[tabled(rename = "Pos-given")] + pos_given: OptPos, - #[tabled(rename = "I-sect")] - isect: &'a str, + #[tabled(rename = "Pos-expans")] + pos_expans: Position, - #[tabled(rename = "Rem-given")] - rem_given: &'a str, + #[tabled(rename = "Valid")] + valid: bool, - #[tabled(rename = "Rem-found")] - rem_found: &'a str, + #[tabled(rename = "I-sect")] + isect: &'a str, - #[tabled(rename = "Pos")] - pos: Position, + #[tabled(rename = "Expansion")] + expans: &'a str, - #[tabled(rename = "Valid")] - valid: bool, + #[tabled(rename = "Found")] + found: &'a str, - #[tabled(rename = "X-given")] - expand_given: &'a str, + #[tabled(rename = "Rem-expans")] + rem_expans: &'a str, - #[tabled(rename = "X-pos")] - expand_pos: OptPos, + #[tabled(rename = "Rem-found")] + rem_found: &'a str, } #[derive(Debug, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)] @@ -54,17 +57,24 @@ fn rows<'a>(combos: impl ParallelIterator>) -> Vec> { let mut vec: Vec<_> = combos .map(|combo| { let parts = combo.stitch.into_parts(); + let StitchParts { + trans, + isect, + rem_expans, + rem_found, + } = parts; + let Pairing { expans, found } = trans.pair; Row { stitch: combo.stitch.whole(), - found: parts.trans.pair.found, - isect: parts.isect, - rem_given: parts.rem_given, - rem_found: parts.rem_found, - pos: parts.trans.pos, + pos_given: OptPos(combo.pos_given), + pos_expans: trans.pos, valid: combo.valid, - expand_given: combo.expand.map_or("", |(x_given, _)| x_given), - expand_pos: OptPos(combo.expand.map(|(_, pos)| pos)), + isect, + expans, + found, + rem_expans, + rem_found, } }) .collect(); diff --git a/src/matcher.rs b/src/matcher.rs index ab09cf2..1582e6e 100644 --- a/src/matcher.rs +++ b/src/matcher.rs @@ -6,13 +6,13 @@ use rayon::prelude::*; #[derive(Debug, Copy, Clone)] pub struct Pairing<'a> { - pub given: &'a str, + pub expans: &'a str, pub found: &'a str, } impl<'a> Pairing<'a> { fn max_isect_len(&self) -> usize { - usize::min(self.given.len(), self.found.len()) + usize::min(self.expans.len(), self.found.len()) } } @@ -25,28 +25,24 @@ pub struct Transform<'a> { impl<'a> Transform<'a> { /// Fractures the transform into its intersection, remaining given, and remaining found (if the intersection exists) fn fracture(&self, isect_len: usize) -> Option<(&'a str, &'a str, &'a str)> { - let Pairing { given, found } = self.pair; + let Pairing { expans, found } = self.pair; - let (igiven, ifound, rgiven, rfound) = match self.pos { + let (iexpans, ifound, rexpans, rfound) = match self.pos { Position::Left => { - let (rgiven, igiven) = given.rsplit_at(isect_len); + let (rexpans, iexpans) = expans.rsplit_at(isect_len); let (ifound, rfound) = found.split_at(isect_len); - (igiven, ifound, rgiven, rfound) + (iexpans, ifound, rexpans, rfound) } Position::Right => { let (rfound, ifound) = found.rsplit_at(isect_len); - let (igiven, rgiven) = given.split_at(isect_len); + let (iexpans, rexpans) = expans.split_at(isect_len); - (igiven, ifound, rgiven, rfound) + (iexpans, ifound, rexpans, rfound) } }; - // if "nacho" == found { - // dbg!((igiven, ifound, rgiven, rfound)); - // } - - (igiven == ifound).then_some((igiven, rgiven, rfound)) + (iexpans == ifound).then_some((iexpans, rexpans, rfound)) } fn stitches(self) -> impl ParallelIterator> { @@ -60,7 +56,7 @@ impl<'a> Transform<'a> { pub struct Stitch<'a> { trans: Transform<'a>, isect: &'a str, - rem_given: &'a str, + rem_expans: &'a str, rem_found: &'a str, } @@ -71,7 +67,7 @@ impl<'a> Stitch<'a> { .map(|(isect, rem_given, rem_found)| Self { trans, isect, - rem_given, + rem_expans: rem_given, rem_found, }) } @@ -79,18 +75,18 @@ impl<'a> Stitch<'a> { fn valid(&self, words: &HashSet<&str>) -> bool { let for_word = |word| word == "" || words.contains(word); - for_word(self.rem_given) && for_word(self.rem_found) + for_word(self.rem_expans) && for_word(self.rem_found) } pub fn whole(&self) -> Whole<'a> { match self.trans.pos { Position::Left => Whole { - left: self.trans.pair.given, + left: self.trans.pair.expans, right: self.rem_found, }, Position::Right => Whole { left: self.rem_found, - right: self.trans.pair.given, + right: self.trans.pair.expans, }, } } @@ -104,7 +100,7 @@ impl<'a> Stitch<'a> { pub struct StitchParts<'a> { pub trans: Transform<'a>, pub isect: &'a str, - pub rem_given: &'a str, + pub rem_expans: &'a str, pub rem_found: &'a str, } @@ -113,7 +109,7 @@ impl<'a> From> for StitchParts<'a> { Self { trans: v.trans, isect: v.isect, - rem_given: v.rem_given, + rem_expans: v.rem_expans, rem_found: v.rem_found, } } @@ -138,11 +134,11 @@ impl fmt::Display for Whole<'_> { pub struct Combo<'a> { pub stitch: Stitch<'a>, pub valid: bool, - pub expand: Option<(&'a str, Position)>, + pub pos_given: Option, } -/// Provides all the extrapolations of a given word. Does not include the given word itself. -fn extrap<'f>(ctx: &'f Ctx<'_>) -> impl ParallelIterator { +/// Provides all the expansions of a given word. Does not include the given word itself. +fn expand<'f>(ctx: &'f Ctx<'_>) -> impl ParallelIterator { let lambda: Box Option<(&'f str, Position)> + Send + Sync> = if ctx.disable_exp { Box::new(|_| None) @@ -152,7 +148,7 @@ fn extrap<'f>(ctx: &'f Ctx<'_>) -> impl ParallelIterator starts().then_some((found, Position::Left)), @@ -177,18 +173,15 @@ fn extrap<'f>(ctx: &'f Ctx<'_>) -> impl ParallelIterator(ctx: &'f Ctx<'_>) -> impl ParallelIterator> { - let extrap = extrap(ctx).map(|(found, pos)| (found, Some(pos))); + let extrap = expand(ctx).map(|(found, pos)| (found, Some(pos))); [(ctx.given.as_str(), None)] .into_par_iter() .chain(extrap) - .flat_map(move |(expand_word, expand_pos)| { + .flat_map(move |(expans, pos_given)| { ctx.founds .par_iter() - .map(|&found| Pairing { - given: expand_word, - found, - }) + .map(|&found| Pairing { expans, found }) .flat_map(|pair| { Position::all() .into_par_iter() @@ -199,7 +192,7 @@ pub fn find_all<'f>(ctx: &'f Ctx<'_>) -> impl ParallelIterator> .map(move |stitch| Combo { stitch, valid: stitch.valid(&ctx.founds), - expand: expand_pos.map(|pos| (expand_word, pos)), + pos_given, }) }) .filter(|combo| ctx.valid.map_or(true, |b| b == combo.valid))