Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command-line tools to split and/or reassemble .fwdata files #403

Merged
merged 10 commits into from
Nov 14, 2024

Conversation

rmunn
Copy link
Contributor

@rmunn rmunn commented Jun 26, 2024

These tools were originally created in sillsdev/languageforge-lexbox#735, but are being moved to the flexbridge repository in order to simplify the process of distributing them. The intent of these tools is to enable automated test scenarios for Send/Receive as per sillsdev/LfMerge#339, and possibly other as-yet-unknown use cases.


This change is Reviewable

@rmunn
Copy link
Contributor Author

rmunn commented Jun 26, 2024

AppVeyor build doesn't support net8.0, it seems. Nevertheless, I believe this is ready to be reviewed: there's no reason, IMHO, to downgrade the targeted .NET version to net6.0 or earlier since all the environments where this tool would be run are easily capable of supporting .NET 8. We'll just have to ignore the failing AppVeyor build.

@rmunn rmunn marked this pull request as ready for review June 26, 2024 09:14
@rmunn rmunn linked an issue Jun 26, 2024 that may be closed by this pull request
@rmunn rmunn requested a review from megahirt June 27, 2024 02:05
@rmunn rmunn self-assigned this Jun 27, 2024
@myieye
Copy link

myieye commented Sep 23, 2024

As just discussed in our team meeting:

@hahn-kev
Copy link
Contributor

Right now we've actually got 2 different cli tools, could we make them into just one exe and control the split vs merge based on the command?

@rmunn
Copy link
Contributor Author

rmunn commented Oct 18, 2024

Right now we've actually got 2 different cli tools, could we make them into just one exe and control the split vs merge based on the command?

Sure. How about fwdatatool.exe split and fwdatatool.exe combine as the commands? I don't want to use "merge" to avoid confusion with the concept of merging a branch.

@myieye
Copy link

myieye commented Oct 18, 2024

Some more options 🙃:

  • fwdata.exe build (build is shorter and simpler than combine and I think it communicates just as much)
  • fwdata.exe split

I feel like this functionality can basically be seen as the mother of the fwdata format, so it's entitled to such a fundamental name (i.e. just fwdata.exe).

@rmunn rmunn force-pushed the feat/lfmerge-mkfwdata-files branch from 3a0e5a3 to 566efb8 Compare October 18, 2024 08:20
@rmunn rmunn force-pushed the feat/lfmerge-mkfwdata-files branch from 566efb8 to 9f276f2 Compare October 18, 2024 08:21
Will make for smaller downloads if we make standalone executables.
@rmunn rmunn force-pushed the feat/lfmerge-mkfwdata-files branch from 9f276f2 to 525996e Compare October 18, 2024 08:23
@rmunn
Copy link
Contributor Author

rmunn commented Oct 18, 2024

Interesting: when I rebased on top of develop, it included a commit that removed the AppVeyor build. But the PR checks still include AppVeyor.

@rmunn
Copy link
Contributor Author

rmunn commented Oct 18, 2024

@myieye wrote:

Some more options 🙃:

* `fwdata.exe build` (build is shorter and simpler than combine and I think it communicates just as much)

* `fwdata.exe split`

I feel like this functionality can basically be seen as the mother of the fwdata format, so it's entitled to such a fundamental name (i.e. just fwdata.exe).

Good suggestions; I've implemented both of them in commits c313bfe and 36603a3.

@rmunn rmunn requested review from hahn-kev and removed request for megahirt October 18, 2024 08:42
@rmunn
Copy link
Contributor Author

rmunn commented Oct 21, 2024

Note: during testing we ran into an issue where the Chorus timeout on HgRunner was too short, hg update tip was killed halfway through checking out a large project with many media files in LinkedFiles, and then the fwdata tool complained about having no Mercurial repo. We'll add a --timeout option to set a timeout in seconds, and maybe default it to 9999 seconds (functionally equivalent to no timeout at all).

Update: Added --timeout option, decided to default to 10 minutes because 9999 seconds, nearly 3 hours, is a little ridiculous. But it can go as high as 9999 if it needs to.

Copy link
Contributor

@hahn-kev hahn-kev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, I'd like @jasonleenaylor to have a chance to comment before we merge this in.

@jasonleenaylor
Copy link
Contributor

src/MkFwData/.editorconfig line 1 at r2 (raw file):

# Copyright (c) 2016-2021 SIL International

Why does it need its own editor config? It mostly duplicates the one at the root, I'd rather keep just one for the whole project unless there is a good reason.

@jasonleenaylor
Copy link
Contributor

src/MkFwData/.gitattributes line 1 at r2 (raw file):

* text=auto whitespace=space-before-tab,tab-in-indent,blank-at-eol,tabwidth=4

Same with the .gitattributes, why add one for the project if there is a perfectly good one at the root

Copy link
Contributor

@jasonleenaylor jasonleenaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 5 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: 3 of 5 files reviewed, all discussions resolved (waiting on @rmunn)

@rmunn
Copy link
Contributor Author

rmunn commented Oct 24, 2024

True, it doesn't need its own .editorconfig, except for one thing. The flexbridge repo's root .editorconfig has indent_style = tab for .cs files. I originally wrote this code for a different repo where spaces were used for indentation, and so when we ended up bringing it into the flexbridge repo instead, I decided that rather than push a whitespace-change commit, I'd just use .editorconfig to say "Despite what the root .editorconfig says, any project files in this folder should use spaces for indentation."

I really don't want to change indentation to tab characters. I could give a whole presentation on why tab characters for indentation are a bad idea, but the short version is that they make indendation inflexible. Yes, tab characters in theory allow anyone to set the indent level of their code editor to their preferred style (some people like 2-space indents, some like 4, etc.) but in practice, this promise of flexibility is broken, because this kind of thing invariably happens:

incorrect-tabs-in-TsString cpp-alignment

If anyone looks at that file with tab width set to something other than 4, then the function parameters will be misaligned from what they're supposed to be.

In the extreme case, you end up with something like the following code that I found in the FW repo at one point (note that I went through some trouble to anonymize the code in this animation, because I thought I'd be using it in a presentation and I didn't want anyone to feel called out, so I replaced all the words in the code with Lorem Ipsum text of the same character length). Here is an animation of me setting the tab display width between 1 and 8 and how the table, which was clearly meant to align, never aligns properly under any tab width:

tab-animation

I believe what happened is that someone originally wrote that code to align, then later some rename refactorings happened and the lengths of some identifiers changed, and nobody went back and re-aligned that table. But the result perfectly encapsulates why I have always disliked tab characters in indentation.

So I'll trim down that .editorconfig to just the differences from the root (space indents instead of tabs). But I would prefer to keep it in there, so that this file, at least, can have correct indentation.

The .gitattributes file is there for a similar reason: because the root .gitattributes as indent-with-non-tab for all text files, which means "flag a whitespace warning if you find a line indented with a non-tab character". Since this project uses spaces for indentation, I needed to change that to tab-in-indent, meaning "flag a whitespace warning if you find tab characters in the indent".

We only need to keep the .editorconfig settings that differ from the
ones in the root (spaces instead of tabs for indent, add final newline).
@rmunn
Copy link
Contributor Author

rmunn commented Oct 24, 2024

Commit 418f119 gets rid of the duplicate lines from .editorconfig, making it much more obvious that it's there to override just two settings (indent style and final newline).

Copy link
Contributor Author

@rmunn rmunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 5 files reviewed, all discussions resolved (waiting on @rmunn)


src/MkFwData/.editorconfig line 1 at r2 (raw file):

Previously, jasonleenaylor (Jason Naylor) wrote…

Why does it need its own editor config? It mostly duplicates the one at the root, I'd rather keep just one for the whole project unless there is a good reason.

Replied in #403 (comment)


src/MkFwData/.gitattributes line 1 at r2 (raw file):

Previously, jasonleenaylor (Jason Naylor) wrote…

Same with the .gitattributes, why add one for the project if there is a perfectly good one at the root

Replied in #403 (comment) — summary is that this is overriding the indent-with-non-tab warning settings from the root, replacing them with tab-in-indent since MkFwData uses spaces for indentation. So this is here in order to override the one from the project root.

To match the style of the rest of this repo, we'll just switch to tab
indentation. That makes this nearly a whitespace-only commit, but nobody
else is touching this code so it won't create merge conflicts.
@rmunn
Copy link
Contributor Author

rmunn commented Oct 25, 2024

@jasonleenaylor - Decided to just switch to tab indentation here so there's no need for separate .editorconfig files. Commit cda5364 has that change.

If you're happy with this, I'll go ahead and merge on Monday.

@rmunn
Copy link
Contributor Author

rmunn commented Nov 14, 2024

Having heard nothing, I'm going to assume this is okay. Please contact me if there are other changes you need.

@rmunn rmunn merged commit 669a5e3 into develop Nov 14, 2024
8 of 10 checks passed
@rmunn rmunn deleted the feat/lfmerge-mkfwdata-files branch November 14, 2024 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Idea: generate .fwdata file from lexbox?
4 participants