Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract plugin #265

Open
wants to merge 7 commits into
base: v2
Choose a base branch
from
Open

Extract plugin #265

wants to merge 7 commits into from

Conversation

sbalmer
Copy link
Collaborator

@sbalmer sbalmer commented Oct 9, 2017

This patch rewrites extraction to become part of the build process. (Rationale: We've switched our deploy process to use a pristine copy of the repo rather than our working copies. In these copies extracts.msgfmt is not generated and deploys don't get their strings updated.)

There are a few things which need to be addressed before this patch can be merged. At the moment it's a "works for us" proposition. Some things that need fixing:

  • I've adapted but not tested the various handlers. We use only JS and HTML so the rest is untested
  • The readme is not updated
  • The extractor uses process.cwd() to find the root dir which can be wrong (when meteor is run from a subdir)

I've decided to create this pull request to discuss the state of the thing before continuing to invest time in it.

@fermuch
Copy link

fermuch commented Oct 9, 2017

I have a fairly large project which uses JS (ecmascript 5), es6, JSX, TS and TSX -- Is there some way I can test running the extractor and report the status?

@sbalmer
Copy link
Collaborator Author

sbalmer commented Oct 9, 2017

@fermuch sure thing, just link to our fork from your package dir:

cd $SOMEWHERE
git clone -b  extract-plugin  --recursive [email protected]:Openki/meteor-messageformat.git

cd $YOURPROJECT/packages
ln -s $SOMEWHERE/meteor-messageformat/packages/extract msgfmt-extract
ln -s $SOMEWHERE/meteor-messageformat/packages/core msgfmt-core

Then it should just work but of course it won't. If it fails I'd be interested to know the shape of server/extracts.msgfmt~ because that's the cached state of the last extraction run.

@fermuch
Copy link

fermuch commented Oct 10, 2017

For those who also want to try: Follow @sbalmer instructions, but after cloning run: git submodule update --init --recursive in the cloned directory.

I've ran with the fork, restarted the project several times, and server/extracts.msgfmt is created, but server/extracts.msgfmt~ isn't. No error is shown in the logs.

As far as I understand, you're using the build system, so it should be generated after each run. Am I right, or do I need to call it manually?
I also tried switching between different meteor versions (that project is currently running 1.6-rc.3) and deleting .meteor/local to have a cold start, but nothing seems to work.

@fermuch
Copy link

fermuch commented Oct 10, 2017

Welp, I'm an idiot. I was using the v2 branch instead of extract-plugin.

Here's the generated extracts.msgfmt~ (built with Meteor 1.6-rc.3)

@sbalmer
Copy link
Collaborator Author

sbalmer commented Oct 11, 2017

Oups, thanks for complementing my instructions. Amended.

Your extracts.msgfmt~ is encouraging. I see .jade and .jsx had some strings extracted. Thanks for giving it a whirl. Does it actually add the strings too? So that when you add a new mf-string and save, the translation interface shows it?

Do you use meteor build or Galaxy? It should propagate the changes, because the process for dev-server and build is the same now. Haven't tested much though.

@fermuch
Copy link

fermuch commented Oct 11, 2017

Do you use meteor build or Galaxy? It should propagate the changes, because the process for dev-server and build is the same now. Haven't tested much though.

I use meteor build for staging and Galaxy for production. As far as I understand, both use meteor build.

Does it actually add the strings too? So that when you add a new mf-string and save, the translation interface shows it?

It does work if I add new strings!


I haven't tested doing a deploy yet, and staging is recreated each time a new commit is made, so there's no much sense in using staging to try this.

I noticed two things:

  • This also happens with the older version, but I thought it would disappear using Meteor's build system: the crawler is running inside dirs it shouldn't, like private, public and .scripts. According to the Meteor docs, private is a place reserved for assets, so there's no much sense running the crawler there but maybe this behaviour is expected in msgfmt?
  • I'm not sure if this is the place to ask for it, but in Add support for typescript (ts and tsx) #261 I forked the repo to clone JS/JSX parsers to add support for TS/TSX. Is it possible to add support for those here?

@sbalmer
Copy link
Collaborator Author

sbalmer commented Oct 15, 2017

This also happens with the older version, but I thought it would disappear using Meteor's build system:

For extraction of strings, the directory tree is traversed independently of the build system. The Meteor build system calls only one handler per file so the plugin can't make use of it. Maybe the linter stage could be abused for extraction. That might be faster too. For now I just kept the old extracts.msgfmt hack in place.

the crawler is running inside dirs it shouldn't, like private, public and .scripts.

I thought it would ignore dotfiles and dot-dirs. I'll have a look at this. We have some email templates in private, so we need extraction to run in private too. If that causes problems I'm sure we could find a way to adapt our system. But I'd prefer not changing the behaviour now.

I'm not sure if this is the place to ask for it, but in #261 I forked the repo to clone JS/JSX parsers to add support for TS/TSX. Is it possible to add support for those here?

I'd prefer not adding unrelated patches even if they are trivial.

@sbalmer
Copy link
Collaborator Author

sbalmer commented Oct 15, 2017

Turns out contents of dot-dirs were already being ignored. I've added another check to ignore dot-files.

@fermuch
Copy link

fermuch commented Oct 16, 2017

Turns out contents of dot-dirs were already being ignored. I've added another check to ignore dot-files.

Thank you! I'll try to take a look tomorrow to check if it's solved for me.

We have some email templates in private, so we need extraction to run in private too. If that causes problems I'm sure we could find a way to adapt our system.

It's not really a problem, it's just bothersome. But I can imagine some use cases where it might be a bother, like having git's submodules inside the private dir for private packages.

How hard would it be to add an env or meteor setting to append values to filters, to add more control over what's ignored?

@sbalmer
Copy link
Collaborator Author

sbalmer commented Oct 16, 2017

It seems Meteor settings are not available at build-time.

I think the easiest solution would be to allow configuring filter settings in server/extracts.msgfmt file. (And rename that file to server/config.msgfmt or something.)

Another option would be to only run extraction in dirs that have a trigger file like extracts.msgfmt. Though that would be more effort to set up compared to the automated (and overreaching) extraction now.

In any case I'd prefer not meddling with the filter logic on this branch.

@1u
Copy link
Contributor

1u commented Oct 30, 2017

Hei @gadicc, hope you are fine. What are your thoughts on this?
(No hurry, we'll anyway start to use it that way for now, and might update here how it goes)
Cheers

@gadicc
Copy link
Owner

gadicc commented Oct 30, 2017

Hey all. Sorry for my lack of input now. The last few months have been the most busy in my life, but lots of fun stuff is happening too.

Firstly, @sbalmer, thank you yet again for your continued contributions to the project. I'm so happy you've gotten so much use out of it and are always so ready to give back to the community and other users here!

And, @fermuch gets a special mention too :) Thanks for being so involved in an early stage PR, giving sbalmer so much feedback and advice to other potential users too.

I have to be honest, I haven't touched or thought about this part of msgfmt in so long. If I understood correctly, this doesn't break the existing behaviour (in any scenario) but makes the project substantially more useful in others (CI, etc). Can you confirm (to the best of your knowledge)?

I have only two concerns...

  1. I'm a little reluctant to change something so core that's worked so well for so long. This could be easily addressed by a configurable setting in the build file (similar to what you suggested to filtering), if both code branches were maintained. Ideally with the original behaviour being the default, at least initially. However, this may well not be worth all the extra time involved... so with more testing on an alternate branch, I'd be willing to forego this.

  2. that I'm not (currently) actively working on this and have limited time. This can also actually be addressed pretty easily... would you be interested in becoming an official maintainer of the project, including publishing rights? :) It's actually an offer I've wanted to make for a long time now, but had intended to have some early work on v3 ready first - and that's unfortunately looking a bit more distant now.

(on that note, v3 solves this problem via a babel plugin. it makes the project relevant outside of Meteor, of course solves all the edge cases the regex fails at, and automatically handles any language that's compiled to javascript. I had wanted to link to an early example but can't seem to recall which branch of which repo I put it in. also, don't recall 100% where babel fits in Meteor w.r.t. to .html (and other template files) before they're compiled to JS - but either way, a non-regex full XML decode is the long term vision).

Let me know your thoughts and thanks again :)

@sbalmer
Copy link
Collaborator Author

sbalmer commented Nov 1, 2017

@gadicc Thanks for your feedback.

If I understood correctly, this doesn't break the existing behaviour (in any scenario) but makes the project substantially more useful in others (CI, etc). Can you confirm (to the best of your knowledge)?

Yes, this patch should mirror the functionality of the old process. It doesn't update the mfExtractFiles collection. This makes it more robust in my view. It would break the workflow of people that rely on the collection. I don't know of a scenario where this would be useful, so I don't think it likely.

I'm a little reluctant to change something so core that's worked so well for so long. This could be easily addressed by a configurable setting in the build file (similar to what you suggested to filtering), if both code branches were maintained. Ideally with the original behaviour being the default, at least initially. However, this may well not be worth all the extra time involved... so with more testing on an alternate branch, I'd be willing to forego this.

In my view this would complicate matters. Keeping two codepaths active in parallel would increase complexity of both. I'd opt for more testing in any case. We'll want to use it do simplify our deployment process "soon". I haven't looked into other modes of testing though.

that I'm not (currently) actively working on this and have limited time. This can also actually be addressed pretty easily... would you be interested in becoming an official maintainer of the project, including publishing rights? :) It's actually an offer I've wanted to make for a long time now, but had intended to have some early work on v3 ready first - and that's unfortunately looking a bit more distant now.

First of all, thanks for considering the option. It would simplify matters being able to commit directly. At the same time I'm afraid that my interests might be too limited in scope. I'd be mostly interested in amending the v2 branch for our use-case. Hard for me to tell how that would work out for other people :-)

I like the long-term vision of properly parsing the template files. But I don't see this as a priority from our side.

@gadicc
Copy link
Owner

gadicc commented Nov 3, 2017

Hey :)

Ok that sounds good. I'm a bit hazy on all the logic and don't have time to review the current code now. The only thing that comes to mind is we used the mfExtracts collection to figure out what changed, and when, and to mark translations as out of date / fuzzy (I think :)). But if everything works as it used to, no need to support internal functions that it seems aren't actually used.

I'm happy to skip supporting both code paths as indeed it's a lot of work and does not seem necessary.

I'm also happy for you to come on-board as solely responsible for your own work. But that would include responsibility if it affects other users. At this point in time, I can't take responsibility for PRs and related (if albeit unexpected) consequences for them. If that's the case I'd rather you maintained a separate branch until a time when I'd be able to do that (which would be early next year, at the earliest). But otherwise, sure, you'd have no responsibility for general bugs/features/issues that are unrelated. What do you think?

Thanks also for the feedback re template parsing.

@sbalmer
Copy link
Collaborator Author

sbalmer commented Nov 6, 2017

The way I think it works:

  1. msgfmt-core creates the /server/extracts.msgfmt file if it doesn't exist
  2. On build, the existence of /server/extracts.msgfmt causes the registered handler in msgfmt-extract to run
  3. msgfmt-extract scans all project dirs for files with known extensions
  4. File status is loaded from the collection mfExtractFiles, extraction is run on new or changed files
  5. mfPkg.addNative() is called with the freshly extracted strings
  6. The extracted strings are merged with the strings previously in the DB and written as JSON to /server/extracts.msgfmt~
  7. Contents of file /server/extracts.msgfmt~ are wrapped in a call to mfPkg.addNative() and added as server/extracts.msgfmt.js to the build. This file will be loaded on the live systems (process.env.NODE_ENV === "production") and add new translations on startup.

How the process proposed here works:

Steps 1-3 are the same as before.
Step 4: The cache in /server/extracts.msgfmt~ with all extracted strings is loaded, extraction is run on new or changed files
Step 5: Not done. This will be done on startup.
Step 6: Extraction state is written to /server/extracts.msgfmt~
Step 7: Extracted strings are wrapped in a call to mfPkg.addNative() and added as file server/extracts.msgfmt.js to the build. This will run on startup, for dev systems too.

This process does not access any collections on the dev system and is thus oblivious to meteor reset. It also doesn't track removed strings, as the old system did. It will only find what's there.

@sbalmer
Copy link
Collaborator Author

sbalmer commented Nov 6, 2017

I've just added the last piece that was missing to recreate the old functionality: tracking removed strings.

Given the positive reactions I intend to complete the process by

  • trying a deploy on our staging area
  • update the documentation

I'm unsure about the things to test, and how to go about that.

@gadicc
Copy link
Owner

gadicc commented Nov 7, 2017

Ooh, yay, awesome!!

Thanks for the rundown... super helpful for me having not looked at the code in so long and being under time constraints now. Actually, I think it makes more sense to get rid of the mfExtracts collection so I'm very happy we're going this route. Thanks!

To help move things along, as I'm also in a seminar for Wed-Sun + travel:

  1. I've added you as a repo collaborator. We can continue working/testing on your branch for now.
  2. Added you to meteor "msgfmt" group. (Probably) suggest releasing first as a test/pre-release to try get feedback, but it's not easy to communicate with users.

Alas, I don't think I had any built in tests for this (beyond parsing). I guess we can put down in writing now some ideal things for users to report back on. Maybe when one of has time we can write some integration tests, but probably dev efforts might be better put on v3.

  1. Add new string
  2. Change an existing string
  3. Remove a string
  4. Change existing string marks translations as fuzzy

a. meteor deploy
b. meteor build
c. meteor dev

I think that's it roughly, right?

@sbalmer
Copy link
Collaborator Author

sbalmer commented Nov 7, 2017

Thanks @gadicc for the invitation!

Having automated integration tests would be very valuable. I think it's quite complicated though, the build step is not usually repeated during tests so would would have to hack that. Maybe an easier route would be to just replay a sequence of addNative() calls and check whether the DB was updated correctly in-between calls.

I'm less worried about the extraction process because there we can just examine the output. But automating this would still add some confidence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants