-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Self-adaptive translation mode for Marian (runtime domain adaptation). #887
base: master
Are you sure you want to change the base?
Self-adaptive translation mode for Marian (runtime domain adaptation). #887
Conversation
SwappableSlot: add GPU-to-GPU reset feature
…into dynamic_swap_mvp
This doesn't work though because we're missing a lot of options because we initialize them manually instead of using the config parser.
@snukky I think I've implemented all the requested changes. I've also added a test using a transformer model to marian-nmt/marian-regression-tests#81 CI is failing but that seems to note be related to my changes. |
@rihardsk Thanks, I will take a look again. The GitHub check called "Documentation" is optional and will not make the CI failing, the rest should pass (excluding the already disabled "Ubuntu 16.04"). Did you run all regression tests from marian-regression-tests locally? |
Co-authored-by: Roman Grundkiewicz <[email protected]>
Co-authored-by: Roman Grundkiewicz <[email protected]>
Co-authored-by: Roman Grundkiewicz <[email protected]>
Wasn't intentional
@snukky i had run the regression tests previously but reran them again. This time there were some 40 failing tests. I checked some of the logs and it seems that the outputs are off by some small fraction but i didn't check all. Here's the summary:
I think i've resolved all of your other suggestions. |
…atch-1 Change "training-sets" to "train-sets"
During the last Marian meeting, we decided that @emjotde will provide comments on |
@rihardsk Thank you for this work. It is really helpful for what we are trying to achieve. I have a question though. How could I control the intensity of the adaptation ? I need to control globally how much the model "adapts" to the given context.
I found this option: data-weighting but I am not sure how it will behave during runtime domain adaptation ? Thanks a lot for your answer PS: I am not sure if it is the right place to ask this. Feel free to point what would be suitable. |
@stanBienaives you should be able to use the regular training options for that – use, e.g., If you're interested in what other options are available in self-adaptive training, take a look here https://github.com/marian-cef/marian-dev/blob/a274dfbe0f356294ee092315ebd9a9df4dd16c5e/src/common/config_parser.cpp#L423 or just see the help output of the self-adaptive executable. There might still be some less used options that haven't been tested and they might not work as expected but for the most part those options should be working. BTW, I'm no longer working on this pull request and am not concerned with getting this merged, because I've changed employers recently. Hopefully though, someone will step in to get this over the line because it seemed that very little remained to be done. |
Description
This PR implements self-adaptive translation, a.k.a. runtime domain adaptation, in Marian. It enables training the model on a set of context sentence pairs (source and target) prior to translation to adapt it to a new domain during runtime. The model is reset before the next translation.
This is useful because it enables one to have a single generic NMT model that is fine-tuned on the fly to better suit any number of domains for which context sentences can be provided. Typically these context sentences would be fetched from a translation memory (out of scope of this PR) based on similarity to the to-be-translated sentence at hand. More so, the translation quality can be improved over time without retraining and redeploying the model, by adding additional sentences to the translation memory.
The PR based on earlier work by @snukky in c63aa8f but the mechanism for transferring model parameters from the training graph to translation graph has been revised so that it's based on the swappable infrastructure from kpu#2.
Self-adaptive translation can be run either in server mode, where source sentences and context sentence pairs are supplied via JSON, or in CLI mode, where they're supplied in separate files.
List of changes:
marian-adaptive
executable-DCOMPILE_ADAPTIVE=ON
config_parser.cpp
to allowmarian-adaptive
to accept options for both translation and trainingAdded dependencies: none
How to test
Run the regression tests located in the
tests/_self-adaptive
directory in this PR marian-nmt/marian-regression-tests#81I've tested things on Ubuntu 18.04. To enable building
marian-adaptive
, you must usecmake .. -DCOMPILE_ADAPTIVE=ON
.Checklist