Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Adding Docker version of Prism #68

Closed
wants to merge 1 commit into from
Closed

[WIP] Adding Docker version of Prism #68

wants to merge 1 commit into from

Conversation

danieldeutsch
Copy link
Contributor

Here is what it would look like to add a Dockerized version of a metric.


class Prism(ReferencedMetric):
def __init__(self, device: int, language: str = "en"):
self.prism = _Prism(device=device, language=language)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation in the GEM metrics library would wrap the one in Repro (here)


# Example `micro` output for 3 inputs
# [{'prism': -1.1578280925750732}, {'prism': -1.3325390815734863}, {'prism': -2.730839729309082}]
_, micro = self.prism.predict_batch(inputs)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual implementation would reformat the input data to the format required by the metric, running predict_batch, and then reformatting the outputs to fit the GEM framework

@danieldeutsch
Copy link
Contributor Author

If you have a machine with Docker installed, you should be able to directly run test_prism.py. The first time it runs, this Docker image will automatically be downloaded to your machine, which may take 1-2 minutes. After that, you can directly run the metric without downloading the Docker image.

@danieldeutsch
Copy link
Contributor Author

@sebastianGehrmann @tuetschek @ndaheim Please take a look and let me know what you think.

@sebastianGehrmann
Copy link
Contributor

Hey, given the discussions in the chat, should we merge this so we can proceed with adding the other implementations?

@danieldeutsch
Copy link
Contributor Author

I think it should be ok. My only concern is how to deal with the GPU device. It seems like the other GPU-based metrics don't manually control the device, but this is necessary for the Dockerized metrics or else they will just use GPU 0.

When you run something in Docker, the code run in Docker runs in its own process, so the CUDA_VISIBLE_DEVICES environment variable you might set from the command line is ignored. My library sets that variable specifically for the Docker processes.

I was actually thinking about this last night. Since the classes are instantiated here without any arguments to the constructor, there isn't an obvious way right now to pass the device ID to the Dockerized metrics.

logger.info(f"Computing {metric_class.__name__} for {outs.filename}...")
metric = metric_class()
result = metric.compute_cached(cache, outs)

Any thoughts?

@danieldeutsch danieldeutsch deleted the docker branch January 29, 2022 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for docker and add all the metrics from repro Add PRISM
2 participants