Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues generating Heavy metrics #103

Open
SimonvdFliert opened this issue Feb 27, 2023 · 0 comments
Open

Issues generating Heavy metrics #103

SimonvdFliert opened this issue Feb 27, 2023 · 0 comments

Comments

@SimonvdFliert
Copy link

Hi,

I have been trying to test several models on the GEM-benchmark metrics. I followed the tutorial provided both on GitHub and the official website and have been able to generate a submission file with generations and GEM-ID keys. However, when I attempt to generate output scores, I notice that I am missing several scores I wish to have.

For example, in the requirements.txt file, the package of rouge-score is included, however, the output scores do not contain any rouge metric. Furthermore, I attempted several times to generate output scores with --heavy-metric flag, however, this is always skipped. Regardless of whether I include the flag or leave it out, the same metrics are returned.

I attached an example of my output scores below:
afbeelding

An example of the generation is shown below here:
afbeelding

More information:

  • I cloned the repo in my google drive, cd'd in the file and pip installed both the normal requirements file as heavy requirements. I did this several times ensuring that everything was installed
  • I also tried generating the metrics by manually choosing the metrics with --metric-list, but that did not work either
  • I attempted to pip import gem_metrics, but this did not resolve my issues.

Could someone help uncover what I am doing wrong?

Kind regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant