-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset-specific metrics #21
Comments
As just discussed in the larger group, we need the following:
|
Perhaps we could define a schema for an AllenNLP-style {
"input_file": "/path/to/input.txt",
"output_file": "/path/to/output.json",
"metrics": [
{
"name": "bertscore",
"model": "mbert",
"output_key": "bertscore_mbert"
},
{
"name": "bertscore",
"model": "roberta",
"output_key": "bertscore_roberta"
}
]
} We could potentially define task-specific suites of metrics to run, which would run a pre-defined set of metrics: {
"input_file": "/path/to/input.txt",
"output_file": "/path/to/output.json",
"task_suite": "summarization"
} |
@sebastianGehrmann you're assuming a global config for GEM tasks, right? @danieldeutsch (if it's a global config, then) maybe it would make sense to integrate this in |
I have a couple of metrics in mind that are dataset-specific. For example:
How should I go about this?
E.g. for the global recall metric I could preprocess the training data, and if the references have an identifier, I can use that to load the relevant data. Is that the best solution?
The text was updated successfully, but these errors were encountered: