-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PATCH v4] test: ml_perf: add ML performance test #2138
base: master
Are you sure you want to change the base?
Conversation
} | ||
} | ||
|
||
optind = 1; /* reset 'extern optind' from the getopt lib */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some ODP testers has this and some don't, does this have any actual effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if getopt*() is called again, I guess. So at least in this case, no effect. Left it there anyway. Maybe we should clean these up at some point.
}; | ||
|
||
typedef struct { | ||
int mode, num_threads, rounds, latency, warmup; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These as well as num_batch
could be unsigned integers as them having a negative value does not make much sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't do any harm, either. The negative range is not used now, but it could be used at some point for some special values, etc. I didn't change these now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, user can pass e.g. negative round and warmup values and tester happily continues, giving weird results. There should at least be some sanity checks after parsing the command line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a typical user of this tool, I don't need such checks. As a developer, I think if we want to do user input sanitation (and I don't), then we should probably implement it in a library and do it consistently in all of our testing tools, etc.
if (out_size_final != glb->ref_file_size) { | ||
ODPH_ERR("Output size mismatch: %" PRIu64 | ||
" differs from reference file size %" PRIu64 "\n", | ||
out_size_final, glb->ref_file_size); | ||
ret = -1; | ||
goto error; | ||
} | ||
|
||
if (memcmp(glb->ref_file_data, output_final, out_size_final)) { | ||
ODPH_ERR("Output differs from reference\n"); | ||
ret = -1; | ||
goto error; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's need to fail the thread due to these, this could simply be made visible when printing results at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least I would definitely want the program to exit with an error if the output doesn't match the reference because in that case something went horribly wrong. Also in CI we want an error exit in that case.
mdl = odp_ml_model_create(glb->opt.model_name, &model_param); | ||
if (mdl == ODP_ML_MODEL_INVALID) { | ||
ODPH_ERR("odp_ml_model_create() failed\n"); | ||
ret = -1; | ||
goto error; | ||
} | ||
|
||
if (odp_ml_model_destroy(mdl)) { | ||
ODPH_ERR("odp_ml_model_destroy() failed\n"); | ||
ret = -1; | ||
goto error; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For stuff that's being tested, it's probably better to record their failures as some error counter increases instead of failing the thread. Same thing for loading and unloading in test_ml_load()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, better to just quit. We're measuring speed, not e.g. the probability of success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, in a performance testing setting, speed and probability of success might be related. Anyways, up to you.
46968bd
to
411a1b5
Compare
v2: All comments processed. |
411a1b5
to
d6c372c
Compare
d6c372c
to
7737811
Compare
Add machine learning (ML) performance test application. The application measures nanoseconds used for various ML operations, such as creating and desroying a model, loading and unloading a model, and inferencing. Signed-off-by: Jere Leppänen <jere.leppanen@nokia.com>
Add ML performance test run for linux-gen. This runs some basic performance tests using the conv.onnx model. Signed-off-by: Jere Leppänen <jere.leppanen@nokia.com>
No description provided.