Add File Size to Parquet Metrics #310

stanbrub · 2024-06-12T01:21:15Z

Currently, we collect read/write rates for the Parquet Benchmarks. So for the multi-column tests that are meant to allow comparison between codecs (e.g. snappy, gzip), Shivam would like to see resulting parquet file size as well. (Stan would like to see a more generic way to pull in extra metrics to adhoc runs. So this is a good fit.)

This has been done manually before, but it makes sense to automate it, since there are more metrics of interest (like installation size, memory usage, etc) that are not being shown in a meaningful or obvious way.

Add file size metric collection to the FileTestRunner (this should be relatively straightforward)
Improve the adhoc snippet to allow selection of a metric by property name
- Some metrics may be unique to certain benchmarks
- Ensure proper null handling if the specified metrics are not present
Pull file size metric in beside the rate column in the result table for the run sets

stanbrub · 2024-07-12T18:53:39Z

There is a metric called 'data.file.size' that has been added for the Parquet tests. Queries have been updated to allow metrics that are null for some and provided for others. Also, the adhoc snippet now has a way for the user to specify metric names that are pulled in as columns beside other columns like op_rate.

stanbrub added the enhancement New feature or request label Jun 12, 2024

stanbrub self-assigned this Jun 12, 2024

stanbrub linked a pull request Jul 12, 2024 that will close this issue

Design Changes for Dir Struct, Tagged Iterations, Metrics #317

Merged

stanbrub closed this as completed in #317 Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add File Size to Parquet Metrics #310

Add File Size to Parquet Metrics #310

stanbrub commented Jun 12, 2024

stanbrub commented Jul 12, 2024 •

edited

Loading

Add File Size to Parquet Metrics #310

Add File Size to Parquet Metrics #310

Comments

stanbrub commented Jun 12, 2024

stanbrub commented Jul 12, 2024 • edited Loading

stanbrub commented Jul 12, 2024 •

edited

Loading