Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterations for Benchmark #176

Closed
stanbrub opened this issue Oct 3, 2023 · 1 comment · Fixed by #317
Closed

Iterations for Benchmark #176

stanbrub opened this issue Oct 3, 2023 · 1 comment · Fixed by #317
Assignees
Labels
enhancement New feature or request

Comments

@stanbrub
Copy link
Collaborator

stanbrub commented Oct 3, 2023

Currently, determining an average processing Rate for benchmarks comes from multiple nightly runs. However, variability in benchmarks for an operation can come from multiple sources; source code changes, platform changes, operational rate fluctuations.

  • Source Code Changes: Any change to the codebase (especially the DH engine) can affect an operation even if the operation is not targeted for work.
  • Platform Changes; Hardware changes, JVM version, Docker version, dependency versions, etc can all affect Rates without and code changes (Currently, the benchmarks are run on the same bare metal hardware every night)
  • Operational Rate Changes: Individual operational rates can swing wildly even neither source code or platform changes have occurred

Possible methods of doing iterations (at least for operations that have the worst variability when run back-to-back).

  • Provision more servers: 3 for starters, but make it configurable and take the median run w/metrics
    • Pros: Iterations for all standard operations
    • Cons: More expensive
  • Iterate on worst offenders: After nightly, get worst offenders on variability and iterate
    • Pros: Cheaper. Keeps some operational variable benchmarks off the worst list
    • Cons: Some operations are iterated. Others are not. Yet all are compared.

Notes:

  • Logging: Docker logs for nightly runs are per-operation
    • Co-locate with the test results so that when selecting the median run, logs come with?
  • Multi-server Failures: If test run fails on one server but not on the others
    • Do we fail the whole thing? Or use the successful runs?
    • Change origin column to include server host as well as component?
    • Add result column that includes iteration count?
@stanbrub stanbrub added the enhancement New feature or request label Oct 3, 2023
@stanbrub stanbrub linked a pull request Jul 12, 2024 that will close this issue
@stanbrub stanbrub self-assigned this Jul 12, 2024
@stanbrub
Copy link
Collaborator Author

Most of this ticket is solved by #317. That approach is not as ambitious but does the job of providing a way to add extra iterations for problematic benchmarks. The rest will be done with Auto-Provisioning Benchmark Hardware

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant