Disable memory benchmarking (#589)

Summary: Our tests have been red for a while due to failing memory bechmarks. ## Issue When benchmarking opacus we run the training script multiple times within one process: ``` for i in range(args.num_runs): run_layer_benchmark( ... ) ``` We use built-in pytorch tools to check memory stats. Crucially, we verify that `torch.cuda.memory_allocated()` is 0 before the run starts. Normally, it should be 0, as all previous tensors are out of scope and should have been collected. It all worked fine until something changed and some GPU memory stayed allocated between runs. No idea why, but explicit cache clearing or object deletion didn't help. So I gave up and disabled memory benchmarking, since it seems like it's not a complicated thing to do due to some PyTorch update Pull Request resolved: #589 Reviewed By: JohnlNguyen Differential Revision: D45691684 Pulled By: karthikprasad fbshipit-source-id: 82006e503240532840d3fb6dc0314f2202780973
pytorch · Aug 1, 2023 · 7b28054 · 7b28054
1 parent e8bc932
commit 7b28054
Show file tree

Hide file tree

Showing 2 changed files with 1 addition and 3 deletions.
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -273,7 +273,6 @@ commands:
             python benchmarks/generate_report.py --path-to-results /tmp/report_layers --save-path benchmarks/results/report-${report_id}.pkl --format pkl
 
             python benchmarks/check_threshold.py --report-path "./benchmarks/results/report-"$report_id".pkl" --metric runtime --threshold <<parameters.runtime_ratio_threshold>>  --column <<parameters.report_column>>
-            python benchmarks/check_threshold.py --report-path "./benchmarks/results/report-"$report_id".pkl" --metric memory --threshold <<parameters.memory_ratio_threshold>>  --column <<parameters.report_column>>
           when: always
       - store_artifacts:
           path: benchmarks/results/

diff --git a/benchmarks/utils.py b/benchmarks/utils.py
@@ -230,7 +230,7 @@ def generate_report(path_to_results: str, save_path: str, format: str) -> None:
     pivot = results.pivot_table(
         index=["batch_size", "num_runs", "num_repeats", "forward_only", "layer"],
         columns=["gsm_mode"],
-        values=["runtime", "memory"],
+        values=["runtime"],
     )
 
     def add_ratio(df, metric, variant):
@@ -245,7 +245,6 @@ def add_ratio(df, metric, variant):
     if "baseline" in results["gsm_mode"].tolist():
         for m in set(results["gsm_mode"].tolist()) - {"baseline"}:
             add_ratio(pivot, "runtime", m)
-            add_ratio(pivot, "memory", m)
         pivot.columns = pivot.columns.set_names("value", level=1)
 
     output = pivot.sort_index(axis=1).sort_values(