Show the models listed models in model_urls
go run main.go list
Download the models listed models in model_urls to model_dir
using
go run main.go downloadmodels -o ~/data/carml/dlperf
Show downloaded model paths
go run main.go paths --model_path ~/onnx_models
model_path
and output_path
can be a folder or a file.
go run main.go layerstats --model_path ~/onnx_models/emotion_ferplus/model.onnx --format dot
go run main.go layerstats --model_path ~/onnx_models --output_path assets/layer_stats --format json
dot
from graphviz
is needed. On macos, install it using brew install graphviz
.
Get the flops information for alexnet using
go run main.go flopsinfo --model_path ~/onnx_models/bvlc_alexnet/model.onnx
Get information per layer using
go run main.go flopsinfo --model_path ~/onnx_models/bvlc_alexnet/model.onnx --full
Get information per layer using
go run main.go weightsinfo --model_path ~/onnx_models/bvlc_alexnet/model.onnx --output_file=out
Store the recalled benchmarks in json using
go run main.go benchinfo --model_path ~/data/carml/ --benchmark_database results/v100/8.json --short=false --batch_size=8 --human=true -o assets/benchinfo/v100 -f json
model_path
and output_path
can be a folder or a file.
Find the patterns of length 4
go run main.go patterns --model_path ~/onnx_models/ --length 4
Generate the benchmark files of a model or across models at model_path
.
Use --forward
and --backward
to control whether to generate benchmarks for forward and backward pass.
go run main.go benchgen --model_path ~/data/carml/dlperf/Emotion-FerPlus/emotion_ferplus/model.onnx --forward=true --backward=false -o test_generated_benchmarks.hpp
or
./scripts/gen_benchmarks.sh
Query benchmark database at benchmark_database
to to get information on the model at model_path
go run main.go benchinfo --model_path ~/data/carml/dlperf/BVLC_AlexNet/bvlc_alexnet/model.onnx --benchmark_database results/Tesla_V100-SXM2-16GB/1.json --output_file=testout --short=false --batch_size=1 --human=false --strategy=parallel --total=true --format=json --trim_layer_name=false
Options:
- trim_layer_name: limit the layer name length
- strategy: "parallel" executes the layers on the short path; "serial" serializes the layers
- total: output total number of all layers
You can draw a graph with the runtime data using the following command
go run main.go benchinfo --model_path ~/data/carml/dlperf/ArcFace/resnet100/resnet100.onnx --benchmark_database results/v100/8.json --short=false --batch_size=8 --human=true --strategy=parallel --show --highlight_fast_path
You can query both kernels and metrics that are dummed by cudnn|scope using the following command
go run main.go benchinfo --model_path ~/data/carml/dlperf/ResNet50-v1/resnet50v1/resnet50v1.onnx --benchmark_database results/v100/profile/8.json.gz --short=false --batch_size=8 --human=true --strategy=parallel --metrics --format=json
go run main.go benchinfo --model_path ~/data/carml/dlperf/ResNet50-v1/resnet50v1/resnet50v1.onnx --benchmark_database results/v100/profile/8.json.gz --short=false --batch_size=8 --human=false --strategy=parallel --metrics --output_file=tmp.tbl --flops_only --flops_aggregate=true
go run main.go benchinfo --model_path ~/data/carml/dlperf/ResNet50-v1/resnet50v1/resnet50v1.onnx --benchmark_database results/v100/profile/8.json.gz --short=false --batch_size=8 --human=false --strategy=parallel --metrics --output_file=tmp.tbl --flops_only --flops_aggregate=true --metric_filter=flop_count_sp --trim_layer_name=false
go run main.go benchinfo --model_path ~/data/carml/dlperf/ResNet50-v1/resnet50v1/resnet50v1.onnx --benchmark_database results/v100/profile/8.json.gz --short=false --batch_size=8 --human=false --strategy=parallel --metrics --output_file=tmp.tbl --trim_layer_name=false --total=false --format=csv --kernels_only=true