-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Debugging, Profiling and Benchmarking DVC
You can add -vv
flag to any commands to increase the verbosity of dvc
's log. Eg:
$ dvc metrics diff -vv
2021-03-11 09:10:34,136 TRACE: Namespace(cprofile=False, cprofile_dump=None, pdb=False, instrument=False, instrument_open=False, quiet=0, verbose=2, version=None, cd='.', cmd='diff', a_rev=None, b_rev=None, targets=None, recursive=False, all=False, show_json=False, show_md=False, no_path=False, precision=None, func=<class 'dvc.command.metrics.CmdMetricsDiff'>)
2021-03-11 09:10:34,503 DEBUG: Check for update is enabled.
...
If you are using dvc
's Python API or dvc.api
, you can do following to increase verbosity. Note that you need to do this after you are done with importing, as importing anything from dvc later might change the verbosity again.
import logging
logger = logging.getLogger("dvc")
logger.setLevel(5)
Any dvc
commands can be used with --pdb
flag added to it, which will drop you to the debugger on any exceptions.
By default, it will try to use ipdb
as a debugger, and/or fallback to the pdb
.
$ dvc metrics diff --pdb
dvc stage list data/dvc.yaml --pdb
> /home/user/dvc/dvc/dvcfile.py(133)_load()
132 is_ignored = self.repo.fs.exists(self.path, use_dvcignore=False)
--> 133 raise StageFileDoesNotExistError(self.path, dvc_ignored=is_ignored)
134
ipdb>
Similarly, if you are using python API, you can do the dvc._debug.debug
context manager to achieve this.
from dvc._debug import debug
with debug():
pass # your code here
You could alternatively use pdb
or breakpoint()
in your code itself.
With —show-stack
/—ss
flag, you can inspect what dvc is doing at the time, with Ctrl + T on macOS and Ctrl + \ on Linux. It will print a stack-frame of the main thread. It might be useful for debugging when dvc freezes or hangs. Not available on Windows.
You can use dvc._debug.show_stack()
context manager in Python APIs if you want the same behavior.
If you are having some performance issues, we might ask you for profiling data. DVC supports two kinds of profiling data: deterministic (with cprofile
) and statistical (with pyinstrument
).
cprofile
traces every Python call, which might make it a bit slower than without it. But, most of the time, it's enough to trace where performance issues are. So, we will ask for it most of the time. One disadvantage of cprofile
data is the lack of full-stack records (why those functions are getting called). This is possible to gather with another profiler pyinstrument
, which is a sampling profiler and has a much lower overhead.
--cprofile-dump <filename>
flag can be used to generate cprofile data for the given command, with the specified filename
.
You can attach it to us on email/issue/chat for tracing performance issues. Eg:
$ dvc push --cprofile-dump dump.prof
Similarly, if you are using Python API, you can use dvc._debug.profile
to generate the cprofile data.
from dvc._debug import profile
with profile("dump.prof"): # dumps profiling output to the file
pass # your code here
with profile(): # dumps profiling output to the terminal
pass # your code here
Alternatively, you can use cProfile.Profile
to do the same.
snakeviz
or tuna
can be used to visualize the data.
Alternatively, pstats
can also be used to analyze the data.
You need to install pyinstrument
first, as DVC does not come it pre-installed.
After it's installed, you can use --instrument-open
flag to any dvc
's commands to instrument/profile them. Example:
$ dvc status --instrument-open
This will open a webpage with performance results. If you want to print this into the console instead, --instrument
flag could be used instead.
Similarly, when using python API, you could use dvc._debug.instrument
to achieve this.
from dvc._debug import instrument
with instrument(html_output=True): # opens a webpage
pass # your code here
with instrument(): # prints to the terminal
pass # your code here
This requires yappi
to be installed. This generates callgrind output, so you may want to install kcachegrind
(qcachegrind
on macOS/Windows).
After that, you can use --yappi
flag on any of the DVC's commands, which will generate callgrind file in the form of callgrind.dvc-XXX.out
, which can be viewed using kcachegrind
/qcachegrind
or any other compatible visualization tools.
Also consider using --yappi-separate-threads
flag that will generate one callgrind file per every thread, which makes debugging multithreaded code much easier.
Similarly, you can use dvc._debug.yappi_profile
as a part of an API, when profiling small APIs.
Also, please check this small guide if you are new to the kcachegrind
/qcachegrind
, to make familiar with it's user interface.
This requires viztracer
to be installed.
After that, you can use --viztracer
flag on any of the DVC's commands, which will generate an output file in the form of viztracer.dvc-XXX.json
. You can use the --viztracer-depth
flag to customize the Max Stack Depth or --viztracer-async
to visualize async tasks as separate "threads":
$ dvc status --viztracer --viztracer-depth 8 --viztracer-async
Similarly, you can use dvc._debug.viztracer_profile
as a part of an API, when profiling small APIs.
The results can be visualized with vizviewer
, which is installed alongside viztracer
:
vizviewer viztracer.dvc-20220406_091208.json
This requires filprofiler
to be installed. After it's installed, you can create
an script:
# status.py
from dvc.repo import Repo
Repo().status()
And run
$ fil-profile status.py
On Python3.7 and above, there's a builtin profiler on import time: -X importtime
option. Example:
$ python -X importtime -m dvc --help
You can use tuna
to visualize this as well. Example:
$ python -X importtime -m dvc --help 2> startup.log
$ tuna startup.log
You can use tools like hyperfine
to do the benchmarks, as they provide statistical analysis and perform multiple runs.
$ hyperfine "dvc --help" --warmup 3
DVC's performance can be heavily influenced by disk caches, so it's recommended to be use warmup runs.
debug
, profile
and instrument
can also be used as a decorator during the development (in addition to a context manager). Please don't forget to call it as a function though:
@debug() # ✅
@debug # ❎
Eg:
@instrument()
def collect_repo(self, onerror: Callable[[str, Exception], None] = None):
If you want to mix them all, you can also take the help of debugtools
.