Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dali backend device parameter setting question #155

Open
frankxyy opened this issue Aug 26, 2022 · 1 comment
Open

dali backend device parameter setting question #155

frankxyy opened this issue Aug 26, 2022 · 1 comment
Assignees
Labels
help wanted Extra attention is needed perf Issues related to DALI or DALI Backend performance

Comments

@frankxyy
Copy link

frankxyy commented Aug 26, 2022

The dali.py file content is as below:

import nvidia.dali as dali
from nvidia.dali.plugin.triton import autoserialize
import nvidia.dali.types as types

@autoserialize
@dali.pipeline_def(batch_size=1, num_threads=1, device_id=0)
def pipe():
    images = dali.fn.external_source(device="cpu", name="INPUT_0")
    shape_list = dali.fn.external_source(device="cpu", name="INPUT_1")
    images = dali.fn.decoders.image(device="mixed", images, device="mixed", output_type=types.RGB) # The output of the decoder is in HWC layout.
    images_converted = dali.fn.color_space_conversion(device="gpu", images, image_type=types.RGB, output_type=types.BGR)
    images = dali.fn.resize(device="gpu", images_converted, resize_y=shape_list[0, 2]*shape_list[0, 0], resize_x=shape_list[0, 3]*
                            shape_list[0, 1])
    images = dali.fn.crop_mirror_normalize(device="gpu", images,
                                           dtype=types.FLOAT,
                                           output_layout="CHW",
                                           scale=1.0/255,
                                           mean=[0.485 * 255, 0.456 * 255, 0.406 * 255],
                                           std=[0.229, 0.224, 0.225])

    return images, shape_list

A peculiar circumstance I found is that if I donot set the device parameter for the color_space_conversion, resize and crop_mirror_normalize operator, the latency will boost to 90ms(comparing to 40ms when explicitly setting the device parameter to 'gpu'). I assumed that if the device parameter is not set, the default gpu to gpu behavior will be selected as the input of the three operators are all in gpu memory, but the program running result reveals that my assumption may be wrong. I am wondering why does this happen?

@frankxyy frankxyy changed the title dali backend highly lower than simple cpu processing of python backend dali backend highly slower than simple cpu processing of python backend Aug 26, 2022
@frankxyy frankxyy changed the title dali backend highly slower than simple cpu processing of python backend dali backend slightly quicker than simple cpu processing of python backend Aug 27, 2022
@frankxyy frankxyy changed the title dali backend slightly quicker than simple cpu processing of python backend dali backend device parameter setting question Aug 28, 2022
@banasraf
Copy link
Collaborator

banasraf commented Sep 2, 2022

Hi @frankxyy
As you assume, adding device='gpu' argument to those operators shouldn't change anything, because they receive gpu input and their placement is inferred to be on gpu.
Can you tell me more how do you measured that latency? Did you use perf_analyzer or your custom script? What parameters did the measurements have?

@JanuszL JanuszL added the help wanted Extra attention is needed label Sep 2, 2022
@klecki klecki added the perf Issues related to DALI or DALI Backend performance label Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed perf Issues related to DALI or DALI Backend performance
Development

No branches or pull requests

4 participants