inference speed is slow #6

tlxnulixuexi · 2023-12-17T08:27:19Z

Hello, @tersekmatija,
I'm sorry to bother you.
I have some questions that I'd like to ask you.
When I use the ewasr model to train the LARS data set, the training speed is very fast, but the inference speed is very slow. Compared with wasr, the inference speed does not have a big advantage. It is consistent with your paper. There is a big gap in the ten times speed. What is the reason?

tersekmatija · 2023-12-18T10:13:28Z

Hey @tlxnulixuexi ,
What script do you use for prediction? Training on a different dataset should not have a noticeable impact on the inference speed, since the only thing that should change are the model's weights.

tlxnulixuexi · 2023-12-22T03:03:39Z

I would like to ask you to help me see if there is any problem with some of the codes below. I am using the predict.py file in ewasr, adding the code to calculate fps in it, and using the GPU for inference, but the calculation results after inference are Only 22.7 (average fps). Thanks

import time

...

def predict(args):
if args.dataset == "mods":
dataset = MODSDataset(args.dataset_config, normalize_t=PytorchHubNormalization())
else:
dataset = MaSTr1325Dataset(args.dataset_config, normalize_t=PytorchHubNormalization(), include_original=True)

dl = DataLoader(dataset, batch_size=args.batch_size, num_workers=1)

# Prepare model
model = models.get_model(args.model, num_classes=args.num_classes, pretrained=False, mixer=args.mixer, enricher=args.enricher, project=args.project)
state_dict = load_weights(args.weights)
model.load_state_dict(state_dict)
predictor = Predictor(model, args.fp16)

output_dir = Path(args.output_dir)
if not output_dir.exists():
    output_dir.mkdir(parents=True)

start_time = time.time()
processed_images = 0

for features, labels in tqdm(iter(dl), total=len(dl)):
    pred_masks = predictor.predict_batch(features)

    for i, pred_mask in enumerate(pred_masks):
        pred_mask = SEGMENTATION_COLORS[pred_mask]
        orig_img = features["image_original"][i].numpy()
        pred_mask = np.transpose(orig_img, (1,2,0)) * 0.7 + pred_mask * 0.3
        pred_mask = pred_mask.astype(np.uint8)
        mask_img = Image.fromarray(pred_mask)

        out_file = output_dir / labels['mask_filename'][i]

        mask_img.save(out_file)

        processed_images += 1

end_time = time.time()
elapsed_time = end_time - start_time
fps = processed_images / elapsed_time
print(f"Processed {processed_images} images in {elapsed_time} seconds. FPS: {fps:.2f}")

...

tersekmatija · 2024-01-03T14:22:27Z

Hey @tlxnulixuexi ,

Sorry for a late reply. The way you measure FPS is not correct. You need to use torch.cuda.synchronize() which waits for all CUDA streams to complete before taking the time measurement. More insights into why here.

I pushed the slightly modified benchmarking code used for the paper to tools/benchmark.py. You can run it with:
python3 tools/benchmark.py -d GPU -niter 300 after installing dev requirements with pip install -r requirements-dev.txt.

Benchmark should produce lines with latency and FPS like:

eWaSR                          ----- 008.737 [009.06, 001.57] ms latency ----- 114.45 FPS
WaSR                           ----- 089.036 [089.74, 002.47] ms latency ----- 011.23 FPS

It will also visualize the density of the latency measurements of each prediction.

Let me know if this works for you.

tlxnulixuexi · 2024-01-07T07:08:46Z

Hello @tersekmatija,
thank you very much for your reply. Reply I will run it again using the method you provided to us. Thank you again for your advice

tersekmatija · 2024-01-08T09:47:46Z

Thanks @tlxnulixuexi ,
Feel free to close the issues if you can replicate the results.

Best,
Matija

tlxnulixuexi · 2024-01-12T14:04:38Z

Hello Matija Teršek，
Sorry to bother you again. I have just entered this field, so my foundation is relatively weak. There is a lot of knowledge that is not well understood. Regarding the benchmark.py file you uploaded in ewasr, there are a few things I don’t understand very well and I need to ask you for advice. The first is how to use the weight files I have trained on other data sets in benchmark.py to perform inference and calculate fps. Second, you are using a single image file for inference in benchmark.py. If I modify the code and use an image folder for inference, will it affect other parts? Third, from what part should the fps be calculated as the starting time of inference, and why the way I calculated the fps gave such a bad result.
These are some of my questions and I would be very grateful if you would answer them.

tersekmatija · 2024-05-20T14:28:15Z

The first is how to use the weight files I have trained on other data sets in benchmark.py to perform inference and calculate fps.

The weights shouldn't affect the speed. If you want, you can load your weights onto the models here.

Second, you are using a single image file for inference in benchmark.py. If I modify the code and use an image folder for inference, will it affect other parts?

Depends on the loading/pre-processing speed, but it should not.

Third, from what part should the fps be calculated as the starting time of inference, and why the way I calculated the fps gave such a bad result.

Short article that should explain it here: https://www.speechmatics.com/company/articles-and-news/timing-operations-in-pytorch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference speed is slow #6

inference speed is slow #6

tlxnulixuexi commented Dec 17, 2023

tersekmatija commented Dec 18, 2023

tlxnulixuexi commented Dec 22, 2023

tersekmatija commented Jan 3, 2024

tlxnulixuexi commented Jan 7, 2024

tersekmatija commented Jan 8, 2024

tlxnulixuexi commented Jan 12, 2024

tersekmatija commented May 20, 2024

inference speed is slow #6

inference speed is slow #6

Comments

tlxnulixuexi commented Dec 17, 2023

tersekmatija commented Dec 18, 2023

tlxnulixuexi commented Dec 22, 2023

tersekmatija commented Jan 3, 2024

tlxnulixuexi commented Jan 7, 2024

tersekmatija commented Jan 8, 2024

tlxnulixuexi commented Jan 12, 2024

tersekmatija commented May 20, 2024