Skip to content

Commit

Permalink
Refine MASKRCNN examples document (#2411)
Browse files Browse the repository at this point in the history
  • Loading branch information
LuFinch authored Sep 25, 2023
1 parent 93ec10b commit e3c565c
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 14 deletions.
6 changes: 3 additions & 3 deletions examples/train_horovod/mnist/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Note: to install oneAPI base toolkit, refer to [Intel GPU Software Installation]
### Check Device Count (Optional)
Run:
```
horovodrun -np 1 -H localhost:1 python tensorflow2_keras_mnist.py
mpirun -np 1 -prepend-rank -ppn 1 python tensorflow2_keras_mnist.py
```

Check how many devices (XPUs) in local machine according output of above command, like:
Expand All @@ -62,11 +62,11 @@ In some Intel GPU (like Intel® Data Center GPU Max Series), there are more than
### Running Command
For 2 XPUs:
```
horovodrun -np 2 -H localhost:2 python ./tensorflow2_keras_mnist.py
mpirun -np 2 -prepend-rank -ppn 2 python ./tensorflow2_keras_mnist.py
```
For 4 XPUs:
```
horovodrun -np 4 -H localhost:4 python ./tensorflow2_keras_mnist.py
mpirun -np 4 -prepend-rank -ppn 4 python ./tensorflow2_keras_mnist.py
```
## Output
```
Expand Down
22 changes: 11 additions & 11 deletions examples/train_maskrcnn/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Accelerate Mask R-CNN Training w/o horovod on Intel GPU
# Accelerate Mask R-CNN Training on Intel GPU

## Introduction

Expand Down Expand Up @@ -46,7 +46,6 @@ source env_itex/bin/activate
```
pip install --upgrade pip
pip install --upgrade intel-extension-for-tensorflow[gpu]
pip install intel-optimization-for-horovod
pip install opencv-python-headless pybind11
pip install pycocotools
pip install -e "git+https://github.com/NVIDIA/dllogger#egg=dllogger"
Expand All @@ -69,19 +68,12 @@ cd dataset
bash download_and_preprocess_coco.sh ./data
```

+ Download the pre-trained ResNet-50 weights.

```
python scripts/download_weights.py --save_dir=./weights
```

## Execute the Example

Here we provide single-tile training scripts and multi-tile training scripts with horovod. The datatype can be float32 or bfloat16.

```
DATASET_DIR=./data
PRETRAINED_DIR=./weights
OUTPUT_DIR=/the/path/to/output_dir
```

Expand All @@ -107,10 +99,16 @@ python main.py train \
--epochs 1 --steps_per_epoch 20 --log_every=1 --log_warmup_steps=1
```

+ Multi-tile with horovod. Default datatype is fp32. You can use `--amp` flag for bf16.
+ Multi-tile with horovod.

Install `intel-optimization-for-horovod`.
```
mpirun -np 2 -prepend-rank -ppn 1 \
pip install intel-optimization-for-horovod
```
Default datatype is fp32. You can use `--amp` flag for bf16.

```
mpirun -np 2 -prepend-rank -ppn 2 \
python main.py train \
--data_dir $DATASET_DIR \
--model_dir=$OUTPUT_DIR \
Expand All @@ -119,6 +117,8 @@ python main.py train \
--epochs 1 --steps_per_epoch 20 --log_every=1 --log_warmup_steps=1
```

**Note:** Only distributed workload needs `intel-optimization-for-horovod`. Please uninstall it if you want to run single tile workload.

## FAQ

1. If you get the following error log, refer to [Enable Running Environment](#Enable-Running-Environment) to Enable oneAPI running environment.
Expand Down

0 comments on commit e3c565c

Please sign in to comment.