This package is based on the S3FD implementation from face-alignment.
Face alignment is relatively heavy as it incorporate facial landmark detection, and I have encountered some performance issue when using the S3FD detector from face-alignment during decoding stage due to the implementation. To make thing faster and easier, I made this package for face detection only and fix some performance problem of the original implementation of decoding.
pip install git+https://github.com/enhuiz/efd
import torch
import matplotlib.pyplot as plt
from PIL import Image
from torchvision import transforms
from efd import s3fd
# 1. Open an image.
img = Image.open("./example.jpg")
# 2. Use torchvision to transform it as tensor.
img = transforms.ToTensor()(img)
if img.shape[0] == 1:
# Gray => RGB
img = torch.repeat_interleave(img, 3, 0)
imgs = torch.stack([img])
# 3. Initialize the s3fd model.
model = s3fd(pretrained=True)
model = model.cuda()
# 4. Detect. The imgs feed to the model will be scaled by scale_factor.
# Smaller scale_factor make inference faster but less accurate.
# Notice that the patches are still cropped from the original image.
bbox_lists, patch_iters = model.detect(imgs, scale_factor=0.5)
# 5. Print & plot the results.
print(bbox_lists)
for patch_iter in patch_iters:
for patch in patch_iter:
plt.imshow(patch.permute(1, 2, 0).cpu().numpy())
plt.title(str(patch.shape))
plt.show()
commit | Time (s) |
---|---|
git checkout 04eac0a (from face-alignment, pytorch decoding) | 5.8595 |
git checkout master (numpy based decoding) | 1.0739 |
This implementation is around 5.5x faster.