Probabilistic intersection-over-union for training and evaluation of oriented object detectors

Federal University of Rio Grande do Sul

Transactions on Image Processing (IEEE TIP) 2024

Comparison with other methods

Smooth L1

GWD

KLD

⭐ ProbIoU (Ours)

Abstract

Oriented object detection is a challenging and relatively new problem. Most existing approaches are based on deep learning and explore Oriented Bounding Boxes (OBBs) to represent the objects. They are typically based on adaptations of traditional detectors that work with Horizontal Bounding Boxes (HBBs), which have been exploring IoU-like loss functions to regress the HBBs. However, extending this idea for OBBs is challenging due to complex formulations or requirement for customized backpropagation implementations. Furthermore, using OBBs presents limitations for irregular or roughly circular objects, since the definition of the ideal OBB is an ambiguous and ill-posed problem. In this work, we jointly tackle the problem of training, representing, and evaluating oriented detectors. We explore Gaussian distributions -- called Gaussian Bounding Boxes (GBBs) -- as fuzzy representations for oriented objects and propose using a similarity metric between two GBBs based on the Hellinger distance. We show that this metric leads to a differentiable closed-form expression that can be directly used as a localization loss term to train OBB object detectors. We also show that GBBs present a natural representation as elliptical regions (called EBBs), which inherently mitigate ambiguity representation for circular objects. Finally, we empirically show that the proposed similarity metric computed between two GBBs strongly correlates with the IoU between the corresponding EBBs, motivating the name Probabilistic Intersection-over-Union (ProbIoU). Our experiments show that results using ProbIoU as a regression loss are competitive with state-of-the-art alternatives without requiring additional hyperparameters or customized implementations, and that ProbIoU is a promising alternative to evaluate oriented object detectors.

Approach

If we use a fuzzy object representation based on GBBs, we can calculate $ B_D $ (distance measure) and present closed-form expressions in terms of the GBB parameters.
Considering that $ p \sim \mathcal{N}(\boldsymbol{\mu}_1, \Sigma_1) $ and $ q \sim \mathcal{N}(\boldsymbol{\mu}_2, \Sigma_2) $ are Gaussian distributions with

$$ \boldsymbol{\mu}_1 = \begin{pmatrix} x_1 \\ y_1 \end{pmatrix}, \quad \Sigma_1 = \begin{bmatrix} a_1 & c_1 \\ c_1 & b_1 \end{bmatrix}, \quad \boldsymbol{\mu}_2 = \begin{pmatrix} x_2 \\ y_2 \end{pmatrix}, \quad \Sigma_2 = \begin{bmatrix} a_2 & c_2 \\ c_2 & b_2 \end{bmatrix} $$

we obtain:

$$ B_D = \frac{1}{8}(\boldsymbol{\mu}_1 - \boldsymbol{\mu}_2)^T \Sigma^{-1} (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_2) + \frac{1}{2} \ln\left( \frac{\det \Sigma} {\sqrt{\det \Sigma_1 \det \Sigma_2}} \right), \quad \Sigma = \frac{1}{2}(\Sigma_1 + \Sigma_2) $$

since:

$$ B_C = e^{-B_D}, \quad H_D(p,q) = \sqrt{1 - B_C(p,q)}, \quad \text{ProbIoU}(p,q) = 1 - H_D(p,q) $$

$ \text{ProbIoU}(p,q) $ ranges from 0 to 1, with the following properties:

Loss	Implementation	Scale	Metric	Hyper
Loss	Implementation	Invariance	Properties	parameters
r-IoU	Hard	✓	✓	--
Smooth ℓ₁	Easy	×	×	--
GWD	Easy	×	×	τ, f(·)
KLD	Easy	✓	×	τ, f(·)
ProbIoU	Easy	✓	✓	--

BibTeX

@ARTICLE{10382963, author={Murrugarra-Llerena, Jeffri and Kirsten, Lucas N. and Zeni, Luis Felipe and Jung, Claudio R.}, journal={IEEE Transactions on Image Processing}, title={Probabilistic Intersection-Over-Union for Training and Evaluation of Oriented Object Detectors}, year={2024}, volume={33}, number={}, pages={671-681}, keywords={Detectors;Location awareness;Object detection;Measurement;Training;Gaussian distribution;Annotations;Computer vision;object detection;performance evaluation}, doi={10.1109/TIP.2023.3348697}}