How to read out the loss function in YOLO algorithm?

197 Views Asked by At

How do I read out the loss function used in YOLO? I somehow need it for a class that I'm attending.

EDIT

Got an answer in Reddit!

1

There are 1 best solutions below

0
On

It's a bit of an unexpected question, but I guess I would read it out by describing one term at a time. (Hopefully you meant a high-level description, not literally a phonetic sequence.) I'd say something like this when "reading it out":

  • Overall, we want to perform simultaneous object detection and classification. The indicator functions $(\unicode{x1D7D9}_{ij}^{ \text{obj} })$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(\unicode{x1D7D9}_{i}^{ \text{obj} })$ denotes whether there is an object in cell $i$. Hatted quantities (e.g. $\widehat{x}$, $\widehat{C}$, $\widehat{p}_i$) are predictions of their unhatted counterparts. The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).

  • The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.

  • The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.

  • The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.

  • The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.

Might be helpful to look at other Yolo questions: [1], [2], [3], [4], [5], [6].