Our benchmark evaluates the negative Impact of common visual hazards on algorithm output performance. It is calculated by this formula:
impact = min(metriclow,metrichigh) / max(metricnone,metriclow) - 1.0
The metricsnone/low/high are evaluated on subsets of the benchmark dataset that correspond to the identified severity of the hazard (e.g. the subset Blurhigh contains images which have a lot of blur visible). Positive impacts are truncated to zero.
An impact of -10% at Blur translates to an expected performance degradation for the algorithm of 10 percent when there is a considerable blur in the input image as opposed to supplying the same algorithm a similar image without noticeable image blur.
These are all currently evaluated hazards:
Blur: Image is noticeably affected by blur (e.g. motion blur, defocusing, compression artifacts...)
Coverage: Normally visible parts of the road are covered (e.g. unusual lane markings, snow, leaves...)
Distortion: Visible lens distortion
Hood: Ego-vehicle is visible, non-windscreen parts (e.g. car hood, mirrors)
Occl: Objects are partiially occluded or cut off by image border
Overexp.: The scene is overexposed
Particle Particles in the air obstruct the view (e.g. heavy rain, snow, fog)
Screen The windscreen is interfering (e.g. interior reflections, wipers, rain on the windscreen,...)
Underexp. The image is underexposed
Variation Intra-class variations within the image (i.e. unusual representations of labels like unique cars)
More details on evaluation metrics and negative test cases can also be found on the FAQ page.