New: WildDash 2 with 4256 public frames, new labels & panoptic GT!
See also: RailSem19 dataset for rail scene understanding.
For all metrics, higher scores are better. To participate in the benchmark, check our submission instructions.
Meta AVG | Classic | Negative | Impact (IoU class) | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Algorithm | IoU Class | IoU Class | iIoU Class | IoU Cat. | iIoU Cat. | IoU Class | Blur | Coverage | Distortion | Hood | Occ. | Overexp. | Particles | Screen | Underexp. | Var. |
MSeg_1080 | 48.3% | 49.8% | 43.1% | 63.3% | 56.0% | 65.0% | -7% | -10% | 0% | -20% | 0% | -20% | -7% | -13% | -16% | -9% |
LDN_BIN_768 | 46.9% | 48.8% | 42.8% | 63.6% | 59.3% | 47.7% | -10% | -10% | -1% | -18% | 0% | -23% | -6% | -8% | -25% | -7% |
MSeg | 43.0% | 42.2% | 31.0% | 59.5% | 51.9% | 51.8% | -5% | -7% | -7% | -11% | -3% | -16% | 0% | -5% | -20% | -3% |
LDN_OE | 42.7% | 43.3% | 31.9% | 60.7% | 50.3% | 52.8% | -11% | -13% | -7% | -10% | -5% | -24% | 0% | -6% | -30% | -7% |
LDN_BIN | 41.8% | 43.8% | 37.3% | 58.6% | 53.3% | 54.3% | -14% | -14% | -22% | -14% | -3% | -35% | -3% | -9% | -25% | -8% |
DN169_CAT_DUAL | 41.0% | 41.7% | 34.4% | 57.7% | 49.7% | 52.6% | -4% | -7% | -11% | -10% | -5% | -24% | -7% | -4% | -26% | -9% |
MSeg_low_res | 40.5% | 42.2% | 34.0% | 55.7% | 42.2% | 43.1% | -1% | -17% | -5% | -20% | 0% | -23% | -14% | -11% | -22% | -13% |
AHiSS_ROB | 39.0% | 41.0% | 32.2% | 53.9% | 39.3% | 43.6% | -11% | -12% | -2% | -24% | 0% | -27% | -13% | -13% | -28% | -16% |
MapillaryAI_ROB | 38.9% | 41.3% | 38.0% | 60.5% | 57.6% | 25.0% | -15% | -5% | -4% | -23% | 0% | -23% | -12% | -21% | -25% | -6% |
PSP-IBN-SA_ROB | 38.5% | 39.4% | 33.6% | 60.6% | 51.0% | 65.3% | -18% | -3% | -5% | -18% | -3% | -27% | -17% | -13% | -27% | -12% |
DN_2_4_CWVI_BIN_SEG | 36.6% | 37.9% | 30.9% | 52.5% | 43.7% | 63.5% | -16% | -7% | 0% | -15% | -2% | -30% | -9% | -10% | -41% | -14% |
IBN-PSP-SA_ROB | 33.6% | 34.7% | 30.8% | 55.1% | 38.9% | 68.5% | -8% | 0% | 0% | -22% | 0% | -27% | -23% | -23% | -36% | -8% |
IBN-PSA-SA_ROB | 32.5% | 33.6% | 30.1% | 53.8% | 39.3% | 69.5% | -9% | -1% | 0% | -25% | 0% | -28% | -25% | -20% | -32% | -11% |
LDN2_ROB | 32.1% | 34.4% | 30.7% | 56.6% | 47.6% | 29.9% | -7% | -0% | -11% | -36% | 0% | -37% | -16% | -24% | -42% | -6% |
LDN_ROB | 32.1% | 34.4% | 30.7% | 56.6% | 47.6% | 29.9% | -7% | -0% | -11% | -36% | 0% | -37% | -16% | -24% | -42% | -6% |
BatMAN_ROB | 31.7% | 31.4% | 17.4% | 51.9% | 37.3% | 36.3% | -9% | -8% | -11% | -20% | -11% | -29% | -5% | -10% | -37% | -6% |
Mapillary_ROB | 31.6% | 32.7% | 27.5% | 55.2% | 51.1% | 22.7% | -12% | -7% | -15% | -23% | -1% | -26% | -12% | -28% | -31% | -3% |
HiSS_ROB | 31.3% | 31.0% | 16.3% | 50.3% | 34.6% | 44.1% | -11% | -10% | -11% | -25% | -10% | -32% | -2% | -10% | -44% | -0% |
DeepLabv3+_CS | 30.6% | 34.2% | 24.6% | 49.0% | 38.6% | 15.7% | -13% | -15% | -15% | -34% | 0% | -55% | -17% | -23% | -53% | -6% |
AdapNet2_ROB | 29.5% | 28.7% | 16.5% | 51.5% | 38.0% | 43.6% | -15% | -10% | -20% | -24% | -14% | -21% | -8% | -7% | -37% | -7% |
AdapNetv2_ROB | 29.5% | 28.7% | 16.5% | 51.5% | 38.0% | 43.6% | -15% | -10% | -20% | -24% | -14% | -21% | -8% | -7% | -37% | -7% |
VlocNet++_ROB | 29.2% | 28.4% | 16.4% | 51.3% | 37.3% | 39.4% | -19% | -8% | -17% | -23% | -14% | -23% | -4% | -9% | -36% | -11% |
M_DN | 29.1% | 29.6% | 22.9% | 55.8% | 48.0% | 16.7% | -15% | -9% | -13% | -23% | -7% | -26% | -16% | -14% | -37% | -6% |
DRN_MPC | 28.3% | 29.1% | 13.9% | 49.2% | 29.2% | 15.9% | -17% | -8% | -15% | -32% | -5% | -47% | -3% | -12% | -34% | -9% |
VENUS_ROB_update | 28.2% | 29.8% | 22.7% | 51.5% | 35.0% | 50.6% | -3% | -0% | 0% | -32% | 0% | -42% | -15% | -31% | -43% | -21% |
DN_2_4_CITY_WD | 27.2% | 28.3% | 18.2% | 50.6% | 38.6% | 17.5% | -5% | -3% | -10% | -40% | 0% | -45% | -15% | -23% | -44% | 0% |
DRN_MPS | 26.3% | 27.4% | 11.9% | 47.5% | 27.1% | 12.9% | -19% | -12% | -14% | -32% | -8% | -51% | -9% | -12% | -45% | -14% |
VENUS_ROB | 25.1% | 26.4% | 19.8% | 46.9% | 29.8% | 54.4% | -2% | -0% | 0% | -37% | 0% | -49% | -17% | -30% | -48% | -16% |
GoogLeNetV1_ROB | 22.9% | 22.4% | 17.3% | 36.7% | 36.6% | 50.7% | -21% | -21% | -43% | -26% | -9% | -29% | -21% | -28% | -46% | -2% |
APMoE_seg_ROB | 22.2% | 22.5% | 12.6% | 48.1% | 35.2% | 22.8% | -11% | -2% | -23% | -23% | -4% | -44% | -12% | -11% | -46% | 0% |
PAG_ROB | 22.1% | 21.7% | 12.5% | 48.8% | 35.6% | 34.1% | -9% | -10% | -20% | -27% | -3% | -35% | -6% | -8% | -41% | -3% |
DRN_CS | 14.8% | 15.4% | 7.1% | 28.9% | 14.2% | 7.2% | -43% | -9% | -29% | -29% | -15% | -27% | -18% | -24% | -74% | -35% |
FCN101_ROB | 12.2% | 11.1% | 2.1% | 29.3% | 8.3% | 38.7% | 0% | -7% | -26% | -27% | -11% | -49% | -17% | -4% | -32% | -10% |
PSPNetv0 | 8.3% | 8.5% | 5.5% | 17.7% | 15.5% | 10.1% | -17% | -33% | -10% | -20% | 0% | -34% | -26% | -52% | -30% | -32% |
Methodology:
Our benchmark evaluates the negative Impact of common visual hazards on algorithm output performance. It is calculated by this formula:
impact = min(metriclow,metrichigh) / max(metricnone,metriclow) - 1.0
The metricsnone/low/high are evaluated on subsets of the benchmark dataset that correspond to the identified severity of the hazard (e.g. the subset Blurhigh contains images which have a lot of blur visible). Positive impacts are truncated to zero.
An impact of -10% at Blur translates to an expected performance degradation for the algorithm of 10 percent when there is a considerable blur in the input image as opposed to supplying the same algorithm a similar image without noticeable image blur.
These are all currently evaluated hazards:
Blur: Image is noticeably affected by blur (e.g. motion blur, defocusing, compression artifacts...)
Coverage: Normally visible parts of the road are covered (e.g. unusual lane markings, snow, leaves...)
Distortion: Visible lens distortion
Hood: Ego-vehicle is visible, non-windscreen parts (e.g. car hood, mirrors)
Occl: Objects are partially occluded or cut off by image border
Overexp.: The scene is overexposed
Particle: Particles in the air obstruct the view (e.g. heavy rain, snow, fog)
Screen: The windscreen is interfering (e.g. interior reflections, wipers, rain on the windscreen,...)
Underexp.: The image is underexposed
Variation: Intra-class variations within the image (i.e. unusual representations of labels like unique cars)
More details on evaluation metrics and negative test cases can also be found on the FAQ page.