New: WildDash 2 with 4256 public frames, new labels & panoptic GT!
See also: RailSem19 dataset for rail scene understanding.
For all metrics, higher scores are better. To participate in the benchmark, check our submission instructions.
Meta AVG | Classic | Negative | Impact (IoU class) | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Algorithm | IoU Class | IoU Class | iIoU Class | IoU Cat. | iIoU Cat. | IoU Class | Blur | Coverage | Distortion | Hood | Occ. | Overexp. | Particles | Screen | Underexp. | Var. |
ltbgnn_trainwd | 50.2% | 53.2% | 49.2% | 73.2% | 69.5% | 43.6% | -9% | -11% | -6% | -6% | -3% | -3% | -11% | -21% | -6% | -16% |
ltbgnn2_rvc | 50.0% | 52.2% | 47.5% | 72.4% | 68.6% | 44.6% | -8% | -11% | -6% | -8% | -3% | -4% | -8% | -18% | -7% | -14% |
MIX6D_RVC | 48.5% | 51.2% | 46.5% | 72.4% | 66.1% | 40.8% | -7% | -5% | -6% | -7% | -4% | -7% | -7% | -17% | -10% | -11% |
test_RVC_1 | 47.5% | 50.8% | 44.0% | 74.2% | 67.5% | 34.4% | -5% | -4% | -4% | -6% | -5% | -1% | -9% | -17% | -9% | -17% |
FAN_NV_RVC | 47.5% | 50.8% | 44.0% | 74.2% | 67.5% | 34.4% | -5% | -4% | -4% | -6% | -5% | -1% | -9% | -17% | -9% | -17% |
UNIV_CNP_RVC_UE | 46.9% | 51.6% | 45.9% | 72.8% | 67.5% | 29.0% | -7% | -6% | -3% | -7% | -0% | -6% | -5% | -14% | -7% | -8% |
SN_DN161_fat_pyrx8 | 46.8% | 51.0% | 43.9% | 71.4% | 65.5% | 32.6% | -7% | -11% | -5% | -9% | -3% | -2% | -7% | -22% | -8% | -8% |
MIX6D_old | 46.6% | 48.6% | 43.3% | 70.7% | 64.7% | 41.6% | -9% | -9% | -3% | -8% | -2% | -0% | -10% | -17% | -10% | -13% |
segformer-data5+1 | 46.6% | 48.6% | 43.3% | 70.7% | 64.7% | 41.6% | -9% | -9% | -3% | -8% | -2% | -0% | -10% | -17% | -10% | -13% |
UNIV_CNP_RVC | 46.3% | 50.4% | 44.7% | 71.3% | 65.9% | 32.0% | -6% | -9% | -3% | -8% | -1% | -5% | -7% | -15% | -7% | -8% |
SN_DN161s3pyrx8 | 45.6% | 49.8% | 41.6% | 71.3% | 65.3% | 31.0% | -10% | -6% | -6% | -10% | -3% | -3% | -6% | -20% | -9% | -10% |
UDSSEG_RVC | 45.5% | 51.0% | 44.3% | 72.1% | 66.2% | 25.4% | -5% | -6% | -5% | -7% | -4% | -0% | -10% | -19% | -11% | -10% |
Anonymous | 45.5% | 51.0% | 44.3% | 72.1% | 66.2% | 25.4% | -5% | -6% | -5% | -7% | -4% | -0% | -10% | -19% | -11% | -10% |
SN_RN152pyrx8_RVC | 45.4% | 48.9% | 42.7% | 70.1% | 64.8% | 32.5% | -6% | -7% | -5% | -7% | -1% | -2% | -7% | -19% | -11% | -3% |
StudentNetwork | 45.3% | 50.6% | 44.2% | 71.9% | 66.7% | 26.5% | -5% | -5% | -6% | -5% | -5% | -1% | -12% | -21% | -10% | -16% |
mmseg segformer22 | 44.5% | 46.9% | 38.4% | 70.4% | 64.1% | 37.7% | -5% | -8% | -2% | -7% | -3% | -6% | -10% | -19% | -11% | -14% |
ltbgnn2_fixbug | 40.5% | 41.2% | 33.7% | 65.4% | 54.2% | 43.4% | -18% | -16% | -3% | -12% | -1% | -20% | -11% | -25% | -12% | -5% |
UniSeg | 39.4% | 41.7% | 35.3% | 65.8% | 57.4% | 34.8% | -18% | -12% | -4% | -13% | -3% | -11% | -9% | -26% | -13% | -20% |
ltbgnn2 | 38.5% | 42.2% | 35.1% | 65.7% | 56.4% | 27.4% | -16% | -14% | -4% | -12% | -1% | -20% | -10% | -23% | -11% | -3% |
SIW_new | 37.9% | 41.9% | 41.6% | 65.7% | 54.1% | 27.2% | -15% | -10% | -5% | -16% | -2% | -12% | -11% | -22% | -8% | -15% |
seamseg_rvcsubset | 37.9% | 41.2% | 37.2% | 63.1% | 58.1% | 30.5% | -16% | -17% | 0% | -7% | -4% | -14% | -18% | -31% | -14% | -7% |
Tong | 37.2% | 41.0% | 41.2% | 65.2% | 53.5% | 26.0% | -18% | -9% | -5% | -16% | -2% | -13% | -12% | -24% | -10% | -1% |
ltbgnn_new | 37.2% | 40.6% | 31.8% | 65.0% | 54.4% | 26.7% | -15% | -13% | -3% | -12% | -0% | -21% | -7% | -23% | -12% | -3% |
seamseg_mvd_ss | 37.1% | 41.3% | 36.9% | 63.4% | 55.7% | 26.6% | -15% | -14% | 0% | -11% | -4% | -11% | -30% | -36% | -20% | -10% |
U_test | 36.9% | 39.1% | 32.6% | 63.2% | 51.2% | 32.2% | -19% | -12% | -5% | -11% | -4% | -11% | -6% | -22% | -12% | -17% |
ltbgnn2_fix | 36.7% | 40.5% | 35.3% | 62.5% | 57.3% | 25.6% | -17% | -14% | -4% | -13% | -1% | -22% | -11% | -24% | -11% | -2% |
SIW | 36.5% | 41.0% | 38.6% | 65.8% | 53.1% | 24.1% | -16% | -17% | -6% | -14% | -2% | -7% | -19% | -23% | -10% | -6% |
ltbgnn | 36.4% | 38.3% | 31.1% | 64.1% | 52.4% | 30.7% | -11% | -10% | -3% | -12% | -1% | -14% | -4% | -28% | -13% | -10% |
UniSeg Baseline | 36.0% | 39.0% | 33.4% | 63.7% | 53.5% | 27.9% | -23% | -14% | -6% | -15% | -2% | -19% | -8% | -26% | -12% | -16% |
hs1 | 35.7% | 40.0% | 38.0% | 64.8% | 52.3% | 23.0% | -17% | -10% | -8% | -18% | -1% | -15% | -11% | -27% | -9% | -9% |
MSeg1080_RVC | 35.2% | 38.7% | 35.4% | 65.1% | 50.7% | 24.7% | -15% | -11% | -9% | -19% | -3% | -14% | -6% | -25% | -8% | -13% |
w_test | 35.0% | 39.1% | 37.6% | 64.5% | 52.3% | 22.4% | -20% | -9% | -8% | -18% | -0% | -12% | -14% | -30% | -11% | 0% |
BASE-DeepLabV2 | 35.0% | 39.5% | 28.9% | 65.6% | 53.0% | 18.7% | -7% | -8% | -9% | -11% | -6% | -14% | -5% | -19% | -7% | -6% |
DeepLabV2@ResNet50 | 34.9% | 39.4% | 28.7% | 65.6% | 53.7% | 18.7% | -8% | -5% | -10% | -12% | -3% | -10% | -3% | -19% | -8% | -4% |
tong_test | 34.6% | 38.7% | 36.3% | 63.6% | 50.8% | 22.5% | -17% | -7% | -8% | -15% | -1% | -12% | -9% | -24% | -11% | -20% |
hs | 34.4% | 38.4% | 36.2% | 64.2% | 52.1% | 22.3% | -19% | -11% | -8% | -18% | 0% | -13% | -15% | -29% | -11% | -6% |
submit_test | 34.0% | 36.6% | 31.2% | 61.4% | 48.2% | 26.6% | -20% | -13% | -6% | -14% | -3% | -16% | -7% | -21% | -9% | -17% |
test_base | 33.8% | 37.8% | 36.1% | 63.1% | 50.6% | 22.1% | -17% | -11% | -7% | -18% | -1% | -12% | -14% | -28% | -13% | -14% |
EffPS_b1bs4sem_RVC | 32.2% | 35.7% | 24.4% | 63.8% | 56.0% | 20.4% | -10% | -6% | -4% | -7% | -1% | -7% | -10% | -25% | -8% | -6% |
CARB | 16.8% | 19.1% | 13.8% | 45.8% | 35.8% | 10.0% | -24% | -2% | -5% | -25% | -2% | -26% | -15% | -33% | -20% | -6% |
DeepTrain | 16.4% | 17.5% | 1.1% | 32.2% | 30.0% | 11.8% | -15% | -7% | 0% | -5% | -7% | -2% | -14% | -18% | -5% | -1% |
WSSS-CLIP-ES | 13.0% | 14.6% | 7.1% | 40.4% | 25.6% | 8.1% | -18% | 0% | -9% | -28% | -2% | -16% | -13% | -32% | -16% | -2% |
FAN_RVC1 | 7.2% | 2.5% | 6.2% | 7.5% | 23.8% | 28.1% | -36% | -19% | -98% | -32% | -2% | -11% | -25% | -45% | -30% | -2% |
FAN_RVC | 7.0% | 2.5% | 6.2% | 7.6% | 23.8% | 26.9% | -36% | -14% | -98% | -31% | -3% | -10% | -25% | -45% | -29% | 0% |
test_RVC | 6.5% | 2.6% | 6.3% | 10.5% | 34.6% | 24.1% | -37% | -12% | -98% | -31% | -3% | -9% | -31% | -43% | -30% | -1% |
WSSS-CLIMS | 1.2% | 1.3% | 0.0% | 4.5% | 6.6% | 1.0% | -2% | -13% | -80% | -6% | -15% | -28% | -15% | 0% | -26% | -17% |
Methodology:
Our benchmark evaluates the negative Impact of common visual hazards on algorithm output performance. It is calculated by this formula:
impact = min(metriclow,metrichigh) / max(metricnone,metriclow) - 1.0
The metricsnone/low/high are evaluated on subsets of the benchmark dataset that correspond to the identified severity of the hazard (e.g. the subset Blurhigh contains images which have a lot of blur visible). Positive impacts are truncated to zero.
An impact of -10% at Blur translates to an expected performance degradation for the algorithm of 10 percent when there is a considerable blur in the input image as opposed to supplying the same algorithm a similar image without noticeable image blur.
These are all currently evaluated hazards:
Blur: Image is noticeably affected by blur (e.g. motion blur, defocusing, compression artifacts...)
Coverage: Normally visible parts of the road are covered (e.g. unusual lane markings, snow, leaves...)
Distortion: Visible lens distortion
Hood: Ego-vehicle is visible, non-windscreen parts (e.g. car hood, mirrors)
Occl: Objects are partially occluded or cut off by image border
Overexp.: The scene is overexposed
Particle: Particles in the air obstruct the view (e.g. heavy rain, snow, fog)
Screen: The windscreen is interfering (e.g. interior reflections, wipers, rain on the windscreen,...)
Underexp.: The image is underexposed
Variation: Intra-class variations within the image (i.e. unusual representations of labels like unique cars)
More details on evaluation metrics and negative test cases can also be found on the FAQ page.