BEGIN:VCALENDAR
VERSION:2.0
PRODID:ECMLPKDD-MB
BEGIN:VEVENT
DTSTAMP;TZID=Europe/Dublin:20180826T200000
UID:_ecmlpkdd_MACH-D-18-00045
DTSTART;TZID="Europe/Dublin":20180911T150000
DTEND;TZID="Europe/Dublin":20180911T152000
LOCATION:Hogan Mezz 1
TRANSP:TRANSPARENT
SEQUENCE:1
DESCRIPTION:Supervised learning has seen numerous theoretical and practical advances over the last few decades. However, its basic assumption of identical train and test distributions often fails to hold in practice. One important example of this is when the training instances are subject to label noise: that is, where the observed labels do not accurately reflect the underlying ground truth. While the impact of simple noise models has been extensively studied, relatively less attention has been paid to the practically relevant setting of instance-dependent label noise. It is thus unclear whether one can learn, both in theory and in practice, good models from data subject to such noise, with no access to clean labels.We provide a theoretical analysis of this issue, with three contributions. First, we prove that for instance-dependent noise, any algorithm that is consistent for classification on the noisy distribution is also consistent on the noise-free distribution. Second, we prove that consistency also holds for the area under the ROC curve, assuming the noise scales (in a precise sense) with the inherent difficulty of an instance. Third, we show that the Isotron algorithm can efficiently and provably learn from noisy samples when the noise-free distribution is a generalised linear model. We empirically confirm our theoretical findings, which we hope may stimulate further analysis of this important learning setting.
SUMMARY:Learning from binary labels with instance-dependent noise
CLASS:PUBLIC
END:VEVENT
END:VCALENDAR