Spotlights Session 1
Workshop: S2D-OLAD: From shallow to deep, overcoming limited and adverse data

Min-Entropy Sampling Might Lead to Better Generalization in Deep Text Classification, Nimrah Shakeel

Nimrah Shakeel

[ Abstract ]
Fri 7 May 6:34 a.m. PDT — 6:38 a.m. PDT


We investigate the effectiveness of maximum-entropy based uncertainty sampling for active learning, for a convolutional neural network, when the acquired dataset is used to train another CNN. Our analysis shows that maximum entropy sampling always performs worse than random iid sampling on the three datasets that are investigated, for all sample sizes considerably smaller than half of the dataset. Side by side, we compare it to a minimum entropy sampling strategy, and propose using a mixture of the two, which is almost always better than iid sampling, and often beats it by a large margin. Our analysis is limited to the text classification setting.