Query By Committee
Query By Committee (QBC) is an active learning algorithm used in machine learning to select the most informative data points for labeling from a large pool of unlabeled data. It works by training a committee of diverse models on the labeled data and then querying instances where the committee members disagree the most, as these are likely to be the most uncertain and valuable for improving model performance. This approach reduces the labeling effort and cost in supervised learning tasks by focusing on data that maximizes learning efficiency.
Developers should learn and use Query By Committee when working on machine learning projects with limited labeled data, such as in natural language processing, computer vision, or any domain where data annotation is expensive or time-consuming. It is particularly useful in scenarios like semi-supervised learning, where leveraging unlabeled data can significantly boost model accuracy without exhaustive labeling, and in applications like medical diagnosis or fraud detection where expert labeling is costly.