Image categorisation through Boosting using cost-minimising strategies for data labelling

Research output: ThesisDoctoral Thesis

Bibtex - Download

@phdthesis{c14f556bc3844d98b2f50fc18619cf95,
title = "Image categorisation through Boosting using cost-minimising strategies for data labelling",
abstract = "Previous work has shown that image categorisation using AdaBoost is a powerful method. There AdaBoost was utilised to select discriminative features to learn a classifier against a background class. As proposed in earlier work, we present recent extensions to that framework by (a) incorporating geometric relations between features into the weak learner and (b) providing a weight optimisation method to combine pairwise classifiers for multiclass classification. We evaluate our framework on the Xerox data set where we compare our results to the bag-of-keypoints approach. Moreover we report our results from the PASCAL VOC Challenge 2006. The mass of images available through image databases a.s.o. is huge but obtaining the class information needed to learn a classifier is usually considered to be costly. One way to deal with the general problem of costly labels is active learning, where points to be labelled are selected with the aim of creating a classifier with better performance than that of a classifier trained on an equal number of randomly sampled points. Previous work showed that active learning can improve the performance compared to standard passive learning. However the basic question of whether new examples should be queried at all is seldom addressed. This work deals with the labelling cost directly as recently proposed in our earlier work. The learning goal is defined as the minimisation of a cost which is a function of the expected model performance and the total cost of the labels used. This allows the development of general strategies and specific algorithms for (a) optimal stopping, where the expected cost dictates whether label acquisition should be terminated, and (b) empirical evaluation, where the cost is used as a performance metric for a given combination of learning, stopping and sampling methods. Though the main focus is optimal stopping, we also aim to provide the background for further developments and discussion within the field of active learning. Experimental results illustrate the proposed evaluation methodology and demonstrate the use of the introduced stopping method.",
keywords = "aktives Lernen, optimales Stoppen, Bildkategorisierung, Boosting, active learning, optimal stopping, image categorisation, Boosting",
author = "Christian Savu-Krohn",
note = "no embargo",
year = "2009",
language = "English",

}

RIS (suitable for import to EndNote) - Download

TY - BOOK

T1 - Image categorisation through Boosting using cost-minimising strategies for data labelling

AU - Savu-Krohn, Christian

N1 - no embargo

PY - 2009

Y1 - 2009

N2 - Previous work has shown that image categorisation using AdaBoost is a powerful method. There AdaBoost was utilised to select discriminative features to learn a classifier against a background class. As proposed in earlier work, we present recent extensions to that framework by (a) incorporating geometric relations between features into the weak learner and (b) providing a weight optimisation method to combine pairwise classifiers for multiclass classification. We evaluate our framework on the Xerox data set where we compare our results to the bag-of-keypoints approach. Moreover we report our results from the PASCAL VOC Challenge 2006. The mass of images available through image databases a.s.o. is huge but obtaining the class information needed to learn a classifier is usually considered to be costly. One way to deal with the general problem of costly labels is active learning, where points to be labelled are selected with the aim of creating a classifier with better performance than that of a classifier trained on an equal number of randomly sampled points. Previous work showed that active learning can improve the performance compared to standard passive learning. However the basic question of whether new examples should be queried at all is seldom addressed. This work deals with the labelling cost directly as recently proposed in our earlier work. The learning goal is defined as the minimisation of a cost which is a function of the expected model performance and the total cost of the labels used. This allows the development of general strategies and specific algorithms for (a) optimal stopping, where the expected cost dictates whether label acquisition should be terminated, and (b) empirical evaluation, where the cost is used as a performance metric for a given combination of learning, stopping and sampling methods. Though the main focus is optimal stopping, we also aim to provide the background for further developments and discussion within the field of active learning. Experimental results illustrate the proposed evaluation methodology and demonstrate the use of the introduced stopping method.

AB - Previous work has shown that image categorisation using AdaBoost is a powerful method. There AdaBoost was utilised to select discriminative features to learn a classifier against a background class. As proposed in earlier work, we present recent extensions to that framework by (a) incorporating geometric relations between features into the weak learner and (b) providing a weight optimisation method to combine pairwise classifiers for multiclass classification. We evaluate our framework on the Xerox data set where we compare our results to the bag-of-keypoints approach. Moreover we report our results from the PASCAL VOC Challenge 2006. The mass of images available through image databases a.s.o. is huge but obtaining the class information needed to learn a classifier is usually considered to be costly. One way to deal with the general problem of costly labels is active learning, where points to be labelled are selected with the aim of creating a classifier with better performance than that of a classifier trained on an equal number of randomly sampled points. Previous work showed that active learning can improve the performance compared to standard passive learning. However the basic question of whether new examples should be queried at all is seldom addressed. This work deals with the labelling cost directly as recently proposed in our earlier work. The learning goal is defined as the minimisation of a cost which is a function of the expected model performance and the total cost of the labels used. This allows the development of general strategies and specific algorithms for (a) optimal stopping, where the expected cost dictates whether label acquisition should be terminated, and (b) empirical evaluation, where the cost is used as a performance metric for a given combination of learning, stopping and sampling methods. Though the main focus is optimal stopping, we also aim to provide the background for further developments and discussion within the field of active learning. Experimental results illustrate the proposed evaluation methodology and demonstrate the use of the introduced stopping method.

KW - aktives Lernen

KW - optimales Stoppen

KW - Bildkategorisierung

KW - Boosting

KW - active learning

KW - optimal stopping

KW - image categorisation

KW - Boosting

M3 - Doctoral Thesis

ER -