An adapted linear discriminant analysis for the classification in high-dimension, and an application to medical data

Abstract

The classification of normally distributed data in a high-dimensional setting when variables are more numerous than observations is considered. Under the assumption that the inverse covariance matrices (the precision matrices) are the same over all groups, the method of the linear discriminant analysis (LDA) is adapted by including a sparse estimate of these matrices. Furthermore, a variable selection procedure is developed based on the graph associated to the estimated precision matrix. For that, a discriminant capacity is defined for each connected component of the graph, and variables of the most discriminant components are kept. The adapted LDA and the variable selection procedure are both evaluated on synthetic data, and applied to real data from PET brain images for the classification of patients with Alzheimer’s disease.

Caroline Chaux
CNRS Researcher
Eric Guedj
Professeur Université Practicien Hospitalier