The aim of dimensionality reduction is to reduce the number of considered variables without removing the information needed to perform a given task. In explorative data analysis, this translates to preserving the clustering properties of the data, while in a classification setting, only class separation has to be preserved. By far the most popular tools are principal component analysis (PCA) for the former and linear discriminant analysis (LDA) for the latter. Both transform the data to a linear subspace. With PCA, the subspace is chosen so that most of the variance is preserved. However, there is no guarantee that clustering properties or even class separation are preserved too. With LDA, the data is projected to a C − 1 dimensional (where C denotes the number of classes) subspace so that class separation is maximized. Apart from unnecessarily restricting the number of dimensions, LDA might destroy discriminative information if its implicit assumptions (normally distributed data) are violated. In this technical report, we present a novel approach to linear dimensionality reduction. The approach is formulated as an optimization prob ... mehrlem, which is solved using stochastic gradient descent (SGD). Like LDA, the aim is to maximize class separability. Like PCA, the dimensionality of the subspace can be specified by the user. As SGD is very sensitive to the initial conditions, we further present a method to determine suitable starting points for the gradient descent.