In this paper, the problem of sparse nonparametric conditional density estimation based on samples and prior knowledge is addressed. The prior knowledge may be restricted to parts of the state space and given as generative models in form of mean-function constraints or as probabilistic models in the form of Gaussian mixtures. The key idea is the introduction of additional constraints and a modified kernel function into the conditional density estimation problem. This approach to using prior knowledge is a generic solution applicable to all nonparametric conditional density estimation approaches phrased as constrained optimization problems. The quality of the estimates, their sparseness, and the achievable improvements by using prior knowledge are shown in experiments for both Support-Vector Machine-based and integral distance-based conditional density estimation.