Developers of image processing routines rely on benchmark data sets to give qualitative comparisons of new image analysis algorithms and pipelines. Such data sets need to include artifacts in order to occlude and distort the required information to be extracted from an image. Robustness, the quality of an algorithm related to the amount of distortion is often important. However, using available benchmark data sets an evaluation of illumination robustness is difficult or even not possible due to missing ground truth data about object margins and classes and missing information about the distortion. We present a new framework for robustness evaluation. The key aspect is an image benchmark containing 9 object classes and the required ground truth for segmentation and classification. Varying levels of shading and background noise are integrated to distort the data set. To quantify the illumination robustness, we provide measures for image quality, segmentation and classification success and robustness. We set a high value on giving users easy access to the new benchmark, therefore, all routines are provided within a software package, but can as well easily be replaced to emphasize other aspects.