We propose an unsupervised method for learning multi-stage hierarchies of sparseconvolutional features. While sparse coding has become an increasingly popularmethod for learning visual features, it is most often trained at the patch level.Applying the resulting filters convolutionally results in highly redundant codesbecause overlapping patches are encoded in isolation. By training convolutionallyover large image windows, our method reduces the redudancy between featurevectors at neighboring locations and improves the efficiency of the overall repre-sentation. In addition to a linear decoder that reconstructs the image from sparsefeatures, our method trains an efficient feed-forward encoder that predicts quasi-sparse features from the input. While patch-based training rarely produces any-thing but oriented edge detectors, we show that convolutional training produceshighly diverse filters, including center-surround filters, corner detectors, cross de-tectors, and oriented grating detectors. We show that using these filters in multi-stage convolutional network architecture improves performance on a number ofvisual recognition and detection tasks