Deep Learning for Retail
Grocery recognition in supermarkets comprises several challenges as groceries embed small inter-class and intra-class variance. Small inter-class variance is given because different products share substantial visual similarities. Datasets typically contain real-world images and reference images, which induces intra-class variance. The visual appearances of products change over time, and their number continuously grows because designs are reworked or new products are published. Standard object classification methods are inapplicable at scale because models need to be fine-tuned continuously to relax these changing conditions.
In this project, we leverage the burden of requiring all classes to be known at training time using methods derived from face recognition techniques and meta-knowledge derived from additional sensor information. The setting is based on recognizing groceries in unknown supermarkets, e.g., without substantial infrastructural changes. The core idea is to extend face-recognition methods and fine-tune known architectures to distinguish the fine-grained visual differences of grocery products. The required training images are semi-automatically generated using sensor data acquired with modern smart glasses, e.g., the user’s trajectory and a model of the environment. Product candidates in real-world images are found using a sliding window approach, which uses the observation that products are arranged on shelves.
This topic is under ongoing research. For questions refer to Marco Filax.