Fine-Grained Open-World Recognition

figaro-logo-dalle3 Computer-aided visual perception is one of the fundamental problems in computer vision research. Often, approaches aim to predict the likeliest predefined labels. These typically have been determined during a dataset's acquisition.
Decades of research were required to predict the likeliest (predefined) label with sufficient accuracy for everyday use. Although the currently available and often data-driven approaches work reasonably well, their ability to predict labels of objects is similar to that of a three-year-old child. These labels have a broad complexity, such as differentiating mammals (e.g., dogs or cats). Fine-grained objects (e.g., different dog breeds) pose new challenges because minute differences separate one object label from another. More research must be conducted with open datasets without obligating the closed dataset requirement (i.e., having a complete set of labels during the implementation).
We refer to this category of problems as fine-grained recognition problems. This research investigates the current state of the art in fine-grained open-world recognition (i.e., retail product recognition) and aims to improve its accuracy. We research approaches for overcoming the shortage of fine-labeled datasets by exploiting metaknowledge of the environment and demonstrate how these approaches can be applied to acquire datasets at a significant scale. We also research different approaches for recognizing the identifier of fine-grained retail products in real-world scenarios. Moreover, we aim to reduce the number of manually required annotations during training.

This topic is under ongoing research. For questions refer to Marco Filax.

scatter_plot_web

Backwards Forward