Efficient and interactive spatial-semantic image retrieval

Ryosuke Furuta, Naoto Inoue, Toshihiko Yamasaki

Research output: Contribution to journalArticle


This paper proposes an efficient image retrieval system. When users wish to retrieve images with semantic and spatial constraints (e.g., a horse is located at the center of the image, and a person is riding on the horse), it is difficult for conventional text-based retrieval systems to retrieve such images exactly. In contrast, the proposed system can consider both semantic and spatial information, because it is based on semantic segmentation using fully convolutional networks (FCN). The proposed system can accept three types of images as queries: a segmentation map sketched by the user, a natural image, or a combination of the two. The distance between the query and each image in the database is calculated based on the output probability maps from the FCN. In order to make the system efficient in terms of both the computational time and memory usage, we employ the product quantization (PQ) technique. The experimental results show that the PQ is compatible with the FCN-based image retrieval system, and that the quantization process results in little information loss. It is also shown that our method outperforms a conventional text-based search system.

Original languageEnglish
Pages (from-to)18713-18733
Number of pages21
JournalMultimedia Tools and Applications
Issue number13
Publication statusPublished - 15 Jul 2019



  • Fully convolutional networks
  • Image retrieval
  • Product quantization
  • Semantic segmentation

Cite this