Image indexing based on web page segmentation and clustering
Date Issued
2012
Author(s)
Abstract
Thousands of images are nowadays available on the web. These images are accompanied by a wide range of textual descriptors, such as image file names, anchor texts and, of course, surrounding text. Existing systems that attempt to mine information for images using surrounding text suffer from several problems, such as the inability to correctly assign all relevant text to an image and discard the irrelevant. In this paper, we propose a novel method for indexing web images which is based on textual descriptors. The web document is segmented into visual blocks of text and then each block of text is assigned to the closet image. The text extraction is improved by assigning the text to an image following the intuitive understanding of how close two visual blocks are. The evaluation confirms the validity of the proposed method and demonstrates its possible extensions.
File(s)![Thumbnail Image]()
Name
Tsapatsoulis_2012.pdf
Size
566.76 KB
Format
Adobe PDF
Checksum (MD5)
d3ed4858c44cc26197f87842e416dd69

