Research Statement

 

1. Indexing & Retrieval of Large Chinese Calligraphic Character Databases

The large amount of Chinese calligraphic scripts in existence is a valuable part of the Chinese cultural heritage. As shown in Fig. 1, although these scripts can be available through public libraries or Internet, they can hardly be retrieved by optical character recognition (OCR) which performs well on machine printed characters against clean background. The reasons why no effective techniques can support the retrieval of Chinese calligraphic characters written in different styles are as follows:

(1)    Complexity: It has been estimated that an average character is composed of an average number of 12.71 strokes [Zhang 1992; Wu et al, 1992]. The size of a stroke varies according to the total numbers of strokes composing the character, and the size of a stroke segment depends on the total number of segments composing the stroke.

(2)     Deformation: The same writer under different moods can generate different styles of the same character. Sometimes a character is deformed consciously for a better artistic effect.

(3)  Degradation: Many ancient calligraphic works have been degraded by nature changes.

Moreover, so far, no efficient techniques have been proposed to retrieve and index large Chinese calligraphic character databases. In essence, the indexing issue of Chinese calligraphic character belongs to the category of high- dimensional data indexing, for which considerable research work has been done on the high-dimensional indexing issue [Bohm et al, 2001]. Unfortunately, the existing indexing methods for the high-dimensional data can not be directly applied to the Chinese calligraphic characters due to their

Fig. 1. The example of Chinese calligraphic character script

unique characteristics:

l      To depict each Chinese calligraphic character, a set of contour points is extracted from each Chinese calligraphic character image [Cadal 2005].   Due to the complexity of Chinese characters, the number of contour points extracted from each Chinese calligraphic character image is very large (in general, above 150 dimensions), As a consequence, if each point occupies an entry in the character’s vector representation, the dimensionality of a vector is very high and conventional multi-dimensional indexing techniques, such as R-tree [Guttman. 1998] and k-d-tree [Bentley 1975] can not be used to index due to the “curse of dimensionality”.

l      The number of contour points of each character is different from each other owing to their shape’s complexity. Thus, for different characters, the  dimensionalities of their representative vectors may vary. Many existing high-dimensional indexing schemes (e.g., R-tree [Guttman. 1998], VA-file [Weber et al. 1998], etc.) are not suitable for indexing the Chinese calligraphic characters only because they can not well handle the high- dimensional search with dynamic dimensionalities.

In this project, we propose a novel high-dimensional indexing for effective and efficient retrieving of Chinese calligraphic characters. (see [Zhuang et al. TALIP 07] and [Zhuang et al. CIKM’06])

 

Here are three demos: [Demo 1], [Demo 2], [Demo 3]

 

 

2. Database Support for Efficient Retrieval of Large Cross-Media Retrieval

 

Content-based multimedia retrieval and indexing issues have been extensively studied for many years. The existing approaches, however, mainly focus on the single-modality-based retrieval, such as content-based image retrieval [1] and content-based video retrieval [2], etc. However, for the media objects of different modalities (e.g., image, audio and video, etc) in webpages, there exists to some extent latent semantic correlation among them. Cross-media retrieval [5] as a new multimedia retrieval method has received increasing attention in the research community. It is perceived as a novel retrieval method which returns the media objects of multi-modalities in response to a query media object of single modality. For example, when user submits a “tiger” image, the retrieval system may return some “tiger”-related audio clips and video clips via using so called Cross Reference Graph [5]. Compared with the traditional single-modality retrieval [1,2], the cross-media retrieval tries to breakthrough the restriction of media modality in multimedia retrieval. Up to now, the research of the cross-media retrieval and indexing has been barely touched. The fundamental challenges of it lie in two folds:

  How to model the correlation of media objects of different modalities? For example, for an image and an audio clip, it is hard to measure the correlation between them with the same semantic information since they are two different modality media objects,

  To effectively and efficiently facilitate the cross-media retrieval with different modalities (e.g., image, audio and video, etc), how to generate and index a cross reference graph via link analysis among the webpages downloaded from the Internet?

To the best of our knowledge, Octopus [5] is the first prototype system to support the multi-modality retrieval. No index, however, has been considered in this system, so the retrieval performance of the Octopus is not satisfactory when the number of media objects becomes large. To speed up the large-scale cross-media retrieval efficiency, the study of multi-feature indexing method has received much attention nowadays. In [3], Jagadish proposed a multi- feature indexing method. Shen, et al [4] also proposed a multi-scale indexing (MSI) structure for web image retrieval, which nicely combines two modality information of image, such as its semantic and visual features. However the usability of the MSI is very limited and it can not support a unified indexing expression for multi-modality media objects. In this project, based on a cross reference graph [5], we propose a novel integrated indexing scheme called CIndex (see [Zhuang et al. JOS 2008), which is specifically designed for indexing the cross-media retrieval over large multi-modality media databases. With the aid of the CIndex, a cross-media retrieval of a query example in high- dimensional spaces is transformed into a range query in the single dimensional space. Moreover, our technique permits the immediate generation of a few results while additional results are searched for.

 

Here is a demo: [Demo 4]

(a). By image

(b). By audio

(c). By video

Fig. 2. The example of Cross-media retrieval