Academic journal article Informing Science: the International Journal of an Emerging Transdiscipline

Multimedia Content Analysis and Indexing for Filtering and Retrieval Applications

Academic journal article Informing Science: the International Journal of an Emerging Transdiscipline

Multimedia Content Analysis and Indexing for Filtering and Retrieval Applications

Article excerpt

Introduction

Data compression coupled with the availability of high-bandwidth networks and storage capacity have created the overwhelming production of multimedia content. In addition, the introduction of digital video will completely change the landscape of the entire video value chain. For content producers, advertisers, and consumers, there will be increased availability and increased challenges to manage the data. Users in the consumer and corporate domains will be given an overwhelming and confusing number of traditional and Internet media viewing choices and will look for ways to help them manage these choices [2,14,15,24,26,34,49,92]. Image and video archives in broadcast studios, corporate archives of multimedia collaborative sessions, video conferencing sessions, and educational content require tools for providing quick overview and transparent access. From a content production point of view, a broadcast studio archive will produce video at a rate of 19.2 Mb/s which translates into 207GB storage per day, assuming that only new content is broadcast. At that rate, the studio archive will require 75TB per year which means that in 14 years digital broadcast studios will have to cope with data in the petabyte range. The sheer size of the stored video data will pose serious issues for content owners to find and reuse some of the archived material. Content management for studio archives is just one of the many applications that incorporate tools for content analysis and retrieval of multimedia data.

Content management tools will aid in applications that will facilitate effective access, interaction, browsing and display of complex and inhomogeneous information consisting of images, video and audio. Such tools are important in various cases of professional and consumer applications such as education, digital libraries, entertainment, content authoring tools, geographical information systems, bio-medical systems, investigation services, surveillance and many others [64].

In this paper, we survey the techniques for content-based analysis, retrieval and filtering of digital images, audio and video. We will focus on basic methods for extracting features that will enable indexing and search applications. The deployment of a variety of these methods will enable powerful tools for both professionals and consumers to cope with multimedia data. Although the goal of these methods is content understanding which stems from computer vision systems, the methods surveyed here would be more similar to database methods for indexing. This is because a gap exists between these two aspects of the retrieval problem: databases do not provide content analysis and segmentation, and vision systems do not provide database query capabilities. The data acquisition in traditional databases relies primarily on the user to type in the data. Similarly, in the past image and video databases provided keyword descriptions of the visual descriptions of the visual data. However, the annotation based description is being augmented or replaced by automatic methods for feature extraction, indexing and content understanding [11,17].

This paper is organized as follows. Section 2 provides a survey of the methods for image analysis and retrieval. Section 3 describes techniques for video analysis, retrieval and filtering. Section 4 presents methods for audio based analysis and retrieval. In section 5 we provide a high level survey of systems and standardization efforts in content description and retrieval.

Methods for Image Analysis and Retrieval

Image analysis is concerned with extraction of features for content representation. Often, extraction of low-level features such as color, texture and shape are used for this purpose. On the other hand, the intent of most image retrieval systems is to give the user tools to search for images using higher-level semantic descriptions. For example, the user may want to find all the "city" pictures or the "beach" pictures from the last vacations [26,29,54]. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.