Multimedia has long played an important role in the process of informing activities by changing the way we learn, think, work and live (Zeng & Yu, 1999). Not surprisingly, the use of multimedia information is growing at an exponential rate, imposing a great challenge on the way information are controlled and organized. In future years, this trend is expected to continue, thanks to the development in processor speed and to the advances in network technologies. Nowadays, learning, studying, researching, and communicating are examples of activities that users can do by means of multimedia streams.
The reason of this success is that multimedia information is far more rich, educational and entertaining than traditional text-based information and nowadays users may access to broadband and/or wireless technologies (DSL, Fiber optics, Wi-Fi, Hedge/GPRS, CDMA, to name a few) to reach multimedia contents whenever and wherever they want.
Many multimedia applications are available (on-demand multimedia services, videoconferencing, distance learning, on-line games and pay-per-view, just to name a few), but the emerging multimedia applications are those that enable natural interaction among end-users: for instance, user can interact along the application lifetime with the contents provider in order to get a customized stream of multimedia information. A large interest is given to these interactive applications, which are becoming more and more attractive and popular over the Internet. On-line games, interactive-webtv, on-demand multimedia services are some examples of such interactive applications.
Unfortunately, despite their popularity over the Internet, these applications achieve a QoS that is far from what desired. The reason is that multimedia interactive applications impose rigid timing constraints on the traffic they produce and the respect of these timing constraints is difficult in the Internet scenario, as the Internet provides a best-effort service to the traffic it carries. In other words, the Internet makes its best effort to move the traffic from sender to receiver as quickly as possible. However, the best-effort service does not make any promises about the end-to-end delay for an individual packet and about the variation of packet delay (network jitter) within a packet stream. This causes the end-to-end delay to be unknown a priori and very variable along the application lifetime and poses serious problems to the supporting of interactive multimedia applications, which are subject to a very critical timing constraint: the overall end-to-end delay, experienced by the application traffic, should be not noticeable to the end-users. The critical role played by this end-to-end delay is described in several studies (for instance, Kurita, Iai & Kitawaki, 1995), which highlight how human perception is strongly affected by this delay. Briefly, these studies point out that the overall end-to-end delay is not noticeable by the human perception if it stays within a threshold, but if the delay goes above this threshold, it becomes noticeable to endusers. This threshold, called NIT (Natural Interaction Threshold) in this paper, represents the limit below which the interactions are well supported. Hence, an interactive multimedia application is well supported if its traffic has an overall end-to-end delay smaller than this NIT threshold along the application lifetime. The value of this NIT threshold depends on the characteristics of the application and on the level of interactivity requested by the end-users (i.e., the more interactive operations are involved, the lower the threshold value should be) (Kurita, Iai & Kitawaki, 1995). For instance, if we consider an interactive application where audio is involved, the delay from when a user speaks until the sound is manifested at the receiving hosts should be less than a few hundred milliseconds. In particular, delays smaller than 150 milliseconds are not perceived by a human listener, delays between 150 and 400 milliseconds can be acceptable, and delays exceeding 400 milliseconds result in frustrating, if not completely unintelligible, voice conversations (Sitaram & Dan, 2000). …