An Overview of

Interactive Video On Demand System

Miranda Ko and Irene Koo

The University of British Columbia

Date: Dec 13, 1996


Video On Demand has become increasingly popular. Giant television providers in the United States have committed to provide VOD services in the near future. Interactive Video On Demand (IVOD) is an extension of VOD in which additional functionalities such as Fast Forward, Fast Rewind, and Pause are implemented. These functionalities pose new requirements and challenges on the system implementation. An IVOD system has three components: Client's "set-top box", network, and servers with archives. The clients' set-top boxes are their interfaces to the IVOD system. It has a network interface, a decoder, buffers, and synchronization hardware. Clients input their commands using remote controls. Network of an IVOD system must be a high speed network. Currently available technologies that are suitable for transferring IVOD data include SONET, ATM, ADSL, and HFC. Servers with archives are places where user commands are processed and where movies are stored. Issues such as admission control, servicing policies, and the storage subsystem structure must be considered when designing the IVOD system. In addition to the technical issues described above, non-technical issues such as standards, property rights, and cost must also be considered.

Table of Contents

Abstract ii
List of Figures iv
List of Tables iv
1Introduction 1
2System Architecture 2
2.1Clients 4
2.2The Network 5
2.3The Server 5
3Interactive Functions 6
4Quality of Service 7
5Clients 8
5.1Network Interface 8
5.2Decoder 8
5.3Buffer 8
5.4Synchronization Hardware 9
6Network 9
6.1Network Requirements of the IVOD System 9
6.2Existing Network Technologies 10
6.2.1 Backbone Network 10
6.2.2 Signaling Schemes 11
7Server 13
7.1Admission Control 14
7.2Servicing Policies 15
7.3Storage Subsystem 16
7.3.1 Hierarchical Structure 16
7.3.2 Movie Storage Data Distribution 16
7.3.3 Data Block Placement 17
7.3.4 Disk Scheduling 18
7.4User Traffic Characterization 19
8Ways of Handling Interactive Functions 20
9Existing Trials 21
10Conclusion 24
11References 24

List of Figures

Figure 1A Centralized Interactive Video On Demand System 2
Figure 2A Centralized IOVD System with Local Buffers 3
Figure 3A Distributed Interactive Video On Demand System 3
Figure 4Communications Between Clients and Servers 4
Figure 5A User's Set-Top Box 5
Figure 6An IVOD Storage Hierarchy 6
Figure 7The Overall Interactive Video On Demand System Architecture 6
Figure 8A Network Model for IVOD Systems 10
Figure 9An ADSL Local Distribution Network 11
Figure 10A Hybrid Fiber Coax Local Distribution Network 12
Figure 11A Typical Structure of an IVOD Server 13
Figure 12Data Flow Graph of an IVOD Server 13
Figure 13RAID18

List of Tables

Table 1Interactive TV Trials in the United States 22
Table 2Interactive TV Trials Outside the US 23

An Overview of

Interactive Video On Demand System

1. Introduction

With the explosion of Internet, people have endless hype, opinions, forecasts, and beliefs about it [16]. Interactive Television, they feel, is the vision to their beliefs: people will soon be able to purchase products, view movies, play video games, browse Internet, participate video-conferences without leaving their houses [4]. Of all the new things that people can do with television, video-on-demands is highly supported by Hollywood since it can lead to new markets and can bring them unpredictable profits.

People have been passive participants in receiving what TV service providers offer since television was introduced. Interactive Video On Demand (IVOD), unlike traditional television delivery, provides users with flexibility in choosing the kinds of information they like to receive [5]. An IVOD system is capable of serving a large number of end users to concurrently access large number of repositories of stored data, often movies [14]. IVOD is basically an extension of Video On Demand (VOD). In addition to the freedom of choosing movies, users can interact with movies and decide the viewing schedule [20]. In other words, IVOD system supports VCR-like functions, such as fast forward, rewind, pause. The enormous communication bandwidth and disk bandwidth required, and the Quality of Service (QoS) demanded necessitate a careful design of the system in order to maximize the number of concurrent users while minimizing the cost.

An IVOD system comprises of 3 major components: the "set-top box" at the client's site, the distribution network, and the server. There are many design issues to consider in building each of these components. In this paper, we first discuss several different alternatives for the system structure -- the placement of video servers. We then provide a general description on each of the system components. In Section 3 and Section 4, some interactive functions and the QoS expected from an IVOD system are presented. The next three sections discuss each of the system components in details: The hardware requirements for the "set-top box" at clients' sites are outlined in Section 5. Network requirements and existing potential network technologies for implementing an IVOD system are studied in Section 6. Section 7 presents various issues to consider in building an IVOD server, including admission control to guarantee QoS, servicing policies employed to serve viewers, and user traffic characterization to improve system's performance. Design issues to consider in building the storage subsystem in an IVOD server are also discussed. They include the structure of the storage subsystem, the movies' distributions in the subsystem, the placement schemes of data blocks on storage devices in the subsystem, and the disk scheduling algorithms used to retrieve data blocks for viewers. Several proposed ways of supporting interactive functions are outlined in Section 8. Lastly, Section 9 lists some of the existing VOD trials.

2. System Architecture

As with other networked systems, IVOD can be designed as centralized multimedia systems or distributed multimedia systems. A centralized IVOD system places processing servers and media archives in a single site as a central node. Requests from clients are processed at the central node, and videos demanded are delivered through the network to the client sites. Figure 1 illustrates a centralized system architecture. Centralized IVOD systems are simple to manage, but they usually suffer from poor scalability, long network delay, and low throughput. The performance of centralized IVOD systems can be improved if local servers are added. These local servers have video buffers, but no media archives. Popular movies can be stored in local video buffers so that they can be delivered to clients more quickly. Videos that are not buffered at local sites can be delivered to clients from the central archive when they are requested. A distributed IVOD system has local processing servers and media archives. Clients' requests are handled by local servers (Figure 3). If the movie requested is not in the local archive, the local server can request the movie from remote servers located across the network.

Figure 1. A Centralized Interactive Video On Demand System

Figure 2. A Centralized IVOD System with Local Buffers

Figure 3. A Distributed Interactive Video On Demand System

A distributed IVOD system can be viewed as many small regional IVOD systems connected together. The distributed IVOD system spreads users' requests to many sites, thereby moving the processing servers and media archives closer to the clients. Local servers reduce network delay and congestion as experienced by central servers, but distributed systems are more difficult to manage. The choice of the system structure depends on the available storage, communication systems, costs, application demands, and other factors. However, the desired QoS of IVOD systems (described in Section 4) makes the distributed structure more preferable.

Each IVOD connection requires a bi-directional communication between the client and the local server. Each server has a number of video selections available for users. The server processes the client's requests and tries to respond to the clients' demands as soon as possible. An IVOD system should be able to handle hundreds or even thousands of clients with different preferences simultaneously [2]. The quality of each service should remain in specific bounds throughout the entire session. An IVOD service usually starts from a client requesting information from a server; the server responds via the network to the client.

The system architecture of an Interactive Video On Demand system basically consists of three major parts: a client, a network, and a server. Each part can be subdivided further into components and interfaces. Figure 4 depicts the communications between clients and servers.

Figure 4. Communications Between Clients and Servers

2.1 Clients

A client subscribing to an Interactive Video On Demand service has a display device (usually TV) and some audio devices (e.g. speakers) to present the movie requested. He/She interacts with the system via an input device such as a remote control, a mouse, or a keyboard. A controller is needed at the client site to take the client's commands and to send the signal to the server through its network interface. The controller also stores video signals it receives from the server into its buffers, decodes the compressed signals, and sends the decoded signals to the display at the appropriate time. The controller is assembled in a box, known as the "set-top box." Figure 5 depicts the components at the client site.

Figure 5. A User's Set-Top Box

2.2 The Network

An IVOD service requires real-time display of the video purchased by the client. A typical video stream consists of frames of pictures, sounds corresponding to those frames, and captioned text. The large quantity of information needed to be transmitted to the client continuously with minimal delay poses high performance requirements on the network. An IVOD network should be a high speed network with reasonable error rate as retransmission is unacceptable. Since video information is delay sensitive, the delay variation (jitter) should be kept to a minimum.

2.3 The Server

A server of an IVOD system processes commands from users. It accepts or rejects the clients' requests based on the current state of the system and the network load (refer to Section 7.1). It also performs scheduling on the retrieval of data for all active clients. A multimedia archive is connected to the server. The archive contains a collection of videos available to the users. Depending on the system requirements and the budget available, a range of storage devices can be used: cache (RAM) is the most expensive but has the lowest access time. Disk-arrays provide fault-tolerance at a reasonable price and access rate (10 msec). Optical discs have a capacity of 650 MB with access time 100 msec. Digital Versatile Disc (DVD) is state of the art. Each disc can stores 4.7 gigabytes of information. The content of movies stored on DVD discs can be easily configured to suit viewers' preferences with the help of authoring tools. Tape drives are on the lower price range, but with longer access time. A typical IVOD storage system uses a combination of storage devices to optimize the tradeoff between cost and efficiency. Figure 6 depicts a general IVOD storage hierarchy.

Figure 6. An IVOD Storage Hierarchy

Combining all the components above, an IVOD system is constructed. The overall system architecture of an IVOD system is shown in Figure 7.

Figure 7. The Overall Interactive Video On Demand System Architecture

3. Interactive Functions

As mentioned before, IVOD is an extension of VOD with additional interactive functionalities added. Possible interactive functions include [4]:
  1. Play/Resume : Start a presentation from the beginning or resume after a Stop.
  2. Stop : Stop the presentation, without picture and sound.
  3. Pause : Hold the presentation with picture.
  4. Jump Forward : Jump to a target time of the presentation in the forward direction without picture and sound.
  5. Jump backward : Jump to a target time of the presentation in the backward direction without picture and sound.
  6. Fast Forward (FF) : Browse the presentation in the forward direction with picture and sound.
  7. Slow Down : Present forward with a lower playback rate with picture and sound.
  8. Reverse : Play the presentation in the reversed direction with picture and sound.
  9. Fast Reverse (REW) : Browse the presentation in the backward direction with picture and sound.
  10. Slow Reverse : Present backward with a lower rate with picture and sound.

Other interactive features include the ability to avoid or select advertisements, to investigate additional details about news events (through hypermedia, for example), and to browse, select, and purchase goods [5]. Seven main interactive functions: Fast Forward, Fast Reverse, Jump, Play, Stop, Pause, and Slow Down will be discussed in this paper.

4. Quality of Service

Quality of Service (QoS) can be used in many different aspects. For instance, QoS in networks may include guarantees on the throughput, network delays, delay jitter, error rate, etc. QoS in this paper refers to the required standards of IVOD services expected by users. They include:

Fast setup time: Initial service delay (connection setup time) should be within a few minutes.

Independence of Quality of Service to different customers: Quality of service provided to the current customers should not be degraded due to the joining of new customers to the service.

Continuity of media streams: There should be no or little jitter in presentation

Prompt response to interactive functions: The loading of extra data from the server due to interactive functions, e.g. fast forward, should be invisible to the customer in the ideal case. It should appear as if the client is operating his/her own VCR.

Transparency of the involvement of multiple media streams: Multiple media streams must be synchronized. For instance, if a video object is to be combined with an audio object at the client's site, this should be done in such a way that lip-synchronization is achieved.

All of the above QoS require cooperation of all three components in the IVOD system: server, network, as well as the "set-top box".

5. Client

To support interactive services, considerable functionalities must be built into the set-top box. Four important components for set-top box are Network Interface, Buffer, Decoder and Synchronization Hardware.

5.1 Network Interface

The network interface allows the client to receive data from server. Moreover, it provides a mechanism to translate user commands received by the sensor to appropriate signals for transmissions on network.

5.2 Decoder

In order to save storage space, disk bandwidth, and network bandwidth, movies are usually encoded before they are stored. Thus, a decoder at the client's site is needed to decode the arrived media streams before presenting them to the viewer.

5.3 Buffer

Due to delay jitters in the network, the arrival time of a video stream cannot be determined exactly. In order to guarantee starvation-free (continuous) playback, the server must ensure a media unit is available at the client prior to its earliest predicted playback time. By taking into account the maximum network delay (max), the server can transmit a media unit at least max prior to the unit's earliest predicted playback time. However, if the actual network delay experienced by the media unit is less than max, the media unit will arrive at the client earlier than its scheduled playback time, and will have to be buffered [19]. For details on the computations of the maximum buffering that is needed at the client's site, refer to [19].

If the buffer size is large enough and the Jump size is relatively small, Jump Forward or Jump Backward may not require data delivery from the server. In other words, the data required are already in the client's buffer. The response to the jumps will be faster as compared to those requiring data from the server. Buffering also makes the response to Fast Forward and Fast Rewind faster if the initial data required is in the buffer.

As mentioned above, decoding is required at the client's site. Buffering of data that may be required for future display allows data to be decoded while the current data is being displayed. Less powerful decoding hardware or decoding algorithm can be used since real-time decoding is avoided. Thus, the cost of the "set-top box" can be reduced.

5.4 Synchronization Hardware

A movie consists of both video and audio streams. They must be synchronized before being presented. Synchronization is required at the client site to support scalable video [12]. In scalable video, a video stream is decomposed to a base stream and one or more additional streams. The additional streams are to be combined with the base stream to produce higher quality video. Each of those streams is stored as individual media files on the server. Depending on the quality demanded and the bandwidth available to the client, the presentation may require zero or more of the additional streams [12]. Therefore, different media streams must be synchronized before presenting to the customers.

Lastly, the user interface should be simple and easy to use. Preferably, the same remote control can be used for both the IVOD system and the video cassette recorder. In addition, the cost of set-top box must fall within reasonable limits (under a few hundred dollars) for the technology to succeed. Open and interoperable systems that let users subscribe to several different services are preferred [5].

6. Network

Unlike traditional computer communications that are bursty and short-lived, the deliveries of videos involve sending enormous amount of data to customer homes continuously for a long-period of time (the length of the movie presentation). Videos are usually encoded using MPEG-1 or MPEG-2 standards. Since MPEG encoding standards exploit the intra-dependancy between frames, the resulting encoded video streams are usually of variable bit rates. These characteristics of videos lead to new criteria and challenges on the network.

6.1 Network Requirements of IVOD Systems

High Speed Network. High speed networks are definitely needed for IVOD systems. For example, videos compressed using MPEG standards require a bandwidth between 1.5 and 6 Mbps. A system supporting 100 users requires a bandwidth close to 600 Mbps. Ordinary 10 Mbps Ethernet or 28.8 kpbs telephone lines cannot support such video transfers.

Connection-Oriented Transfer. IVOD services are real-time multimedia application services. In other words, the time at which packets arrive at the destination is strictly bounded. Any packets arrived later than their expected time are useless and discarded. Moreover, retransmissions are not possible . Consequently, connection-oriented services, which reduce the rate of dropped and late packets, are desired.

Latency and Jitter. The delay between a video stream being sent from the server and it being received by the client needs to be minimized. Variations in delay (jitter) must be kept within rigorous bounds to preserve the quality of the presentation [3].

In order to provide IVOD services, the network needs to deliver guaranteed services to delay-sensitive variable-bit-rate video traffic[4]. The delay sensitive characteristic of video services requires the network to support some resource reservation schemes for each video stream.

6.2 Existing Network Technologies

As mentioned above, a typical video stream requires an average bandwidth of 1.5Mbps.

A typical IVOD network can be divided into two levels: backbone and local distribution networks. The backbone connects servers and routers/access nodes together, while the local distribution network links a client site to its nearby router/access node. Each local distribution network link needs a bandwidth of at least 1.5 Mbps - the bandwidth of a single IVOD video stream. The bandwidth of the backbone is usually on the order of hundreds megabits per second, depending on the number of simultaneous connections the IVOD network has to support. Figure 8 illustrates a network model for IVOD systems.

Figure 8. A Network Model for IVOD Systems

6.2.1 Backbone Network

IVOD services require high-speed and low-jitter networks to support hundreds or even thousands of simultaneous connections. The required bandwidth of the backbone is on the order of hundreds megabits per seconds. Two attractive solutions are SONET and ATM.

SONET. SONET is a synchronous fiber optic network. The entire bandwidth of a fiber optic link is devoted to a single channel. Nodes connecting the channel transmit data at different time slots. A basic SONET channel (STS-1) has a bandwidth of 51.84 Mbps. SONET can also multiplex multiple digital channels together [7] to support more viewers. For example, three STS-1 channels are multiplexed to form a STS-3 channel with 155.52 Mbps bandwidth. SONET is suitable for delivering IVOD streams because bandwidth is guaranteed and jitter is zero.

ATM. ATM is the acronym for Asynchronous Transfer Mode. It is asynchronous because it allows data arriving at irregular intervals [7]. ATM is suitable for transferring IVOD data because it is a connection-oriented packet switching network. ATM transmits data at a rate from 1.544 Mbps to 622 Mbps, over copper and fiber media.

Note that a backbone with ATM running over SONET can also be adopted.

6.2.2 Signaling Schemes

A local distribution link requires a bandwidth of 1.5 Mbps. Many signaling schemes can deliver video data at such a data rate over existing communication links [5]. Two such schemes are described below.

Asymmetrical Digital Subscriber Loop (ADSL). ADSL is a signaling scheme used on copper-wire networks (e.g. telephone networks). It can deliver data at high speed with few signal distortions over existing copper wires. ADSL consists of a pair of ADSL units. One is installed in the client site; the other is attached at the central phone office. ADSL uses advanced integrated circuit designs, complex digital signal processing techniques, and software-based algorithms to compensate distortions in copper wires [9]. ADSL can provide a subscriber with a down-link of 1.536 Mbps wide-band signal, an up-link of 16 Kbps, and a basic-rate ISDN channel/the analog 4 kHz telephone channel on existing twisted copper wire [5]. These characteristics satisfy the bi-directional communication and bandwidth requirements posted by IVOD services. If the client site is within 5.5 kilometers of the access node, no additional equipments are necessary to ensure strength of the received signal. ADSL is achievable because it divides the signal on a range of carrier frequencies, dynamically adjusting to achieve the most efficient channel allocation [6]. Extensions of ADSL include HDSL, SDSL, S-HDSL, and VDSL. HDSL has a data rate of 6 Mbps, and it can support MPEG-2 video streams up to about 2 km [9]. An ADSL local distribution network is shown in figure 9.

Figure 9. An ADSL Local Distribution Network [7]

Hybrid Fiber Coax (HFC). HFC is currently being installed by cable TV companies. It migrates the all-coaxial cable design to a network with fiber and coaxial cable[8]. Variations on HFC exist and have been implemented. They include fiber-to-feeder, fiber-to-the-hub, fiber-to-the-zone, fiber-to-the-curb, and fiber-to-the-tap. Fiber-to-the-feeder and fiber-to-the-hub are networks with fiber trunks, coaxial distribution links and subscriber drops. The others are networks consist of fiber trunks and distribution links, but coaxial subscriber drops [10]. The migration to HFC involves a replacement of the current 300-450 MHz coaxial cables by new 750 MHz coax cables [7]. Each channel is subdivided in 125 6-MHz subchannels. Bi-directional communications are implemented using splitband systems. In North America, the frequency band between 5-54 MHz is used as up-link. The remaining bandwidth is guard band and down-link. Since clients share the same physical medium, collisions occur when they try to access the medium at the same time. The problem of sharing the transmission medium limits the number of clients attached to a coax tree structure [10]. The network architecture of HFC is depicted in Figure 10.

Figure 10. A Hybrid Fiber Coax Local Distribution Network [8]

IVOD services are feasible on HFC because the frequency band of the coaxial cables is splitted so that an up-link is available for sending clients' requests to servers. Currently, Rogers Cables Canada is upgrading its existing cable system to HFC, hoping they will soon be able to deliver Internet access services, if not Video On Demand services.

7. Server

Server is the heart of the IVOD system. It provides high quality services to users by using strategies that minimize cost. Server has a storage subsystem where movies are stored. Many researchers have explored ways to optimize the server capacity and have discussed issues concerning building a multimedia server in general. This section will discuss admission control, servicing policies, the storage subsystem, and traffic characterization, which are issues related to optimizing server performances. Figure 11 and Figure 12 depict the general structure and details of an IVOD server:

Figure 11. A Typical Structure of an IVOD server [7]

Figure 12. Data Flow Graph of an IVOD Server [13]

7.1 Admission Control

Any incoming clients who want to use the IVOD service will have to request for and set up a connection with one of the servers. If the required movie or parts of the required movie are not stored in the current server, transfers of data between servers are needed. Hence, admission control for the remote server is also required.

In order to guarantee new clients have continuous playback of the video and to ensure the QoS contracted to existing connections are not jeopardized, the server must have enough resources such as storage subsystem read bandwidth, memory buffers, processing bandwidth, and network bandwidth before admitting a new connection request [11]. Before establishing a connection, a set of QoS parameters sent by the client will be checked by the server. If requested QoS can not be achieved, negotiations can be made between the client and the server, or the connection will be denied. An alternative is that the server does not commit the resources requested by the client until the connection is up for some threshold of time, usually a few minutes [5]. This reduces the probability of committing resources to connections that just last for a few minutes or even a few seconds.

An admission control algorithm must have some knowledge about the capability of the storage subsystem, e.g. the minimum number of blocks that the server can read in a time slot. A time slot is the interval between serving a video stream divided by the number of active streams in the server[12]. Neufield, Makaroff, and Hutchinson [12] suggested that the minimum number of blocks that a server can read in a time slot, called minRead, is the only storage subsystem information required for the admission control algorithm. According to them, minRead can be determined by running a calibration program that uniformly spaces data blocks across the disk to maximize seek times (assuming a SCAN algorithm is used). The estimation of minRead should be as accurate as possible because conservative estimations of minRead degrade server performances. A simple admission control algorithm can be described as follows: sum the block schedules for all active streams together with that of the requested stream. If the sum is greater than minRead in any time slot, the connection will be denied. An improvement can be made to this simple scheme by allowing the server to read ahead when the server is idle[12].

The server must also ensure enough network resources are available before accepting a new connection. Unfortunately, variable bit rate video streams create difficulties in determining the amount of resources needed. If the network reserves resources according to the average video stream rate, delay or packet loss may occur when the servers are sending at their peak rates. If the network reserves resources according to the peak video stream rate, the network may be under-utilized most of the time [4]. Traffic policing is performed on admitted connections to ensure the connections have not over-used any network resources.

In addition to checking the storage system and network bandwidth, the server also ensures the availability of memory buffers and processing bandwidth. The server, based on the required QoS, determines the amount of buffer space and the processing bandwidth needed by the requested stream.

7.2 Servicing Policies

Servicing policies determine the design and the implementation of various IVOD components. The server capacity, which is the number of simultaneous viewers that can be supported by the server, depends on the strategies used in allocating video streams to viewers. Obviously, from the service providers' point of view, server capacity should be as large as possible. The simplest way of allocating streams to viewers is to dedicate a single stream to each viewer. However, this scheme is expensive and inefficient since the server capacity is bounded by the maximum number of streams that can be handled by the server. If the network in the IVOD system supports multicast, e.g. ATM, sharing of video streams among several viewers is possible. Sharing of video streams can significantly increase the number of simultaneous viewers and can save network bandwidth [13]. Three service policies have been proposed in [13]:

On-Demand Single Cast (ODSC). ODSC is the simplest of the three schemes. In this service policy, each client has a dedicated video stream that is assigned to the viewer when its connection is granted. Since the client has complete control over his/her own video stream, the implementation of the interactive functions mentioned in Section 3 becomes easy. The drawback of this scheme is that the server capacity is limited. The waiting for services to become available can be unexpectedly long if client requests come at times when the server is fully utilized [13].

Phase Multicast (PMC) or Batching. To use this policy, the data network must support multicasting. Each video stream is shared by viewers of a multicast group. Video streams are started at fixed intervals or phases. Connections that are admitted in between startings of streams are grouped and served by the next video stream [13]. The video streams' starting intervals can be fixed by the service provider, or they can be determined by the server dynamically. Without any doubts, the longer the time interval between consecutive streams, the lesser the service costs and the greater the number of viewers can be supported by each stream. However, long setup time delay may cause clients' dissatisfaction. Thus, a tradeoff exists between cost and quality of service provided.

A serious drawback of this scheme is the difficulty in implementing interactive functions. Since the video streams are now shared by others, the delivery of each video stream cannot deviate from its scheduled time. A discussion on how to support interactive functions under PMC is out of the scope of this paper.

On Demand Multicast (ODMC). ODMC is a hybrid of PMC and ODSC. During light load, the server uses ODSC to serve viewers. It switches to PMC during heavy load. Thus, ODMC addresses problems of PMC and ODSC, which include the inability of ODSC to cope with overload situations and the underutilization of the server during light load when PMC is used [13].

Simulations had been run on the above three service policies under the condition that no interactions are allowed. Thus, the system simulated was really a VOD rather than an IVOD. It was found that the ODMC scheme is suitable for server that is utilized lightly by providing fast setup time. At very high levels of utilization, the PMC service policy should be used. The ODSC service model should only be used when support of extensive random access, which is what IVOD needed, is required. For details on the simulation and the performance results, refer to [13].

7.3 Storage Subsystem

Server subsystem is where compressed movies are stored. The compressed video and corresponding compressed audio streams for a movie can be stored in the same server or in different ones. The storage subsystem is also the place where most of the optimizations are done to improve the performance of the IVOD system. It is an important factor in determining the server's cost. So far, the standard used in an IVOD system that has been agreed on is the use of MPEG-2 for video encoding [7].

7.3.1 Hierarchical Structure

Even with MPEG-2 compression, a movie will occupy approximately 4GB [7]. If the fixed magnetic disks are used as the sole storage medium in IVOD systems, the cost of archiving thousands of movies is on the order of millions. Therefore, large tertiary storage devices such as tape, optical jukeboxes, or the new technology Digital Versatile Disc (DVD) jukeboxes should also be used [6].

Tertiary devices are highly cost effective because they provide large storage capacities at low cost. However, their random access time is slow, and their throughput is low. Consequently, servers should be organized as a hierarchy that combines the cost-effectiveness of tertiary storage and high performance of fixed magnetic disks [6]. Refer to Figure 6 for a server storage hierarchy.

A hierarchical storage model can be described as follows: Popular movies are stored in RAM, less popular ones on the hard disks and the least popular ones on tertiary storage. This hierarchical approach reduces operating costs while offering a wide selection of movies to its users [5].

7.3.2 Movie Storage Data Distribution

Several approaches are possible for managing the storage hierarchy. One approach is to store the beginning segments of the multimedia files in magnetic disks. This approach reduces in start-up latency and provides smooth transitions in the playback [15]. It improves the system performance if viewers make another selection within the first minutes of the movie [15, 5].

An alternative is to calculate movies' popularity on a daily basis. Based on their popularity, movies are rearranged or replicated as necessary during off-peak hours. Consequently, movies are available for viewing during peak hours based on anticipated demand [5].

7.3.3 Data Block Placement

The way in which data blocks of media files are placed on disk can significantly affect the data retrieval scheme and the system performance since the time required for retrieving data depends totally on the location of data blocks. The data retrieval scheme can in turn affect how interactive functions, such as Fast Forward (FF) and Fast Backward (FB) searches, are supported. One possible approach of implementing FF and FB is to read more data blocks for that particular client in a round. However, extra disk bandwidth requires more storage devices to serve the same number of clients, thereby increasing the cost of the storage subsystem. Cheng, Wen, Lee, Wang, and Oyang [17] suggested that if the data blocks are placed by a placement scheme that effectively utilize disk bandwidth during normal playback, one can design a data retrieval scheme which requires no extra bandwidth to support interactive functions. However, an extra buffer of size several file blocks is needed when a stream is in the FF or FB search modes.

Two general strategies used in data block placement schemes are load-matching and load-balancing [14].

Load-matching. Placement of more frequently accessed data blocks in "best" (outer) tracks while those that are accessed occasionally are placed in inner tracks. This strategy exploits the track dependent transfer rate that can have a ratio of 1:1.8 from inner to outer tracks [18].

Load-balancing. Placement of data in a way such that constant streaming capacity is provided independent of viewers' choices and location of data.

Both of the above strategies try to maximize the server's capacity, which is the number of simultaneous viewers supported. Details on implementation of load-matching and load-balancing and description on interdisk and intradisk load-balancing can be found in [14] and [18].

Two main schemes of data block placement are disk farm and disk array (RAID). Each disk holds several entire movies in disk farm; whereas in RAID, each movie is scattered over multiple drives, e.g. block 0 on drive 0, block 1 on drive 1, and so on so forth. This organization is called striping. RAID is preferred to disk farm because it offers better performance [7].

Disk Farm

Load matching is a possible approach under this scheme: Popular movies are stored in outer tracks while less popular ones are stored in inner tracks. However, popularity of movies may change several times in a day. Thus, the capacity of the server will decrease when such changes happen. One can dynamically rearrange the movies on disks but this can be expensive [7, 15]. Under this scheme, the number of concurrent access to a single multimedia file is limited by the throughput of the disk. However, this scheme is easy to implement.

RAID (Redundant Array of Inexpensive Disks)

Unlike storing data in a single disk, both load-balancing and load-matching can be achieved in RAID by "striping" all movies onto all disks [7, 15]. Under this scheme, data is "striped" across an array of disks. The coarser the striping, the larger the buffer required. Figure 13 describes the structure of a RAID.

Figure 13. RAID [15]

Striping allows parallel accesses of data from the same physical sectors of all disks in the array. Hence, logical sectors and physical sectors have identical access time. The effective throughput can be increased by a factor that is equaled to the number of disks in the array. The increased transfer rate makes RAID a good structure for the storage subsystem since high bandwidth is required in supporting IVOD [15]. RAID can have several parity disks to provide fault tolerance. Those extra disks contain block-by-block EXCLUSIVE OR of the other disks to allow data recovery for the faulty disks. The number of drives' failures that can be survived is determined by the number of parity disks added. This scheme, however, is ill-suited to interactive functions such as FF and FB since many of the blocks read in parallel will be discarded [5].

7.3.4 Disk Scheduling

Real-time constraints makes traditional disk scheduling algorithm, such as first come first serve, short seek time first, and scan, inappropriate for IVOD. Here are two suggested scheduling algorithms [15]:


The best known algorithm for real-time scheduling of tasks with deadlines is the earliest deadline first algorithm (EDF). The media block with the earliest deadline is fetched first. The disadvantage of this algorithm is excessive seeks and poor utilization of the server's resource [15].

A variant of EDF is a combination of SCAN and EDF that is called the Scan-EDF scheduling algorithm [19]. Scan-EDF, like EDF, services blocks with the earliest deadline first. However, when several blocks have the same deadline, those blocks are served using the SCAN algorithm (the disk head moves back and forth across the disk and fetch requested blocks as it passes them). Clearly, the effectiveness of SCAN-EDF depends entirely on the number of requests that have the same deadline [15].


Under round-based algorithms, a server serves all streams in units of round. During each round, the server retrieves a certain number of blocks for each stream. Since MPEG-2 results in variable-bit-rate compressed streams, the number of blocks that must be retrieved for each client in each round will vary according to the compression ratio achieved for each block [15].

To ensure continuous playback of media streams, the server must retrieve a sufficient number of blocks for each client in each round to prevent starvation for the round's duration. Thus, the server has to know the maximum duration of a round as round length depends on the number of blocks retrieved for each stream. A simple scheme that retrieves the same number of blocks for each stream (generally referred to as a round robin algorithm [19]) is inefficient since the maximum playback rate among all streams will dictate the number of blocks to read. This results in streams with smaller playback rates retrieving more data blocks than needed in each round. This may overflow some clients' buffer as well as decrease the capacity of the server. Consequently, more clients can be accommodated by reducing the number of data blocks retrieved per service round for streams with lower playback rate [19].

One proposed approach to minimize the round length is to make the number of blocks retrieved for each stream in each round proportional to its playback rate [19]. This scheme is called Quality Proportional Multi-subscriber Servicing (QMPS) algorithm. This algorithm is provably optimal in [19]. For more information on QMPS, please refer to [19].

7.4 User Traffic Characterization

Even though customers access the IVOD system randomly, having a priori knowledge about the users' access pattern can lead to a more efficient design of the storage scheme and a more efficient utilization of the storage and network bandwidth [5]. For instance, if traffic characteristic indicates that a movie is popular in a particular site, the system can replicate the movie to increase availability. Similarly, knowing the typical user access pattern can be beneficial in designing schemes for resource managements, such as popularity tables updates, data redistributions, and system reconfigurations. Preferably, these resource managements should be done during off-peak hours [5].

8. Ways of Handling Interactive Functions

Several ways are proposed on the handling of interactive functions. Most of the researches are done on implementing FF and FB. Several proposed methods are outlined below:

Frame Skipping

Chen, Kandlur, and Yu [11] proposed that FF and FB can be supported using MPEG frame skipping. Their approach composes of a storage method, a segment placement scheme, and a playout method with a segment sampling scheme or a segment placement scheme as alternatives for selecting segments.

Usage of D-frames provided by MPEG-1 [7, 19]

MPEG-1 D-frames contain only the block averages that can be used for browsing. Thus, only D-frames are presented to the client after decoding when the client is in FF or FB modes. This method is attractive since no additional processing is required as the D-frames are always generated by MPEG-1. However, the resulting outputs have poor resolutions. Unfortunately, D-frames are not available in MPEG-2.

Block categorizing [19]

Unlike the previous two schemes, in which the client is responsible for supporting the interactive functions, FF and FB are supported by the server in block categorizing. This scheme works as follows: Each block is marked as being relevant or irrelevant to FF/FB. Both types of blocks are retrieved during normal playback while only those marked relevant are retrieved and transmitted in FF/FB modes. This scheme can be achieved by scalable compression, which combines one or more additional streams with the base stream to produce higher quality [12, 19]. However, scalable compression poses additional overhead on splitting the blocks and recombining them before presenting. This scheme requires extra bandwidth if differential compression is used since most frames are relevant to FF/FB.

9. Existing Trials

Table 1 and 2 list some of the VOD trials existed in the United States and in other parts of the world [10]. In 1995, VOD services had started to appear in the market. However, the responses were poor. Things started to changes in the middle of 1996. Major operators such as Americast, Tele-Communciations Inc, Rogers Cablesystems Ltd. have committed to buying million set-top boxes [10]. A typical set-top box costs about three hundred US dollars. Table 1 and 2 show that many companies have just started to offering VOD services. Price ranges from US $7.50 to US $20.00 per month plus movie charges. Network technologies used in the trials are mostly HFC and ADSL because they require fewer modifications of existing cable and telephone networks. However, when HDTV at 20 Mbps is highly demanded, networks with ATM and Satellite may even be used in local distributions. Analysts estimated that more than US$ 1 billion were spent in 1994 on VOD infrastructure worldwide and nearly US$ 3 billions for 1995 [21].

Table 1. Interactive TV Trials in the United States

Interactive TV Trials in the United States
Company Location Settop Server Services Technology Start/End #Homes: Now/End Cost/Month
Bell Atlantic (FutureVision) Dover Toms River NJ Philips, FutureVision, TeleTV nCUBE, FTTC NVOD, PPV, transactionSwitch DigVid 2/96-Rollout0/38,000 (passed) Varies
Bell Atlantic (Stargazer) Fairfax VirStellar One nCUBE, VOD, Inter. ADSL3/95 12/96 1,000/1,000$7.50/mo
Bell South (Inter. Serv) Atlanta GASci/Atlanta H/P VOD, NVOD transactionFiber/Coax 2/96-mid/970/12,000 (passed) NA
Cox Cable (Canceled) Omaha NBZenith N/AVOD, NVOD transaction Hybrid F/Coax6/94- 12/950/2,000 $20+ movie
Pacific BellHFC in SD, SJose, Org Cty California Sci/Atlanta N/A NVOD, VOD, cableHybrid F/Coax 96-1997 200/1 million passed(1997-?) N/A
SNET (Canceled) Hartford CN Sci/Atlanta H/P NVOD, PPVHybrid F/Coax 4/94- Mid 96350/150,000 passed $20/mo +movies
Sprint (VDT Trial) Wake Forest, NCN/AN/A VOD, Sega, InfoN/A10/95- Fall 97 650/1,000N/A

Table 2. Interactive TV Trials Outside the US

Interactive TV Trials Outside the US
Primary Source: Electronic Engineering Times, Nov 27,1995
CompanyLocation TechnologySettop Start/End# Subscribers
British TelecomUK ADSLApple1995 2,500
Cambridge CableUK Fiber/CoaxOnline Media 1994250
Bell/Nynex/ TeleWestUK NANA1996 1,000
Deutsche TelekomGermany ADSL/ATM/HFC SatelliteAlcatel Nokia IBM HP 19961,500,000
French TelecomFrance ADSL, Fiber to homePhillips, SEMA 19961-2,000
Canel+France SatelliteSony Pioneer Sagem Phillips Thompson 1996N/A
Swiss TelecomSwitzerland CoaxPhillips1995 800
Svenska Kabel TVSweden Fiber/CoaxDigital, Vela Research 1995500
Telecom ItaliaItaly ADSLB.Atlantic, OS-9/David 19951,000
Telecom AustraliaAustralia ADSLCLI, OS-9/David 19962,500
Hong Kong TelecomHong Kong Fiber to building, ATMNEC, OS-9/David 199665,000
Israel TelecomTwo city trial ADSLCelerity Server, David Settop 1996N/A
Korean TelecomKorea ADSLCelerity Server Samsung/David Settop 1995100
Nakano City TVJapan HFC/ATMFujitsu, OS-9/David 1995300
NTTJapanADSL/ATM N/A1995400
Singapore TelecomSingapore ADSL, ATMN/A1995 300
JSATJapanDBS Satellite N/A1996N/A

10. Conclusion

In this paper, we give an introduction to the overall system architecture of an IVOD system. Issues to be considered in designing and building each of the components are discussed in turn. Lastly, some existing trials are presented.

Developing this new information delivery infrastructure requires considerable planning and effort [5]. In addition to meeting technology requirements, such as network and disk bandwidth, making IVOD service a reality also involves considerations of other factors. They include the cost of building the system, international standard agreement, and service providers' security. The economics of providing IVOD service cannot be ignored. A large video server can easily cost more than a mainframe, certainly 10 million dollars. Consider a regional IVOD service with 100,000 subscribers, each of which rent a 300-dollar "set-top box". If the networking equipments cost 10 million dollars and have a 4-year depreciation period, the service has to generate 10 dollars per home per month. Charging 2 dollars per movie and 3 dollars rental for "set-top box" requires every subscriber buying 2 movies per month to break even [7]. Certainly, all the cost figures mentioned above vary with time, but it is clear that a mass market is needed for the technology to be viable [7]. Unfortunately, few of the existing trials have more than 2000 subscribers. For the development of a mass market, agreement on international standards must be reached. Establishing standards for "set-top box", video-file server, and standards that allow competition among service providers are important for such systems to grow [5]. Currently, MPEG-2 is the only standard being agreed on for video encoding [7]. Mechanisms to protect intellectual property rights must also be established so that service providers are able to maintain control of their data and thus are able to stay in business. One mechanism to avoid movie duplication is to limit storage buffer space in the "set-top box" [5].

In fact, the technologies required to make IVOD a reality already exist. If an IVOD system can be built with a reasonable cost, international standards can be developed, and mechanisms for protecting the interests of service providers can be established, we can easily see the appearances of large-scale IVOD services in the near future.

11. References

  1. Son Vuong. "What is a Multimedia System". Course Notes on Communication Protocols, Computer Science Department of University of British Columbia, Vancouver, 1996.
  2. Jrg Liebeherr. "Multimedia networks: Issues and challenges". Computer Magazine, vol. 28, issue 4, April 1995.
  3. Andrew Campbell, Geoff Coulson, Francisco Garca, David Hutchinson, and Helmut Leopold. "Integrated Quality of Service for Multimedia Communications". Infocom' 93, 1993.
  4. W. Knightly, D. E. Wrege, J. Liebeherr, H. Zhang. "Fundamental Limits and Tradeoffs of Providing Deterministic Guarantees to VBR Video Traffic". SIGMETRICS' 95, 1995.
  5. T. Little, D. Venkatesh. "Prospects for Interactive Video-On-Demand". IEEE MultiMedia, vol. 1, issue 3, 1994.
  6. K. Cleary. "Video On Demand - Competing Technologies and Services". Broadcasting Convention, 1995 (IEE Conference Publication 413), 1995.
  7. A. S. Tanenbaum, "Computer Networks Ed. 3," Prentice Hall, New Jersey, 1996., pp.744-757
  8. S. D. Dukes. "Next Generation Cable Network Architecture". NCTA Technical Papers, 1993.
  9. "Enhancing the Performance and Application of Copper Cable with HDSL". A Technology Brief of PairGain Technologies, Inc.
  10. "Interactive TV Trials", Sam's Telecom Casbah.
  11. Ming-Syan Chen, Dilip D. Kandlur and Phillip S. Yu, "Support for Fully Interactive Playout in a Disk-Array-Based Video," Proc. ACM Multimedia 94, ACM Press, San Francisco, Oct. 1994.
  12. Gerald Neufield, Dwight Makaroff and Norman Hutchinson, "Design of a Variable Bit Rate Continuous Media File Server for an ATM Network," Department of Computer Science, The University of British Columbia.
  13. Wong, L. Zhang and K.K. Pang, "Video On Demand Service Policies", IEEE Singapore International Conference on Networks, 1995.
  14. Yitzhak Birk, "Deterministic Load-Balancing Schemes for Disk-Based Video-On-Demand Storage Servers," 14th IEEE Symposium on Mass Storage Systems, 1995.
  15. James Gemmell, Harrick M. Vin, Dilip D. Kandlur, P. Venkat Rangan, Lawrence A. Rowe, "Multimedia Storage Servers: A Tutorial," Computer Magazine, vol. 28, issue 5, May 1995.
  16. "Interactive Television and the Interactive Home,"
  17. Chih-Yuan Cheng, Chun-Hung Wen, Meng-Huang Lee, Fu-Ching Wang and Yen-Jen Oyang, "Effective Utilization of Disk Bandwidth for Supporting Interactive Video-on-Demand," IEEE Transactions on Consumer Electronics, vol. 42, issue 1, Feb. 1996.
  18. Yitzhak Birk, "Disk-Based Video-On-Demand Storage Servers: Requirements, Challenges and (Some) Solutions," Electrical and Electronics Engineers in Israel 1995 Convention, 1995.
  19. Venkat Rangan, Harrick M. Vin, and Srinivas Ramanathan, "Designing an On-Demand Multimedia Service," IEEE Communication Magazine, vol. 30, No. 7, July 1992, pp. 56-65
  20. V. Li, W. Liao, X. Qiu, and E. Wong, "Performance Model of Interactive Video-on-Demand Systems," IEEE Journal on Selected Area in Communications, vol. 14, issue 6, August 96.
  21. "Sequent Selected to Support Interactive TV Trial in the United Kingdom,"