The power of the human visual system to process wide ranges of intensities far exceeds the abilities of current imaging systems. Both cameras and displays are currently limited to a dynamic range (contrast) of between 300:1 to 1,000:1, while the human visual system can process a simultaneous dynamic range of 50,000:1 or more, and can adapt to a much larger range. In recent years, there has been a strong push to alleviate this situation by developing high-dynamic-range (HDR) display and camera hardware, as well as the supporting processing algorithms.
A new generation of high dynamic range (HDR) display devices promises to provide life-like picture quality, thanks to an improved luminance/color dynamic range over the existing conventional display technology. HDR displays and cameras will replace traditional display technology in niche markets such as film post-production and medical imaging. This technology has the potential to displace conventional displays also in mass markets such as computer monitors and television.
HDR displays will produce high-fidelity scenes (images) with a luminance range close to the perceptible exposure range of the human visual system. At the same time, digital imaging technology continues to evolve with future digital cameras aiming at capturing high-fidelity content, with significantly higher dynamic ranges than are available with the currently available low dynamic range (LDR) cameras. The evolution of these technologies will require the development of efficient and low-cost HDR video capturing, as well as processing and delivery mechanisms that should allow interoperability with existing LDR displays. These mechanisms should take advantage of the larger amount of information in the HDR content, to produce superior LDR picture quality. Moreover, having HDR content backward compatible with LDR displays should be efficient and scalable, while keeping the overall bandwidth requirements within industry-acceptable limits.
Current Technology. Present HDR capturing and processing solutions are still at an early stage of development and have not kept up with the advances in HDR display technology. The two major problems are the high cost of capturing HDR video and the inefficiency in generating scalable HDR video content that is backward compatible with existing (LDR) technologies. None of the existing HDR video capturing solutions enables high-quality HDR content generation at low cost. Moreover, research on generating better quality LDR content using HDR input is mostly limited to still images. Backward compatibility and the use of scalable video coding techniques for representing the HDR content bitstream are evolving, but current solutions suffer from low coding efficiency and the generation of considerable bandwidth overhead, compared to encoding a single-layer HDR video stream.
Our Vision of HDR Video Technology. Our objective is to conduct high-level research that leads to novel and practical HDR capturing, compression, and delivery solutions. Appropriate algorithms, as well as analysis and design tools will be developed for these systems. The novel aspects of our research include:
Efficient LDR to HDR up-conversion
Optimized Tone Mapping for Lossy Video Compression
Scalable HDR Content Generation
Cost-Effective HDR Video Capturing
Our objectives:
Provide backward compatibility with LDR displays while offering higher than traditional LDR picture quality,
Allow perfect reproduction of the HDR component,
Efficiently code the LDR-HDR-enabled content to allow delivery within the bandwidth allowed by the existing multimedia distribution infrastructure, and
Facilitate early market adoption of HDR technology by offering an inexpensive HDR capturing scheme.
Television systems are migrating to digital technology and more Digital Television (DTV) systems are being deployed around the world everyday. This change is creating a technological revolution in the entertainment industry. Besides delivering superior picture and sound quality, the new DTV technology allows the addition of new add-in services that enhance the TV viewer's experience.
Current Technology: Over the years the concept of 'interactivity' for TV has been evolving. For example, Video-On-Demand systems, TV systems with VCR-like functionality and Advanced TV were referred to as Interactive TV even though they do not allow the user to interact with the TV program they are watching. In Video-On-Demand systems, viewers select a movie or TV show from a library, to be only played back on their TV. In TV systems with VCR-like functionality (e.g., TiVo and Replay), TV viewers can save, pause or rewind a live TV show. The Enhanced TV technology allows users to interact with contents of the web but not with the program they are watching. Along the same line, several interactive TV competing technologies/platforms, e.g., MHEG-5, HTML, OpenTV and MediaHighway have been recently introduced and used by European broadcasting facilities such as BBC, Channel 4, ITV, RTE, and Sky. To standardize across platforms and be cost effective, the Digital Video Broadcasting (DVB) consortium introduced the Multimedia Home Platform (MHP) that extends the existing successful DVB broadcasting standard to include interactive services. MHP uses Java to support interactive TV applications, providing more capabilities and flexibility compared to the above platforms, but at the expense of increased complexity and cost. In these systems, viewers access additional information about the video using video overlays, however they cannot alter the contents of the program they are watching, e.g., accessing different view angle or different rated versions of a movie. It is thus evident that current interactive systems for digital TV remain limited.
Our Vision of Interactive Television: Our aim is to conduct high level research that leads to a truly novel, yet practical interactive television system. The objective is to define and design a system that will allow TV viewers to control and individualize parts of the TV program contents they are viewing. A unique feature of our proposed scheme is to add alternate video substreams and audio streams that allow viewers to interact with parts of the content. For example, for some scenes the viewer would have the choice of watching the “main stream” or the alternate “added stream”. Such interactivity has been successfully implemented in the DVD standard. However, DVD is a non-broadcast interactive media. Therefore, our key challenge is to design a system that offers DVD-like Internet interactivity, but applies to real-time broadcast digital TV systems. We impose the following restrictions on our system: The addition of the “added” streams should be: 1) accommodated within the originally allocated transmission bandwidth for that program, 2) should not result in any degradation in picture or sound quality of the main TV program content, 3) should be compatible with present-day TV systems, and 4) should not significantly increase the production complexity and cost.
To design this system, appropriate algorithms, analysis and implementation tools involving innovative ideas in multimedia, video processing and statistical signal processing will be developed. Novel aspects include:
Developing practical encoding, authoring and multiplexing schemes for the head-end (broadcasting side). These include scalable video coding methods for a single TV transmission channel that may carry one or multiple video and audio streams, real-time DVD-like authoring and formatting tools for interactive content, and new multiplexing schemes. We will also develop software design tools for preparing, formatting and testing the main as well as the interactive added streams.
Developing experimentally validated stochastic dynamical traffic models for interactive TV content. New admission, scheduling and traffic characterization techniques that exploit channel conditions and optimize network performance will be developed.
Developing demultiplexing, buffering and playback schemes for the receiver-end. Adaptive buffering techniques and caching will be studied to optimize the quality and level of interactivity offered to the user. Our goal is to modify the existing DVD architecture to handle the proposed interactive services.
Our novel architecture will use existing standards at the transmitter and receiver ends. This is of great importance to the entertainment industry, since it minimizes their implementation costs. Preparing interactive content is a complex and expensive process. Unlike present platforms, our system will not require content authors and post production houses to learn and use new ways for preparing interactive TV content, since our system offers (the familiar) DVD based interactivity. For the end-user, it eliminates the need for buying new equipment and user training (i.e., uses a DVD player). It also does not require viewers to have Internet connection, since it does not require bi-directional data channels as is the case with the present interactive TV platforms. This project is supported by NSERC and major international giants such as Sonic Solutions, Deluxe Studios, Technicolor, and Microsoft. Our research will have significant impact on future multimedia and entertainment industries. It will facilitate transformation of present broadcast services to truly interactive systems.
Converting DVB-MHP Interactive TV content to Blu-ray Format
Our goal is to enable the video and interactive content transmitted by a DVB-MHP broadcasting system to be played in real-time on the Blu-ray platform. The Multimedia Home Platform (MHP) is the latest interactive TV standard introduced by the Digital Video Broadcasting (DVB) group, which is the most widely used digital TV broadcasting standard in the world. On the other hand, Blu-ray is a new-generation of high-definition DVD format. Compared to the traditional DVD technology, Blu-ray is designed to offer advanced interactive features and high definition video quality. Although both the DVB-MHP and the Blu-ray support Java-based interactivity, there are several differences between the two to make them incompatible. In order to achieve compatibility between the two systems in real-time, an efficient transcoding scheme must be developed that will convert metadata, video, and interactive data from the DVB-MHP format to the Blu-ray system format.
Selected Publications
Zicong Mai, Panos Nasiopoulos, Rabab Ward, “Efficient DVB-MHP to Blu-ray System Information Transcoding”, 20th Canadian Conference on Electrical and Computer Engineering, April 2007.
Sergio Infante, Panos Nasiopoulos, “A DVB-MHP to Blu-ray transcoding scheme for interactive data”, 26th International Conference on Consumer Electronics, January 2008.
Zicong Mai, Panos Nasiopoulos, and Rabab Ward, “DVB-MHP iTV to Blu-ray System Information Transcoding”, the 3rd International Symposium on Communications, Control and Signal Processing (ISCCSP), March 2008.
Zicong Mai, Panos Nasiopoulos, Sergio Infante, and Rabab Ward, “Real-Time DVB-MHP to Blu-ray System Information Transcoding”, IEEE Transactions on Consumer Electronics, submitted in Sep. 2007.
This project focuses on developing algorithms for multi-view video coding (MVC) schemes and also 3D content generation for three-dimensional television (3D TV) applications. 3D TV is believed to bring the next major revolution in TV history. Future 3D TV will not only allow on-screen images to emerge or penetrate into the viewer’s space, but it will also provide viewers with interactivity features; the viewer will be able to adjust 3D depth perception based on his/her preferences and choose a viewing angle within a visual scene (free navigation). This can be achieved by capturing the scene from multi-view points with a setup of N synchronized cameras. Similar to 2D streams, multi-view streams need to be compressed before transmission. However, a full dynamic multi-view signal acquisition yields N video streams instead of one, resulting in an enormous data rate. Thus, highly efficient coding schemes are required for 3D TV applications to conquer limitations regarding resources like channel capacity and storage. Although the creation and transmission of new 3D video content is important for the successful introduction of 3D TV to the consumer market, equally important is the ability to convert existing 2D material to 3D format. The latter would allow existing popular movies and documentaries to be transformed into 3D video streams, creating a new market for content owners and providers.
One of the objectives of our research is the effective compression of 3D TV signals by considering the strong correlation that exists among the multi-view streams. Moreover, our MVC schemes will support random access in view and time, with minimum decoding effort. Development will be based on extending and modifying the H.264/AVC standard, which has proven to yield the best efficiency among all existing MPEG standards. To take advantage of the correlation between different views, we will use disparity compensated view prediction in addition to motion compensated prediction. Reference frame management is used in order to provide the viewer with interactivity features, i.e., random access to any frame from any viewpoint or at any time instant with minimum delay.
The other objective of our research is developing efficient methods that convert 2D video sequences to 3D ones by considering the relationship between the motion of objects and their distance from the camera. In our approach the relative motion between consecutive frames is derived at quarter pixel accuracy and for several different block sizes that dynamically adjust to video content. The resulting motion vectors are converted to depth map information using a non-linear model that is based on the characteristics of 3D visual perception. In this case, 2D horizontal motion is approximated to be the displacement between the right and left frames of a 3D set up. Finally, 3D video content is rendered from 2D video stream and its corresponding depth map, using a process known as depth image based rendering (DIBR). Our approach can be implemented in real-time at the receiver-end where motion vectors will be readily available for no additional computational cost.
Selected Publications
M. T. Pourazad, P. Nasiopoulos, and R. K. Ward, “An efficient 2D to 3D video conversion approach”, EURASIP European Signal Processing Conference –EUSIPCO 2008 (submitted).
M. T. Pourazad, R. K. Ward, and P. Nasiopoulos, “An adaptive multi-view video coding scheme”, IEEE International Conference on Image Processing –ICIP 2008 (submitted).
M. T. Pourazad, P. Nasiopoulos, and R. K. Ward, “An H.264-based video encoding scheme for 3D TV”, EURASIP European Signal Processing Conference –EUSIPCO, (Florence, Italy), September 2006.
MPEG-2 is presently the video coding standard used in most consumer products ranging from digital TV broadcasting to DVD. However, H.264/AVC, the latest video coding standard of the Joint Video Team (JVT), is gaining ground in many applications. The introduction of several new advanced features in H.264/AVC results in improved compression efficiency. It is, therefore, expected that MPEG-2 and H.264/AVC will co-exist in the foreseeable future. Since MPEG-2 has been in existence for over a decade, much video have been stored using the MPEG-2 standard. Thus, in order to have universal media access, users with H.264 players should be able to access and play the MPEG-2 coded video. Transcoding from MPEG-2 to H.264 can be carried at the transmitter, the receiver or at a server located somewhere in between. Instead of using the straight forward cascaded approach which fully decodes the MPEG-2 encoded video back to the pixel domain and then re-encodes it using H.264, transcoding from MPEG-2 to H.264 is a more efficient and cost effective solution.
In this project we focus on developing MPEG-2-to-H.264 transcoding schemes whose objective are:
to reduce the computational complexity of the system without affecting the bit-rate or video quality,
to use existing MPEG-2 information to reduce complexity but at the same time exploit new advanced H.264 features that may improve the overall bit rate and/or picture quality of the final video.
To this end, we are designing an efficient MPEG-2 to H.264 transcoding scheme which removes the distortions due to re-quantization errors, luminance half-pixel interpolation errors and chrominance quarter/three-quarter pixel interpolation errors is proposed.
Another scheme focuses on using neural networks to predict accurate block size partitioning for H.264 and employing a fast motion vector refinement algorithm for block sizes smaller than 16x16. This approach has shown to drastically improve complexity and bit rate while maintaining the same video quality.
The advances in digital video coding are pushing the bounds of video delivery to larger and more versatile applications such as digital cinema. The requirements for such applications, however, include high dynamic range content with very high spatial resolutions, which give rise to the need for bit-depth scalability at the codec level and parallel decoding of spatially localized regions of the picture.
In this project, we investigate the transcoding options of a high resolution H.264/AVC encoded video stream into multiple lower resolution H.264/AVC streams each of which produces a spatially-localized section of the original video display. The benefit of such transcoding lies in the ability to distribute the projection load in digital cinema over multiple projectors with varying intensity levels so as to generate a high dynamic range projected display. Moreover, we are exploring new compression schemes for the high dynamic range video content which falls outside the existing standard specifications (e.g., H.265).
Selected Publications
Q. Tang, P. Nasiopoulos and R. Ward, Efficient High Quality MPEG2 to H.264/AVC Transcoding, submitted to the IEEE Transactions on Circuits and Systems for Video Technology in September 2006, 3rd rev. March 2007.
Q. Tang, P. Nasiopoulos, Z. Mai and R. Ward, Block Size Mode Prediction Using Neural Networks and Fast MV Refinement for MPEG-2 to H.264 Transcoding, submitted to the IEEE International Workshop on Machine Learning for Signal Processing in April 2007.
Q. Tang, P. Nasiopoulos and R. Ward, “Efficient Chrominance Compensation for MPEG2 to H.264 Transcoding,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2007, Honolulu, Hawaii, April 2007.
Q. Tang, R. Ward, P. Nasiopoulos, “An Efficient MPEG2 to H.264/AVC Half-Pixel Motion Compensation Transcoding,” ICIP 2006, Atlanta, October 2006.
Q. Tang, P. Nasiopoulos, R. Ward, “An Efficient Re-quantization Error Compensation for MPEG2 to H.264 Transcoding” IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) 2006, pp. 530-535, Vancouver, August 2006.
High speed packet access (HSPA) technologies such as HSDPA/HSUPA in third and fourth generation (3G/4G) wireless networks will have the capability to support heterogeneous compressed video traffic with transmission speeds reaching up to 14.4 Mbps on the downlink channel and 5.8 Mbps on the uplink channel. Wireless video transmission applications can be divided into two main categories: downlink streaming services such as mobile TV, and down/uplink conversational services such as video conferencing, which are characterized by bandwidth-intensity, delay-sensitivity, and loss-tolerance. Our research tackles the power, bandwidth, and protection allocation necessary to overcome the packet loss and limited capacity of wireless channels and the limited battery life of mobile devices.
We address the case of transmitting scalable video data to multiple users over wireless networks that suffer from packet loss, variable packet delivery delay, and limited capacity. These challenges constitute the general difficulties facing wireless video streaming and conversational applications. We are currently treating the problem of channel-aware and delay-aware multi-user scalable video streaming with optimized unequal erasure protection (UXP) over HSPDA networks, taking into account capacity constraints, device heterogeneity, quality of service (QoS) guarantees, and packet delivery delays. We have also developed a loss-distortion model for hierarchically predictive video codecs and successfully used our model in the multi-user rate-allocation framework. The objective is to dynamically allocate unequal error protection and scalable video bit-rate to the multiple users accessing the downlink channel while satisfying the time varying channel capacity, delay, and QoS guarantees. In order to enable efficient uplink services, the following additional considerations should be met:
Developing accurate channel capacity estimates and introducing the effect of the media access control (MAC) layer scheduling and admission control policies to further improve multi-user bit-rate allocation.
Derivation of real-time rate-distortion models for spatial, temporal, and quality scalable coded video streams. These models enable streaming servers and conversational service controllers optimize the encoding and rate-allocation strategies based on the underlying network conditions and the expected video characteristics.
Transmission power management of uplink mobile devices to perform wirless resource allocation and user scheduling in HSUPA systems.
We wish to address the above mentioned requirements for video transmission over third and fourth generation mobile HSPA networks by designing and implementing resource allocation and scheduling architectures that can improve the quality performance of wireless multimedia applications.
Selected Publications
H. Mansour, V. Krishnamurthy, P. Nasiopoulos, “Channel Aware Multi-User Scalable Video Streaming over Lossy Under-Provisioned Channels: Modeling and Analysis,” submitted to IEEE Transactions on Multimedia.
H. Mansour, V. Krishnamurthy, P. Nasiopoulos, “Delay-Aware Rate Control for Multi-User Scalable Video Streaming,” submitted to IEEE ICC 2008.
H. Mansour, P. Nasiopoulos, V. Krishnamurty, “Joint Media-Channel Aware Unequal Error Protection for Wireless Scalable Video Streaming,” submitted to IEEE ICASSP 2008.
H. Mansour, V. Krishnamurthy, P. Nasiopoulos, “Channel Adaptive Multi-User Scalable Video Streaming with Unequal Erasure Protection”, International Workshop on Image Analysis and Multimedia Interactive Services (WIAMIS), pp 54 – 57, June 2007.
H. Mansour, P. Nasiopoulos, V. Krishnamurthy, “Modelling of Loss Distortion in Hierarchical Prediction Codecs”, International Symposium on Signal Processing and Information Technology (ISSPIT), pp 536 - 540, Vancouver, August 2006.
H. Mansour, P. Nasiopoulos, V. Leung, “An Efficient Multiple Description Coding Scheme for the Scalable Extension of H.264/AVC (SVC)”, ISSPIT, pp 519 - 523, Vancouver, August 2006.
H. Mansour, P. Nasiopoulos, V. Leung, “Low Redundancy Layered Multiple Description Scalable Coding Using the Subband Extension of H.264/AVC”, International Symposium on Circuits And Systems (ISCAS) 2005, pp 4042 - 4045, Kobe, Japan, May 2005.
Multimedia Applications that rely on wireless communications are becoming increasingly popular. Traditionally, the communication media for video traffic were limited to cable or satellite broadcast links. New wireless technologies such as WiMax (802.16), 802.15, and 802.11, and future 60GHz band technologies provide countless opportunities for new applications. Examples of such applications include broadcasting video in wireless metropolitan area networks (WMANs) using 802.16 technology, and in neighborhoods using 802.11 based wireless local area networks (WLANs). Other home environment applications include streaming high definition (HD) video to LCD panels in a home, using 802.11, 802.15 or future 60GHz band networks (for Wireless Personal Area Networks, WPANs). Video surveillance applications, based on WLAN or WMAN technologies, are also of great interest.
Existing wireless technologies are designed with the aim of increasing available throughput for data applications, usually overlooking the requirements and characteristics of applications such as video. In this project, we address the issues in transporting video traffic over wireless networks and focus our efforts on designing schemes that will improve the quality of the video delivered over the current and future wireless networks. In particular, we are devising video-aware link adaptation and scheduling schemes for 802.11n and 802.16 networks and will extend this work to WPANs (802.15) as well. The proposed link adaptation scheme adjusts the physical layer configuration in order to achieve maximum quality for the delivered salable H.264 video, given the network and admission control constraints in the network layer. This mechanism is in particular beneficial for HDTV broadcast applications. In addition to link adaptation schemes, we plan to design scheduling schemes for WiMAX and WPANs, targeting applications such as video surveillance. Our current and future research will specially benefit applications such as home and commercial video streaming and broadcasting.
Selected Publication:
Y. Pourmohammadi Fallah, S. Khan, H. Mansour, P. Nasiopoulos, H. Alnuweiri, “A Link Adaptation Scheme for Optimized Transmission of Scalable H.264 Video over Multirate Wireless LANs”, accepted for publication in IEEE Trans. on Circuits and Systems for Video Technology, 2007
Y. Pourmohammadi Fallah, P. Nasiopoulos, “A QoS Framework for Supporting Real-Time Multimedia Traffic in IEEE 802.16 Networks”, submitted to IEEE Wireless Communications
Y. Pourmohammadi Fallah, P. Nasiopoulos, H. Alnuweiri, “Controlled Access Schemes for Efficient Delivery of H.264 Video over WLANs”, submitted to EURASIP Journal of Wireless Communications and Networking
Y. Pourmohammadi Fallah, S.Khan, P. Nasiopoulos, H.Alnuweiri, “Hybrid OFDMA/CDMA Based Medium Access Control for Next Generation WLANs”, to appear, Proc. IEEE Int. Conf. on Communications, ICC 2008
Y. Pourmohammadi Fallah, P. Nasiopoulos, H. Alnuweiri, “Scheduled and Contention Access Transmission of Partitioned H.264 Video over WLANs”, Proc. of IEEE Globecom, pp. 2134-2139, 2007.
Y. Pourmohammadi Fallah, H. Mansour, S. Khan, P. Nasiopoulos, H. Alnuweiri, “An Optimized Link Adaptation Scheme for Efficient Delivery of Scalable H.264 Video over IEEE 802.11n”, to appear in the Proc. of IEEE Int. Symp. On Circuits and Systems (ISCAS) 2008.
Y. Pourmohammadi Fallah, P. Nasiopoulos, V. Leung, 'Fair Scheduling in Multirate IEEE 802.16 Networks”, to appear in proc. of IEEE Int. Symp. On Wireless Pervasive Computing (ISWPC) 2008
A. T. Connie, P. Nasiopoulos, V. Leung, Y. Pourmohammadi Fallah, “Video Packetization Techniques for Enhancing H.264 Video Transmission over 3G Networks”, to appear in Proc. 5th IEEE Consumer Communications and Networking Conf. (CCNC) 2008.
Y. Pourmohammadi Fallah, Darrel Koskinen, Faizal Karim, Avideh Shahabi, Panos Nasiopoulos,” A Cross Layer Optimization Mechanism to Improve H.264 Video Transmission over WLANs”, Proc. 4th IEEE Consumer Communications and Networking Conf. (CCNC), January 2007, pp. 875-879.
Y. Pourmohammadi Fallah, H. Alnuweiri, "Hybrid Polling and Contention Access Scheduling in IEEE 802.11e WLANs”, Journal of Parallel and Distributed Computing, Elsevier, Vol 67, Issue 2, Feb. 2007, pp. 242-256.
Y. Pourmohammadi-Fallah, K. Asrar-Haghighi, H. Alnuweiri, “Internet delivery of MPEG-4 Object-based Multimedia ”, IEEE Multimedia, Vol. 10 , Issue: 3 , July-Sept. 2003, pp. 68 - 78
The ease by which anyone can distribute copies of digital content facilitates piracy, which results in significant losses for the movie industry. Watermarking can be used to either identify the recording device or enable access control. In the latter case, the watermark embedded in the video sequence provides information on whether video players are authorized to display the content or not.
This project focuses on developing a new highly robust watermarking scheme designed specifically for access control applications. The development of such a scheme will solve the problem that content owners constantly face when handheld video cameras are used to record feature films played in poorly supervised movie theaters. The illegally recorded video sequences are compressed and sold as pirated DVDs. These DVDs do not provide any revenues to the creators of the content. Furthermore, potential revenues may be lost since people might choose to watch the pirated copies instead of going to the theaters.
One objective is to design a watermarking scheme that is part of the theater projection devices, with the watermark identifying the theater location. This approach would allow Hollywood to identify the poorly supervised theater and remove its license.
Another objective is to design a watermarking scheme that enables DVD players to detect a watermark that was inserted in the copy of a movie - which was only intended for release in cinemas - and not play the illegal content.
Our research has been based on the Dual-Tree Complex Wavelet Transform which is inherently robust to geometric attacks such as rotation, scaling, cropping. Our method, which is also robust to compression attacks, relies on the orientation of edges rather than pixel positions to embed the watermark. Another advantage of our approach is that the decoder is blind, i.e. the watermark can be detected without knowledge of the original content. This feature makes it useful for watermark detection in compliant DVD players.
Selected Publication:
Lino E. Coria, Mark Pickering, Panos Nasiopoulos, Rabab K. Ward, "A Video Watermarking Scheme Based on the Dual-Tree Complex Wavelet Transform," IEEE Transactions on Information Forensics and Security. Submitted
Lino E. Coria, Mark Pickering, Panos Nasiopoulos, Rabab K. Ward, "A Complex-Wavelet Based Video Watermarking Scheme for Playback Control," EURASIP Journal on Image and Video Processing. Submitted
Mark Pickering, Lino Coria, Panos Nasiopoulos, "A Novel Blind Video Watermarking Scheme for Access Control Using Complex Wavelets," IEEE International Conference on Consumer Electronics, Las Vegas, NV, USA, January 12-14, 2007.
Lino Coria, Panos Nasiopoulos, Rabab Ward, “A Robust Content-Dependent Algorithm for Video Watermarking,” ACM Workshop on Digital Rights Management, Alexandria, VA, USA, October 30, 2006, pages 97-101.
Lino Coria-Mendoza, Panos Nasiopoulos, Rabab Ward, "A Robust Watermarking Scheme Based on Informed Coding and Informed Embedding," IEEE International Conference on Image Processing ICIP 2005, Genoa, Italy.
Panos Nasiopoulos, Lino Coria-Mendoza, Hassan Mansour, Adarsh Golikeri, "An Improved Error Concealment Algorithm for Intra-Frames in H.264/AVC," IEEE International Symposium on Circuits and Systems ISCAS 2005, Kobe, Japan.
Virtually all consumer digital cameras use a single light sensor for capturing colour images, instead of having three sensors for capturing red, green and blue samples at each pixel location. Consequently, these cameras capture incomplete colour information, and an interpolation process (known as demosaicking) must be performed to generate a full colour image from the data the sensor captures. This project focuses on developing a computationally efficient demosaicking method for single-sensor cameras, which lowers the computing requirements of the processor while maintaining high image quality. We are also working on designing methods for compressing images and video streams captured with single-sensor cameras. Our proposed compression schemes take advantage of the method used for capturing colour in these cameras to achieve better compression efficiency.
Multi-view video provides an exciting way for a user to observe a dynamic 3D scene. In these systems, a number of cameras capture the same scene from different positions and viewing angles. By interpolating between the captured views, any viewing direction can be rendered. Thus the user can freely chose the way they view the scene, providing a level of interactivity and realism not possible with traditional 2D video.
A major challenge with implementing multi-view video systems is the huge amount of data that must be captured, processed and stored. Current multi-view systems can involve up to 100 cameras, each of which captures a high-resolution video. To transmit and store the data, efficient compression schemes are required.
Traditional single view video compression involves exploiting the redundancies within individual frames (spatial correlation) and between frames captured at different times (temporal correlation). In multi-view video there is also correlation between views captured with different cameras that can be exploited to increase compression efficiency. This can be done with simple block based disparity estimation or more advanced geometric prediction that takes the camera arrangement into account.
There are many new challenges that arise in multi-view coding. Due to inconsistencies between cameras there can be differences in illumination, colour and focus between views. There can also be unintended spatial displacements between cameras and time synchronization problems. These issues affect the correlation between views and hence lower the compression efficiency when views are predicted from other views.
This project focuses on developing methods for correcting inconsistencies between views in multi-view video. There are several benefits to doing this. The compression efficiency will be higher since there will be higher correlation between views. The 3D viewing experience will be improved as there will be less variation in conditions as the user switches between views while watching the video. The quality of interpolated views should also improve, as the views used to generate the interpolated view will be more consistent.
Techniques have been developed for color matching between pairs or sets of images. However, additional challenges arise in correcting illumination and color in multi-view video. Applying color correction to views on a frame by frame basis can reduced temporal correlation in the corrected views, since different correction parameters are applied at different times. This negatively affects compression performance. We will investigate new color correction methods designed to consider both interview correlation and temporal correlation within each view to maximize coding efficiency. This may involve smoothing the color correction parameters applied at different times so they change gradually.
We will investigate using adaptive filtering to correct focus issues between views. There are tradeoffs between preserving detail in the views most in focus and matching the sharpness between views. Combinations of blurring and sharpening different views will be investigated in order to improve the quality of the 3D video experience.
Selected Publication:
C. Doutre, and P. Nasiopoulos, "A Colour Correction Preprocessing Method For Multiview Video Coding," Submitted to European Signal Processing Conference (EUSIPCO-2008).
C. Doutre, and P. Nasiopoulos, “Motion Vector Prediction For Improving One Bit Transform Based Motion Estimation,” Accepted in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008).
C. Doutre, P. Nasiopoulos, and K.N. Plataniotis, “H.264 Based Compression of Bayer Pattern Video Sequences.” Accepted for publication in IEEE Transactions on Circuits and Systems for Video Technology, Nov. 2007.
C. Doutre, P. Nasiopoulos, and K.N. Plataniotis, “A Fast Demosaicking Method Directly Producing YCbCr 4:2:0 Output.” IEEE Transactions on Consumer Electronics, vol. 53, no. 2, pp.499-505, May 2007.
C. Doutre, and P. Nasiopoulos, “Analysis of the Impact of Demosaicking on JPEG Image Compression,” Proc. IEEE Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2007), Santorini, Greece, pp. 71-74, Jun. 2007.
C. Doutre, and P. Nasiopoulos, “An Efficient Compression Scheme for Colour Filter Array Images Using Estimated Colour Differences,” Proc. IEEE Canadian Conference on Electrical and Computer Engineering (CCECE 2007), Vancouver, Canada, pp. 24-27, Apr. 2007.
C. Doutre, and P. Nasiopoulos, “An Efficient Compression Scheme for Colour Filter Array Video Sequences,” Proc. IEEE International Workshop on Multimedia Signal Processing (MMSP-2006), Victoria, Canada, pp. 166-169, Oct. 2006.
Video coding is the most demanding task for a mobile device to perform. This is due to the enormous amount of data that needs to be processed in order to capture, store, convert and display motion picture content. A typical image sensor with a resolution of 176x144 pixels (QCIF resolution), a sampling rate of 30 frames per second and 24 bits would generate 6.082 Mbits/s of raw data. This data rate is significantly beyond the capabilities of storage or processing power of mobile devices.
In addition to being cost-efficient, mobile devices need to be small, lightweight and ideally have a long battery life. These restrictions significantly limit the energy available for different applications and consequently call for a well-balanced optimization of required processing power, data storage space and other energy-consuming parts such as e.g. display, radio transmitter, image sensor, etc.
In this project we focus on identifying intelligent ways to reduce the computational load for the video encoding process without sacrificing bitrate or quality. One key aspect is the motion estimation process which consumes between 60% and 90% of the computational time required, hence bearing a significant potential for savings of processing power and energy reservoir. Additionally, we aim at tailoring the latest video coding standards (specifically H.264) to the specific characteristics of mobile devices such as low bandwidth, single capturing sensor, limited screen size, storage capacity, processing power and energy efficiency. Taking advantage of detailed knowledge about these elements would facilitate real-time video encoding on mobile devices, propelling the development of many new applications.
Selected Publication:
M. von dem Knesebeck, P. Nasiopoulos, H.-J. Lee and V.C.M. Leung, “A Fast Motion Estimation Algorithm for Mobile Devices”, submitted to IEEE Transactions on Multimedia, October 2007.
M. von dem Knesebeck, P. Nasiopoulos, “An efficient early-termination motion estimation algorithm for H.264”, submitted to IEEE Transactions on Multimedia, February 2008.
P. Nasiopoulos and M. von dem Knesebeck, "A Fast Video Motion Estimation Algorithm for the H.264 Standard," in IEEE International Conference on Multimedia and Expo (ICME), pp. 701-4, 2006.
H.-J. Lee, P. Nasiopoulos, J. Nam, “An improved fast motion estimation algorithm for Mobile Applications”, in EUSIPCO Signal Processing Conference, 2006.
H.-J. Lee, “A Fast Motion Estimation Algorithm for Mobile Devices”, Master’s Thesis, University of British Columbia, 2005.
H.-J. Lee, P. Nasiopoulos and V.C.M. Leung, "Fast Video Motion Estimation Algorithm for Mobile Devices," in IEEE International Conference on Multimedia and Expo (ICME), pp. 370-3, 2005.
Video Packetization Techniques for NAL
With the advent of 3G technologies, the need for video transmission over the wireless network has become inevitable. The wireless network is very error prone due to the effect of fading, shadowing, inter symbolic interference etc. Also bandwidth is a very costly resource in wireless domain. Video transmission in wireless domain requires high coding efficiency and error resilience features to deal with this adverse situation. This requirements places H.264/ Advanced Video Coding (AVC) standard as the prime candidate for video transmission since it achieves almost double compression efficiency compared to the previous standards and it offers some novel error resilience features e.g. slice structure, flexible macroblock ordering etc. This project focuses on finding the appropriate H.264 video packet size that will ensure good picture quality while keeping in mind the delay and bandwidth constraints.
Joint Optimization of PR-SCTP, NAL and VCL
This work will focus on the optimization of the transport layer and application layer functionality. H.264 is the most recent standard for video compression, achieving not only the highest compression efficiency but also providing network friendliness and error resiliency. Data partitioning is one of the very important error resilience features of H.264. With data partitioning, each video slice is encoded into three different units of data with different importance. The packet containing the most important information should be protected against transmission error to ensure good quality. By virtue of multistreaming and partial reliability property of Stream Control Transmission Protocol (SCTP), we can set different priority for different data partitions. This work focuses on the optimization of H.264 error resilience properties and multistreaming and partial reliability property of SCTP. We will investigate the impact of loosing different partitions on picture quality and we will also provide a comparative study between the cases where data partitioning is used and not used.
Selected Publication:
A. T. Connie, P. Nasiopoulos, V. C. M. Leung and Y. P. Fallah, “Video Packetization Techniques for Enhancing H.264 Video Transmission over 3G Networks,” to appear in the Proc. of IEEE Consumer Communications and Networking Conf. (CCNC), January 2008.
A. T. Connie, P. Nasiopoulos, Y. P. Fallah, V. C. Leung, “Joint Optimazation of H.264 Error Resilience Properties and SCTP Features,” submitted to PIMRC 2008, September 2008.
Transcoding is the process of reformatting the content of a compressed video stream to the same or another format. One application of transcoding is inserting a logo in a stream of encoded bits. The commercial justifications for this application are numerous. There are more television channels and programs to watch than ever before, and television networks are struggling for visibility against a mass of hungry cable channels. The logo is extremely effective at identifying the station to the viewer. Throughout the years, we have come to associate the “peacock” logo with NBC, or the “eye” logo with CBS. Placing logos at strategic moments can greatly improve a broadcaster’s chances of viewer recognition.
TV logos usually appear stationary at the bottom right corner of the screen, and they are inserted into the video stream with varying levels of transparency. As future broadcasting will involve pre-compressed movies, it is essential to develop logo inserting schemes that will not affect the overall quality of the content.
The objective of this project focuses on investigating the effectiveness of using previous MPEG-2 logo insertion algorithms on H.264, pointing out new with H.264 and developing new approaches to overcome the challenges. We will strive for cost-effective and practical solutions in our approaches.
Publications: (Master’s work)
D. Xu and M. D. Adams, "Optimization-based methods for the design of high-performance separable filter banks for image coding", in preparation to be submitted to Signal Processing or IEEE Transactions on Signal Processing in Dec 2007.
D. Xu and M. D. Adams, "An Improved Normal-Mesh-Based Image Coder", Canadian Journal of Electrical and Computer Engineering, Volume 33, Issue 1, 2008 (accepted).
D. Xu and M. D. Adams, "An Improved Multiscale Normal-Mesh-Based Image Coder", Proc. of IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PacRim2007) , Victoria, BC, Canada, Aug. 2007, pp. 50-53.
Di Xu and Michael D. Adams, "Design of High-Performance Filter Banks for Image Coding", Proc. of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2006), Vancouver, BC, Canada, Aug. 2006, pp. 868-873.