Media Management Research Lab

National University of Singapore

  • Increase font size
  • Default font size
  • Decrease font size
Home Research

I. Streaming Media

Location-adaptive P2P Audio Streaming

In the past few years the Internet has become an important platform for large-scale interactive applications. One type of such applications are networked virtual environments (NVE) where users can move a representation of themselves (also known as avatar) in the shared virtual world and interact with each other. One of the most wellknown examples of an NVE is SecondLife. Such virtual worlds are interesting for a number of reasons. Most importantly, some of these virtual worlds are not applications per se, but they form the foundation for the creation of specific applications. For example, SecondLife has been used for virtual meetings, training and recruitment by large corporations. As a consequence, the term metaverse has also been used to describe such generic NVEs. Of course, one of the biggest application areas for NVEs are games, also termed Massively Multiple-player Online Games (MMOG).

Very large NVEs present a plethora of research challenges. Among them are scalability of network traffic and server environments, end-to-end delay of user interactions, humancomputer interface issues, and more. For some of these problems fairly good solutions have been found, for example the visual quality of the 3D environments in these shared worlds is quite good and constantly improving through better hardware and software algorithms. However, one area that has been sorely lagging behind in terms of natural interaction has been natural audio communication. Voice interaction between NVE participants is still in its very early stages and leaves much to be desired. The most widely used mode of communication in these shared worlds is through text chat. While this is a mature technology that is relatively easy to implement (and efficient in its network bandwidth use), we feel that it lacks much of the natural and immersive characteristic that good-quality voice communication could provide for its users. Some commercial voice communication tools are available now (e.g., TeamSpeak, Ventrillo and others) but they are all based on centralized client-server architectures.

With our project we address the significant challenge of designing a location-adaptive audio streaming architecture for NVEs based on a peer-to-peer distribution topology. Specifically, the architecture supports proximity audio, an audio distribution mode that allows for a natural and automatic control of group sizes in online virtual environments such as Massively Multiplayer Online Games (MMOG) games.

People Involved
Roger Zimmermann
Beomjoo Seo
Min Min Htoon
Ke Liang
Zhijie Shen

More Information


An increasing number of recorded videos are being tagged with geographic properties of the camera scenes. This meta-data is of significant use for storing, indexing and searching large collections of videos. By considering video related meta-information, more relevant and precisely delimited search results can be returned. Our system implementation demonstrates a prototype of a georeferenced video search engine (GRVS) that utilizes an estimation model of a camera’s viewable scene for efficient video search. For video acquisition, our system provides an automated annotation software that captures videos and their respective field of views (FOV ). The acquisition software allows community-driven data contributions to the search engine.

People Involved
Roger Zimmermann
He Ma
Jia Hao
Haiyang Ma
Lingyan Zhang
Shunkai Fang

More Information

  • This GVRS prototype search engine is here.
  • See project pages.

LBS Support for Social Networks

Social networking has recently emerged as an important driver in the area of technology development and adoption. In an increasingly connected world, social networking can bridge geographical distance to bring people with similar interests together. Often, technology adoption is a "top-down" process as new technology is initially expensive and hence introduced in affluent areas first. In the proposed project we aim to develop technologies and designs that appeal to a larger audience of possibly under-served communities that currently make limited use of technology. To achieve this objective we will develop technologies for location-based services (LBS) that provide the foundation for innovative applications to improve people's lives. The enabled applications will leverage the techniques to provide services which meet the needs of the targeted group of people, such as learning, healthcare, social interaction, tourism, and agriculture.

With the increasing capabilities of mobile devices, there has been a growing interest in location-based services. With such services, implicitly the geographical position of the user is taken into account when interacting with various applications, allowing for novel application features. We propose to leverage the basic positional-reference capabilities that are increasingly becoming a standard function even on lower-end handsets.

People Involved
Roger Zimmermann
Ying Zhang
Bernard C.Y. Tan

More Information

  • This project is supported by the Ministry of Education (MOE) Academic Research Fund (AcRF) Tier-1 FRC grant no. T1 251RES0918.
  • See project pages.

III. Previous Projects

Peer-to-peer architectures (P2P) have become the focus of academic attention because they enable very scalable application platforms. In the last few years many P2P protocols have been designed and implemented. Research has investigated the adaptation of P2P architectures for media streaming. The objective of this project is to design a platform that can support large scale interactive streaming using a P2P architecture.

Interactive streaming applications — for example multi-party audio communications — are very demanding in terms of end-to-end delay among users. Previous research has concluded that a round trip time (RTT) latency of more than about 250ms will make voice conversations difficult. In P2P systems, the delay is more problematic since nodes are often not directly connected to each other, but instead communicate through some intermediate overlay nodes between them. Hence one of the challenges is to realistically model the processing delay introduced at each intermediate node and then dynamically optimize the P2P structure accordingly. We have designed a novel peer-to-peer streaming architecture called ACTIVE that aims to address this challenge. One of the main innovations of ACTIVE is that it maintains a multi-cast overlay tree that dynamically clusters active users (usually only a fraction of the total number of users), who have more critical demands for low-latency. Our results show that ACTIVE significantly reduces the end-to-end delay experienced among active users while at the same time being capable of providing streaming services to very large multicast groups.

Read more.

Presently, most streaming media systems focus on playback only. With the High-performance Data Recording Architecture (HYDRA) project we focus on an integrated approach that includes real-time live streaming and recording in addition to efficient retrieval. Currently more and more sensor devices (e.g., cameras) can directly produce digital data streams and many of these devices are network-capable. Hence, the need arises to capture and store these streams with an efficient data stream recorder which acts as a stream coordinator that manages the transmission, recording, and playback of many different data streams simultaneously and provides a central repository for all data. HYDRA aims to provide the same services for all media, independent of their bandwidth requirements, resolution or modality. One of the applications that we are exploring for this technology is a Distributed Immersive Performance where musicians and audiences are geographically disbursed in different locations. Extensive experimental data exploring the limits of latency tolerance by musicians have been captured with HYDRA.

The research challenges that we are exploring in this context are optimal memory management with a unified buffer pool and streams of different bandwidth and quality of service requirements. A novel admission control algorithm was designed based on extensive modeling of a comprehensive set of stochastic parameters such as variable disk transfer rates, variable bitrate compression, and seek and rotational latencies. In addition the architecture supports a pass-through mode that allows live monitoring of the managed streams. This mode has been used for live, high-definition interactive experiments between USC and multiple other locations (notably Korea).

Read more.

The objective of project Yima was to design, implement and evaluate a scalable real-time streaming architecture for applications such as video-on-demand and distance learning on a large scale. Yima was a second generation design that incorporated lessons learned from our first generation research prototype called Mitra.

Yima formed the basis of the Remote Media Immersion (RMI) project. RMI was a testbed that integrates many of the technologies that are the result of multiple research efforts. The goal of the RMI was to reproduce the complete aural and visual ambience of an environment that includes people and other real and virtual elements.

Yima also provided the foundation for our later work on HYDRA (see above).

Read more.

The Geotechnical Information Management and Exchange ITR Project is an NSF sponsored research collaboration with USC’s Civil Engineering department (Jean-Pierre Bardet, PI; Roger Zimmermann, Co-PI) aimed at exploring data management solutions for the exchange and utilization of geotechnical information. Nowadays the combination of Web and database technologies unleashes new powerful opportunities for collecting, exchanging, and utilizing geotechnical information. The objectives of the research are to (1) define versatile data structures based on the knowledge of domain experts on selected geotechnical information, (2) define metadata by geotechnical domain experts describing the processes generating geotechnical information, including development of automated metadata collection for facilitating user input, and (3) develop data dissemination tools for geotechnical information, integrating both data and metadata.

We addressed the challenges presented to us by this project with a design based on Web services to handle geotechnical data via a portable XML format. Repositories under different administrative control were integrated through a transparent query routing algorithm that is based on a distributed R-tree index to efficiently cull nonrelevant databases from the search. A middleware layer translated native database data into the public XML format through XSLT wrappers.

Read more.