Home
Important Dates
News
Invited Speakers
Organisers
Technical Commitee
Journal Special Issues
Tutorials
Workshop
Paper Submission
Registration
Programme  New
Presentation Guidelines  New
Travel  New
Venue
Accommodation
Local Information
Contact
 
 
 
 
Tutorials


The following tutorials will be given on topical research for VIE'08:
 
 


     Presented by Dr. Neil Thacker
 
 

     Presented by Mr. Don Braggins
 
 

     Presented by Dr. Changsheng Xu and Prof. Qingming Huang
 
 

     Presented by Dr. Chong-Wah Ngo

 
 
Statistical Design of Quantitative Vision Systems

Abstract : This tutorial is aimed at providing a general introduction to the use of probability in the design of vision modiules and its use in computational vision systems. Emphasis is placed on basic concepts and quantitative use. This includes hypothesis testing, likelihood and Bayes theorem. The importance of understanding data quality and the evaluation of algorithm performance is addressed via the use of measurement error, error propagation, covariance estimation and Monte Carlo techniques. The limitations of established methods will be explained, where necessary, in order to provide an in depth understanding of the use of quantitative probability as a scientific tool. Taken together, these methods form a framework for the design of computer vision algorithms, leading to a better understanding of what constitutes appropriate solutions to vision analysis tasks. It is intended that by the end of the tutorial the student will have sufficient familiarity with the basic concepts that they will be able to begin to recognise genuine novelty in published work in this area, as apposed to re-invention of existing statistical methods. They will also understand the appropriate use of these techniques. This in turn should facilitate, not only higher standards in their own work, but better reviewing of conference and journal publications for the next generation of vision researchers.

Intended Audience: The tutorial is designed for an audience with intermediate level mathematical skills, while the coverage of probability concepts ranges from introductory to intermediate. The material is therefore suitable for students and researchers whishing to gain a better understanding of the scientific basis for algorithm design.

Speaker Biography

Dr. Neil Thacker
University of Manchester, UK

Neil Thacker has a background training in experimental physics followed by 20 years experience in the area of computer vision and image analysis. During this time he has worked in areas as diverse as; computer vision, neural networks and pattern recognition, medical image analysis, and VLSI hardware design. He has published in excess of 200 (journal and conference) papers. This work is now distributed from the TINA open source computer vision pages www.tina-vision.net in the form of a three volume thesis on Visual Intelligence.

[Back to Top]


A Short Guide to 'Do's and Dont's' in Order to Ensure Acceptance of Your Paper in Optical Engineering

Abstract : In calendar 2007 66 papers in the category 'Machine Vision and Pattern Recognition' were submitted to the journal and of these, just half were accepted. Don will describe how the review process is implemented, what makes a paper likely to be accepted (or rejected!), and what steps can be taken to ensure that a minimum of revision is required before acceptance.

Speaker Biography

Mr. Don Braggins
Machine Vision Systems Consultancy & Director, UKIVA

Don Braggins has been an independent consultant in Machine Vision since 1983. He helped to found the UK Industrial Vision Association in 1992 and the European Machine Vision Association in 2003. In 1991 the grade of Fellow of SPIE was conferred on him, and in 2005 he was asked to take on the role of Associate Editor for Machine Vision and Pattern Recognition for SPIE's peer-reviewed journal Optical Engineering.

[Back to Top]


Sports Video Content Analysis and Applications

Abstract: In recent years extensive research efforts have been devoted to sports video content analysis and applications due to their wide viewer-ship and high commercial potentials. Technologies and prototypes have been developed to automatically or semi-automatically analyze sports video content, extract semantic events or highlights, intelligently adapt, enhance and personalize the content to meet users' preferences and network/device capabilities. Many applications have been developed and used in broadcasting video enhancement such as multi-camera based 3D virtual sports events, virtual ads insertion for sports video, and motion analysis systems for sports training, etc. The aim of this tutorial is to provide a brief overview of general video content analysis techniques and a comprehensive overview of the technical achievements in the research area of sports video analysis and applications. We first cover feature extraction methods, which include low-level feature extraction, midlevel representation creation and high-level semantics detection in sports videos. Next, we present the state of the art in sports video analysis from the following three aspects: structure analysis, event detection, content adaptation and enhancement. We also address the issues concerning test data preparations and performance evaluations for sports video analysis systems. Based on the current technologies used in sports video analysis and the demands from real-world applications, future promising directions and research challenges are discussed at the end of the tutorial.

Intended Audience: This tutorial is intended for educators, researchers, engineers, students and people interested in gaining an overall understanding of video content analysis and sports video analysis and applications. The audience is required to have basic understanding of image, video, audio and text media and preliminary knowledge of signal processing and machine learning.


Speaker Biographies

Dr. Changsheng Xu
Institute for Infocomm Research, Singapore

Prof. Qingming Huang
Chinese Academy of Sciences, China

Changsheng Xu received the Ph.D. degree from Tsinghua University, Beijing, China in 1996. From 1996 to 1998, he was with the National Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China. He joined the Institute for Infocomm Research (I2R), Singapore, in March 1998. His research interests include multimedia content analysis, indexing and retrieval, digital watermarking, computer vision and pattern recognition. He published over 150 papers in those areas. He is an IEEE Senior Member and Member of ACM. He is an Associate Editor of ACM/Springer Multimedia Systems Journal. He will serve as Program Co-Chair of ACM Multimedia 2009, Short Paper Co-Chair of ACM Multimedia 2008, General Co-Chair of 2008 Pacific-Rim Conference on Multimedia (PCM2008) and 2007 Asia-Pacific Workshop on Visual Information Processing (VIP2007), Program Co-Chair of VIP2006, Industry Track Chair and Area Chair of 2007 International Conference on Multimedia Modeling (MMM2007). He also served as Technical Program Committee Member of major international multimedia conferences, including ACM Multimedia Conference, International Conference on Multimedia & Expo, Pacific-Rim Conference on Multimedia, and International Conference on Multimedia Modeling.

Qingming Huang received the Ph.D. degree in computer science from Harbin Institute of Technology, Harbin, China in 1994. He was a Postdoctoral Fellow in National University of Singapore from 1995 to 1996, and worked in Institute for Infocomm Research, Singapore as Member Research Staff from 1996 to 2002. Currently, he is a professor in Graduate School of Chinese Academy of Sciences. He has published over 80 scientific papers. His current research areas are image processing, video analysis, video coding, and pattern recognition.

[Back to Top]


Building Semantic Detectors and Semantic Spaces for Concept-based Video Search

Abstract: Enabling semantic-based video retrieval has been one of the long-term goals in multimedia computing. Traditional content-based approaches of deriving semantics purely from low-level multimedia features have proven their limitation in conquering the so-called "semantic gap". Modern approaches enable the semantic search by pooling a set of concepts and forming a semantic space to facilitate the high-level understanding of user queries and low-level features. The search method is generally referred to as concept-based video search (CBVS).

In this tutorial, I will present two main components of CBVS: semantic detectors and semantic spaces . The techniques of building large-scale semantic concept detectors and utilizing the detectors for modeling semantic spaces will be introduced. In the first part, I will describe the development of VIREO-374 which is a publicly available detector set developed by exploiting the bag-of-visual-words representation. Under this representation, the detection of local interest points, choices of local features, design of word weighting and vocabulary size, stop word removal, feature selection, and impact of visual word linguistics will be fully discussed. The major differences between the "bag-of-words" video retrieval and text retrieval will be highlighted.

In the second part of the tutorial, I will present the effective building of semantic spaces with the available set of semantic detectors. The development will take into account the factors such as ontological relatedness of detectors, coverage of semantic space, observability and diversity of concepts, and reliability of detectors in videos. The techniques of utilizing the developed semantic spaces for query disambiguation, multi-modality fusion and video search will also be discussed.

Finally, the practical and empirical insights of building semantic detectors/spaces for large-scale video search will be demonstrated based on the TRECVID benchmark evaluations. The tutorial will be concluded by showing the challenges of video search in large-scale multimedia database.

Intended Audience: Researchers, postgraduate students and engineers in multimedia computing, information retrieval, video/image/audio processing, and machine learning, from both academia and industries. The level will be from intermediate to advanced. The audiences are expected to have the basic knowledge in multimedia and information retrieval.


Speaker Biography

Dr. Chong-Wah Ngo
Department of Computer Science
City University of Hong Kong

Chong-Wah Ngo received his PhD in Computer Science from the Hong Kong University of Science and Technology (HKUST). He received his MSc and BSc, both in computer engineering, from Nanyang Technological University (NTU) of Singapore. Before joining City University of Hong Kong in 2002, he was with Beckman Institute of University of Illinois in Urbana Champion. He was also a visiting researcher in Microsoft Research Asia (MSRA). His recent research interests include large-scale multimedia information retrieval and video computing. He has been serving as technical program committee in various major multimedia-related conferences including ACM Multimedia (MM), International Conf. on Image and Video Retrieval (CIVR) and International Conf. on Multimedia and Expo (ICME). He is the leader of video retrieval group (VIREO): http://vireo.cs.cityu.edu.hk/ in CityU. He also serves as the chairman of ACM (Hong Kong Chapter) recently.

[Back to Top]