Technologies on Display

B2 - XVIP: The XML-based Video Information Processing System

XVIP (XML-Based Video Information Processing System) is a system providing end-to-end solution from raw video input, information extraction, to video searching and streaming delivery. The key functionalities include the following:

  • Automatic understanding of video input by speech recognition (Microsoft SAPI or IBM ViaVoice), on-screen character recognition, and other data extraction and enrichment techniques
  • Through video understanding by machine, the video information can therefore be indexed and become searchable
  • Provides a Video Search Engine front-end for Web and WAP users of various device types. The users can search their own favorite video segments


  • Automatic information extraction, indexing and searching on television news
  • Video searching and access through Internet and smart devices


  • A number of video processing techniques like Speech Recognition, Video OCR and color histogram to achieve computer video understanding
  • Heterogeneous client access, including Internet access and smart device access through HTTP or WAP
  • Distributed collaboration approach through Web Services that facilitates different research institutes for collaborating and provides a workflow solution in practical production of interoperable digital libraries
  • An XML-based interoperable and scalable standard for different extraction technique modules


Principal Investigator
Prof. Michael Lyu
Department of Computer Science and Engineering

he backend processor of our system for extracting video information

The web interface for video searching by key words and streaming delivery

XVIP system architecture

Different client interfaces that XVIP support, including web on PC, web on PDA and WAP on mobile phones