T1: Web Search and Browse Log Mining: Challenges, Methods, and Applications

Daxin Jiang


Huge amounts of search log data have been accumulated in various search engines. Currently, a commercial search engine receives billions of queries and collects tera-bytes of log data on any single day. Other than search log data, browse logs can be collected by client-side browser plug-ins, which record the browse information if users's permissions are granted. Such massive amounts of search/browse log data, on the one hand, provide great opportunities to mine the wisdom of crowds and improve search results as well as online advertisement. On the other hand, designing effective and efficient methods to clean, model, and process large scale log data also presents great challenges.

In this tutorial, I will focus on mining search and browse log data for search engines. I will start with an introduction of search and browse log data and an overview of frequently-used data summarization in log mining. I will then elaborate how log mining applications enhance the five major components of a search engine, namely, query understanding, document understanding, query-document matching, user understanding, and monitoring and feedbacks. For each aspect, I will survey the major tasks, fundamental principles, and state-of-the-art methods. Finally, I will discuss the challenges and future trends of log data mining.

Speaker Biography:

Daxin Jiang's research focuses on information retrieval and log data mining. He received Ph.D. in computer science from the State University of New York at Buffalo. He has published extensively in prestigious conferences and journals, and served as a PC member of numerous conferences. He received the Best Application Paper Award of SIGKDD'08 and the Runner-up for Best Application Paper Award of SIGKDD'04. Daxin Jiang has been working on development of Microsoft search engines, including Live Search and Bing. Daxin Jiang's publication list can be found at http://research.microsoft.com/en-us/people/djiang/.

T2: Managing Social Image Tags: Methods and Applications

Aixin Sun
Sourav S Bhowmick


With the advances in digital photography (e.g., digital cameras and mobile phones) and social media sharing web sites, a huge number of multimedia content is now available online. Most of these sites enable users to annotate web objects including images with free tags. A key consequence of the availability of such tags as meta-data is that it has created a framework that can be effectively exploited to significantly enhance our ability to understand social images. Such understanding paves way to the creation of novel and superior techniques and applications for searching and browsing social images contributed by common users. The objective of this tutorial is to provide a comprehensive background on state-of-the-art techniques for managing tags associated with social images.

The tutorial is structured as follows. In the first part, we provide a comprehensive understanding of social image tags. We present a brief survey on studies related to motivation behind tagging and impact of various tagging systems that are used by users to create tags. We shall use Flickr as an example tagging system to illustrate various concepts. In the second part, we shall describe state-of-the-art techniques for measuring effectiveness of tags in describing its annotated resources (social images). Specifically, we shall describe techniques that enable us to quantitatively measure a tag's ability to describe the image content of social images. Note that this issue is one of the most fundamental problem in multimedia analysis, search, and retrieval. The third part of the tutorial is devoted to describing state-of-the-art techniques for discovering relationships between tags and how such knowledge is useful for various tag-based social media management applications such as tag recommendation, tag disambiguation and tag-based browsing systems. We conclude by identifying potential research directions in this area.

Speakers Biography:

Aixin Sun is an Assistant Professor with School of Computer Engineering (SCE), Nanyang Technological University (NTU), Singapore. He received his B.A.Sc (First class honors) and Ph.D. in 2001 and 2004 respectively, both in Computer Engineering from NTU. He was a postdoctoral fellow with School of Computer Science and Engineering (CSE) at The University of New South Wales (UNSW), Sydney, Australia. His current research interests include information retrieval, text/web/data mining, social computing, and digital libraries. He has published more than 60 papers in major international conferences and journals such as ACM SIGIR, ACM WSDM, ACM Multimedia, ACM CIKM, ACM/IEEE JCDL, IEEE ICDM, PAKDD, IEEE TKDE, DSS, JASIST, IP&M and KAIS. Aixin is serving as a program committee member of various conferences and reviewer for various journals in the areas of information retrieval, data mining and related areas. He is a member of ACM and a member of IEEE.

Sourav S Bhowmick is an Associate Professor in the School of Computer Engineering, Nanyang Technological University and the Director of Centre for Advanced Information Systems (CAIS). He is currently Visiting Associate Professor at the Biological Engineering Division, Massachusetts Institute of Technology (MIT), USA. He also holds the position of Singapore-MIT Alliance (SMA) Fellow in Computation and Systems Biology program (2005-2010). Sourav received his Ph.D. in computer engineering in 2001. His current research interests include tree and graph data management, social media and web data management, data mining, and computation and systems biology. He has published more than 100 papers in major international database and data mining conferences and journals such as VLDB, IEEE ICDE, ACM WWW, ACM SIGMOD, ACM SIGKDD, ACM MM, ACM CIKM, ER, PAKDD, IEEE TKDE, ACM CS, Information Systems, and DKE. Sourav is serving as a PC member of various database and data mining conferences and workshops and reviewer for various journals. He has served as a program chair/co-chair of several international conferences and workshops. He is a member of the editorial boards of several international journals. Sourav has been tutorial speaker in several international conferences such as ER 2006, APWeb 2008, WAIM 2008, and PAKDD 2009. He has co-authored a book entitled "Web Data Management: A Warehouse Approach" (Springers Verlag, October 2003). Sourav has received Best Interdisciplinary Paper Award (along with Q Zhao, M Mohania, Y Kambayashi) at ACM CIKM 2004 .

T3: Searching, Analyzing and Exploring Databases

Yi Chen
Wei Wang
Ziyang Liu


Keyword based search, analysis and exploration enables users to easily access databases without the need to learn a structured query language and to study possibly complex data schemas. Supporting keyword based search, analysis and exploration on databases has become an emerging hot area in database research and development due to its substantial benefit. Researchers from different disciplines are working together to tackle various challenges in this area.

This tutorial aims at outlining the problem space of supporting keyword based search, analysis and exploration on databases, introducing representative and state-of-the-art techniques that address different aspects of the problem, and discussing further challenges and potential future research directions. The tutorial will provide the researchers and developers a systematic and organized view on the techniques related to this topic.

Speakers Biography:

Yi Chen is an Assistant Professor in the Department of Computer Science and Engineering at Arizona State University, USA. She received Ph.D. degree in Computer Science from the University of Pennsylvania in 2005. She is a recipient of an NSF CAREER award and an IBM faculty award. Her current research interests focus on empowering non-expert users to easily access diverse structured data, in particular, searching and optimization in the context of databases, information integration, workflows, and social network (http://www.public.asu.edu/~ychen127/).

Wei Wang is a Senior Lecturer in the School of Computer Science and Engineering at the University of New South Wales, Australia. He received his Ph.D. degree in Computer Science from Hong Kong University of Science and Technology in 2004. His recent research interests are integration of database and information retrieval technologies, similarity search, and spatial-temporal databases (http://www.cse.unsw.edu.au/~weiw/).

Ziyang Liu is a Ph.D. candidate and an SFAz (Science Foundation Arizona) Graduate Fellowship recipient in the Department of Computer Science and Engineering at Arizona State University. He joined Arizona State University in August 2006 and received M.S. degree in Computer Science in May 2008. His current research focuses on keyword search on structured and semi-structured data and workflow management (http://www.public.asu.edu/~zliu41/).