Web search is challenging largely due to the fact that search queries and Web documents use different language styles and vocabularies. This talk presents a statistical translation based approach to web search ranking. We start with a case study of search query spelling correction. Then we describe in detail three types of translation models for web search ranking: (1) word-based models, (2) phrase-based models and (3) bilingual topic models. For each type, we will go through the underlying theory, the methods of training these models, and the way the models are used for web search ranking.
Jianfeng Gao is Senior Researcher in Natural Language Processing group at Microsoft Research, Redmond. From 2005 to 2006, he was a research lead in Natural Interactive Services Division at Microsoft. From 2000 to 2005, he was Researcher in Natural Language Computing group at Microsoft Research Asia. His research interests include Web search and mining, information retrieval and statistical natural language processing. Additional information is available at http://research.microsoft.com/en-us/um/people/jfgao/.