北京大學智能科學系信息系統研究室 北京 100871
摘要 針對當前檢索提示技術無法區分不同用戶查詢意圖的問題,本文提出了支持用戶個性化的檢索提示方法。該方法首先對查詢日志中的查詢進行聚類,通過預先對被點擊文檔的詞頻矩陣做奇異值分解,改進了現有的基于相似點擊文檔的查詢聚類方法;再通過查詢聚類結果和用戶查詢日志,以查詢聚類向量的形式表示用戶興趣。在用戶輸入查詢串的過程中,該方法能根據用戶興趣選擇出合適的檢索詞提示給用戶。實驗證明本文方法可以有效提高檢索提示質量。
關鍵字 查詢聚類 奇異值分解 用戶個性化 檢索提示
Personalized Search Auto-completion Based on Query Clustering
LI Chao WANG Wen-Qing
(Key Laboratory of Machine Perception, Peking University, Beijing 100871, China )
Abstract: Traditional auto-completion technique provides search suggestions without considering user interests or intents. In this paper, we propose a personalized auto-completion approach. First, we cluster queries from query log. Different from traditional query clustering which is based on similar clicked documents, we applied Singular Vector Decomposition to cluster clicked documents first, which improves the quality of query clustering. Then, we present user interests as query cluster vectors through user query log. In online query process, search suggestions which best meet user’s interest are provided. Our algorithm is evaluated using query log from Sogou search engine. We show that the quality of query suggestions can be improved by our method.
Keywords: Query Clustering, SVD, Personalized Search, Autocompletion
作者簡介:
李超,男,1987年生,于2008年獲得北京大學計算機系理學學士學位,北京大學智能科學系碩士研究生,研究方向:數字圖書館,文本挖掘。