(國防科技大學 計算機學院,湖南省長沙市 郵編410005)
摘要:隨著互聯網的高速發展和Web2.0時代的到來,微博用戶正以驚人的速度在增長。新浪微博現以粉絲數作為用戶排名的依據,在僵尸粉和大量低使用率帳號的影響下,這種簡單的排名依據難以表征用戶的影響力。本文以海量新浪微博數據為分析對象,在分布式系統上構建微博用戶的影響力評價模型。文章主要以微博用戶的轉發網絡計算微博用戶的微博影響力,再利用關注關系計算微博用戶的潛在影響力,最后合成微博用戶影響力的評價模型。實驗及分析表明,該評價方法在微博服務中能有效的反映微博用戶的真實影響力,適用于度量微博用戶的影響力。
關鍵字:微博 影響力 Pagerank MapReduce
中圖分類號: 文獻標識碼:A 文章編號:
The design and implementation of An Evaluation algorithm for the influence of weibo users based on MapReduce
FANG-Chao ZHOU-Bin LI-Aiping
(National University of Defense Technology, Changsha 410005 FANG-Chao 190772662@qq.com)
With the rapid development of Internet and WEB2.0 application, the number of Sina weibo user is growing at a very high speed. Sina.com uses the number of fans as the basis for user ranking. Under the influence of artificial followers and a large number of low utilization rate accounts,this simple ranking method is difficult to capture the user’s influence accurately .In this paper ,we use Sina weibo data to built users’influence evaluation model on a distributed system. This paper caculate weibo users microblog influence mainly based on weibo user’s retweet data, and uses the following relationship of weibo users to calculate weibo user’s popularity. Finally we come up with an influence evaluation model for weibo users. The experiments and analysis show that this assessment method can capture weibo user’s real influence more accurately. This method is applicable to measure influence of weibo users.
Key words: weibo influence Pagerank MapReduce
參考文獻
[1] Ye S Z,Wu SF. Measuring Message Propagation and Social
Influence on twitter.com[C]. In : Proceedings of the 2nd International Conference on Social Informatics (SocInfo’10). Heidelberg: Springer-Verlag.2010:216-231.
表4-最終實驗結果
[2] 李軍,陳震,黃霽崴.微博影響力評價研究[J].信息網絡安全,2012(3):10-13,27
[3] Weng J S,Lin E P,Jiang J, et al.TwitteRank :Finding Topic-sensitive Influential twitters[C].In:Proceeding of the 3rd ACM International Conference on Web Search and Data Mining(WSDM 2010). New York:ACM,2010:261-270
[4] Cha M Y,Haddadi H,Benevenuto F, et al.Measuring User Influence in Twitter: The Million Follower Fallacy[C]. In :Proceedings of International AAAI Conference on Weblogs and Social Media (IC-WSM’10) ,Washington. Menlo Park: The AAAI Press,2010.
[5] 石磊,張聰,衛琳.引入活躍指數的微博用戶排名機制[J].小型微型計算機系統,2012()1:110-114
[6] 朱恒民,李青.面向話題衍生性的微博網絡輿情傳播模型研究[J],現代圖書情報技術,2012(5):60-64.
[7] Ding Zhaoyun, Jia Yan, Zhou Bin, Han Yi. Mining Topical Influencers Based on the Multi-Relational Network in Micro-Blogging Sites[J] China Communication , January 2013, vol.10, No.1 93-104
[8] KWAK H,LEE changhyun, PARK H, et al. What is Twitter,a social network or a news media[C].Proceedings of the 19th International Conference on World Wide Web(WWW’10). New York:ACM Press,2010:591-600
[9] WANG Ru i, JIN Yongsheng. An empirical study on the relationship between the followers’number and influence of microblogging [ C]. Proceedings of the International Conference on E-Business and E-Government , ICEE 2010. Guangzhou, China: IEEE Computer Society, 2010: 2014 - 2017.
作者簡介
方超 男,(1988- ),碩士生,研究方向:Web數據挖掘
周斌 男,(1971-) 博士,碩士生導師 研究員,研究方向:Web數據挖掘 社交網絡分析 分布計算,
李愛平 男,(1974-) 博士,副研究員,研究方向:網絡安全,社交網絡分析,分布式計算