Kajian Literatur Mengenai Klasifikasi Blog
Abstract
Klasifikasi blog merupakan topik kajian baru. Teknik klasifikasi web tradisional tidak dapat diterapkan secara langsung terhadap blog karena sering terjadinya update terhadap isi dan variasi topik pada suatu situs blog. Komponen penyusun blog seperti judul, isi dan komentar, tag (label), penulis, hyperlink, permalink, outlink, tanggal dan jam termasuk obyek yang perlu dilibatkan dalam proses klasifikasi. Tulisan ini mencoba meninjau berbagai pendekatan klasifikasi blog yang hadir sejak 2009. Pada awal kemunculan blog, klasifikasi biner digunakan untuk membedakan blog dari halaman web biasa. Kami fokus pada bagaimana mengkategorikan suatu blog ke dalam daftar topik, genre dan opini (mood dan sentimen) yang telah didefinisikan sebelumnya.Pada klasifikasi topik dan genre, algoritma kNN, Naive Bayes, CFC, SVM dan pendekatan machine learning lainnnya banyak digunakan.Pemanfaatan ontologi topik dan tag dapat meningkatkan akurasi klasifikasi. Pada deteksi opini, pendekatan berbasis lexicon seperti ANEW cenderung lebih banyak digunakan. Opini dari suatu situs blog juga dapat diprediksi berdasarkan opini di sekitar inlink yang menuju situs tersebut. Kajian ini perlu diperluas dan diperdalam, seperti keterlibatan lebih lanjut dari tag, link dan analisis jejaring sosial.
Kata kunci: klasifikasi blog, analisis sentimen, blog mining
Full Text:
PDF (Bahasa Indonesia)References
Michael Chau, Porsche Lam, Boby Shiu, Jennifer Xu, Jinwei Cao (2009) A Blog Mining Framework, IT Pro January/February 2009, IEEE Computer Society
Technorati (2011) State of the Blogosphere 2011, URL: http://technorati.com/blogging/article/state-of-the-blogosphere-2011-introduction/page-2/
Geetika T. Lakshmanan, Marten A. Oberhofer (2010) Knowledge Discovery in the Blogosphere Approaches and Challenges, IEEE Internet Computing
Flora S. Tsai (2011) A Tag-Topic Model for Blog Mining, Expert System with Applications (ESwA) Journal, Vol. 38, Page 5330 – 5335
Jiawei Han, Micheline Kamber, Jian Pei (2012) Data Mining Concepts and Techniques, Third Edition, Morgan Kaufmann Publishers
Hae-Ching Chang, Kao-chi, Yeh (2008) Clarifying The Difficulties And Management Of Blogging, Journal of Information, Technology, and Society (JITAS), Vol. 8 No.2, URL: jitas.im.cpu.edu.tw/2008-2/1.pdf
Bonnie A. Nardi, Diane J. Schiano, Michelle Gumbrecht (2004) Blogging As Social Activity, Or, Would You Let 900 Million People Read Your Diary?, dalam Proceedings of Conference On Computer Supported Cooperative Work: 222-231, ACM, URL: http://home.comcast.net /~diane.schiano/CSCW04.Blog.pdf
Xiaoguang Qi, Brian D. Davidson (2009) Web page Classification: Features and Algorithms, ACM Computing Survey, Vol. 41, No. 2, Article 12, URL: http://www.cse.lehigh.edu/~xiq204/pubs /classification-survey/LU-CSE-07-010.pdf
Tomoyuki Nanno, Yasuhiro Suzuki, Toshiaki Fujiki, Manabu Okumura (2004) Automatically Collecting, Monitoring, And Mining Japanese Weblogs, dalam Proceedings Of The 13th International World Wide Web Conference On Alternate Track Papers & Posters (WWW Alt.): 320-321, ACM, URL: www.iw3c2.org/WWW2004/docs/2p320.pdf
Erik Elgersma, Maarten De Rijke (2005) Learning To Recognize Blogs: A Preliminary Exploration, EACL Workshop: New Text—Wikis And Blogs And Other Dynamic Text Sources, URL: http://www.Sics.Se/Jussi/Newtext/Working_Notes/05_Elgersma_ Derijke.Pdf
Feng Yu, Dequan Zheng, Tiejun Zhao, Xiao Cheng (2008) Structure and Content Based Blog Pages Identification, dalam Fifth International Conference on Fuzzy Systems and Knowledge Discovery, IEEE
Ioannis Kanaris & Efstathios Stamatatos (2009) Learning To Recognize Webpage Genres, Information Processing and Management Journal, Volume 45 Issue 5, URL: http://www.icsd.aegean.gr/lecturers/stamatatos/papers/IPM2009%20preprint.pdf
Philipp Petrenz (2009) Assessing Approaches To Genre Classification, M.Sc. Thesis, School Of Informatics University Of Edinburgh, URL: http://www.inf.ed.ac.uk/publications/thesis/online /IM090692.pdf
Alisabeth Lex, Christin Seifert, Michael Granitzer, Andreas Juffinger (2009) Automatic Blog Classification: A Cross Domain Approach, dalam Proceedings of IADIS International Conference WWW/Internet 2009, URL: http://www.iadisportal.org/digital-library/automated-blog-classification-a-cross-domain-approach
Alisabeth Lex, Christin Seifert, Michael Granitzer, Andreas Juffinger (2010) Efficient Cross-Domain Classification of Weblogs, International Journal of Intelligeny Computing Research (IJICR), Vol 1, Issue 1/2, URL: http://infonomics-society.org/IJICR/Efficient%20Cross_Domain%20Classification%20of%20Weblogs.pdf
Verayuth Lertnattee, Thanaruk Theeramunkong (2004) Effect Of Term Distributions On Centroid-Based Text Categorization, Information Sciences - Informatics and Computer Science Journal, Volume 158 Issue 1, URL: http://ccc.inaoep.mx/~villasen/index_archivos/cursoTL/articulos/Lertnattee-EffectOfTermDistributionsOn Centroid-based TextCategorization.pdf
Hu Guan, Jingyu Zhou, Minyi Guo (2009) A Class-Feature-Centroid Classifier for Text Categorization, WWW 2009, April 20–24, Madrid, Spain, ACM, URL: www2009.eprints.org/21/1/p201.pdf
Fabrizio Sebastiani (2002) Machine Learning in Automated Text Categorization, ACM Computing Surveys, Vol. 34, No. 1: 1-47, URL: http://nmis.isti.cnr.it/sebastiani/Publications/ACMCS02.pdf
Ken Hagiwara, Hiroya Takamura, Manabu Okumura (2010) Constructing Blog Entry Classifiers Using Blog-Level Topic Labels, Proceedings of Asia Information Retrieval Symposium (AIRS) 2010, Springer
Andrew McCallum, Kamal Nigam (1998) A Comparison of Event Models for Naive Bayes Text Classification. Proceedings of AAAI 1998 Workshop on Learning for Text Cetegorization: 41-48, 1998, URL: http://www.cs.cmu.edu/~knigam/papers/multinomial-aaaiws98.pdf
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom Mitchell (2000) Text Classication from Labeled and Unlabeled Documents using EM, Machine Learning Journal: 1-34, Vol 39 Issue 2-3, URL: www.kamalnigam.com/papers/emcat-mlj99.pdf
Subramaniyaswamy, V, S. Chenthur Pandia (2012) An Improved Approach for Topic Ontology Based Categorization of Blogs Using Support Vector Machine, Journal of Computer Science 8 (2): 251-258, URL: http://thescipub.com/pdf/10.3844/jcssp.2012.251.258
Michael Wiegand, Dietrich Klakow (2009) Topic-Related Polarity Classification of Blog Sentences, Proceeding EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence, URL: www.lsv.uni-saarland.de/epia.pdf
Macdonald, C.; Santos, R. L.; Ounis, I.; and Soboroff, I. (2010) Blog track research at TREC. SIGIR Forum 44:58–75, URL: http://www.sigir.org/forum/2010J/2010j-sigirforum-macdonald.pdf
Malik Muhammad Saad Missen, Guillaume Cabanac , Mohand Boughanem (2010) Opinion Detection in Blogs: What is still Missing?, International Conference on Advances in Social Networks Analysis and Mining, IEEE, URL: http://acadmedia.wku.edu/Zhuhadar/nikhile/ASONAM-2010/ASONAM-47.pdf
Gianluca Demartini, Stefan Siersdorfer, Sergiu Chelaru,Wolfgang Nejdl (2011) Analyzing Political Trends in the Blogosphere, Fifth International AAAI Conference on Weblogs and Social Media, URL: http://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2838/3244
B. Ernsting, W. Weerkamp, and M. de Rijke (2007) The University of Amsterdam at the TREC 2007 Blog Track, URL: http://staff.science.uva.nl/~mdr/Publications/Files/trec2007-wn-blog.pdf
S. Gerani, M. Carman, and F. Crestani (2009) Investigating Learning Approaches for Blog Post Opinion Retrieval, ECIR 2009, URL: bradipo.net/mark/papers/gerani_ecir2009.pdf
Andrea Esuli, Fabrizio Sebastiani (2006) SentiWordNet: A Publicly Available Lexical Resource For Opinion Mining, LREC-06: ELRA, URL: gandalf.aksis.uib.no/lrec2006/pdf/384_pdf.pdf
Bo Pang, Lillian Lee (2008) Opinion Mining And Sentiment Analysis, Foundations And Trends In Information Retrieval, Vol. 2, No 1-2: 1–135, URL: www.cs.cornell.edu/home/llee/omsa/omsa.pdf
Thin Nguyen, Dinh Phung, Brett Adams, Truyen Tran, Svetha Venkatesh (2010) Classification and Pattern Discovery of Mood in Weblogs, PAKDD 2010, Springer
Yuchul Jung, Hogun Park, Sung Hyon Myaeng (2006) A Hybrid Mood Classification Approach for Blog Text, PRICAI 2006: 1099 - 1103, Springer
Yiming Yang, Jan O. Pedersen (1997) A Comparative Study On Feature Selection In Text Categorization. Proceedings. of ICML, pp. 412–420, URL: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.9956
Fazel Keshtkar, Diana Inkpen (2011) A Hierarchical Approach To Mood Classification In Blogs, Natural Language Engineering 18 (1): 61–81, Cambridge University Press
Feifan Liu, Dong Wang, Bin Li, Yang Liu (2010) Improving Blog Polarity Classification via Topic Analysis and Adaptive Methods, Human Language Technology: The 2010 Annual Conference of the North American Chapter of the ACL, 309-312, URL: www.aclweb.org/anthology/N10-1042
Zheng Lin, Songbo Tan, Xueqi Cheng (2011) Using Key Sentence to Improve Sentiment Classification, Proceddings of Advanced Information Retrieval Systems (AIRS) 2011, Springer
Shoushan Li, Zhongqing Wang, Guodong Zhou, Sophia Yat Mei Lee (2011) Semi-Supervised Learning for Imbalanced Sentiment Classification, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, URL: ijcai.org/papers11/Papers/IJCAI11-306.pdf
Sergey Brin, Larry Page (1998) The Anatomy of A Large-Scale Hypertext Web Search Engine. Computer Networks and ISDN System Archive 30 (1-7), 107-117
Anubhav Kale, Amit Karandikar, Pranam Kolari, Akshay Java, Tim Finin, Anupam Joshi (2007) Modelling Trust and Influence in the Blogosphere Using Link Polarity. International Conference on Weblogs and Social Media, URL: http://ebiquity.umbc.edu/_file_directory_/papers/364.pdf
Justin Martineau, Matthew Hurst (2008) Blog Link Classification. International Conference on Weblogs and Social Media, URL: http://ebiquity.umbc.edu/_file_directory_/papers/510.pdf
Jure Leskovec, Daniel Huttenlocher, Jon Kleinberg (2010) Predicting Positive and Negative Links in Online Social Networks, 10th WWW, URL: cs.stanford.edu/~jure/pubs/signs-www10.pdf
Aya Ishino, Hidetsugu Nanba, Toshiyuki Takezawa (2011) Automatic Classification of Link Polarity in Blog Entries, Proceedings of Asia Information Retrieval Symposium (AIRS) 2011, Springer
Aurangzeb Khan, Baharum Baharudin, Khairullah Khan (2011) Sentiment Classification Using Sentence-level Lexical Based Semantic Orientation of Online Reviews, Trends in Applied Science Research 6 (10); 1141-1157, URL: eprints.utp.edu.my/6435/1/207-627-1-PB.pdf
Shoushan Li, Chu-Ren Huang, Guodong Zhou, Sophia Yat Mei Lee (2010) Employing Personal/Impersonal Views in Supervised and Semi-supervised Sentiment Classification, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 414–423, 2010, URL: www.aclweb.org/anthology/P10-1043
DOI: https://doi.org/10.21107/simantec.v8i2.7223
Refbacks
- There are currently no refbacks.
Copyright (c) 2020 Husni Husni
Indexed By