SENTIMENT ANALYSIS OF ELECTRONIC PRODUCT TWEETS USING BIG DATA FRAMEWORK

(Received: 2019-01-08, Revised: 2019-02-27 , Accepted: 2019-03-13)
Nowadays, social media has become more popular due to the advancement of Internet technologies and smartphone devices. Such platforms have generated interest among users to give their opinion. Social media-like Twitter- also plays an important role for business companies. Based on customer opinion about any product, business companies came to know more about customer choices. In the current scenario, millions of tweets are generated by people every year. But handling these huge unstructured tweets is not possible through the traditional platform. Therefore, big data framework, such as Hadoop and Spark, is used to handle such kind of large data. In this paper, different sale tweets are used to analyze the sentiments of customers regarding electronic products. The experimental results of the proposed work will be useful for various business companies to take business decisions, which will further enhance the product sales.
  1. B. Liu, "Sentiment Analysis and Opinion Mining, "Synthesis Lectures on Human Language Technologies, vol. 5, no. 1, pp. 1–167, 2012.
  2. A. Hasan, S. Moin, A. Karim and S. Shamshirband, "Machine Learning-based Sentiment Analysis for Twitter Accounts, "Mathematical and Computational Applications, vol. 23, no. 1, p. 11, 2018.
  3. C. S. Khoo and S. B. Johnkhan, "Lexicon-based Sentiment Analysis: Comparative Evaluation of Six Sentiment Lexicons, "Journal of Information Science, vol. 44, no. 4, pp. 491–511, 2017.
  4. F. Iqbal, J. Maqbool, B. C. M. Fung, R. Batool, A. M. Khattak, S. Aleem and P. C. K. Hung, "A Hybrid Framework for Sentiment Analysis Using Genetic Algorithm-based Feature Reduction, "IEEE Access, pp. 1–1, 2019.
  5. F. Atefeh and D. Inkpen, Proceedings of the Workshop on Semantic Analysis in Social Media, Association for Computational Linguistics, France, 2012.
  6. A. Tyagi and S. Naresh, "Sentiments Analysis of Twitter Data Using K-Nearest Neighbour Classifier," International Journal of Engineering Science, vol. 17258, 2018.
  7. T. White, Hadoop: The Definitive Guide, 3rd Edition, O'Reilly Media, Inc., May 2012.
  8. M. V. Banerveld, N.-A. Le-Khac and M.-T. Kechadi, "Performance Evaluation of a Natural Language Processing Approach Applied in White Collar Crime Investigation, "Future Data and Security Engineering Lecture Notes in Computer Science, pp. 29–43, 2014. 58 "Sentiment Analysis of Electronic Product Tweets Using Big Data Framework", S. Kumar, V. Koolwal and K. K. Mohbey.
  9. G. Dubey, S. Chawla and K. Kaur, "Social Media Opinion Analysis for Indian Political Diplomats, " Proc. of the IEEE 7th International Conference on Cloud Computing, Data Science and Engineering- Confluence, pp. 681-686, 2017.
  10. S. Al-Saqqa, G. Al-Naymat and A. Awajan, "A Large-Scale Sentiment Data Classification for Online Reviews Under Apache Spark," Procedia Computer Science, vol. 141, pp. 183–189, 2018.
  11. R. Sandy, U. Laserson, S. Owen and J. Wills, Advanced Analytics with Spark: Patterns for Learning from Data at Scale, O'Reilly Media, Inc., 2017.
  12. G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross and K. J. Miller, "Introduction to WordNet: An On- line Lexical Database," International Journal of Lexicography, vol. 3, no. 4, pp. 235–244, 1990.
  13. A. Esuli and F. Sebastiani, "SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining," [Online], Available: http://nmis.isti.cnr.it/sebastiani/Publications/LREC06.pdf, 2006.
  14. S. Seifollahi and M. Shajari, "Word Sense Disambiguation Application in Sentiment Analysis of News Headlines: An Applied Approach to FOREX Market Prediction," Journal of Intelligent Information Systems, vol. 52, no. 1, pp. 57–83, 2018.
  15. M. Bhuiyan, A. Misra, S. Tripathy, J. Mahmud and R. Akkiraju, "Don't Get Lost in Negation: An Effective Negation Handled Dialogue Acts Prediction Algorithm for Twitter Customer Service Conversations," arXiv preprint arXiv:1807.06107, 2018.
  16. S. -M. Kim and E. Hovy, "Determining the Sentiment of Opinions," Proceedings of the 20th International Conference on Computational Linguistics (COLING 04), [Online], Available: http://aclweb.org/anthology/C04-1200, 2004.
  17. T. Wilson, J. Wiebe and P. Hoffmann, "Recognizing Contextual Polarity in Phrase-level Sentiment Analysis," Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT 05), [Online], Available: https://people.cs.pitt.edu/~wiebe/pubs/papers/emnlp05polarity.pdf, 2005.
  18. J. Blitzez, M. Dredze and F. Pereira, "Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification," Proc. of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 440–447, 2007.
  19. J. Kim, M. Yang, Y. Hwang, S. Jeon, K. Kim, I. Jung, C. Choi, W. Cho and J. Na, "Customer Preference Analysis Based on SNS Data," Proc. of the IEEE 2nd International Conference on Cloud and Green Computing, pp. 609-613, 2012.
  20. M. Kumar and A. Bala, "Analyzing Twitter Sentiments through Big Data," Proc. of the IEEE 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, pp. 2628-2631, 2016.
  21. A. L. Berger, V. J. D. Pietra and S. A. D. Pietra, "A Maximum Entropy Approach to Natural Language Processing," Computational Linguist, vol. 22, no. 1, pp. 39–71, 1996.
  22. A. Agarwal, B. Xie, I. Vovsha, O. Rambow and R. Passonneau. "Sentiment Analysis of Twitter Data," Proceedings of the Workshop on Languages in Social Media, pp. 30-38, 2011.
  23. F. H. Khan, S. Bashir and U. Qamar, "TOM: Twitter Opinion Mining Framework Using Hybrid Classification Scheme," Decision Support Systems, vol. 57, pp. 245–257, 2014.
  24. S. Geetha and K. V. Kumar, "Tweet Analysis Based on Distinct Opinion of Social Media Users," Advances in Intelligent Systems and Computing Advances in Big Data and Cloud Computing, pp. 251– 261, 2018.
  25. A. Kaur, D. Khaneja, K. Vyas and R. S. Saini, Sentiment Analysis on Twitter Using Apache Spark, [Online], Available: https://www.researchgate.net/profile/Deepesh_Khaneja/publication/320625064 _project_report_sentiment_analysis_on_twitter_using_apache_spark/links/59f24420aca272cdc7d0169a /project-report-sentiment-analysis-on-twitter-using-apache-spark.pdf, 2016.
  26. R. Kaptein, "Learning to Analyze Relevancy and Polarity of Tweets," CLEF (Online Working Notes/Labs/Workshop), [Online], Available: http://ceur-ws.org/Vol-1178/CLEF2012wn-RepLab- Kaptein2012.pdf, 2012.
  27. A. Kanavos, N. Nodarakis, S. Sioutas, A. Tsakalidis, D. Tsolis and G. Tzimas, "Large Scale Implementations for Twitter Sentiment Classification," Algorithms, vol. 10, no. 1, p. 33, 2017.
  28. A. Baltas, A. Kanavos and A. K. Tsakalidis, "An Apache Spark Implementation for Sentiment Analysis 59 Jordanian Journal of Computers and Information Technology (JJCIT), Vol. 05, No. 01, April 2019. on Twitter Data," Algorithmic Aspects of Cloud Computing Lecture Notes in Computer Science, pp. 15–25, 2017.
  29. W. N. Chan and T. Thein, "A Comparative Study of Machine Learning Techniques for Real-time Multi-tier Sentiment Analysis," Proc. of the IEEE 1st International Conference on Knowledge, Innovation and Invention (ICKII), 2018.
  30. N. Deshai, S. Venkataramana and G. P. S. Varma, "Performance and Cost Evolution of Dynamic Increase Hadoop Workloads of Various Data Centers," Smart Intelligent Computing and Applications Smart Innovation, Systems and Technologies, pp. 505–516, 2018.
  31. J. Dean and S. Ghemawat, "MapReduce," Communications of the ACM, vol. 51, no. 1, p. 107, 2008.
  32. M. Zaharia et al., "Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing, " Proc. of the 9th USENIX Conference on Networked Systems Design and Implementation, pp. 2-2, 2012.
  33. X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu and D. Xin, "MLlib: Machine Learning in Apache Spark," The Journal of Machine Learning Research, vol. 17, no. 1, pp. 1235-1241, 2016.
  34. J. Hellerstein, J. Thathachar and I. Rish, "Recognizing End-user Transactions in Performance Management," Proc. AAAI-2000, pp. 596–602, 2000.
  35. J. Han, M. Kamber and J. Pei, "Data Mining: Concepts and Techniques," Elsevier, pp. 279–325, 2012.
  36. V. Vapnik, Estimation of Dependencies Based on Empirical Data, ISBN 978-0-387-34239-9, Springer, 1995.
  37. K. Karimi and J. H. Howard, "Generation and Interpretation of Temporal Decision Rules," arXiv preprint arXiv:1004.3334, 2010.
  38. L. Breiman, "Random Forests," UC Berkeley TR567, 1999.
  39. G. Angiani, L. Ferrari, T. Fontanini, P. Fornacciari, E. Iotti, F. Magliani and S. Manicardi, "A Comparison between Pre-processing Techniques for Sentiment Analysis in Twitter," In: KDWeb, 2016.
  40. H. Karau, A. Konwinski, P. Wendell and M. Zaharia, Learning Spark: Lightning-fast Big Data Analysis, O'Reilly Media, Inc., Jan 2015.
  41. A. Giachanou, J. Gonzalo, I. Mele and F. Crestani, "Sentiment Propagation for Predicting Reputation Polarity," Lecture Notes in Computer Science Advances in Information Retrieval, pp. 226–238, 2017.