A Brief Survey on Content-based Fake News Detection


  • Shreenivas Choudhary Student, Department of Computer Engineering, National Institute of Technology, Kurukshetra, Haryana, India
  • Sanjay Kumar Jain Professor, Department of Computer Engineering, National Institute of Technology, Kurukshetra, Haryana, India


Fake News Detection, Content Based, Machine Learning, Natural Language Processing (NLP), Propagation-based, TF-IDF


The increase in spread of fake news causes destruction of democracy and public confidence, which has boosted the demand for reliable fake news identification dramatically. Recent developments in this area have opened up innovative approaches for identifying fake news by looking at how it travels on social media. However, in order to spot false news early, one has no information about news distribution because at an early-stage, fake news can only be created and later dissemination happens. Also, the speed and amount at which news is created and propagated online is destructive in nature. As a result, there is an urgent need to create methods for detecting false news just based on news content and eliminate the danger before it reaches the mass of people. This study is divided into five sections. We have an introduction where we describe the topic and the motivations behind this survey. Fundamental Theories is the second section where we discuss the current theories that are used to detect fake news. In the third section, we briefly touched upon the advantages of utilizing content-based detection. In the fourth section, we conducted a survey on content-based fake news detection and listed the advantages and disadvantages of the available techniques. We closed the study with a discussion of some potential topics to investigate further.


Xinyi Zhou, Reza Zafarani. A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities. ACM Comput Surv. 2020 Sep; 53(5): Article 109(40p). https://doi.org/10.1145/3395046

Xichen Zhang, Ghorbani Ali A. An overview of online fake news: Characterization, detection, and discussion. Inf Process Manag. 2020 Mar; 57(2): 102025. https://doi.org/10.1016/j.ipm.2019.03.004

Ramy Baly, Georgi Karadzhov, Dimitar Alexandrov, James Glass, Preslav Nakov. Predicting Factuality of Reporting and Bias of News Media Sources. 1810.01765.pdf (arxiv.org). 2018.

Kumar S, Shah N. False information on web and social media: A survey. [1804.08559] (arxiv.org). 2018.

Kai Shu, Suhang Wang, Huan Liu. Exploiting Tri-Relationship for Fake News Detection arXiv:1712.07709 [cs.SI]. 2018.

Ahmed H. Detecting opinion spam and fake news using n- gram analysis and semantic similarity. Ph.D. thesis. Canada: University of Ahram Canadian; 2017.

Conroy NJ, Rubin VL, Chen Y. Automatic deception detection: Methods for finding fake news. Proc Assoc Inf Sci Technol. 2015; 52(1): 1–4.

Popel M, Žabokrtskỳ Z. Tectomt: Modular nlp framework. International conference on natural language processing; Springer. 2010; 293–304.

Goldberg Y, Levy O. Word2vec explained: Deriving Mikolov et al.’s negative-sampling word-embedding method. ArXiv e-prints. 2014.

Sundermeyer M, Schlüter R, Ney H. Recurrent neural network based language model. 2010; 10: 1045–1048.

Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. Advances in neural information processing systems. 2014; 3104–3112.

Feng VW, Hirst G. Detecting deceptive opinions with profile compatibility. Proceedings of the sixth international joint conference on natural language processing. 2013; 338–346.

Shu K, Sliva A, Wang S, Tang J, Liu H. Fake news detection on social media: A data mining perspective. ACM SIGKDD Explor Newsletter. 2017; 19(1): 22–36.

Ruchansky N, Seo S, Liu Y. CSI: A hybrid deep model for fake news detection. Proceedings of the 2017 ACM on conference on information and knowledge management, ACM. 2017; 797–806.

Henner Gimpel, Sebastian Heger. The Power of Related Articles: Improving Fake News Detection on Social Media Platforms. Proceedings of the 53rd Hawaii International Conference on System Sciences. 2020; 6063–6072.

Afroz S, Brennan M, Greenstadt R. Detecting hoaxes, frauds, and deception in writing style online. 2012 IEEE symposium on Security and privacy (SP). 2012; 461–475.

Horne BD, Adali S. This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. ArXiv eprints. 2017.

Castillo C, Mendoza M, Poblete B. Information credibility on twitter. Proceedings of the 20th international conference on world wide web, ACM. 2011; 675–684.

Jin Z, Cao J, Zhang Y, Luo J. News verification by exploiting conflicting social viewpoints in microblogs. AAAI'16: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016; 2972–2978.

Banerjee R, Feng S, Kang JS, Choi Y. Keystroke patterns as prosody in digital writings: A case study with deceptive reviews and essays. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014; 1469–1473.