Phishing URL Detection Using Machine Learning Classification Algorithms

Authors

  • Jeremiah Paul Richard
  • Matthias Daniel

Abstract

Phishing attack is utilized to get the data like username, secret phrase, financial balance subtleties, and credit card details. Today, is the most well-known cybercrime. Phishing assaults additionally influence the web-based installment area monetary organization, document facilitating or distributed storage, and numerous others. Phishing assault generally focuses to these Web locales which are connected with the internet-based payment area and Web mail. To stop phishing attacks, a variety of methods are employed, including blacklists, heuristics, and visual similarity. The proposed model in this research, however, is a combination of logistic regression (LR) and decision tree (DT) with some variable parameters related to the design and training of classifiers used by data mining techniques related to real-world issues using Python programming language (Spyder IDE) used in developing the model with Sklearn built-in data source library. The system was successfully tested according to the design specification using logistic regression and Decision Tree models. This resulted to 86.33 and 87.62% accuracy level in comparison with the existing system that has 81.42% accuracy rate.

References

Gunter Ollmann. The Phishing Guide Understanding & Preventing Phishing Attacks. IBM Internet Security Systems; 2007. 2. Mahmoud Khonji, Youssef Iraqi. Phishing Detection: A Literature Survey. IEEE Commun Surv Tutor. 2013 Fourth Quarter; 15(4): 2091–2121.

Liu G, Qiu B, Wenyin L. Automatic detection of phishing target from phishing webpage. In 2010 IEEE 20th International Conference on Pattern Recognition (ICPR). 2010; 4153–4156.

Basnet RB, Doleck T. Towards developing a tool to detect phishing urls: A machine learning approach. In 2015 IEEE International Conference on Computational Intelligence & Communication Technology (CICT). 2015; 220–223.

Miyamoto D, Hazeyama H, Kadobayashi Y. An evaluation of machine learning-based methods for detection of phishing sites. In Springer International Conference on Neural Information Processing. 2008; 539–546.

Marchal S, Franc¸ois J, State R, Engel T. Phish storm: Detecting phishing with streaming analytics. IEEE Trans Netw Serv Manag. 2014; 11(4): 458–471.

Feroz MN, Mengel S. Phishing URL detection using URL ranking. In 2015 IEEE International Congress on Big Data. 2015; 635–638.

Khonji M, Iraqi Y, Jones A. Lexical URL analysis for discriminating phishing and legitimate e-mail messages. In 2011 IEEE International Conference for Internet Technology and Secured Transactions (ICITST). 2011; 422–427.

Nguyen LAT, To BL, Nguyen HK, Nguyen MH. Detecting phishing web sites: A heuristic url-based approach. In 2013 IEEE International Conference on Advanced Technologies for Communications (ATC 2013). 2013; 597–602.

Tan CL, Chiew KL, et al. Phishing website detection using url-assisted brand name weighting system. In 2014 IEEE International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS). 2014; 054–059.

Zhang Y, Hong J, Cranor L. CANTINA: a content based approach to detecting phishing web sites. In Proc 16th Int Conf World Wide Web, WWW’07, Banff, Alberta, Canada. 2007; 639–648.

Fu A, Wenyin L, Deng X. Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover’s Distance (EMD). IEEE Trans Dependable Secure Comput. 2006; 3(4): 301–311.

Chandrasekaran M, Narayanan K, Upadhyaya S. Phishing email detection based on structural properties. Proceedings of the NYS Cyber Security Conference. 2006; 2–8.

Nkue D, Matthias D, Bennett EO. An Efficient Model for Detecting Uniform Resource Locator (URL) Phishing using Machine Learning Techniques. Int J Comput Tech. 2021 Jun; 8(3): 46–55. ISSN: 2394-2231. http://www.ijctjournal.org

Tamraparni D, Theodore J. Exploratory data mining and data cleaning. New York, USA: John Wiley & Sons; 2003.

Jiawei H, Micheline K, Jian P. Data Mining Concepts and Techniques. 3rd Edn. USA: Morgan Kaufmann; 2012.

Bellatreche L, Chakravarthy S. Big Data Analytics and Knowledge Discovery. Proceeding of 19th International Conference DAWak Lyon France. 2017.

Published

01/16/2023

How to Cite

Jeremiah Paul Richard, & Daniel, M. . (2023). Phishing URL Detection Using Machine Learning Classification Algorithms. JOURNAL OF WEB ENGINEERING &Amp; TECHNOLOGY, 9(3), 22–33. Retrieved from https://stmcomputers.stmjournals.com/index.php/JoWET/article/view/407