Malicious-URL Detection using Logistic Regression Technique

  • Vanitha N
  • Vinodhini V
Keywords: URL, Logistic Regression, Machine Learning, Data


Over the last few years, the Web has seen a massive growth in the number and kinds of web services. Web facilities such as online banking, gaming, and social networking have promptly evolved as has the faith upon them by people to perform daily tasks. As a result, a large amount of information is uploaded on a daily to the Web. As these web services drive new opportunities for people to interact, they also create new opportunities for criminals. URLs are launch pads for any web attacks such that any malicious intention user can steal the identity of the legal person by sending the malicious URL. Malicious URLs are a keystone of Internet illegitimate activities. The dangers of these sites have created a mandates for defences that protect end-users from visiting them. The proposed approach is that classifies URLs automatically by using Machine-Learning algorithm called logistic regression that is used to binary classification. The classifiers achieves 97% accuracy by learning phishing URLs.


Download data is not yet available.


Justin Ma, Saul L. K., Savage S., & Voelker G. M. (2011). Learning to detect malicious urls. ACM Transactions on Intelligent Systems and Technology, 3(2), 1–24.

Verma R. & Das A. (2017). Whats in a URL: Fast feature extraction and malicious URL detection. In 3rd International Workshop on Security and Privacy Analytics, pp. 55–63.

Patil D. R. & Patil J. B. (2016). Malicious web pages detection using static analysis of URLs. International Journal of Information Security and Cybercrime, 5(2), 57–70.

Zuhair, H., Selamat, A., & Salleh, M. (2015). Selection of robust feature subsets for phish webpage prediction using maximum relevance and minimum redundancy criterion. Journal of Theoretical and Applied Information Technology, 81(2), 188–205.

Hajian Nezhad J, Vafaei Jahan M, Tayarani-N M, & Sadrnezhad Z. (2017). Analyzing new features of infected web content in detection of malicious web pages. The ISC International Journal of Information Security, 9(2), 63–83.

Mark Dredze, Koby Crammer, & Fernando Pereira. (2008). Confidence-weighted linear classification. In 25th International Conference on Machine Learning (ICML), pp. 264–271.

Hsu C. W. & Lin C. J. (2002). A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks, 13(2), 415–425. Doi: 10.1109/72.991427.

Crammer K., Dredze M., & Kulesza A. (2009). Multiclass confidence weighted algorithms. In Conference on Empirical Methods in Natural Language Processing, pp. 496–504.

How to Cite
Vanitha N, & Vinodhini V. (2019). Malicious-URL Detection using Logistic Regression Technique. International Journal of Engineering and Management Research, 9(6), 108-113.