A Machine Learning Approach to Personalized Movie Recommendations Using Content-based Filtering, Web Scraping and NLP Based Sentiment Analysis


  • Yachika S. Yadav Research Scholar, MCA Thakur Institute of Management Studies, Career Development & Research (TIMSCDR), Mumbai, Maharashtra, India
  • Vikas R. Yadav Research Scholar, MCA Thakur Institute of Management Studies, Career Development & Research (TIMSCDR), Mumbai, Maharashtra, India


Content-based filtering, Cosine similarity, API key, Web scraping, Sentiment analysis


This research work presents an improved approach for movie recommendation using content-based filtering and cosine similarity. The proposed system considers the movies' content information and genre during item similarity calculations to recommend a top 10 list of similar movies based on user preferences. Data for the movies, including title, genre, runtime, rating, and cast, is obtained from the TMDB website using an API key, while reviews are web-scraped from the IMDB website and subjected to sentiment analysis using NLP to predict whether the review is positive or negative. The recommendation system is designed to minimize transaction costs and improve the quality and decision-making process for users. This statement highlights the significance of recommendation systems in today's world, emphasizing their widespread utilization by renowned applications.


Katarya R, Verma OP. Effective collaborative movie recommender system using asymmetric user similarity and matrix factorization. 2016 International Conference on Computing, Communication and Automation (ICCCA), Noida. 2016; 71–75. doi: 10.1109/CCAA.2016.7813692.

Kumar M, Yadav D, Ashutosh Kumar Singh, Gupta VK. A Movie Recommender System: MOVREC. Int J Comput Appl. 2015; 124(3): 7–11.

Alhamid MF, Rawashdeh M, Hossain MA, et al. Towards context-aware media recommendation based on social tagging. J Intell Inf Syst. 2016; 46(3): 499–516. doi: 10.1007/s10844-015-0364-5

Halder S, Sarkar AMJ, Lee Y. Movie Recommendation System Based on Movie Swarm. 2012 2nd International Conference on Cloud and Green Computing, Xiangtan. 2012; 804–809. doi: 10.1109/CGC.2012.121.

Yang L, Li Y, Sherratt RS. Sentiment Analysis for E-Commerce Product Reviews in Chinese Based on Sentiment Lexicon and Deep Learning. IEEE Access. 2020; 8: 23522–23530.

Lin Y, Li J, Yang L, Lin H. Sentiment Analysis with Comparison Enhanced Deep Neural Network. IEEE Access. 2020; 8: 78378–78384.

Breese JS, Heckerman D, Kadie C. Empirical analysis of predictive algorithms for collaborative filtering, UAI’98 Proc 14th Conf Uncertain Artif Intell. 1998; 43–52.

Zhang J, Wang Y, Yuan Z, Jin Q. Personalized Real-Time Movie Recommendation System: Practical Prototype and Evaluation. Tsinghua Sci Technol. 2020; 25(2): 180–191.

Bollen J, Mao H, Pepe A. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. InProceedings of the international AAAI conference on web and social media 2011 (Vol. 5, No. 1, pp. 450–453).

Yueshen Xu, Jianwei Yin. Collaborative recommendation with user generated content. Eng Appl Artif Intell. 2015; 45: 281–294.