Smart Crawler: A Two-Stage Crawler for Efficiently Harvesting Deep-Web

Authors

  • Pooja Chandrakant Kathara
  • Nikita Gajanan Kulkarni
  • Aruna Ramesh Vasave
  • Hemalata Arun Gosavi

Abstract

Since the deep web is rising at a very fast pace, there has been an improved interest in technologies that help in efficiently locating interfaces and sites. However, due to the large volume of web properties and the active nature of deep web, accomplishing wide coverage and high efficiency is a motivating problem. We propose a two-stage framework, specifically Smart Crawler, in order to efficiently harvest deep web interfaces. In the first stage, the Smart Crawler performs a site-based search for middle pages with the help of search engines, avoiding visiting a large number of pages. To reach more accurate focused crawling results, Smart Crawler places websites to rank.

Published

11/01/2021

How to Cite

Chandrakant Kathara, P. ., Gajanan Kulkarni, N. ., Ramesh Vasave, A. ., & Arun Gosavi, H. . (2021). Smart Crawler: A Two-Stage Crawler for Efficiently Harvesting Deep-Web. JOURNAL OF WEB ENGINEERING &Amp; TECHNOLOGY, 8(2), 24–29. Retrieved from https://stmcomputers.stmjournals.com/index.php/JoWET/article/view/35