CRAWLING AJAX-BASED WEB APPLICATIONS: EVOLUTION AND STATE-OF-THE-ART
Main Article Content
Abstract
The innovation of AJAX resulted in more responsive, interactive and faster web applications due to the clever amalgamation of JavaScript, HTML, and Cascading Style Sheets (CSS). However, from the user’s perspective, this achievement places many challenges before web search engines. One major challenge is due to the complexities in crawling such web applications because multiple states are associated with one uniform resource locator (URL) that cause a mismatch with search model of web search engines, where a web document is uniquely identified by a single unique URL with a single state. Crawling AJAX-based web applications means giving strength and capability to web search engines so that information produced in these highly-interactive web applications is downloaded and indexed. The need here is to investigate the technicalities of AJAX that shatter the metaphor of a web page which the current web search engine utilize during crawling in order to improve the capabilities of web search engines. Although some academic tools have been developed, they produce some false positives which greatly affect the performance of web search engine. We aim to investigate AJAX and AJAX-based web applications as well as the state-of-the-art in crawling these applications along with some prominent issues, challenges and recommendations