preview

A Note On Detection Algorithm

Better Essays

2.1 PAGE CHANGE DETECTION ALGORITHM
2.1.1 Introduction: About 60% of the content on the web is dynamic. It is quiet possible that after downloading a particular web page, the local copy of the page residing in the repository of the web pages becomes obsolete compared to the copy on the web. Therefore a need arises to update the database of web pages. Once a decision has been taken to update the pages, it should be ensured that minimal resources are used in the process. Updating only those elements of the database, which have actually undergone a change, can do this. Importance of web pages to be downloaded has been discussed in the above section. It also checks whether the page is already there in the database or not and lowers its priority value if it is referred rather frequently. In this section, we discuss some algorithms to derive certain parameters, which can help in deriving the fact whether the page has changed, or not. These parameters will be calculated at the time of page parsing. When the client again counters the same URL, it just calculates the code by parsing the page without downloading the page and compares it to the current parameters. If changes in parameters are detected, it is concluded that the page has changed and needs to be downloaded again. Otherwise the URL is discarded immediately without further processing. The following changes are of importance when considering changes in a web page:
• Change in page structure.
• Change in text contents.

Get Access