INTRODUCTION
1.1 GENERAL
The wide adoption of the Internet has fundamentally altered the ways in which user communicate, gather information, conduct businesses and make purchases. As the use of World Wide Web is increasing, data on internet is increasing. A few sites consist of millions of pages, but millions of sites only contain a handful of pages. Few sites contain millions of links, but many sites have one or two. Millions of users flock to a few select sites, giving little attention to millions of others. The expansion of the World Wide Web (Web for short) has resulted in a large amount of data that is now in general freely available for user access. The different types of data have to be managed and organized in such a way that they can be accessed by different users efficiently by the search engine.
The World Wide Web is a collection of Web sites and its Web contents. The Web evolves continuously and changes dynamically since new Web sites are born and the old ones disappear simultaneously, and contents of those Web sites are updated at any times. While the Web contains vast amount of information and provides an access to it at any places and any times, that is a prize beyond our reach without efficient searching tools for the Web. Efficient searching for Web contents becomes more important than ever before as the Web evolves and users increase explosively. Portal sites with search engines are popular and commonly used tools for searching Web contents at this time,
The first versions of WWW ((what most people call “The Web”))) provide means for people around the world to exchange information between, to work together, to communicate, and to share documentation more efficiently. Tim Berners-Lee wrote the first browser (called WWW browser) and Web server in March 1991, allowing hypertext documents to be stored, fetched, and viewed. The Web can be seen as a tremendous document store where these documents (web pages) can be fetched by typing their address into a web browser. To do that, two im- portant techniques have been developed. First, a language called Hypertext Markup Languag (HTML) tells the computers how to display documents which contain texts, photos, sounds, visuals (video), and animation, interactive
There is a complex debate over the Internet and whether it is making society smarter or dumber. For that matter the debate focuses on the Internet and the intellectuality of individuals, and if the Internet hinders or it progresses society as a whole. Other critics argue that the Internet contributes to the decline of our mental state. On the other hand, others argue that the Internet promotes and encourages literacy by its ability in providing limitless amount of information at the stroke of a key. In
When searching on the Internet, one may find it difficult sometimes to know where to start. With the seemingly limitless amount of information, one should use the resource suitable for the searcher's needs and tastes. Comparing different factors like databases, directory types, strengths and weaknesses of two search engines, such as Yahoo! and Lycos, can provide an advantage to someone looking for a starting block.
The data is then sent back through the system to the original user. The information that is on the data coming back could have came from a wide array of sources such as books, financial markets, embedded chips or even made up by someone trying to fool the user. The History? The Internet is first
(King-Lup Liu, 2001) Given countless motors on the Internet, it is troublesome for a man to figure out which web search tools could serve his/her data needs. A typical arrangement is to build a metasearch motor on top of the web indexes. After accepting a client question, the metasearch motor sends it to those fundamental web indexes which are liable to give back the craved archives for the inquiry. The determination calculation utilized by a metasearch motor to figure out if a web index ought to be sent the inquiry ordinarily settles on the choice in light of the web search tool agent, which contains trademark data about the database of a web search tool. Be that as it may, a hidden web index may not will to give the required data to the metasearch motor. This paper demonstrates that the required data can be evaluated from an uncooperative web crawler with great exactness. Two bits of data which license precise web crawler determination are the quantity of reports filed by the web index and the greatest weight of every term. In this paper, we display systems for the estimation of these two bits of data.
People who require information and furthermore, specific information for example “health” use special sites on the web, termed internet search engines that assist in retrieving stored information on other sites. (Franklin, 2000) Many search engines are available and some are designed for specific purposes; the two most popular all-purpose search engines are Google and Yahoo. Medical search engines are distinctively designed for
The Internet - The Good, the Bad, and the Ugly The internet is a computer based global information system. It is composed of many interconnected computer networks. Each network may link thousands of computers enabling them to share information. The internet has brought a transformation in many aspects of life.
Made out of Web locales interconnected by hyperlinks, the World Wide Web can be seen as an enormous yet tumultuous wellspring of data. For choice making numerous business applications need to rely on upon web keeping in mind the end goal to total data from various sites. Programmed information extraction assumes an essential part in preparing results gave via internet searchers in the wake of presenting the question by client. presently days "site" has begun keeping more significance to our life. without which it is hard to oblige even one day .so it has turned into the need that the site ought to be more enlightening and alluring . be that as it may, the sites are created and just grew purposely or unwittingly
Nevertheless, it has obtained gigantic awareness best in the up to date years [41-58, 60-64]. Targeted crawlers avoid the crawling method on a certain set of issues that characterize a narrow area of the online. A focused or a topical internet crawler makes an attempt to download websites critical to a suite of pre-outlined subject matters. Hyperlink context varieties and most important part of web headquartered understanding retrieval assignment. Topical crawlers follow the hyperlinked constitution of the online making use of the supply of understanding to direct themselves towards topically relevant pages. For deriving the proper expertise, they mine the contents of pages which are already fetched to prioritize the fetching of unvisited pages. Topical crawlers depend especially on contextual understanding. This is considering that topical crawlers need to predict the advantage of downloading unvisited pages based on the understanding derived from pages that have been downloaded. One of the vital fashioned predictors is the anchor textual content of the hyperlinks [59]. The area targeted search engines like google and yahoo use these targeted crawlers to download selected
The author had mentioned about the EDM, FTR and the WWW at the first discussion. This had shown that many EDM vendors had to promote the “off the peg” systems. Its means here this system relied on the use of automated indexing for the retrieval purpose. Then, FTR is limited by its reliance on natural language. This is because every different records creator will use different terms in order to show the same or closely meaning. Meanwhile, for the WWW it has great demands and also can facilitate the searching. This can be seen through the “hits” that was produces after conducting a searching.
In today’s world 3 billion humans are on the internet but there are also 4 billion people that are not. In the beginning of my study on the future of the internet, I asked myself this question: is it possible that everyone could be online and globally connected? Then I asked myself how, if everyone is online, the future of the internet change the experience of everyday life? Looking back, the internet is still a relatively new phenomenon as it was first created back in the 1960’s by a computer scientist named J.C.R Licklider. He envisioned a network of computers, called the galactic network, which would allow humans to be able to share information instantly. Overtime this is how the internet developed, as many of these networks that shared
Communication--it is a fundamental part of our everyday lives. It characterizes who we are, what we do, and how we relate to others in society. It is a very powerful tool that holds many different uses for our basic needs and survival. At a very simplistic level, it is key in attaining our very basic needs for survival. In that respect, it is key in achieving all needs in Maslows hierarchy. Its uses and possibilities endless.
Online websites are crawled and are archived so they can be available for any special-purpose [16].
Author Athanasios Papagelis el al [2] had worked on the bottom-up approach, this approach has been characterised into a hybrid bottom-up search engine that produces search results based on user provided web-related data and their sharing
Following the success of Netscape and its web browser, Internet became a resource and communication platform idolized by many IT students in the universities. What started off as a hobby-cum-research[1] work by Jerry Yang (now Chief of Yahoo!) and David Filo (Co-founder of Yahoo!) for their Ph.D. dissertations; has evolved and became an Internet sensation over time. What they did was to compile all their favourite web links to form an online directory for easy navigation in the World Wide Web. The duo’s work immediately garnered a lot of attention from many surfers in the Internet world and before they realized it, Yahoo! became one of the most highly visited websites of all time. The duo saw the