RESEARCH ISSUES IN WEB MINING
ABSTRACT
Web is a collection of inter-related files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for extracting information from the web servers. The web data includes web pages, web links, objects on the web and web logs. Web mining is used to understand the customer behaviour, evaluate a particular website based on the information which is stored in web log files. Web mining is evaluated by using data mining techniques, namely classification, clustering, and association rules. It has some beneficial areas or applications such as
…show more content…
1. INTRODUCTION Web mining is the application of data mining technique which is an unstructured or semi-structured data and it automatically discovers and extracts potentially useful and previously unknown information or knowledge from the web. The significant web mining applications are website design, web search, search engines, information retrieval, network management, E-commerce, business and artificial intelligence, web market places and web communities. Online business breaks the barrier of time and space as compared to the physical office business. Big companies around the world are realizing that e-commerce is not just buying and selling over Internet, rather it improves the efficiency to compete with other giants in the market. This application includes the temporal issues for the users. []
Web mining has three classifications namely, web content mining, web structure mining and web usage mining. Each classification is having its own algorithms and tools. Web content mining is nothing but the discovery of valuable information from web documents and these web documents may contain text, image, hyperlinks, metadata and structured records. It is used to look at the information by search engine or web spiders i.e. Google, Yahoo. It is the process of retrieving the useful information from the web content or web documents. Web structure mining is also a process of discovering
This section discuss about the common traits or ideas observed in the three research topics. Although, each of the three articles discuss a unique idea, all of them are aimed at utilizing the web data to produce better results. Web data mining is a hot research topic in the current realm of big data. These papers discuss about the utilization of the valuable user generated data from the social media or the browser cookies to provide the best user experience in order to maintain the user interest in the company's product or to take effective decisions by an individual. All the three articles propose an idea to solution the problem stated, compared their results to the existing models and showed significant improvement.
Here we discuss about the common traits or ideas observed in the three research topics. Although, these three papers discuss about different ideas, they all fall under the web data mining domain. web data mining is a hot research topic in the current realm of big data. These papers discuss about the utilisation of the valuable user generated data from the social media or the the browser cookies to provide the best user experience in order to maintain the user interest in the company's product or to take effective decisions by the individual.
Web analytics refers to the study of user data collected on websites. Online commerce has been the main application area that has driven the development of web analytics in recent years. Nonetheless, the goal of web analytics is to capture and analyze data on the use made of
In the world of technology that we live in today has forced companies in almost every industry to use whatever tools that are available to help them be competitive in their business industry. There are a few ways to do this, one of those ways is the use of Web analytics, which is the collection of raw data from users browsing habits and then taking the raw data and assemble the data into clear comprehensive results. This type of analysis is very useful for companies, as it helps them learn what users are doing and their habits and the best way to target these users.
Web Analytics is a relatively new phenomenon in the business world, and while it is not a mandatory requirement for organizations to compete in today’s marketplace it is becoming increasingly important as organizations strive to optimize their web presence. So what is web analytics and how can it help companies achieve a better web presence? Web Analytics is defined by the Web Analytics Association as, “The practice of measuring, collecting, analyzing and reporting on Internet data for the purposes of understanding how a web site is used by its audience and how to optimize its usage.” (ClarkU, p. 1) To simplify this definition we can say that web analytics is the process of determining how your website most effectively turns site visitors into customers.
Web analytics is the measurement, collection, analysis and reporting of web data for purposes of understanding and optimizing web usage. It is used to enable a business to attract more visitors, retain or attract new customers for goods or services, or to increase the dollar volume each customer spends.
Data mining is the process of analysing data to discover meaningful patterns within the data resulting in extracting useful information that may have not been discovered yet. Data mining borrows techniques from a variety of fields such as statistics, machine learning and artificial intelligence. Because of its usefulness, data mining has been used in a range of industries such as, banking, telecommunications, retail, marketing, and insurance.
In this case study, we are defining a scenario as a client where Trade Me which is New Zealand’s giant and how online auction & transaction is improvised by using Web Analytics. Clients site usability and experience is enhance by applying the science and Web analytics methodology. It is a science in light of the fact that it utilizes statistics, data mining systems, and a methodological process. Online marketing deals with huge web investigation for the validation and verification purpose that reduces cybercrime, hence Web Analytics Maturity Model makes it more precise, straightforward and accomplished to define variables that can be implemented in online transactions. While we focus on the certifiable state of Trade Me business for the majority of the part, the future will totally regard the benefits of the business investigation & system streamlining that are outside the area of web advancing.
From the academic perspective, web analysis can be identified as a process of analyzing and measuring visitors’ behaviors based on the data from a website. Specifically, analysts would dig into the data and try to find out the meaning hidden behind those mathematical numbers and statistic and figure out the reason of every click that every visitor makes. Then analysts could come up with solutions to help companies improve web performance in order to attract more visitors and increase the dollar value that visitors spend on the site. In other words, web analysis can be also seen as an integration of diverse analysis activities with using different technics and tools from collecting data to transferring those data to visual outcomes and insights to do accurate business prediction and forecasting. In a word, web analysis could help business to figure out people’s real-time reaction to the message and information that business is trying to deliver on the web pages, and get a deep understanding about customers’ responses to recent marketing campaigns and activities on the web.
Web Data Mining is the use of unstructured or semi-structured web data sources to extract structured information. Organization make use of web data mining as a tool in which data is gathered from different web sites. The data is then collated to do analysis, build other web sites that will provide information. It is an advantage to have
From my understanding, data mining is a series of operation to dig up a value-added process from a bunch of data in the form of knowledge that is not known for manually. Knowledge discovery in database is a term that we called for data mining in science computer. Data mining also about to find a new information in a lot of data. Not only that, data mining is searching for patterns or relationships in one or more databases and it way to generate new information. Besides that, for secondary use, the information collected for one purposed used for another purpose and the information about customers is a valuable commodity. But, does we know how the data mining is work?
In this electronic and online world nowadays everything is online. This all had grown bigger in last few years. Because if we go back before 20 to 25 years, there was not any website. And now you look to any business, there is no business without having online websites. So there came a need of something which would not only help to develop but also maintain the website. And the field which do both of the thing is called as web analytics.
Web Analytics is the analysis of qualitative and quantitative data from our designed website and the competition, to drive a continual improvement of the online experience that the users and the potential visitors intend to make use the proposed website. The expected upshot of this proposal is to develop a metric based web analytics and improve the website using key metrics.
ISBN 978-952-5726-00-8 (Print), 978-952-5726-01-5 (CD-ROM) Proceedings of the 2009 International Symposium on Web Information Systems and Applications (WISA’09) Nanchang, P. R. China, May 22-24, 2009, pp. 202-205
After a study of literature review the concluded that the Association rule with Apriori Algorithm is used in combination of Fuzzy c means clustering and it gives result better Accuracy in web page prediction. Also we use Association rule with Apriori algorithm and Markov model in web page prediction then it may also result a highest probability in web page and helped to predict the user access behavior and also decrease the access time of user. So our new proposed methodology has following methods: