Chapter 2: Research Methodology
1. INTRODUCTION
In this chapter, describes the research method that was followed. It including the research design and participants, and the techniques and data analysis methods that are used for research and detected. Moreover, it is also discussing the techniques used to carry out the experimental work.
2. Research Methods
2.1. Research Design
This research is designed to provide a program that helps the user to search and detect the provenance of news articles from different news websites. This research studied the extent of effectiveness of using a technique Google Search API and technique Google Custom Search to detect the provenance of news articles in various news websites. Also, detect the
…show more content…
In order to achieve the best and the maximum the correctness, accuracy, credibility and speed of display the results. It is clear from the work breakdown structure (WBS) in Figure 1. The research focused multiple work packages (WP) to develop a system to detect the provenance of news articles, as shown in Figure 2.
• WP 1: Literature Review
The data are collected through scientific papers and related publications to the provenance in general, and detect the provenance of news articles in particular.
• WP 2: Create the architectural design and analysis the system of detected the provenance of news articles
1. Architectural Design
This work package focuses on the research and detecting the provenance of news articles that are described and presented in the literature. As well as the work of the design architecture of the system detecting the provenance of news articles.
2. System Analysis
1. Functional requirements
2. Non-functional requirements
• WP 3: Prototype or Develop a model to detect the provenance of news articles
Through explaining how to use the techniques Google Custom Search and Google Search API to search and detect the provenance of news articles. As well as verification of the proposed architectural design.
• WP 4: Validation
Experiment the system using two techniques, and selection the technique that has the most efficient and accurate to display the results. Through
The case is important because it belongs in a very new category of journalism. Online journalism, which has been established with vast technological advancements, poses many different advantages and disadvantages
Though journalists main goal is to expose the whole truth, to do this they must be open on their methods of acquiring information. With the use of transparency in news articles, the journalists do just this. One major point in the
The newspaper industry is undergoing a radical change in three primary areas caused by technology. First, the underlying two-sided business model is changing. With the Advent of internet, news content is easily and freely available from various sources but lacks quality journalism and credibility. Revenues from online advertising are not large enough to compensate for decline in revenues from print advertising & subscription. Newspaper industry is experiencing new realm of new content delivery and in process of understanding and establishing sustainable sources and
In recent years, it appears that social media use has risen immensely. Due to new technological advancements, people have taken it to their advantage to report misleading or misguided information throughout the internet. To prevent getting fooled by fake information, one should do additional research, avoid any evidence of truthiness and look carefully for bias. To begin, when seeking information online, it can be difficult to identify between true fact and false data; therefore, using a tool to eliminate the chances of obtaining false information will be beneficial when researching. “Is it a primary or secondary source? … Are methods or references provided?... Who published the information?... Was it peer-reviewed” (Gratz). These questions
Without it, the public receives delayed, but accurate information. With it, the public receives quick information that could contain errors. Neither of these scenarios are good but journalists are forced to choose between the two. A solution to this dilemma is known as crowdsourcing journalism. The Conversation describes crowdsource journalism as, “a hybrid model that combines traditional journalism practices, including legal and ethical knowledge, with the swarming power of people online and forensic IT and data management.” Crowdsource journalism is still a long and slow process but it acknowledges the power of the online community and in the future, it can offer a permanent solution to
Interactivity is what most separates on line news from traditional news. Indexicality (using hypertext links) is an important aspect of on-line journalism because it frees up space and time for the reader. People can explore international news and easily access the latest stories before the papers get to print, all at the click of a mouse.
In the modern day technology and information are developing, changing, and is exchanged at a rapid pace. With so much information out there and a constantly evolving landscape of technology it is important to know how to find, use and validate new information and technology.
bitts.beans@gmail.com ABSTRACT In this paper we are going to illustrate a way to cluster similar news articles based on their term frequency. We will using python and nltk to recognize keywords and subsequently using hierarchical clustering algorithm. This method can be used to build news aggregation backends. Aggregation means clustering like documents from different sources.
So, it makes perfect sense of going in flow and introducing subsequent topics. Another strength I noticed that when authors wanted to introduce new architecture they don’t jump directly on to explain it. Authors analyzed existing architectures like SAM, AA-Dedupe, CABdedupe and SHHC, they mentioned benefits and shortcoming of each and every one. After that they reached on finding that these all architectures are lacking to consider dynamic nature of data during duplication. So this is how they reached on necessity of their architecture and introduced it. Two big strength of the architecture are handling issue of dynamicity and consider quality of service as they decide how many copy of a file and chunk should be replicated. This paper also holds very strong on experimental results of proposed solution. The authors experimented the model with many possible set ups like changing number of deduplicators, changing type of operation, and change in the value of the quality of service factor. So overall they touched upon every possible setup to conclude that more than 90% of time savings can be achieved. Another strong point about paper is that authors are very clear about what factors are being compromised in the study and they discussed scope of it. For example, a possible monitor access patter can be implemented where system will
Secondary data will be collected from news papers, other related researches, internet, journals and text books.
Provenance (aka lineage) is a descriptive metadata (i.e. data about data). It specifies not only the properties of an object but also the history of deriving this object. As provenance touches many different domains and applications, it has different definitions that represent different views of provenance such as “Description of the origins of data and the process by which it arrived at the database” [20],”Metadata recording the process of experiment workflows, annotations, and notes about experiments” [50].
Content mining is tied in with recovering important data from information or content accessible in accumulations of reports through the recognizable proof and investigation of fascinating examples. This procedure is particularly beneficial when a client needs to locate a specific kind of incredible data on the web. Content mining is focusing on the record accumulation. The greater part of content mining calculation and methods are gone for finding designs crosswise over extensive record accumulations. The quantity of records can extend from the large number to millions. This paper goes for talking about some essential content mining strategies, calculation, and apparatuses.
The user enters the news article who wants to search and detect the provenance of the news published
Furthermore, the target user group of this news system would have different news-related behaviours. To catch up the target users’ needs and enhance the usability of the whole system, the news system would support some specific new-related behaviours. One of specific news-related behaviours supported by the system is that the target users would look through some popular or hot news listed on the homepage of the system in order to spend less time on searching for the news or information which they want because many target users would want to catch up the trend and collect a large amount of information in a short time. As a result of this, many target users would through viewing the news listed on homepage to grasp some latest news or events happened in their countries or in the worldwide. Another news-related behaviour is that while the target users use the news system to encounter different news or information, they would share the links of news or information to other people on different social media platforms. Because of
Research methodology is a way how the research is conducted step by step and in order. There are two methods used for data collection which is the primary data and secondary data. These data can be obtained and used many ways. The data is taken and analyzed in advance to produce a result that we can use for research and future reference. This study will relate to the objective we want to achieve and finding the answer to every objective we seek. In order to successfully achieve the objectives we seek, we must know