Business Intelligence, 2e (Turban/Sharda/Delen/King)
Chapter 5 Text and Web Mining
1) DARPA and MITRE teamed up to develop capabilities to automatically filter text-based information sources to generate actionable information in a timely manner.
Answer: TRUE
Diff: 2 Page Ref: 190
2) A vast majority of business data is captured and stored in text documents that are structured.
Answer: FALSE
Diff: 2 Page Ref: 192
3) Text mining is important to competitive advantage because knowledge is power, and knowledge is derived from text data sources.
Answer: TRUE
Diff: 2 Page Ref: 192
4) The purpose and processes of text mining are different from those of data mining because with text mining the input to the process are data
…show more content…
A) classification
B) natural language processing
C) evidence-based processing
D) symbolic processing
Answer: B
Diff: 2 Page Ref: 195
28) Why will computers probably not be able to understand natural language the same way and with the same accuracy that humans do?
A) A true understanding of meaning requires extensive knowledge of a topic beyond what is in the words, sentences, and paragraphs.
B) The natural human language is too specific.
C) The part of speech depends only on the definition and not on the context within which it is used.
D) All of the above.
Answer: A
Diff: 3 Page Ref: 196
29) At a very high level, the text mining process consists of each of the following tasks except:
A) create log frequencies
B) establish the corpus
C) create the term-document matrix
D) extract the knowledge
Answer: A
Diff: 2 Page Ref: 207
30) In ________, the problem is to group an unlabelled collection of objects, such as documents, customer comments, and Web pages into meaningful groups without any prior knowledge.
A) search recall
B) classification
C) clustering
D) grouping
Answer: C
Diff: 2 Page Ref: 211
31) The two main approaches to text classification are ________ and ________.
A) knowledge engineering; machine learning
B) categorization; clustering
C) association; trend analysis
D) knowledge extraction; association
Answer: A
Diff: 2 Page Ref: 211
32) Commercial software tools include all of the following
4. Within the organisation there will be groups of people or teams which work toward
RI.3.5 Use text features and search tools (e.g., key words, sidebars, hyperlinks) to locate information relevant to a given topic efficiently.
Each group (Group A, Group B1, Group B2 and Group D) is made up of sub-headings and questions or “statements of competence”.
31. __________ groups are nonsecurity-related groups created for the distribution of information to one or more persons. Distribution
C. Inferential Questions (“higher-order thinking” within the text). Construct a series of inferential questions (basic questions at the level of
c. Explanation of the central points from the article(s): Avoid just using a quote from a source. Explain what you think the author means.
1. Which of the following would not be a reason to obtain a greater understanding of
There are a variety of methods for storing and retrieving information and data, one of the most likely methods for storing information in a business environment is electronically using databases and Excel.
43. Which of the following describes how data are organized and how to use them effectively? Metadata
Text mining is a process which collects information and knowledge from large amounts of unstructured data sources. When I say unstructured data sources, I am talking about Pdf files, Word documents, XML files, text excerpts etc… Text mining collects information from text. Text mining is different than data mining because data mining is a process which collects information and knowledge from large amounts of structured data sources. Structured data sources means that data are classify by categorical, ordinal, or continuous variables, and the goal of data mining is to transform data into model or understandable structure after collecting information from data. However they are
Text mining sometimes known as text data mining often refers to the process of pulling out of interesting and non-trivial patterns of knowledge form a semi or unstructured text document. Text mining can also serve as an extension of data mining or of data finding from a structures database. With text mining it can be the same as data mining but with a bit more complexity, because they somewhat carry out the same processes and has the same purpose, however with text mining the data is more unstructured rather that structured in the data files such as : (pdf, word, xml etc.). This is so because most people store information in the form of text, it is believed that text mining can be greater than data mining, during recent years there where a number of studies done which indicates that 80% of business information is stored in text format.
1.1 Text mining: Text mining can be called as text data mining, which is roughly equal to text analytics; text mining is used for deriving high-quality information from text documents and to disclose the unseen meanings. Text mining is more complicated task as compared to data mining because text mining deals with text data, which can be unstructured as well as fuzzy where as data mining is the procedure of extracting information from huge sets of data. Text mining also is called as Text Data Mining (TDM) and Knowledge Discovery in Textual Database (KDT)
This proposal is submitted to the Computer and Information Science faculty in partial fulfillment for the degree