MIS_510_Module_8_Discussion1

.docx

School

Colorado State University, Global Campus *

*We aren’t endorsed by this school

Course

510

Subject

Accounting

Date

Jan 9, 2024

Type

docx

Pages

2

Uploaded by Mazzoeri2323

Module 8: Discussion Forum Use the industry/domain that you currently work in (or one that you would like to work in) and identify textual information (documents) relevant to it. Apply the process described in sections 20.2 through 20.5 of our text to the document(s) you selected. What particularities do your document(s) have which would require divergence from the textbook process? What additional or different steps would be required to process (mine) your document(s)? Describe at least two benefits of following an established process. Followup: Discuss similarities and differences in the text mining processes mapped by your peers. What can we learn about the text mining process from these similarities and differences? Your discussion (initial post) must address the following items: Description and purpose of the project, the organization and/or person behind the project, and the URL. List of the steps from sections 20.2 through 20.5 of our text with a description of what this project did at each step. Analysis of the similarities and differences between the textbook process and the process followed by the project you chose. Description of at least two benefits of following an established process. Please do not duplicate the projects that others in this thread have already presented in their posts. Support your post with the information and concepts from the class reading materials from this module. Support your post with at least one scholarly source beyond the materials provided in this module. Use APA-style references & citations wherever necessary to support your discussion. In the context of government accounting, textual information might include various documents such as financial reports, budget proposals, audit findings, policies and procedures, memos, and correspondence. Let's consider invoices, which often contain textual information alongside numbers. The textbook process outlined in sections 20.2 through 20.5 provides a general framework for text processing, but in the case of invoices or financial documents, there are specific challenges and divergences from the standard process: Structured Data Integration: Invoices typically have structured data (numbers, dates, amounts) alongside textual information. The challenge here is integrating both structured and unstructured data effectively. The text
processing steps might need to be combined with methods to extract and process numerical information. Specialized Vocabulary: Government accounting might involve specific jargon, abbreviations, or terms that are not standard English stopwords. Customizing the stopword list or creating a domain-specific dictionary becomes crucial to avoid removing important terms. Data Validation: Ensuring the consistency and accuracy of textual information against the numerical data on invoices. Anomalies in text (like a mismatch between textual descriptions and numeric amounts) might indicate errors. Sensitive Information Handling: Government documents often contain sensitive information. Anonymizing or securing sensitive data within the text becomes a priority before any analysis. Two benefits of following an established text processing process: Consistency and Reproducibility: Using a structured process ensures consistent treatment of text data across different documents and analysis sessions. This aids in reproducibility of results and facilitates collaboration among analysts. Reduced Noise and Improved Relevance: By following text preprocessing steps like stopword removal, stemming, and text reduction, irrelevant or noisy terms are eliminated, leading to a more focused and meaningful analysis, especially in domains with specialized vocabulary. Applying these specialized steps in text processing for government accounting documents can enhance the quality of analysis and insights drawn from textual information while ensuring compliance and accuracy in dealing with financial data. Shmueli, G., Bruce, P.C., Yahav, I., Patel, N.R., & Lichtendahl, K.C. (2018). Data mining for business analytics: Concepts, techniques, and applications in R. Wiley Publishing. ISBN: 9781118879337
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help