mod-2.1

docx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

511

Subject

Business

Date

Jun 5, 2024

Type

docx

Pages

5

Report

Uploaded by DeaconRose12996

1 2-1 Report: Improving Data Quality Cody Manley Southern New Hampshire University QSO-511-X4347 Business Analytics 24TW4 Professor Daniel Letort May 26, 2024
2 Importance of Quality Data Financial Charm Bank is a large retail bank operating in the United States. Senior management wants to compile data for 3 Florida Branches to predict the needs of existing customers and their interest in term deposits. The Bank's Vice President (V.P.) has identified an ideal data set from the Florida branches that contains customer demographics, banking history, and term deposit holdings for the business analysis team to analyze. Unfortunately, the initial analysis of the data set has been deemed questionable. The V.P. has tasked the senior members to create a comprehensive report highlighting the impact of the data set’s errors, gaps, and anomalies on the organization if they are not corrected. Errors, Gaps, and Anomalies Several errors, gaps, and anomalies have been found within the data set. The first inconsistency that needs to be addressed is the lack of clarity in the column names. Below is a screenshot of the column names in the data set that have been color-coded to highlight the issues that could confuse individuals and cause errors when interpreting data. Columns A-D in orange are the only columns that do not need to be changed because they are self-explanatory. Columns rambles E-H in yellow have unclear meanings. The last section of blue marking Column L-P has undefined labels and uses abbreviated titles, which create more questions than answers. The biggest issue in the blue section is the abbreviation ‘P’ used before the label. Since Column O is labeled ‘previous’ and is between N and P, one can only assume that ‘P’ stands for ‘previous but needs to be clarified. The numerical data contains inconsistencies that need to be addressed and adjusted to produce accuracy in the analysis (Michaloudis, 2024). In column L, ‘duration,’
3 several numbers have decimal points; it is unclear if column ‘duration’ should have numbers with decimal points. Further information is needed. The last inconsistencies that need to be addressed are the blank spaces and data anomalies. Overall, only 74 blanks were found in columns A, C, and P, a relatively small number compared to the data size. The identified gaps need to be corrected; however, more information is needed to complete the data, which may need additional analysis. Columns K and B need to be addressed because data anomalies can lead to data quality issues when data is being analyzed miss spells and abbreviated. Columns I-J can be seen in the screenshot above and have little meaning. Column I, ‘contact’ shows either ‘telephone,‘contact’ Unknown or N/A, but it is unclear how that is relevant or could be used. Data Quality Issues  Analyzing the data revealed that the data set has quality issues, missing values, and gaps in data. The missing values identified were 74 gaps in 3 Columns and an unclear label of ‘pday’ on Column N, which shows a negative number in several places in a column. Columns P and I use values of ‘unknown’ and ‘N/A,’ which is confusing because both words have the same meaning. The data quality issues identified are in Columns I-J, which are in the screenshot above and have little meaning. Column I, ‘contact,’ shows either Unknown or N/A, but it is unclear if ‘contact’ means no phone number is on file or the customer has not been contacted. This column also lists ‘telephone,’ but it is unclear if this is a landline or a work line. Suppose unclean data is used for organizational prediction. In that case, it will provide the organization with inaccurate analytics, resulting in poor customer relations and creating bad decisions that will harm overall business performance (Foote, 2023). Using the existing data set with quality issues could result in additional time delays in deciphering what disparities the
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4 information created. Using unrefined data also creates trust and liability for the organization (W. Custom, 2018). Correct The Errors and Issues   The Business Analytic Team manually corrected the data set to remove inconsistencies and improve data quality. We adjusted all rows, columns, and headings to a uniform front and ensured all data was visible, making it easier to read. We removed all of the preexisting formatting by using the ‘clear format’ options listed in the editing section of the home ribbon and added dollar signs where needed. To locate all blank rows since the data set did not have any complete rows to remove, but the data did have 74 blank cells. The placeholder of ‘N/A’ for the missing numbers in the column labeled ‘Age’ and “Unknown’ is the placeholder for all other empty cells until the information is obtained. Putting a placeholder will allow data to run a pivot chart and other data analysis tools. We chose not to remove the row entirely because the rest of the cells were complete and could be useful in other areas. Summary of Findings   The business analysis team has finished analyzing and correcting the questionable data in the data set. They have concluded that correcting the gaps, errors, and anomalies, in combination with improvements in the readability of the data set, has greatly improved its quality. The team has ensured that the data is high quality and ready to aid Financial Charm Bank in making optimal business decisions for future growth and development beneficial for both the organization and current customers.
5 REFERENCES Foote, K. D. (2023, March 1). The impact of poor data quality (and how to fix it) - DATAVERSITY . DATAVERSITY. https://www.dataversity.net/the-impact-of-poor-data- quality-and-how-to-fix-it/#:~:text=Poor%2Dquality%20data%20can%20lead,of %20errors%20increase%20and%20accumulate . Merriam-Webster. (n.d.). Tertiary. In Merriam-Webster.com dictionary. Retrieved May 26, 2024, from https://www.merriam-webster.com/dictionary/tertiary Michaloudis, J. (2024, March 22). How to fill blank cells in Pivot Table | MyExcelOnline. MyExcelOnline . https://www.myexcelonline.com/blog/how-to-fix-pivot-table-empty- cells-in-excel/#:~:text=You%20might%20see%20there%20are,with%20the%20text %20%E2%80%9CNA%E2%80%9D . Team, C. (2023, December 8). Documenting Excel models best practices . Corporate Finance Institute. https://corporatefinanceinstitute.com/resources/excel/documenting-excel- models-best-practices/ W. Custom (2018). Business Analytics. A General Introduction to Data Analytics. https://read.wiley.com/books/9781119296263/page/7/section/head-2-23