STAT301 checkpt2

.pdf

School

Purdue University *

*We aren’t endorsed by this school

Course

301

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

4

Uploaded by AgentFireCamel20

Report
Checkpoint #2: Exploratory Data Analysis Valerie Shao STAT301 Fall 2023 I. Data Set Exploration The human resources dataset contains 312 rows and 36 columns, created by Dr. Rich and Dr. Carla Patalano on Kaggle, and is used in graduate courses to educate students. This dataset revolves around a fictional company and contains personal and employment information about its employees. When examining the dataset's structure, there are 13 categorical variables and 22 numerical variables. The numerical variables primarily assess employee performance, pay rates, and other identifiers like their marital status. Categorical variables help organize and group the data, simplifying analysis and insights, particularly when conducting studies on the company's workforce, where race, gender, and other personal details are categorized. Several important factors can help us assess the company's diversity and find any pay disparities. Personal details, like those from diversity job fairs, race, and gender, provide insights into employee characteristics that reveal the company's diversity. We can compare salaries with factors like gender or race to understand if these factors might lead to unfair pay for certain groups. However, it's crucial to consider other factors in this analysis because an employee's performance also affects their salary. Additionally, we can examine the data on recruitment sources to determine the best platforms for future company advertising to attract better candidates. Regarding data preprocessing, there were no missing values or duplicate entries found in this data set. II. Data Visualizations Figure 1Categorical Visualization: Example #1 [Pie Chart]
Figure 2Categorical Visualization: Example #2 [Bar Chart] Figure 3Numeric Visualization: Example #3 [Scatter Plot] Categorical Example #1 Insights: The pie chart provides a simple and clear view of the ethnicity distribution within the company. From the chart, we can see that the majority of employees are white, making up the largest slice of the pie at 61%. Following that, black employees represent a significant portion at 26%, while Asian employees make up 9%. A smaller portion is composed of employees with Two or more races, accounting for 3%, and those classified as Other (American Indian, Alaska Native, Hispanic) make up the remaining 1%. This visual representation helps us quickly understand the diversity of the company's workforce, with white and black employees being the most prominent groups, and there's room to enhance diversity and inclusion efforts, especially among Asian and Other ethnicities.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help