FinalProject_AyeshabiTigdikar_20230630
.docx
keyboard_arrow_up
School
Northeastern University *
*We aren’t endorsed by this school
Course
6060
Subject
Industrial Engineering
Date
Apr 3, 2024
Type
docx
Pages
9
Uploaded by ayeshut
Final Project Probability Theory & Introductory Statistics
Ayeshabi W Tigdikar
Master of Science in Project Management, Northeastern University
Professor Tom Breur
June 30
th
, 2023
T
ABLE
OF
C
ONTENTS
I
NTRODUCTION
.........................................................................................................................................
2
V
ARIABLES
................................................................................................................................................
3
S
UMMARY
OF
EDA
..................................................................................................................................
3
Q
UESTIONS
................................................................................................................................................
4
H
YPOTHESIS
F
ORMULATION
& T
ESTING
............................................................................................
4
A
NALYSIS
...................................................................................................................................................
4
R
ESULTS
....................................................................................................................................................
6
R
EFERENCES
.............................................................................................................................................
6
A
PPENDIX
..................................................................................................................................................
6
I
NTRODUCTION
The data is taken from the COVID-19 State Dashboard for California. Hospitalization statistics include all patients who had a COVID-19 diagnosis while they were there. This does not necessarily imply that they had COVID-19-related problems or COVID-19-related symptoms when they were admitted to the hospital. Note: Because hospitals record the total number of patients each day (as opposed to new patients), cumulative totals are not available.
V
ARIABLES
According to the data dictionary on the website, the variables are:
1.
county: The County where the hospital is located. None of the consolidated reporters had hospitals in different counties.
2.
todays_date: The date on which the counts were recorded is the todays_date.
3.
hospitalized_covid_confirmed_patients: The total number of inpatients with a COVID diagnosis who are hospitalized and occupy a bed. This value is not accumulated. This includes all inpatients (including those in ICUs and Medical/Surgical units), but excludes patients who are awaiting an inpatient bed at connected clinics, outpatient clinics, emergency rooms, and overflow locations. COVID ED patients were no longer included in the Hospitalized COVID total as of April 21, 2020, and were instead counted separately.
4.
hospitalized_suspected_covid_patients: The number of patients admitted to a hospital with an inpatient bed who, in accordance with the CDC's Interim Public Health Guidance
for Evaluating Persons Under Investigation (PUIs), have symptoms and signs that are consistent with COVID (the majority of patients with confirmed COVID have a fever and/or symptoms of an acute respiratory illness, such as cough, shortness of breath, or myalgia/fatigue). This includes all inpatients (including those in ICUs and Medical/Surgical units), but excludes patients waiting for inpatient beds in overflow facilities, connected clinics, outpatient clinics, emergency departments, and emergency rooms.
5.
hospitalized_covid_patients: The number of patients currently hospitalized in an inpatient
bed who have suspected or confirmed COVID. 6.
all_hospital_beds: All surge beds, inpatient and outpatient post-surgical beds, labor and delivery unit beds, and observation beds are included in the facility's total bed count. This
covers all beds for which the hospital might supply personnel and resources; it does not necessarily reflect the number of beds staffed at the time the facility submits its report. Bays for the emergency department (ED) are not included in this field.
7.
icu_ covid_confirmed_patients: The total number of hospitalized COVID patients with laboratory confirmation of a positive result. All ICU beds (NICU, PICU, and adult) are included in this.
8.
icu_suspected_covid_patients: The number of symptomatic patients in the hospital's ICU whose COVID tests are still awaiting laboratory confirmation. All ICU beds (NICU, PICU, and adult) are included in this. 9.
icu_available_beds: The quantity of ICU beds that the hospital has available. All ICU beds (NICU, PICU, and adult) are included in this.
S
UMMARY
OF
EDA
The initial EDA involved loading the dataset, checking its structure, and understanding the variables present. The dataset was then cleaned by renaming columns, handling missing values, and summarizing the data. Additionally, the frequency of each county was determined, and the top 10 counties with the highest number of hospitalized COVID-19 patients were identified. Time series analysis was performed to observe the trend of available ICU beds over the years. A linear regression analysis was conducted to explore the relationship between the number of hospitalized COVID-19 patients and the availability of ICU beds. Finally, a t-test was performed to compare the number of ICU beds between two specific counties, Lake and Colusa.
Q
UESTIONS
The following questions were explored during the analysis: a)
What is the frequency distribution of counties in the dataset, and which counties have the highest number of hospitalized COVID-19 patients? b)
How does the availability of ICU beds change over time? c)
Is there a relationship between the number of hospitalized COVID-19 patients and the availability of ICU beds? d)
Are there any significant differences in the number of ICU beds between Lake and Colusa counties as there are less number of patients hospitalized?
H
YPOTHESIS
F
ORMULATION
& T
ESTING
Hypothesis 1:
Null Hypothesis (H0):
There is no significant relationship between the number of hospitalized COVID-19 patients and the availability of ICU beds.
Alternative Hypothesis (Ha):
There is a significant relationship between the number of hospitalized COVID-19 patients and the availability of ICU beds.
The hypothesis was tested using linear regression analysis, where the number of hospitalized COVID-19 patients (Covid_Patients_H) was considered as the dependent variable, and the availability of ICU beds (ICU_Beds) was the independent variable.
Hypothesis 2:
Null Hypothesis (H0):
There is no significant difference in the number of ICU beds between Lake and Colusa counties.
Alternative Hypothesis (Ha):
There is a significant difference in the number of ICU beds between Lake and Colusa counties.
The hypothesis was tested using a two-sample t-test, comparing the number of ICU beds in Lake County (Lake_data$ICU_Beds) with that in Colusa County (Colusa_data$ICU_Beds).
The null hypothesis is often accepted as true in hypothesis testing up until there is enough data to
refute it. To determine the likelihood of witnessing the data under the null hypothesis, statistical tests are run. The null hypothesis is accepted in favor of the alternative hypothesis if this probability is lower than a preset significance level, which is often set at 0.05.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help