2nd-hw

.docx

School

The City College of New York, CUNY *

*We aren’t endorsed by this school

Course

215

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

9

Uploaded by AgentRainWombat19

Report
2nd hw 2024-01-10 R Markdown 1. load your library/liobraries library (tidyverse) ## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ── ## ✔ dplyr 1.1.4 ✔ readr 2.1.4 ## ✔ forcats 1.0.0 ✔ stringr 1.5.1 ## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1 ## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0 ## ✔ purrr 1.0.2 ## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ## ✖ dplyr::filter() masks stats::filter() ## ✖ dplyr::lag() masks stats::lag() ## Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors 2.load our data penguins <- read.csv ( "https://raw.githubusercontent.com/zeigna/AppliedStatsAnalysi s/master/penguins.csv" ) 3. Verify the dataset loaded correctly by displaying the first 8 lines of code head (penguins, n = 8 ) ## X species island bill_length_mm bill_depth_mm flipper_length_mm ## 1 1 Adelie Torgersen 39.1 18.7 181 ## 2 2 Adelie Torgersen 39.5 17.4 186 ## 3 3 Adelie Torgersen 40.3 18.0 195 ## 4 4 Adelie Torgersen 36.7 19.3 193 ## 5 5 Adelie Torgersen 39.3 20.6 190 ## 6 6 Adelie Torgersen 38.9 17.8 181 ## 7 7 Adelie Torgersen 39.2 19.6 195 ## 8 8 Adelie Torgersen 41.1 17.6
182 ## body_mass_g sex year ## 1 3750 male 2007 ## 2 3800 female 2007 ## 3 3250 female 2007 ## 4 3450 female 2007 ## 5 3650 male 2007 ## 6 3625 female 2007 ## 7 4675 male 2007 ## 8 3200 female 2007 4. Display the last 5 lines of penguins tail (penguins, n = 5 ) ## X species island bill_length_mm bill_depth_mm flipper_length_mm ## 329 329 Chinstrap Dream 55.8 19.8 207 ## 330 330 Chinstrap Dream 43.5 18.1 202 ## 331 331 Chinstrap Dream 49.6 18.2 193 ## 332 332 Chinstrap Dream 50.8 19.0 210 ## 333 333 Chinstrap Dream 50.2 18.7 198 ## body_mass_g sex year ## 329 4000 male 2009 ## 330 3400 female 2009 ## 331 3775 male 2009 ## 332 4100 male 2009 ## 333 3775 female 2009 5. Examine the “structure” of the penguins dataset str (penguins) ## 'data.frame': 333 obs. of 9 variables: ## $ X : int 1 2 3 4 5 6 7 8 9 10 ... ## $ species : chr "Adelie" "Adelie" "Adelie" "Adelie" ... ## $ island : chr "Torgersen" "Torgersen" "Torgersen" "Torgersen" ... ## $ bill_length_mm : num 39.1 39.5 40.3 36.7 39.3 38.9 39.2 41.1 38.6 34.6 ... ## $ bill_depth_mm : num 18.7 17.4 18 19.3 20.6 17.8 19.6 17.6 21.2 21.1 ... ## $ flipper_length_mm: int 181 186 195 193 190 181 195 182 191 198 ... ## $ body_mass_g : int 3750 3800 3250 3450 3650 3625 4675 3200 3800 4400 ... ## $ sex : chr "male" "female" "female" "female" ...
## $ year : int 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ... glimpse (penguins) ## Rows: 333 ## Columns: 9 ## $ X <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 1… ## $ species <chr> "Adelie", "Adelie", "Adelie", "Adelie", "Adelie", "A… ## $ island <chr> "Torgersen", "Torgersen", "Torgersen", "Torgersen", … ## $ bill_length_mm <dbl> 39.1, 39.5, 40.3, 36.7, 39.3, 38.9, 39.2, 41.1, 38.6… ## $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, 19.3, 20.6, 17.8, 19.6, 17.6, 21.2… ## $ flipper_length_mm <int> 181, 186, 195, 193, 190, 181, 195, 182, 191, 198, 18… ## $ body_mass_g <int> 3750, 3800, 3250, 3450, 3650, 3625, 4675, 3200, 3800… ## $ sex <chr> "male", "female", "female", "female", "male", "femal… ## $ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007… 6. Are there any extra columns? If so, remove it/them penguins <- subset (penguins, select = - c ( 1 )) head (penguins) ## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g ## 1 Adelie Torgersen 39.1 18.7 181 3750 ## 2 Adelie Torgersen 39.5 17.4 186 3800 ## 3 Adelie Torgersen 40.3 18.0 195 3250 ## 4 Adelie Torgersen 36.7 19.3 193 3450 ## 5 Adelie Torgersen 39.3 20.6 190 3650 ## 6 Adelie Torgersen 38.9 17.8 181 3625 ## sex year ## 1 male 2007 ## 2 female 2007 ## 3 female 2007 ## 4 female 2007 ## 5 male 2007 ## 6 female 2007
7. What is the size of this dataframe nrow (penguins) ## [1] 333 ncol (penguins) ## [1] 8 The penguins dataset has 333 rows by 9 columns 8. What this the datatype of each variable in the penguins dataset 9. How many unique values do we have in the variable year? Based on this, do you think year is quant or qual? Can year be a factor? `` unique (penguins $ year) ## [1] 2007 2008 2009 Year has no mathematic qualities in this dataset, so recode year as a factor 10. Based on your results for 7 and 8, recode the appropriate variables. penguins $ species <- as.factor (penguins $ species) 11. Verify your recoding with str() or glimpse() str (penguins) ## 'data.frame': 333 obs. of 8 variables: ## $ species : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ... ## $ island : chr "Torgersen" "Torgersen" "Torgersen" "Torgersen" ... ## $ bill_length_mm : num 39.1 39.5 40.3 36.7 39.3 38.9 39.2 41.1 38.6 34.6 ... ## $ bill_depth_mm : num 18.7 17.4 18 19.3 20.6 17.8 19.6 17.6 21.2 21.1 ... ## $ flipper_length_mm: int 181 186 195 193 190 181 195 182 191 198 ... ## $ body_mass_g : int 3750 3800 3250 3450 3650 3625 4675 3200 3800 4400 ... ## $ sex : chr "male" "female" "female" "female" ... ## $ year : int 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ... 12. Run summary() on your dataset. Do any of the variables have missing values? If so, which variables? summary (penguins) ## species island bill_length_mm bill_depth_mm ## Adelie :146 Length:333 Min. :32.10 Min. :13.10 ## Chinstrap: 68 Class :character 1st Qu.:39.50 1st Qu.:15.60
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help