2244B FW23 Assignment 2 - Data

.pdf

School

Western University *

*We aren’t endorsed by this school

Course

2244

Subject

Biology

Date

Apr 3, 2024

Type

pdf

Pages

8

Uploaded by 2004kevinscix

Report
Biol/Stat 2244B FW23 Assignment 2 1 BIOL/STATS 2244 Assignment 2: Data Objectives This Assignment is designed to demonstrate your current mastery of the following Learning Outcomes (these are pulled directly from the course syllabus, page 4): i. Create and interpret appropriate summaries of data; a. Select appropriate summaries based on research question and variables; ii. Use statistical software to explore, summarize, analyse, interpret, and communicate data in a reproducible manner; a. Use R to create and modify graphical and numerical summaries of data; b. Use R markdown to create reproducible analyses and reports. iii. Communicate statistical concepts, analyses, and arguments in an accurate, and scholarly manner. a. Use conventional and transparent formats for reporting results of statistical analyses in written/graphical form. To achieve these objectives, students will need to draw on course material from the topics primarily from Data structure and Planning analysis , and Summarizing & Exploring Data, as well as the Labs. How this Assignment ‘works’ This Assignment is the second of three Assignments in the course; it continues our progression through the stages of the PPDAC Framework. The major focus of this Assignment is the Data stage of PPDAC. In the first Assignment, you were introduced to some ‘Research Background’ related to screen time and physical attributes like growth, strength, and flexibility. We will continue to work with this background (you can refer back to it if necessary for this assignment). You planned a sampling design to gather a sample to address a very specific research question in Assignment 1 as part of the “Problem” and “Plan” phases of the PPDAC Framework. The next phase of the PPDAC framework , the “Data” phase, would involve collecting data and exploring data from our Plan; unfortunately, we don’t have the time/resources for you to implement YOUR sampling Plan, nor to create and then implement a relevant study design. So, we will use an openly available dataset that is related to the research objective introduced in Assignment 1. This Assignment 2 file is accompanied by a datafile (named assign.csv ) that you will be working with, as well as the research article (named article.pdf ) that describes the way the data were collected (i.e. in the Materials and Methods section of the article). You should, therefore, review the rest of this Assignment instructions file so you understand what you are being asked to do, and then spend some time reviewing the article while exploring the datafile in R. Make sure you understand what the columns represent by cross-referencing with the information in the Materials and Methods . Note carefully : you will not use every variable and/or every data point in the datafile for this Assignment! Some of them aren’t relevant!
Biol/Stat 2244B FW23 Assignment 2 2 Being successful on this Assignment Remember, the Assignment is evaluating you on three things: Your ability to choose appropriate summaries to answer a research question Your ability to use R and R markdown files Your ability present data in a conventional, transparent, and scholarly manner The knowledge to support these tasks is developed in lectures Topics and Labs; this R aspect of this Assignment is achievable through content from the course Labs. The concepts (i.e. on how to select a graph) is based on what we did in lecture (and is reinforced in the structure of Lab 4: Lesson 1). So, go back to your class materials. Seriously. Your best approach would be to think about the following (these aren’t the Assignment questions, just a guide for getting started): 1. Think about the Research Question (given below, in the Assignment Questions section) and the dataset. Which variables in the dataset(s) will you use to answer the question? Use the article ’s Materials & Methods sections to understand what the variables/columns in the data represent and the way the variables were measured (i.e. to understand types of variables). 2. “Analyse” the Research Question, alongside the variables from the data like we did in lecture for Topic 6: Summarizing & Exploring Data. How many variables are you working with? What type of variables are they (again, use the information in the article ’s Materials & Methods to help you understand the way the variables were measured, and then determine if they are quantitative or categorical)? What is your summary goal: describing a distribution or looking for a relationship between variables? 3. Select a graph type to answer the Research Question based on the number and type of variables. Look back through the summaries we’ve seen in lecture and lab. Pick the one that makes sense! If there is more than one graph type that is appropriate based on the type and number of variables, choose the one that YOU feel most confident making. BE VERY CAREFUL: I chose the article and dataset because it’s interesting and freely available to us. What the researchers chose to do with their data (e.g. types of graphs/numerical summaries, methods of analysis, etc. in their Results) may not be appropriate for the data they collected (why? The peer-review process that research articles go through isn’t always adequate , unfortunately and may not have picked up on errors). Apply what we are learning in 2244 to form conclusions about variable types and appropriate summaries ; you are marked based on 2244 course concepts . The article is provided just so you can understand the dataset you are working with. 4. Draw a quick sketch of what you want the graph to look like (what variables go where?). Think about your axes titles, etc. Having a clear image of what you are trying to make can help you create the graph in R. 5. Use examples in Lab 4: Lesson 1 (and possibly from lecture Topic 6) to create your graph; those examples have code to show you how to do it! Also, review the Tips for working in R to create a graph file that came with this Assignment BEFORE you start trying to make your graph. 6. Refer back to Lab 4: Lesson 2 for characteristics of good figures for axes titles, figure captions, and colour/symbolism. Apply them properly/if relevant . That’s scholarly communic ation.
Biol/Stat 2244B FW23 Assignment 2 3 Assignment Questions Here’s the R output from str(assign) so you can confirm the data imported correctly into R. I have NOT corrected any mistakes R made at identifying variable type. Increasing screentime in children may influence their physical health; less time spent moving may result in decreased flexibility, but this has yet to be thoroughly investigated. It has been shown that connective tissues and muscle elasticity is different in men vs. women (i.e. sexes), so sex of the individual may influence a putative relationship between screentime and physical health. This leads us to investigate the question, RESEARCH QUESTION: Does sex of school-age children influence the relationship between time spent on screens and physical flexibility? Question 1. Your entire assignment (all answers, code, and graph/output) must be created in an R markdown file, and knitted to .PDF (ideally), or, knitted to a Word (.doc) file and subsequently saved to PDF for submission. Don’t forget you have TWO uploads when submit ting: to Gradescope, and to OWL Brightspace Assessments/Assignments. For the Research Question (in the yellow box above) , answer the following questions: a. What are the explanatory and response variable(s) based on the Research Question? b. Which variable(s) in the dataset (i.e. give the column name(s)) will you use to answer the Research Question? Briefly connect your choices to the variables identified in part a. c. What type of variable (i.e. categorical or quantitative) is each variable that you have identified in part b ? d. Only if applicable: If you want/need to do any transformation of the variables and/or generation of new variables before creating your graph for Question 2, provide a brief description of what you will be doing and why. All such transformation should be completed with R (with code
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help