IST 687 HW3
.pdf
keyboard_arrow_up
School
Syracuse University *
*We aren’t endorsed by this school
Course
687
Subject
Computer Science
Date
Dec 6, 2023
Type
Pages
10
Uploaded by MinisterGoldfinch3708
11/29/23, 8:10 PM
HW3.knit
file:///C:/Users/empet/OneDrive/Desktop/Intro to Data Science/Elyse_Peterson_HW3.knit.html
1/10
Intro to Data Science - HW 3
Copyright Jeffrey Stanton, Jeffrey Saltz, and Jasmina Tacheva
# Enter your name here: Elyse Peterson
Attribution statement: (choose only one and delete the rest)
# 1. I did this homework by myself, with help from the book and the professor.
Reminders of things to practice from last week:
Make a data frame data.frame( )
Row index of max/min which.max( ) which.min( )
Sort value or order rows sort( ) order( )
Descriptive statistics mean( ) sum( ) max( )
Conditional statement if (condition) “true stuff” else “false stuff”
This Week:
Often, when you get a dataset, it is not in the format you want. You can (and should) use code to refine the dataset
to become more useful. As Chapter 6 of Introduction to Data Science mentions, this is called “data munging.” In
this homework, you will read in a dataset from the web and work on it (in a data frame) to improve its usefulness.
Part 1: Use read_csv( ) to read a CSV file from the
web into a data frame:
A. Use R code to read directly from a URL on the web. Store the dataset into a new dataframe, called
dfComps.
The URL is:
“https://intro-datascience.s3.us-east-2.amazonaws.com/companies1.csv (https://intro-datascience.s3.us-
east-2.amazonaws.com/companies1.csv)”
Hint:
use read_csv( ), not read.csv( ). This is from the tidyverse package
. Check the help to compare
them.
library
(readr)
urlToRead <- "https://intro-datascience.s3.us-east-2.amazonaws.com/companies1.csv"
dfcomps <- read_csv(url(urlToRead))
11/29/23, 8:10 PM
HW3.knit
file:///C:/Users/empet/OneDrive/Desktop/Intro to Data Science/Elyse_Peterson_HW3.knit.html
2/10
## Rows: 47758 Columns: 18
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (16): permalink, name, homepage_url, category_list, market, funding_tota...
## dbl (2): funding_rounds, founded_year
## ## ℹ
Use `spec()` to retrieve the full column specification for this data.
## ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(dfcomps)
## # A tibble: 6 × 18
## permalink name homepage_url category_list market funding_total_usd status
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> ## 1 /organizatio… #way… http://www.… |Entertainme… News 1 750 000 acqui…
## 2 /organizatio… &TV … http://enjo… |Games| Games 4 000 000 opera…
## 3 /organizatio… 'Roc… http://www.… |Publishing|… Publi… 40 000 opera…
## 4 /organizatio… (In)… http://www.… |Electronics… Elect… 1 500 000 opera…
## 5 /organizatio… #NAM… http://plus… |Software| Softw… 1 200 000 opera…
## 6 /organizatio… -R- … <NA> |Entertainme… Games 10 000 opera…
## # ℹ
11 more variables: country_code <chr>, state_code <chr>, region <chr>,
## # city <chr>, funding_rounds <dbl>, founded_at <chr>, founded_month <chr>,
## # founded_quarter <chr>, founded_year <dbl>, first_funding_at <chr>,
## # last_funding_at <chr>
Part 2: Create a new data frame that only contains
companies with a homepage URL:
E. Use subsetting
to create a new dataframe that contains only the companies with homepage URLs (store
that dataframe in urlComps
).
urlcomps <- subset(dfcomps,complete.cases(dfcomps$homepage_url))
head(urlcomps)
## # A tibble: 6 × 18
## permalink name homepage_url category_list market funding_total_usd status
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> ## 1 /organizatio… #way… http://www.… |Entertainme… News 1 750 000 acqui…
## 2 /organizatio… &TV … http://enjo… |Games| Games 4 000 000 opera…
## 3 /organizatio… 'Roc… http://www.… |Publishing|… Publi… 40 000 opera…
## 4 /organizatio… (In)… http://www.… |Electronics… Elect… 1 500 000 opera…
## 5 /organizatio… #NAM… http://plus… |Software| Softw… 1 200 000 opera…
## 6 /organizatio… .Clu… http://nic.… |Software| Softw… 7 000 000 <NA> ## # ℹ
11 more variables: country_code <chr>, state_code <chr>, region <chr>,
## # city <chr>, funding_rounds <dbl>, founded_at <chr>, founded_month <chr>,
## # founded_quarter <chr>, founded_year <dbl>, first_funding_at <chr>,
## # last_funding_at <chr>
11/29/23, 8:10 PM
HW3.knit
file:///C:/Users/empet/OneDrive/Desktop/Intro to Data Science/Elyse_Peterson_HW3.knit.html
3/10
D. How many companies are missing a homepage URL?
library
(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔
dplyr 1.1.3 ✔
purrr 1.0.2
## ✔
forcats 1.0.0 ✔
stringr 1.5.0
## ✔
ggplot2 3.4.4 ✔
tibble 3.2.1
## ✔
lubridate 1.9.3 ✔
tidyr 1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖
dplyr::filter() masks stats::filter()
## ✖
dplyr::lag() masks stats::lag()
## ℹ
Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to becom
e errors
count(dfcomps)
## # A tibble: 1 × 1
## n
## <int>
## 1 47758
count(urlcomps)
## # A tibble: 1 × 1
## n
## <int>
## 1 44435
count(dfcomps)-count(urlcomps)
## n
## 1 3323
Part 3: Analyze the numeric variables in the
dataframe.
G. How many numeric variables
does the dataframe have? You can figure that out by looking at the output of
str(urlComps)
.
H. What is the average number of funding rounds for the companies in urlComps
?
str(urlcomps)
11/29/23, 8:10 PM
HW3.knit
file:///C:/Users/empet/OneDrive/Desktop/Intro to Data Science/Elyse_Peterson_HW3.knit.html
4/10
## tibble [44,435 × 18] (S3: tbl_df/tbl/data.frame)
## $ permalink : chr [1:44435] "/organization/waywire" "/organization/tv-communications" "/organization/rock-your-paper" "/organization/in-touch-network" ...
## $ name : chr [1:44435] "#waywire" "&TV Communications" "'Rock' Your Paper" "(In)
Touch Network" ...
## $ homepage_url : chr [1:44435] "http://www.waywire.com" "http://enjoyandtv.com" "http://
www.rockyourpaper.org" "http://www.InTouchNetwork.com" ...
## $ category_list : chr [1:44435] "|Entertainment|Politics|Social Media|News|" "|Games|" "|
Publishing|Education|" "|Electronics|Guides|Coffee|Restaurants|Music|iPhone|Apps|Mobile|iOS|E-Co
mmerce|" ...
## $ market : chr [1:44435] "News" "Games" "Publishing" "Electronics" ...
## $ funding_total_usd: chr [1:44435] "1 750 000" "4 000 000" "40 000" "1 500 000" ...
## $ status : chr [1:44435] "acquired" "operating" "operating" "operating" ...
## $ country_code : chr [1:44435] "USA" "USA" "EST" "GBR" ...
## $ state_code : chr [1:44435] "NY" "CA" NA NA ...
## $ region : chr [1:44435] "New York City" "Los Angeles" "Tallinn" "London" ...
## $ city : chr [1:44435] "New York" "Los Angeles" "Tallinn" "London" ...
## $ funding_rounds : num [1:44435] 1 2 1 1 2 1 1 1 1 1 ...
## $ founded_at : chr [1:44435] "1/6/12" NA "26/10/2012" "1/4/11" ...
## $ founded_month : chr [1:44435] "2012-06" NA "2012-10" "2011-04" ...
## $ founded_quarter : chr [1:44435] "2012-Q2" NA "2012-Q4" "2011-Q2" ...
## $ founded_year : num [1:44435] 2012 NA 2012 2011 2012 ...
## $ first_funding_at : chr [1:44435] "30/06/2012" "4/6/10" "9/8/12" "1/4/11" ...
## $ last_funding_at : chr [1:44435] "30/06/2012" "23/09/2010" "9/8/12" "1/4/11" ...
#There are 2 fields (funding rounds and founded year) that are numeric#
mean(urlcomps$funding_rounds)
## [1] 1.725194
I. What year was the oldest company in the dataframe founded?
Hint:
If you get a value of “NA,” most likely there are missing values in this variable which preclude R from
properly calculating the min & max values. You can ignore NAs with basic math calculations. For example,
instead of running mean(urlComps$founded_year), something like this will work for determining the average
(note that this question needs to use a different function than ‘mean’.
#mean(urlComps$founded_year, na.rm=TRUE)
min(na.omit(dfcomps$founded_year))
## [1] 1900
Part 4: Use string operations to clean the data.
K. The permalink variable
in urlComps
contains the name of each company but the names are currently
preceded by the prefix “/organization/”. We can use str_replace() in tidyverse or gsub() to clean the values of
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
For this assignment I need to perform the instructions below and enter the code you get from the instructions below here (Hint: starts with 143)
arrow_forward
- Blackboard Learn
=view&content id3_68980 1&course id3 2576 1#
44 / 71
90%
Table 2.2: Data for the Customers table.
Customer ID
Job Description
Job Status
Completion Data
555-3451
555-6639
555-7877
555-2258
555-9111
555-8890
IMS
Jacob's Motor Sales
Kelth's Sports Cars, Inc.
Randy's Od Makes
Trevor's Rolling Wheels
Jacob Danlels
Kelth Moorehouse
Randy Petersen
Trevor Craig
Jack Johnson
Nick Stone
KSI
ROM
TRW
Jack's Timeless Classics
Nick's Quick Rides
ITC
NQR
Assignment :
You have been asked to create a small information system to administer the records of a small educational
institution using MS Access. The following details need to be stored: students' surname, first name, date of
birth, the date they enrolled in the institution, papers available for study, including the paper code and paper
title, the enrolments of students in particular papers, including date of enrolment, mark gained, and fee
charged.
Create a database with the appropriate number of tables to store the above data…
arrow_forward
Using python (pycharm)Please solve all parts i am in need please )
arrow_forward
use python
Web Scraping
A marketing company would like to know if varsity (college) swimmers are (on average) taller than their volleyball counterparts. You have been asked to create a data driven solution in order to answer this question.
The following web pages contain the roster of the Bearcats’ men’s and women’s swimming and volleyball teams.
Men’s Swimming Team
https://athletics.baruch.cuny.edu/sports/mens-swimming-and-diving/roster
Men’s Volleyball Team
https://athletics.baruch.cuny.edu/sports/mens-volleyball/roster
Women’s Swimming Team
https://athletics.baruch.cuny.edu/sports/womens-swimming-and-diving/roster
Women’s Volleyball Team
https://athletics.baruch.cuny.edu/sports/womens-volleyball/roster
The height of each player is listed on all web pages.1. Scrape the heights of all the players on the men’s swimming team and find the average.
2. Scrape the heights of all the players on the men’s volleyball team and find the average.
3. Scrape the heights of all the players on the…
arrow_forward
DTSC670: Foundations of Machine Learning Models
Module 1
Assignment 2: COVID-19 Data Wrangling
Name:
The purpose of this assignment is to hone your data wrangling skills. Your task for this assignment is to perform the data preparation as instructed in the
DTSC670_Assignment_2 pdf listed in Brightspace. After performing all the data preparation tasks outlined in the document, run the code in the "Prepare
DataFrames for Grading" section.
You are supplied an Excel file called BrazilcOVIDData.xlsx - be sure to put the data file in the same directory as this Jupyter Notebook. Please note that it
may take around 5 minutes to read-in all of the data in this file.
In [ ]:
N ### ENTER CODE HERE ###
# Insert as many cells as you need, but be sure your code is very neat and very well documented.
In [ ]:
N # Get Final Features DataFrame
# features = ### ENTER CODE HERE ###
# Get Final Response DataFrame
# response = ### ENTER CODE HERE ###
arrow_forward
Populate the tables with the following data, Save and print a copy of each table.
arrow_forward
population_df = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/AnalyseProject/world_population.csv', index_col='Country Code')meta_df = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/AnalyseProject/metadata.csv', index_col='Country Code')
Question 1
As we've seen previously, the world population data spans from 1960 to 2017. We'd like to build a predictive model that can give us the best guess at what the world population in a given year was. However, as a slight twist this time, we want to compute this estimate for only countries within a given income group.
First, however, we need to organise our data such that the sklearn's RandomForestRegressor class can train on our data. To do this, we will write a function that takes as input an income group and return a 2-d numpy array that contains the year and the measured population.
Function Specifications:
Should take a str argument, called income_group_name as input…
arrow_forward
Python data science
Please do it all parts
Python code needed with screenshot
1. Download the dataset -> house price prediction from kaggle.
2. Show the summary statics
3. Create heatmap between numerical variables
4. Remove the null values and show in the code null value removes.
5. Display all the column values.
arrow_forward
# coding: utf-8
# ## Section 5 Homework - Fill in the Blanks
# Import the packages needed to perform the analysis
import _ as pd
import _ as np
import _ as plt
import _ as sns
# Load the data provided for the exercise
# Import the csv dataset
data = _("P4-Demographic-Data.csv")
# Explore the data
# Visualize the dataframe
data
# Rename the column names
data._ = ['CountryName', 'CountryCode', 'BirthRate', 'InternetUsers', 'IncomeGroup']
# Check top 6 rows
data._(6)
# Check bottom 7 rows
data._(7)
# Check the structure of the data frame
data._()
# Check the summary of the data
data._()
# ### Request 1
# You are employed as a Data Scientist by the World Bank and you are working on a project to analyse the World’s demographic trends.
#
# You are required to produce a scatterplot illustrating Birth Rate and Internet Usage statistics by Country.
#
# The scatterplot needs to also be categorised by Countries’ Income Groups.…
arrow_forward
Write steps in C# to Delete data from ‘Datagridview’
Subject: Visual Programming (C#)
arrow_forward
xsdd
arrow_forward
import sqlite3
from sqlite3 import Error
# Creates connection to sqlite in-memory database
def create_connection():
"""
Create a connection to in-memory database
:return: Connection object
"""
try:
conn = sqlite3.connect(":memory:")
return conn
except Error as e:
print(e)
# YOUR CODE HERE
# Use sqlite3.connect(":memory:") to create connection object
return conn
# query to create the table
table_sql = """
CREATE TABLE Horses (
id integer PRIMARY KEY NOT NULL,
name text,
breed text,
height real,
birthday text
);
"""
# query to insert data into the table
ins_sql = """INSERT INTO Horses VALUES(1,'Babe','Quarter Horse',15.3,'2015-02-10'); """
# query to fetch all data from the table
fetch_sql = """SELECT * FROM Horses;"""
# creating db connection
conn = create_connection()
# fetching a cursor from the connection
c = conn.cursor()
# executing statement to create table
c.execute(table_sql)
# executing statement to…
arrow_forward
CGI programs often serve as an interface between a database and a Web server
Select one:
O True
O False
arrow_forward
K
ENGINEERING COMPUTER-SCIENCE
Purpose
In this lab activity, you will be introduced to
the concept of enriching data with SPLUNK.
This three-hour course is for knowledge
managers who want to use lookups to enrich
their search environment. Topics will
introduce lookup types and cover how to
upload and define lookups, create automatic
lookups, and use advanced lookup options.
Additionally, students will learn how to verify
lookup contents in search and review lookup
best practices.
arrow_forward
When I remove the quotations it says Invalid string a string can have up 2048 characters
arrow_forward
PHP/Mysql program
arrow_forward
Why is index.html a significant file name?
arrow_forward
mysql> CREATE TABLE School ( -> SchoolNumber INT PRIMARY KEY, -> Name VARCHAR(50), -> Address VARCHAR(100), -> PhoneNumber BIGINT, -> Built DATE, -> Size INT -> );ERROR 1046 (3D000): No database selectedmysql> SELECT * FROM School;ERROR 1046 (3D000): No database selected
arrow_forward
Create a PHP Program that perform
the following
Connect to database server
Create database under your name
(DB_YOURNAME)
Create 2 tables
Table 1: Customer (
Cust_ID,Cust_Name,Cust_PhoneNo,
Cust_Email,
Cust_Address,Cust_Location)
Table 2: Product
(P_ID,P_Name,P_Quantity,
P_Category)
Insert sample data into the two
tables
arrow_forward
Lesson: Visual BasicTopic: Data Files
arrow_forward
Using PHP
Write a program using PHP programming language that reads the user name and password from a login form Login.html Login.php And then stores them in a table named: users
in these fields: U_name, U_Pwd
using phpMyadmin database.
arrow_forward
zybook/Database_Essentials_for_Business_with
S 245: Database Essentials for Business with Lab home > 4.4: Subqueries
416936.2673254.qx3zqy7
Jump to level 1
Courseld CourseCode
7994
CS168
HIST64
ENGL967
8914
5024
8367
8901
CS32
HIST635
Check
Course
CourseName Capacity Instructorld
Databases
Next
World History
Intro to Poetry
Data Structures
SELECT CourseName
FROM Course
WHERE InstructorId =
125
175
50
Select the values returned by the query below.
200
European History 100
73/chapter/4/section/4
(SELECT InstructorId
FROM Instructor
WHERE InstructorName = 'Mel Park');
2
3
1
2
3
Instructorld InstructorName
Gil Chen
Mel Park
Ada Rice
1
2
3
Data Structures
2
Instructor
Rank
7994
Professor
Associate Professor
Associate Professor
Note: Both tables may not be necessary to complete this level.
Department
żyBooks
English
Computer Science
History
European History
Databases
Computer Science
arrow_forward
STUDENT
Name Student number
Smith
Brown
COURSE
Course_name
Intro to Computer Science
Data Structures
Discrete Mathematics
Database
SECTION
17
8
102
112
119
135
Class
1
2
Course_number
CS1310
CS3320
MATH2410
CS3380
Section_identifier Course_number
85
MATH2410
92
CS1310
CS3320
MATH2410
CS1310
CS3380
Major
CS
CS
Credit_hours
4
4
3
3
Semester
Fall
Fall
Spring
Fall
Fall
Fall
Year
07
07
08
08
08
08
Department
CS
CS
MATH
CS
Instructor
King
Anderson
Knuth
Chang
Anderson
Stone
arrow_forward
PHP code
arrow_forward
Link to Data File (Parking.csv): https://data.cityofnewyork.us/City-Government/Open-Parking-and-Camera-Violations/nc67-uf89
In this assignment, you will analyze a large public data set in an effort to answer one of the burning questions of our time: Who isn't paying their parking tickets? The work you will do in this assignment is a variation of something done in the amazing and popular blog I Quant NY (Links to an external site.). You should read the original article here (Links to an external site.). In a nutshell, we are going look at New York City parking ticket data and determine which country's diplomats owe and how much. The entire data set is available online but it's way too big for our purposes so you will trim it down to only include tickets for street cleaning violations and only those tickets that still have an amount due of over $50.00. You must download the data file for this assignment here. (Links to an external site.)You can use the online filter features or…
arrow_forward
Students
Courses
Stu_Id
Stu_Name Age GPA
Course_Code Course_Name Cr_Hrs
201912346
Ali
20
2.5
ITDR2106
Introduction
3
to Database
201812347
Sami
22
3.0
ITDR3102
Operating
Systems
Programming 3
3
202022228
Yaser
21
2.3
ITDR2104
202011119
Yumna
23
2.7
202011110
Turkiya
23
2.9
202011111
Raya
22
2.5
Display the students name's whose names ends with 'Y'.
arrow_forward
Description
Purpose
In this lab activity, you will be introduced to the
concept of enriching data with SPLUNK.
This three-hour course is for knowledge
managers who want to use lookups to enrich
their search environment. Topics will introduce
lookup types and cover how to upload and
define lookups, create automatic lookups, and
use advanced lookup options. Additionally,
students will learn how to verify lookup contents
in search and review lookup best practices.
arrow_forward
Don't use ai to answer I will report your answer Solve it Asap with explanation and calculation
arrow_forward
Create
Entry relationship diagram (ERD)
Data dictionary
Clear handwriting please
arrow_forward
Python Programming: Jupyter Notebook
What is the syntax on how to locate or search for multiple datatype object from a column?
arrow_forward
Calculating baseball statistics in a file 5 points The Lahman Baseball Database is a comprehensive database of Major League
baseball statistics. The journalist Sean Lahman provides all of this data freely to the public. We will make use of some of his data in
this assignment. If you would like to learn more about the database, you can visit his website. We provide you with a CSV file
named batting.csv that contains the annual batting performance data for all Major League Baseball players dating back to the year
1871. The first row in the file is a header indicating what data is stored in each column of the file. For example, column 12 is labeled
"HR" and contains the number of home runs the player hit that year. Each of the next 99,846 lines contains a comma separated list
of the data for that player and year. For example, the fifth line in the file indicates that a player with the ID allisdo01 hit 2 home runs
in 1871. You should download batting.csv and place it in the same directory…
arrow_forward
Pls help java to mysql (search from database)
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education
Related Questions
- For this assignment I need to perform the instructions below and enter the code you get from the instructions below here (Hint: starts with 143)arrow_forward- Blackboard Learn =view&content id3_68980 1&course id3 2576 1# 44 / 71 90% Table 2.2: Data for the Customers table. Customer ID Job Description Job Status Completion Data 555-3451 555-6639 555-7877 555-2258 555-9111 555-8890 IMS Jacob's Motor Sales Kelth's Sports Cars, Inc. Randy's Od Makes Trevor's Rolling Wheels Jacob Danlels Kelth Moorehouse Randy Petersen Trevor Craig Jack Johnson Nick Stone KSI ROM TRW Jack's Timeless Classics Nick's Quick Rides ITC NQR Assignment : You have been asked to create a small information system to administer the records of a small educational institution using MS Access. The following details need to be stored: students' surname, first name, date of birth, the date they enrolled in the institution, papers available for study, including the paper code and paper title, the enrolments of students in particular papers, including date of enrolment, mark gained, and fee charged. Create a database with the appropriate number of tables to store the above data…arrow_forwardUsing python (pycharm)Please solve all parts i am in need please )arrow_forward
- use python Web Scraping A marketing company would like to know if varsity (college) swimmers are (on average) taller than their volleyball counterparts. You have been asked to create a data driven solution in order to answer this question. The following web pages contain the roster of the Bearcats’ men’s and women’s swimming and volleyball teams. Men’s Swimming Team https://athletics.baruch.cuny.edu/sports/mens-swimming-and-diving/roster Men’s Volleyball Team https://athletics.baruch.cuny.edu/sports/mens-volleyball/roster Women’s Swimming Team https://athletics.baruch.cuny.edu/sports/womens-swimming-and-diving/roster Women’s Volleyball Team https://athletics.baruch.cuny.edu/sports/womens-volleyball/roster The height of each player is listed on all web pages.1. Scrape the heights of all the players on the men’s swimming team and find the average. 2. Scrape the heights of all the players on the men’s volleyball team and find the average. 3. Scrape the heights of all the players on the…arrow_forwardDTSC670: Foundations of Machine Learning Models Module 1 Assignment 2: COVID-19 Data Wrangling Name: The purpose of this assignment is to hone your data wrangling skills. Your task for this assignment is to perform the data preparation as instructed in the DTSC670_Assignment_2 pdf listed in Brightspace. After performing all the data preparation tasks outlined in the document, run the code in the "Prepare DataFrames for Grading" section. You are supplied an Excel file called BrazilcOVIDData.xlsx - be sure to put the data file in the same directory as this Jupyter Notebook. Please note that it may take around 5 minutes to read-in all of the data in this file. In [ ]: N ### ENTER CODE HERE ### # Insert as many cells as you need, but be sure your code is very neat and very well documented. In [ ]: N # Get Final Features DataFrame # features = ### ENTER CODE HERE ### # Get Final Response DataFrame # response = ### ENTER CODE HERE ###arrow_forwardPopulate the tables with the following data, Save and print a copy of each table.arrow_forward
- population_df = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/AnalyseProject/world_population.csv', index_col='Country Code')meta_df = pd.read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/AnalyseProject/metadata.csv', index_col='Country Code') Question 1 As we've seen previously, the world population data spans from 1960 to 2017. We'd like to build a predictive model that can give us the best guess at what the world population in a given year was. However, as a slight twist this time, we want to compute this estimate for only countries within a given income group. First, however, we need to organise our data such that the sklearn's RandomForestRegressor class can train on our data. To do this, we will write a function that takes as input an income group and return a 2-d numpy array that contains the year and the measured population. Function Specifications: Should take a str argument, called income_group_name as input…arrow_forwardPython data science Please do it all parts Python code needed with screenshot 1. Download the dataset -> house price prediction from kaggle. 2. Show the summary statics 3. Create heatmap between numerical variables 4. Remove the null values and show in the code null value removes. 5. Display all the column values.arrow_forward# coding: utf-8 # ## Section 5 Homework - Fill in the Blanks # Import the packages needed to perform the analysis import _ as pd import _ as np import _ as plt import _ as sns # Load the data provided for the exercise # Import the csv dataset data = _("P4-Demographic-Data.csv") # Explore the data # Visualize the dataframe data # Rename the column names data._ = ['CountryName', 'CountryCode', 'BirthRate', 'InternetUsers', 'IncomeGroup'] # Check top 6 rows data._(6) # Check bottom 7 rows data._(7) # Check the structure of the data frame data._() # Check the summary of the data data._() # ### Request 1 # You are employed as a Data Scientist by the World Bank and you are working on a project to analyse the World’s demographic trends. # # You are required to produce a scatterplot illustrating Birth Rate and Internet Usage statistics by Country. # # The scatterplot needs to also be categorised by Countries’ Income Groups.…arrow_forward
- Write steps in C# to Delete data from ‘Datagridview’ Subject: Visual Programming (C#)arrow_forwardxsddarrow_forwardimport sqlite3 from sqlite3 import Error # Creates connection to sqlite in-memory database def create_connection(): """ Create a connection to in-memory database :return: Connection object """ try: conn = sqlite3.connect(":memory:") return conn except Error as e: print(e) # YOUR CODE HERE # Use sqlite3.connect(":memory:") to create connection object return conn # query to create the table table_sql = """ CREATE TABLE Horses ( id integer PRIMARY KEY NOT NULL, name text, breed text, height real, birthday text ); """ # query to insert data into the table ins_sql = """INSERT INTO Horses VALUES(1,'Babe','Quarter Horse',15.3,'2015-02-10'); """ # query to fetch all data from the table fetch_sql = """SELECT * FROM Horses;""" # creating db connection conn = create_connection() # fetching a cursor from the connection c = conn.cursor() # executing statement to create table c.execute(table_sql) # executing statement to…arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education