Calculating baseball statistics in a file 5 points The Lahman Baseball Database is a comprehensive database of Major League baseball statistics. The journalist Sean Lahman provides all of this data freely to the public. We will make use of some of his data in this assignment. If you would like to learn more about the database, you can visit his website. We provide you with a CSV file named batting.csv that contains the annual batting performance data for all Major League Baseball players dating back to the year 1871. The first row in the file is a header indicating what data is stored in each column of the file. For example, column 12 is labeled "HR" and contains the number of home runs the player hit that year. Each of the next 99,846 lines contains a comma separated list of the data for that player and year. For example, the fifth line in the file indicates that a player with the ID allisdo01 hit 2 home runs in 1871. You should download batting.csv and place it in the same directory as your Python code. Your job will be to write a Python program that finds the player ID of the player with the highest total career RBis of all time. But be careful: your program should work with any similarly formatted CSV file! Data Input: Opening the file First, you will need to read in the data file. You can do this by opening the file using the open function and iterating through each line (remember that you can use either read or readlines ; I strongly urge the latter of these). To parse the data contained in each line, you will need to use the split method. We are interested in two columns, 'playerID' and 'RBI' (Your program should skip the header in the file and completely ignore any lines where the RBI column does not contain a digit.) You should create an accumulator dictionary called career_rbis that maps each player ID string to an integer representing the total number of RBIS for that player. As you iterate through the file, you should update the career_rbis dictionary. Duplicate entries should be summed. Data Processing: Finding the most RBIS After reading in the data and generating the "career_rbis" dictionary, you should next iterate through the dictionary to find the player with the most career RBIS (thus summing duplicate entries). You will need two accumulator variables to track both the most RBIS you've seen so far AND the player having that many RBIS. Store an integer representing the highest number of RBIS in a variable named max_rbis and the corresponding player id string in a variable named max_player. DO NOT try to write your code here. Debugging it here will be very difficult. You should write and test your code on an EWS computer or your own computer. Using files and directories When you submit code here, you should use open('batting.csv') with no directory path. On your own machine things may behave differently. Briefly, the best thing to do is to figure out where Python is running and move your file batting.csv there. Otherwise you can try to find out where your file is located and refer to it directly using the code shared in lecture. You can also use the smaller batting_test.csv to test your code. Files: • batting.csv • batting_test.csv Files: • batting.csv • batting_test.csv Your submission should include the following variables defined correctly: • career_rbis max_rbis max_player Starter code (click to view) career_rbis = ()# Open the file. Call it batting_file. ??? # Read the data from the file using a 'for loop. for line in batting_file.??: line = line.strip) # remove whitespace from the line values = ?? # split the line by commas" if values[ 0] == 'playerID': continue # skip the header line batter_id = ??? # get the batter_id rbis = ??? # get the RBIS # ignore non-digit RBIS ??? # check if batter_id is in career_rbis if batter_id not in career_rbis: # add it ??? else: # add the RBIS career_rbis[ batter_id] = ??? # Find the player with the maximum RBIS. This will probably take # several lines of code and some serious thought on your part. ???

Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
icon
Related questions
Question
Calculating baseball statistics in a file 5 points The Lahman Baseball Database is a comprehensive database of Major League
baseball statistics. The journalist Sean Lahman provides all of this data freely to the public. We will make use of some of his data in
this assignment. If you would like to learn more about the database, you can visit his website. We provide you with a CSV file
named batting.csv that contains the annual batting performance data for all Major League Baseball players dating back to the year
1871. The first row in the file is a header indicating what data is stored in each column of the file. For example, column 12 is labeled
"HR" and contains the number of home runs the player hit that year. Each of the next 99,846 lines contains a comma separated list
of the data for that player and year. For example, the fifth line in the file indicates that a player with the ID allisdo01 hit 2 home runs
in 1871. You should download batting.csv and place it in the same directory as your Python code. Your job will be to write a Python
program that finds the player ID of the player with the highest total career RBIS of all time. But be careful: your program should
work with any similarly formatted CSV file! Data Input: Opening the file First, you will need to read in the data file. You can do this
by opening the file using the open function and iterating through each line (remember that you can use either read or readlines ; I
strongly urge the latter of these). To parse the data contained in each line, you will need to use the split method. We are interested
in two columns, 'playerID' and 'RBI' (Your program should skip the header in the file and completely ignore any lines where the RBI
column does not contain a digit.) You should create an accumulator dictionary called career_rbis that maps each player ID string to
an integer representing the total number of RBIS for that player. As you iterate through the file, you should update the career_rbis
dictionary. Duplicate entries should be summed.
Data Processing: Finding the most RBIS After reading in the data and generating the "career_rbis" dictionary, you should next
iterate through the dictionary to find the player with the most career RBIS (thus summing duplicate entries). You will need two
accumulator variables to track both the most RBIS you've seen so far AND the player having that many RBIS. Store an integer
representing the highest number of RBIS in a variable named max_rbis and the corresponding player id string in a variable named
max_player. DO NOT try to write your code here. Debugging it here will be very difficult. You should write and test your code on an
EWS computer or your own computer. Using files and directories When you submit code here, you should use open('batting.csv)
with no directory path. On your own machine things may behave differently. Briefly, the best thing to do is to figure out where
Python is running and move your file batting.csv there. Otherwise you can try to find out where your file is located and refer to it
directly using the code shared in lecture. You can also use the smaller batting_test.csv to test your code. Files: • batting.csv •
batting_test.csv
Files: • batting.csv • batting_test.csv Your submission should include the following variables defined correctly: • career_rbis
max_rbis max_player
Starter code (click to view) career_rbis = () # Open the file. Call it batting_file. ??? # Read the data from the file using a for loop. for
line in batting_file.??: line = line.strip) # remove whitespace from the line values = ?? # split the line by commas" if values[ 0] ==
'playerID': continue # skip the header line batter_id = ??? # get the batter_id rbis = ??? # get the RBIS # ignore non-digit RBIS ??? #
check if batter_id is in career_rbis if batter_id not in career_rbis: # add it ??? else: # add the RBIS career_rbis[ batter_id] = ?? #
Find the player with the maximum RBIS. This will probably take # several lines of code and some serious thought on your part. ???
Transcribed Image Text:Calculating baseball statistics in a file 5 points The Lahman Baseball Database is a comprehensive database of Major League baseball statistics. The journalist Sean Lahman provides all of this data freely to the public. We will make use of some of his data in this assignment. If you would like to learn more about the database, you can visit his website. We provide you with a CSV file named batting.csv that contains the annual batting performance data for all Major League Baseball players dating back to the year 1871. The first row in the file is a header indicating what data is stored in each column of the file. For example, column 12 is labeled "HR" and contains the number of home runs the player hit that year. Each of the next 99,846 lines contains a comma separated list of the data for that player and year. For example, the fifth line in the file indicates that a player with the ID allisdo01 hit 2 home runs in 1871. You should download batting.csv and place it in the same directory as your Python code. Your job will be to write a Python program that finds the player ID of the player with the highest total career RBIS of all time. But be careful: your program should work with any similarly formatted CSV file! Data Input: Opening the file First, you will need to read in the data file. You can do this by opening the file using the open function and iterating through each line (remember that you can use either read or readlines ; I strongly urge the latter of these). To parse the data contained in each line, you will need to use the split method. We are interested in two columns, 'playerID' and 'RBI' (Your program should skip the header in the file and completely ignore any lines where the RBI column does not contain a digit.) You should create an accumulator dictionary called career_rbis that maps each player ID string to an integer representing the total number of RBIS for that player. As you iterate through the file, you should update the career_rbis dictionary. Duplicate entries should be summed. Data Processing: Finding the most RBIS After reading in the data and generating the "career_rbis" dictionary, you should next iterate through the dictionary to find the player with the most career RBIS (thus summing duplicate entries). You will need two accumulator variables to track both the most RBIS you've seen so far AND the player having that many RBIS. Store an integer representing the highest number of RBIS in a variable named max_rbis and the corresponding player id string in a variable named max_player. DO NOT try to write your code here. Debugging it here will be very difficult. You should write and test your code on an EWS computer or your own computer. Using files and directories When you submit code here, you should use open('batting.csv) with no directory path. On your own machine things may behave differently. Briefly, the best thing to do is to figure out where Python is running and move your file batting.csv there. Otherwise you can try to find out where your file is located and refer to it directly using the code shared in lecture. You can also use the smaller batting_test.csv to test your code. Files: • batting.csv • batting_test.csv Files: • batting.csv • batting_test.csv Your submission should include the following variables defined correctly: • career_rbis max_rbis max_player Starter code (click to view) career_rbis = () # Open the file. Call it batting_file. ??? # Read the data from the file using a for loop. for line in batting_file.??: line = line.strip) # remove whitespace from the line values = ?? # split the line by commas" if values[ 0] == 'playerID': continue # skip the header line batter_id = ??? # get the batter_id rbis = ??? # get the RBIS # ignore non-digit RBIS ??? # check if batter_id is in career_rbis if batter_id not in career_rbis: # add it ??? else: # add the RBIS career_rbis[ batter_id] = ?? # Find the player with the maximum RBIS. This will probably take # several lines of code and some serious thought on your part. ???
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps with 1 images

Blurred answer
Recommended textbooks for you
Computer Networking: A Top-Down Approach (7th Edi…
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
Computer Organization and Design MIPS Edition, Fi…
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
Concepts of Database Management
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
Prelude to Programming
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
Sc Business Data Communications and Networking, T…
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY