4 & 5 python Project 5 – Data PlotsObjective: To work with a data set and generate graphical charts. To gain experience with pandas andmatplotlib modules/libraries.Description: You will perform a simple data analysis given a CSV file of Amazon rain forest data. The dataset contains the number of fires that have occurred over an 18 year period in the Amazon. Working with rawdata is typical in Data Science. You will need to clean the data before visualizing the data in the form of a plotor bar chart. About the data:year is the year when the forest fire happened.state is the Brazilian state.month is the month when the forest fire happened.number is the number of forest fires reported. You will notice that the format of the numbers is not US standard.In many countries, thousands are represented with a period. So 2.588 is really 2588 forest fires. Don't fix thedata in Excel. You will correct for this on the data importdate is the date when the forest fire was reported• In this project, you will first perform a high level analysis of the data using pandas functions.Clean the data looking for missing valuesCreate subsets of the data.Then plot the dataDetailed Instruction StepsIt is recommended that you go through this project step by step. Make sure each step works before moving tothe next one. Use a lot of print() commands to show results at each intermediate step. These can be commentedout once you have the plot working.1. This project needs to import the pandas, matplotlib.pyplot, and numpy modules/libraries.2. Open the data file using the pandas read_csv function. Research how to convert decimal thousands toregular numbers. There is an input parameter available to handle this formatting.3. Perform high level data summaries:a. Use the pandas shape() function to print the number of rows and coluans indata setb. Use the pandas head() and tail() function to print the first and last row sets.c. Use the pandas describe() function to provide some simple statistics about the data. Researchoptions to show all statistics available. Print this.4. Clean the dataa. Check for missing data values. Is the isna() with sum() method to identify missing cells. Theresult should look like thisyear ostate omonth onumber 0datedtype: int64b. The goal is to generate a bar chart with a count of the number of fires per month. Since there aremonths with 0 fires, you can eliminate these values from the data set. First, use the replacefunction to replace Os with NaN values (Not a Number). Use the np.nan value as the replacementvalue. Do a print of the head() of the data to now see NaN values.c. To remove the lines, use the dropna() function. This function looks for NaN values in a specificcolumn. Research how to specify a column as the input parameter. Use the "number" column.5. Group the dataa. The goal in this step is to create a pandas series to be used in the chart. The data must betransformed so that there are totals by month. Research the pandas groupby() function syntax.The goal is to specify the number column as a list key and then sum() function to get the totals.Assign the results of the groupby() function to a new variable which is the data series.b. Use the print) command for the variable in (a). This should show you totals for each month- inalphabetical order.c. The data needs to be sorted with January being first. Note that the original CSV file is sortedcorrectly by month. Use the following command to create a list of unique months from the dataset - months_unique = list(data.month.unique())d. Use the pandas reindex) function on the variable in (a). Use months unique from (c) as input

Question

4 & 5 python Project 5 – Data PlotsObjective: To work with a data set and generate graphical charts. To gain experience with pandas andmatplotlib modules/libraries.Description: You will perform a simple data analysis given a CSV file of Amazon rain forest data. The dataset contains the number of fires that have occurred over an 18 year period in the Amazon. Working with rawdata is typical in Data Science. You will need to clean the data before visualizing the data in the form of a plotor bar chart. About the data:year is the year when the forest fire happened.state is the Brazilian state.month is the month when the forest fire happened.number is the number of forest fires reported. You will notice that the format of the numbers is not US standard.In many countries, thousands are represented with a period. So 2.588 is really 2588 forest fires. Don't fix thedata in Excel. You will correct for this on the data importdate is the date when the forest fire was reported• In this project, you will first perform a high level analysis of the data using pandas functions.Clean the data looking for missing valuesCreate subsets of the data.Then plot the dataDetailed Instruction StepsIt is recommended that you go through this project step by step. Make sure each step works before moving tothe next one. Use a lot of print() commands to show results at each intermediate step. These can be commentedout once you have the plot working.1. This project needs to import the pandas, matplotlib.pyplot, and numpy modules/libraries.2. Open the data file using the pandas read_csv function. Research how to convert decimal thousands toregular numbers. There is an input parameter available to handle this formatting.3. Perform high level data summaries:a. Use the pandas shape() function to print the number of rows and coluans indata setb. Use the pandas head() and tail() function to print the first and last row sets.c. Use the pandas describe() function to provide some simple statistics about the data. Researchoptions to show all statistics available. Print this.4. Clean the dataa. Check for missing data values. Is the isna() with sum() method to identify missing cells. Theresult should look like thisyear ostate omonth onumber 0datedtype: int64b. The goal is to generate a bar chart with a count of the number of fires per month. Since there aremonths with 0 fires, you can eliminate these values from the data set. First, use the replacefunction to replace Os with NaN values (Not a Number). Use the np.nan value as the replacementvalue. Do a print of the head() of the data to now see NaN values.c. To remove the lines, use the dropna() function. This function looks for NaN values in a specificcolumn. Research how to specify a column as the input parameter. Use the &#34;number&#34; column.5. Group the dataa. The goal in this step is to create a pandas series to be used in the chart. The data must betransformed so that there are totals by month. Research the pandas groupby() function syntax.The goal is to specify the number column as a list key and then sum() function to get the totals.Assign the results of the groupby() function to a new variable which is the data series.b. Use the print) command for the variable in (a). This should show you totals for each month- inalphabetical order.c. The data needs to be sorted with January being first. Note that the original CSV file is sortedcorrectly by month. Use the following command to create a list of unique months from the dataset - months_unique = list(data.month.unique())d. Use the pandas reindex) function on the variable in (a). Use months unique from (c) as input

Accepted Answer

Given that:
Amazon fire forest dataset.
Problem Given:
To clean the data
To find nan values
To…