Create a dataframe variable 'a' with this dataset. This dataframe should have all the 569 instances, 30 features and the class of 569 instances as 0 (Malignant) or 1 (Benign). The column that contains the classes should be labeled as 'typeofcancer'. Show the output of the following input: In [13]: ▸a.shape

Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
icon
Related questions
Question

NEED HELP WITH THE THIRD PART ONLY. I ALREADY TRIED IT,  BUT MY CODE DOESNT MATCH THE QUESTION, THIS IS LIKE MY 5th TIME TRYING THIS. KEEPS GETTING REJECTED.

Create a dataframe variable 'a' with this dataset. This dataframe should have all the 569 instances,
30 features and the class of 569 instances as 0 (Malignant) or 1 (Benign). The column that contains
the classes should be labeled as 'typeofcancer'. Show the output of the following input:
In [13]:
▸a.shape
[Hints: the outputs should be same as below.
Out [13] (569, 31)
(b) Now create a dataframe variable 'df' by slicing dataframe 'a'. The new datafraeme 'df' should
have all the instances, their labels but with the following three features: mean radius, mean
perimeter and mean area. [Hints: use .iloc method to extract necessary columns from 'a']
(i) Show the first two rows.
[Hint: The output should be same as below.
Out [16]:
80
0
(ii) Show the rows with indexes 17, 18, 19, 20, 21.
60
1
40
mean radius mean perim
17.99
20.57
(iii) Suppose we want to explore the possibility of developing a machine learning model that can
diagnose a new patient's cancer condition as benign or malignant from the features in df.
20
As a first step, you want to do some graphical analysis. Write the code to generate the following
figure (Figure 1). Show screenshot of the code (input) and the figure (output) from your work. You
are free to choose your favorite data marker and color in your figure.
10
CO(M)
25
● c1(B)
20
OD07
15
10
50
150
100
mean perimeter
122.8
132.9
mean area typeofcancer
1001.0
1326.0
CO(Malignant)
c1(Benign)
0
0
15 20
mean radius
25
10
CO(M)
c1(B)
500
1000 1500 2000 2500
mean area
Figure 1: From left of right: histogram of 'mean radius' data for each class, scatter plot of 'mean
radius' versus 'mean perimeter', scatter plot of 'mean radius' versus 'mean area'.
(iv) Briefly describe what each of the subplots in Figure 1 reveal about the data.
Transcribed Image Text:Create a dataframe variable 'a' with this dataset. This dataframe should have all the 569 instances, 30 features and the class of 569 instances as 0 (Malignant) or 1 (Benign). The column that contains the classes should be labeled as 'typeofcancer'. Show the output of the following input: In [13]: ▸a.shape [Hints: the outputs should be same as below. Out [13] (569, 31) (b) Now create a dataframe variable 'df' by slicing dataframe 'a'. The new datafraeme 'df' should have all the instances, their labels but with the following three features: mean radius, mean perimeter and mean area. [Hints: use .iloc method to extract necessary columns from 'a'] (i) Show the first two rows. [Hint: The output should be same as below. Out [16]: 80 0 (ii) Show the rows with indexes 17, 18, 19, 20, 21. 60 1 40 mean radius mean perim 17.99 20.57 (iii) Suppose we want to explore the possibility of developing a machine learning model that can diagnose a new patient's cancer condition as benign or malignant from the features in df. 20 As a first step, you want to do some graphical analysis. Write the code to generate the following figure (Figure 1). Show screenshot of the code (input) and the figure (output) from your work. You are free to choose your favorite data marker and color in your figure. 10 CO(M) 25 ● c1(B) 20 OD07 15 10 50 150 100 mean perimeter 122.8 132.9 mean area typeofcancer 1001.0 1326.0 CO(Malignant) c1(Benign) 0 0 15 20 mean radius 25 10 CO(M) c1(B) 500 1000 1500 2000 2500 mean area Figure 1: From left of right: histogram of 'mean radius' data for each class, scatter plot of 'mean radius' versus 'mean perimeter', scatter plot of 'mean radius' versus 'mean area'. (iv) Briefly describe what each of the subplots in Figure 1 reveal about the data.
In [11]:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import matplotlib.pyplot as plt
import sklearn.
from sklearn.datasets import load_breast cancer
breast cancer -load_breast cancer (return_X_y True, as frame True)
a breast cancer [0]
b
breast cancer [1]
a['classes - b
In [12] cancer sklearn.datasets.load_breast cancer ()
Out [15]:
a pd. DataFrame (cancer ['data'],columns-cancer.feature_names)
a['typeofcancer'] cancer['target']
In [13]: print (a.shape)
#creaing a dataframe df
(569, 31).
In [14]: df=a.iloc[:, [0,2,3,-1]]
In [15]: df.head (2)
0
1
17
18
19
mean radius mean perimeter mean area typeofcancer
In [16]: df.iloc [[17,18,19,20,21],:]
Out [16]:
20
21
17.99
20.57
mean radius mean perimeter mean area typeofcancer
16.130
108.10
798.8
19.810
13.540
13.080
9.504
122.8
132.9
357
212
plt.tight_layout()
In [18]: asorted.typeofcancer.value_counts()
Out [18] 1
0
Name: typeofcancer, dtype: int64
100
130.00
75
87.46
85.63
60.34
In [17]: asorted = a.sort_values ('typeofcancer',ignore_index=True)
asorted;
50-
25
1001.0
1326.0
axsl.set_xlabel('mean radius')
axsl.set_ylabel('frequney')
1260.0
566.3
In [19]: f0, fl asorted.typeofcancer.value_counts()
fo, fl
Out [19]: (357, 212)
520.0
273.9
In [20]: fig, axs plt.subplots (figsize (10,2.5))
(0,0))
axs1 plt.subplot2grid (shape= (1, 3), loc
plt.subplot2grid (shape= (1, 3), loc
axs2
(0,1))
axs3 plt.subplot2grid (shape= (1, 3), loc= (0,2))
10
15
axs1.hist (a.iloc [0:f0, 0], edgecolor='r', fe='none', label='c0')
axs1.hist (a.iloc [f0:f0+f1l, 0], edgecolor='b', fe='none', label='c1')
0
0
axs1.legend()
axs2.scatter (a.iloc [0:f0, 2], a.iloc [0:f0, 0], label='c0(M)')
axs2.scatter (a.iloc [f0:f0+f1, 2], a.iloc [f0:f0+f1, 0], label='c1(B)')
axs2.set_xlabel('mean perimeter')
axs2.set_ylabel('mean radius')
axs2.legend()
axs3.scatter (a.iloc [0:f0, 3), a.iloc [0:f0, 0], label='c0(M)')
axs3.scatter (a.iloc [f0:f0+f1, 3], a.iloc [f0:f0+f1, 0], label='cl(B)')
axs3.set_xlabel('mean area')
axs3.set_ylabel('mean radius')
axs3.legend()
20
0
0
Out [20]: <matplotlib.legend. Legend at 0x7fa5c12a0f70>
1
c0(M)
25 ● cl(B)
20
wor
15
10
50
100
mean perimeter
mean radius
1
25
1
150
..
25
10
● CO(M)
cl(B)
500 1000 1500 2000 2500
mean area
Transcribed Image Text:In [11]: import numpy as np import pandas as pd from pandas import Series, DataFrame import matplotlib.pyplot as plt import sklearn. from sklearn.datasets import load_breast cancer breast cancer -load_breast cancer (return_X_y True, as frame True) a breast cancer [0] b breast cancer [1] a['classes - b In [12] cancer sklearn.datasets.load_breast cancer () Out [15]: a pd. DataFrame (cancer ['data'],columns-cancer.feature_names) a['typeofcancer'] cancer['target'] In [13]: print (a.shape) #creaing a dataframe df (569, 31). In [14]: df=a.iloc[:, [0,2,3,-1]] In [15]: df.head (2) 0 1 17 18 19 mean radius mean perimeter mean area typeofcancer In [16]: df.iloc [[17,18,19,20,21],:] Out [16]: 20 21 17.99 20.57 mean radius mean perimeter mean area typeofcancer 16.130 108.10 798.8 19.810 13.540 13.080 9.504 122.8 132.9 357 212 plt.tight_layout() In [18]: asorted.typeofcancer.value_counts() Out [18] 1 0 Name: typeofcancer, dtype: int64 100 130.00 75 87.46 85.63 60.34 In [17]: asorted = a.sort_values ('typeofcancer',ignore_index=True) asorted; 50- 25 1001.0 1326.0 axsl.set_xlabel('mean radius') axsl.set_ylabel('frequney') 1260.0 566.3 In [19]: f0, fl asorted.typeofcancer.value_counts() fo, fl Out [19]: (357, 212) 520.0 273.9 In [20]: fig, axs plt.subplots (figsize (10,2.5)) (0,0)) axs1 plt.subplot2grid (shape= (1, 3), loc plt.subplot2grid (shape= (1, 3), loc axs2 (0,1)) axs3 plt.subplot2grid (shape= (1, 3), loc= (0,2)) 10 15 axs1.hist (a.iloc [0:f0, 0], edgecolor='r', fe='none', label='c0') axs1.hist (a.iloc [f0:f0+f1l, 0], edgecolor='b', fe='none', label='c1') 0 0 axs1.legend() axs2.scatter (a.iloc [0:f0, 2], a.iloc [0:f0, 0], label='c0(M)') axs2.scatter (a.iloc [f0:f0+f1, 2], a.iloc [f0:f0+f1, 0], label='c1(B)') axs2.set_xlabel('mean perimeter') axs2.set_ylabel('mean radius') axs2.legend() axs3.scatter (a.iloc [0:f0, 3), a.iloc [0:f0, 0], label='c0(M)') axs3.scatter (a.iloc [f0:f0+f1, 3], a.iloc [f0:f0+f1, 0], label='cl(B)') axs3.set_xlabel('mean area') axs3.set_ylabel('mean radius') axs3.legend() 20 0 0 Out [20]: <matplotlib.legend. Legend at 0x7fa5c12a0f70> 1 c0(M) 25 ● cl(B) 20 wor 15 10 50 100 mean perimeter mean radius 1 25 1 150 .. 25 10 ● CO(M) cl(B) 500 1000 1500 2000 2500 mean area
Expert Solution
steps

Step by step

Solved in 2 steps with 1 images

Blurred answer
Recommended textbooks for you
Computer Networking: A Top-Down Approach (7th Edi…
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
Computer Organization and Design MIPS Edition, Fi…
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
Concepts of Database Management
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
Prelude to Programming
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
Sc Business Data Communications and Networking, T…
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY