Question
Use decision tree to further explore the dataset, where the dependent variable is ‘smoker’. Please explain the approach taken. [No more than 300 words]

Transcribed Image Text:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
A
Personl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
B
age
с
sex
19 female
18 male
28 male
33 male
32 male
31 F
46 F
37 F
37 M
60 F
25 M
62 female
23 male
56 female
27 male
19 male
52 female
23 male
56 male
30 male
bmi
D
27.9
33.77
33
22.705
28.88
25.74
33.44
27.74
29.83
25.84
26.22
26.29
34.4
39.82
42.13
24.6
30.78
23.845
40.3
35.3
E
F
childrer smoker
0 yes
1 no
3 no
0 no
0 no
O no
1 no
3 no
2 no
0 no
0 no
0 yes
O no
O no
O yes
1 no
1 no
O no
O no
O yes
G
H
charges
southwest 16884.92
southeast 1725.552
southeast 4449.462
northwest 21984.47
northwest 3866.855
southeast
3756.622
southeast
8240.59
northwest 7281.506
northeast 6406.411
northwest
28923.14
northeast 2721.321
southeast 27808.73
southwest
1826.843
southeast 11090.72
southeast 39611.76
southwest
1837.237
northeast
10797.34
northeast
2395.172
southwest
10602.39
southwest 36837.47
▾ region
Expert Solution

This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by stepSolved in 3 steps

Follow-up Questions
Read through expert solutions to related follow-up questions below.
Follow-up Question
Can u generate the python code for me as that i can generate the decision tree in Jupyter notebook? An example of a decision tree has been attached as image. Explain using the variables inside the decision tree.
![-
ples
vidas1003.
class No Smo
-
harges-133 454)
-0.003
ampl 205
waha1705, 11
un-D
T
Was-10:41
Une Sich
- Smoker
on
-
samples 1
11
US
EDE
g-U.S
sampla-2
value=11.11
en Smer
Charges <14453 74
-0.014
sam-717
age - 210
0:00
sample-21
[22.11
class Nat-In
-1712,51
DING
-0.0
CHYTILA
vula 1L
Nandit
DO
UN
172, 1
Lina
in Ren-Bre
LODE
LLED FO
The
UN
181
at Sinir
εκτείνει τη
13
P
rive Seeker
50
Sen
aplica - 11
17,4]
clat Non Smer
ra
420.
thurg4L (2)
0211
campi 15
14.11
2002->
- 463
sples
AN 15511
-18
ST
2521,
6161
JAS
Uut
2
a 12.07
02-1
S
Senk
4
4374
M
ார்.
. 2.11
ri
L
chagua
3--23.10
1671-30
Seker
charg
singles
9-1322
-993
sak-(777 296
das-master
-
sala
LADa-Sing
SO WEAST
-3057
-
---
MIM
18-202
cines-Tera
NETRES
HOLL
Ident
PAVE. DE
wa Serek
PETREDEN
2.61 PR
aplicad
e effi
15,11
SUTADES -2
Sales-Pil
in 11-
REDAT
---23.2
Svar
ges
30-8339
-250
165 TE
-Ste
wght 2132015
Band
HEND
Tym
213.79
S
-1291
choa-Nasan
BELL->
32329 531
-PATER ENG
SEDA
STUNGE
cstar 15,11
-Ma
CHECER
ww
- N
2.
52=TAB
coca-fr
31
1111
Sab
berk
-20
cupuces
coating
SIN
cal 11.4
T-sanus
-12.01
21
L. J
ST
368 SMF, ->346.mp
tal-ar
gm-0.250
WI-12.3
and
nate: -11.61
HALS-D
245
340
aus -15
14,91
gin
BEST-SE
gm-d
- 30
11.91
gn-D2
Samples 2
12,01
[-secues
www.de.mens
0055
sample-111
ala-2.10
clas
salar-11,01
-Sko
11.01
Namun
MIONS L
CASIE
ge-001
DE
10, ST
WELK
SAD
latt al
44-205
Smoker
Come
106
13,105
119
-0.0
aca-1
WELL
To 11-1
411100
N 6334
applica - 13
11,141
ran fris
Smi
V
age 43.0
sample-2
ON
vermit
-0.0
saka-121
ca-Man-Gim
STARLIG - Spgs
-0.5
14
Tot nl-mon
ampus
ww
ON-DO
De
comple
UN [1.01
la Not-beseur](https://content.bartleby.com/qna-images/question/67508e39-b920-4772-b081-cb3f5e5efc60/a03a3651-5d59-43ed-a63a-2518a549bba3/bs5jbzo_thumbnail.jpeg)
Transcribed Image Text:-
ples
vidas1003.
class No Smo
-
harges-133 454)
-0.003
ampl 205
waha1705, 11
un-D
T
Was-10:41
Une Sich
- Smoker
on
-
samples 1
11
US
EDE
g-U.S
sampla-2
value=11.11
en Smer
Charges <14453 74
-0.014
sam-717
age - 210
0:00
sample-21
[22.11
class Nat-In
-1712,51
DING
-0.0
CHYTILA
vula 1L
Nandit
DO
UN
172, 1
Lina
in Ren-Bre
LODE
LLED FO
The
UN
181
at Sinir
εκτείνει τη
13
P
rive Seeker
50
Sen
aplica - 11
17,4]
clat Non Smer
ra
420.
thurg4L (2)
0211
campi 15
14.11
2002->
- 463
sples
AN 15511
-18
ST
2521,
6161
JAS
Uut
2
a 12.07
02-1
S
Senk
4
4374
M
ார்.
. 2.11
ri
L
chagua
3--23.10
1671-30
Seker
charg
singles
9-1322
-993
sak-(777 296
das-master
-
sala
LADa-Sing
SO WEAST
-3057
-
---
MIM
18-202
cines-Tera
NETRES
HOLL
Ident
PAVE. DE
wa Serek
PETREDEN
2.61 PR
aplicad
e effi
15,11
SUTADES -2
Sales-Pil
in 11-
REDAT
---23.2
Svar
ges
30-8339
-250
165 TE
-Ste
wght 2132015
Band
HEND
Tym
213.79
S
-1291
choa-Nasan
BELL->
32329 531
-PATER ENG
SEDA
STUNGE
cstar 15,11
-Ma
CHECER
ww
- N
2.
52=TAB
coca-fr
31
1111
Sab
berk
-20
cupuces
coating
SIN
cal 11.4
T-sanus
-12.01
21
L. J
ST
368 SMF, ->346.mp
tal-ar
gm-0.250
WI-12.3
and
nate: -11.61
HALS-D
245
340
aus -15
14,91
gin
BEST-SE
gm-d
- 30
11.91
gn-D2
Samples 2
12,01
[-secues
www.de.mens
0055
sample-111
ala-2.10
clas
salar-11,01
-Sko
11.01
Namun
MIONS L
CASIE
ge-001
DE
10, ST
WELK
SAD
latt al
44-205
Smoker
Come
106
13,105
119
-0.0
aca-1
WELL
To 11-1
411100
N 6334
applica - 13
11,141
ran fris
Smi
V
age 43.0
sample-2
ON
vermit
-0.0
saka-121
ca-Man-Gim
STARLIG - Spgs
-0.5
14
Tot nl-mon
ampus
ww
ON-DO
De
comple
UN [1.01
la Not-beseur
Solution
by Bartleby Expert
Follow-up Questions
Read through expert solutions to related follow-up questions below.
Follow-up Question
Can u generate the python code for me as that i can generate the decision tree in Jupyter notebook? An example of a decision tree has been attached as image. Explain using the variables inside the decision tree.
![-
ples
vidas1003.
class No Smo
-
harges-133 454)
-0.003
ampl 205
waha1705, 11
un-D
T
Was-10:41
Une Sich
- Smoker
on
-
samples 1
11
US
EDE
g-U.S
sampla-2
value=11.11
en Smer
Charges <14453 74
-0.014
sam-717
age - 210
0:00
sample-21
[22.11
class Nat-In
-1712,51
DING
-0.0
CHYTILA
vula 1L
Nandit
DO
UN
172, 1
Lina
in Ren-Bre
LODE
LLED FO
The
UN
181
at Sinir
εκτείνει τη
13
P
rive Seeker
50
Sen
aplica - 11
17,4]
clat Non Smer
ra
420.
thurg4L (2)
0211
campi 15
14.11
2002->
- 463
sples
AN 15511
-18
ST
2521,
6161
JAS
Uut
2
a 12.07
02-1
S
Senk
4
4374
M
ார்.
. 2.11
ri
L
chagua
3--23.10
1671-30
Seker
charg
singles
9-1322
-993
sak-(777 296
das-master
-
sala
LADa-Sing
SO WEAST
-3057
-
---
MIM
18-202
cines-Tera
NETRES
HOLL
Ident
PAVE. DE
wa Serek
PETREDEN
2.61 PR
aplicad
e effi
15,11
SUTADES -2
Sales-Pil
in 11-
REDAT
---23.2
Svar
ges
30-8339
-250
165 TE
-Ste
wght 2132015
Band
HEND
Tym
213.79
S
-1291
choa-Nasan
BELL->
32329 531
-PATER ENG
SEDA
STUNGE
cstar 15,11
-Ma
CHECER
ww
- N
2.
52=TAB
coca-fr
31
1111
Sab
berk
-20
cupuces
coating
SIN
cal 11.4
T-sanus
-12.01
21
L. J
ST
368 SMF, ->346.mp
tal-ar
gm-0.250
WI-12.3
and
nate: -11.61
HALS-D
245
340
aus -15
14,91
gin
BEST-SE
gm-d
- 30
11.91
gn-D2
Samples 2
12,01
[-secues
www.de.mens
0055
sample-111
ala-2.10
clas
salar-11,01
-Sko
11.01
Namun
MIONS L
CASIE
ge-001
DE
10, ST
WELK
SAD
latt al
44-205
Smoker
Come
106
13,105
119
-0.0
aca-1
WELL
To 11-1
411100
N 6334
applica - 13
11,141
ran fris
Smi
V
age 43.0
sample-2
ON
vermit
-0.0
saka-121
ca-Man-Gim
STARLIG - Spgs
-0.5
14
Tot nl-mon
ampus
ww
ON-DO
De
comple
UN [1.01
la Not-beseur](https://content.bartleby.com/qna-images/question/67508e39-b920-4772-b081-cb3f5e5efc60/a03a3651-5d59-43ed-a63a-2518a549bba3/bs5jbzo_thumbnail.jpeg)
Transcribed Image Text:-
ples
vidas1003.
class No Smo
-
harges-133 454)
-0.003
ampl 205
waha1705, 11
un-D
T
Was-10:41
Une Sich
- Smoker
on
-
samples 1
11
US
EDE
g-U.S
sampla-2
value=11.11
en Smer
Charges <14453 74
-0.014
sam-717
age - 210
0:00
sample-21
[22.11
class Nat-In
-1712,51
DING
-0.0
CHYTILA
vula 1L
Nandit
DO
UN
172, 1
Lina
in Ren-Bre
LODE
LLED FO
The
UN
181
at Sinir
εκτείνει τη
13
P
rive Seeker
50
Sen
aplica - 11
17,4]
clat Non Smer
ra
420.
thurg4L (2)
0211
campi 15
14.11
2002->
- 463
sples
AN 15511
-18
ST
2521,
6161
JAS
Uut
2
a 12.07
02-1
S
Senk
4
4374
M
ார்.
. 2.11
ri
L
chagua
3--23.10
1671-30
Seker
charg
singles
9-1322
-993
sak-(777 296
das-master
-
sala
LADa-Sing
SO WEAST
-3057
-
---
MIM
18-202
cines-Tera
NETRES
HOLL
Ident
PAVE. DE
wa Serek
PETREDEN
2.61 PR
aplicad
e effi
15,11
SUTADES -2
Sales-Pil
in 11-
REDAT
---23.2
Svar
ges
30-8339
-250
165 TE
-Ste
wght 2132015
Band
HEND
Tym
213.79
S
-1291
choa-Nasan
BELL->
32329 531
-PATER ENG
SEDA
STUNGE
cstar 15,11
-Ma
CHECER
ww
- N
2.
52=TAB
coca-fr
31
1111
Sab
berk
-20
cupuces
coating
SIN
cal 11.4
T-sanus
-12.01
21
L. J
ST
368 SMF, ->346.mp
tal-ar
gm-0.250
WI-12.3
and
nate: -11.61
HALS-D
245
340
aus -15
14,91
gin
BEST-SE
gm-d
- 30
11.91
gn-D2
Samples 2
12,01
[-secues
www.de.mens
0055
sample-111
ala-2.10
clas
salar-11,01
-Sko
11.01
Namun
MIONS L
CASIE
ge-001
DE
10, ST
WELK
SAD
latt al
44-205
Smoker
Come
106
13,105
119
-0.0
aca-1
WELL
To 11-1
411100
N 6334
applica - 13
11,141
ran fris
Smi
V
age 43.0
sample-2
ON
vermit
-0.0
saka-121
ca-Man-Gim
STARLIG - Spgs
-0.5
14
Tot nl-mon
ampus
ww
ON-DO
De
comple
UN [1.01
la Not-beseur
Solution
by Bartleby Expert
Knowledge Booster
Similar questions
- Question: given dataset in which the label column is almost completely determined by a subset of k attributes the depth of the decision tree build on this dataset is expected to be: 1) K 2) log k 3) k^2 4) none of the above. ( Do fast i have one hour)arrow_forwardThe difference between Linear Regression and Logistic Regression. Note: Please make in table and with your own wordarrow_forwardApply component factor- and projection-based dimensionality reduction approaches on the given dataset (tripadvisor_review.csv) for creating three collective variables using UMAP. Can you please help me with the coding part, I am finding it difficult to find the target variablearrow_forward
- The output for linear regression analysis has multiple numbers. How can we interpret the output? Can you share some hints.arrow_forwardThe table R(x,y) currently has the following tuples (note there are duplicates): (1,2), (1,2), (2,3), (3,4), (3,4), (4,1), (4,1), (4,1), (4,2). Compute the result of the query: SELECT R1.x, R2.y, COUNT(*) FROM R R1, R R2 WHERE R1.y = R2.x GROUP BY R1.x, R2.y; Which of the following tuples is in the result? O a. (3, 4, 6) O b. (3, 2, 6) O c. (3, 2, 2) O d. (4, 3, 2)arrow_forwardExplain why you think the star schema is better than the snowflake form. Which one is the outlier here?arrow_forward
- Which statement best describes k-means cluster analysis? It is the process of estimating the value of a continuous outcome variable. It is the process of organizing observations into distinct groups based on a measure of similarity or dissimilarity. It is the process of reducing the number of variables to consider in data-mining. It is the process of agglomerating observations into a series of nested groups based on a measure of similarity or dissimilarity.arrow_forwardSuppose you wish to study the relationship between the number of 'likes' on Facebook and the number of 'friends' one has on Facebook. Describe the statistical technique you would use.arrow_forwardWhen we buy a packaged data model, we receive whatever element we obtain as part of the deal.arrow_forward
- Exhibit what we mean by coding norms.arrow_forwardA table called "game1" contains three columns and twenty rows, but "game2" has the same column as game1 (ie 3) and fifteen rows. Both tables have 5 rows in common. What is the degree and cardinality of the resulting table if we take union?arrow_forwardAssume an attribute (feature) has a normal distribution in a dataset. Assume the standard deviation is S and the mean is M. Typically: Group of answer choices, multiple choice: Then the outliers usually lie below -3*M or above +3*M Then the outliers usually lie above -3*S or below +3*S Then the outliers usually lie below -3*S or above +3*S Then the outliers usually lie above -3*M or below +3*Marrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios