Module 3_Part2
docx
School
Georgia Institute Of Technology *
*We aren’t endorsed by this school
Course
6414
Subject
Statistics
Date
Apr 3, 2024
Type
docx
Pages
3
Uploaded by Mattyboo
It is important to remember that in this model, we do not have an error term!
Slide 9:
What are the model assumptions? A first assumption is the linearity of the link function of the probability of a success in the predicted variables, that is we write the g function of the probability of a success as a linear combination of the predicting variables. Although I'm going to refer to this assumption still as a linearity assumption, it is a different assumption than the linearity assumption in the regression model we have learned in the previous modules since the g link function is a non-linear transformation of the probability of the success or of the expectation of the response variable. Similar to the standard regression model, we also assume independence in the response data.
The third assumption is specific to the logistic regression model. The logistic regression model assumes that the link function is the so-called logit function, provided here on the
slide. The link function g is the log of the ratio of p over one minus p, where p again is the probability of success. This is an assumption since the logit function is not the only function that yields s-shaped curves.
There are other s-shaped functions that are used in modeling binary responses, under a more general model framework called binomial model. We'll learn about other shape functions in a different lesson.
Slide 10:
I will continue with illustrating logistic regression with a data example I will be using throughout the lessons introducing the Basic Concepts of Logistic Regression. In 1972-
1974 a survey was taken in Whickham, a mixed urban and rural district near Newcastle,
United Kingdom. Twenty years later a follow-up study was conducted. Among the information obtained originally was whether a person was a smoker or not. It was found that twenty years later, 76.12% of the 582 smokers were still alive with only 68.58% of 732 nonsmokers were still alive. That is, smokers had a higher survival rate than non-
smokers.
That will make the story for Philip Morris. Smoking leads
to a longer life span.
This example was provided by Dr. Jeffrey Simonoff from New York University.
Slide 11:
This slide includes the R code to get you started with reading the data. Here is also the code for plotting the age versus the proportion of those that survived. We want to compare the relationship between age and the proportion of survival by smokers and nonsmokers separately.
The plot shows a non-linear relationship between age and survival proportion. In fact, this looks more like an S shape
,
as
I motivated in the previous lesson where I introduced
the logistic regression model.
Slide 13:
Next, I transformed the survival proportion using the logit function, which is the log of the ratio between the proportion of survival divided by 1 minus the proportion of survival, which is called logit transformation or link funciton. Here I'm plotting the age versus the logit of the proportion of survival.
I'm contrasting the plot that you saw in a previous slide on the left to the plot of the age
versus logit of the survival rate. The relationship between age and the transformed survival rate improved compared to the un-transformed survival proportion. We still see
a slight curvature. I will expand more on this when we're going to perform the logistic regression analysis on this example.
Summary:
In this lesson, I introduced the concept of binary response data along with the most used regression approach to model such data, the so-called logistic regression. I also illustrated it with a data example in R.
3.2. Model Description and Estimation
In this lesson, I'll introduce the approach used to estimate the logistic regression model and the interpretation of the regression coefficients. I will also illustrate the implementation of the estimation of the logistic regression model using a data example.
Slide 3:
Logistic regression is the generalization of linear regression when the response variable y is binary or binomial. Assume that Yi takes 0 or 1 values, thus binary, and we want to relate or
regress Y onto some predicting variables. The objective of the model is to estimate the probability of a success given the predicting variables.
We model the probability of success using the logit link function as I presented in the previous lesson. That is, the logit function of the probability of success is a linear model in the predicting variables. We can rewrite this as the probability of success equal to the
ratio between the exponential of the linear combination of the predicting variables over 1 plus this same exponential. The two formulations are equivalent. We will use them interchangeably throughout this lecture.
Slide 4:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Related Questions
Use the data to develop a regression equationthat could be used to predict the quantity of pork sold during future periods. Discuss how you can tell whether heteroscedasticity, autocorrelation, or multi-collinearity might be a problem.
arrow_forward
We have data on Lung Capacity of persons and we wish
to build a multiple linear regression model that predicts
Lung Capacity based on the predictors Age and
Smoking Status. Age is a numeric variable whereas
Smoke is a categorical variable (0 if non-smoker, 1 if
smoker). Here is the partial result from STATISTICA.
b*
Std.Err.
of b*
Std.Err.
N=725
of b
Intercept
Age
Smoke
0.835543
-0.075120
1.085725
0.555396
0.182989
0.014378
0.021631
0.021631
-0.648588
0.186761
Which of the following statements is absolutely false?
A. The expected lung capacity of a smoker is expected
to be 0.648588 lower than that of a non-smoker.
B. The predictor variables Age and Smoker both
contribute significantly to the model.
C. For every one year that a person gets older, the lung
capacity is expected to increase by 0.555396 units,
holding smoker status constant.
D. For every one unit increase in smoker status, lung
capacity is expected to decrease by 0.648588 units,
holding age constant.
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you

Elementary Linear Algebra (MindTap Course List)
Algebra
ISBN:9781305658004
Author:Ron Larson
Publisher:Cengage Learning
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage
Related Questions
- Use the data to develop a regression equationthat could be used to predict the quantity of pork sold during future periods. Discuss how you can tell whether heteroscedasticity, autocorrelation, or multi-collinearity might be a problem.arrow_forwardWe have data on Lung Capacity of persons and we wish to build a multiple linear regression model that predicts Lung Capacity based on the predictors Age and Smoking Status. Age is a numeric variable whereas Smoke is a categorical variable (0 if non-smoker, 1 if smoker). Here is the partial result from STATISTICA. b* Std.Err. of b* Std.Err. N=725 of b Intercept Age Smoke 0.835543 -0.075120 1.085725 0.555396 0.182989 0.014378 0.021631 0.021631 -0.648588 0.186761 Which of the following statements is absolutely false? A. The expected lung capacity of a smoker is expected to be 0.648588 lower than that of a non-smoker. B. The predictor variables Age and Smoker both contribute significantly to the model. C. For every one year that a person gets older, the lung capacity is expected to increase by 0.555396 units, holding smoker status constant. D. For every one unit increase in smoker status, lung capacity is expected to decrease by 0.648588 units, holding age constant.arrow_forward
Recommended textbooks for you
- Elementary Linear Algebra (MindTap Course List)AlgebraISBN:9781305658004Author:Ron LarsonPublisher:Cengage LearningAlgebra & Trigonometry with Analytic GeometryAlgebraISBN:9781133382119Author:SwokowskiPublisher:Cengage

Elementary Linear Algebra (MindTap Course List)
Algebra
ISBN:9781305658004
Author:Ron Larson
Publisher:Cengage Learning
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage