Case Control Studies ERIC Notebook
pdf
School
Johns Hopkins University *
*We aren’t endorsed by this school
Course
PH.340.601
Subject
Medicine
Date
Oct 30, 2023
Type
Pages
5
Uploaded by LieutenantProtonPony24
ERIC at the
UNC CH Department of Epidemiology
Medical Center
Case-Control Studies
E R I C N O T E B O O K S E R I E S
Case-control studies are used to
determine if there is an association
between an exposure and a
specific health outcome. These
studies proceed from effect (e.g.
health outcome, condition, disease)
to cause (exposure). Case-control
studies assess whether exposure is
disproportionately distributed
between the cases and controls,
which may indicate that the
exposure is a risk factor for the
health outcome under study. Case-
control studies are frequently used
for studying rare health outcomes or
diseases.
Unlike cohort or cross-sectional
studies, subjects in case-control
studies are selected because they
have the health outcome of interest
(cases).
Selection is not based on exposure
status. Controls, persons who are free
of the health outcome under study,
are randomly selected from the
population out of which the cases
arose. The case-control study aims to
achieve the same goals (comparison
of exposed and unexposed) as a
cohort study but does so more
efficiently, by the use of sampling.
After cases and controls have been
identified, the investigator determines
the proportion of cases and the
proportion of controls that have been
Second Edition Authors:
Lorraine K. Alexander, DrPH
Brettania Lopes, MPH
Kristen Ricchetti-Masterson, MSPH
Karin B. Yeatts, PhD, MS
ERIC Notebook
Second Edition
At baseline:
·Selection of cases and controls
based on health outcome or
disease status
Exposure status is unknown
*Exposure at some specified point before disease onset
ERIC at the
UNC CH Department of Epidemiology
Medical Center
exposed to the exposure of interest. Thus, the
denominators obtained in a case-control study do not
represent the total number of exposed and non-exposed
persons in the source population.
After the investigator determines the exposure, a table can
be formed from the study data.
Measures of incidence in case-control studies
In case-control studies the proportion of cases in the entire
population-at-risk is unknown, therefore one cannot
measure incidence of the health outcome or disease. The
controls are representative of the population-at-risk, but
are only a sample of that population, therefore the
denominator for a risk measure, the population- at-risk, is
unknown. We decide on the number of diseased people
(cases) and non-diseased people (controls) when we
design our study, so the ratios of controls to cases is not
biologically or substantively meaningful. However, we can
obtain a valid estimate of the risk ratio or rate ratio by
using the exposure odds ratio (OR).*
Diseased person-years
RR = (a/n1)/(c/n2)
Case-Control Study
OR = (a/c)/(b/d) = (a/b)/(c/d) = (axd)/(cxb)
If b and d (from the case-control study) are sampled from
the source population, n1 + n2, then b will represent the
n1 component of the cohort and d will represent the n2
component, and (a/n1)/(c/n2) will be estimated by (a/b)/
(c/d).
Interpreting the odds ratio
The odds ratio is interpreted the same way as other ratio
measures (risk ratio, rate ratio, etc.).
For example, investigators conducted a case-control study
to determine if there is an association between colon
cancer and a high fat diet. Cases were all confirmed colon
cancer cases in North Carolina in 2010. Controls were a
sample of North Carolina residents without colon cancer.
The odds ratio was 4.0. This odds ratio tells us that
individuals who consumed a high fat diet have four times
the odds of colon cancer than do individuals who do not
consume a high-fat diet. In another study of colon cancer
and coffee consumption, the OR was 0.60. Thus, the odds
of colon cancer among coffee drinkers is only 0.60 times
the odds among individuals who do not consume coffee.
This OR tells us that coffee consumption seems to be
protective against colon cancer.
Types of case-control studies
Case-control studies can be categorized into different
groups based on when the cases develop the health
outcome and based on how controls are sampled. Some
ERIC NOTEBOOK
PAGE 2
Cases
Controls
Exposed
a
b
Unexposed
c
d
Odds of exposure among cases = a/c
Odds of exposure among controls = b/d
Disease
No Disease
Exposed
a
n
1
Unexposed
c
n
2
Cases
Controls
Exposed
a
b
Unexposed
c
d
OR = 1 Odds of disease is the same for exposed
and unexposed
OR > 1 Exposure increases odds of disease
OR < 1 Exposure reduces odds of disease
*Note: Under some conditions, the odds ratio approxi-
mates a risk ratio or rate ratio. However, this is not
always the case, and care should be taken to interpret
odds ratios appropriately.
In case-control studies the proportion of cases in the entire
population-at-risk is unknown, therefore one cannot
measure incidence of the health outcome or disease.
The controls are representative of the population-at-risk,
but are only a sample of that population,
therefore the denominator for a risk measure, the
population- at-risk, is unknown. We decide on the number
of diseased people (cases) and non-diseased people
(controls) when we design our study, so the ratios
of controls to cases is not biologically or substantively
meaningful. However, we can obtain a valid
estimate of the risk ratio or rate ratio by using the exposure
odds ratio (OR).* Odds of exposure among cases
= a divided by c Odds of exposure among controls =
b divided by d
RR = (a divided by n1) divided by (c divided
by n2)
OR = (a divided by c) divided by (b divided by d) = (a divided by b) divided by(c
divided by d) = (axd) divided by (cxb)
The odds ratio is interpreted the same way as other ratio measures (risk ratio, rate ratio, etc.). OR = 1
Odds of disease is the same for exposed and unexposed OR > 1 Exposure increases odds of disease
OR < 1 Exposure reduces odds of disease For example, investigators conducted a case-control
study to determine if there is an association between colon cancer and a high fat diet. Cases
were all confirmed colon cancer cases in North Carolina in 2010. Controls were a sample of North
Carolina residents without colon cancer. The odds ratio was 4.0. This odds ratio tells us that individuals
who consumed a high fat diet have four times the odds of colon cancer than do individuals
who do not consume a high-fat diet. In another study of colon cancer and coffee consumption,
the OR was 0.60. Thus, the odds of colon cancer among coffee drinkers is only 0.60 times
the odds among individuals who do not consume coffee. This OR tells us that coffee consumption
seems to be protective against colon cancer.
ERIC at the
UNC CH Department of Epidemiology
Medical Center
remained free of the health outcome at the end of follow-
up then we call the sampling cumulative density sampling
or survivor sampling. Controls cannot ever have the
outcome (become cases) when using this type of
sampling. In these case-control studies, the odds ratio
estimates the rate ratio only if the health outcome is rare,
i.e. if the proportion of those with the health outcome
among each exposure group is less than 10% (requires
the rare disease assumption).
Incidence density sampling or risk set sampling
When cases are incident cases and when controls are
selected from the at-risk source population at the same
time as cases occur (controls must be eligible to become
a case if the health outcome develops in the control at a
later time during the period of observation) then we call
this type of sampling incidence density sampling or risk
set sampling. The control series provides an estimate of
the proportion of the total person-time for exposed and
unexposed cohorts in the source population. In these case
-control studies, the odds ratio estimates the rate ratio of
cohort studies, without assuming that the disease is rare
in the source population.
Note that it is possible, albeit rare, that a control selected
at a later time point could become a case during the
remaining time that the study is running. This differs from
case-control studies that use cumulative density sampling
or survivor sampling, which select their controls after the
conclusion of the study from among those individuals
remaining at risk.
Selecting controls in a risk set sampling or incidence
density sampling manner provides two advantages:
1.
A direct estimate of the rate ratio is possible.
2.
The estimates are not biased by differential loss to
follow up among the exposed vs. unexposed controls.
For example, if a large number of smokers left the source
population after a certain time point, they would not be
available for selection at the end of the study – when
controls would be selected in a study that uses cumulative
density sampling or survivor sampling. This would give the
case-control studies use prevalent cases while other case-
control studies use incident cases. There are also different
ways that cases can be identified, such as using
population-based cases or hospital-based cases.
Types of cases used in case control studies
Prevalent cases are all persons who were existing cases of
the health outcome or disease during the observation
period. These studies yield a prevalence odds ratio, which
will be influenced by the incidence rate and survival or
migration out of the prevalence pool of cases, and thus
does not estimate the rate ratio. Case control studies can
also use incident cases, which are persons who newly
develop the health outcome or disease during the
observation period. Recall that prevalence is influenced by
both incidence and duration. Researchers that study
causes of disease typically prefer incident cases because
they are usually interested in factors that lead up to the
development of disease rather than factors that affect
duration.
Selecting controls
Selection of controls is usually the most difficult part of
conducting a case-control study. We will discuss 3 possible
ways to select controls:
1.
Base or case-base sampling
2.
Cumulative density or survivor sampling
3.
Incidence density or risk set sampling
Base sampling or case-base sampling
This sampling involves using controls selected from the
source population such that every person has the same
chance of being included as a control. This type of
sampling only works with a previously defined cohort. In
these case-control studies, the odds ratio provides a valid
estimate of the risk ratio without assuming that the
disease is rare in the source population.
Cumulative density sampling or survivor sampling
When controls are sampled from those people who
ERIC NOTEBOOK
PAGE 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
ERIC at the
UNC CH Department of Epidemiology
Medical Center
investigators biased information regarding the level of
exposure among the controls over the course of the study.
Source populations for case-control studies
Source populations can be restricted to a population of
particular interest, e.g. postmenopausal women at risk of
breast cancer. This restriction makes it easier to control for
extraneous confounders in the population. Controls should
represent the restricted source population from which cases
arise, not all non-cases in the total population. The cases in
the study do not have to include all cases in the total
population.
Sources of cases
Cases diagnosed in a hospital or clinic
Cases entered into a disease registry, e.g. cancer, birth
defects, deaths
Cases identified through mass screening, e.g.
hypertensives, diabetics
Cases identified through a prior cohort study, e.g. lung
cancers in an occupational asbestos cohort
Sources of controls
Population controls are non-cases sampled from the
source population giving rise to cases. This is the most
desirable method for selecting controls. Sampling
randomly from census block groups, or a registry such as
the Department of Motor Vehicles (of adults who are
able to drive) are examples of ways to find and recruit
population-based controls.
Neighborhood or friend controls are appropriate for
selection as controls if these individuals would be
included as cases if they developed the health outcome
of interest. It is not appropriate to select neighbors or
friends as controls if they share the exposure of interest.
Hospital controls - There are certain problems with
hospital controls in that they may not be from the same
source population from which the cases arose. Hospital
controls may not be representative of the exposure
prevalence in the source population of cases, e.g.
there may be a higher prevalence of smokers in
hospitals. Hospital controls also may have diseases
resulting from the exposure of interest, e.g. the
exposure (smoking) is related to the disease of
interest (cancer) and to heart and lung diseases from
which the controls may be suffering.
Controls with another disease - However if the study is
on lung cancer, for example, it is essential to exclude
cancers known or suspected to be related to the study
exposure of interest. These controls also share some
of the same problems as hospital controls.
Advantages of case-control studies
Case-control studies are the most efficient design for rare
diseases and require a much smaller study sample than
cohort studies. Additionally, investigators can avoid the
logistical challenges of following a large sample over time.
Thus, case-control studies also allow more intensive
evaluation of exposures of cases and controls. Case-
control studies that use incidence density sampling or risk
set sampling yield a valid estimate of the rate ratio derived
from a cohort study if incident cases are studied and
controls are sampled from the risk set of the source
population. If properly performed (i.e. appropriate
sampling), case-control studies provide information that
mirrors what could be learned from a cohort study, usually
at considerably less cost and time.
Disadvantages of case-control studies
Case-control studies do not yield an estimate of rate or
risk, as the denominator of these measures is not defined.
Case-control studies may be subject to recall bias if
exposure is measured by interviews and if recall of
exposure differs between cases and controls. However,
investigators may be able to avoid this problem if historical
records are available to assess exposure. Choosing an
appropriate source population is also difficult and may
contribute to selection bias. Case-control studies are not
an efficient means for studying rare exposures (less than
10% of controls are exposed) because very large numbers
of cases and controls are needed to detect the effects of
rare exposures.
ERIC NOTEBOOK
PAGE 4
ERIC at the
UNC CH Department of Epidemiology
Medical Center
Self-Evaluation Questions
1.
Suppose that in a case-control study using incident
cases
of colon cancer you found that 80% of the cases
were married. Does this demonstrate that being married
increases the risk of developing colon cancer?
2. In the same case-control study above, assume that 90%
of the control group is married. If there are 200 cases and
200 controls estimate the risk ratio of colon cancer for
single men. Construct a 2x2 table and determine and in-
terpret the exposure odds ratio.
Self-Evaluation Answers
1. No. In order to assess whether or not the exposure of
interest (marriage) increases the risk of having colon can-
cer, the proportion of controls that are married must also
be known and the exposure odds ratio must be comput-
ed.
2.
OR = (40x180) / (20x160) = 2.25
An odds ratio of 2.25 means that single men have 2.25
times the odds of being a case compared to married men.
ERIC NOTEBOOK
PAGE 5
Acknowledgement
The authors of the Second Edition of the ERIC Notebook would like to acknowledge the authors of the ERIC
Notebook, First Edition: Michel Ibrahim, MD, PhD, Lorraine Alexander, DrPH,
Carl Shy, MD, DrPH,
and Sherry Farr, GRA, Department of Epidemiology at the University of North Carolina at Chapel
Hill. The First Edition of the ERIC Notebook was produced by the Educational Arm of the
Epidemiologic Research and Information Center at Durham, NC. The funding for the ERIC Notebook
First Edition was provided by the Department of Veterans Affairs (DVA), Veterans Health
Administration (VHA), Cooperative Studies Program (CSP) to promote the strategic growth of the
epidemiologic capacity of the DVA.
Terminology
Cohort studies:
An observational study in which
subjects are sampled based on the presence (exposed)
or absence(unexposed) of a risk factor of interest.
These subjects are followed over time for the
development of a health outcome of interest.
Cross-sectional studies:
An observational study in
which subjects are sampled at one point in time, and
then the associations between the concurrent risk
factors and health outcomes are investigated.
Exposure odds ratio (OR):
the odds of a particular
exposure among persons with a specific health
outcome divided by the corresponding odds of
exposure among persons without the health outcome
of interest. Yields a valid estimate of the incidence rate
ratio or risk ratio derived from a cohort study,
depending on control sampling.
Incident case:
a person who is newly diagnosed as a
case.
Prevalent case:
a person who has a health outcome of
interest that was diagnosed in the past.
Risk ratio (RR):
the likelihood of a particular health
outcome occurrence among persons exposed to a
given risk factor divided by the corresponding
likelihood among unexposed persons.
Source population:
the population out of which the
cases arose.
From: Medical Epidemiology, R.S. Greenberg, 1993,
1996.
Marital status
cases
controls
total
single
40
20
60
married
160
180
340
total
200
200
400
Related Documents
Browse Popular Homework Q&A
Q: box with m = 2 kg is launched up the circular ramp by compressing and releasing a spring with…
Q: Find the values of x for which the series converges. (If the answer is an interval, enter your…
Q: A lawn mower has a flat, rod-shaped steel blade that rotates about its center. The mass of the blade…
Q: If the crate described in question #7 was accelerating down the ramp instead of
moving at a constant…
Q: Given: ABCD is a parallelogram
Show: LA=LC
ABCD is a parallelogram
Given
3
ABCD
Definition of…
Q: How many moles of Na₂HPO3 would need to be added to 527.6 mL of 0.38 M
NaH₂PO3 to produce a solution…
Q: Experiment 9.B
amo
S
t
²
mil
1-4
1
1
1
1
1
HCL
0.15
TITRATION A
titrated with
Volume of second…
Q: What is the by-product in the preparation of Aspirin from salicylic acid?
A. methyl salicylate
B.…
Q: In a fixed population, no new member can be added; hence it decreases in size due to deaths or…
Q: Sam, whose mass is 80 kg, takes off across level snow on
his jet-powered skis. The skis have a…
Q: What is the formula for the following?
gold (I) thiosulfite
lithium thiosulfite
Q: Write C code to define a double variable called height and an integer variable called age.
Q: c) Choose the correct answer below.
A. Since the mean and the standard deviation are very close, the…
Q: 10. Given that limx→
csc²x
∞, illustrate Definition 6 by
finding values of 8 that correspond to (a)…
Q: S
(3x - 2)sin(6x)dz
Q: value
Q: Mirabile Corporation uses activity-based costing to compute product margins. Overhead costs have…
Q: A chemist prepares a solution of iron (III) chloride (FeC13) by measuring out 0.30 g of FeCl3 into a…
Q: you will import the json module.
Write a class named SatData that reads a JSON file containing data…
Q: What is the decay probability per second per nucleus of a
substance with a half-life of 5.0 hours?
Q: a. List the different possible outcomes. Assume that these outcomes are equally likely.
b. What is…
Q: Calculate the work done by friction as a 3.7 kg box is slid along a floor from point A to point B in…
Q: A 3.55-g sample of an oxide of chromium contains 2.20 g of chromium. Calculate the simplest formula…
Q: I
for de
||
Q: 1. The time to failure in hours of an electronic component
subjected to an accelerated life test is…