Odds are simply another way of describing probability. Odds are calculated by dividing the number of times an event happens by the number of times it does not happen.
If one in every 100 patients suffers a side-effect from a treatment, the odds are:
\[\text{Odds} = 1:99 = \frac{1}{99} = 0.0101\]
Risk, on the other hand, indicates the probability that an event will happen. It is calculated by dividing the number of events by the number of people at risk. In the example above the risk would be:
\[\text{Risk} = \frac{1}{100} = 0.01\]
Key Difference
Odds = Number of events / Number of non-events
Risk = Number of events / Total number at risk
While similar for rare events, odds and risk diverge as events become more common.
10.2 Odds Ratios
Odds ratios are calculated by dividing the odds in one group of patients (e.g. cases) with the odds in a comparison group of patients (e.g. controls).
An odds ratio of 1 indicates no difference between the groups, i.e. the odds in each group are the same.
Odds Ratio
Interpretation
= 1
No difference in odds between groups
> 1
Increased odds of exposure in cases
< 1
Reduced odds of exposure in cases
Odds ratios are frequently given with 95% confidence intervals – if the confidence interval for an odds ratio does not include 1 (no difference in odds), it is statistically significant.
The 2 × 2 Table for Odds Ratios
In a case-control study, patients are selected on the basis of their disease status. We compare the odds of exposure between cases (those with disease) and controls (those without disease).
Table 10.1: Structure of a 2 × 2 table for calculating odds ratios
Disease Status
Case (Disease)
Control (No Disease)
Total
Exposed
a
b
a + b
Unexposed
c
d
c + d
**Total**
a + c
b + d
n
Odds of exposure
a / c
b / d
The odds ratio (OR) compares the odds of exposure in cases to the odds of exposure in controls:
\[OR = \frac{\text{Odds of exposure in cases}}{\text{Odds of exposure in controls}} = \frac{a/c}{b/d} = \frac{ad}{bc}\]
Worked Example: HPV and Oropharyngeal Cancer
A case-control study investigated whether human papillomavirus (HPV) infection was associated with oropharyngeal squamous cell carcinoma. Researchers recruited 250 patients with newly diagnosed oropharyngeal cancer (cases) and 250 age- and sex-matched patients without cancer (controls). HPV status was determined by serology testing.
Table 10.2: Case-control study of HPV infection and oropharyngeal cancer
Disease Status
HPV Status
Case (Cancer)
Control (No Cancer)
Total
HPV positive
175 (a)
50 (b)
225
HPV negative
75 (c)
200 (d)
275
**Total**
**250**
**250**
**500**
Odds of exposure
175 / 75
50 / 200
Calculating the odds ratio:
code
# Values from the 2×2 tablea <-175# Exposed cases (HPV positive with cancer)b <-50# Exposed controls (HPV positive without cancer)c <-75# Unexposed cases (HPV negative with cancer)d <-200# Unexposed controls (HPV negative without cancer)# Odds of HPV exposure in casesodds_cases <- a / c# Odds of HPV exposure in controlsodds_controls <- b / d# Odds ratioor <- odds_cases / odds_controls# Equivalently: or <- (a * d) / (b * c)
Step 1: Calculate the odds of HPV exposure in cases (patients with cancer):
\[\text{Odds in cases} = \frac{a}{c} = \frac{175}{75} = 2.33\]
Step 2: Calculate the odds of HPV exposure in controls (patients without cancer):
\[\text{Odds in controls} = \frac{b}{d} = \frac{50}{200} = 0.25\]
Interpretation: The odds of HPV exposure are 9.3 times higher in patients with oropharyngeal cancer compared to controls. This strong positive association suggests HPV infection is an important risk factor for this malignancy.
Clinical Significance
An odds ratio of 9.3 indicates a very strong association between HPV infection and oropharyngeal cancer. If the 95% confidence interval excludes 1.0, the association is statistically significant. This finding is consistent with published literature showing that HPV-positive oropharyngeal cancers have distinct biology and generally improved prognosis compared to HPV-negative tumours.
10.3 Logistic Regression
Logistic regression is similar to linear regression but is used when the outcome variable is binary (e.g. having a disease or not) as opposed to continuous.
The coefficients in a logistic regression are interpreted as odds ratios. The coefficients indicate the percent change in the odds of the event when a unit change in the explanatory variable occurs.
Practical Application
Logistic regression is commonly used in medical research to:
Predict disease risk based on multiple factors
Identify risk factors for binary outcomes (e.g., death vs survival)
Adjust for confounding variables when examining associations
Worked Example: Logistic Regression for Treatment Response
A study investigated factors predicting complete response to chemotherapy in 150 cancer patients. The outcome was binary (complete response: yes/no) and predictors included age, tumour size, and performance status.
code
# Create example logistic regression outputlogistic_results <-tibble(Predictor =c("Intercept", "Age (per year)", "Tumour size (per cm)", "Performance status (1 vs 0)"),`Coefficient (log OR)`=c(2.45, -0.03, -0.42, -1.15),`Odds Ratio`=c(11.59, 0.97, 0.66, 0.32),`95% CI Lower`=c(3.21, 0.95, 0.51, 0.15),`95% CI Upper`=c(41.85, 0.99, 0.85, 0.67),`P-value`=c("<0.001", "0.041", "0.002", "0.003"))logistic_results |>kable() |>kable_styling(bootstrap_options =c("striped", "hover"))
Table 10.3: Logistic regression output for predicting complete response to chemotherapy
Predictor
Coefficient (log OR)
Odds Ratio
95% CI Lower
95% CI Upper
P-value
Intercept
2.45
11.59
3.21
41.85
<0.001
Age (per year)
-0.03
0.97
0.95
0.99
0.041
Tumour size (per cm)
-0.42
0.66
0.51
0.85
0.002
Performance status (1 vs 0)
-1.15
0.32
0.15
0.67
0.003
Interpretation:
Age: OR = 0.97 (95% CI: 0.95-0.99, p = 0.041)
For each additional year of age, the odds of complete response decrease by 3% (1 - 0.97 = 0.03)
Older patients have slightly lower odds of complete response
Tumour size: OR = 0.66 (95% CI: 0.51-0.85, p = 0.002)
For each additional cm of tumour size, the odds of complete response decrease by 34% (1 - 0.66 = 0.34)
Larger tumours have significantly lower odds of complete response
Performance status: OR = 0.32 (95% CI: 0.15-0.67, p = 0.003)
Patients with performance status 1 have 68% lower odds (1 - 0.32 = 0.68) of complete response compared to those with performance status 0
Poor performance status is a strong negative predictor
Statistical Significance
All three predictors are statistically significant (p < 0.05), and none of the 95% confidence intervals include 1.0. This indicates that age, tumour size, and performance status are all independently associated with complete response when adjusting for the other variables.
10.4 Summary
Concept
Formula
Use
Odds
Events / Non-events
Alternative to probability
Risk
Events / Total at risk
Probability of event
Odds Ratio
(a/c) / (b/d) = ad/bc
Compare odds of exposure between cases and controls