A categorical measure with only two categories (for example alive or dead) is called dichotomous. Sometimes the categories of a dichotomous variable are labelled 0 and 1, and are called a binary variable.
4.3 Numerical (Quantitative) Data
Numerical data consists of values that are counts or measurements.
Discrete Data
Values arise from a counting process.
Examples: number of tumours, number of metastases, number of hospital admissions
# Create diagram using ggplotggplot() +# Main categorical boxannotate("rect", xmin =0, xmax =4, ymin =4, ymax =5, fill ="#3498db", alpha =0.3, colour ="#2980b9") +annotate("text", x =2, y =4.7, label ="Categorical (qualitative)", fontface ="bold", size =4) +annotate("text", x =2, y =4.3, label ="tells us which category an\nindividual belongs to", size =3) +# Nominal boxannotate("rect", xmin =0, xmax =1.9, ymin =2, ymax =3.5, fill ="#3498db", alpha =0.2, colour ="#2980b9") +annotate("text", x =0.95, y =3.2, label ="Nominal scale", fontface ="bold", size =3.5) +annotate("text", x =0.95, y =2.8, label ="Categories distinguished\nby name, with no\nintrinsic ordering", size =2.5) +annotate("text", x =0.95, y =2.2, label ="(e.g. sex, histology,\ncancer type)", size =2.5, fontface ="italic") +# Ordinal boxannotate("rect", xmin =2.1, xmax =4, ymin =2, ymax =3.5, fill ="#3498db", alpha =0.2, colour ="#2980b9") +annotate("text", x =3.05, y =3.2, label ="Ordinal scale", fontface ="bold", size =3.5) +annotate("text", x =3.05, y =2.8, label ="Categories distinguished\nby name, with\nintrinsic ordering", size =2.5) +annotate("text", x =3.05, y =2.2, label ="(e.g. performance status,\ntoxicity grade)", size =2.5, fontface ="italic") +# Main numerical boxannotate("rect", xmin =5, xmax =9, ymin =4, ymax =5, fill ="#e74c3c", alpha =0.3, colour ="#c0392b") +annotate("text", x =7, y =4.7, label ="Numerical (quantitative)", fontface ="bold", size =4) +annotate("text", x =7, y =4.3, label ="values are counts or\nmeasurements", size =3) +# Discrete boxannotate("rect", xmin =5, xmax =6.9, ymin =2, ymax =3.5, fill ="#e74c3c", alpha =0.2, colour ="#c0392b") +annotate("text", x =5.95, y =3.2, label ="Discrete", fontface ="bold", size =3.5) +annotate("text", x =5.95, y =2.8, label ="Values arise from\ncounting process", size =2.5) +annotate("text", x =5.95, y =2.2, label ="(e.g. number of\ntumours)", size =2.5, fontface ="italic") +# Continuous boxannotate("rect", xmin =7.1, xmax =9, ymin =2, ymax =3.5, fill ="#e74c3c", alpha =0.2, colour ="#c0392b") +annotate("text", x =8.05, y =3.2, label ="Continuous", fontface ="bold", size =3.5) +annotate("text", x =8.05, y =2.8, label ="Values arise from\nmeasuring process", size =2.5) +annotate("text", x =8.05, y =2.2, label ="(e.g. height, tumour size,\nage, survival)", size =2.5, fontface ="italic") +# Connecting linesannotate("segment", x =1, y =4, xend =1, yend =3.5) +annotate("segment", x =3, y =4, xend =3, yend =3.5) +annotate("segment", x =1, y =4, xend =3, yend =4) +annotate("segment", x =2, y =4.0, xend =2, yend =4) +annotate("segment", x =6, y =4, xend =6, yend =3.5) +annotate("segment", x =8, y =4, xend =8, yend =3.5) +annotate("segment", x =6, y =4, xend =8, yend =4) +annotate("segment", x =7, y =4.0, xend =7, yend =4) +theme_void() +coord_cartesian(xlim =c(-0.5, 9.5), ylim =c(1.5, 5.5))
Figure 4.1: Classification of data types
4.4 Paired Data
The majority of statistical analyses compare characteristics measured in two separate groups of individuals. In some circumstances, however, data may consist of pairs of outcome measurements.
When the same variable is measured on two occasions in the same individual, this is called paired data. If measurements are only made once on each individual they are unpaired.
Examples of Paired Data
Before and after treatment: We might wish, for example, to carry out a study where the assessment of tumour response to radiotherapy is based on comparing tumour size measurements in a group of lung cancer patients, before and after they received treatment. For each person, we therefore have a pair of measures: tumour size after treatment and tumour size before treatment.
Comparing two sites: When two measurements are taken on the same patient (e.g., comparing left and right eyes, or two different anatomical sites).
Why Pairing Matters
It is important to take this pairing in the data into account when assessing how much on average the treatment has affected tumour size. Paired analyses account for within-person variability and are typically more powerful than unpaired analyses.