Finding Relationships

Correlation, Comparison, and Connections

Understanding how variables relate to each other is often the heart of analysis. Does X affect Y? Do these groups differ? What moves together?

Correlation

What Correlation Measures

Correlation measures the strength and direction of a linear relationship between two variables.

Correlation coefficient (r): Ranges from -1 to +1

  • +1: Perfect positive relationship (as X increases, Y increases)
  • 0: No linear relationship
  • -1: Perfect negative relationship (as X increases, Y decreases)

Interpreting Correlation

CorrelationInterpretation
0.9 to 1.0Very strong positive
0.7 to 0.9Strong positive
0.5 to 0.7Moderate positive
0.3 to 0.5Weak positive
0 to 0.3Little to none

Same thresholds apply negatively.

Correlation Warnings

Correlation ≠ Causation: Two things moving together doesn't mean one causes the other.

Linear only: Correlation measures linear relationships. Strong non-linear relationships can have low correlation.

Outliers affect it: A few extreme points can dramatically change correlation.

Spurious correlations: With enough variables, some will correlate by chance.

Comparing Groups

Are Groups Different?

Common question: Does one group perform differently than another?

Examples:

  • New customers vs. existing customers
  • Treatment vs. control
  • Region A vs. Region B
  • Before vs. after

What to Compare

Central tendency: Is the average/median different?

Distribution: Are the shapes different?

Variability: Is one group more variable?

Statistical Significance

Just because means differ doesn't mean the difference is real. Could be random variation.

Statistical tests help determine if differences are likely real or just noise:

  • t-test: Compare two group means
  • Chi-square: Compare categorical distributions
  • ANOVA: Compare multiple group means

Practical significance: Even statistically significant differences may be too small to matter practically.

Segmentation Analysis

Splitting Data

Dividing data into meaningful groups to compare:

  • By customer type
  • By time period
  • By geography
  • By product category

Finding Segments

Look for natural groupings where behavior differs.

Approach:

  1. Hypothesize segments
  2. Split data
  3. Compare metrics
  4. Validate differences

RFM Analysis (Example)

Classic customer segmentation:

  • Recency: How recently did they buy?
  • Frequency: How often do they buy?
  • Monetary: How much do they spend?

Score customers on each, combine into segments.

Cross-Tabulation

What It Is

A table showing the relationship between two categorical variables.

Example

Product AProduct BProduct C
Region 11005030
Region 28012060
Region 31509045

What to Look For

  • Are patterns consistent across categories?
  • Do certain combinations stand out?
  • Are differences meaningful?

Time-Based Comparisons

Period-over-Period

Compare same metric across different times:

  • Month over month
  • Year over year
  • This week vs. last week

Seasonality

Patterns that repeat regularly:

  • Holiday spikes
  • Summer slowdowns
  • Day-of-week patterns

Trend Analysis

Long-term direction:

  • Is this metric growing, shrinking, or flat?
  • What's the rate of change?
  • Are there inflection points?

AI Prompt: Relationship Analysis

Help me analyze the relationship between these variables.

Variable 1: [Describe it]
Variable 2: [Describe it]
Sample data: [Paste if available]
Context: [What these represent]

Please help me:
1. Quantify the relationship
2. Visualize it appropriately
3. Interpret what this means
4. Identify any caveats
5. Suggest what this implies for my question

AI Prompt: Group Comparison

Help me compare these groups.

Group A: [Description and data]
Group B: [Description and data]
Metric I'm comparing: [What you're measuring]
Question: [What you want to know]

Please:
1. Calculate appropriate comparison statistics
2. Assess whether differences are meaningful
3. Visualize the comparison
4. Interpret the findings
5. Note any limitations

What's Next

Can we predict what will happen? Let's explore.

Next chapter: Basic predictive analysis.