Auditing Machine Learning Models for True Fairness Using Amazon CausalFairnessInAction

Machine learning algorithms govern an increasing surface area of our modern lives. They filter our resumes, price our insurance policies, and approve our mortgages. As the stakes have risen, the industry has rightly shifted its focus toward algorithmic fairness. We want our models to be equitable, ensuring they do not systematically disadvantage vulnerable groups.

However, the standard approach to measuring fairness has a massive theoretical blind spot. For years, the industry has relied almost entirely on statistical correlations.

We use metrics like Demographic Parity, which demands that a model predicts favorable outcomes at the same rate across different groups. We use Equalized Odds, which insists that true positive rates remain uniform regardless of a sensitive attribute. While well-intentioned, these observational metrics fail to account for the actual mechanisms that generate the data. They look at the symptoms rather than the disease.

Statistical fairness metrics are highly susceptible to phenomena like Simpson's Paradox, where a trend appears in different groups of data but disappears or reverses when these groups are combined. If we merely force a machine learning model to satisfy a statistical parity constraint, we risk masking the true sources of discrimination. Worse, we might inadvertently harm the very populations we are trying to protect by ignoring the underlying structural inequalities in the real world.

Enter Causal Inference and the Data-Generating Mechanism

To truly understand bias, we must stop asking "What is the correlation between gender and loan approval?" and start asking "What would the loan approval decision be if this applicant were of a different gender, holding all other relevant factors constant?"

This shift requires moving from observational statistics to causal inference. Causal inference utilizes structural causal models and Directed Acyclic Graphs to represent the directional relationships between variables. By defining how data is generated in the real world, we can mathematically intervene in the system. We can isolate the direct prejudice applied by a decision-maker from the indirect disadvantages caused by systemic, historical factors.

Until recently, applying causal inference to machine learning fairness was a highly academic exercise. It required bespoke implementations, deep statistical expertise, and complex mathematical derivations. That landscape is now changing.

Introducing CausalFairnessInAction

Amazon Science recently released an open-source Python library named CausalFairnessInAction. This library operationalizes years of academic research into an accessible, production-ready package. It allows machine learning practitioners to compute fairness metrics through the lens of causal reasoning, enabling audits at both the aggregate group level and the highly specific individual level.

The library moves beyond simple statistical disparities by offering tools to compute counterfactuals and path-specific causal effects. It helps developers uncover the actual data-generating mechanisms behind disparities.

Before diving into causal fairness, ensure you have a strong understanding of your domain. Causal models are only as good as the assumptions baked into their structural graphs. Collaborating with domain experts is essential for defining accurate data-generating mechanisms.

Setting Up Your Environment

To get started, you will need a standard Python data science environment. The CausalFairnessInAction library integrates seamlessly with standard data manipulation libraries like Pandas and graph libraries like NetworkX.

code

# Install the necessary packages via pip
pip install causal-fairness-in-action pandas networkx scikit-learn

Let us frame a practical tutorial around a classic machine learning fairness problem. Imagine we are building a credit risk model to determine loan approvals. Our dataset contains applicant information including age, gender, income, historical credit score, and the final loan approval decision.

Defining the Structural Causal Model

The foundation of any causal analysis is the Directed Acyclic Graph. The graph represents our assumptions about how the real world operates. Variables are nodes, and causal relationships are directed edges.

In our loan approval scenario, we need to map out the relationships carefully. Let us assume the following causal structure.

Gender is our sensitive attribute and acts as a root node.
Age is another root node, independent of gender.
Income is influenced by both Age and Gender due to societal factors like the gender pay gap.
Credit Score is influenced by Income and Age.
Loan Approval is the final outcome, directly influenced by Income, Credit Score, and potentially, bias against Gender.

We can build this exact structure using the library.

code

import pandas as pd
import networkx as nx
from causal_fairness import CausalModel

# Define the causal graph using NetworkX
causal_graph = nx.DiGraph()
causal_graph.add_edges_from([
    ('Gender', 'Income'),
    ('Age', 'Income'),
    ('Age', 'Credit_Score'),
    ('Income', 'Credit_Score'),
    ('Income', 'Loan_Approval'),
    ('Credit_Score', 'Loan_Approval'),
    ('Gender', 'Loan_Approval') # Potential direct discrimination path
])

# Load your historical loan dataset
df = pd.read_csv('historical_loan_data.csv')

# Initialize the Causal Model
model = CausalModel(
    data=df,
    graph=causal_graph,
    sensitive_attribute='Gender',
    outcome='Loan_Approval'
)

The edge from Gender directly to Loan Approval represents direct discrimination. If a loan officer rejects an application solely based on the applicant's gender, it travels along this path. The path from Gender through Income to Loan Approval represents indirect, systemic bias.

Computing Group Fairness Metrics

With our structural causal model established, we can begin auditing our data and our machine learning models. Traditional statistical parity simply subtracts the approval rate of women from the approval rate of men. CausalFairnessInAction decomposes this disparity into specific causal effects.

The Total Effect

The Total Effect measures the overall causal impact of changing the sensitive attribute on the outcome. It answers the question of how much the probability of loan approval changes if we intervene and change the applicant's gender, allowing that change to ripple through the entire graph.

code

from causal_fairness.metrics import GroupFairnessEvaluator

evaluator = GroupFairnessEvaluator(model)

# Calculate the Total Effect
total_effect = evaluator.compute_total_effect(treatment_value='Male', control_value='Female')
print(f"Total Causal Effect: {total_effect:.4f}")

If the Total Effect is significantly different from zero, it indicates a causal disparity. However, this single number does not tell us whether the disparity is due to direct discrimination by our model or due to systemic differences in upstream variables like income.

The Natural Direct Effect

To isolate direct discrimination, we compute the Natural Direct Effect. This metric measures the change in loan approval probability if we flip the gender from female to male, but artificially hold all intermediate variables exactly where they would have been if the applicant were female.

code

# Calculate the Natural Direct Effect
direct_effect = evaluator.compute_natural_direct_effect(treatment_value='Male', control_value='Female')
print(f"Natural Direct Effect: {direct_effect:.4f}")

A non-zero Natural Direct Effect is a massive red flag. It implies that your algorithm or historical process is making decisions based directly on the sensitive attribute, regardless of the applicant's actual financial qualifications. From a legal and ethical standpoint, minimizing the direct effect is often the highest priority for compliance teams.

The Natural Indirect Effect

The Natural Indirect Effect measures the disparity that flows through mediator variables. In our graph, this represents the bias cascading from Gender to Income, and subsequently to Credit Score and Loan Approval.

code

# Calculate the Natural Indirect Effect
indirect_effect = evaluator.compute_natural_indirect_effect(treatment_value='Male', control_value='Female')
print(f"Natural Indirect Effect: {indirect_effect:.4f}")

Handling indirect effects requires nuanced policy decisions. Should a machine learning model correct for historical income disparities? If the Natural Indirect Effect is high but the Direct Effect is zero, the model is technically treating equal financial profiles equally, but the society generating those profiles is unequal. CausalFairnessInAction gives you the precise numbers needed to have these complex policy discussions with stakeholders.

Evaluating Individual Fairness Through Counterfactuals

Group metrics are powerful, but they aggregate the human experience. Fairness is ultimately experienced at the individual level. A model might be fair on average across a population while still acting highly unfairly toward specific edge cases.

Individual fairness in a causal framework is known as Counterfactual Fairness. A model is counterfactually fair if its prediction for a specific individual remains identical in the counterfactual world where that individual belonged to a different demographic group.

Computing counterfactuals is mathematically demanding. It requires the library to compute the posterior distribution of the unobserved background variables for a specific person, and then simulate the forward process under the altered sensitive attribute.

The CausalFairnessInAction library handles this complex calculus under the hood.

code

from causal_fairness.metrics import IndividualFairnessEvaluator

# Initialize the individual evaluator
ind_evaluator = IndividualFairnessEvaluator(model)

# Select a specific applicant from our dataset who was denied a loan
applicant_index = 42
applicant_data = df.iloc[[applicant_index]]

# Compute the counterfactual outcome
# "Would this specific female applicant have been approved if she were male?"
cf_outcome = ind_evaluator.compute_counterfactual(
    individual_data=applicant_data,
    counterfactual_treatment={'Gender': 'Male'}
)

original_outcome = applicant_data['Loan_Approval'].values[0]
print(f"Original Outcome: {original_outcome}")
print(f"Counterfactual Outcome: {cf_outcome['Predicted_Approval_Probability']:.4f}")

If the original outcome was a denial, but the counterfactual probability of approval is high, you have identified a specific instance of individual discrimination. By running this evaluation across your entire test suite, you can calculate the Counterfactual Fairness Violation Rate. This provides an incredibly stringent and robust metric for algorithmic audits.

Real World Limitations and Best Practices

While the Amazon CausalFairnessInAction library is a profound step forward, causal reasoning is not a magical solution to all fairness problems. As practitioners, we must acknowledge the inherent limitations of this approach.

The Assumption of Unconfoundedness

The mathematics of causal inference rely heavily on the assumption that there are no unmeasured confounding variables. If there is a hidden factor in the real world that influences both gender and loan approval, and you fail to include it in your Directed Acyclic Graph, the metrics output by the library will be incorrect.

Always approach causal fairness metrics with humility. They are mathematical derivations of your graph structure. If your graph is missing critical reality, your fairness metrics will provide a false sense of security.

Graph Specification Disagreements

Defining the causal graph is inherently subjective. Two different sociologists or economists might draw slightly different arrows between Income, Education, and Credit Score. Because the fairness metrics depend entirely on the structure of the graph, different graphs will yield different fairness assessments. It is a best practice to perform sensitivity analysis, slightly altering the edges of your graph to see how robust your fairness metrics are to structural uncertainty.

The Future of Responsible Artificial Intelligence

The release of CausalFairnessInAction by Amazon Science marks a significant maturation in the field of Responsible AI. We are finally moving past the era of relying solely on flawed observational statistics. By embracing causal reasoning, we are forcing ourselves to formally document our assumptions about the world.

When developers use this library to audit their models, they are doing much more than just running a compliance script. They are translating systemic social realities into code. They are distinguishing between the direct bias a model introduces and the historical bias the model inherits from society.

As machine learning continues to integrate into high-stakes decision-making, tools that expose the true data-generating mechanisms will transition from academic curiosities to mandatory engineering standards. Integrating causal fairness audits into your machine learning pipelines today will ensure your systems are not just statistically compliant, but genuinely equitable.