[ ]:
import fdfi
print('FDFI version:', fdfi.__version__)
Confidence Intervals and Statistical Inference
This tutorial covers statistical inference with FDFI, including confidence intervals, hypothesis testing, and feature selection.
What You’ll Learn
Computing confidence intervals with
conf_int()One-sided vs two-sided tests
Variance floor for stable inference
Practical significance margins
Statistically-driven feature selection
[ ]:
import numpy as np
import matplotlib.pyplot as plt
from fdfi.explainers import OTExplainer
from fdfi.plots import confidence_interval_plot
np.random.seed(42)
Setup
Create a model where we know the true feature importance:
[ ]:
n_features = 10
n_train = 500
n_test = 100
# True importance: features 0, 1, 2 are important; rest are noise
true_coefs = np.array([2.0, 1.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
def model(X):
return X @ true_coefs
X_train = np.random.randn(n_train, n_features)
X_test = np.random.randn(n_test, n_features)
# Create explainer
explainer = OTExplainer(model, data=X_train, nsamples=100)
results = explainer(X_test)
print("True coefficients:", true_coefs)
print("Estimated importance:", results["phi_X"].round(3))
Basic Confidence Intervals
The conf_int() method computes pointwise confidence intervals:
[ ]:
# Two-sided 95% confidence intervals
ci = explainer.conf_int(alpha=0.05, alternative="two-sided")
print("Two-sided 95% Confidence Intervals:")
print("-" * 70)
print(f"{'Feature':>8} {'Estimate':>10} {'SE':>10} {'CI Lower':>10} {'CI Upper':>10} {'P-value':>10}")
print("-" * 70)
for i in range(n_features):
sig = "*" if ci["pvalue"][i] < 0.05 else ""
print(f"{i:>8} {ci['score'][i]:>10.4f} {ci['se'][i]:>10.4f} "
f"{ci['ci_lower'][i]:>10.4f} {ci['ci_upper'][i]:>10.4f} {ci['pvalue'][i]:>10.4f} {sig}")
Visualize Confidence Intervals
[ ]:
feature_names = [f"X{i}" for i in range(n_features)]
confidence_interval_plot(
ci,
feature_names=feature_names,
show=False,
)
One-Sided Tests
For feature importance, we often care if a feature has positive importance. Use alternative="greater":
[ ]:
# One-sided test: H0: phi <= 0 vs H1: phi > 0
ci_greater = explainer.conf_int(alpha=0.05, alternative="greater")
print("One-sided test (phi > 0):")
print("-" * 60)
print(f"{'Feature':>8} {'Estimate':>10} {'CI Lower':>10} {'P-value':>10} {'Significant':>12}")
print("-" * 60)
for i in range(n_features):
sig = "Yes" if ci_greater["reject_null"][i] else "No"
print(f"{i:>8} {ci_greater['score'][i]:>10.4f} "
f"{ci_greater['ci_lower'][i]:>10.4f} {ci_greater['pvalue'][i]:>10.4f} {sig:>12}")
Variance Floor
When some features have very small variance in their importance estimates, confidence intervals can become too narrow. The variance floor adds a minimum standard error.
Two methods are available:
fixed: Use a constant floor valuemixture: Fit a two-component mixture to estimate the floor
[ ]:
# Without variance floor
ci_no_floor = explainer.conf_int(alpha=0.05, var_floor_c=0)
# With fixed variance floor
ci_fixed = explainer.conf_int(alpha=0.05, var_floor_method="fixed", var_floor_c=0.1)
# With mixture-based floor
ci_mixture = explainer.conf_int(alpha=0.05, var_floor_method="mixture", var_floor_quantile=0.95)
print("Standard errors comparison:")
print("-" * 55)
print(f"{'Feature':>8} {'No Floor':>12} {'Fixed':>12} {'Mixture':>12}")
print("-" * 55)
for i in range(n_features):
print(f"{i:>8} {ci_no_floor['se'][i]:>12.4f} {ci_fixed['se'][i]:>12.4f} {ci_mixture['se'][i]:>12.4f}")
Practical Significance Margin
Instead of testing \(H_0: \phi = 0\), you can test against a practical threshold \(\delta\):
This identifies features that are not just statistically different from zero, but also practically meaningful.
[ ]:
# Test with practical margin of 0.5
margin = 0.5
ci_margin = explainer.conf_int(
alpha=0.05,
alternative="greater",
margin=margin
)
print(f"Testing H0: phi <= {margin}")
print("-" * 50)
print(f"{'Feature':>8} {'Estimate':>10} {'P-value':>10} {'Significant':>12}")
print("-" * 50)
for i in range(n_features):
sig = "Yes" if ci_margin["reject_null"][i] else "No"
print(f"{i:>8} {ci_margin['score'][i]:>10.4f} {ci_margin['pvalue'][i]:>10.4f} {sig:>12}")
Automatic Margin via Mixture Model
Use margin_method="mixture" to automatically estimate a practical margin:
[ ]:
ci_auto_margin = explainer.conf_int(
alpha=0.05,
alternative="greater",
margin_method="mixture",
margin_quantile=0.95,
)
print(f"Automatically selected margin: {ci_auto_margin['margin']:.4f}")
print(f"Significant features: {np.where(ci_auto_margin['reject_null'])[0]}")
Feature Selection with Statistical Guarantees
Use the confidence intervals to select features with controlled false discovery:
[ ]:
def statistical_feature_selection(explainer, X_test, alpha=0.05, margin=0.0):
"""Select features with statistical significance."""
# Compute importance
results = explainer(X_test)
# Get confidence intervals
ci = explainer.conf_int(
alpha=alpha,
alternative="greater",
margin=margin,
var_floor_method="mixture",
)
# Select significant features
selected = np.where(ci["reject_null"])[0]
# Sort by importance
sorted_idx = np.argsort(ci["score"][selected])[::-1]
return selected[sorted_idx], ci
# Run feature selection
selected_features, ci_result = statistical_feature_selection(
explainer, X_test, alpha=0.05, margin=0.0
)
print("Selected Features (sorted by importance):")
print("-" * 40)
for i, feat in enumerate(selected_features):
print(f" {i+1}. Feature {feat} (importance = {ci_result['score'][feat]:.4f})")
print(f"\nTrue important features: 0, 1, 2")
print(f"Correctly identified: {set(selected_features) & {0, 1, 2}}")
The summary() Method
For a quick formatted view, use the built-in summary() method:
[ ]:
# Print formatted summary
explainer.summary(
alpha=0.05,
alternative="greater",
var_floor_method="mixture",
)
Multiple Testing Correction
When explaining models with many features, it is important to control the false discovery rate (FDR). You can specify a multitest_method in both conf_int() and summary():
[ ]:
# Summary with FDR control (Benjamini-Hochberg)
explainer.summary(
alpha=0.05,
alternative="greater",
multitest_method="fdr_bh"
)
Summary
Key takeaways:
conf_int()provides confidence intervals and p-values for feature importanceUse
alternative="greater"for one-sided tests of positive importanceVariance floor (
var_floor_method) stabilizes inference for small effectsPractical margin (
margin) tests against meaningful thresholdsUse
summary()for quick formatted output