Methodology
How we calculate influence correlation scores
Important Disclaimer
Correlation Does Not Imply Causation
The scores presented on this platform represent statistical correlations, not proof of corruption, wrongdoing, or undue influence. A high score does not mean a legislator is corrupt or acting improperly.
Many legitimate reasons can explain correlations between campaign contributions and voting patterns:
- Industries naturally support legislators who share their policy views
- Legislators often represent constituents who work in specific industries
- Committee assignments attract industry interest and expertise
- Ideological alignment precedes and explains both contributions and votes
This tool provides information for further investigation, not accusations or conclusions.
The Common Sense Algorithm
The Common Sense Algorithm analyzes the relationship between campaign contributions and legislative voting behavior using four independent statistical components. Each component measures a different aspect of this relationship, and the final score combines these measurements with weights reflecting their relative importance.
Contribution Concentration (HHI)
25%How reliant is the legislator on a single industry for funding?
Vote Alignment
35%How often does the legislator vote in favor of donor industry interests?
Timing Patterns
20%Do contributions cluster suspiciously close to favorable votes?
Peer Deviation
20%Does the legislator differ significantly from similar peers?
1. Contribution Concentration (HHI)
We use the Herfindahl-Hirschman Index (HHI), a well-established economic measure of market concentration, to assess how dependent a legislator is on contributions from specific industries. Think of it as measuring how many eggs are in one basket.
The Formula
HHI = Σ(industry_share_i)² × 10,000Where industry_share_i equals the contributions from industry i divided by total contributions. The result ranges from near 0 (perfectly diversified) to 10,000 (single-industry monopoly).
In Plain English
If a legislator receives contributions from many different industries in roughly equal amounts, the HHI will be low. If most contributions come from a single industry, the HHI will be high. We use thresholds established by the Department of Justice and Federal Trade Commission for antitrust analysis.
| HHI Range | Concentration Level | Interpretation |
|---|---|---|
| < 1,500 | Low | Diversified funding from multiple industries |
| 1,500 - 2,499 | Moderate | Some industry concentration |
| ≥ 2,500 | High | Significant reliance on one industry |
Normalization
We normalize the raw HHI to a 0-1 scale for comparison:
Normalized Score = (HHI - min_HHI) / (10,000 - min_HHI)Where min_HHI = 10,000 / n (n = number of industries contributing).
2. Vote Alignment
This component measures how frequently a legislator votes in ways that favor industries that contribute to their campaigns. We track votes on bills that have been tagged as relevant to specific industries.
The Formula
Alignment Score = aligned_votes / (aligned_votes + opposed_votes)Determining Favorable Votes
For each bill tagged with an industry relevance:
- Industry-supported bills: A "Yea" vote is considered favorable
- Industry-opposed bills: A "Nay" vote is considered favorable
- Neutral/Unknown position: We default to "Yea" as favorable
In Plain English
If a legislator votes "Yea" on bills that an industry supports and "Nay" on bills that industry opposes, their alignment score will be high. We only count definitive votes ("Yea" or "Nay"), not "Present" or abstentions.
Interpretation
- Score 0.0: Always votes against industry interests
- Score 0.5: Evenly split between aligned and opposed
- Score 1.0: Always votes in industry's favor
3. Timing Patterns
This component analyzes the temporal relationship between when contributions are received and when favorable votes are cast. We look for patterns where contributions cluster suspiciously close to votes.
Analysis Windows
| Window | Days Before Vote | Interpretation |
|---|---|---|
| Suspicious | 0-30 days | Contribution very close to vote |
| Notable | 31-90 days | Contribution moderately close to vote |
| Background | 91-180 days | Normal campaign activity window |
The Formula
Timing Score = (suspicious_ratio × 0.6) + (consistency × 0.4)Where:
- suspicious_ratio = contributions in 30-day window / total pre-vote contributions
- consistency = measure of regular pre-vote contribution pattern (0-1)
Statistical Significance
We apply a binomial test to determine if contributions cluster before votes more than random chance would predict. If contributions were randomly distributed, we would expect only about 16.7% (30/180 days) to fall in the suspicious window.
4. Peer Deviation
This component compares a legislator to their peers—other members of the same party in the same chamber. If a legislator stands out from their peers in their relationship with a particular industry, it may warrant further examination.
Peer Group Definition
Peers are defined as legislators who share:
- Same political party (Democrat, Republican, Independent)
- Same chamber (House or Senate)
- Currently in office
What We Measure
For the target legislator and all peers, we calculate:
- Industry contribution percentage (% of total from this industry)
- Vote alignment score with the industry
The Formula
We calculate z-scores to measure how many standard deviations from the mean:
z_contribution = (target_contrib - mean_contrib) / std_dev_contribz_alignment = (target_alignment - mean_alignment) / std_dev_alignmentcombined_z = (|z_contribution| + |z_alignment|) / 2The combined z-score is converted to a 0-1 deviation score using the cumulative distribution function (CDF) of the normal distribution.
In Plain English
If a legislator receives significantly more from an industry than their peers, and also votes more favorably for that industry than their peers, the deviation score will be high. A score near 0 means they're typical for their group; a score near 1 means they're a statistical outlier.
Minimum Requirements
A minimum of 5 peers is required for meaningful analysis. If the peer group is too small, the confidence level is marked as "insufficient data."
Composite Score Calculation
The final score combines all four components using a weighted average, with weights adjusted based on data quality and confidence levels.
Default Weights
| Component | Weight | Rationale |
|---|---|---|
| Vote Alignment | 35% | Most direct measure of voting behavior |
| Contribution Concentration | 25% | Measures funding dependency |
| Timing Patterns | 20% | Temporal correlation indicator |
| Peer Deviation | 20% | Comparative context measure |
Weight Adjustment
Weights are adjusted based on component confidence levels:
- High confidence: 100% weight
- Medium confidence: 70% weight
- Low confidence: 30% weight
- Insufficient data: 0% weight (excluded)
After adjustment, weights are renormalized to sum to 100%.
Confidence Levels
Each component has its own confidence level based on the amount and quality of available data. These confidence levels affect how much weight each component receives in the final score.
| Component | High | Medium | Low | Insufficient |
|---|---|---|---|---|
| Contribution Concentration | ≥$10,000 total | $1,000-$9,999 | <$1,000 | $0 |
| Vote Alignment | ≥10 votes | 5-9 votes | <5 votes | 0 votes |
| Timing Patterns | ≥20 pairs | 10-19 pairs | <10 pairs | 0 pairs |
| Peer Deviation | ≥20 peers | 10-19 peers | 5-9 peers | <5 peers |
Data Sources
All data used in our analysis comes from publicly available, authoritative sources:
Federal Election Commission (FEC)
Campaign contribution data, updated weekly during election cycles. Provides detailed records of individual and organizational contributions to candidate committees.
fec.gov/dataCongress.gov API
Roll call votes and bill information, updated daily. Official source for legislative actions, bill text, and voting records.
api.congress.govBioguide
Biographical information for current and historical members of Congress. Provides unique identifiers used to link records across data sources.
bioguide.congress.govNAICS Codes
North American Industry Classification System codes for categorizing contributions by industry, with custom mappings for political relevance.
census.gov/naicsUpdate Frequency
- FEC data: Weekly during active election cycles
- Vote data: Daily (when Congress is in session)
- Score recalculation: Nightly batch processing
Limitations
Correlation vs Causation
Statistical correlation does not prove causation. A high score does not mean a legislator is corrupt or that their votes were "bought." Many factors influence both campaign contributions and voting behavior.
Committee Assignments
Legislators on committees relevant to specific industries naturally attract more contributions from those industries. A member of the Agriculture Committee will receive more from agricultural interests, regardless of their independence.
Constituent Industries
Representatives from districts with major employers in specific industries will naturally have higher contributions and votes aligned with those industries. A legislator from a farming district supporting agricultural policy is representing their constituents.
Vote Significance
Not all votes are equal. Procedural votes, amendments, and final passage votes have different significance. Our algorithm treats all recorded votes equally, which may overstate the importance of minor procedural votes.
Data Completeness
While we use comprehensive data sources, some contributions may be misclassified, delayed in reporting, or linked to incorrect industries. Bill-industry tagging involves judgment calls that may not capture all relevant legislation.
Historical Comparison
Score methodology may change over time as we improve our algorithms. Historical scores should not be directly compared across different methodology versions.
Open Source Transparency
Our entire codebase, including all scoring algorithms, is open source. We believe transparency is essential for trust. You can review, audit, and contribute to our methodology: