Methodology

How we calculate influence correlation scores

Important Disclaimer

Correlation Does Not Imply Causation

The scores presented on this platform represent statistical correlations, not proof of corruption, wrongdoing, or undue influence. A high score does not mean a legislator is corrupt or acting improperly.

Many legitimate reasons can explain correlations between campaign contributions and voting patterns:

  • Industries naturally support legislators who share their policy views
  • Legislators often represent constituents who work in specific industries
  • Committee assignments attract industry interest and expertise
  • Ideological alignment precedes and explains both contributions and votes

This tool provides information for further investigation, not accusations or conclusions.

The Common Sense Algorithm

The Common Sense Algorithm analyzes the relationship between campaign contributions and legislative voting behavior using four independent statistical components. Each component measures a different aspect of this relationship, and the final score combines these measurements with weights reflecting their relative importance.

Contribution Concentration (HHI)

25%

How reliant is the legislator on a single industry for funding?

Vote Alignment

35%

How often does the legislator vote in favor of donor industry interests?

Timing Patterns

20%

Do contributions cluster suspiciously close to favorable votes?

Peer Deviation

20%

Does the legislator differ significantly from similar peers?

1. Contribution Concentration (HHI)

We use the Herfindahl-Hirschman Index (HHI), a well-established economic measure of market concentration, to assess how dependent a legislator is on contributions from specific industries. Think of it as measuring how many eggs are in one basket.

The Formula

HHI = Σ(industry_share_i)² × 10,000

Where industry_share_i equals the contributions from industry i divided by total contributions. The result ranges from near 0 (perfectly diversified) to 10,000 (single-industry monopoly).

In Plain English

If a legislator receives contributions from many different industries in roughly equal amounts, the HHI will be low. If most contributions come from a single industry, the HHI will be high. We use thresholds established by the Department of Justice and Federal Trade Commission for antitrust analysis.

HHI RangeConcentration LevelInterpretation
< 1,500LowDiversified funding from multiple industries
1,500 - 2,499ModerateSome industry concentration
≥ 2,500HighSignificant reliance on one industry

Normalization

We normalize the raw HHI to a 0-1 scale for comparison:

Normalized Score = (HHI - min_HHI) / (10,000 - min_HHI)

Where min_HHI = 10,000 / n (n = number of industries contributing).

2. Vote Alignment

This component measures how frequently a legislator votes in ways that favor industries that contribute to their campaigns. We track votes on bills that have been tagged as relevant to specific industries.

The Formula

Alignment Score = aligned_votes / (aligned_votes + opposed_votes)

Determining Favorable Votes

For each bill tagged with an industry relevance:

  • Industry-supported bills: A "Yea" vote is considered favorable
  • Industry-opposed bills: A "Nay" vote is considered favorable
  • Neutral/Unknown position: We default to "Yea" as favorable

In Plain English

If a legislator votes "Yea" on bills that an industry supports and "Nay" on bills that industry opposes, their alignment score will be high. We only count definitive votes ("Yea" or "Nay"), not "Present" or abstentions.

Interpretation

  • Score 0.0: Always votes against industry interests
  • Score 0.5: Evenly split between aligned and opposed
  • Score 1.0: Always votes in industry's favor

3. Timing Patterns

This component analyzes the temporal relationship between when contributions are received and when favorable votes are cast. We look for patterns where contributions cluster suspiciously close to votes.

Analysis Windows

WindowDays Before VoteInterpretation
Suspicious0-30 daysContribution very close to vote
Notable31-90 daysContribution moderately close to vote
Background91-180 daysNormal campaign activity window

The Formula

Timing Score = (suspicious_ratio × 0.6) + (consistency × 0.4)

Where:

  • suspicious_ratio = contributions in 30-day window / total pre-vote contributions
  • consistency = measure of regular pre-vote contribution pattern (0-1)

Statistical Significance

We apply a binomial test to determine if contributions cluster before votes more than random chance would predict. If contributions were randomly distributed, we would expect only about 16.7% (30/180 days) to fall in the suspicious window.

4. Peer Deviation

This component compares a legislator to their peers—other members of the same party in the same chamber. If a legislator stands out from their peers in their relationship with a particular industry, it may warrant further examination.

Peer Group Definition

Peers are defined as legislators who share:

  • Same political party (Democrat, Republican, Independent)
  • Same chamber (House or Senate)
  • Currently in office

What We Measure

For the target legislator and all peers, we calculate:

  • Industry contribution percentage (% of total from this industry)
  • Vote alignment score with the industry

The Formula

We calculate z-scores to measure how many standard deviations from the mean:

z_contribution = (target_contrib - mean_contrib) / std_dev_contrib
z_alignment = (target_alignment - mean_alignment) / std_dev_alignment
combined_z = (|z_contribution| + |z_alignment|) / 2

The combined z-score is converted to a 0-1 deviation score using the cumulative distribution function (CDF) of the normal distribution.

In Plain English

If a legislator receives significantly more from an industry than their peers, and also votes more favorably for that industry than their peers, the deviation score will be high. A score near 0 means they're typical for their group; a score near 1 means they're a statistical outlier.

Minimum Requirements

A minimum of 5 peers is required for meaningful analysis. If the peer group is too small, the confidence level is marked as "insufficient data."

Composite Score Calculation

The final score combines all four components using a weighted average, with weights adjusted based on data quality and confidence levels.

Default Weights

ComponentWeightRationale
Vote Alignment35%Most direct measure of voting behavior
Contribution Concentration25%Measures funding dependency
Timing Patterns20%Temporal correlation indicator
Peer Deviation20%Comparative context measure

Weight Adjustment

Weights are adjusted based on component confidence levels:

  • High confidence: 100% weight
  • Medium confidence: 70% weight
  • Low confidence: 30% weight
  • Insufficient data: 0% weight (excluded)

After adjustment, weights are renormalized to sum to 100%.

Confidence Levels

Each component has its own confidence level based on the amount and quality of available data. These confidence levels affect how much weight each component receives in the final score.

ComponentHighMediumLowInsufficient
Contribution Concentration≥$10,000 total$1,000-$9,999<$1,000$0
Vote Alignment≥10 votes5-9 votes<5 votes0 votes
Timing Patterns≥20 pairs10-19 pairs<10 pairs0 pairs
Peer Deviation≥20 peers10-19 peers5-9 peers<5 peers

Data Sources

All data used in our analysis comes from publicly available, authoritative sources:

Federal Election Commission (FEC)

Campaign contribution data, updated weekly during election cycles. Provides detailed records of individual and organizational contributions to candidate committees.

fec.gov/data

Congress.gov API

Roll call votes and bill information, updated daily. Official source for legislative actions, bill text, and voting records.

api.congress.gov

Bioguide

Biographical information for current and historical members of Congress. Provides unique identifiers used to link records across data sources.

bioguide.congress.gov

NAICS Codes

North American Industry Classification System codes for categorizing contributions by industry, with custom mappings for political relevance.

census.gov/naics

Update Frequency

  • FEC data: Weekly during active election cycles
  • Vote data: Daily (when Congress is in session)
  • Score recalculation: Nightly batch processing

Limitations

Correlation vs Causation

Statistical correlation does not prove causation. A high score does not mean a legislator is corrupt or that their votes were "bought." Many factors influence both campaign contributions and voting behavior.

Committee Assignments

Legislators on committees relevant to specific industries naturally attract more contributions from those industries. A member of the Agriculture Committee will receive more from agricultural interests, regardless of their independence.

Constituent Industries

Representatives from districts with major employers in specific industries will naturally have higher contributions and votes aligned with those industries. A legislator from a farming district supporting agricultural policy is representing their constituents.

Vote Significance

Not all votes are equal. Procedural votes, amendments, and final passage votes have different significance. Our algorithm treats all recorded votes equally, which may overstate the importance of minor procedural votes.

Data Completeness

While we use comprehensive data sources, some contributions may be misclassified, delayed in reporting, or linked to incorrect industries. Bill-industry tagging involves judgment calls that may not capture all relevant legislation.

Historical Comparison

Score methodology may change over time as we improve our algorithms. Historical scores should not be directly compared across different methodology versions.

Open Source Transparency

Our entire codebase, including all scoring algorithms, is open source. We believe transparency is essential for trust. You can review, audit, and contribute to our methodology: