5500 Analyzer
Methodology

Dataset, scoring, and data linkages

This document specifies the data sources, derivation rules, and known limitations of every number presented in the 5500 Analyzer. All scoring criteria below are evaluated from the public DOL EFAST2 release; no proprietary or licensed data enters at the row level.

Back to search

§1Dataset

All data derive from the U.S. Department of Labor EFAST2 public release of Form 5500 and Form 5500-SF for plan years 2022, 2023, and 2024. The corpus is loaded from Parquet files published by the DOL and queried locally with DuckDB.

Plan year coverage
2022, 2023, 2024. Filings with PLAN_YEAR < 2022 are not present in the loaded corpus and are surfaced as “Pre-2022” if encountered.
Filings (form_5500.parquet, form_5500_sf.parquet)
One row per filing. The unique identifier is ACK_ID. The same plan (sponsor EIN + plan number) appears once per plan year.
Service providers (sch_c_item1, sch_c_item2, sch_c_item3, sch_c_part2)
Schedule C disclosure of compensated service providers. Schedule C is required only for Form 5500 filers (not 5500-SF) with at least one provider receiving $5,000 or more in a plan year.
Schedule H (sch_h.parquet)
Audited financial statements for large plans (generally ≥100 participants). Contains assets, liabilities, and the Line 4i schedule-of-assets summary.
Schedule D (sch_d.parquet)
List of Direct Filing Entities (DFEs) — master trusts, common/collective trusts, pooled separate accounts — in which the filer holds an interest.
Schedule of Assets (Line 4i, parsed offline)
Line-item investment holdings. Filed as a free-form PDF attachment on EFAST2; we extract these offline. Coverage is partial and prioritized toward the largest plans and master trusts.

§2Data linkages

A retirement plan is rarely described by a single filing. The information needed to evaluate a plan is distributed across the Form 5500 itself, its attached schedules, and — when assets are pooled — the filings of separate Direct Filing Entities. The diagram below specifies the relationships.

Plan sponsor (SPONSOR_EIN)
  └─ files one Form 5500 or 5500-SF per plan, per plan year
     │   (uniquely identified by ACK_ID)
     │
     ├─ Schedule C ··················· compensated service providers
     │     └─ Part 2 ················· providers who failed to disclose
     │
     ├─ Schedule H ··················· audited financial statements
     │     │      (large plans only, generally ≥100 participants)
     │     │
     │     └─ Line 4i ················ Schedule of Assets (line items)
     │            ↳ filed as a PDF attachment on EFAST2
     │            ↳ parsed offline; coverage is partial
     │
     └─ Schedule D ··················· interests in pooled investment
            │                          vehicles (Direct Filing Entities)
            │
            ▼
     ┌─────────────────────────────────────────────┐
     │  Direct Filing Entity (master trust / CCT)  │
     │  • Has its own EIN + plan number             │
     │  • Files its OWN Form 5500                    │
     │  • Its Schedule H reports the FULL pooled    │
     │    balance, not the per-plan allocation      │
     │  • Its Line 4i lists the underlying          │
     │    securities (CUSIPs, share classes, etc.)  │
     └─────────────────────────────────────────────┘

Practical consequences for what you see on a plan page

  1. Small filers (Form 5500-SF). No Schedule H, no Schedule of Assets, no Schedule C. Holdings cannot be shown for these plans because they were never disclosed in the first place.
  2. Large filers with no DFE interests.Holdings exist as a PDF attachment to the filing. The 5500 Analyzer shows them when our offline parser has unlocked that filing’s template; otherwise the plan page links to the original PDF on EFAST2.
  3. Large filers with Schedule D interests.The plan’s own Schedule H reports each pooled vehicle as a single allocated-interest balance (e.g. “Vanguard Fiduciary Trust Company Master Trust — $X”). The line-item securities sit on the master trust’s own Form 5500. The plan page surfaces this with a Schedule D table and links the participant to the trust’s filing.
  4. The filing is itself a DFE. Its Schedule H reports the full pooled balance for the trust. Participating plans each report their allocated interest separately. The plan page replaces the Plan Health Score with a DFE summary in this case because the health-score formula does not apply to a pooled investment vehicle (no benefit design, no participants in the usual sense, no IQPA requirement).

A filing’s reported “total plan assets” on its own row in the search results is always the value the plan itself reported on Schedule H. No allocation, deduplication, or inference is applied at the row level.

§3DFE double-counting in aggregate totals

When totals are computed across the corpus — for example, the homepage hero numbers — the structure described in §2 produces a known double-count: a master trust’s assets appear once on the trust’s own Form 5500 and again on each participating plan’s Schedule H.

How we handle it

  • DFE filings are identified by joining each filing’s (SPONSOR_EIN, PLAN_NUM) against the (DFE_EIN, DFE_PN) columns of raw_sch_dacross the corpus. Any filing whose EIN/PN is cited as a DFE by another plan’s Schedule D is flagged as a DFE filing.
  • The homepage total assetsaggregate sums only filings for the latest plan year and excludes DFE-flagged filings. The number reported elsewhere (e.g. on a plan’s detail page) is always the single filing’s own Schedule H value and is not adjusted.
  • The homepage active participants aggregate sums the latest plan year only, with no DFE adjustment. A DFE filing typically reports zero or non-applicable participant counts.

The Investment Company Institute estimates U.S. private retirement assets at approximately $15 trillion (year-end 2024). The Form 5500 universe is a superset because it also covers welfare benefit plans (health, life, severance).

§4Plan Health Score

4.1 Scoring philosophy

The Plan Health Score is a 0–100 directional indicator derived entirely from the public Form 5500. Two principles govern its construction:

  • Verifiable signals only.Every penalty is tied to a specific, deterministically-read DOL field (e.g. ACCOUNTANT_FIRM_NAME, DATE_RECEIVED, DIRECT_COMP_AMT). The score does not penalize a plan for data we may have failed to extract.
  • Missing signal = neutral.When a pillar’s input is not measurable from the filing — for example, concentration when the Schedule of Assets has not been parsed — the pillar holds at a neutral 75 with an explicit “not scored” note. Missing data never deducts and never credits.

4.2 Grade bands

Score rangeGradeInterpretation
85–100ANo material flags identified
70–84BMinor flags or single moderate flag
55–69CModerate flags across one or more pillars
40–54DMultiple material flags
0–39FSerious compliance, cost, or concentration flags

4.3 Pillar weights

PillarWeight
Compliance & disclosures30%
Total plan cost25%
Concentration risk20%
Participation health15%
Vendor stack10%

4.4 Compliance & disclosures (pillar starts at 100)

CriterionAdjustmentNote
TOT_ACTIVE_PARTCP_CNT ≥ 120 AND ACCOUNTANT_FIRM_NAME is null−40Plan required to file with an Independent Qualified Public Accountant did not name one.
100 ≤ TOT_ACTIVE_PARTCP_CNT < 120 AND ACCOUNTANT_FIRM_NAME is null−10Filer in the 80–120 audit safe-harbor zone with no IQPA named.
sch_c_part2 contains ≥1 row for the filing−10 to −35Penalty = min(35, 10 + 5·count). Each row is a service provider that failed to disclose compensation.
DATE_RECEIVED > FORM_TAX_PRD + 10.5 monthsup to −25Penalty scales at 4 points per month past the 10.5-month grace window (Form 5558 extension is 9.5 months; we allow a 1-month grace).

4.5 Total plan cost (pillar starts at 100)

Cost ratio is computed as (Σ DIRECT_COMP_AMT + Σ flagged TOT_INDIRECT_COMP_AMT) / TOT_ASSETS_EOY_AMT expressed in basis points (1 bp = 0.01%). The ratio is compared to the peer median for the plan’s asset band.

Plan size (assets)Sample nPeer median (bps)
< $1M1,125158
$1M – $10M16,35952
$10M – $50M~17,00029
$50M – $100M~3,00016
$100M – $500M~6,00010
$500M – $1B~1,0006
$1B+1,2904.4
CriterionAdjustmentNote
Filing fee bps > 1.5× peer median−35Total Schedule C fees materially above peer band.
1.2× < Filing fee bps ≤ 1.5× peer median−20Modestly above peer band.
Indirect fees ≥ $50,000 AND ≥ 40% of total fees−12Heavily revenue-sharing/12b-1-funded structure.
No Schedule C disclosed (small plan or unbundled)neutral (75)Not scored — input not present.
Filing fee bps < 0.7× peer medianpositivenoted as positive

4.6 Concentration risk (pillar starts at 100)

Requires parsed Schedule of Assets data. If no holdings are available, pillar holds at 75 with an explicit “not scored” note.

CriterionAdjustmentNote
Party-in-interest holdings ≥ 50% of plan−35
Party-in-interest holdings ≥ 25% of plan−20
Largest single holding ≥ 25% of planup to −25 (scaled by holding share)
Employer / sponsor stock ≥ 20% of plan−25
Employer / sponsor stock ≥ 10% of plan−12

4.7 Participation health (pillar starts at 100)

Ratio of PARTCP_ACCOUNT_BAL_CNT (accounts with a balance) to TOT_ACTIVE_PARTCP_CNT, capped at 1.0 because terminated employees with residual balances would otherwise inflate the numerator.

CriterionAdjustmentNote
Ratio < 0.5−35
0.5 ≤ Ratio < 0.7−20
TOT_ACTIVE_PARTCP_CNT = 0 AND NET_ASSETS_EOY_AMT > 0−10

4.8 Vendor stack (pillar starts at 100)

CriterionAdjustmentNote
Form 5500 (not SF) AND TOT_ACTIVE_PARTCP_CNT ≥ 100 AND no recordkeeper disclosed on Schedule C−20
≥8 disclosed Schedule C providers across ≥5 service categories−12
TYPE_PENSION_BNFT_CODE contains code 1I (frozen plan, no new accrual)−25

§5What the score does not measure

  • Investment menu quality. Per-fund expense ratio, share class selection, and risk-adjusted performance are not on Form 5500 and are not evaluated.
  • Match generosity and eligibility design. Match formulas (e.g. 100% on first 3%, 50% on next 2%), vesting schedules, and eligibility waiting periods are not reported on Form 5500.
  • True deferral rates. Active-participant salary deferral percentages live in recordkeeper data, not the 5500. The participation pillar uses balance-coverage as a directional proxy only.
  • Fiduciary process. Whether the plan committee meets quarterly, maintains a written investment policy statement, or documents fee benchmarking is not observable from filings.
  • Master trust filings.A pooled DFE filing does not receive a Plan Health Score; it is shown a DFE-specific summary instead because the score’s pillars (audit, participation, vendor stack) do not apply to a pooled investment vehicle.

§6Lead report definitions

Each count on the Lead Reports page is the result of a deterministic SQL predicate against the loaded views. The predicates are:

ReportPredicate
Failed Schedule C disclosuresEXISTS (SELECT 1 FROM sch_c_part2 WHERE ACK_ID = plans.ACK_ID)
Large plans missing auditorsource_form = ‘form_5500’ AND TOT_ACTIVE_PARTCP_CNT ≥ 100 AND (ACCOUNTANT_FIRM_NAME IS NULL OR trim(ACCOUNTANT_FIRM_NAME) = ‘’)
≥20% party-in-interest exposureSUM(CURRENT_VALUE) FILTER (WHERE PARTY_IN_INTEREST) / SUM(CURRENT_VALUE) ≥ 0.20
>10% employer-stock exposureIssuer regex matches COMMON STOCK|EMPLOYER STOCK|SPONSOR STOCK|COMPANY STOCK AND value ratio ≥ 0.10.
Total direct comp ≥ $1MSUM(DIRECT_COMP_AMT) ≥ 1,000,000 across Schedule C providers.
8+ disclosed providersCOUNT(*) ≥ 8 on Schedule C providers.
Self-directed brokerage detectedAny holding row matches a brokerage-window regex on IDENTITY_OF_ISSUE.
Late filingsDATE_RECEIVED > date_of_extended_deadline (10.5 months past PYE).
YoY: auditor / recordkeeper change, asset drop, first-time filerSelf-join of plans_canonical on (SPONSOR_EIN, PLAN_NUM) between the latest plan year and the prior plan year.