Methodology · Whitepaper v1.0

The Irreplaceable Score Methodology

A 4-dimension framework for career AI readiness. Anchored in primary research from Anthropic, WEF, Goldman Sachs, McKinsey, and Stanford HAI.

Version 1.0 · April 2026 · 30+ peer-reviewed + industry sources

Executive Summary

The Irreplaceable Score is a 0–100 composite index that quantifies how well-positioned an individual is to thrive as AI reshapes knowledge work. It is computed from four weighted dimensions, each scored 0–100 against a transparent rubric anchored in primary research from Anthropic, the World Economic Forum, Goldman Sachs, McKinsey, Stanford HAI, and peer-reviewed labor economics (Eloundou et al., 2023).

The four dimensions are: (1) Personal AI Fluency — the respondent's actual use of AI; (2) Company AI Maturity — their employer's organizational readiness; (3) Industry AI Disruption — sector-level exposure based on observed task-level data; and (4) Role Amplification — how much the respondent's specific job can be leveraged (vs. displaced) by AI.

Each dimension is decomposed into 4–5 sub-criteria with explicit weights, five level labels (Spectator → Native, or equivalent), and example behaviors. Composite scoring produces a score, a ±confidence interval, and one of four readiness categories: Behind (<30), At Risk (30–50), Emerging (50–75), Leveraged (75+). The rubric is designed for repeatable LLM scoring, psychometric defensibility, and enterprise reporting.

The 4 Dimensions

Each dimension is scored 0–100 against a transparent rubric with sub-criteria, five level labels, and example behaviors. The four dimensions combine into a weighted composite described in Section 3.

Dimension 1 · Weight 30%

Personal AI Fluency

Personal AI Fluency measures the respondent's actual hands-on proficiency with AI tools in their own work. It is not self-reported confidence — it is inferred from behavior: what tools they use, how often, what they produce with them, and whether they collaborate, direct, or build. It is the only dimension fully under the individual's control, and therefore the lever the assessment emphasizes most.

Sub-criteria and weights

Sub-criterionWhat it measuresAssessment inputWeight
1a. Tenure & FrequencyLength and cadence of AI use“How long have you used AI tools?” + “How often?”15%
1b. Tool DiversityBreadth of stack (chat, coding, image, agents, automation)Tool checklist + verticals used20%
1c. Collaboration ModeWhere they sit on Anthropic's 5-mode spectrumTask-description prompts → LLM classifies25%
1d. Builder IndexDo they create reusable AI artifacts (prompts, GPTs, agents, scripts)?“Have you built…” behavior checklist20%
1e. Workflow IntegrationIs AI embedded in daily work or sporadic?% of weekly work hours + integration scenarios20%

Levels

ScoreLevelDescriptionExample behaviors
0–30SpectatorPassive awareness; little to no hands-on use.Tried ChatGPT once or twice; trivia/curiosity only; no work tasks; doesn't know what “prompt engineering” means.
30–55DabblerOccasional use, mostly single-shot prompts for content.Uses ChatGPT/Claude 1–2×/week for emails or drafts; copies outputs without editing; one tool only; treats AI as autocomplete.
55–75OperatorDaily user with multiple tools and iterative workflows.2–3 AI tools (chat + coding/image); multi-turn conversations; validation/iteration loops; integrates AI into ≥3 work tasks.
75–90BuilderComposes AI into repeatable systems.Writes reusable prompts/skills; chains tools; uses coding agents; has built an internal GPT or automation; mentors colleagues.
90–100NativeAI is the default operating layer for knowledge work.Ships agentic workflows in production; evaluates models against benchmarks; spends >50% of work hours with AI in the loop.

Research anchoring

  • Anthropic Economic Index — 5-mode collaboration taxonomy (Directive / Feedback Loop / Task Iteration / Learning / Validation)
  • Anthropic Economic Index (March 2026) — tenure / learning-curve findings on compounding sophistication
  • Workera AI Skills Framework — 4-domain proficiency ladder
  • Stanford HAI AI Index 2025 — Functional / Critical / Ethical literacy pillars
  • WEF Future of Jobs 2025 — AI & big data literacy as a top-7 rising skill

Dimension 2 · Weight 20%

Company AI Maturity

Company AI Maturity measures how far the respondent's employer has moved along the AI-adoption S-curve. A highly fluent individual trapped in an “Experimenter” company will under-realize their leverage; a moderately fluent individual in a “Leader” company will be pulled forward by organizational momentum. This dimension captures that environmental lift (or drag).

Sub-criteria and weights

Sub-criterionWhat it measuresAssessment inputWeight
2a. Deployment BreadthHow many teams use AI, not just IT“Which teams at your company use AI?”25%
2b. Tooling & AccessDo employees have enterprise AI tools and models?“Does your company pay for Copilot/Claude/ChatGPT Enterprise?”20%
2c. Workflow RedesignHas AI changed how work gets done, not just layered on top?Scenarios: meetings, reviews, hiring, coding, support25%
2d. Leadership SignalIs AI a CEO/exec priority with measurable targets?“Has leadership set AI goals/KPIs?”15%
2e. Governance & SafetyPolicies, red-team, data handling — signals operational maturity“Does your company have an AI policy? Approved-tools list?”15%

Levels

ScoreLevelDescriptionExample behaviors
0–30AI-Dark (Pre-Experimenter)No official tools; AI use is shadow IT or banned.“We're not allowed to use ChatGPT”; no license; no policy; leadership silent.
30–55ExperimenterPilots in 1–2 teams (usually engineering); no scale.Engineering has Copilot; marketing sneaks ChatGPT; no enterprise contract; no KPIs.
55–75PractitionerEnterprise licenses, multi-team usage, early workflow changes.Claude/Copilot deployed org-wide; policies in place; some redesigned workflows; CEO mentions AI in all-hands.
75–90ScalerAI is central to 2+ P&L lines with measured impact.AI features shipped in product; customer-facing agents live; 20%+ productivity targets tied to AI; internal AI platform team.
90–100LeaderAI-native — org structure and economics reshaped by AI.AI-native org design; AI-first products; EBIT impact attributed; referenced in investor letters; hiring “AI-native” as a criterion.

Research anchoring

  • McKinsey — Superagency in the Workplace (Jan 2025): 3× leader/employee gap; ~1% of companies are “mature”
  • McKinsey — The State of AI 2024/2025: 65% of orgs regularly use GenAI
  • McKinsey 4-archetype model: Experimenters, Practitioners, Scalers, Leaders
  • Gartner AI Maturity Index — 5-stage model
  • Anthropic Economic Index (Sept 2025) — enterprise API usage is 77% automation-dominant

Dimension 3 · Weight 25%

Industry AI Disruption

Industry AI Disruption measures sector-level exposure: how much of the typical work done in the respondent's industry is at risk of automation or augmentation over the next 24–36 months. This is where high individual fluency can't fully compensate — if the whole sector reprices labor, earnings scarring is the base rate. Note: in the composite formula this dimension is flipped (100 − D3) so that a more-disrupted industry lowers the Irreplaceable Score.

Sub-criteria and weights

Sub-criterionWhat it measuresAssessment inputWeight
3a. Sector Task Exposure% of typical tasks in sector that AI can do todayIndustry → lookup in the occupation/industry exposure table35%
3b. Labor Market SignalHiring freezes, layoffs, reorgs citing AIRecent news index + BLS projections25%
3c. Pricing Power ShiftIs AI compressing margins / unit economics?Sector pricing trend proxies15%
3d. Incumbent vs. AI-Native CompetitionAre AI-native challengers winning share?Market-structure cue (e.g., Harvey vs. legacy law; Cursor vs. legacy IDE)15%
3e. Regulatory MoatDoes regulation protect human labor (healthcare, legal licensure)?Regulated-profession flag (inverse)10%

Levels

ScoreLevelDescriptionExample behaviors
0–30InsulatedSector largely physical / regulated / localized.Skilled trades, hands-on healthcare, social work; AI assists admin but not core work.
30–55Partial ExposureSome tasks automatable; core human judgment still load-bearing.Management, sales, mid-level healthcare; AI compresses admin but not the relationship.
55–75High ExposureMajority of tasks AI-addressable; repricing pressure already visible.Finance analysis, legal drafting, education content — hiring freezes and role re-scoping evident.
75–90DisruptedSector in active restructuring; headcount flat or down despite revenue growth.Customer support, copywriting, graphic design, first-line coding.
90–100ExistentialAI is the product; human labor per unit of output is collapsing fast.Translation, basic content mills, SEO content, first-draft code, template design.

Research anchoring

  • Anthropic Economic Index — observed task-level usage by SOC code (gold standard for measured exposure)
  • Eloundou, Manning, Mishkin, Rock (2023) — GPTs are GPTs: O*NET 19,265-task exposure rubric (arXiv:2303.10130)
  • Goldman Sachs — generative AI could expose ~300M FTE jobs globally (2023, updated 2025)
  • WEF Future of Jobs 2025 — 22% of jobs disrupted by 2030; industry-specific cuts
  • BLS Occupational Outlook Handbook (2024–2034 projections)

Occupation / industry exposure reference (Dimension 3)

Derived from Anthropic's observed SOC-level usage (Claude conversations) and Eloundou et al. (2023) task-level exposure, cross-checked against Goldman Sachs industry cuts. Exposure is converted to a 0–100 Disruption Score (higher = more disruption / repricing risk).

Occupation / Industry clusterSOC major groupAnthropic observed useEloundou α>0.5 task shareDisruption Score
Software / IT services15-000037.2%~70%85–95
Marketing, PR, content, creative writing27-0000 / 11-200010.3%~65%75–90
Education, instruction, tutoring25-000012.4%~55%65–80
Finance, banking, insurance analysis13-2000~8%~60%70–85
Legal, paralegal, compliance23-0000~5%~63%70–85
Customer support, call centers43-4000~6%~55%75–90
Management, operations11-00003–5%~40%45–60
Sales41-0000~3%~35%40–55
Healthcare practitioners (dx/admin)29-0000<2%~30%35–55
Healthcare support / aides31-0000<1%~10%15–30
Skilled trades, installation, repair49-0000<1%<10%10–25
Transportation (drivers)53-0000<1%~15%20–35
Personal care, cleaning, food prep35-0000 / 37-0000<0.5%<8%5–20
Construction, farming, fishing, forestry47-0000 / 45-00000.1–0.3%<5%5–15

Dimension 4 · Weight 25%

Role Amplification

Where Industry Disruption asks “is the sector being repriced?”, Role Amplification asks “within this sector, does this specific role get amplified (leveraged) or compressed (automated away) by AI?” A junior lawyer doing doc review sits in the same industry as a senior litigator — but one role is being amplified by AI and the other is being absorbed by it. This dimension captures that split.

Sub-criteria and weights

Sub-criterionWhat it measuresAssessment inputWeight
4a. Amplification RatioProductivity lift from AI on this role's core tasks (e.g., SWE +55%)Role-to-lift lookup from research30%
4b. Judgment DensityShare of role that's high-context decision-making vs. executable tasksTask decomposition from job description25%
4c. Human-Capital LeverageDoes seniority/network/trust compound in this role?Seniority + relationship-facing flags20%
4d. Creative/Strategic MixWEF rising skills (creative thinking, leadership, complex problem solving) as % of roleScenario-based self-report15%
4e. AI-Complement vs. AI-SubstituteDoes AI make this person more hireable (complement) or less (substitute)?Derived from 4a–4d10%

Levels

ScoreLevelDescriptionExample behaviors
0–30CompressedRole is largely executable tasks AI already does well.Junior copywriter, template designer, first-line support, basic data entry, doc review paralegal.
30–55ShrinkingRole still needed but team sizes being cut as AI takes the repeatable core.Mid-level analyst, junior coder, content marketer, entry-level recruiter screening.
55–75StableRole changes substantially but headcount holds; humans curate AI output.Experienced consultants, account managers, teachers, mid-career product managers.
75–90AmplifiedAI makes this person 2–5× more productive; demand rising.Senior engineers using agents, principal designers, investigative journalists, senior sales, founders, senior clinicians with AI dx assist.
90–100LeveragedRole is an AI-leverage point — one person now does what a team used to.Founder-engineers shipping multi-agent products; solo operators running AI-native businesses; chief-of-staff humans orchestrating agents.

Research anchoring

  • Anthropic Economic Index — automation vs. augmentation breakdown per SOC code
  • WEF Future of Jobs 2025 — top growing roles (tech, care, education, green)
  • McKinsey Superagency — productivity benchmarks by role (coding +55%, support +14%, writing +40%)
  • Eloundou et al. 2023 — β-exposure: with LLM tooling, 47–56% of tasks accelerated
  • BLS Occupational Outlook projections

The Composite Score

The four dimensions are not weighted equally. Weights reflect (a) the lever the individual controls, and (b) the prognostic power of each dimension in published research.

DimensionWeightRationale
D1 — Personal AI Fluency30%Highest individual agency; the actionable lever. Strongest research link to short-term career outcomes (Anthropic tenure data, McKinsey productivity).
D2 — Company AI Maturity20%Environmental lift; matters, but can be changed by switching jobs. Lower weight to avoid penalizing great individuals stuck in laggard firms.
D3 — Industry AI Disruption25%Strong structural determinant of earnings trajectory per Goldman and Eloundou. Acts as a multiplier on the risk side (flipped in the formula).
D4 — Role Amplification25%Complements D3 — same industry can have amplified and compressed roles. Anthropic's within-occupation variance justifies weighting it equally to industry.

Formula

S_raw = 0.30·D1 + 0.20·D2 + 0.25·D3_flipped + 0.25·D4
D3_flipped = 100 − D3
Irreplaceable Score = round(S_raw), clipped to [0, 100]

Industry Disruption (D3) is a risk score — a more-disrupted industry should lower the Irreplaceable Score unless personal fluency or role amplification compensate. Flipping it (100 − D3) makes the composite point in the correct direction.

Confidence interval

Each dimension is estimated from a small number of inputs with known variance. We treat the final score as a weighted sum of four independent estimates:

SE_i = (range of level band) / 4 # ≈ ±6 for a 25-point band
SE_total = sqrt( Σ (w_i · SE_i)² )
CI_95 = Irreplaceable Score ± 1.96 · SE_total

In practice this yields a ±5 to ±9 confidence band for most respondents. The band widens when dimension scores sit on level boundaries, consistency checks flag ambiguous inputs, or free-text responses are short / low-signal.

Peer benchmarking

Percentile is computed against a rolling cohort of prior respondents, segmented by (industry, role seniority, region). We report the percentile only when ≥50 peers are present in the triad; otherwise we fall back to industry-only, then global. Inspired by Item Response Theory scoring practice: rather than z-scoring raw totals, we percentile-rank within calibrated subgroups so comparisons stay fair across cohorts.

Readiness Categories

The Irreplaceable Score maps onto four readiness bands, each with a characteristic behavioral profile and a research-anchored 12-month outlook. Thresholds are soft — a score of 49 and 51 should be treated similarly; surfaces should show distance to the next band to encourage action.

BandLabelPopulation shareBehavioral profile12-month outlook (research-anchored)
<30Behind~25–30% of knowledge workersLittle or no hands-on AI use; employer is AI-dark; role is in a compressed/shrinking band in a disrupted sector.Highest earnings-scarring risk per Goldman 2023 and WEF (11% of workers unlikely to get reskilling). Anthropic enterprise hiring data shows ~-14% relative job-finding rate for highly exposed roles without AI fluency. Priority: upskill immediately.
30–50At Risk~30–35%Some exposure, single-tool use; company is an Experimenter; role is partially exposed.Likely to feel “AI anxiety” without measurable productivity gain. Risk of lateral churn as roles re-scope. Priority: tool diversification + workflow integration.
50–75Emerging~25–30%Daily operator across multiple tools; company is a Practitioner; role is stable or amplified.Positioned to capture WEF “rising skill” premium. McKinsey reports 40–55% productivity gains for this cohort. Priority: move from Operator → Builder.
75+Leveraged~10–15%Builder or Native; company is a Scaler/Leader or the respondent is a founder; role is Amplified/Leveraged.Compounding advantage. Anthropic directive-mode data shows this cohort ships 2–5× more output per week. Outcomes: promotion, equity upside, startup optionality. Priority: compound leverage — ship AI-native work publicly, mentor, recruit.

Scoring Guardrails

The rubric is transparent, which means respondents can try to game it. We apply three classes of guardrails.

Gamability detection

  • Fluency vs. output mismatch: D1 inputs claim “Native” use but free-text descriptions show generic AI vocabulary → cap D1 at Operator ceiling (75).
  • Tool-list bloat: claiming 10+ tools used weekly without describing one concrete workflow → cap Tool Diversity (1b) at 60.
  • Builder claims without evidence: “I build agents” with no artifact URL / repo / internal link → cap Builder Index (1d) at 60.
  • Company-maturity inflation: Scaler/Leader claim but role inputs don't reflect AI integration → discount D2 by 15%.
  • Keyword stuffing: repetition of buzzwords (“agentic, LLM, RAG, fine-tuned”) without concrete nouns → consistency flag + widen CI.

Consistency checks

  • D1 internal: Tenure × frequency × tools should tell one story. A 3-month user claiming 10 tools and Builder maturity is inconsistent — widen CI and discount the outlier sub-criterion by 20%.
  • D1 × D2: Builder/Native individual at an AI-Dark company → flagged as “misaligned environment” (often a future job-switcher signal; noted, not penalized).
  • D3 × D4: A role Leveraged in an Existential sector is rare (solo AI-native operators). Requires concrete evidence in free text; otherwise D4 is capped at Amplified (90).

Floor & ceiling logic

  • Floor of 20: every completed assessment scores ≥20. A raw 0 doesn't reflect reality — even AI-dark workers have some career optionality.
  • Ceiling of 98 (v1): 99–100 is reserved for validated AI-native operators (shipped AI products, public AI work with traction). This keeps the top band credible and prevents self-scored perfection.
  • Hard floor: D3 ≥ 75 and D4 ≤ 30 → Irreplaceable Score is capped at 45 regardless of D1. Industry-role fit dominates when the role is being actively eliminated.
  • Hard floor lift: D1 ≥ 85 → minimum Irreplaceable Score of 50. A Native/Builder individual can always move, even from a bad sector/role.

Sources & Citations

Document version: v1.0 · Last updated 2026-04-18 · Maintainer: Human in Residence

Ready to see where you stand?

Take the Assessment →
Build your team →