Evidence-Based Truth Claim Scale (0-100)

My Personal Framework for Evaluating Truth Claims

Mar 08, 2025

I suspect we share this in common: I am flooded with information—from scientific studies to social media claims, news reports to expert opinions. So, I've developed a personal approach to evaluate what to believe (and how much to believe it). I wanted to share this framework with you as a window into how I distinguish between what I consider well-established facts, plausible theories, and dubious claims.

A close up of a blue and purple light — Photo by maks_d on Unsplash

I call it my Evidence-Based Truth Claim Scale (ETCS). It helps me structure my thinking about claims based on the strength of supporting evidence rather than relying solely on intuition, authority, or personal preference.

Why I Developed This Approach

I find myself making dozens of judgments daily about what's true, often unconsciously. Without a systematic approach, there is a risk of falling into several traps:

Thinking in binary terms (treating claims as either completely true or completely false)
Relying too heavily on authority ("Trust me, I'm an expert")
Confusing moral values with factual claims
Failing to update beliefs when new evidence emerges

The ETCS helps me address these challenges by creating a nuanced framework for evaluating the strength of evidence supporting any claim.

How it Works

The scale ranges from 0-100, with higher numbers representing stronger evidence.

0-10: What I Consider Debunked/Baseless

Claims that score 0-10 aren't merely unproven—I believe they've been actively disproved or lack any credible support. These claims typically:

Are contradicted by overwhelming evidence
Have no credible supporting studies
Are rejected by relevant experts
Often rely purely on anecdotes or thoroughly superseded theories

Example: "Phlogiston explains combustion." (This pre-modern theory was completely replaced by our understanding of oxidation.)

10-40: What I See as Highly Uncertain/Weak Evidence

Claims I place in this range have some supporting evidence, but I find it limited, flawed, or only applicable in very specific contexts. These claims often:

Have limited studies with significant methodological flaws
Only seem plausible within niche contexts or speculative frameworks
Face significant skepticism from the mainstream scientific community

Example: "Homeopathy cures cancer." (Multiple systematic reviews have found no evidence supporting homeopathy beyond placebo effects for treating cancer.)123

50: Equivocal/Unresolved (Tentative Balance of Evidence)

This middle point represents a genuine state of uncertainty—where the evidence appears truly mixed. Claims I score around 50:

Have conflicting but methodologically sound peer-reviewed studies with comparable quality
May have mechanistic plausibility but lack direct causal proof in humans
Remain actively debated with no clear consensus among experts or established professional guidelines

Example: "Vitamin D supplements prevent depression in non-deficient adults." (Some observational studies suggest associations,4 but randomized controlled trials often show mixed or null results, and while the mechanism is biologically plausible, it remains unproven.)56

60-90: What I Consider Likely/Strong Evidence

Claims I score in this range have substantial evidence supporting them, though I recognize some questions may remain. These claims typically:

Enjoy broad consensus among relevant experts, with dissent primarily focused on specifics rather than core principles
Are supported by multiple high-quality studies using diverse methodologies
Have survived deliberate falsification attempts
Rest on well-established mechanisms, though some details may still be debated
May have minor unresolved questions or context-dependent exceptions

Granularity in this range:

60-70: Strong evidence with some mechanistic gaps or ongoing debates about magnitude/time frames
- Example: "Antibiotic overuse contributes to bacterial resistance."
  - Clear causal mechanism demonstrated in lab and clinical settings7
  - Debates about relative contribution compared to other factors in real-world scenarios8
70-80: Very strong evidence replicated across multiple scientific domains
- Example: "Regular exercise reduces risk of cardiovascular disease."
  - Consistent across observational and interventional studies9
  - Questions remain about optimal type/intensity for specific populations10
80-90: Near-certainty, with remaining questions primarily about edge cases or extreme precision
- Example: "CO2 emissions drive climate change."
  - Core principle has overwhelming support
  - Debates center on sensitivity, feedback mechanisms, and regional effects

90-100: What I Accept as Factual/Unassailable

At the highest end of my scale are claims that I personally consider essentially factual. These claims typically:

Have been universally validated by evidence
Are foundational to scientific understanding
Face no credible dissent in the peer-reviewed literature
Often represent basic physical realities

Example: "Humans require water to survive." (This biological necessity is established beyond any reasonable doubt.)

How I Apply This Framework with AI Systems

This scale has become particularly valuable in my interactions with AI systems. When working with AI models, I now:

Include the scale in my prompts: I explicitly reference this framework when asking for information, using language like: "Using the Evidence-Based Truth Claim Scale (where 0-10 = debunked, 10-40 = weak evidence, 50 = equivocal, 60-90 = strong evidence, 90-100 = factual), how would you score the following claim..."
Request evidence ratings: When asking the AI to generate information on complex topics, I specifically request that it rate its own claims on my scale, which forces more nuanced answers than simple assertions.
Evaluate model outputs: I apply this framework to assess claims made by AI systems, helping me distinguish between what the model presents as well-established versus speculative.
Calibrate confidence levels: I use the scale to request that AI outputs include explicit confidence levels (e.g., "I'd rate this claim as ~75 on your scale because...").
Structure disagreements: When I disagree with an AI's assessment, I can use the scale to pinpoint exactly where our evaluations of evidence differ rather than arguing in circles.

This approach has transformed my AI interactions from potentially misleading "black box" exchanges into structured conversations where I can better assess the reliability of the information I receive.

Applying This Framework Generally

This is a personal tool that helps me navigate information. Here's how I try to use it:

Thinking Dynamically: I'm ready to adjust my scoring as new evidence emerges. What I rate as a "50" today might become a "70" after conclusive randomized controlled trials are published.
Considering Context: I recognize that different domains have different standards of evidence. What constitutes "strong evidence" in particle physics might look different from "strong evidence" in psychology.
Watching the Thresholds: For claims I score between 40-60, I try to maintain intellectual humility. Similarly, claims I score 85-95 represent near-certainty in my mind, but I remain open to refinement.
Separating Facts from Values: My scale measures evidence strength, not moral or ethical worth. A claim can score highly on evidence while still representing something I find objectionable (or vice versa).

Moving Beyond Binary Thinking

The most valuable aspect of this scale for me is how it moves my thinking beyond simplistic true/false dichotomies. By acknowledging degrees of certainty, I can:

Hold provisional beliefs appropriate to the evidence
Update my views proportionally as new information emerges
Communicate more precisely about what I know and don't know
Identify where further research would be most valuable

I'm sharing this not as a prescription for how everyone should think, but as an explanation of how I personally try to evaluate claims in an increasingly complex information landscape. I find it helps me maintain intellectual honesty while navigating a world of competing truth claims—especially when interacting with AI systems that might otherwise obscure their confidence levels.

I'm curious—do you have your own system for evaluating information? Does any aspect of this approach resonate with how you think about evidence and truth?