SATIR

Precision Patient–Trial Matching

Scalable High-Recall Constraint-Satisfaction-Based
Information Retrieval for Clinical Trials Matching

Free text

LLM Semantic Parsing

SQL

Matches

LLMs compile criteria into formal constraints.
Relational algebra retrieves. SMT verifies.

Cyrus Zhou¹

Yufei Jin^2*

Yilin Xu^3*

Yu-Chiang Wang⁴

Chieh-Ju Chao⁴

Monica S. Lam¹

¹Stanford ²UCLA ³Emory ⁴Mayo Clinic

*Equal contribution

Paper Cite

+72%

More eligible trials found per patient vs. TrialGPT

93.7%

Recall (vs. 62.4% for TrialGPT)

0.15s

Per patient across 3,621 trials

SatIR teaser: converting free-text eligibility criteria into formal constraints

SATIR in action. A tofersen SOD1-ALS study showing how SATIR turns free-text eligibility criteria into formal, interpretable constraints executable in a database. Terms in blue are ontology-grounded. Clauses not safely translatable to recall-safe predicates are dropped, preserving recall.

Motivation & Abstract

Formal Methods Meet Constraint-Based Retrieval

A Case Study in Clinical Trial Matching

THE PROBLEM

Today's search uses keywords and embedding similarity — soft scores that break down when queries carry hard requirements. A patient either meets an eligibility criterion or doesn't. These are logical predicates, not fuzzy matches.

SATIR'S APPROACH

Encode requirements as formal, composable predicates and solve them exactly using SMT + relational algebra. Use LLMs to convert informal reasoning into explicit, precise, controllable formal constraints.

EXHAUSTIVE RECALL

Nothing is missed

FULL INTERPRETABILITY

Every decision is traceable

PERMANENT CORRECTABILITY

Fix once, propagates everywhere

Evaluated on 59 patients and 3,621 trials, SATIR outperforms TrialGPT on all three retrieval objectives — retrieving 32%–72% more relevant-and-eligible trials per patient, improving recall by 22–38 points, in just 0.15 seconds per patient.

Constraint Satisfaction Constraint-Based Clinical Trial Matching SMT Formal Methods High Recall Retrieval SNOMED Ontology LLM + Formal Methods

System Overview

How SATIR Works

Trials are processed offline into formal constraint representations stored in a database. At query time, patient records are formalized and matched against the database.

Figure 1. SATIR overview. Clinical trial listings and patient records are independently parsed into formal SMT representations. Trial constraints are stored offline; patient constraints are executed as SQL queries for fast, high-recall retrieval.

Ontology-Grounded Representation

An SMT-based representation grounded in SNOMED CT captures Boolean logic, numeric thresholds, temporal windows, and ontology-aware concept identity and entailment.

Accurate Semantic Parsing

LLM-based parsers translate complex clinical text — handling underspecified criteria, negation, temporality, and implicit assumptions — into executable formal constraints.

Salience-Based Missingness

SATIR introduces salience to resolve ambiguity from incomplete records — using LLMs to assess whether missing data is clinically significant enough to affect eligibility, with explicit auditable judgments.

Scalable Database Retrieval

SMT constraints are projected into relational algebra and stored in SQL, enabling fast recall-preserving retrieval across hundreds of thousands of trials in under 3 seconds.

Method

Constraint Indexing & Execution

The first three components build a formal constraint index from trial listings. The fourth executes patient constraints against it for retrieval.

Constraint Semantic Parsing

LLM-based pipelines that translate free-text trial criteria and patient records into formal SMT constraint programs.

→ Explore

Constraint Augmentation

Enriching constraints with ontology-grounded subsumption and salience-based handling of incomplete patient records.

→ Explore

Constraint Indexing

Projecting SMT constraints into recall-safe relational algebra and storing them as first-class data in a SQL database.

→ Explore

Constraint Execution

Executing patient constraints against the pre-built trial database via SQL queries in ~0.15 seconds per patient.

→ Explore

Key Ideas

Four Pillars of SATIR

01

Relational Algebra & SMT

SMT (Satisfiability Modulo Theories) decides whether logical formulas over rich types can be satisfied. Relational algebra is the math behind SQL — fast and scalable. SATIR uses relational algebra wherever possible, falling back to SMT only when needed.

            Relational Algebra
             ⊂ 
            SMT
          

Every SQL query is an SMT formula — but not every SMT formula is a SQL query.

RELATIONAL ALGEBRA

“Patient is at least 18 and has a diabetes diagnosis.”
Simple filters — milliseconds over millions of records.

FULL SMT

“Eligible iff there exists an assignment of unresolved facts (menopausal status, washout completion) satisfying all constraints, with no assignment implying pregnancy risk.”
Existential + universal reasoning — true constraint solving.

See SMT programming examples →

02

Constraints as First-Class Objects

A — Programming

Constraints are programmable objects — not opaque weights. Each criterion can be composed, inspected, and manipulated: capturing qualifier relationships, expressing complex logic, enforcing consistency, and injecting domain ontologies.

See SMT programming examples →

Constraint programming pipeline. Free-text criteria become composable constraint objects with qualifier relationships, ontology injection, and consistency verification.

B — Storage & Execution

Once constraints are compiled as first-class objects, they can be stored, indexed, and executed directly against a database. Retrieval proceeds in three stages: compatible atom matching, clause-level support, and all-clause aggregation. A patient-trial pair is returned only when every retrieval-relevant constraint is satisfied.

Explore the full execution pipeline →

Projected-constraint retrieval execution. Compatible atom matching, clause-level support, and all-clause aggregation over the trial CNF.

03

The Boundary of Formalizability

Not everything can be formalized — and it doesn't need to be. SATIR finds the thin boundary: recurring structures get formalized; everything else is gracefully dropped to preserve recall.

Formalized

age ≥ 18 diagnosis ∈ SNOMED HbA1c < 7.5 no prior chemo BMI 18–35

Precise, executable, auditable

BOUNDARY

Dropped (preserves recall)

“willing to comply” “investigator judgment” “adequate organ function”

Too vague to formalize — safe to skip

            Formal Language
             + 
            Model
             = 
            Formal Reasoning
          

Exhaustive retrieval · Full traceability · Permanent correctability

Explore subsumption & ontology details →

CURATED ONTOLOGY

SNOMED CT, ICD-10, etc. — precise hierarchies, synonyms, and semantic relationships. The most readily usable model.

EMBEDDING + LLM

For domains without an ontology — discover semantic neighborhoods, near-synonyms, and hierarchies on the fly.

04

Handling Missingness with Salience

Records are routinely incomplete. If a fact is missing, is it because it doesn't apply — or because no one wrote it down? SATIR introduces salience: if a clinically important fact would always be recorded when present, its absence is evidence. Unlike end-to-end LLM matchers, these judgments are explicit, auditable, and overridable.

Explore salience in detail →

Whole-fact missingness. Salience determines whether absence is evidence or the record is simply silent.

Specificity mismatch. Appendicitis may match Acute Appendicitis but not Ruptured Suppurative Appendicitis.

Results

State-of-the-Art on All Objectives

SATIR outperforms all baselines — including TrialGPT — across three retrieval objectives on 59 patients and 3,621 trials.

Recall — Treat-Chief

SATIR93.66

TrialGPT62.44

BGE-large59.51

+31.2 pts vs. TrialGPT

Recall — Treat-Any

SATIR92.07

TrialGPT53.66

BGE-large51.22

+38.4 pts vs. TrialGPT

Recall — Relevant-to-Any

SATIR92.53

TrialGPT70.27

BGE-large68.27

+22.3 pts vs. TrialGPT

Mean Relevant-and-Eligible Trials Retrieved per Patient

Objective	Retrieved/Patient	BMRetriever	ClinBest	BGE-large	PubMedBERT	TrialGPT	SATIR
Treat-Chief	36	1.44	1.75	2.07	2.00	2.17	3.25
Treat-Any	64	2.00	2.54	2.85	2.68	2.98	5.12
Relevant-to-Any	121	6.39	7.24	8.68	8.92	8.93	11.76

Patient-Level Outcomes vs. TrialGPT (59 patients)

Objective	SATIR wins	Tie	TrialGPT wins	Served by SATIR	Served by TrialGPT
Treat-Chief	20	33	6	39	35
Treat-Any	29	28	2	42	39
Relevant-to-Any	35	17	7	54	50

Interpretable, auditable, and permanently correctable. Because every eligibility decision traces back to explicit symbolic constraints — not opaque model weights — failures are diagnosable. When a parsing error or an ontology gap is identified and fixed, that fix propagates immediately and permanently across every patient and every trial that touches the corrected predicate. There is no need to retrain, re-embed, or re-evaluate a model. This is the core advantage of SATIR's formal approach: errors are local, fixes are global.

View failure case analysis →

Citation

BibTeX

@article{zhou2026satir,
  title  = {Scalable High-Recall Constraint-Satisfaction-Based
           Information Retrieval for Clinical Trials Matching},
  author = {Cyrus Zhou and Yufei Jin and Yilin Xu and
           Yu-Chiang Wang and Chieh-Ju Chao and Monica S. Lam},
  note   = {Under review},
  year   = {2026},
}

Future Extensions

Beyond Clinical Trials

Clinical trials matching is just the beginning. The core insight behind SATIR — that real-world search is fundamentally about satisfying structured constraints, not matching keywords — unlocks a new paradigm for information retrieval across high-stakes domains. Anywhere people struggle with complex, multi-faceted requirements, SATIR's formal approach can deliver exhaustive, interpretable, and auditable results.

Job Search

Go beyond keyword matching — formally verify candidate-role fit across skills, certifications, experience levels, and visa requirements.

Treatment Selection

Recommend treatments that provably satisfy a patient's full clinical profile — history, contraindications, drug interactions, and guideline criteria.

Travel Search

Plan trips that simultaneously honor budgets, date windows, dietary needs, accessibility, and personal preferences — no compromises, no missed options.

Legal & Policy Retrieval

Surface every relevant statute, regulation, or precedent that applies to a specific jurisdiction, timeline, and subject — with full traceability.

Product Search

Find products that truly meet every specification — compatibility, dimensions, certifications, and performance thresholds — with zero false positives.

Let's build the future of search together. We are actively seeking collaborators to bring constraint-based retrieval to new domains. Whether you have a dataset, a domain challenge, or a vision — we'd love to hear from you.

Cyrus Zhou · Monica S. Lam

Acknowledgments

Support & Thanks

We gratefully acknowledge support from the Verdant Foundation, the Hasso Plattner Institute, Microsoft Azure AI credits, Itaú Unibanco, BMO Financial Group, and the Stanford HAI Institute. Cyrus Zhou is partially supported by the Stanford School of Engineering Fellowship.

We thank Shengguang Wu for helping get this project started, Yucheng Jiang for the clinician annotation interface, Ning Tang from CreditEase, and our collaborators at Stanford and Mayo Clinic for their expert guidance.