Key Idea

Handling Missingness with Salience

Patient records are often incomplete or under-specified. SatIR augments its constraint representation to handle such incompleteness through salience-based reasoning over missing or underspecified evidence, and in some cases by inferring diagnoses from patient notes.

Principle

Salience as a Principle for Handling Missingness

The fundamental assumption: any salient information about a patient's condition will be documented in their medical record. This allows us to address missing data by focusing on whether the absent information is truly salient.

Because salience is not formally defined in the medical literature, SatIR uses targeted LLM queries to assess concept importance. These judgments are recorded explicitly, keeping matching decisions transparent, interpretable, and open to expert review. In contrast, end-to-end LLM matching is much harder to inspect.

Why this matters: end-to-end LLM matchers make implicit missingness judgments that are hidden inside model weights. SatIR externalizes these as explicit salience assessments, making it possible to review, override, or standardize how incomplete records are handled — a critical property for clinical deployment.

Missingness

Whole-Fact Missingness in Records

When a patient record lacks information to support or refute a specific trial constraint, salience determines how that absence should be interpreted.

For each clinical trial condition, SatIR uses the LLM to determine whether potentially missing information is salient, and correspondingly whether it should be interpreted as supporting the condition, refuting it, or remaining inconclusive.

Fig. — Whole-fact missingness. When a queried trial condition has no direct support in the patient record, salience determines whether the missing fact is tolerable (low salience — remain inclusive), or decision-critical (high salience — absence is evidence).

Under-Specificity

Salience of Specificity

When a patient's documented diagnosis is less specific than a trial's target, salience determines whether the coarser evidence is sufficient.

A patient is logically eligible for trials targeting conditions that subsume the patient's diagnosis. However, since medical records can be under-specified, it may also be reasonable to match patients when their diagnosis subsumes the trial's targeted condition. Whether we should do so depends on the salience of the targeted condition.

For example, a patient documented only with appendicitis may still match a trial for Acute Appendicitis, since the record may omit that extra specificity. But the same patient should not match a trial for Ruptured Suppurative Appendicitis, because such a salient condition would likely be explicitly recorded.

Augmentation compiles into ORs. Rather than bending the matching logic at runtime, SatIR encodes salience decisions directly into the trial-side formula as disjunctive clauses. When the extra specificity of a condition relative to its ontology parent is low-salience, the condition is expanded to specific_condition OR parent_concept. The augmented formula remains valid, and the solver's guarantees are preserved.

Fig. — Specificity salience. A patient documented with appendicitis may match Acute Appendicitis (low salience — specificity gap is tolerable), but not Ruptured Suppurative Appendicitis (high salience — would be explicitly recorded).

Inference

Inferring Diagnoses from Patient Notes

Some patient notes describe symptoms or clinical findings without stating a diagnosis explicitly. In these cases, SatIR augments its representation by inferring likely diagnoses using the LLM parsing pipeline. This allows constraints that reference specific diagnoses to be evaluated even when the record documents only the underlying clinical evidence.

← Back to overview Constraint Augmentation →