Helix Insight

Phenotype Matching

After variant classification, the geneticist faces a key question: which of these variants are relevant to this patient's clinical presentation? A Pathogenic BRCA1 variant is clinically irrelevant if the patient was referred for epilepsy. Phenotype matching automates this correlation.

The patient's HPO terms are compared against the known phenotypic associations of every gene carrying a candidate variant. Each gene receives a phenotype match score (0-100) and is assigned to a clinical tier. This transforms the subjective question "does my patient's phenotype match disease Y?" into a reproducible numerical score.

How It Works

1

Select HPO Terms

The geneticist describes the patient’s clinical presentation using HPO terms -- a standardized vocabulary of phenotypic abnormalities. 5-15 terms is optimal for most cases.

2

Compute Similarity

For each gene carrying a candidate variant, the algorithm compares the patient’s HPO terms against the gene’s known phenotype associations using semantic similarity. Related concepts are recognized even without exact matches.

3

Score and Rank

Each gene receives a phenotype match score (0-100) based on the average best-match similarity across all patient HPO terms.

4

Assign Clinical Tiers

Variants are classified into five clinical tiers combining phenotype match strength, ACMG pathogenicity, variant impact, and population frequency.

5

Review Results

Tier 1 (Actionable) and Tier 2 (Potentially Actionable) variants are presented first. Incidental findings are flagged separately.

Semantic, Not Exact Matching

Phenotype matching uses semantic similarity, not exact string matching. A patient with "Infantile spasms" will match genes associated with "Epileptic encephalopathy" even if the exact term is absent from the gene's HPO profile. The ontological relationship between these terms is captured through their common ancestors in the HPO hierarchy.

Performance

Gene PanelsSeconds
WES5-15 seconds (~4,600 unique gene HPO sets)
WGS15-30 seconds

All variants from the same gene share identical HPO annotations. The system deduplicates at the gene level before computing similarity, reducing the number of calculations by approximately 130x for a typical WGS case.

In This Section