Documentation / Phenotype Matching / Interpreting Scores

Interpreting Scores

Phenotype match scores range from 0 to 100 and reflect how closely a gene's known disease phenotype resembles the patient's clinical presentation. This page explains what the scores mean in practice and what factors can affect their accuracy.

Score Ranges

80-100Excellent Match

The gene’s known disease phenotype closely resembles the patient’s presentation. Most patient HPO terms have strong matches in the gene’s profile. High confidence in phenotype-genotype correlation.

60-79Good Match

Significant phenotypic overlap. Several patient terms match well. The gene should be considered a strong candidate even if not all features are represented.

40-59Moderate Match

Some shared features between the patient’s presentation and the gene’s profile. May represent partial phenotypic overlap or an atypical presentation of the associated disease.

20-39Weak Match

Few shared phenotypic features. The patient’s presentation has limited overlap with the gene’s known disease spectrum.

0-19Poor Match

Little to no phenotypic similarity. The gene’s associated diseases do not match the patient’s clinical features.

Individual Term Matches

Beyond the overall score, Helix Insight reports which specific patient HPO terms matched and how well. For each patient term, the system identifies the best-matching gene HPO term and its similarity score. A match is considered significant when the individual similarity score exceeds 0.5 (on the 0-1 Lin similarity scale).

Reviewing individual matches helps the geneticist understand why a gene scored the way it did. A gene with 2 of 3 patient terms matched at high similarity is a stronger candidate than one with 3 of 3 matched at low similarity, even if the overall score is similar.

Factors Affecting Accuracy

HPO term specificity

More specific terms (e.g., "Focal clonic seizure" instead of "Seizure") produce more discriminating scores. Broad terms match many genes equally, reducing the ability to differentiate between candidates.

Number of patient terms

5-15 terms is optimal. Too few terms may miss relevant matches. Too many terms can dilute the average score if some are not well-characterized in the HPO database.

Gene annotation completeness

Well-characterized genes (e.g., BRCA1, SCN1A) have extensive HPO profiles and score more accurately. Recently discovered disease genes or genes with limited clinical descriptions may score lower than expected.

Atypical presentations

If the patient has an unusual phenotype for the underlying disease, the standard HPO profile for the gene may not capture the presentation well. The score reflects what is known in the database, not all possible presentations.

Ontology structure

The HPO hierarchy sometimes groups terms in ways that may not perfectly reflect clinical similarity. Scores should be interpreted as prioritization guides, not definitive clinical judgments.

When Scores May Be Misleading

Under-annotated genes

Genes with few HPO annotations will score low even if truly relevant. If the gene has recently been associated with disease and the HPO database has not yet been updated, the phenotype match will underestimate the gene's relevance.

Broad HPO terms

Using non-specific terms like "Abnormality of the nervous system" instead of specific seizure types dilutes the score. The algorithm cannot distinguish between genes when the input terms match many HPO profiles equally.

Multi-system diseases

If a patient has features from multiple organ systems and only some are entered as HPO terms, the score for a gene that matches the missing features will be underestimated.

Recommendation

Always review Tier 1 and Tier 2 genes regardless of exact score. Use the phenotype match score as a prioritization guide, not a definitive answer. The score helps rank hundreds of candidate genes, but the final clinical interpretation remains with the geneticist.

For guidance on selecting HPO terms that maximize score accuracy, see the HPO Term Selection Guide.