Documentation / Phenotype Matching / Interpreting Scores
Interpreting Scores
Phenotype match scores range from 0 to 100 and reflect how closely a gene's known disease phenotype resembles the patient's clinical presentation. This page explains what the scores mean in practice and what factors can affect their accuracy.
Score Ranges
The gene’s known disease phenotype closely resembles the patient’s presentation. Most patient HPO terms have strong matches in the gene’s profile. High confidence in phenotype-genotype correlation.
Significant phenotypic overlap. Several patient terms match well. The gene should be considered a strong candidate even if not all features are represented.
Some shared features between the patient’s presentation and the gene’s profile. May represent partial phenotypic overlap or an atypical presentation of the associated disease.
Few shared phenotypic features. The patient’s presentation has limited overlap with the gene’s known disease spectrum.
Little to no phenotypic similarity. The gene’s associated diseases do not match the patient’s clinical features.
Individual Term Matches
Beyond the overall score, Helix Insight reports which specific patient HPO terms matched and how well. For each patient term, the system identifies the best-matching gene HPO term and its similarity score. A match is considered significant when the individual similarity score exceeds 0.5 (on the 0-1 Lin similarity scale).
Reviewing individual matches helps the geneticist understand why a gene scored the way it did. A gene with 2 of 3 patient terms matched at high similarity is a stronger candidate than one with 3 of 3 matched at low similarity, even if the overall score is similar.
Factors Affecting Accuracy
HPO term specificity
More specific terms (e.g., "Focal clonic seizure" instead of "Seizure") produce more discriminating scores. Broad terms match many genes equally, reducing the ability to differentiate between candidates.
Number of patient terms
5-15 terms is optimal. Too few terms may miss relevant matches. Too many terms can dilute the average score if some are not well-characterized in the HPO database.
Gene annotation completeness
Well-characterized genes (e.g., BRCA1, SCN1A) have extensive HPO profiles and score more accurately. Recently discovered disease genes or genes with limited clinical descriptions may score lower than expected.
Atypical presentations
If the patient has an unusual phenotype for the underlying disease, the standard HPO profile for the gene may not capture the presentation well. The score reflects what is known in the database, not all possible presentations.
Ontology structure
The HPO hierarchy sometimes groups terms in ways that may not perfectly reflect clinical similarity. Scores should be interpreted as prioritization guides, not definitive clinical judgments.
When Scores May Be Misleading
Under-annotated genes
Genes with few HPO annotations will score low even if truly relevant. If the gene has recently been associated with disease and the HPO database has not yet been updated, the phenotype match will underestimate the gene's relevance.
Broad HPO terms
Using non-specific terms like "Abnormality of the nervous system" instead of specific seizure types dilutes the score. The algorithm cannot distinguish between genes when the input terms match many HPO profiles equally.
Multi-system diseases
If a patient has features from multiple organ systems and only some are entered as HPO terms, the score for a gene that matches the missing features will be underestimated.
Recommendation
Always review Tier 1 and Tier 2 genes regardless of exact score. Use the phenotype match score as a prioritization guide, not a definitive answer. The score helps rank hundreds of candidate genes, but the final clinical interpretation remains with the geneticist.
For guidance on selecting HPO terms that maximize score accuracy, see the HPO Term Selection Guide.