Data Protection Impact Assessment
Last updated: February 2026
This document provides a summary of the Data Protection Impact Assessment (DPIA) conducted by Helena Bioinformatics for the Helix Insight platform, pursuant to GDPR Article 35. A DPIA is mandatory when processing genetic data on a large scale, as it constitutes high-risk processing of special category data.
1. Processing Description
Nature of processing: automated analysis of genetic variant data (VCF files) uploaded by clinical genetics laboratories. Processing includes variant annotation against population and clinical databases, automated ACMG/AMP classification, phenotype-genotype correlation, biomedical literature mining, and generation of clinical interpretation reports.
Scope: the platform processes whole-exome and whole-genome sequencing data containing thousands to millions of genetic variants per patient sample. Associated phenotype data (HPO terms) and clinical context are also processed.
Context: the platform serves as a clinical decision support tool for qualified geneticists. It does not make autonomous clinical decisions. All outputs require professional review and validation.
Purpose: to reduce variant interpretation time from days to minutes, improving laboratory throughput and consistency while maintaining clinical accuracy.
2. Necessity and Proportionality
Necessity: processing genetic variant data is essential to the core function of the platform. The service cannot be provided without processing VCF files. Phenotype data is necessary for clinical correlation and prioritization of variants.
Proportionality: we process only the minimum data necessary. VCF files are received in pseudonymized form (sample IDs only, no patient names). We do not request or store directly identifying patient information. Phenotype data is limited to standardized HPO codes relevant to the clinical question. Data retention is time-limited with automatic deletion.
Legal basis: for genomic data, the Data Controller (laboratory) relies on explicit consent (Article 9(2)(a)) or the healthcare provision exemption (Article 9(2)(h)). Helena Bioinformatics processes this data as Data Processor under contractual obligation (Article 6(1)(b)) and the DPA.
3. Risk Assessment
Risk 1: Unauthorized access to genetic data
Severity: High. Genetic data is immutable and uniquely identifying. Likelihood: Low. Mitigated by dedicated servers (not multi-tenant cloud), TLS 1.3 and AES-256 encryption, role-based access control, network firewall rules, and comprehensive audit logging. Residual risk: Low.
Risk 2: Data breach during transit
Severity: High. Likelihood: Very Low. All data transmission uses TLS 1.3 encryption. VCF files are uploaded directly to our EU servers. No data transits through non-EU jurisdictions. Residual risk: Very Low.
Risk 3: Re-identification of pseudonymized data
Severity: High. Genetic data is inherently identifying. Likelihood: Very Low. We receive only sample IDs, not patient identifiers. Our staff cannot link sample IDs to individuals. Access is restricted to automated processing pipelines. Residual risk: Very Low.
Risk 4: Incorrect variant classification leading to clinical harm
Severity: High. Incorrect classification could affect patient care. Likelihood: Low. The platform is explicitly positioned as a decision support tool, not a diagnostic device. All outputs must be reviewed by qualified geneticists. The platform follows established ACMG/AMP guidelines and references validated databases (ClinVar, gnomAD). Regular validation against clinical benchmarks. Residual risk: Low (mitigated by mandatory human review).
Risk 5: Excessive data retention
Severity: Medium. Likelihood: Very Low. Automatic deletion after configurable retention period (default 90 days). Data Controllers can request immediate deletion. Deletion processes are logged and auditable. Residual risk: Very Low.
Risk 6: Sub-processor non-compliance
Severity: Medium. Likelihood: Very Low. Hetzner (infrastructure provider) maintains ISO 27001 certification and provides physical hosting only without logical data access. DPA in place with Hetzner. No sub-processors outside the EU process genomic data. Residual risk: Very Low.
4. Measures to Address Risks
The following technical and organizational measures are implemented:
Technical measures: dedicated EU-based servers (Helsinki, Finland) with no multi-tenant sharing; TLS 1.3 encryption in transit and AES-256 at rest; role-based access control with principle of least privilege; JWT authentication with automatic session expiration; comprehensive audit trails for all data operations; automated data deletion based on configurable retention policies; network isolation with firewall restricting non-essential traffic; bcrypt password hashing; and regular vulnerability assessments.
Organizational measures: Data Processing Agreements with all laboratory partners; confidentiality obligations for all personnel; documented security incident response procedures with 24-hour notification to Data Controllers; regular review and updating of security measures; sub-processor management with Data Controller notification; staff training on data protection obligations; and appointed Data Protection Officer.
5. Conclusion
This DPIA concludes that the Helix Insight platform can process genetic data with an acceptable level of residual risk, provided all identified measures are maintained and regularly reviewed. The key factors supporting this conclusion are: data is received only in pseudonymized form; all processing occurs within the EU on dedicated infrastructure; encryption is applied both in transit and at rest; the platform functions as decision support requiring mandatory human review; and data retention is time-limited with automatic deletion.
This DPIA will be reviewed annually or when significant changes are made to the processing activities, infrastructure, or regulatory landscape.
For questions regarding this assessment, contact our Data Protection Officer at privacy@helena.bio.