The BHF Data Science Centre is an advocate of open science, and encourage all developers and users of computable phenotypes to make them freely and openly available following the FAIR guiding principles by:
- Sharing their phenotype definitions publicly and freely, ideally via an open access repository (e.g., the HDR UK Phenotype Library)
- Annotating phenotyping definitions with rich data and metadata to support reuse (see our recommended list here)
- Where it is not possible to submit phenotype definitions to a repository (e.g., due to the coding terminology not being supported), sharing the full code list and rich data/metadata as detailed above (e.g., using the YAML and CSV file formats of the Phenotype Library in a publicly accessible location, e.g., on GitHub)
The recommendations above are a requirement for all research supported by the BHF Data Science Centre (see our Publication and Dissemination Policy).
You can read our full recommendations here.
Submitting to the Phenotype Library
The HDR UK Phenotype Library is an open access, searchable repository of computable phenotypes that have been developed using electronic health records. It contains structured data and metadata describing each definition, enabling researchers to find, interpret, and re-use them easily.
To support submission to the Phenotype Library we have developed a video demonstrating the process, guidance and scripts to aid in preparing and uploading phenotypes via the Phenotype Library API.
Submission instructions, Codelist formatting, Creating YAML files, Batch uploading phenotypes
Additional support
Documentation on the Phenotype Library can be found both
Any questions or problems that occur during the submission process should be directed to the Phenotype Library, via their contact page.
Sharing via GitHub
All computable phenotypes should include a link to the computational code required for implementation. The computational code may be stored on Github, with a link to the relevant Github repository included in the ‘implementation’ metadata field/YAML template of the entry in the Phenotype Library.
Where it is not possible to submit phenotype definitions to a repository (e.g., due to the coding terminology not being supported), we recommend sharing the computable phenotype in a publicly accessible location, e.g., on GitHub. At minimum this should include:
- the full code list
- rich data/metadata (e.g., using the YAML and CSV file formats of the Phenotype Library)