Releasing the Healthcare Text Annotation Guidelines

The Healthcare Text Annotation Guidelines are blueprints for capturing a structured representation of the medical knowledge stored in digital text. In order to automatically map the textual insights to structured knowledge, the annotations generated using these guidelines are fed into a machine learning algorithm that learns to systematically extract the medical knowledge in the text. We’re pleased to release to the public the Healthcare Text Annotation Guidelines as a standard.
The guidelines provide a reference for training annotators in addition to explicit blueprints for several healthcare annotation tasks. The annotation guidelines cover the following:
- The task of medical entity extraction with examples from medical entity types like medications, procedures, and body vitals.
- Additional tasks with defined examples, such as entity relation annotation and entity attribute annotation. For instance, the guidelines specify how to relate a medical procedure entity to the source medical condition entity, or how to capture the attributes of a medication entity like dosage, frequency, and route of administration.
- Guidance for annotating an entity’s contextual information like temporal assessment (e.g., current, family history, clinical history), certainty assessment (e.g., unlikely, somewhat likely, likely), and subject (e.g., patient, family member, other).
Google consulted with industry experts and academic institutions in the process of assembling the Healthcare Text Annotation Guidelines. We took inspiration from other open source and research projects like i2b2 and added context to the guidelines to support information extraction needs for industry-applications like Healthcare Effectiveness Data and Information Set (HEDIS) quality reporting. The data types contained in the Healthcare Text Annotation Guidelines are a common denominator across information extraction applications. Each industry application can have additional information extraction needs that are not captured in the current version of the guidelines. We chose to open source this asset so the community can tailor this project to their needs.
We’re thrilled to open source this project. We hope the community will contribute to the refinement and expansion of the Healthcare Text Annotation Guidelines, so they mirror the ever-evolving nature of healthcare.
Related Google News:
- Google’s Cloud Healthcare Consent Management API now generally available March 9, 2021
- Healthcare distributor FFF Enterprises improves performance 7x with SAP on Google Cloud March 8, 2021
- TELUS and Google Form Strategic Alliance to Bring Digital Transformation to Key Industries,… February 9, 2021
- ToTTo: A Controlled Table-to-Text Generation Dataset January 15, 2021
- Powering open source healthcare research on Google Cloud January 11, 2021
- Made in Canada: Meet the Waterloo engineering team that’s transforming healthcare service December 4, 2020
- Customize text style and appearance in Google Sites December 1, 2020
- Advancing healthcare with the Healthcare Interoperability Readiness Program November 30, 2020