Language and knowledge technology for interpreting medical data sources



Since several years, electronic medical record (EMR) have been introduced to suppress old paper based medical records. However, today, only data needed for pricing are systematically structured and coded in hospital medical records. Most data are still unstructured, written in free texts by the physicians even they contain information that is crucial to biomedical research. With that in mind, a 3-year collaborative project, SYNODOS, was launched in October 2012 with the aim of developing a generic solution for extracting meaning out of medical data and organizing it to support epidemiological studies and medical decisions. In order to perform this extraction, several steps have been required based on natural language processing and medical terminologies analysis. SYNODOS has focused on two different types of diseases: hospital-acquired infections and cancer to ensure the genericity of the solution.

Schema PatientMiner
Schema PatientMiner

Synodos consists of four main components:

  • The multi-terminology server, which provides all processing modules with relevant lexical-semantic information in the medical domain.
  • The linguistic server, which analyses textual medical input in order to provide a semantically enriched document to the next component.
  • The knowledge server, which extracts high level knowledge using both outputs of the terminology and the linguistic servers.
  • The general planning component, which calls the different modules and provides results, either directly through the general user interface, or to each component that might need input from another component.

Besides all these components, in order to respect privacy policies, Synodos provides a rule-based de-identifier which masks any information in the document that could help to identify either the patient or the medical staff. In some cases, for medical decision making, it is important that, at the end of the semantic processing, the authorized medical staff have access to patient’s identity to make the most appropriate decision for them. The de-identification tool thus also provides a re-identification facility.