Information extraction from medical records1 Information extraction from medical records Tomáš Houfek Information extraction from medical records2 Overview ̶ General approaches to information extraction ̶ IE of medical records ̶ What tools are used for IE in medical records ̶ IE of Czech medical records ̶ Problems of information extraction in medical records ̶ How to approach IE of Czech medical records ̶ Examples Information extraction of medical records3 General approaches ̶ Pre-processing ̶ Part of Speech tagging ̶ Name entity recognition ̶ Syntactic analysis ̶ Semantic analysis ̶ ML ̶ Regex Information extraction from medical records4 IE in medical records ̶ Results of 67 studies on IE of medical records ̶ Majority in US, on US medical records (61%) ̶ Majority used for detecting cases in medical records (87%) ̶ Only minority used on hospital EMR (13%) Information extraction from medical records5 Tools used for medical records ̶ Majority used rule-bases NLP algorithms (67%) ̶ Keyword search (24%) ̶ Only 9% of ML, Bayesian or hybrid approaches ̶ A lot of reoccurring IE systems MedLEE (9 studies), cTAKES (5 studies), HITEx (4 studies) Information extraction from medical records6 Algorithm accuracy Information extraction from medical records7 IE of Czech medical records ̶ No cTAKES or any other already existing tools ̶ For every task a specific solution ̶ Little to none research Information extraction from medical records8 Problem of Czech medical records ̶ Text of Czech medical records is not typical Czech text ̶ Lingvistic analysis cannot be used successfully ̶ Partial solution: Třífázová metoda předběžného zpracování (3PP) ̶ Tokenization ̶ Normalization ̶ Semantic annotation Extrakce informací z lékařských textů, Ing. Karel Zvára PhD. Information extraction from medical records9 Example of Czech medical record Information extraction from medical records10 IE of Czech medical records ̶ Available data ̶ 122 patients ̶ 17 216 records ̶ Types of records – (16) ̶ Subtypes of records - (192) Information extraction from medical records11 What data I want to extract ̶ Diagnosis date ̶ Diagnosis (MNK-10) ̶ Diagnosis determined by what? ̶ Morphology ̶ Laterality ̶ Treatment in time ̶ Operations ̶ Chemo ̶ TNM classification ̶ pTNM classification ̶ Lokalization of metastasis ̶ Clinical stage ̶ Progression of disease ̶ State of tumour in time ̶ Stadium ̶ Size ̶ Recurrence Information extraction from medical records12 Example of tagged records Information extraction from medical records13 Example of record tagged Information extraction from medical records14 Sources ̶ Extrakce informací z lékařských textů, Ing. Karel Zvára PhD. ̶ http://hdl.handle.net/20.500.11956/94214 ̶ Extracting information from the text of electronic medical records to improve case detection: a systematic review ̶ https://pubmed.ncbi.nlm.nih.gov/26911811/ ̶ Data Mining from Free-Text Health Records: State of the Art, New Polish Corpus ̶ https://nlp.fi.muni.cz/raslan/2020/paper5.pdf