MARC보기
LDR00000nmm u2200205 4500
001000000330265
00520241029094315
008181129s2018 ||| | | | eng d
020 ▼a 9780438255500
035 ▼a (MiAaPQ)AAI10842594
035 ▼a (MiAaPQ)sunyalb:12438
040 ▼a MiAaPQ ▼c MiAaPQ ▼d 248032
0491 ▼f DP
0820 ▼a 020
1001 ▼a Tao, Mingzhe.
24510 ▼a Clinical Information Extraction from Unstructured Free-Texts.
260 ▼a [S.l.] : ▼b State University of New York at Albany., ▼c 2018
260 1 ▼a Ann Arbor : ▼b ProQuest Dissertations & Theses, ▼c 2018
300 ▼a 142 p.
500 ▼a Source: Dissertation Abstracts International, Volume: 79-12(E), Section: A.
500 ▼a Advisers: Ozlem Uzuner
5021 ▼a Thesis (Ph.D.)--State University of New York at Albany, 2018.
520 ▼a Information extraction (IE) is a fundamental component of natural language processing (NLP) that provides a deeper understanding of the texts. In the clinical domain, documents prepared by medical experts (e.g., discharge summaries, drug labels,
520 ▼a In the past decade, there have been many efforts focused on extraction of clinical information, i.e., clinical IE. In this dissertation, we present novel extensions to IE methods for automatically identifying clinically-relevant information from
520 ▼a (1) Knowledge representations that utilize real-valued word embeddings outperform their categorical counterparts. Categorical embeddings eliminate word-to-word distances in the high-dimensional space when converting words into discrete labels. R
520 ▼a (2) Introducing pseudo-sequences from unannotated data can improve extraction of entity categories that are sparsely represented in the training data. We use a supervised model trained on annotated data to predict pseudo-sequences from unannotat
520 ▼a (3) We can address lack of available annotated data through pseudo-data generation. We experiment with three different methods of pseudo-data generation. The first method is based on professional gazetteers. It replaces entities in the annotated
520 ▼a (4) Sequence labeling approach to relation extraction can benefit this task. Sequence labeling can identify textual excerpts that contain entities and enables subsequent extraction of sequences of related entities from these excerpts.
520 ▼a Cross-validated results across multiple clinical IE tasks show overall significant performance improvement from the knowledge representations, pseudo-sequences, pseudo-data, and relation extraction models we proposed in our study. The generalize
590 ▼a School code: 0668.
650 4 ▼a Information science.
650 4 ▼a Computer science.
650 4 ▼a Bioinformatics.
690 ▼a 0723
690 ▼a 0984
690 ▼a 0715
71020 ▼a State University of New York at Albany. ▼b Information Science.
7730 ▼t Dissertation Abstracts International ▼g 79-12A(E).
773 ▼t Dissertation Abstract International
790 ▼a 0668
791 ▼a Ph.D.
792 ▼a 2018
793 ▼a English
85640 ▼u http://www.riss.kr/pdu/ddodLink.do?id=T14999857 ▼n KERIS
980 ▼a 201812 ▼f 2019
990 ▼a 관리자 ▼b 관리자