Institutional Repository [SANDBOX]
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Semantic similarity computation and word sense induction using hidden sets multidimensional scaling

Athanasopoulou Georgia

Simple record


URIhttp://purl.tuc.gr/dl/dias/17E8583A-E9FD-47A3-9217-D6D3CE6A642D-
Identifierhttps://doi.org/10.26233/heallink.tuc.66075-
Languageen-
Extent1,2 megabytesen
TitleSemantic similarity computation and word sense induction using hidden sets multidimensional scalingen
CreatorAthanasopoulou Georgiaen
CreatorΑθανασοπουλου Γεωργιαel
Contributor [Co-Supervisor]Potamianos Alexandrosen
Contributor [Co-Supervisor]Ποταμιανος Αλεξανδροςel
Contributor [Thesis Supervisor]Koutsakis Polychronisen
Contributor [Thesis Supervisor]Κουτσακης Πολυχρονηςel
Contributor [Committee Member]Liavas Athanasiosen
Contributor [Committee Member]Λιαβας Αθανασιοςel
PublisherΠολυτεχνείο Κρήτηςel
PublisherTechnical University of Creteen
Academic UnitTechnical University of Crete::School of Electrical and Computer Engineeringen
Academic UnitΠολυτεχνείο Κρήτης::Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστώνel
Content SummaryIn this thesis, motivated by evidences in psycholinguistics and cognition, we propose an unsupervised language-agnostic Distributional Semantic Model (DSM), that utilize web harvested data, for the problem of semantic similarity estimation. Semantic similarity can be applied to numerous tasks of Natural Language Processing (NLP), such as affective text analysis and paraphrasing. In the first part of the thesis, the construction of typical DSMs following the well-established Vector Space Model, is presented. More specifically, we describe the creation of corpora by harvesting web documents following a query-based approach, as well as state-of-the-art DSMs used for the computation of semantic similarity from the corpora. Next, we propose a novel hierarchical distributed semantic model (DSM), that is inspired by evidence in psycholinguistics and cognition, and consists of low-dimensional manifolds built on semantic neighborhoods. Each manifold is sparsely encoded and mapped into a low-dimensional space. Global operations are decomposed into local operations in multiple sub-spaces; results from these local operations are fused to come up with semantic relatedness estimates. Manifold DSM are constructed starting from a pairwise word-level semantic similarity matrix. The proposed model is evaluated against state-of-the-art/baseline DSMs on semantic similarity estimation task, where the similarity metrics are evaluated against human similarity ratings. The proposed model significantly improve performance comparing to the baseline approaches for the task of semantic similarity estimation between words. Furthermore the proposed model is evaluated in a taxonomy task achieving achieving state-of-the-art results. Finally, motivated by evidence of cognitive organization of concepts based on the degree of concreteness, we present the performance of proposed DSM for abstract and concrete nouns. en
Type of ItemΜεταπτυχιακή Διατριβήel
Type of ItemMaster Thesisen
Licensehttp://creativecommons.org/licenses/by/4.0/en
Date of Item2016-07-20-
Date of Publication2016-
SubjectNatural language processingen
SubjectNLPen
SubjectSemantic similarityen
Bibliographic CitationGeorgia Athanasopoulou, "Semantic similarity computation and word sense induction using hidden sets multidimensional scaling", Master Thesis, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2016en

Available Files

Services

Statistics