Advisor(s)
Abstract(s)
Tom Gruber (1993) defines Ontology as ”an explicit specification of a conceptualization.”
Due two the enormous quantity of information available, there is a growing number
of applications that perform tasks where lexical-semantic resources are needed,
like Information Retrieval, intelligent search or machine translation. This shows that
Natural Language Processing is becoming more dependent on semantic information.
One of the main motivations in ontology building is the possibility of knowledge
sharing and reuse across different applications. The start point is to fixed a particular
domain (like medicine), which is expected to be the base of domain knowledge for
a variety of applications. This is a difficult task as the domain knowledge strongly
depends on the particular task at hand.
This paper is an approach on ontology learning, for which it was selected the
Medical Domain, so that we could have a base to compare and evaluate the resulting
ontology.
In our approach, we use different techniques, like Asymmetric Association Measures,
clustering algorithm and text rank algorithm, so that we can obtain relations
between a set of terms, which are rank by the degree of generality, like the cluster
obtained by applying clustering algorithms, with the confidence measure as the values
for the similarity matrix, to the set of terms, the generality clusters. Those clusters are
then submitted to clustering algorithm, but with Symmetric Conditional Probability
values in the similarity matrix, to obtain domain clusters within the generality clusters.
In the future, this ontology may be used in acquisition of Lexical Chains for Text
Summarization, as in other Natural Language Processing applications.
Description
Keywords
Ciência da computação - Ciências da informação - Ontologia Web semântica - Ontologia UMLS (Unified Medical Language System)