Skip to Main content Skip to Navigation
Conference papers

Contextual ontological concepts extraction

Abstract : Ontologies provide a common layer which plays a major role in supporting information exchange and sharing. In this paper, we focus on the ontological concept extraction process from HTML documents. We propose an unsupervised hierarchical clustering algorithm namely “Contextual Ontological Concept Extraction” (COCE) which is an incremental use of a partitioning algorithm and is guided by a structural context. This context exploits the html structure and the location of words to select the semantically closer cooccurrents for each word and to improve the words weighting. Guided by this context definition, we perform an incremental clustering that refines the words' context of each cluster to obtain semantic extracted concepts. The COCE algorithm offers the choice between either an automatic execution or an interactive one. We experiment the COCE algorithm on French documents related to the tourism. Our results show how the execution of our context-based algorithm improves the relevance of the clusters' conceptual quality.
Complete list of metadatas

https://hal-supelec.archives-ouvertes.fr/hal-00259905
Contributor : Evelyne Faivre <>
Submitted on : Friday, February 29, 2008 - 4:27:43 PM
Last modification on : Tuesday, June 30, 2020 - 4:04:07 PM

Identifiers

  • HAL Id : hal-00259905, version 1

Collections

Citation

Lobna Karoui, Nacéra Bennacer Seghouani, Marie-Aude Aufaure. Contextual ontological concepts extraction. Ninth International Conference on Discovery Science (DS-2006) and in Lecture Notes in Artificial Intelligence, Oct 2006, Barcelonne, Spain. pp.306-310. ⟨hal-00259905⟩

Share

Metrics

Record views

168