Skip to Main content Skip to Navigation
Conference papers

A New Extraction Concepts based on Contextual Clustering

Abstract : Ontologies provide a common layer that plays a major role in information exchange and support sharing. Ontologies proliferation relies strongly on the automation of their building, integration and deployment processes. In this paper, we present an integrated framework involving complementary dimensions to drive the (semi) automatic acquisition conceptual knowledge process from HTML Web pages. Our approach takes advantage from structural HTML document features and the word location to identify the appropriate term context. Our context definition improves word weighting, the selection of the semantically closer cooccurrents and the relevant extracted ontological concepts. We use an unsupervised clustering method for term groups' generation. Notice that the chosen clustering method relies on a user incremental quality evaluation process. In this paper and after a theoretical presentation of our structural contextual definition, we summarize the most significant results obtained by applying our method on a corpus dedicated to the tourism domain. The first results show how the definition of an appropriate context improves the relevance of the extracted concepts.
Complete list of metadatas

https://hal-supelec.archives-ouvertes.fr/hal-00259881
Contributor : Evelyne Faivre <>
Submitted on : Friday, February 29, 2008 - 4:03:08 PM
Last modification on : Tuesday, June 30, 2020 - 4:04:07 PM

Identifiers

  • HAL Id : hal-00259881, version 1

Collections

Citation

Lobna Karoui, Marie-Aude Aufaure, Nacéra Bennacer Seghouani. A New Extraction Concepts based on Contextual Clustering. IEEE International Conference on Computational Intelligence for Modelling, Control and Automation CIMCA 2006 Jointly with International Conference on Intelligent Agents, Web Technologies and Internet Commerce IAWTIC 2006, Nov 2006, Sydney, Australia. pp.91-96. ⟨hal-00259881⟩

Share

Metrics

Record views

354