Evaluating Tree Pattern Similarity for Content-based Routing Systems - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 2006

Evaluating Tree Pattern Similarity for Content-based Routing Systems

Résumé

With the advent of XML as the de facto language for data interchange, scalable distribution of data to large populations of consumers remains an important challenge. Content-based publish/subscribe systems offer a convenient abstraction for data producer and consumers, as most of the complexity related to addressing and routing is encapsulated within the network infrastructure. Data consumers typically specify their subscriptions using some XML pattern specification language (e.g., XPath), while producers publish content without prior knowledge of the recipients, if any. A novel approach to content-based routing consists in organizing consumers with similar interests in peer-to-peer semantic communities inside which XML documents are propagated. In order to build semantic communities and connect peers that share common interests with each other, one needs to evaluate the similarity between their subscriptions. In this paper, we specifically address this problem and we propose novel algorithms to compute the similarity of seemingly unrelated tree patterns by taking advantage of information derived from the XML document types, such as valid combinations of elements, or conjunctions and disjunctions on their occurrence. These results are of interest in their own right, and can prove useful in other domains, such as approximate XML queries involving tree patterns. Results from a prototype implementation validate the effectiveness of our approach.

Domaines

Autre [cs.OH]
Fichier principal
Vignette du fichier
RR-5891.pdf (317.51 Ko) Télécharger le fichier
Loading...

Dates et versions

inria-00071377 , version 1 (23-05-2006)

Identifiants

  • HAL Id : inria-00071377 , version 1

Citer

Raphaël Chand, Pascal Felber. Evaluating Tree Pattern Similarity for Content-based Routing Systems. [Research Report] RR-5891, INRIA. 2006. ⟨inria-00071377⟩
98 Consultations
1348 Téléchargements

Partager

Gmail Facebook X LinkedIn More