International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 60 - Number 5 |
Year of Publication: 2012 |
Authors: Alsayed Algergawy |
10.5120/9685-4127 |
Alsayed Algergawy . Feature-based Clustering of Web Data Sources. International Journal of Computer Applications. 60, 5 ( December 2012), 1-4. DOI=10.5120/9685-4127
The proliferation of web data sources increasingly demands the integration of these sources. To facilitate the integration process, a pre-analysis step is required to classify and group data sources into their correct domains. In this paper, we propose a feature-based clustering approach for clustering web data sources without any human intervention and based only on features extracted from the source schemas. In particular, we make use of both linguistic and structural schema features. We experimentally demonstrate the effectiveness of the proposed approach in terms of both the clustering quality and runtime.