• 校内登录
访问量:   最后更新时间:--

田锋

电子邮箱:
所在单位:计算机科学与技术学院
职务:电子与信息学部副主任
学历:博士研究生毕业
办公地点:
性别:男
联系方式:
学位:博士
职称:教授
主要任职:视觉信息与应用国家工程研究中心常务副主任
其他任职:陕西省大数据知识工程重点实验室
博士生导师:是
硕士生导师:是
学科:计算机科学与技术
论文成果
当前位置: 中文主页 > 科学研究 > 论文成果
Approximate Top-K Answering under Uncertain Schema Mappings
发布时间:2025-04-30    点击次数:

发布时间:2025-04-30

论文名称:Approximate Top-K Answering under Uncertain Schema Mappings

发表刊物:Data & Knowledge Engineering.

摘要:Data integration techniques provide a communication bridge between isolated sources and offer a platform for information exchange. When the schemas of heterogeneous data sources map to the centralized schema in a mediated data integration system or a source schema maps to a target schema in a peer-to-peer system, multiple schema mappings may exist due to the ambiguities in the attribute matching. The obscure schema mappings lead to the uncertainty in query answering, and frequently people are only interested in retrieving the best k answers (top-k) with the biggest probabilities. Retrieving the top-k answers efficiently has become a research issue. For uncertain queries, two semantics, by-table and by-tuple, have been developed to capture top-k answers based on the schema mapping probabilities. However, although the existing algorithms support certain features to capture the accurate top-k answers and avoid accessing all data from sources, they cannot effectively reduce the number of processed tuples in most cases. In this paper, new algorithms based on the histogram approximation and heuristic are proposed to efficiently identify the top-k answers for the data integration systems under uncertain schema mappings. In the experiments, the Histogram algorithm in the by-table semantics and the expected approach in the by-tuple semantics are shown to significantly reduce the number of processed tuples while maintaining high accuracy with the estimated probabilistic confidence.

合写作者:李隆庄,田锋等

卷号:118

页面范围:71-91

是否译文:

发表时间:2018-12-09