Select Lab Publications


Using Probabilistic Models for Data Management in Acquisitional Environments (2005)

By: Amol Deshpande, Carlos Guestrin, and Sam Madden

Abstract: Traditional database systems, particularly those focused on capturing and managing data from the real world, are poorly equipped to deal with the noise, loss, and uncertainty in data. We discuss a suite of techniques based on probabilistic models that are designed to allow database to tolerate noise and loss. These techniques are based on exploiting correlations to predict missing values and identify outliers. Interestingly, correlations also provide a way to give approximate answers to users at a significantly lower cost and enable a range of new types of queries over the correlation structure itself. We illustrate a host of applications for our new techniques and queries, ranging from sensor networks to network monitoring to data stream management. We also present a unified architecture for integrating such models into database systems, focusing in particular on {ITacquisitional systems} where the cost of capturing data (EG, from sensors) is itself a significant part of the query processing cost.

Download Information
Amol Deshpande, Carlos Guestrin, and Sam Madden (2005). "Using Probabilistic Models for Data Management in Acquisitional Environments." 2nd Biennial Conference on Innovative Data Systems Research (CIDR). pdf            
BibTeX citation

@inproceedings{Deshpande+al:cidr2005bbq,
author = {Amol Deshpande and Carlos Guestrin and Sam Madden},
title = {Using Probabilistic Models for Data Management in Acquisitional Environments},
booktitle = {2nd Biennial Conference on Innovative Data Systems Research (CIDR)},
year = {2005},
address = {Asilomar},
month = {January},
wwwfilebase = {cidr2005-deshpande-guestrin-madden},
wwwtopic = {Sensor Networks}
}



full list