Collective Knowledge Bases

The production and use of knowledge is a collective enterprise, and communication between its participants is the bottleneck. Some of the costs of this bottleneck are duplicated work, misdirected work, slower progress, and suboptimal decisions for lack of knowledge that is actually available. The Internet has greatly reduced the physical barriers to communication and coordination; our focus is to help overcome the intellectual ones. In particular, we are developing methods to improve the composability of knowledge by semi-automatically learning to translate between the vocabularies of different sources. This can potentially lead to an exponential increase in the number of questions answerable by a collective knowledge base. We are also developing methods to automatically learn the quality of knowledge sources and elements, to properly take advantage of sources of widely variable quality, to automatically resolve inconsistencies between sources, and to automatically give feedback, credit and guidance to contributors, such that a collective knowledge base can grow and improve harmonically without centralized control. We are beginning to implement these ideas in BibServ, a collective bibliography repository.

Publications

Object identification with attribute-mediated dependencies
Parag Singla and Pedro Domingos
European Conference on Principles and Practice of Knowledge Discovery in Databases, 2005. Full Paper (PDF)
    Best Paper Award
iMAP: Discovering Complex Semantic Matches between Database Schemas
Robin Dhamankar, Yoonkyong Lee, AnHai Doan, Alon Halevy and Pedro Domingos
ACM SIGMOD International Conference on Management of Data, 2004. Full Paper (PDF)
Trust management for the Semantic Web
Matt Richardson, Rakesh Agrawal and Pedro Domingos
International Semantic Web Conference, 2003. Full Paper (PDF)
Building large knowledge bases by mass collaboration
Matt Richardson and Pedro Domingos
International Conference on Knowledge Capture, 2003. Full Paper (PDF)
Learning with knowledge from multiple experts
Matt Richardson and Pedro Domingos
International Conference on Machine Learning, 2003. Full Paper (PDF)
Learning to Map between Ontologies on the Semantic Web
AnHai Doan, Jayant Madhavan, Pedro Domingos and Alon Halevy
International World Wide Web Conference, 2002. Full Paper (PDF)
Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach
AnHai Doan, Pedro Domingos and Alon Halevy
ACM SIGMOD International Conference on Management of Data, 2001. Full Paper (PDF)