Clustering: Science or Art? Towards Principled ApproachesA NIPS 2009 Workshop, December 11, 2009
The Hilton Whistler Resort & Spa
Shai Ben-David | Ulrike von Luxburg | Avrim Blum | Isabelle Guyon | Robert C. Williamson | Reza Bosagh Zadeh | Margareta Ackerman
Clustering is one of the most widely used techniques for exploratory data analysis. In the past five decades, many clustering algorithms have been developed and applied to a wide range of practical problems. There has also been very exciting theoretical work, proving guarantees for algorithms and developing new frameworks for analysis.
Yet in many ways we are only beginning to understand some of the most basic issues in clustering. While there have been some remarkable successes, we believe more is possible. In particular, work addressing issues that are independent of any specific clustering algorithm, objective function, or specific data generative model, is still in its infancy.
In his famous Turing award lecture, Donald Knuth states about Computer Programming that: "It is clearly an art, but many feel that a science is possible and desirable''. In the case of clustering, we believe that an even better and deeper science than what we currently offer is possible and highly desirable.
Goals of the WorkshopThis workshop aims at initiating a dialog between theoreticians and practitioners, aiming to bridge the theory-practice gap in this area. The workshop will be built along three main questions:
The workshop will also serve as a follow up meeting to the NIPS 2005 “Theoretical Foundations of clustering” workshop, a venue for the different research groups working on these issues to take stock, exchange view points and discuss the next challenges in this ambitious quest for theoretical foundations of clustering.
8:15 - 9:15 Non-standard Approaches
9:15 - 9:30 Coffee Break
9:30 - 10:30 Evaluating clustering: the human factor and particular applications
10:30 - 11:00 Deepayan Chakrabarti (invited talk) - Clustering applications at Yahoo!
4:30 - 4:45 Coffee Break
4:45 - 5:45 Information Theoretic Approaches
5:45 - 6:30 Panel discussion
What is a Cluster? Perspectives from Game Theory
Clustering with Prior Information
Finding a Better k: A psychophysical investigation of clustering
Single Data, Multiple Clusterings
An Empirical Study of Cluster Evaluation Metrics using Flow Cytometry Data
Some ideas for formalizing clustering schemes
A Characterization of Linkage-Based Clustering: An Extended Abstract
Information theoretic model selection in clustering
A PAC-Bayesian Approach to Formulation of Clustering Objectives
These papers will form a basis for discussion sessions during the workshop
OrganizersShai Ben-David is a CS professor at the University of Waterloo, Canada.
Avrim Blum is a professor of CS at Carnegie Mellon University.
Ulrike von Luxburg is a Senior Research Scientist at the Max Plank Institute in Tubingen, Germany.
Isabelle Guyon is an independent engineering consultant, working from California.
Reza Bosagh Zadeh is a graduate student at Carnegie Mellon University.
Margareta Ackerman is a graduate student at the University of Waterloo.
Robert C. Williamson is the Scientific Director of NICTA and a Professor in the Research School of Information Sciences and Engineering at the Australian National University.
This workshop is supported by the PASCAL Network of Excellence.