Reading List 2 EE 380L - Data Mining Spring 2006 Notices PaperSelection Policy - Students must select and present one of the following papers.
- Paper is allocated on first come first served basis.
- To select a paper, email your selection, to Suju Rajan.
- Scheduling of your talk depends on the topic you have chosen. Ideally it will take place in the "Student Paper Presentations" slot right after that topic has been covered in class.
A. Exploratory Data Analysis -
"FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets," C. Faloutsos and K I Lin, SIGMOD, 1995
Selected: Mark Offutt and Kent Knox
-
"A global geometric framework for nonlinear dimensionality reduction," J.B. Tenenbaum, V. de Silva and J. C. Langford. Science,2000 Selected: Maytal Dahan and Eric Roberts B. Clustering/Segmentation
-
"A Framework for Clustering Evolving Data Streams," C. Aggarwal, J. Han, J. Wang, and P. S. Yu VLDB, 2003 Selected: Louis Orenstein and Vincci Carel
High-Performance Clustering of Streams and Large Data Sets," L. O'Callaghan, N. Mishra, A. Meyerson, and S. Guha, ICDM 2002.(local pdf copy)
Selected: Monica Moncrief and Veena Tiwari
-
"A Probabilistic Framework for Semi-Supervised Clustering," Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney. KDD,2004 -
"Segmenting Customer Transactions Using a Pattern-Based Clustering Approach," Yang, Y. and Padmanabhan, B. ICDM,2003 Selected: Cesar Santiago and Sandy Kao -
"Stability-Based Validation of Clustering Solutions," Tilman Lange, Volker Roth, Mikio L. Braun and Joachim M. Buhmann Neural Computation,2003 -
" Combining partitions by probabilistic label aggregation," Tilman Lange and Joachim M. Buhmann KDD,2003 C. Association Rules -
" Mining Significant Associations in Large Scale Text Corpora" P. Raghavan and P. Tsaparas, ICDM 2002 Selected: Kashif Siddiqui and Irfan Shahdad 3/10/06 -
" CloseGraph: Mining Closed Frequent Graph Patterns," X. Yan and J. Han, KDD 2003 D. Classification and Prediction -
"Boosting the margin: A new explanation for the effectiveness of voting methods," Robert E. Schapire, Yoav Freund, Peter Bartlett and Wee Sun Le Annals of Statistics, 1998 Selected: Bradley Harrington 3/11/06 -
"Error-Correcting Output Codes: A General Method for Improving Multiclass Inductive Learning Programs," Thomas G. Dietterich and Ghulum Bakiri, Proceedings of the Ninth AAAI National Conference on Artificial Intelligence Selected:Onome Ufomata and Zulfiqar Haider 3/11/06 -
" An Iterative Method for Multi-class Cost-sensitive Learning," N. Abe, B. Zadrozny and J. Langford KDD 2004 -
"Segmented Regression Estimators for Massive Data Sets," Ramesh Natarajan and Edwin Pednault, SDM 2002 -
" Cross-Training: Learning Probabilistic Mappings Between Topics," Sunita Sarawagi, Soumen Chakrabarti, and Shantanu Godbole ACM SIGKDD 2003 -
" Learning Probabilistic Models of Relational Structure," L. Getoor, N. Friedman, D. Koller and B. Taskar Journal of Machine Learning Research (JMLR) 2002
E. Web Mining andInformation Retrieval -
"Variable Latent Semantic Indexing ," A. Dasgupta, R. Kumar, P. Raghavan and A. Tomkins KDD 2005 Selected: Matt Ray and Yang Yu 5/11/06 -
"Deriving marketing intelligence from online discussion ," Natalie Glance, Matthew Hurst, Kamal Nigam, Matthew Siegler, Robert Stockton, Takashi Tomokiyo KDD 2005 Selected : Chip Killmar and Russell Glasser 5/11/06 F.Miscellaneous -
" Random Data Perturbation Techniques and Privacy Preserving Data Mining," Hillol Kargupta, Souptik Datta, Qi Wang, and Krishnamoorthy Sivakumar ICDM 2003
"Customer Lifetime Value Modeling and Its Use for Customer Retention Planning,"
Saharon Rosset, Einat Neumann, Uri Eick, Nurit Vatnik, and Yizhak Idan,KDD 2002 Selected: Hoi Heather Chan and Joseph John 4/8/06
"Dynamic Weighted Majority: A New Ensemble Method For Tracking Concept Drift,," Jeremy Z. Kolter and Marcus A. Maloof, ICDM, 2003, pg 123-130 -
"Online Portfolio Selection using Multiplicative Updates ," David Helmbold, Robert Schapire, Yoram Singer and Manfred Warmuth SIAM 2005 Selected: Arjun Khanna 4/8/06 "A Random Walks Perspective on Maximizing Satisfaction and Profit ," M. Brand SIAM 2005 -
"Data mining solves tough semiconductor manufacturing problems ," Mike Gardner and Jack Bieker KDD 2000 -
"Data Mining for Improving A Cleaning Process in the Semiconductor Industry ," KDD 2000 Selected: Dale Blackwell and Simon Yeung 4/7/06 -
"Segmental Semi-Markov Models for Change-Point Detection with Applications to Semiconductor Manufacturing ," Xianping Ge and Padhraic Smyth Tech Report UCI-ICS 2000 Please report any broken or incorrect hyperlinks to rsuju[AT]lans{DOT]ece[DOT]utexas[DOT]edu Last updated 01/2006 |