| Reading List 2 (Online) 1 (Course Reader) |
Exploratory Data Analysis |
|
|
Comprehensible Knowledge Discovery: Gaining Insight from Data
Michael J. Pazzani, First Federal Data Mining Conference and Exposition, pp 73-82, Washington, DC. |
|
|
Discovery-driven Exploration of OLAP Data Cubes
Nimrod Megiddo, Sunita Sarawagi, Rakesh Agrawal In Proc. Sixth International Conference on Extending Database Technology (EDBT), Mar 1998 |
Clustering/Segmentation |
|
|
Clustering Large Datasets in Arbitrary Metric Spaces
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison L. Powell, James C. French Proceedings of the 5th International Conference on Data Engineering, 23-26 March 1999, Sydney, Austrialia, IEEE CS Press, 1999, pp. 502-511 |
|
|
ROCK: A Robust Clustering Algorithm for Categorical Attributes
Sudipto Guha, Rajeev Rastogi, Kyuseok Shim Serendip Data Mining Project (Bell Labs) Proceedings of the 15th International Conference on Data Engineering, 23-26 March 1999, Sydney, Austrialia, IEEE CS Press, 1999, pp. 512-521 |
|
|
CACTUS-Clustering Categorical Data Using Summaries Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke DEMON project Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), Aug 1999, pp 73-83 |
Association Rules |
|
|
User Profiling in Personalization Applications through Rule Discovery and Validation
Gediminas Adomavicius, Alexander Tuzhilin Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), Aug 1999, pp 377-381 |
|
|
Using Association Rules for Product Assortment Decisions: A Case Study
Tom Brijs, Gilbert Swinnen, Koen Vanhoof, Geert Wets Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), Aug 1999, pp 254-260 |
|
|
A Statistical Theory for Quantitative Association Rules
Yonatan Aumann, Yehuda Lindell Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), Aug 1999, pp 261-270 |
|
|
Constraint-Based Rule Mining in Large, Dense Databases
Roberto J. Bayardo Jr., Rakesh Agrawal and Dimitrios Gunopulos Proceedings of the 15th International Conference on Data Engineering, 1999 Mar, Sydney, Austrialia, IEEE CS Press, 1999 Scheines, R. |
|
|
Beyond Market Baskets: Generalizing Association Rules to Correlations(Dependence
Rules) Craig Silverstein, Sergey Brin, Rajeev Motwani Data Mining and Knowledge Discovery, 2, 1998, pp. 39-68 |
|
|
Pruning and Grouping Discovered Association Rules
H. Toivonen, M. Klemettinen, P. Ronkainen, K. Hätönen, and H. Mannila In MLnet Workshop on Statistics, Machine Learning and Discovery in Databases, pp 47-52, Heraklion, Crete, Greece, Apr 1995 |
Classification |
|
|
SLIQ: A Fast Scalable Classifier for Data Mining
Manish Mehta, Rakesh Agrawal and Jorma Rissanen Proc. Fifth Int'l Conference on Extending Database Technology, Avignon, France, Mar 1996 |
|
|
An Interval Classifier for Database Mining Applications
Rakesh Agrawal, S. Ghosh, T. Imielinski, B. Iyer, and A. Swami Proc. 18th Int'l Conference on Very Large Databases, pp 560-573, Vancouver, Aug 1992 |
|
|
Support vector classifiers: a first look David M.J. Tax, D. de Ridder, Robert P.W. Duin Proceedings of the Third Annual Conference of the Advanced School for Computing and Imaging, ASCI, Delft, June 1997 |
|
|
A tutorial on Support Vector Machines for Pattern Recognition
Christopher J. C. Burges Bell Labs SVM page Data Mining and Knowledge Discovery, Vol. 2, Number 2, p. 121-167, 1998 |
|
|
On Support Vector Decision Trees for Database Marketing Kristin P. Bennett, D. H. Wu, L. Auslender .P.I Math Report No. 98-100, Rensselaer Polytechnic Institute, Troy, NY, 1998 |
Sampling |
|
|
Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules S.D. Lee, David W. Cheung, Ben Kao Data Mining and Knowledge Discovery, An International Journal, Vol. 2, pp. 233-262, Kluwer Academic Publishers, 1998 |
|
|
Sampling Large Databases for Association Rules Hannu Toivonen In 22th International Conference on Very Large Databases (VLDB'96), 134-145, Mumbay, India, September 1996. Morgan Kaufmann |
|
|
Density Biased Sampling: An Improved Method for Data Mining and Clustering Christopher R. Palmer and Christos Faloutsos CMU Technical Report CMU-CS-99-113 |
Scalability Issues |
|
|
Scaling EM (Expectation-Maximization) Clustering to Large Databases Paul Bradley, Usama FAyyad, and Cory Reina Microsoft Research Technical Report MSR-TR-98-35, Revised Feb 1999 |
|
|
Scalable Parallel Data Mining for Association Rules Eui-Hong (Sam) Han, George Karypis, and Vipin Kumar IEEE Transactions on Knowledge and Data Engineering (To appear) |
|
|
The Effects of Training Set Size on Decision Tree Complexity Tim Oates, David Jensen Proceedings of the 14th International Conference on Machine Learning. 1997 |
|
|
A Survey of Methods for Scaling Up Inductive Learning Algorithms"
Foster .J. Provost, Venkat Kolluri Data Mining and Knowledge Discovery Journal 3(2), pp 131-169 |
Web Mining &Information Retrieval |
|
|
Searching the World Wide Web
Steve Lawrance and C. Lee Giles Science 280, p. 98, April 3, 1998 |
|
|
Empirical Analysis of Predictive Algorithms for Collaborative F
iltering
John S. Breese, David Heckerman, Carl Kadie Microsoft Research Technical Report MSR-TR-98-12, May 1998 |
Mining of Sequential Patterns and Time Series |
|
|
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases
Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney, and Kyuseok Shim Proc. 21st International Conference on Very Large Databases, Zurich, Switzerland, Sep 1995 |
Miscellaneous |
|
|
What Makes Patterns Interesting in Knowledge Discovery Systems
Avi Silberschatz, Alexander Tuzhilin IEEE Transactions on Knowledge and Data Engineering Vol. 8, No. 6, Dec 1996, pp 970-974 |
|
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem Mauricio A. Hernández, Salvatore J. Stolfo Data Mining and Knowledge Discovery, 1997, pp. 9-37 |
|
Discovering Robust Knowledge from Databases that Change Chun-Nan Hsu and Craig A. Knoblock Data Mining and Knowledge Discovery, 2(1), 1998, pp. 69-95 |
|
|
Statistics and Data Mining Techniques for Lifetime Value Modeling
D.R. Mani, James Drew, Andrew Betz, Piew Datta Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), Aug 1999, pp 94-103 |
|
|
Detecting Changes in Categorical Data: Mining Contrast Sets
Stephen D. Bay, Michael J. Pazzani Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), Aug 1999, pp 302-306 |
|
|
The Impact of Changing Populations on Classifier Performance
Mark G. Kelly, David J. Hand, Niall M. Adams Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), Aug 1999, pp 367-371 |
|
|
MetaCost: A General Method for Making Classifiers Cost-Sensitive
Pedro Domingos Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), Aug 1999, pp 155-164 |
|
|
Interactive Data Analysis: The Control Project
Joseph M. Hellerstein, et al. The CONTROL project IEEE Computer, 32(8), Aug, 1999, pp. 51-59 |