|
Reading List 2
EE 380L - Data Mining (ESE)
Spring 2003
Notices
Paper
Selection Policy
- Students must select and present one of the following papers.
- Paper is allocated on first come basis.
- To select a paper, email your selection, to Srujana
Merugu.
- Scheduling of your talk depends on the topic you have chosen. Ideally it
will take place in the "Student Paper Presentations" slot right
after that topic has been covered in class.
A. Exploratory Data Analysis
-
"Mining
Frequent Patterns by Pattern-Growth: Methodology and Implications,"
J. Han and J. Pei, SIGKDD Explorations, vol. 2(2), Dec. 2000
-
"On
Interactive Visualization of high-dimensional Data using the Hyperbolic
Plane,"
Joerg Walter, Helge Britter, KDD 2002
B. Clustering/Segmentation
-
"Scalability
for Clustering Algorithms Revisited"
Fredrik
Farnstrom, James
Lewis, Charles
Elkan, SIGKDD Explorations 2(1), 2000
-
"Hierarchical
Model-Based Clustering of Large Datasets Through Fractionation and
Refractionation,"
Jeremy Tantrum, Alejandro Murua, Werner Stuetzle, KDD 2002
3. "Learning
to Match and Cluster Large High-Dimensional Data Sets For Data Integration,"
William W. Cohen, Jacob Richman, KDD 2002
4. "High-Performance
Clustering of Streams and Large Data Sets,
L. O'Callaghan, N. Mishra, A. Meyerson, and S. Guha, ICDM 2002.
5. "Mining the
Stock Market: Cluster Discovery".
M. Gavrilov, D. Angelov, and P. Indyk KDD 2000.
C. Association Rules
-
Discovering
Frequent Substructures from Hierarchical Semi-structured Data
Gao Cong, Lan Yi, Bing Liu, Ke Wang, SDM 2002
-
Learning
Simple Relations: Theory and Applications
Pavel Berkhin and Jonathan D. Becher
-
"Selecting
the Right Interestingness Measure for Association Patterns,"
Pang-Ning Tan, Vipin Kumar, Jaideep Srivastava, KDD 2002
-
"Small
is Beautiful: Discovering the Minimal Set of Unexpected Patterns,"
Padmanabhan, B. and Tuzhilin, A., KDD-2000, pp. 54-63
-
"Empirical
Bayes Screening for Multi-item Associations,"
William DuMouchel and Daryl Pregibon, KDD-2001, pp. 67-76
D. Classification and Prediction
-
Segmented
Regression Estimators for Massive Data Sets
Ramesh Natarajan and Edwin Pednault, SDM 2002
-
Why
the Information Explosion Can Be Bad for Data Mining, and How Data
Fusion Provides a Way Out
Peter van der Putten, Joost N. Kok, and Amar Gupta, SDM 2002
-
"What's
the Code? Automatic Classification of Source Code Archives,"
Secil Ugurel, Robert Krovetz, Lee Giles, David M. Pennock, Eric J.
Glover, Hongyuan Zha, KDD 2002
-
"Transforming
classifier scores into accurate multiclass probability estimates,"
Bianca Zadrozny, Charles Elkan, KDD 2002
-
"BOAT
-- Optimistic Decision Tree Construction,"
J. E. Gehrke, Venkatesh
Ganti, Raghu Ramakrishnan, and Wei-Yin Loh.
Proceedings of the 1999 SIGMOD Conference, Philadelphia, Pennsylvania,
1999.
-
"Probabilistic
Classification and Clustering in Relational Data,"
B. Taskar, E. Segal &
D. Koller, IJCAI-01
E. Web Mining and
Information Retrieval
- "Mining
Knowledge-Sharing Sites for Viral Marketing,"
Matthew Richardson, Pedro Domingos, KDD 2002
- "Efficiently
Mining Frequent Trees in a Forest,"
Mohammed Zaki, KDD 2002
- "ANF:
A Fast and Scalable Tool for Data Mining in Massive Graphs,"
Christopher R. Palmer, Phillip B. Gibbons, Christos Faloutsos, KDD
2002
- "Web
Site Mining: A new way to spot Competitors, Customers and Suppliers in
the World Wide Web,"
Martin Ester, Hans-Peter Kriegel, Matthias Schubert, KDD 2002
- "Agglomerative
Clustering of A Search Engine Query Log,"
Beeferman, D. and Berger, A., KDD-2000, pp. 407 - 416
- "Intermediaries:
An Approach to Manipulating Information Streams,"
Barrett, R. and Maglio, P.P., IBM Systems Journal, 38, 1999
- “Personalization
from Incomplete Data: What you don’t know can hurt,”
Padmanabhan, B., Zheng,
Z., and Kimbrough, S., KDD-2001
F.
Miscellaneous
-
Transforming
Data to Satisfy Privacy Constraints
Vijay S. Iyengar, KDD 2002
-
From
Run-time Behavior to Usage Scenarios: An Interaction-pattern Mining
Approach
Mohammad El-Ramly, Eleni Stroulia, Paul Sorenson, KDD 2002
-
Customer
Lifetime Value Modeling and Its Use for Customer Retention Planning
Saharon Rosset, Einat Neumann, Uri Eick, Nurit Vatnik, Yizhak Idan,
KDD 2002
-
Learning
Nonstationary Models of Normal Network Traffic for Detecting Novel
Attacks
Matthew V. Mahoney, Philip K. Chan, KDD 2002
Last updated 01/2003
|