Reading List 1 (Course Reader) 2 (Online)

    Database Marketing

  1. Understanding Consumer Database Marketing
    Denise D. Schoenbachler, Geoffrey L. Gordon, Dawn Foley and Linda Spelman
    Journal of Consumer Marketing, V14, No. 1, 1997
  2. Alternative Approaches to Cluster-based Market Segmentation
    Paul E. Green, Abba M. Kreiger
    Journal of the Market Research Society

    Feature Selection, Stepwise Regression

  3. Introduction to the Logistic Regression Model
    Ch.1 From "Applied Logistic Regression" by Hosmer and Lemeshow, Wiley, 1989.
  4. The Problem of Underestimating the Residual Error Variance in forward Stepwise Regression
    Pages 1 2 3 4 5 6 7 8
    L.S. Freedman, D. Pee, D. N. Midthune
    The Statistician 41(4), 1992, pp 405-412

    Clustering & Segmentation

  5. CACTUS-Clustering Categorical Data Using Summaries
    Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke
    DEMON project
    Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), 1999 Aug, pp. 73-83
  6. Chameleon: Hierarchical Clustering Using Dynamic Modeling
    George Karypis, Eui-Hong (Sam) Han, Vipin Kumar
    IEEE Computer, 32(8), 1999 Aug, pp. 68-75

    Combining Multiple Models

  7. Combining Predictors
    Leo Breiman
    in COMBINING ARTIFICIAL NEURAL NETS: Ensemble and Modular Multi-Net Systems
    Edited: Amanda Sharkey Publisher: Springer-Verlag London Ltd 1999

    Beyond Association Rules

  8. Mining the Most Interesting Rules
    Roberto J. Bayardo Jr., Rakesh Agrawal
    See 1999 ACM SIGMOD Workshop on Research Issues in DMKD
    Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), 1999 Aug, pp. 145-154
  9. Constraint-Based Rule Mining in Large, Dense Databases
    Roberto J. Bayardo Jr., Rakesh Agrawal and Dimitrios Gunopulos
    Proceedings of the 15th International Conference on Data Engineering, 1999 Mar, Sydney, Austrialia, IEEE CS Press, 1999 Scheines, R.
  10. Beyond Market Baskets: Generalizing Association Rules to Correlations(Dependence Rules)
    Craig Silverstein, Sergey Brin, Rajeev Motwani
    Data Mining and Knowledge Discovery, 2, 1998, pp. 39-68

    Support Vector Machines

  11. Support vector classifiers: a first look
    David M.J. Tax, D. de Ridder, Robert P.W. Duin
    Proceedings of the Third Annual Conference of the Advanced School for Computing and Imaging, ASCI, Delft, June 1997
  12. On Support Vector Decision Trees for Database Marketing
    Kristin P. Bennett, D. H. Wu, L. Auslender
    .P.I Math Report No. 98-100, Rensselaer Polytechnic Institute, Troy, NY, 1998

    Web Mining & Information Retrieval

  13. A Framework for Collaborative, Content-Based and Demographic Filtering
    Michael J. Pazzani
    Artificial Intelligence (in press)
  14. Mining the Web's Link Structure
    Soumen Chakrabarti, Byron E. Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins, David Gibson, and Jon Kleinberg
    IEEE Computer, 32(8), Aug, 1999, pp. 60-67

    Sampling

  15. Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules
    S.D. Lee, David W. Cheung, Ben Kao
    Data Mining and Knowledge Discovery, An International Journal, Vol. 2, pp. 233-262, Kluwer Academic Publishers, 1998
  16. Sampling: Design and Analysis
    Sharon Lohr
    Duxbury Press, 1999, pp 272-280
  17. Sampling Large Databases for Association Rules
    Hannu Toivonen
    In 22th International Conference on Very Large Databases (VLDB'96), 134-145, Mumbay, India, September 1996. Morgan Kaufmann

    Scalability Issues

  18. Mining Very Large Databases
    Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishnan
    IEEE Computer, 32(8), Aug, 1999, pp. 38-45
  19. The Effects of Training Set Size on Decision Tree Complexity
    Tim Oates, David Jensen
    Proceedings of the 14th International Conference on Machine Learning. 1997
  20. An Exploratory Technique for Investigating Large Quantities of Categorical Data
    G. Kass
    Applied Statistics, 29, 1980

    Miscellaneous

  21. MetaCost: A General Method for Making Classifiers Cost-Sensitive
    Pedro Domingos
    Proc. 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), 1999 Aug, pp. 155-164