Reading List
EE 380L - A Practicum in
Data Mining
Spring 2003
Notices
Paper
Selection Policy
- Each
student must select and present any two of the following papers
(except the general reading papers). Both papers should not be on the same
sub-topic.
- Paper
is allocated on first come basis.
- To
select a paper, email your selection to the TA.
- Scheduling
of your talk depends on the topic you have chosen. Ideally it will take place
in the "Student Paper Presentations" slot (in the course
schedule) right after that topic has been covered in class.
- Selected means the paper has been selected for presentation and cannot be selected again. Also check for pending requests on
the following page.
- Please use this link to request a paper
for presentation and edit the required fields of the subject line.
This queues your request on this page instantly.
I. General Reading (but not for class presentations) supplement to the 7
papers in your course packet.
- Information
retrieval on the web
Mei Kobayashi and Koichi Takeda
ACM
Computing Surveys, vol.32, no.2, 144-173, 2000
- Data
mining for hypertext: A tutorial survey S. Chakrabarti
ACM SIGKDD
Explorations, 1(2), 1-11, 2000
- Impact
of Similarity Measures on Web-page Clustering
A.Strehl, J. Ghosh and
R. Mooney
Proc. AAAI workshop on AI for Web Search, K. Bollacker
(Ed)
TR WS-00-01, AAAI Press, July 2000, pp. 58-64
- Data
Preparation for Mining World Wide Web Browsing Patterns
Robert Cooley,
Bamshad Mobasher, and Jaideep Srivastava
Knowledge and Information
Systems, V1(1), 1999
- An
Internet-enabled Knowledge Discovery Process
by Alex Buchner, et. al.,
MINEit Software Ltd., 1999
II. Hyperlinks
- Focused
Crawling: A New Approach to Topic-Specific Web Resource Discovery
Soumen Chakrabarti, Martin van den Berg, Byron Dom, WWW8
Selected: Tabassum
- Random
Walks with "Back Buttons"
Ronald Fagin et. al., Proc. 2000 ACM
Symposium on Theory of Computation
- Stochastic
models for the Web graph
R. Kumar, P. Raghavan, S. Rajagopalan, D.
Sivakumar, A. Tomkins, and Eli Upfal Proc. of the 41th IEEE Symp. on
Foundations of Computer Science.2000
Selected: Rajalingam Alagesan : 3/4/03
- "Winners
Don't Take All: Characterizing the Competition for Links on the Web,"
D. Pennock, G.W. Flake, S. Lawrence, E.J. Glover, C.L. Giles
Proceedings of the National Academy of Sciences, 99(8),5207-5211, April
2002.
Selected:Kunal
- Learning Probabilistic
Models of Link Structure
Lisa Gentoor, Nir Friedman, Daphne Koller
etal. JMLR. 3(Dec) 2002
III. Information Retrieval
- Hierarchical
Bayesian models for applications in information retrieval
David Blei,
Michael Jordan and Andrew Y. Ng Bayesian Statistics 7, (Oxford University
Press) 2003
Selected: Andromache Howe : 3/4/03
- A
Probabilistic Framework for the Hierarchic Organisation and Classification of
Document Collections
Alexei Vinokourov and Mark Girolami
BUBL
journals: Information Processing and Management, 2002
Selected: Matt MacMahon : 3/6/03
- Latent
Semantic Kernels
Nello Cristianini and Huma Lodhi and John
Shawe-Taylor
Journal of Intelligent Information Systems, 18:2/3,
127-152, 2002
Selected: Hyuk Cho : 3/6/03
- Learning
Approaches for Detecting and Tracking News Events
Y. Yang et. al., IEEE Intelligen
Systems, 14(4):32--43, 1999
Selected: Kunal : 3/18/03
- Text
Classification in a Hierarchical Mixture Model for Small Training
Sets
Kristina Toutanova, Francine Chen, Kris Popat, and Thomas
Hofmann
Proceedings of the Tenth International ACM Conference on
Information and Knowledge Management, CIKM 2001
Selected: Vidya Narayan : 3/18/03
- Topic
Distillation at TrecWeb2002
Tsinghua, City Univ. London, IBM Haifa.
Selected: Abhinav Sharma 4/17/03
- Name Page finding
ate TrecWeb2002: Top3 ( Tsinghua, CMU, Glasgow )
IV. Contents + Links
- The missing link
- a probabilistic model of document content and hypertext
connectivity
David Cohn and Thomas Hofmann, NIPS-13, 2001
Selected: Hyuk Cho :3/27/03
- Categorization
of web pages and user clustering with mixture of hidden Markov
models
A.Ypma and T.Heskes WEBKDD'02 pp 31-43
Selected: Matt MacMahon : 3/27/03
- The impact of site
structure and user environment on session reconstruction in Web usage
analysis
B.Masand , M.Spiliopoulou, J.Srivastava etal Working Notes
of the Fourth WebKDD Web Mining for Usage Profiles, Workshop
at KDD, pp 115-129, 2002
Selected: Alexander Y. Liu : 4/17/03
- Stumme, G., Hotho, A., & Berendt,
B. (2002). Usage Mining for and on the Semantic Web. In Proceedings of
the National Science
Foundation Workshop on Next Generation Data Mining, Baltimore, Nov.
1-3, 2002.
Selected Tabassum : 4/22/03
- Shaping the Web: Why the politics of search engines matters
Lucas D. Introna and Helen Nissenbaum.
Selected: Vidya Venkat : 3/20/03
V. Personalization
- LumberJack: Intelligent Discovery and Analysis of Web User Traffic Composition (2002)
Ed H. Chi, Adam Rosien, Jeffrey Heer WebKDD02
Selected: Rajalingam Alagesan : 4/22/03
- ]Efficient and Anonymous Web-Usage Mining for Web Personalization
Cyrus Shahabi, Farnoush Banaei-Kashani
Selected: Abhinav Sharma: 4/24/03
- A critical View of
Recommendor Systems
Andreas Mild and Martin Natter Working Paper
No.82 July 2001
Selected: Alexander Y. Liu: 4/24/03
- Generative
Models for Cold-Start Recommendations
Andrew I. Schein, Alexandrin
Popescul, Lyle H. Ungar and David M. Pennock
SIGIR-01 Workshop on
Recommender Systems
- PVA: A
Self-Adaptive Personal View Agent
Chien Chin Chen, Meng Chang Chen and
Yeali Sun
Journal of Intelligent Information Systems, 18:2/3,
173-194, 2002