International Workshop on Data Mining in
Web 2.0 Environments

held in conjunction with the IEEE International
Conference on Data Mining (ICDM 2007)
on October 28, 2007 in Omaha, United States.

Topics of interest

Users feel very attracted by currently emerging Web 2.0 environments, that allow to provide content in a simple, unrestricted, and ad hoc way. Providing annotations (such as tags) in a Web 2.0 like way is applicable to a wide range of resources and data types, such as web pages, images, multimedia, etc. There is, however, a disadvantage: the freedom to provide arbitrary (personal) content and tags in ubiquitous, uncoordinated ways results in very large amounts of poorly structured information. Behind the current hype around Web 2.0 applications, this raises several important challenges for future data and web mining methods.

The workshop aims to bring together researchers and professionals in the areas of data and web mining, information systems and collaborative systems to discuss challenges and solutions of applying data mining to highly unstructured, user created data. Such challenges include the analysis of loosely-coupled snippets of information, such as overlapping tag structures, homonym or synonym tags, blog networks etc. Other challenges arise from scalability issues or new forms of fraud and spam. They demand, for instance, innovative methods of tag clustering, filtering, aggregation, personalization and visualization.

As an outcome of the workshop, we expect a better understanding of methods which can be successfully applied to Web 2.0 applications and open challenges yet to be solved.

Topics of interest include but are not limited to:

  • analysis of blogs
  • tag clustering and visualization
  • synonym and homonym resolution in tags
  • visual and textual information extraction
  • temporal analysis
  • data streams, trend detection, and concept drift
  • application of web and text mining to wiki content
  • discovering social structures and communities
  • evolution of online social networks
  • predicting user behavior
  • analysis of dynamic networks
  • discovering misuse and fraud
  • combining the web with data from other sources, mining with mashups
  • deriving profiles from usage
  • personalized delivery of information
  • applications, case studies

Important Dates

  • Submission deadline: 22th June 2007
  • Notification of acceptance: 8th August 2007
  • Camera-ready copies due: 17th August 2007
  • Workshop day: 28th October 2007


Paper submissions should be limited to a maximum of 6 pages in the IEEE 2-column format. All papers will be reviewed by at least 2 program committee members for their technical merit, originality, significance, and relevance to the workshop. Accepted papers will be published in the proceedings by the IEEE Computer Society Press.

Please prepare and submit the camera-ready version of your paper according to the instructions found on the IEEE ICDM Workshop Page.

If you have any questions concerning the submission procedure, please contact Michael Wurst.


  • 13:30-14:00 Introduction
  • 14:00-15:30 Session I: Web Sentiment, Blog Analysis and Social Networks
    • Aspect Summarization from Blogsphere for Social Study
      Chia-Hui Chang, Fred Tsai
    • HSN-PAM: Finding Hierarchical Probabilistic Groups From Large-Scale Networks
      Haizheng Zhang, Wei Li, Xuerui Wang, C. Lee Giles, Henry C. Foley, John Yen
    • SOPS: Stock Prediction using Web Sentiment
      Vivek Sehgal, Charles Song
  • 15:30-16:00 Coffee Break
  • 16:00-17:00 Lyle Ungar: Information Extraction from Informal Texts (Invited Talk)
  • 17:00-18:30 Session II: Extracting Information from the Web 2.0
    • Ask the Crowd to Find Out What?s Important
      Sisay Fissaha Adafre, Maarten de Rijke
    • Extracting Author Meta-Data from Web
      Shuyi Zheng, Ding Zhou, Jia Li, Lee Giles
    • FiVaTech: Page-level Web Data Extraction from Template Pages
      Chia-Hui Chang, Mohammed Kayed, Khaled Shaalan, Ramzy Girgis
  • 18:30 Wrap-up

The workshop will take place in room B.


Program Committee

  • Lada Adamic, University of Michigan
  • Bettina Berendt, Humboldt-University Berlin
  • Fabio Ciravegna, University of Sheffield
  • Martin Ester, Simon Fraser University Vancouver
  • Ronen Feldman, ClearForest Corp.
  • Dimitrios Gunopulos, University of California Riverside
  • Marko Grobelnik, J. Stefan Institute Ljubljana
  • Siggi Handschuh, DERI Galway
  • Thomas Hofmann, Google
  • Hillol Kargupta, UMBC Maryland
  • Nick Koudas, University of Toronto
  • Ernestina Menasalvas, Polytechnical University of Madrid
  • Srujana Merugu, Yahoo!
  • Dunja Mladenic, J. Stefan Institute Ljubljana
  • Katharina Morik, University of Dortmund
  • Srinivasan Parthasarathy, Ohio State University
  • Maarten van Someren, University of Amsterdam
  • D. Sivakumar, Google
  • Gerd Stumme, University of Kassel
  • Panayiotis Tsaparas, Microsoft