A list of all my publications can also be found at BibSonomy.
- Künstliche Intelligenz braucht menschliche Intelligenz - was KI und Individualsoftware gemeinsam haben. Doerfel, Stephan (2020). 1 5-7.
- Wie spät ist es eigentlich? Ein Praxisblick auf die Teamuhr. Doerfel, Stephan; Krackrügge, Simon (2020). 3 39-43.
- Clones in Graphs. Doerfel, Stephan; Hanika, Tom; Stumme, Gerd M. Ceci, Japkowicz, N., Liu, J., Papadopoulos, G. A., Ra's, Z. W. (eds.) (2018). (Vol. 11177) 56-66.Finding structural similarities in graph data, like social networks, is a far-ranging task in data mining and knowledge discovery. A (conceptually) simple reduction would be to compute the automorphism group of a graph. However, this approach is ineffective in data mining since real world data does not exhibit enough structural regularity. Here we step in with a novel approach based on mappings that preserve the maximal cliques. For this we exploit the well known correspondence between bipartite graphs and the data structure formal context (G, M, I) from Formal Concept Analysis. From there we utilize the notion of clone items. The investigation of these is still an open problem to which we add new insights with this work. Furthermore, we produce a substantial experimental investigation of real world data. We conclude with demonstrating the generalization of clone items to permutations.
- Leveraging User-Interactions for Time-Aware Tag Recommendations. Zoller, Daniel; Doerfel, Stephan; Pölitz, Christian; Hotho, Andreas in CEUR Workshop Proceedings, M. Bielikova, Bogina, V., Kuflik, T., Sasson, R. (eds.) (2017).For the popular task of tag recommendation, various (complex) approaches have been proposed. Recently however, research has focused on heuristics with low computational effort and particularly, a time-aware heuristic, called BLL, has been shown to compare well to various state-of-the-art methods. Here, we follow up on these results by presenting another time-aware approach leveraging user interaction data in an easily interpretable, on-the-fly computable approach that can successfully be combined with BLL. We investigate the influence of time as a parameter in that approach, and we demonstrate the effectiveness of the proposed method using two datasets from the popular public social tagging system BibSonomy.
- Supporting Researchers: Analyzing the Scholarly Publication Life Cycle and Social Bookmarking Systems. Technical Report (PhD dissertation), Doerfel, Stephan (2017).Researchers must face the exponential growth of the body of available scholarly literature, which makes it ever harder to keep track with one’s own community, especially for newcomers. In this thesis, we explore different means of supporting researchers with that task. For this purpose, we follow two approaches: We provide analyses of research communities and of researchers’ interactions through data that can be obtained from the phases in the life cycle of scholarly publications (creation, dissemination, usage, and citation in other publications). The resulting statistics and visualizations allow researchers to better understand their own communities, to identify the most important players and publications, and to find valuable conversational partners at conferences. For the analysis of publication usage and connections to citations, we turn to social bookmarking systems and investigate the actions of users in BibSonomy. The provided insights can help operators of such systems improve them. Our second approach is more proactive, focusing on supporting researchers by pointing them directly to important publications – through automatically computed personalized recommendations and through social peer review. The analysis of research and researchers often relied on studying scholarly publications and their metadata. Such studies can reveal insights into how scientific work is conducted, they can shed light on communities and research topics, and they allow the measurement of certain forms of impact, a publication, an individual researcher, or a venue had. The exploited data – publication metadata– is generated when publications are created. The life cycle of a scholarly publication, however, just begins with a publication’s creation: Publications are disseminated (e.g., presented at conferences), they are used (e.g., acquired, stored, collected, marked as to-read, and, of course, read), and they are cited. With the advent of the Web 2.0, traces of the activities in these phases have become observable. In this thesis, we collect and analyze datasets from all four stages of the publication life cycle. We thus go beyond traditional means of scientometrics, touching such fields as altmetrics, web log analysis, and role discovery. We not only present new insights into communities that have not been investigated before, but we also demonstrate new means of analysis that are generalizable to other communities as well. Among them are formal concept analysis to visualize influences between groups of authors and social network analyses of interaction networks. Our datasets comprise – next to a traditional publication corpus containing metadata and references – a face-to-face contact network, gathered from real-live interactions of researchers during a conference, and datasets from the scholarly social bookmarking system BibSonomy. Social bookmarking services allow their users to publicly store and annotate resources, like web links, photos, videos, or publications. As representatives of the Web 2.0, social bookmarking systems have attracted the interest of the research community. Through the central feature, tagging of resources, users of such systems create a data structure called folksonomy, in which users, resources, and tags are connected. The resulting network allows users to navigate between these folksonomic entities. In scholarly bookmarking systems, users store and manage publications. Thus, such systems are an ideal candidate for the investigation of publication usage. In this thesis, we study data of the popular system BibSonomy to address various aspects of the use of social bookmarking systems and the therein stored resources. Moreover, we analyze the usefulness for altmetrics by studying correlations between the usage of a publication and its citations, as well as predictive power of usage-features over future citations. Scholarly bookmarking tools support researchers in their daily work with publications and their metadata. Still, the sheer number of available publications and its ever faster growth make it difficult to keep track of the relevant developments in one’s field of research – an instance of the information overload problem. Therefore, recommendation systems can be employed to point users to particular publications using personalized ranking algorithms. Usually, such algorithms exploit information in user profiles, for instance, previously stored resources and the according tags, as well as information about similarity between entities or about their positions within the network of entities (the folksonomy) to recommend new items that the active user might find interesting. Similarly, a recommender can also assist the process of tagging by recommending suitable tags to users while they create a new post for some resource. We use the scenario of tag recommendation to thoroughly analyze the typical evaluation setup of folksonomic recommender systems using so-called graph-cores. We improve the setup by introducing a new, more flexible type of core to circumvent a structural drawback of the graph-core approach. We also point to several pitfalls of using cores for benchmarking recommendation algorithms. Moreover, we employ the scenario of resource recommendation – specifically the recommendation of scholarly publications – to investigate different ways of integrating publication metadata into the popular and versatile folksonomic recommendation algorithm FolkRank. Finally, any tool that is offered on the web must comply with the law and its use must be socially compatible. Particularly difficult is the case of publicly visible ratings, where products are judged by users. For instance, in the case where resources are scholarly publications and thus the products of researchers (the authors), improper criticism may have consequences for researchers’ careers or for decisions about funding allocation. Based on requirements that have been derived from German law, we describe and discuss opportunities and risks of social web systems in which users share, debate, and rate scholarly publications. Altogether, this thesis relies on data from the scholarly publication life cycle to gain insights into research communities and the interaction of researchers with literature. We focus on social bookmarking systems, which reveal traces of its users’ behavior and which provide a suitable tool to support researchers in their work with literature. Our contributions aim at supporting researchers in their work, as members of their respective communities and as producers and consumers of scholarly literature.
- FolkTrails: Interpreting Navigation Behavior in a Social Tagging System. Niebler, Thomas; Becker, Martin; Zoller, Daniel; Doerfel, Stephan; Hotho, Andreas in CIKM '16 (2016).Social tagging systems have established themselves as a quick and easy way to organize information by annotating resources with tags. In recent work, user behavior in social tagging systems was studied, that is, how users assign tags, and consume content. However, it is still unclear how users make use of the navigation options they are given. Understanding their behavior and differences in behavior of different user groups is an important step towards assessing the effectiveness of a navigational concept and of improving it to better suit the users’ needs. In this work, we investigate navigation trails in the popular scholarly social tagging system BibSonomy from six years of log data. We discuss dynamic browsing behavior of the general user population and show that different navigational subgroups exhibit different navigational traits. Furthermore, we provide strong evidence that the semantic nature of the underlying folksonomy is an essential factor for explaining navigation.
- Posted, visited, exported: Altmetrics in the social tagging system BibSonomy. Zoller, Daniel; Doerfel, Stephan; Jäschke, Robert; Stumme, Gerd; Hotho, Andreas (2016). 10(3) 732 - 749.Abstract In social tagging systems, like Mendeley, CiteULike, and BibSonomy, users can post, tag, visit, or export scholarly publications. In this paper, we compare citations with metrics derived from users’ activities (altmetrics) in the popular social bookmarking system BibSonomy. Our analysis, using a corpus of more than 250,000 publications published before 2010, reveals that overall, citations and altmetrics in BibSonomy are mildly correlated. Furthermore, grouping publications by user-generated tags results in topic-homogeneous subsets that exhibit higher correlations with citations than the full corpus. We find that posts, exports, and visits of publications are correlated with citations and even bear predictive power over future impact. Machine learning classifiers predict whether the number of citations that a publication receives in a year exceeds the median number of citations in that year, based on the usage counts of the preceding year. In that setup, a Random Forest predictor outperforms the baseline on average by seven percentage points.
- What Users Actually do in a Social Tagging System: A Study of User Behavior in BibSonomy. Doerfel, Stephan; Zoller, Daniel; Singer, Philipp; Niebler, Thomas; Hotho, Andreas; Strohmaier, Markus (2016). 10(2) 14:1--14:32.Social tagging systems have established themselves as an important part in today’s web and have attracted the interest of our research community in a variety of investigations. Henceforth, several aspects of social tagging systems have been discussed and assumptions have emerged on which our community builds their work. Yet, testing such assumptions has been difficult due to the absence of suitable usage data in the past. In this work, we thoroughly investigate and evaluate four aspects about tagging systems, covering social interaction, retrieval of posted resources, the importance of the three different types of entities, users, resources, and tags, as well as connections between these entities’ popularity in posted and in requested content. For that purpose, we examine live server log data gathered from the real-world, public social tagging system BibSonomy. Our empirical results paint a mixed picture about the four aspects. While for some, typical assumptions hold to a certain extent, other aspects need to be reflected in a very critical light. Our observations have implications for the understanding of social tagging systems, and the way they are used on the web. We make the dataset used in this work available to other researchers.
- The Role of Cores in Recommender Benchmarking for Social Bookmarking Systems. Doerfel, Stephan; Jäschke, Robert; Stumme, Gerd (2016). 7(3) 40:1-40:33.Social bookmarking systems have established themselves as an important part in today’s web. In such systems, tag recommender systems support users during the posting of a resource by suggesting suitable tags. Tag recommender algorithms have often been evaluated in offline benchmarking experiments. Yet, the particular setup of such experiments has rarely been analyzed. In particular, since the recommendation quality usually suffers from difficulties like the sparsity of the data or the cold start problem for new resources or users, datasets have often been pruned to so-called cores (specific subsets of the original datasets) – however without much consideration of the implications on the benchmarking results. In this paper, we generalize the notion of a core by introducing the new notion of a set-core – which is independent of any graph structure – to overcome a structural drawback in the previous constructions of cores on tagging data. We show that problems caused by some types of cores can be eliminated using setcores. Further, we present a thorough analysis of tag recommender benchmarking setups using cores. To that end, we conduct a large-scale experiment on four real-world datasets in which we analyze the influence of different cores on the evaluation of recommendation algorithms. We can show that the results of the comparison of different recommendation approaches depends on the selection of core type and level. For the benchmarking of tag recommender algorithms, our results suggest that the evaluation must be set up more carefully and should not be based on one arbitrarily chosen core type and level.
- Description-Oriented Community Detection using Exhaustive Subgroup Discovery. Atzmueller, Martin; Doerfel, Stephan; Mitzlaff, Folke (2016). 329 965-984.Abstract Communities can intuitively be defined as subsets of nodes of a graph with a dense structure in the corresponding subgraph. However, for mining such communities usually only structural aspects are taken into account. Typically, no concise nor easily interpretable community description is provided. For tackling this issue, this paper focuses on description-oriented community detection using subgroup discovery. In order to provide both structurally valid and interpretable communities we utilize the graph structure as well as additional descriptive features of the graph’s nodes. A descriptive community pattern built upon these features then describes and identifies a community, i.e., a set of nodes, and vice versa. Essentially, we mine patterns in the “description space” characterizing interesting sets of nodes (i.e., subgroups) in the “graph space”; the interestingness of a community is evaluated by a selectable quality measure. We aim at identifying communities according to standard community quality measures, while providing characteristic descriptions of these communities at the same time. For this task, we propose several optimistic estimates of standard community quality functions to be used for efficient pruning of the search space in an exhaustive branch-and-bound algorithm. We demonstrate our approach in an evaluation using five real-world data sets, obtained from three different social media applications.
- Fast Description-Oriented Community Detection using Subgroup Discovery (Extended Abstract, Resubmission). Atzmueller, Martin; Doerfel, Stephan; Mitzlaff, Folke (2015).Communities can intuitively be defined as subsets of nodes of a graph with a dense structure. However, for mining such communities usually only structural aspects are taken into account. Typically, no concise and easily interpretable community description is provided. For tackling this issue, we focus on fast description-oriented community detection using subgroup discovery, cf. [1, 2]. In order to provide both structurally valid and interpretable communities we utilize the graph structure as well as additional descriptive features of the contained nodes. A descriptive community pattern built upon these features then describes and identifies a community given by a set of nodes, and vice versa. Essentially, we mine for patterns in the “description space” characterizing interesting sets of nodes in the “graph/community space”; the interestingness of a community is then evaluated by a selectable quality measure. We aim at identifying communities according to standard community quality measures, while providing characteristic descriptions of the respective communities at the same time. In order to implement an efficient approach, we propose several optimistic estimates of standard community quality functions. Together with the proposed exhaustive branch-and-bound algorithm, these estimates enable fast description-oriented community detection. This is demonstrated in an evaluation using five real-world data sets, obtained from three different social media applications.
- On Publication Usage in a Social Bookmarking System. Zoller, Daniel; Doerfel, Stephan; Jäschke, Robert; Stumme, Gerd; Hotho, Andreas in WebSci '15 (2015). 67:1--67:2.Scholarly success is traditionally measured in terms of citations to publications. With the advent of publication man- agement and digital libraries on the web, scholarly usage data has become a target of investigation and new impact metrics computed on such usage data have been proposed – so called altmetrics. In scholarly social bookmarking sys- tems, scientists collect and manage publication meta data and thus reveal their interest in these publications. In this work, we investigate connections between usage metrics and citations, and find posts, exports, and page views of publications to be correlated to citations.
- Ubicon and its applications for ubiquitous social computing. Atzmueller, Martin; Becker, Martin; Kibanov, Mark; Scholz, Christoph; Doerfel, Stephan; Hotho, Andreas; Macek, Bjoern-Elmar; Mitzlaff, Folke; Mueller, Juergen; Stumme, Gerd (2014). 20(1) 53-77.The combination of ubiquitous and social computing is an emerging research area which integrates different but complementary methods, techniques and tools. In this paper, we focus on the Ubicon platform, its applications, and a large spectrum of analysis results. Ubicon provides an extensible framework for building and hosting applications targeting both ubiquitous and social environments. We summarize the architecture and exemplify its implementation using four real-world applications built on top of Ubicon. In addition, we discuss several scientific experiments in the context of these applications in order to give a better picture of the potential of the framework, and discuss analysis results using several real-world data sets collected utilizing Ubicon.
- Of course we Share! Testing Assumptions about Social Tagging Systems Doerfel, Stephan; Zoller, Daniel; Singer, Philipp; Niebler, Thomas; Hotho, Andreas; Strohmaier, Markus (2014).Social tagging systems have established themselves as an important part in today's web and have attracted the interest from our research community in a variety of investigations. The overall vision of our community is that simply through interactions with the system, i.e., through tagging and sharing of resources, users would contribute to building useful semantic structures as well as resource indexes using uncontrolled vocabulary not only due to the easy-to-use mechanics. Henceforth, a variety of assumptions about social tagging systems have emerged, yet testing them has been difficult due to the absence of suitable data. In this work we thoroughly investigate three available assumptions - e.g., is a tagging system really social? - by examining live log data gathered from the real-world public social tagging system BibSonomy. Our empirical results indicate that while some of these assumptions hold to a certain extent, other assumptions need to be reflected and viewed in a very critical light. Our observations have implications for the design of future search and other algorithms to better reflect the actual user behavior.
- ECML PKDD Discovery Challenge - Recommending Given Names Doerfel, Stephan; Hotho, Andreas; Jäschke, Robert; Mitzlaff, Folke; Mueller, Juergen (2014). (Vol. 1120) CEUR-WS.All over the world, future parents are facing the task of finding a suitable given name for their children. Their choice is usually influenced by a variety of factors, such as the social context, language, cultural background and especially personal taste. Although this task is omnipresent, little research has been conducted on the analysis and application of interrelations among given names from a data mining perspective. Since 1999 the ECML PKDD embraces the tradition of organizing a Discovery Challenge, allowing researchers to develop and test algorithms for novel and real world datasets. The Discovery Challenge 20131 tackled the task of recommending given names in the context of the name search engine Nameling. It consisted of an offline and an online phase. In both phases, participants were asked to create a name recommendation algorithm that could provide suitable suggestions of given names to users of Nameling. More than 40 participants/teams registered for the challenge, of which 17 handed in predictions of the offline challenge. After the end of the offline phase 6 teams submitted a paper. All papers have been peer reviewed and can be found in these proceedings. The different approaches to the challenge are presented at the ECML PKDD workshop on September 27th, 2013, in Prague, Czech Republic. The online challenge ran until the day before the workshop and four teams successfully participated with implementations meeting all required criteria. Details of the two challenge tasks, winners of both phases and an overview of the main findings are presented in the first paper of these proceedings.
- Summary of the 15th Discovery Challenge: Recommending Given Names. Mitzlaff, Folke; Doerfel, Stephan; Hotho, Andreas; Jäschke, Robert; Mueller, Juergen S. Doerfel, Hotho, A., Jäschke, R., Mitzlaff, F., Mueller, J. (eds.) (2014). (Vol. 1120) 7-24.The 15th ECML PKDD Discovery Challenge centered around the recommendation of given names. Participants of the challenge implemented algorithms that were tested both offline - on data collected by the name search engine Nameling - and online within Nameling. Here, we describe both tasks in detail and discuss the publicly available datasets. We motivate and explain the chosen evaluation of the challenge, and we summarize the different approaches applied to the name recommendation tasks. Finally, we present the rankings and winners of the offline and the online phase.
- Mining Social Links for Ubiquitous Knowledge Engineering. Scholz, Christoph; Macek, Bjoern-Elmar; Atzmueller, Martin; Doerfel, Stephan; Stumme, Gerd K. David, Geihs, K., Leimeister, J. M., Roßnagel, A., Schmidt, L., Stumme, G., Wacker, A. (eds.) (2014). 109-129.Exploiting social links is an important issue for enhancing ubiquitous knowledge engineering because they are a substitute for a wide range of properties depending on which relation spans the link: in case of human face-to-face contacts, similar locations or potential knowledge transfer for the people in contact can be derived. This information can be used to improve the quality of ubiquitous services as localization or recommendation systems. We capture this information by deploying active RFID setups at a variety of contexts. In this chapter, we focus especially on working groups and conferences and discuss and evaluate the achieved improvements using the gathered data.
- Evaluating Assumptions about Social Tagging - A Study of User Behavior in BibSonomy. Doerfel, Stephan; Zoller, Daniel; Singer, Philipp; Niebler, Thomas; Hotho, Andreas; Strohmaier, Markus T. Seidl, Hassani, M., Beecks, C. (eds.) (2014). 18-19.Social tagging systems have established themselves as an important part in today’s web and have attracted the interest of our research community in a variety of investigations. Henceforth, several assumptions about social tagging systems have emerged on which our community also builds their work. Yet, testing such assumptions has been difficult due to the absence of suitable usage data in the past. In this work, we investigate and evaluate four assumptions about tagging systems by examining live server log data gathered from the public social tagging system BibSonomy. Our empirical results indicate that while some of these assumptions hold to a certain extent, other assumptions need to be reflected in a very critical light.
- How Social is Social Tagging? Doerfel, Stephan; Zoller, Daniel; Singer, Philipp; Niebler, Thomas; Hotho, Andreas; Strohmaier, Markus (2014). 251-252.Social tagging systems have established themselves as an important part in today's web and have attracted the interest of our research community in a variety of investigations. This has led to several assumptions about tagging, such as that tagging systems exhibit a social component. In this work we overcome the previous absence of data for testing such an assumption. We thoroughly study social interaction, leveraging for the first time live log data gathered from the real-world public social tagging system bibs. Our results indicate that sharing of resources constitutes an important and indeed social aspect of tagging.
- Deeper Into the Folksonomy Graph: FolkRank Adaptations and Extensions for Improved Tag Recommendations. Landia, Nikolas; Doerfel, Stephan; Jäschke, Robert; Anand, Sarabjot Singh; Hotho, Andreas; Griffiths, Nathan (2013). 1310.1498The information contained in social tagging systems is often modelled as a graph of connections between users, items and tags. Recommendation algorithms such as FolkRank, have the potential to leverage complex relationships in the data, corresponding to multiple hops in the graph. We present an in-depth analysis and evaluation of graph models for social tagging data and propose novel adaptations and extensions of FolkRank to improve tag recommendations. We highlight implicit assumptions made by the widely used folksonomy model, and propose an alternative and more accurate graph-representation of the data. Our extensions of FolkRank address the new item problem by incorporating content data into the algorithm, and significantly improve prediction results on unpruned datasets. Our adaptations address issues in the iterative weight spreading calculation that potentially hinder FolkRank's ability to leverage the deep graph as an information source. Moreover, we evaluate the benefit of considering each deeper level of the graph, and present important insights regarding the characteristics of social tagging data in general. Our results suggest that the base assumption made by conventional weight propagation methods, that closeness in the graph always implies a positive relationship, does not hold for the social tagging domain.
- Tag Recommendations for SensorFolkSonomies. Mueller, Juergen; Doerfel, Stephan; Becker, Martin; Hotho, Andreas; Stumme, Gerd B. Mobasher, Jannach, D., Geyer, W., Freyne, J., Hotho, A., Anand, S. S., Guy, I. (eds.) (2013). (Vol. 1066)With the rising popularity of smart mobile devices, sensor data-based applications have become more and more popular. Their users record data during their daily routine or specifically for certain events. The application WideNoise Plus allows users to record sound samples and to annotate them with perceptions and tags. The app is being used to document and map the soundscape all over the world. The procedure of recording, including the assignment of tags, has to be as easy-to-use as possible. We therefore discuss the application of tag recommender algorithms in this particular scenario. We show, that this task is fundamentally different from the well-known tag recommendation problem in folksonomies as users do no longer tag fix resources but rather sensory data and impressions. The scenario requires efficient recommender algorithms that are able to run on the mobile device, since Internet connectivity cannot be assumed to be available. Therefore, we evaluate the performance of several tag recommendation algorithms and discuss their applicability in the mobile sensing usecase.
- Informationelle Selbstbestimmung Im Web 2.0 - Chancen Und Risiken Sozialer Verschlagwortungssysteme Doerfel, Stephan; Hotho, Andreas; Kartal-Aydemir, Aliye; Roßnagel, Alexander; Stumme, Gerd in Xpert.press, (S. Doerfel; Hotho, A.; Kartal-Aydemir, A.; Roßnagel, A.; Stumme, G., eds.) (2013). Vieweg + Teubner Verlag.Die neue Generation des Internets („Web 2.0“ oder „Social Web“) zeichnet sich durch eine sehr freizügige Informationsbereitstellung durch seine Nutzer aus. Vor diesem Hintergrund haben Informatiker und Juristen in enger Interaktion die Chancen und Risiken der neuen Web 2.0-Technologien erkundet und gestaltet. Nach Bestandsaufnahme werden die technischen und rechtlichen Chancen und Risiken bezogen auf typisierte Aufgaben analysiert. Generische Konzepte für die datenschutzgerechte Gestaltung einer Anwendung wie Identitätsmanagement, Vermeidung von Personenbezug, Profilbildung und Verantwortlichkeiten werden erarbeitet. Parallel dazu werden Algorithmen und Verfahren für diese Konzepte vorgestellt: Recommender-Systeme für kooperative Verschlagwortungssysteme sowie Spam-Entdeckungsverfahren für solche Systeme. Sie werden anhand realer Daten evaluiert. Alle Ergebnisse werden anhand des Social Bookmarking-Systems BibSonomy erläutert. Schließlich wird diskutiert, inwieweit Dogmatik und Auslegung des Datenschutzrechts wegen der neuen Problemlagen des Web 2.0 verändert werden müssen und eventuell gesetzgeberische Aktivitäten erforderlich oder ratsam sind.
- An Analysis of Tag-Recommender Evaluation Procedures. Doerfel, Stephan; Jäschke, Robert in RecSys '13 (2013). 343-346.Since the rise of collaborative tagging systems on the web, the tag recommendation task -- suggesting suitable tags to users of such systems while they add resources to their collection -- has been tackled. However, the (offline) evaluation of tag recommendation algorithms usually suffers from difficulties like the sparseness of the data or the cold start problem for new resources or users. Previous studies therefore often used so-called post-cores (specific subsets of the original datasets) for their experiments. In this paper, we conduct a large-scale experiment in which we analyze different tag recommendation algorithms on different cores of three real-world datasets. We show, that a recommender's performance depends on the particular core and explore correlations between performances on different cores.
- Publication Analysis of the Formal Concept Analysis Community. Doerfel, Stephan; Jäschke, Robert; Stumme, Gerd in Lecture Notes in Computer Science, F. Domenach, Ignatov, D. I., Poelmans, J. (eds.) (2012). (Vol. 7278) 77-95.We present an analysis of the publication and citation networks of all previous editions of the three conferences most relevant to the FCA community: ICFCA, ICCS and CLA. Using data mining methods from FCA and graph analysis, we investigate patterns and communities among authors, we identify and visualize influential publications and authors, and we give a statistical summary of the conferences’ history.
- Leveraging Publication Metadata and Social Data into FolkRank for Scientific Publication Recommendation. Doerfel, Stephan; Jäschke, Robert; Hotho, Andreas; Stumme, Gerd in RSWeb '12 (2012). 9--16.The ever-growing flood of new scientific articles requires novel retrieval mechanisms. One means for mitigating this instance of the information overload phenomenon are collaborative tagging systems, that allow users to select, share and annotate references to publications. These systems employ recommendation algorithms to present to their users personalized lists of interesting and relevant publications. In this paper we analyze different ways to incorporate social data and metadata from collaborative tagging systems into the graph-based ranking algorithm FolkRank to utilize it for recommending scientific articles to users of the social bookmarking system BibSonomy. We compare the results to those of Collaborative Filtering, which has previously been applied for resource recommendation.
- Extending FolkRank with content data. Landia, Nikolas; Anand, Sarabjot Singh; Hotho, Andreas; Jäschke, Robert; Doerfel, Stephan; Mitzlaff, Folke in RSWeb '12 (2012). 1--8.Real-world tagging datasets have a large proportion of new/ untagged documents. Few approaches for recommending tags to a user for a document address this new item problem, concentrating instead on artificially created post-core datasets where it is guaranteed that the user as well as the document of each test post is known to the system and already has some tags assigned to it. In order to recommend tags for new documents, approaches are required which model documents not only based on the tags assigned to them in the past (if any), but also the content. In this paper we present a novel adaptation to the widely recognised FolkRank tag recommendation algorithm by including content data. We adapt the FolkRank graph to use word nodes instead of document nodes, enabling it to recommend tags for new documents based on their textual content. Our adaptations make FolkRank applicable to post-core 1 ie. the full real-world tagging datasets and address the new item problem in tag recommendation. For comparison, we also apply and evaluate the same methodology of including content on a simpler tag recommendation algorithm. This results in a less expensive recommender which suggests a combination of user related and document content related tags.
Including content data into FolkRank shows an improvement over plain FolkRank on full tagging datasets. However, we also observe that our simpler content-aware tag recommender outperforms FolkRank with content data. Our results suggest that an optimisation of the weighting method of FolkRank is required to achieve better results.
- Face-to-Face Contacts at a Conference: Dynamics of Communities and Roles. Atzmueller, Martin; Doerfel, Stephan; Hotho, Andreas; Mitzlaff, Folke; Stumme, Gerd M. Atzmueller, Chin, A., Helic, D., Hotho, A. (eds.) (2012). (Vol. 7472) 21-39.This paper focuses on the community analysis of conference participants using their face-to-face contacts, visited talks, and tracks in a social and ubiquitous conferencing scenario. We consider human face-to-face contacts and perform a dynamic analysis of the number of contacts and their lengths. On these dimensions, we specifically investigate user-interaction and community structure according to different special interest groups during a conference. Additionally, using the community information, we examine different roles and their characteristic elements. The analysis is grounded using real-world conference data capturing community information about participants and their face-to-face contacts. The analysis results indicate, that the face-to-face contacts show inherent community structure grounded using the special interest groups. Furthermore, we provide individual and community-level properties, traces of different behavioral patterns, and characteristic (role) profiles.
- Ubicon: Observing Social and Physical Activities. Atzmueller, Martin; Becker, Martin; Doerfel, Stephan; Kibanov, Mark; Hotho, Andreas; Macek, Björn-Elmar; Mitzlaff, Folke; Mueller, Juergen; Scholz, Christoph; Stumme, Gerd J. Bourgeois, Zomaya, A. (eds.) (2012). 317-324.The connection of ubiquitous and social computing is an emerging research area which is combining two prominent areas of computer science. In this paper, we tackle this topic from different angles: We describe data mining methods for ubiquitous and social data, specifically focusing on physical and social activities, and provide exemplary analysis results. Furthermore, we give an overview on the Ubicon platform which provides a framework for the creation and hosting of ubiquitous and social applications for diverse tasks and projects. Ubicon features the collection and analysis of both physical and social activities of users for enabling inter-connected applications in ubiquitous and social contexts. We summarize three real-world systems built on top of Ubicon, and exemplarily discuss the according mining and analysis aspects.
- Face-to-Face Contacts during LWA 2010 - Communities, Roles, and Key Players. Atzmueller, Martin; Doerfel, Stephan; Hotho, Andreas; Mitzlaff, Folke; Stumme, Gerd (2011).
- Face-to-Face Contacts during a Conference: Communities, Roles, and Key Players. Atzmueller, Martin; Doerfel, Stephan; Hotho, Andreas; Mitzlaff, Folke; Stumme, Gerd (2011).
- Resource-Aware On-line RFID Localization Using Proximity Data. Scholz, Christoph; Doerfel, Stephan; Atzmueller, Martin; Hotho, Andreas; Stumme, Gerd (2011). 129-144.
- Privatsphären- und Datenschutz in Community-Plattformen: Gestaltung von Online-Bewertungsportalen. Kartal, Aliye; Doerfel, Stephan; Roßnagel, Alexander; Stumme, Gerd in Lecture Notes in Informatics, H. -U. Heiß, Pepper, P., Schlingloff, H., Schneider, J. (eds.) (2011). (Vol. 192) 412.Aufgrund der mittlerweile unüberschaubaren Vielfalt von Anwendungsmöglichkeiten des Web 2.0, findet man fast zu jedem Lebensbereich eine passende Community im Netz. Dabei steigt auch die Anzahl der Bewertungsportale stetig und betrifft längst nicht mehr nur die Bewertung von Waren, sondern erstreckt sich unterdessen auch auf Beurteilungen von Leistungen und Eigenschaften von zu bestimmten Berufsgruppen gehörenden Personen. Diese Entwicklung birgt die Gefahr, dass die dadurch gewonnenen persönlichen Daten durchaus geeignet sind, wahrheitswidrig ein übermäßig positives oder übermäßig negatives Persönlichkeitsbild des Betroffenen zu konstruieren und dadurch sein Ansehen zu beeinflussen. Im Hinblick auf Fragen im Zusammenhang mit dem Persönlichkeits- und Datenschutz soll der folgende Beitrag Maßstäbe an eine verfassungs- und datenschutzkonforme technische Gestaltung von Online-Bewertungsportalen aufzeigen.
- Enhancing Social Interactions at Conferences. Atzmueller, Martin; Benz, Dominik; Doerfel, Stephan; Hotho, Andreas; Jäschke, Robert; Macek, Bjoern Elmar; Mitzlaff, Folke; Scholz, Christoph; Stumme, Gerd (2011). 53(3) 101--107.
- A Context-Based Description of the Doubly Founded Concept Lattices in the Variety Generated by M_3. Doerfel, Stephan in Lecture Notes in Computer Science, P. Valtchev, Jäschke, R. (eds.) (2011). (Vol. 6628) 93-106.In universal algebra and in lattice theory the notion of varieties is very prominent, since varieties describe the classes of all algebras (or of all lattices) modeling a given set of equations. While a comprehensive translation of that notion to a similar notion of varieties of complete lattices – and thus to Formal Concept Analysis – has not yet been accomplished, some characterizations of the doubly founded complete lattices of some special varieties (e.g. the variety of modular or that of distributive lattices) have been discovered. In this paper we use the well-known arrow relations to give a characterization of the formal contexts of doubly founded concept lattices in the variety that is generated by M 3 – the smallest modular, non-distributive lattice variety.
- The Scaffolding of a Formal Context. Doerfel, Stephan M. Kryszkiewicz, Obiedkov, S. (eds.) (2010). (Vol. 672) 283-293.The scaffolding of a complete lattice L of finite length was introduced by Rudolf Wille in 1976 as a relative subsemilattice of L that can be constructed using subdirect decomposition. The lattice is uniquely defined by its scaffolding and can be reconstructed from it. Using bonds, we demonstrate how the scaffolding can be constructed from a given formal context and thereby extend the notion of the scaffolding to doubly founded lattices. Further, we explain the creation of a suitable graphical representation of the scaffolding from the context.
- Gerüste Formaler Kontexte. Technical Report (Master thesis), Doerfel, Stephan PhD thesis, TU Dresden. (2009).1976 wurde von Rudolf Wille das Gerüst eines vollständigen Verbandes V endlicher Länge eingeführt. Dieses besteht aus einer (geordneten) Teilmenge von V, aus der der gesamte Verband rekonstruiert werden kann. In der Formalen Begriffsanalyse werden vollständige Verbände durch Kontexte beschrieben. Um mit diesen arbeiten zu können, wurden diverse verbandstheoretische Konzepte und Resultate für vollständige Verbände adaptiert und begriffsanalytisch, d. h. mit Hilfe von Kontexten formuliert. Ein Ziel dieser Diplomarbeit ist es, die Konstruktion des Gerüstes eines Begriffsverbandes aus einem Kontext heraus beschreiben zu können. Die Anwendbarkeit des Gerüstes wird dabei von vollständigen Verbänden endlicher Länge auf doppelt fundierte vollständige Verbände (bzw. deren reduzierte Kontexte) erweitert. Im zweiten Teil der Arbeit wird die von M_3 erzeugte Varietät betrachtet. Die darin enthaltenen doppelt fundierten vollständigen Verbände werden anhand ihrer reduzierten Kontexte charakterisiert.