Knowledge & Data Engineering Group (KDE), EECS, University of Kassel
Our latest publications
- 1.Draude, C., Engert, S., Hess, T., Hirth, J., Horn, V., Kropf, J., Lamla, J., Stumme, G., Uhlmann, M., Zwingmann, N.: Verrechnung – Design – Kultivierung: Instrumentenkasten für die Gestaltung fairer Geschäftsmodelle durch Ko-Valuation, https://plattform-privatheit.de/p-prv-wAssets/Assets/Veroeffentlichungen_WhitePaper_PolicyPaper/whitepaper/WP_2024_FAIRDIENSTE_1.0.pdf, (2024). https://doi.org/10.24406/publica-2497.
@misc{claude2024verrechnung,
address = {Karlsruhe},
author = {Draude, Claude and Engert, Simon and Hess, Thomas and Hirth, Johannes and Horn, Viktoria and Kropf, Jonathan and Lamla, Jörn and Stumme, Gerd and Uhlmann, Markus and Zwingmann, Nina},
edition = 1,
editor = {Friedewald, Michael and Roßnagel, Alexander and Geminn, Christian and Karaboga, Murat},
howpublished = {White Paper},
keywords = {itegpub},
month = {03},
publisher = {Fraunhofer-Institut für System- und Innovationsforschung ISI},
series = {Plattform Privatheit},
title = {Verrechnung – Design – Kultivierung: Instrumentenkasten für die Gestaltung fairer Geschäftsmodelle durch Ko-Valuation},
year = 2024
}%0 Generic
%1 claude2024verrechnung
%A Draude, Claude
%A Engert, Simon
%A Hess, Thomas
%A Hirth, Johannes
%A Horn, Viktoria
%A Kropf, Jonathan
%A Lamla, Jörn
%A Stumme, Gerd
%A Uhlmann, Markus
%A Zwingmann, Nina
%B Plattform Privatheit
%C Karlsruhe
%D 2024
%E Friedewald, Michael
%E Roßnagel, Alexander
%E Geminn, Christian
%E Karaboga, Murat
%I Fraunhofer-Institut für System- und Innovationsforschung ISI
%R 10.24406/publica-2497
%T Verrechnung – Design – Kultivierung: Instrumentenkasten für die Gestaltung fairer Geschäftsmodelle durch Ko-Valuation
%U https://plattform-privatheit.de/p-prv-wAssets/Assets/Veroeffentlichungen_WhitePaper_PolicyPaper/whitepaper/WP_2024_FAIRDIENSTE_1.0.pdf
%7 1 - 1.Hille, T., Stubbemann, M., Hanika, T.: Reproducibility and Geometric Intrinsic Dimensionality: An Investigation on Graph Neural Network Research. Transactions on Machine Learning Research. (2024).
@article{hille2024reproducibility,
author = {Hille, Tobias and Stubbemann, Maximilian and Hanika, Tom},
journal = {Transactions on Machine Learning Research},
keywords = {itegpub},
note = {Reproducibility Certification},
title = {Reproducibility and Geometric Intrinsic Dimensionality: An Investigation on Graph Neural Network Research.},
year = 2024
}%0 Journal Article
%1 hille2024reproducibility
%A Hille, Tobias
%A Stubbemann, Maximilian
%A Hanika, Tom
%D 2024
%J Transactions on Machine Learning Research
%T Reproducibility and Geometric Intrinsic Dimensionality: An Investigation on Graph Neural Network Research.
%U https://openreview.net/forum?id=CtEGxIqtud - 1.Hirth, J., Horn, V., Stumme, G., Hanika, T.: Ordinal motifs in lattices. Information Sciences. 659, 120009 (2024). https://doi.org/https://doi.org/10.1016/j.ins.2023.120009.
@article{HIRTH2024120009,
author = {Hirth, Johannes and Horn, Viktoria and Stumme, Gerd and Hanika, Tom},
journal = {Information Sciences},
keywords = {itegpub},
pages = 120009,
title = {Ordinal motifs in lattices},
volume = 659,
year = 2024
}%0 Journal Article
%1 HIRTH2024120009
%A Hirth, Johannes
%A Horn, Viktoria
%A Stumme, Gerd
%A Hanika, Tom
%D 2024
%J Information Sciences
%P 120009
%R https://doi.org/10.1016/j.ins.2023.120009
%T Ordinal motifs in lattices
%U https://www.sciencedirect.com/science/article/pii/S0020025523015943
%V 659 - 1.Hanika, T., Jäschke, R.: A Repository for Formal Contexts. In: Proceedings of the 1st International Joint Conference on Conceptual Knowledge Structures (2024).Data is always at the center of the theoretical development and investigation of the applicability of formal concept analysis. It is therefore not surprising that a large number of data sets are repeatedly used in scholarly articles and software tools, acting as de facto standard data sets. However, the distribution of the data sets poses a problem for the sustainable development of the research field. There is a lack of a central location that provides and describes FCA data sets and links them to already known analysis results. This article analyses the current state of the dissemination of FCA data sets, presents the requirements for a central FCA repository, and highlights the challenges for this.
@inproceedings{hanika2024repository,
abstract = {Data is always at the center of the theoretical development and investigation of the applicability of formal concept analysis. It is therefore not surprising that a large number of data sets are repeatedly used in scholarly articles and software tools, acting as de facto standard data sets. However, the distribution of the data sets poses a problem for the sustainable development of the research field. There is a lack of a central location that provides and describes FCA data sets and links them to already known analysis results. This article analyses the current state of the dissemination of FCA data sets, presents the requirements for a central FCA repository, and highlights the challenges for this.},
author = {Hanika, Tom and Jäschke, Robert},
booktitle = {Proceedings of the 1st International Joint Conference on Conceptual Knowledge Structures},
keywords = {repository},
title = {A Repository for Formal Contexts},
year = 2024
}%0 Conference Paper
%1 hanika2024repository
%A Hanika, Tom
%A Jäschke, Robert
%B Proceedings of the 1st International Joint Conference on Conceptual Knowledge Structures
%D 2024
%T A Repository for Formal Contexts
%U https://arxiv.org/abs/2404.04344
%X Data is always at the center of the theoretical development and investigation of the applicability of formal concept analysis. It is therefore not surprising that a large number of data sets are repeatedly used in scholarly articles and software tools, acting as de facto standard data sets. However, the distribution of the data sets poses a problem for the sustainable development of the research field. There is a lack of a central location that provides and describes FCA data sets and links them to already known analysis results. This article analyses the current state of the dissemination of FCA data sets, presents the requirements for a central FCA repository, and highlights the challenges for this. - 1.Hanika, T., Hille, T.: What is the Intrinsic Dimension of Your Binary Data? - and How to Compute it Quickly. In: Cabrera, I.P., Ferr{{é}}, S., and Obiedkov, S.A. (eds.) Conceptual Knowledge Structures - First International Joint Conference, {CONCEPTS} 2024, C{{á}}diz, Spain, September 9-13, 2024, Proceedings. pp. 97–112. Springer (2024). https://doi.org/10.1007/978-3-031-67868-4\_7.
@inproceedings{DBLP:conf/concepts/HanikaH24,
author = {Hanika, Tom and Hille, Tobias},
booktitle = {Conceptual Knowledge Structures - First International Joint Conference, {CONCEPTS} 2024, C{{á}}diz, Spain, September 9-13, 2024, Proceedings},
editor = {Cabrera, Inma P. and Ferr{{é}}, S{{é}}bastien and Obiedkov, Sergei A.},
keywords = {itegpub},
pages = {97--112},
publisher = {Springer},
series = {Lecture Notes in Computer Science},
title = {What is the Intrinsic Dimension of Your Binary Data? - and How to Compute it Quickly},
volume = 14914,
year = 2024
}%0 Conference Paper
%1 DBLP:conf/concepts/HanikaH24
%A Hanika, Tom
%A Hille, Tobias
%B Conceptual Knowledge Structures - First International Joint Conference, {CONCEPTS} 2024, C{{á}}diz, Spain, September 9-13, 2024, Proceedings
%D 2024
%E Cabrera, Inma P.
%E Ferr{{é}}, S{{é}}bastien
%E Obiedkov, Sergei A.
%I Springer
%P 97--112
%R 10.1007/978-3-031-67868-4\_7
%T What is the Intrinsic Dimension of Your Binary Data? - and How to Compute it Quickly
%U https://doi.org/10.1007/978-3-031-67868-4\_7
%V 14914 - 1.Draude, C., Dürrschnabel, D., Hirth, J., Horn, V., Kropf, J., Lamla, J., Stumme, G., Uhlmann, M.: Conceptual Mapping of Controversies. In: Cabrera, I.P., Ferré, S., and Obiedkov, S. (eds.) Conceptual Knowledge Structures. pp. 201–216. Springer Nature Switzerland, Cham (2024).With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversity, complexity and potential bias of controversies. In addition to that, we discuss how the diagrams of concept lattices can be used to navigate between news articles.
@inproceedings{10.1007/978-3-031-67868-4_14,
abstract = {With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversity, complexity and potential bias of controversies. In addition to that, we discuss how the diagrams of concept lattices can be used to navigate between news articles.},
address = {Cham},
author = {Draude, Claude and Dürrschnabel, Dominik and Hirth, Johannes and Horn, Viktoria and Kropf, Jonathan and Lamla, J{ö}rn and Stumme, Gerd and Uhlmann, Markus},
booktitle = {Conceptual Knowledge Structures},
editor = {Cabrera, Inma P. and Ferré, Sébastien and Obiedkov, Sergei},
keywords = {itegpub},
pages = {201--216},
publisher = {Springer Nature Switzerland},
title = {Conceptual Mapping of Controversies},
year = 2024
}%0 Conference Paper
%1 10.1007/978-3-031-67868-4_14
%A Draude, Claude
%A Dürrschnabel, Dominik
%A Hirth, Johannes
%A Horn, Viktoria
%A Kropf, Jonathan
%A Lamla, J{ö}rn
%A Stumme, Gerd
%A Uhlmann, Markus
%B Conceptual Knowledge Structures
%C Cham
%D 2024
%E Cabrera, Inma P.
%E Ferré, Sébastien
%E Obiedkov, Sergei
%I Springer Nature Switzerland
%P 201--216
%T Conceptual Mapping of Controversies
%X With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversity, complexity and potential bias of controversies. In addition to that, we discuss how the diagrams of concept lattices can be used to navigate between news articles.
%@ 978-3-031-67868-4 - 1.Abdulla, M., Hirth, J., Stumme, G.: The Birkhoff Completion of Finite Lattices. In: Cabrera, I.P., Ferré, S., and Obiedkov, S. (eds.) Conceptual Knowledge Structures. pp. 20–35. Springer Nature Switzerland, Cham (2024).We introduce the Birkhoff completion as the smallest distributive lattice in which a given finite lattice can be embedded as semi-lattice. We discuss its relationship to implicational theories, in particular to R. Wille's simply-implicational theories. By an example, we show how the Birkhoff completion can be used as a tool for ordinal data science.
@inproceedings{10.1007/978-3-031-67868-4_2,
abstract = {We introduce the Birkhoff completion as the smallest distributive lattice in which a given finite lattice can be embedded as semi-lattice. We discuss its relationship to implicational theories, in particular to R. Wille's simply-implicational theories. By an example, we show how the Birkhoff completion can be used as a tool for ordinal data science.},
address = {Cham},
author = {Abdulla, Mohammad and Hirth, Johannes and Stumme, Gerd},
booktitle = {Conceptual Knowledge Structures},
editor = {Cabrera, Inma P. and Ferré, Sébastien and Obiedkov, Sergei},
keywords = {itegpub},
pages = {20--35},
publisher = {Springer Nature Switzerland},
title = {The Birkhoff Completion of Finite Lattices},
year = 2024
}%0 Conference Paper
%1 10.1007/978-3-031-67868-4_2
%A Abdulla, Mohammad
%A Hirth, Johannes
%A Stumme, Gerd
%B Conceptual Knowledge Structures
%C Cham
%D 2024
%E Cabrera, Inma P.
%E Ferré, Sébastien
%E Obiedkov, Sergei
%I Springer Nature Switzerland
%P 20--35
%T The Birkhoff Completion of Finite Lattices
%X We introduce the Birkhoff completion as the smallest distributive lattice in which a given finite lattice can be embedded as semi-lattice. We discuss its relationship to implicational theories, in particular to R. Wille's simply-implicational theories. By an example, we show how the Birkhoff completion can be used as a tool for ordinal data science.
%@ 978-3-031-67868-4 - 1.Hirth, J., Hanika, T.: The Geometric Structure of Topic Models, (2024).Topic models are a popular tool for clustering and analyzing textual data. They allow texts to be classified on the basis of their affiliation to the previously calculated topics. Despite their widespread use in research and application, an in-depth analysis of topic models is still an open research topic. State-of-the-art methods for interpreting topic models are based on simple visualizations, such as similarity matrices, top-term lists or embeddings, which are limited to a maximum of three dimensions. In this paper, we propose an incidence-geometric method for deriving an ordinal structure from flat topic models, such as non-negative matrix factorization. These enable the analysis of the topic model in a higher (order) dimension and the possibility of extracting conceptual relationships between several topics at once. Due to the use of conceptual scaling, our approach does not introduce any artificial topical relationships, such as artifacts of feature compression. Based on our findings, we present a new visualization paradigm for concept hierarchies based on ordinal motifs. These allow for a top-down view on topic spaces. We introduce and demonstrate the applicability of our approach based on a topic model derived from a corpus of scientific papers taken from 32 top machine learning venues.
@preprint{hirth2024geometric,
abstract = {Topic models are a popular tool for clustering and analyzing textual data. They allow texts to be classified on the basis of their affiliation to the previously calculated topics. Despite their widespread use in research and application, an in-depth analysis of topic models is still an open research topic. State-of-the-art methods for interpreting topic models are based on simple visualizations, such as similarity matrices, top-term lists or embeddings, which are limited to a maximum of three dimensions. In this paper, we propose an incidence-geometric method for deriving an ordinal structure from flat topic models, such as non-negative matrix factorization. These enable the analysis of the topic model in a higher (order) dimension and the possibility of extracting conceptual relationships between several topics at once. Due to the use of conceptual scaling, our approach does not introduce any artificial topical relationships, such as artifacts of feature compression. Based on our findings, we present a new visualization paradigm for concept hierarchies based on ordinal motifs. These allow for a top-down view on topic spaces. We introduce and demonstrate the applicability of our approach based on a topic model derived from a corpus of scientific papers taken from 32 top machine learning venues.},
author = {Hirth, Johannes and Hanika, Tom},
keywords = {kde},
title = {The Geometric Structure of Topic Models},
year = 2024
}%0 Generic
%1 hirth2024geometric
%A Hirth, Johannes
%A Hanika, Tom
%D 2024
%T The Geometric Structure of Topic Models
%X Topic models are a popular tool for clustering and analyzing textual data. They allow texts to be classified on the basis of their affiliation to the previously calculated topics. Despite their widespread use in research and application, an in-depth analysis of topic models is still an open research topic. State-of-the-art methods for interpreting topic models are based on simple visualizations, such as similarity matrices, top-term lists or embeddings, which are limited to a maximum of three dimensions. In this paper, we propose an incidence-geometric method for deriving an ordinal structure from flat topic models, such as non-negative matrix factorization. These enable the analysis of the topic model in a higher (order) dimension and the possibility of extracting conceptual relationships between several topics at once. Due to the use of conceptual scaling, our approach does not introduce any artificial topical relationships, such as artifacts of feature compression. Based on our findings, we present a new visualization paradigm for concept hierarchies based on ordinal motifs. These allow for a top-down view on topic spaces. We introduce and demonstrate the applicability of our approach based on a topic model derived from a corpus of scientific papers taken from 32 top machine learning venues. - 1.Horn, V., Hirth, J., Holfeld, J., Behmenburg, J.H., Draude, C., Stumme, G.: Disclosing Diverse Perspectives of News Articles for Navigating between Online Journalism Content. In: Nordic Conference on Human-Computer Interaction. Association for Computing Machinery, Uppsala, Sweden (2024). https://doi.org/10.1145/3679318.3685414.Today, exposure to journalistic online content is predominantly controlled by news recommender systems, which often suggest content that matches user’s interests or is selected according to non-transparent recommendation criteria. To circumvent resulting trade-offs like polarisation or fragmentation whilst ensuring user’s autonomy, we explore how different perspectives within online news can be disclosed instead for guiding navigation. To do so, we developed an interactive prototype that displays article titles in correspondence to their argumentative orientation. In order to investigate how the usage of our novel navigation structure impacts the choice of news articles and user experience, we conducted an exploratory user study assessing the impact of the design parameters chosen. Implications are drawn from the study results and the development of the interactive prototype for the exposure to diversity in the context of navigating news content online.
@inproceedings{hci-lattice,
abstract = {Today, exposure to journalistic online content is predominantly controlled by news recommender systems, which often suggest content that matches user’s interests or is selected according to non-transparent recommendation criteria. To circumvent resulting trade-offs like polarisation or fragmentation whilst ensuring user’s autonomy, we explore how different perspectives within online news can be disclosed instead for guiding navigation. To do so, we developed an interactive prototype that displays article titles in correspondence to their argumentative orientation. In order to investigate how the usage of our novel navigation structure impacts the choice of news articles and user experience, we conducted an exploratory user study assessing the impact of the design parameters chosen. Implications are drawn from the study results and the development of the interactive prototype for the exposure to diversity in the context of navigating news content online.},
address = {New York, NY, USA},
author = {Horn, Viktoria and Hirth, Johannes and Holfeld, Julian and Behmenburg, Jens Hendrik and Draude, Claude and Stumme, Gerd},
booktitle = {Nordic Conference on Human-Computer Interaction},
keywords = {itegpub},
publisher = {Association for Computing Machinery},
series = {NordiCHI 2024},
title = {Disclosing Diverse Perspectives of News Articles for Navigating between Online Journalism Content},
year = 2024
}%0 Conference Paper
%1 hci-lattice
%A Horn, Viktoria
%A Hirth, Johannes
%A Holfeld, Julian
%A Behmenburg, Jens Hendrik
%A Draude, Claude
%A Stumme, Gerd
%B Nordic Conference on Human-Computer Interaction
%C New York, NY, USA
%D 2024
%I Association for Computing Machinery
%R 10.1145/3679318.3685414
%T Disclosing Diverse Perspectives of News Articles for Navigating between Online Journalism Content
%U https://doi.org/10.1145/3679318.3685414
%X Today, exposure to journalistic online content is predominantly controlled by news recommender systems, which often suggest content that matches user’s interests or is selected according to non-transparent recommendation criteria. To circumvent resulting trade-offs like polarisation or fragmentation whilst ensuring user’s autonomy, we explore how different perspectives within online news can be disclosed instead for guiding navigation. To do so, we developed an interactive prototype that displays article titles in correspondence to their argumentative orientation. In order to investigate how the usage of our novel navigation structure impacts the choice of news articles and user experience, we conducted an exploratory user study assessing the impact of the design parameters chosen. Implications are drawn from the study results and the development of the interactive prototype for the exposure to diversity in the context of navigating news content online.
%@ 9798400709661 - 1.Hirth, J.: Conceptual Data Scaling in Machine Learning, (2024). https://doi.org/10.17170/kobra-2024100910940.Information that is intended for human interpretation is frequently represented in a structured manner. This allows for a navigation between individual pieces to find, connect or combine information to gain new insights. Within a structure, we derive knowledge from inference of hierarchical or logical relations between data objects. For unstructured data there are numerous methods to define a data schema based on user interpretations. Afterward, data objects can be aggregated to derive (hierarchical) structures based on common properties. There are four main challenges with respect to the explainability of the derived structures. First, formal procedures are needed to infer knowledge about the data set, or parts of it, from hierarchical structures. Second, what does knowledge inferred from a structure imply for the data set it was derived from? Third, structures may be incomprehensibly large for human interpretation. Methods are needed to reduce structures to smaller representations in a consistent, comprehensible manner that provides control over possibly introduced error. Forth, the original data set does not need to have interpretable features and thus only allow for the inference of structural properties. In order to extract information based on real world properties, we need methods that are able to add such properties. With the presented work, we address these challenges using and extending the rich tool-set of Formal Concept Analysis. Here, data objects are aggregated to closed sets called formal concepts based on (unary) symbolic attributes that they have in common. The process of deriving symbolic attributes is called conceptual scaling and depends on the interpretation of the data by the analyst. The resulting hierarchical structure of concepts is called concept lattice. To infer knowledge from the concept lattice structures we introduce new methods based on sub-structures that are of standardized shape, called ordinal motifs. This novel method allows us to explain the structure of a concept lattice based on geometric aspects. Throughout our work, we focus on data representations from multiple state-of-the-art machine learning algorithms. In all cases, we elaborate extensively on how to interpret these models through derived concept lattices and develop scaling procedures specific to each algorithm. Some of the considered models are black-box models whose internal data representations are numeric with no clear real world semantics. For these, we present a method to link background knowledge to the concept lattice structure. To reduce the complexity of concept lattices we provide a new theoretical framework that allows us to generate (small) views on a concept lattice. These enable more selective and comprehensibly sized explanations for data parts that are of interest. In addition to that, we introduce methods to combine and subtract views from each other, and to identify missing or incorrect parts.
@phdthesis{doi:10.17170/kobra-2024100910940,
abstract = {Information that is intended for human interpretation is frequently represented in a structured manner. This allows for a navigation between individual pieces to find, connect or combine information to gain new insights. Within a structure, we derive knowledge from inference of hierarchical or logical relations between data objects. For unstructured data there are numerous methods to define a data schema based on user interpretations. Afterward, data objects can be aggregated to derive (hierarchical) structures based on common properties. There are four main challenges with respect to the explainability of the derived structures. First, formal procedures are needed to infer knowledge about the data set, or parts of it, from hierarchical structures. Second, what does knowledge inferred from a structure imply for the data set it was derived from? Third, structures may be incomprehensibly large for human interpretation. Methods are needed to reduce structures to smaller representations in a consistent, comprehensible manner that provides control over possibly introduced error. Forth, the original data set does not need to have interpretable features and thus only allow for the inference of structural properties. In order to extract information based on real world properties, we need methods that are able to add such properties. With the presented work, we address these challenges using and extending the rich tool-set of Formal Concept Analysis. Here, data objects are aggregated to closed sets called formal concepts based on (unary) symbolic attributes that they have in common. The process of deriving symbolic attributes is called conceptual scaling and depends on the interpretation of the data by the analyst. The resulting hierarchical structure of concepts is called concept lattice. To infer knowledge from the concept lattice structures we introduce new methods based on sub-structures that are of standardized shape, called ordinal motifs. This novel method allows us to explain the structure of a concept lattice based on geometric aspects. Throughout our work, we focus on data representations from multiple state-of-the-art machine learning algorithms. In all cases, we elaborate extensively on how to interpret these models through derived concept lattices and develop scaling procedures specific to each algorithm. Some of the considered models are black-box models whose internal data representations are numeric with no clear real world semantics. For these, we present a method to link background knowledge to the concept lattice structure. To reduce the complexity of concept lattices we provide a new theoretical framework that allows us to generate (small) views on a concept lattice. These enable more selective and comprehensibly sized explanations for data parts that are of interest. In addition to that, we introduce methods to combine and subtract views from each other, and to identify missing or incorrect parts.},
author = {Hirth, Johannes},
keywords = {Knowldege~Representation},
school = {Kassel, Universität Kassel, Fachbereich Elektrotechnik/Informatik},
title = {Conceptual Data Scaling in Machine Learning},
year = 2024
}%0 Thesis
%1 doi:10.17170/kobra-2024100910940
%A Hirth, Johannes
%D 2024
%R 10.17170/kobra-2024100910940
%T Conceptual Data Scaling in Machine Learning
%X Information that is intended for human interpretation is frequently represented in a structured manner. This allows for a navigation between individual pieces to find, connect or combine information to gain new insights. Within a structure, we derive knowledge from inference of hierarchical or logical relations between data objects. For unstructured data there are numerous methods to define a data schema based on user interpretations. Afterward, data objects can be aggregated to derive (hierarchical) structures based on common properties. There are four main challenges with respect to the explainability of the derived structures. First, formal procedures are needed to infer knowledge about the data set, or parts of it, from hierarchical structures. Second, what does knowledge inferred from a structure imply for the data set it was derived from? Third, structures may be incomprehensibly large for human interpretation. Methods are needed to reduce structures to smaller representations in a consistent, comprehensible manner that provides control over possibly introduced error. Forth, the original data set does not need to have interpretable features and thus only allow for the inference of structural properties. In order to extract information based on real world properties, we need methods that are able to add such properties. With the presented work, we address these challenges using and extending the rich tool-set of Formal Concept Analysis. Here, data objects are aggregated to closed sets called formal concepts based on (unary) symbolic attributes that they have in common. The process of deriving symbolic attributes is called conceptual scaling and depends on the interpretation of the data by the analyst. The resulting hierarchical structure of concepts is called concept lattice. To infer knowledge from the concept lattice structures we introduce new methods based on sub-structures that are of standardized shape, called ordinal motifs. This novel method allows us to explain the structure of a concept lattice based on geometric aspects. Throughout our work, we focus on data representations from multiple state-of-the-art machine learning algorithms. In all cases, we elaborate extensively on how to interpret these models through derived concept lattices and develop scaling procedures specific to each algorithm. Some of the considered models are black-box models whose internal data representations are numeric with no clear real world semantics. For these, we present a method to link background knowledge to the concept lattice structure. To reduce the complexity of concept lattices we provide a new theoretical framework that allows us to generate (small) views on a concept lattice. These enable more selective and comprehensibly sized explanations for data parts that are of interest. In addition to that, we introduce methods to combine and subtract views from each other, and to identify missing or incorrect parts. - 1.Dürrschnabel, D., Priss, U.: Realizability of Rectangular Euler Diagrams, (2024).
@misc{dürrschnabel2024realizability,
author = {Dürrschnabel, Dominik and Priss, Uta},
keywords = {itegpub},
title = {Realizability of Rectangular Euler Diagrams},
year = 2024
}%0 Generic
%1 dürrschnabel2024realizability
%A Dürrschnabel, Dominik
%A Priss, Uta
%D 2024
%T Realizability of Rectangular Euler Diagrams - 1.Schäfermeier, B., Hirth, J., Hanika, T.: Research Topic Flows in Co-Authorship Networks. Scientometrics. 128, 5051–5078 (2023). https://doi.org/10.1007/s11192-022-04529-w.In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields.
@article{schafermeier2022research,
abstract = {In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields.},
author = {Schäfermeier, Bastian and Hirth, Johannes and Hanika, Tom},
journal = {Scientometrics},
keywords = {co-authorships},
month = {09},
number = 9,
pages = {5051--5078},
title = {Research Topic Flows in Co-Authorship Networks},
volume = 128,
year = 2023
}%0 Journal Article
%1 schafermeier2022research
%A Schäfermeier, Bastian
%A Hirth, Johannes
%A Hanika, Tom
%D 2023
%J Scientometrics
%N 9
%P 5051--5078
%R 10.1007/s11192-022-04529-w
%T Research Topic Flows in Co-Authorship Networks
%V 128
%X In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields. - 1.Felde, M., Stumme, G.: Interactive collaborative exploration using incomplete contexts. Data & Knowledge Engineering. 143, 102104 (2023). https://doi.org/10.1016/j.datak.2022.102104.
@article{Felde_2023,
author = {Felde, Maximilian and Stumme, Gerd},
journal = {Data & Knowledge Engineering},
keywords = {itegpub},
month = {01},
pages = 102104,
publisher = {Elsevier {BV}},
title = {Interactive collaborative exploration using incomplete contexts},
volume = 143,
year = 2023
}%0 Journal Article
%1 Felde_2023
%A Felde, Maximilian
%A Stumme, Gerd
%D 2023
%I Elsevier {BV}
%J Data & Knowledge Engineering
%P 102104
%R 10.1016/j.datak.2022.102104
%T Interactive collaborative exploration using incomplete contexts
%U https://doi.org/10.1016%2Fj.datak.2022.102104
%V 143 - 1.Hirth, J., Horn, V., Stumme, G., Hanika, T.: Ordinal Motifs in Lattices, https://arxiv.org/abs/2304.04827, (2023).Lattices are a commonly used structure for the representation and analysis of relational and ontological knowledge. In particular, the analysis of these requires a decomposition of a large and high-dimensional lattice into a set of understandably large parts. With the present work we propose /ordinal motifs/ as analytical units of meaning. We study these ordinal substructures (or standard scales) through (full) scale-measures of formal contexts from the field of formal concept analysis. We show that the underlying decision problems are NP-complete and provide results on how one can incrementally identify ordinal motifs to save computational effort. Accompanying our theoretical results, we demonstrate how ordinal motifs can be leveraged to retrieve basic meaning from a medium sized ordinal data set.
@misc{hirth2023ordinal,
abstract = {Lattices are a commonly used structure for the representation and analysis of relational and ontological knowledge. In particular, the analysis of these requires a decomposition of a large and high-dimensional lattice into a set of understandably large parts. With the present work we propose /ordinal motifs/ as analytical units of meaning. We study these ordinal substructures (or standard scales) through (full) scale-measures of formal contexts from the field of formal concept analysis. We show that the underlying decision problems are NP-complete and provide results on how one can incrementally identify ordinal motifs to save computational effort. Accompanying our theoretical results, we demonstrate how ordinal motifs can be leveraged to retrieve basic meaning from a medium sized ordinal data set.},
author = {Hirth, Johannes and Horn, Viktoria and Stumme, Gerd and Hanika, Tom},
keywords = {itegpub},
title = {Ordinal Motifs in Lattices},
year = 2023
}%0 Generic
%1 hirth2023ordinal
%A Hirth, Johannes
%A Horn, Viktoria
%A Stumme, Gerd
%A Hanika, Tom
%D 2023
%T Ordinal Motifs in Lattices
%U https://arxiv.org/abs/2304.04827
%X Lattices are a commonly used structure for the representation and analysis of relational and ontological knowledge. In particular, the analysis of these requires a decomposition of a large and high-dimensional lattice into a set of understandably large parts. With the present work we propose /ordinal motifs/ as analytical units of meaning. We study these ordinal substructures (or standard scales) through (full) scale-measures of formal contexts from the field of formal concept analysis. We show that the underlying decision problems are NP-complete and provide results on how one can incrementally identify ordinal motifs to save computational effort. Accompanying our theoretical results, we demonstrate how ordinal motifs can be leveraged to retrieve basic meaning from a medium sized ordinal data set. - 1.Felde, M., Koyda, M.: Interval-dismantling for lattices. International Journal of Approximate Reasoning. 159, 108931 (2023). https://doi.org/10.1016/j.ijar.2023.108931.Dismantling allows for the removal of elements from a poset, or in our case lattice, without disturbing the remaining structure. In this paper we have extended the notion of dismantling by single elements to the dismantling by intervals in a lattice. We utilize theory from Formal Concept Analysis (FCA) to show that lattices dismantled by intervals correspond to closed subrelations in the respective formal context, and that there exists a unique core with respect to dismantling by intervals. Furthermore, we show that dismantling intervals can be identified directly in the formal context utilizing a characterization via arrow relations and provide an algorithm to compute all dismantling intervals.
@article{FELDE2023108931,
abstract = {Dismantling allows for the removal of elements from a poset, or in our case lattice, without disturbing the remaining structure. In this paper we have extended the notion of dismantling by single elements to the dismantling by intervals in a lattice. We utilize theory from Formal Concept Analysis (FCA) to show that lattices dismantled by intervals correspond to closed subrelations in the respective formal context, and that there exists a unique core with respect to dismantling by intervals. Furthermore, we show that dismantling intervals can be identified directly in the formal context utilizing a characterization via arrow relations and provide an algorithm to compute all dismantling intervals.},
author = {Felde, Maximilian and Koyda, Maren},
journal = {International Journal of Approximate Reasoning},
keywords = {Concept_lattice},
pages = 108931,
title = {Interval-dismantling for lattices},
volume = 159,
year = 2023
}%0 Journal Article
%1 FELDE2023108931
%A Felde, Maximilian
%A Koyda, Maren
%D 2023
%J International Journal of Approximate Reasoning
%P 108931
%R 10.1016/j.ijar.2023.108931
%T Interval-dismantling for lattices
%U https://www.sciencedirect.com/science/article/pii/S0888613X23000622
%V 159
%X Dismantling allows for the removal of elements from a poset, or in our case lattice, without disturbing the remaining structure. In this paper we have extended the notion of dismantling by single elements to the dismantling by intervals in a lattice. We utilize theory from Formal Concept Analysis (FCA) to show that lattices dismantled by intervals correspond to closed subrelations in the respective formal context, and that there exists a unique core with respect to dismantling by intervals. Furthermore, we show that dismantling intervals can be identified directly in the formal context utilizing a characterization via arrow relations and provide an algorithm to compute all dismantling intervals. - 1.Stubbemann, M., Hanika, T., Schneider, F.M.: Intrinsic Dimension for Large-Scale Geometric Learning. Transactions on Machine Learning Research. (2023).
@article{stubbemann2022intrinsic,
author = {Stubbemann, Maximilian and Hanika, Tom and Schneider, Friedrich Martin},
journal = {Transactions on Machine Learning Research},
keywords = {itegpub},
title = {Intrinsic Dimension for Large-Scale Geometric Learning},
year = 2023
}%0 Journal Article
%1 stubbemann2022intrinsic
%A Stubbemann, Maximilian
%A Hanika, Tom
%A Schneider, Friedrich Martin
%D 2023
%J Transactions on Machine Learning Research
%T Intrinsic Dimension for Large-Scale Geometric Learning
%U https://openreview.net/forum?id=85BfDdYMBY