KDE – FB 16 – University of Kassel

Knowledge & Data Engineering Group (KDE), EECS, University of Kassel

The research unit Knowledge & Data Engineering at the Department of Electrical Engineering/Computer Science is developing methods for knowledge discovery and representation (approximation and exploration of knowledge, order structures in knowledge, ontology learning) and for the analysis of (social) networks and related knowledge processes (metrics in networks, anomaly detection, characterization of social networks). Our focus is on the exact algebraic modelling of structures in knowledge and networks. Our research on foundations in order and lattice theory, description logics, graph theory and ontologies is complemented by applications in social media and scientometrics. The research unit Knowledge & Data Engineering is member in the Interdisciplinary Research Center for Information Systems Design (ITeG) and the International Centre for Higher Education Research (INCHER Kassel) at the University of Kassel, and in the Hessian Center for Artificial Intelligence (hessian.AI).

Our latest publications

1.
Draude, C., Engert, S., Hess, T., Hirth, J., Horn, V., Kropf, J., Lamla, J., Stumme, G., Uhlmann, M., Zwingmann, N.: Verrechnung – Design – Kultivierung: Instrumentenkasten für die Gestaltung fairer Geschäftsmodelle durch Ko-Valuation, https://plattform-privatheit.de/p-prv-wAssets/Assets/Veroeffentlichungen_WhitePaper_PolicyPaper/whitepaper/WP_2024_FAIRDIENSTE_1.0.pdf, (2024). https://doi.org/10.24406/publica-2497.

URLBibTeXEndNoteDOI

@misc{claude2024verrechnung,
address = {Karlsruhe},
author = {Draude, Claude and Engert, Simon and Hess, Thomas and Hirth, Johannes and Horn, Viktoria and Kropf, Jonathan and Lamla, Jörn and Stumme, Gerd and Uhlmann, Markus and Zwingmann, Nina},
edition = 1,
editor = {Friedewald, Michael and Roßnagel, Alexander and Geminn, Christian and Karaboga, Murat},
howpublished = {White Paper},
keywords = {itegpub},
month = {03},
publisher = {Fraunhofer-Institut für System- und Innovationsforschung ISI},
series = {Plattform Privatheit},
title = {Verrechnung – Design – Kultivierung: Instrumentenkasten für die Gestaltung fairer Geschäftsmodelle durch Ko-Valuation},
year = 2024
}
%0 Generic
%1 claude2024verrechnung
%A Draude, Claude
%A Engert, Simon
%A Hess, Thomas
%A Hirth, Johannes
%A Horn, Viktoria
%A Kropf, Jonathan
%A Lamla, Jörn
%A Stumme, Gerd
%A Uhlmann, Markus
%A Zwingmann, Nina
%B Plattform Privatheit
%C Karlsruhe
%D 2024
%E Friedewald, Michael
%E Roßnagel, Alexander
%E Geminn, Christian
%E Karaboga, Murat
%I Fraunhofer-Institut für System- und Innovationsforschung ISI
%R 10.24406/publica-2497
%T Verrechnung – Design – Kultivierung: Instrumentenkasten für die Gestaltung fairer Geschäftsmodelle durch Ko-Valuation
%U https://plattform-privatheit.de/p-prv-wAssets/Assets/Veroeffentlichungen_WhitePaper_PolicyPaper/whitepaper/WP_2024_FAIRDIENSTE_1.0.pdf
%7 1
1.
Horn, V., Hirth, J., Holfeld, J., Behmenburg, J.H., Draude, C., Stumme, G.: Disclosing Diverse Perspectives of News Articles for Navigating between Online Journalism Content. In: Nordic Conference on Human-Computer Interaction. Association for Computing Machinery, Uppsala, Sweden (2024). https://doi.org/10.1145/3679318.3685414.

URLBibTeXEndNoteDOI

Today, exposure to journalistic online content is predominantly controlled by news recommender systems, which often suggest content that matches user’s interests or is selected according to non-transparent recommendation criteria. To circumvent resulting trade-offs like polarisation or fragmentation whilst ensuring user’s autonomy, we explore how different perspectives within online news can be disclosed instead for guiding navigation. To do so, we developed an interactive prototype that displays article titles in correspondence to their argumentative orientation. In order to investigate how the usage of our novel navigation structure impacts the choice of news articles and user experience, we conducted an exploratory user study assessing the impact of the design parameters chosen. Implications are drawn from the study results and the development of the interactive prototype for the exposure to diversity in the context of navigating news content online.
@inproceedings{hci-lattice,
abstract = {Today, exposure to journalistic online content is predominantly controlled by news recommender systems, which often suggest content that matches user’s interests or is selected according to non-transparent recommendation criteria. To circumvent resulting trade-offs like polarisation or fragmentation whilst ensuring user’s autonomy, we explore how different perspectives within online news can be disclosed instead for guiding navigation. To do so, we developed an interactive prototype that displays article titles in correspondence to their argumentative orientation. In order to investigate how the usage of our novel navigation structure impacts the choice of news articles and user experience, we conducted an exploratory user study assessing the impact of the design parameters chosen. Implications are drawn from the study results and the development of the interactive prototype for the exposure to diversity in the context of navigating news content online.},
address = {New York, NY, USA},
author = {Horn, Viktoria and Hirth, Johannes and Holfeld, Julian and Behmenburg, Jens Hendrik and Draude, Claude and Stumme, Gerd},
booktitle = {Nordic Conference on Human-Computer Interaction},
keywords = {itegpub},
publisher = {Association for Computing Machinery},
series = {NordiCHI 2024},
title = {Disclosing Diverse Perspectives of News Articles for Navigating between Online Journalism Content},
year = 2024
}
%0 Conference Paper
%1 hci-lattice
%A Horn, Viktoria
%A Hirth, Johannes
%A Holfeld, Julian
%A Behmenburg, Jens Hendrik
%A Draude, Claude
%A Stumme, Gerd
%B Nordic Conference on Human-Computer Interaction
%C New York, NY, USA
%D 2024
%I Association for Computing Machinery
%R 10.1145/3679318.3685414
%T Disclosing Diverse Perspectives of News Articles for Navigating between Online Journalism Content
%U https://doi.org/10.1145/3679318.3685414
%X Today, exposure to journalistic online content is predominantly controlled by news recommender systems, which often suggest content that matches user’s interests or is selected according to non-transparent recommendation criteria. To circumvent resulting trade-offs like polarisation or fragmentation whilst ensuring user’s autonomy, we explore how different perspectives within online news can be disclosed instead for guiding navigation. To do so, we developed an interactive prototype that displays article titles in correspondence to their argumentative orientation. In order to investigate how the usage of our novel navigation structure impacts the choice of news articles and user experience, we conducted an exploratory user study assessing the impact of the design parameters chosen. Implications are drawn from the study results and the development of the interactive prototype for the exposure to diversity in the context of navigating news content online.
%@ 9798400709661
1.
Priss, U., Dürrschnabel, D.: Rectangular Euler Diagrams and Order Theory. In: Lemanski, J., Johansen, M.W., Manalo, E., Viana, P., Bhattacharjee, R., and Burns, R. (eds.) Diagrammatic Representation and Inference. pp. 165–181. Springer Nature Switzerland, Cham (2024).

BibTeXEndNote

This paper discusses the relevance of order-theoretical properties, such as order dimension, for determining properties of Euler diagrams, such as whether a given poset can be represented with or without shading. The focus is on linear, tabular and rectangular Euler diagrams with shading and without split attributes and constructions with subdiagrams and embeddings. Euler diagrams are distinguished from geometric containment orders. Basic layout strategies are suggested.
@inproceedings{10.1007/978-3-031-71291-3_14,
abstract = {This paper discusses the relevance of order-theoretical properties, such as order dimension, for determining properties of Euler diagrams, such as whether a given poset can be represented with or without shading. The focus is on linear, tabular and rectangular Euler diagrams with shading and without split attributes and constructions with subdiagrams and embeddings. Euler diagrams are distinguished from geometric containment orders. Basic layout strategies are suggested.},
address = {Cham},
author = {Priss, Uta and Dürrschnabel, Dominik},
booktitle = {Diagrammatic Representation and Inference},
editor = {Lemanski, Jens and Johansen, Mikkel Willum and Manalo, Emmanuel and Viana, Petrucio and Bhattacharjee, Reetu and Burns, Richard},
keywords = {itegpub},
pages = {165--181},
publisher = {Springer Nature Switzerland},
title = {Rectangular Euler Diagrams and Order Theory},
year = 2024
}
%0 Conference Paper
%1 10.1007/978-3-031-71291-3_14
%A Priss, Uta
%A Dürrschnabel, Dominik
%B Diagrammatic Representation and Inference
%C Cham
%D 2024
%E Lemanski, Jens
%E Johansen, Mikkel Willum
%E Manalo, Emmanuel
%E Viana, Petrucio
%E Bhattacharjee, Reetu
%E Burns, Richard
%I Springer Nature Switzerland
%P 165--181
%T Rectangular Euler Diagrams and Order Theory
%X This paper discusses the relevance of order-theoretical properties, such as order dimension, for determining properties of Euler diagrams, such as whether a given poset can be represented with or without shading. The focus is on linear, tabular and rectangular Euler diagrams with shading and without split attributes and constructions with subdiagrams and embeddings. Euler diagrams are distinguished from geometric containment orders. Basic layout strategies are suggested.
%@ 978-3-031-71291-3
1.
Budde, K.B., Rellstab, C., Heuertz, M., Gugerli, F., Hanika, T., Verdú, M., Pausas, J.G., González-Martínez, S.C.: Divergent selection in a Mediterranean pine on local spatial scales. Journal of Ecology. 112, (2024). https://doi.org/https://doi.org/10.1111/1365-2745.14231.

URLBibTeXEndNoteDOI

Abstract The effects of selection on an organism's genome are hard to detect on small spatial scales, as gene flow can swamp signatures of local adaptation. Therefore, most genome scans to detect signatures of environmental selection are performed on large spatial scales; however, divergent selection on the local scale (e.g. between contrasting soil conditions) has also been demonstrated, in particular for herbaceous plants. Here, we hypothesised that in topographically complex landscapes, microenvironment variability is strong enough to leave a selective footprint in the genomes of long-lived organisms. To test this, we investigated paired south- versus north-facing Pinus pinaster stands on the local scale, with trees growing in close vicinity (≤820 m distance between paired south- and north-facing stands), in a Mediterranean mountain area. While trees on north-facing slopes experience less radiation, trees on south-facing slopes suffer from especially harsh conditions, particularly during the dry summer season. Two outlier analyses consistently revealed five putatively adaptive loci (out of 4034), in candidate genes two of which encoded non-synonymous substitutions. Additionally, one locus showed consistent allele frequency differences in all three stand pairs indicating divergent selection despite high gene flow on the local scale. Permutation tests demonstrated that our findings were robust. Functional annotation of these candidate genes revealed biological functions related to abiotic stress response, such as water availability, in other plant species. Synthesis. Our study highlights how divergent selection in heterogeneous microenvironments shapes and maintains the functional genetic variation within populations of long-lived forest tree species, being the first to focus on adaptive genetic divergence between south- and north-facing slopes within continuous forest stands. This is especially relevant in the current context of climate change, as this variation is at the base of plant population responses to future climate.
@article{https://doi.org/10.1111/1365-2745.14231,
abstract = {Abstract The effects of selection on an organism's genome are hard to detect on small spatial scales, as gene flow can swamp signatures of local adaptation. Therefore, most genome scans to detect signatures of environmental selection are performed on large spatial scales; however, divergent selection on the local scale (e.g. between contrasting soil conditions) has also been demonstrated, in particular for herbaceous plants. Here, we hypothesised that in topographically complex landscapes, microenvironment variability is strong enough to leave a selective footprint in the genomes of long-lived organisms. To test this, we investigated paired south- versus north-facing Pinus pinaster stands on the local scale, with trees growing in close vicinity (≤820 m distance between paired south- and north-facing stands), in a Mediterranean mountain area. While trees on north-facing slopes experience less radiation, trees on south-facing slopes suffer from especially harsh conditions, particularly during the dry summer season. Two outlier analyses consistently revealed five putatively adaptive loci (out of 4034), in candidate genes two of which encoded non-synonymous substitutions. Additionally, one locus showed consistent allele frequency differences in all three stand pairs indicating divergent selection despite high gene flow on the local scale. Permutation tests demonstrated that our findings were robust. Functional annotation of these candidate genes revealed biological functions related to abiotic stress response, such as water availability, in other plant species. Synthesis. Our study highlights how divergent selection in heterogeneous microenvironments shapes and maintains the functional genetic variation within populations of long-lived forest tree species, being the first to focus on adaptive genetic divergence between south- and north-facing slopes within continuous forest stands. This is especially relevant in the current context of climate change, as this variation is at the base of plant population responses to future climate.},
author = {Budde, Katharina B. and Rellstab, Christian and Heuertz, Myriam and Gugerli, Felix and Hanika, Tom and Verdú, Miguel and Pausas, Juli G. and González-Martínez, Santiago C.},
journal = {Journal of Ecology},
keywords = {itegpub},
number = 2,
title = {Divergent selection in a Mediterranean pine on local spatial scales},
volume = 112,
year = 2024
}
%0 Journal Article
%1 https://doi.org/10.1111/1365-2745.14231
%A Budde, Katharina B.
%A Rellstab, Christian
%A Heuertz, Myriam
%A Gugerli, Felix
%A Hanika, Tom
%A Verdú, Miguel
%A Pausas, Juli G.
%A González-Martínez, Santiago C.
%D 2024
%J Journal of Ecology
%N 2
%R https://doi.org/10.1111/1365-2745.14231
%T Divergent selection in a Mediterranean pine on local spatial scales
%U https://besjournals.onlinelibrary.wiley.com/doi/abs/10.1111/1365-2745.14231
%V 112
%X Abstract The effects of selection on an organism's genome are hard to detect on small spatial scales, as gene flow can swamp signatures of local adaptation. Therefore, most genome scans to detect signatures of environmental selection are performed on large spatial scales; however, divergent selection on the local scale (e.g. between contrasting soil conditions) has also been demonstrated, in particular for herbaceous plants. Here, we hypothesised that in topographically complex landscapes, microenvironment variability is strong enough to leave a selective footprint in the genomes of long-lived organisms. To test this, we investigated paired south- versus north-facing Pinus pinaster stands on the local scale, with trees growing in close vicinity (≤820 m distance between paired south- and north-facing stands), in a Mediterranean mountain area. While trees on north-facing slopes experience less radiation, trees on south-facing slopes suffer from especially harsh conditions, particularly during the dry summer season. Two outlier analyses consistently revealed five putatively adaptive loci (out of 4034), in candidate genes two of which encoded non-synonymous substitutions. Additionally, one locus showed consistent allele frequency differences in all three stand pairs indicating divergent selection despite high gene flow on the local scale. Permutation tests demonstrated that our findings were robust. Functional annotation of these candidate genes revealed biological functions related to abiotic stress response, such as water availability, in other plant species. Synthesis. Our study highlights how divergent selection in heterogeneous microenvironments shapes and maintains the functional genetic variation within populations of long-lived forest tree species, being the first to focus on adaptive genetic divergence between south- and north-facing slopes within continuous forest stands. This is especially relevant in the current context of climate change, as this variation is at the base of plant population responses to future climate.
1.
Ganter, B., Hanika, T., Hirth, J., Obiedkov, S.: Collaborative Hybrid Human {AI} Learning through Conceptual Exploration. In: Ericson, P., Khairova, N., and Vos, M.D. (eds.) Proceedings of the Workshops at the Third International Conference on Hybrid Human-Artificial Intelligence co-located with (HHAI) 2024), Malmö, Sweden, June 10-11, 2024. pp. 1–8. CEUR-WS.org (2024).

URLBibTeXEndNote

@inproceedings{DBLP:conf/hhai/GanterHHO24,
author = {Ganter, Bernhard and Hanika, Tom and Hirth, Johannes and Obiedkov, Sergei},
booktitle = {Proceedings of the Workshops at the Third International Conference on Hybrid Human-Artificial Intelligence co-located with (HHAI) 2024), Malmö, Sweden, June 10-11, 2024},
editor = {Ericson, Petter and Khairova, Nina and Vos, Marina De},
keywords = {itegpub},
pages = {1--8},
publisher = {CEUR-WS.org},
series = {{CEUR} Workshop Proceedings},
title = {Collaborative Hybrid Human {AI} Learning through Conceptual Exploration},
volume = 3825,
year = 2024
}
%0 Conference Paper
%1 DBLP:conf/hhai/GanterHHO24
%A Ganter, Bernhard
%A Hanika, Tom
%A Hirth, Johannes
%A Obiedkov, Sergei
%B Proceedings of the Workshops at the Third International Conference on Hybrid Human-Artificial Intelligence co-located with (HHAI) 2024), Malmö, Sweden, June 10-11, 2024
%D 2024
%E Ericson, Petter
%E Khairova, Nina
%E Vos, Marina De
%I CEUR-WS.org
%P 1--8
%T Collaborative Hybrid Human {AI} Learning through Conceptual Exploration
%U https://ceur-ws.org/Vol-3825/tutorial.pdf
%V 3825
1.
Dürrschnabel, D., Priss, U.: Realizability of Rectangular Euler Diagrams, (2024).

BibTeXEndNote

@misc{dürrschnabel2024realizability,
author = {Dürrschnabel, Dominik and Priss, Uta},
keywords = {itegpub},
title = {Realizability of Rectangular Euler Diagrams},
year = 2024
}
%0 Generic
%1 dürrschnabel2024realizability
%A Dürrschnabel, Dominik
%A Priss, Uta
%D 2024
%T Realizability of Rectangular Euler Diagrams
1.
Hanika, T., Hille, T.: What is the intrinsic dimension of your binary data? -- and how to compute it quickly. In: CONCEPTS. pp. 97–112. Springer (2024).

URLBibTeXEndNote

@inproceedings{hanika2024intrinsic,
author = {Hanika, Tom and Hille, Tobias},
booktitle = {CONCEPTS},
keywords = {itegpub},
pages = {97--112},
publisher = {Springer},
series = {Lecture Notes in Computer Science},
title = {What is the intrinsic dimension of your binary data? -- and how to compute it quickly},
volume = 14914,
year = 2024
}
%0 Conference Paper
%1 hanika2024intrinsic
%A Hanika, Tom
%A Hille, Tobias
%B CONCEPTS
%D 2024
%I Springer
%P 97--112
%T What is the intrinsic dimension of your binary data? -- and how to compute it quickly
%U http://dblp.uni-trier.de/db/conf/concepts/concepts2024.html#HanikaH24
%V 14914
%@ 978-3-031-67868-4
1.
Hirth, J., Hanika, T.: The Geometric Structure of Topic Models, (2024). https://doi.org/10.48550/arxiv.2403.03607.

BibTeXEndNoteDOI

@misc{hirth2024geometric,
author = {Hirth, Johannes and Hanika, Tom},
keywords = {selected},
publisher = {arXiv},
title = {The Geometric Structure of Topic Models},
year = 2024
}
%0 Generic
%1 hirth2024geometric
%A Hirth, Johannes
%A Hanika, Tom
%D 2024
%I arXiv
%R 10.48550/arxiv.2403.03607
%T The Geometric Structure of Topic Models
1.
Draude, C., D{ü}rrschnabel, D., Hirth, J., Horn, V., Kropf, J., Lamla, J., Stumme, G., Uhlmann, M.: Conceptual Mapping of Controversies. In: Cabrera, I.P., Ferr{é}, S., and Obiedkov, S. (eds.) Conceptual Knowledge Structures. pp. 201–216. Springer Nature Switzerland, Cham (2024).

BibTeXEndNote

With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversity, complexity and potential bias of controversies. In addition to that, we discuss how the diagrams of concept lattices can be used to navigate between news articles.
@inproceedings{10.1007/978-3-031-67868-4_14,
abstract = {With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversity, complexity and potential bias of controversies. In addition to that, we discuss how the diagrams of concept lattices can be used to navigate between news articles.},
address = {Cham},
author = {Draude, Claude and D{ü}rrschnabel, Dominik and Hirth, Johannes and Horn, Viktoria and Kropf, Jonathan and Lamla, J{ö}rn and Stumme, Gerd and Uhlmann, Markus},
booktitle = {Conceptual Knowledge Structures},
editor = {Cabrera, Inma P. and Ferr{é}, S{é}bastien and Obiedkov, Sergei},
keywords = {formal_concept_analysis},
pages = {201--216},
publisher = {Springer Nature Switzerland},
title = {Conceptual Mapping of Controversies},
year = 2024
}
%0 Conference Paper
%1 10.1007/978-3-031-67868-4_14
%A Draude, Claude
%A D{ü}rrschnabel, Dominik
%A Hirth, Johannes
%A Horn, Viktoria
%A Kropf, Jonathan
%A Lamla, J{ö}rn
%A Stumme, Gerd
%A Uhlmann, Markus
%B Conceptual Knowledge Structures
%C Cham
%D 2024
%E Cabrera, Inma P.
%E Ferr{é}, S{é}bastien
%E Obiedkov, Sergei
%I Springer Nature Switzerland
%P 201--216
%T Conceptual Mapping of Controversies
%X With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversity, complexity and potential bias of controversies. In addition to that, we discuss how the diagrams of concept lattices can be used to navigate between news articles.
%@ 978-3-031-67868-4
1.
Hirth, J.: Conceptual Data Scaling in Machine Learning, (2024). https://doi.org/10.17170/kobra-2024100910940.

BibTeXEndNoteDOI

Information that is intended for human interpretation is frequently represented in a structured manner. This allows for a navigation between individual pieces to find, connect or combine information to gain new insights. Within a structure, we derive knowledge from inference of hierarchical or logical relations between data objects. For unstructured data there are numerous methods to define a data schema based on user interpretations. Afterward, data objects can be aggregated to derive (hierarchical) structures based on common properties. There are four main challenges with respect to the explainability of the derived structures. First, formal procedures are needed to infer knowledge about the data set, or parts of it, from hierarchical structures. Second, what does knowledge inferred from a structure imply for the data set it was derived from? Third, structures may be incomprehensibly large for human interpretation. Methods are needed to reduce structures to smaller representations in a consistent, comprehensible manner that provides control over possibly introduced error. Forth, the original data set does not need to have interpretable features and thus only allow for the inference of structural properties. In order to extract information based on real world properties, we need methods that are able to add such properties. With the presented work, we address these challenges using and extending the rich tool-set of Formal Concept Analysis. Here, data objects are aggregated to closed sets called formal concepts based on (unary) symbolic attributes that they have in common. The process of deriving symbolic attributes is called conceptual scaling and depends on the interpretation of the data by the analyst. The resulting hierarchical structure of concepts is called concept lattice. To infer knowledge from the concept lattice structures we introduce new methods based on sub-structures that are of standardized shape, called ordinal motifs. This novel method allows us to explain the structure of a concept lattice based on geometric aspects. Throughout our work, we focus on data representations from multiple state-of-the-art machine learning algorithms. In all cases, we elaborate extensively on how to interpret these models through derived concept lattices and develop scaling procedures specific to each algorithm. Some of the considered models are black-box models whose internal data representations are numeric with no clear real world semantics. For these, we present a method to link background knowledge to the concept lattice structure. To reduce the complexity of concept lattices we provide a new theoretical framework that allows us to generate (small) views on a concept lattice. These enable more selective and comprehensibly sized explanations for data parts that are of interest. In addition to that, we introduce methods to combine and subtract views from each other, and to identify missing or incorrect parts.
@phdthesis{doi:10.17170/kobra-2024100910940,
abstract = {Information that is intended for human interpretation is frequently represented in a structured manner. This allows for a navigation between individual pieces to find, connect or combine information to gain new insights. Within a structure, we derive knowledge from inference of hierarchical or logical relations between data objects. For unstructured data there are numerous methods to define a data schema based on user interpretations. Afterward, data objects can be aggregated to derive (hierarchical) structures based on common properties. There are four main challenges with respect to the explainability of the derived structures. First, formal procedures are needed to infer knowledge about the data set, or parts of it, from hierarchical structures. Second, what does knowledge inferred from a structure imply for the data set it was derived from? Third, structures may be incomprehensibly large for human interpretation. Methods are needed to reduce structures to smaller representations in a consistent, comprehensible manner that provides control over possibly introduced error. Forth, the original data set does not need to have interpretable features and thus only allow for the inference of structural properties. In order to extract information based on real world properties, we need methods that are able to add such properties. With the presented work, we address these challenges using and extending the rich tool-set of Formal Concept Analysis. Here, data objects are aggregated to closed sets called formal concepts based on (unary) symbolic attributes that they have in common. The process of deriving symbolic attributes is called conceptual scaling and depends on the interpretation of the data by the analyst. The resulting hierarchical structure of concepts is called concept lattice. To infer knowledge from the concept lattice structures we introduce new methods based on sub-structures that are of standardized shape, called ordinal motifs. This novel method allows us to explain the structure of a concept lattice based on geometric aspects. Throughout our work, we focus on data representations from multiple state-of-the-art machine learning algorithms. In all cases, we elaborate extensively on how to interpret these models through derived concept lattices and develop scaling procedures specific to each algorithm. Some of the considered models are black-box models whose internal data representations are numeric with no clear real world semantics. For these, we present a method to link background knowledge to the concept lattice structure. To reduce the complexity of concept lattices we provide a new theoretical framework that allows us to generate (small) views on a concept lattice. These enable more selective and comprehensibly sized explanations for data parts that are of interest. In addition to that, we introduce methods to combine and subtract views from each other, and to identify missing or incorrect parts.},
author = {Hirth, Johannes},
keywords = {Knowldege~Representation},
school = {Kassel, Universität Kassel, Fachbereich Elektrotechnik/Informatik},
title = {Conceptual Data Scaling in Machine Learning},
year = 2024
}
%0 Thesis
%1 doi:10.17170/kobra-2024100910940
%A Hirth, Johannes
%D 2024
%R 10.17170/kobra-2024100910940
%T Conceptual Data Scaling in Machine Learning
%X Information that is intended for human interpretation is frequently represented in a structured manner. This allows for a navigation between individual pieces to find, connect or combine information to gain new insights. Within a structure, we derive knowledge from inference of hierarchical or logical relations between data objects. For unstructured data there are numerous methods to define a data schema based on user interpretations. Afterward, data objects can be aggregated to derive (hierarchical) structures based on common properties. There are four main challenges with respect to the explainability of the derived structures. First, formal procedures are needed to infer knowledge about the data set, or parts of it, from hierarchical structures. Second, what does knowledge inferred from a structure imply for the data set it was derived from? Third, structures may be incomprehensibly large for human interpretation. Methods are needed to reduce structures to smaller representations in a consistent, comprehensible manner that provides control over possibly introduced error. Forth, the original data set does not need to have interpretable features and thus only allow for the inference of structural properties. In order to extract information based on real world properties, we need methods that are able to add such properties. With the presented work, we address these challenges using and extending the rich tool-set of Formal Concept Analysis. Here, data objects are aggregated to closed sets called formal concepts based on (unary) symbolic attributes that they have in common. The process of deriving symbolic attributes is called conceptual scaling and depends on the interpretation of the data by the analyst. The resulting hierarchical structure of concepts is called concept lattice. To infer knowledge from the concept lattice structures we introduce new methods based on sub-structures that are of standardized shape, called ordinal motifs. This novel method allows us to explain the structure of a concept lattice based on geometric aspects. Throughout our work, we focus on data representations from multiple state-of-the-art machine learning algorithms. In all cases, we elaborate extensively on how to interpret these models through derived concept lattices and develop scaling procedures specific to each algorithm. Some of the considered models are black-box models whose internal data representations are numeric with no clear real world semantics. For these, we present a method to link background knowledge to the concept lattice structure. To reduce the complexity of concept lattices we provide a new theoretical framework that allows us to generate (small) views on a concept lattice. These enable more selective and comprehensibly sized explanations for data parts that are of interest. In addition to that, we introduce methods to combine and subtract views from each other, and to identify missing or incorrect parts.
1.
Hanika, T., Jäschke, R.: A Repository for Formal Contexts. In: Proceedings of the 1st International Joint Conference on Conceptual Knowledge Structures (2024).

URLBibTeXEndNote

Data is always at the center of the theoretical development and investigation of the applicability of formal concept analysis. It is therefore not surprising that a large number of data sets are repeatedly used in scholarly articles and software tools, acting as de facto standard data sets. However, the distribution of the data sets poses a problem for the sustainable development of the research field. There is a lack of a central location that provides and describes FCA data sets and links them to already known analysis results. This article analyses the current state of the dissemination of FCA data sets, presents the requirements for a central FCA repository, and highlights the challenges for this.
@inproceedings{hanika2024repository,
abstract = {Data is always at the center of the theoretical development and investigation of the applicability of formal concept analysis. It is therefore not surprising that a large number of data sets are repeatedly used in scholarly articles and software tools, acting as de facto standard data sets. However, the distribution of the data sets poses a problem for the sustainable development of the research field. There is a lack of a central location that provides and describes FCA data sets and links them to already known analysis results. This article analyses the current state of the dissemination of FCA data sets, presents the requirements for a central FCA repository, and highlights the challenges for this.},
author = {Hanika, Tom and Jäschke, Robert},
booktitle = {Proceedings of the 1st International Joint Conference on Conceptual Knowledge Structures},
keywords = {repository},
title = {A Repository for Formal Contexts},
year = 2024
}
%0 Conference Paper
%1 hanika2024repository
%A Hanika, Tom
%A Jäschke, Robert
%B Proceedings of the 1st International Joint Conference on Conceptual Knowledge Structures
%D 2024
%T A Repository for Formal Contexts
%U https://arxiv.org/abs/2404.04344
%X Data is always at the center of the theoretical development and investigation of the applicability of formal concept analysis. It is therefore not surprising that a large number of data sets are repeatedly used in scholarly articles and software tools, acting as de facto standard data sets. However, the distribution of the data sets poses a problem for the sustainable development of the research field. There is a lack of a central location that provides and describes FCA data sets and links them to already known analysis results. This article analyses the current state of the dissemination of FCA data sets, presents the requirements for a central FCA repository, and highlights the challenges for this.
1.
Hille, T., Stubbemann, M., Hanika, T.: Reproducibility and Geometric Intrinsic Dimensionality: An Investigation on Graph Neural Network Research, (2024).

BibTeXEndNote

@preprint{hille2024reproducibility,
author = {Hille, Tobias and Stubbemann, Maximilian and Hanika, Tom},
keywords = {intrinsic},
title = {Reproducibility and Geometric Intrinsic Dimensionality: An Investigation on Graph Neural Network Research},
year = 2024
}
%0 Generic
%1 hille2024reproducibility
%A Hille, Tobias
%A Stubbemann, Maximilian
%A Hanika, Tom
%D 2024
%T Reproducibility and Geometric Intrinsic Dimensionality: An Investigation on Graph Neural Network Research
1.
Abdulla, M., Hirth, J., Stumme, G.: The Birkhoff Completion of Finite Lattices. In: Cabrera, I.P., Ferré, S., and Obiedkov, S. (eds.) Conceptual Knowledge Structures. pp. 20–35. Springer Nature Switzerland, Cham (2024).

BibTeXEndNote

We introduce the Birkhoff completion as the smallest distributive lattice in which a given finite lattice can be embedded as semi-lattice. We discuss its relationship to implicational theories, in particular to R. Wille's simply-implicational theories. By an example, we show how the Birkhoff completion can be used as a tool for ordinal data science.
@inproceedings{10.1007/978-3-031-67868-4_2,
abstract = {We introduce the Birkhoff completion as the smallest distributive lattice in which a given finite lattice can be embedded as semi-lattice. We discuss its relationship to implicational theories, in particular to R. Wille's simply-implicational theories. By an example, we show how the Birkhoff completion can be used as a tool for ordinal data science.},
address = {Cham},
author = {Abdulla, Mohammad and Hirth, Johannes and Stumme, Gerd},
booktitle = {Conceptual Knowledge Structures},
editor = {Cabrera, Inma P. and Ferré, Sébastien and Obiedkov, Sergei},
keywords = {itegpub},
pages = {20--35},
publisher = {Springer Nature Switzerland},
title = {The Birkhoff Completion of Finite Lattices},
year = 2024
}
%0 Conference Paper
%1 10.1007/978-3-031-67868-4_2
%A Abdulla, Mohammad
%A Hirth, Johannes
%A Stumme, Gerd
%B Conceptual Knowledge Structures
%C Cham
%D 2024
%E Cabrera, Inma P.
%E Ferré, Sébastien
%E Obiedkov, Sergei
%I Springer Nature Switzerland
%P 20--35
%T The Birkhoff Completion of Finite Lattices
%X We introduce the Birkhoff completion as the smallest distributive lattice in which a given finite lattice can be embedded as semi-lattice. We discuss its relationship to implicational theories, in particular to R. Wille's simply-implicational theories. By an example, we show how the Birkhoff completion can be used as a tool for ordinal data science.
%@ 978-3-031-67868-4
1.
Draude, C., Dürrschnabel, D., Hirth, J., Horn, V., Kropf, J., Lamla, J., Stumme, G., Uhlmann, M.: Conceptual Mapping of Controversies. In: Cabrera, I.P., Ferré, S., and Obiedkov, S. (eds.) Conceptual Knowledge Structures. pp. 201–216. Springer Nature Switzerland, Cham (2024).

BibTeXEndNote

With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversity, complexity and potential bias of controversies. In addition to that, we discuss how the diagrams of concept lattices can be used to navigate between news articles.
@inproceedings{10.1007/978-3-031-67868-4_14,
abstract = {With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversity, complexity and potential bias of controversies. In addition to that, we discuss how the diagrams of concept lattices can be used to navigate between news articles.},
address = {Cham},
author = {Draude, Claude and Dürrschnabel, Dominik and Hirth, Johannes and Horn, Viktoria and Kropf, Jonathan and Lamla, J{ö}rn and Stumme, Gerd and Uhlmann, Markus},
booktitle = {Conceptual Knowledge Structures},
editor = {Cabrera, Inma P. and Ferré, Sébastien and Obiedkov, Sergei},
keywords = {itegpub},
pages = {201--216},
publisher = {Springer Nature Switzerland},
title = {Conceptual Mapping of Controversies},
year = 2024
}
%0 Conference Paper
%1 10.1007/978-3-031-67868-4_14
%A Draude, Claude
%A Dürrschnabel, Dominik
%A Hirth, Johannes
%A Horn, Viktoria
%A Kropf, Jonathan
%A Lamla, J{ö}rn
%A Stumme, Gerd
%A Uhlmann, Markus
%B Conceptual Knowledge Structures
%C Cham
%D 2024
%E Cabrera, Inma P.
%E Ferré, Sébastien
%E Obiedkov, Sergei
%I Springer Nature Switzerland
%P 201--216
%T Conceptual Mapping of Controversies
%X With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversity, complexity and potential bias of controversies. In addition to that, we discuss how the diagrams of concept lattices can be used to navigate between news articles.
%@ 978-3-031-67868-4
1.
Hirth, J., Horn, V., Stumme, G., Hanika, T.: Ordinal motifs in lattices. Information Sciences. 659, 120009 (2024). https://doi.org/https://doi.org/10.1016/j.ins.2023.120009.

URLBibTeXEndNoteDOI

@article{HIRTH2024120009,
author = {Hirth, Johannes and Horn, Viktoria and Stumme, Gerd and Hanika, Tom},
journal = {Information Sciences},
keywords = {itegpub},
pages = 120009,
title = {Ordinal motifs in lattices},
volume = 659,
year = 2024
}
%0 Journal Article
%1 HIRTH2024120009
%A Hirth, Johannes
%A Horn, Viktoria
%A Stumme, Gerd
%A Hanika, Tom
%D 2024
%J Information Sciences
%P 120009
%R https://doi.org/10.1016/j.ins.2023.120009
%T Ordinal motifs in lattices
%U https://www.sciencedirect.com/science/article/pii/S0020025523015943
%V 659
1.
Schäfermeier, B., Hirth, J., Hanika, T.: Research Topic Flows in Co-Authorship Networks. Scientometrics. 128, 5051–5078 (2023). https://doi.org/10.1007/s11192-022-04529-w.

BibTeXEndNoteDOI

In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields.
@article{schafermeier2022research,
abstract = {In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields.},
author = {Schäfermeier, Bastian and Hirth, Johannes and Hanika, Tom},
journal = {Scientometrics},
keywords = {co-authorships},
month = {09},
number = 9,
pages = {5051--5078},
title = {Research Topic Flows in Co-Authorship Networks},
volume = 128,
year = 2023
}
%0 Journal Article
%1 schafermeier2022research
%A Schäfermeier, Bastian
%A Hirth, Johannes
%A Hanika, Tom
%D 2023
%J Scientometrics
%N 9
%P 5051--5078
%R 10.1007/s11192-022-04529-w
%T Research Topic Flows in Co-Authorship Networks
%V 128
%X In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields.
1.
Felde, M., Stumme, G.: Interactive collaborative exploration using incomplete contexts. Data & Knowledge Engineering. 143, 102104 (2023). https://doi.org/10.1016/j.datak.2022.102104.

URLBibTeXEndNoteDOI

@article{Felde_2023,
author = {Felde, Maximilian and Stumme, Gerd},
journal = {Data & Knowledge Engineering},
keywords = {attribute-exploration},
month = {01},
pages = 102104,
publisher = {Elsevier {BV}},
title = {Interactive collaborative exploration using incomplete contexts},
volume = 143,
year = 2023
}
%0 Journal Article
%1 Felde_2023
%A Felde, Maximilian
%A Stumme, Gerd
%D 2023
%I Elsevier {BV}
%J Data & Knowledge Engineering
%P 102104
%R 10.1016/j.datak.2022.102104
%T Interactive collaborative exploration using incomplete contexts
%U https://doi.org/10.1016%2Fj.datak.2022.102104
%V 143