I am a German researcher that is interested in explainable artificial intelligence. With my research I focus on methods from symbolic methods to generate conceptual views. These are hierarchical structures that encode what abstract concepts are entailed in data and how they depend on each other. The resulting structures are great for structuring data, concept based navigation and rule inference. I contribute to both deeper theoretical insights and applications in the realm of topic models, latent representations of deep learning models, and knowledge representations in general.
In my spare time, I often find myself optimizing my Emacs workflow.
@misc{https://doi.org/10.48550/arxiv.2302.09101, author = {Ganter, Bernhard and Hanika, Tom and Hirth, Johannes}, keywords = {sai}, publisher = {arXiv}, title = {Scaling Dimension}, year = 2023 }
%0 Generic %1 https://doi.org/10.48550/arxiv.2302.09101 %A Ganter, Bernhard %A Hanika, Tom %A Hirth, Johannes %D 2023 %I arXiv %R 10.48550/ARXIV.2302.09101 %T Scaling Dimension %U https://arxiv.org/abs/2302.09101
Hanika, T., Hirth, J.: Conceptual Views on Tree Ensemble Classifiers, https://arxiv.org/abs/2302.05270, (2023).
@misc{https://doi.org/10.48550/arxiv.2302.05270, author = {Hanika, Tom and Hirth, Johannes}, keywords = {sai}, publisher = {arXiv}, title = {Conceptual Views on Tree Ensemble Classifiers}, year = 2023 }
%0 Generic %1 https://doi.org/10.48550/arxiv.2302.05270 %A Hanika, Tom %A Hirth, Johannes %D 2023 %I arXiv %R 10.48550/ARXIV.2302.05270 %T Conceptual Views on Tree Ensemble Classifiers %U https://arxiv.org/abs/2302.05270
Schäfermeier, B., Hirth, J., Hanika, T.: Research Topic Flows in Co-Authorship Networks Scientometrics. (2022).
In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields.
@article{schafermeier2022research, abstract = {In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields.}, author = {Schäfermeier, Bastian and Hirth, Johannes and Hanika, Tom}, journal = {Scientometrics}, keywords = {topic-models}, month = 10, title = {Research Topic Flows in Co-Authorship Networks}, year = 2022 }
%0 Journal Article %1 schafermeier2022research %A Schäfermeier, Bastian %A Hirth, Johannes %A Hanika, Tom %D 2022 %J Scientometrics %R 10.1007/s11192-022-04529-w %T Research Topic Flows in Co-Authorship Networks %X In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields.
Hirth, J., Hanika, T.: Formal Conceptual Views in Neural Networks, https://arxiv.org/abs/2209.13517, (2022).
Knowledge computation tasks, such as computing a base of valid implications, are often infeasible for large data sets. This is in particular true when deriving canonical bases in formal concept analysis (FCA). Therefore, it is necessary to find techniques that on the one hand reduce the data set size, but on the other hand preserve enough structure to extract useful knowledge. Many successful methods are based on random processes to reduce the size of the investigated data set. This, however, makes them hardly interpretable with respect to the discovered knowledge. Other approaches restrict themselves to highly supported subsets and omit rare and (maybe) interesting patterns. An essentially different approach is used in network science, called k-cores. These cores are able to reflect rare patterns, as long as they are well connected within the data set. In this work, we study k-cores in the realm of FCA by exploiting the natural correspondence of bi-partite graphs and formal contexts. This structurally motivated approach leads to a comprehensible extraction of knowledge cores from large formal contexts.
@article{Hanika2022, abstract = {Knowledge computation tasks, such as computing a base of valid implications, are often infeasible for large data sets. This is in particular true when deriving canonical bases in formal concept analysis (FCA). Therefore, it is necessary to find techniques that on the one hand reduce the data set size, but on the other hand preserve enough structure to extract useful knowledge. Many successful methods are based on random processes to reduce the size of the investigated data set. This, however, makes them hardly interpretable with respect to the discovered knowledge. Other approaches restrict themselves to highly supported subsets and omit rare and (maybe) interesting patterns. An essentially different approach is used in network science, called k-cores. These cores are able to reflect rare patterns, as long as they are well connected within the data set. In this work, we study k-cores in the realm of FCA by exploiting the natural correspondence of bi-partite graphs and formal contexts. This structurally motivated approach leads to a comprehensible extraction of knowledge cores from large formal contexts.}, author = {Hanika, Tom and Hirth, Johannes}, journal = {Annals of Mathematics and Artificial Intelligence}, keywords = {sai}, month = {apr}, title = {Knowledge cores in large formal contexts}, year = 2022 }
%0 Journal Article %1 Hanika2022 %A Hanika, Tom %A Hirth, Johannes %D 2022 %J Annals of Mathematics and Artificial Intelligence %R 10.1007/s10472-022-09790-6 %T Knowledge cores in large formal contexts %U https://doi.org/10.1007/s10472-022-09790-6 %X Knowledge computation tasks, such as computing a base of valid implications, are often infeasible for large data sets. This is in particular true when deriving canonical bases in formal concept analysis (FCA). Therefore, it is necessary to find techniques that on the one hand reduce the data set size, but on the other hand preserve enough structure to extract useful knowledge. Many successful methods are based on random processes to reduce the size of the investigated data set. This, however, makes them hardly interpretable with respect to the discovered knowledge. Other approaches restrict themselves to highly supported subsets and omit rare and (maybe) interesting patterns. An essentially different approach is used in network science, called k-cores. These cores are able to reflect rare patterns, as long as they are well connected within the data set. In this work, we study k-cores in the realm of FCA by exploiting the natural correspondence of bi-partite graphs and formal contexts. This structurally motivated approach leads to a comprehensible extraction of knowledge cores from large formal contexts.
Hanika, T., Hirth, J.: On the lattice of conceptual measurements Information Sciences. 613, 453–468 (2022).
We present a novel approach for data set scaling based on scale-measures from formal concept analysis, i.e., continuous maps between closure systems, for which we derive a canonical representation. Moreover, we prove that scale-measures can be lattice ordered using the canonical representation. This enables exploring the set of scale-measures by the use of meet and join operations. Furthermore we show that the lattice of scale-measures is isomorphic to the lattice of sub-closure systems that arises from the original data. Finally, we provide another representation of scale-measures using propositional logic in terms of data set features. Our theoretical findings are discussed by means of examples.
@article{hanika2020lattice, abstract = {We present a novel approach for data set scaling based on scale-measures from formal concept analysis, i.e., continuous maps between closure systems, for which we derive a canonical representation. Moreover, we prove that scale-measures can be lattice ordered using the canonical representation. This enables exploring the set of scale-measures by the use of meet and join operations. Furthermore we show that the lattice of scale-measures is isomorphic to the lattice of sub-closure systems that arises from the original data. Finally, we provide another representation of scale-measures using propositional logic in terms of data set features. Our theoretical findings are discussed by means of examples.}, author = {Hanika, Tom and Hirth, Johannes}, journal = {Information Sciences}, keywords = {scale-measure}, pages = {453-468}, title = {On the lattice of conceptual measurements}, volume = 613, year = 2022 }
%0 Journal Article %1 hanika2020lattice %A Hanika, Tom %A Hirth, Johannes %D 2022 %J Information Sciences %P 453-468 %R https://doi.org/10.1016/j.ins.2022.09.005 %T On the lattice of conceptual measurements %U https://www.sciencedirect.com/science/article/pii/S0020025522010489 %V 613 %X We present a novel approach for data set scaling based on scale-measures from formal concept analysis, i.e., continuous maps between closure systems, for which we derive a canonical representation. Moreover, we prove that scale-measures can be lattice ordered using the canonical representation. This enables exploring the set of scale-measures by the use of meet and join operations. Furthermore we show that the lattice of scale-measures is isomorphic to the lattice of sub-closure systems that arises from the original data. Finally, we provide another representation of scale-measures using propositional logic in terms of data set features. Our theoretical findings are discussed by means of examples.
Hanika, T., Hirth, J.: Quantifying the Conceptual Error in Dimensionality Reduction In: Braun, T., Gehrke, M., Hanika, T., and Hernandez, N. (eds.) Graph-Based Representation and Reasoning - 26th International Conference on Conceptual Structures, {ICCS} 2021, Virtual Event, September 20-22, 2021, Proceedings. pp. 105–118. Springer (2021).
@inproceedings{DBLP:conf/iccs/HanikaH21, author = {Hanika, Tom and Hirth, Johannes}, booktitle = {Graph-Based Representation and Reasoning - 26th International Conference on Conceptual Structures, {ICCS} 2021, Virtual Event, September 20-22, 2021, Proceedings}, editor = {Braun, Tanya and Gehrke, Marcel and Hanika, Tom and Hernandez, Nathalie}, keywords = {scaling}, pages = {105--118}, publisher = {Springer}, series = {Lecture Notes in Computer Science}, title = {Quantifying the Conceptual Error in Dimensionality Reduction}, volume = 12879, year = 2021 }
%0 Conference Paper %1 DBLP:conf/iccs/HanikaH21 %A Hanika, Tom %A Hirth, Johannes %B Graph-Based Representation and Reasoning - 26th International Conference on Conceptual Structures, {ICCS} 2021, Virtual Event, September 20-22, 2021, Proceedings %D 2021 %E Braun, Tanya %E Gehrke, Marcel %E Hanika, Tom %E Hernandez, Nathalie %I Springer %P 105--118 %R 10.1007/978-3-030-86982-3_8 %T Quantifying the Conceptual Error in Dimensionality Reduction %U https://doi.org/10.1007/978-3-030-86982-3_8 %V 12879
Hanika, T., Hirth, J.: Exploring Scale-Measures of Data Sets In: Braud, A., Buzmakov, A., Hanika, T., and Ber, F.L. (eds.) Formal Concept Analysis - 16th International Conference, {ICFCA} 2021, Strasbourg, France, June 29 - July 2, 2021, Proceedings. pp. 261–269. Springer (2021).
@inproceedings{DBLP:conf/icfca/HanikaH21, author = {Hanika, Tom and Hirth, Johannes}, booktitle = {Formal Concept Analysis - 16th International Conference, {ICFCA} 2021, Strasbourg, France, June 29 - July 2, 2021, Proceedings}, editor = {Braud, Agn{{è}}s and Buzmakov, Aleksey and Hanika, Tom and Ber, Florence Le}, keywords = {scaling}, pages = {261--269}, publisher = {Springer}, series = {Lecture Notes in Computer Science}, title = {Exploring Scale-Measures of Data Sets}, volume = 12733, year = 2021 }
%0 Conference Paper %1 DBLP:conf/icfca/HanikaH21 %A Hanika, Tom %A Hirth, Johannes %B Formal Concept Analysis - 16th International Conference, {ICFCA} 2021, Strasbourg, France, June 29 - July 2, 2021, Proceedings %D 2021 %E Braud, Agn{{è}}s %E Buzmakov, Aleksey %E Hanika, Tom %E Ber, Florence Le %I Springer %P 261--269 %R 10.1007/978-3-030-77867-5_17 %T Exploring Scale-Measures of Data Sets %U https://doi.org/10.1007/978-3-030-77867-5_17 %V 12733
Hanika, T., Hirth, J.: Conexp-Clj - A Research Tool for FCA. In: Cristea, D., Ber, F.L., Missaoui, R., Kwuida, L., and Sertkaya, B. (eds.) ICFCA (Supplements). pp. 70–75. CEUR-WS.org (2019).
@inproceedings{conf/icfca/HanikaH19, author = {Hanika, Tom and Hirth, Johannes}, booktitle = {ICFCA (Supplements)}, crossref = {conf/icfca/2019suppl}, editor = {Cristea, Diana and Ber, Florence Le and Missaoui, Rokia and Kwuida, Léonard and Sertkaya, Baris}, keywords = {sai}, pages = {70-75}, publisher = {CEUR-WS.org}, series = {CEUR Workshop Proceedings}, title = {Conexp-Clj - A Research Tool for FCA.}, volume = 2378, year = 2019 }
%0 Conference Paper %1 conf/icfca/HanikaH19 %A Hanika, Tom %A Hirth, Johannes %B ICFCA (Supplements) %D 2019 %E Cristea, Diana %E Ber, Florence Le %E Missaoui, Rokia %E Kwuida, Léonard %E Sertkaya, Baris %I CEUR-WS.org %P 70-75 %T Conexp-Clj - A Research Tool for FCA. %U http://dblp.uni-trier.de/db/conf/icfca/icfca2019suppl.html#HanikaH19 %V 2378
July 2021: ‚Exploring Scale-Measures of Data Sets‘, ICFCA (2021), Université de Strasbourg, Strasbourg, France
March 2021: ‚Discovery of Conceptual Measurements/Entdecken Begrifflicher Messungen‘, Explainable Artificial Intelligence, Schloss Dagstuhl Computer Science Center, Wadern, Germany
Accompanying my theoretical research, there are two projects I mainly work for:
Conexp-Clj — A Research Tool for FCA
The research unit Knowledge & Data Engineering continues the development of the research tool conexp-clj, originally created by Dr. Daniel Borchmann. The continuous enhancement of the software package is supervised by Dr. Tom Hanika. Having such a tool at hand, the research group is able to test and analyze the theoretical research efforts in the realm of formal concept analysis and related fields. The most recent, pre-compiled, release candidate can be downloaded here. A presentation of the tool can be found in Conexp-Clj – A Research Tool for FCA
BibSonomy is a scholarly social bookmarking system where researchers manage their collections of publications and web pages. BibSonomy is an open source project, continously developed by researchers in Kassel, Würzburg, and Hanover. Functioning as a test bed for recommendation and ranking algorithms, as well as through the publicly available datasets, containing traces of user behavior on the Web, BibSonomy has been the subject of various scientific studies.
Im Projekt faire digitale Dienste: „Ko-Valuation in der Gestaltung datenökonomischer Geschäftsmodelle (FAIRDIENSTE)“ wird ein interdisziplinärer Ansatz verfolgt, der sowohl soziologische als auch (wirtschafts-)informatische Aspekte beinhaltet. Es werden faire Geschäftsmodelle untersucht, die auf Kooperation und Wertevermittlung zielen.
Ein Ziel der Arbeit ist die Weiterentwicklung informatischer Methoden zur qualitativen Datenanalyse, welche die an den Kundenschnittstellen digitaler Dienste auftretende Konfliktlandschaft transparent machen und die für Verbraucher*innen eine kritische Beurteilung verschiedener Wertgesichtspunkte ermöglichen soll.
We provide an interactive WebApp (Link) accompanying our work on the Topic Flow Network, which enables us to study topic specific flow of expertise between scientific authors.
Associated Papers:
Schäfermeier, B., Hirth, J., & Hanika, T. (2022). Research topic flows in co-authorship networks. Scientometrics, 1-28.
During the documenta 2022, there is a 100 days of STEM event hosted by the Schülerforschungszentrum Nordhessen, a research centre for students. We participated with a workshop on Which is better? Scale, organise and understand data properly.
Supervision of Student Internships
I supervise an annual two-week student internship at the University of Kassel on simulating traffic using cellular automata (2016) and analyzing the twitter social network (2018 — today).
Teaching Programming in Clojure
Teaching an annual Clojure Programming Course (2019 — today).