Fake Context Challenge – FCC
Organizers: Maximilian Felde, Tom Hanika, and Johannes Hirth
Data science and machine learning in particular are significantly dependent on the quality of the data. A special problem in this regard at present is the manipulation of data through modification or synthesis. With the proposed Fake Context Challenge (FCC), we want to address this problem and raise awareness, especially among the FCA community. In particular, with the help of the FCC, the development of adequate FCA methods to detect data tampering and data synthetization should be promoted.
Tasks and Submission
The challenge consists of three different tasks to decide or recognize fake formal contexts. All tasks will use anonymized real-world data sets (i.e., data scaled to formal contexts) of different sizes and shapes.
To participate in the challenge, you may submit source code (or provide us a link to a repository). You may choose a programming language of your choice. Your submission should consist of either one source code per task or one source code for all tasks. We will compile/execute any source code according to your instructions on our reference computer system.
We prepared for all of the following tasks training data sets. All files related to task {1,2,3} are named task-{1,2,3}-{fake,original}. The purpose of the suffix fake/original is for your preparation. The program you submit should just read files task-1-{1,..10}, task-2-{1a,1b,2a,2b,…,10a,10b}, or task-3-{1,..,20}.
Task 1
INPUT: A set of ten formal contexts in which exactly one is an original real world formal context K and nine are either artificially generated or manipulated versions of K.
OUTPUT: The formal context K.
Task 2
INPUT: A set of ten pairs of formal contexts. In each pair only one formal context is an original real-world data set.
OUTPUT: For each pair the formal context that is original.
Task 3
INPUT: A set of twenty formal contexts with original real world and fake/manipulated formal contexts.
OUTPUT: All formal contexts that are non-fake/non-manipulated.
Evaluation Criteria
The submitted code will be evaluated on
- correctness and
- running time.
To efficiently evaluate the latter criterion we will enforce a strict limit of 120 minutes per task.
Dates and Format
- on April 19th we published the tasks and the training data set
- on June 1st we expect participants to submit their code as well as a 4 page short-paper that describes the applied method. The paper should be prepared using CEUR Art Style. Submit your files via email to icfca-challenge-2023@lists.cs.uni-kassel.de
- Notification of acceptance of you paper and code to the challenge: June 8th
- during the workshop day of ICFCA all short papers will be presented in the FCC-Session
- during the ICFCA conference week we will announce the winners of the FCC in a special session