All metadata (title and authors) of papers published at the ICCS (1993-2011), ICFCA (2003-2011), and CLA (2004-2011) conferences, including citations (title, authors, year). The data was retrieved from the conference/publisher website and extensively cleaned and normalized (automatically and manually). See the corresponding paper

Doerfel, S.; Jäschke, R. & Stumme, G. (2012), Publication Analysis of the Formal Concept Analysis Community. In F. Domenach; D.I. Ignatov & J. Poelmans, ed., ‘ICFCA 2012’, Springer, Berlin/Heidelberg, pp. 77-95.

for further details. Please use this citation for referencing the dataset.

After extraction of the archive file you find the following directory structure:

doerfel2012publication--+--iccs--+--1993--+--01.csv
                        |        |        +--02.csv
                        |        |        +--03.csv
                        |        ...      +-- ...
                        |        |        +--24.csv
                        |        +--1994--+--01.csv
                        |        |        +-- ...
                        |        ...      +-- ...
                        |        +--2011--+--01.csv
                        |        ...      +-- ...
                        |--icfca--+--2003--+--01.csv
                        |         |        +-- ...
                        |         ...      +-- ...
                        |         +--2011--+--01.csv
                        |         ...      +-- ...
                        |--cla--+--2004--+--02.csv
                        |       |        +-- ...
                        |       ...      +-- ...
                        |       +--2011--+--01.csv
                        |       ...      +-- ...

For each of the three conferences there is a directory and in those directories you find subdirectories for each year. Finally, each paper is represented by a file with the extension csv and the name of the file indicating the order from the proceedings. The format of the CSV files is as follows:

# John F. Sowa  Relating diagrams to logic
Chen, P. P.     The entity-relationship model-toward a unified view of data     1976
Chomsky, N.      Syntactic Structures   1957
Genesereth, M. R. and Fikes, R. E.      Knowledge Interchange Format    1992
...

The first line depicts the paper’s author(s) and title (separated by TAB) and always starts with #. The following lines represent the publications referenced by the paper. Each line contains authors, title, and year separated by TAB.

For our analysis we used the pre-processed data and further normalized it for matching and duplicate detection, as described in the paper. The code to parse and normalize the author names and titles is available in BibSonomy’s Maven repository.

Additional Material

The map of the co-author graph (Figure 3 in the paper). The map has been created using GMAP from Graphviz.