Downloads

Please understand that data privacy is an important issue especially in the context of searching for a baby name and we want to prevent commercial use of the nameling’s search profiles. We therefore kindly ask you to look at the administrative details for information on how to obtain the challenge data sets. Requests will be processed within a working day.

Offline Challenge

We here list the most important files for the offline challenge. You find more details on these data sets on the offline challenge’s task description.

Public Challenge Data
Tab separated file containing one line per user activity.
Download
Test User Ids
Ids of all users which are contained in the secret test data set.
Download
Secret Test User Data
This file contains the secret user data which was used for evaluation in the offline challenge. The first column contains the corresponding user’s id, while the remaining two columns contain the respectively hidden test names.
Download

Online Challenge

We here list the most important files for the online challenge. You find more details on these data sets on the offline task description and the online task description

Public Online Challenge Data
Tab separated file containing one line per user activity. Please note that for technical reasons, user names are differently anonymized than for the offline challenge (see user mapping below).
Download
Mapping between user names in the offline and online challenge data
Tabseparated file containing one line per user (online_user_id, offline_user_id).
Download
Approximate Geo Locations
Based on the web server’s access log, we used MaxMind’s GeoLite City database for obtaining a proxy for a user’s geographic location. This file is a tab separated list where each line consists of: user id, number of queries, country code, province, city, latitude, longitude. Please note that the process of approximating geographic locations based on IP addresses yields sparse and rather noisy results.
Download

For integrating your recommender, you can either use our Java or our Python binding. Below, you find all necessary downloads.

Java Binding

nameling-model.jar(source)
This Jar-file contains all basic model classes (e.g. the user object).
nameling-recommender.jar(source)
This Jar-file contains the implementation of Nameling’s recommender framework.
nameling-recommender-servlet.war(source)
This War-file contains the implementation of Nameling’s Java REST servlet for integrating remotely running recommender systems in the running system.

Python Binding

nameling-python-servlet.zip
This ZIP-archive contains the implementation of Nameling’s Python based REST servlet for integrating remotely running recommender systems in the running system which are written in Python.
nameling-test-client.jar
This Jar-archive contains a simple test program for checking your running recommender servlet.

Supplementary Data Sets

In addition to the training data, we provide some supplementary data sets for your convenience. You don’t have to use them and we don’t claim that they will be beneficial at all (actually, we don’t know whether they are). At least you won’t have to crawl nameling in order to get them. As soon as you have registered for the challenge you will have access to these data sets.

Name List
This list comprises all names which are currently known in nameling (one name per line). Recommendation test data is restricted to this list of names.
Download
Top 100 Similar Names
A tab separated file, containing for each name the top 100 most similar names according to the nameling’s similarity metric. Each line contains three columns: source name, other name, similarity score, rank.
Download (de) Download (en) Download (fr)
Approximate Geo Locations
Based on the web server’s access log, we used MaxMind’s GeoLite City database for obtaining a proxy for a user’s geographic location. This file is a tab separated list where each line consists of: user id, number of queries, country code, province, city, latitude, longitude. Please note that the process of approximating geographic locations based on IP addresses yields sparse and rather noisy results.
Download

Scripts

We provide all scripts which are needed to reproduce the applied data preprocessing (including the generation of separate training and test data) and evaluation.

process_activitylog.pl
We used this script for splitting the training and test data according to the applied evaluation protocol.
Download
evaluate_recommender.pl
This script implements the applied evaluation metric.
Download

Quick Access

You can download all above-mentioned files by clicking on the following download button.
Download