Offline Challenge

The challenge’s objective is easily described: Create a recommender system for first names! Given a set of names for which a user has shown interest in, the recommender should provide suggestions for further names for that user. The recommender’s quality will be assessed on an evaluation data set. Thus, the task can be considered a standard item recommendation task.

There are some details which will be subsequently addressed.

Task Description

We provide data from the name search engine nameling. Given a user’s (partial) search history in nameling, your recommender system should provide an ordered list of names in which the user might be interested. For the offline challenge, your recommender system’s quality will be evaluated with respect to the set of names users’ entered directly into nameling’s search field (i.e., ENTER_SEARCH activities). We restricted the evaluation to such activities, because all other user activities are biased towards the lists of names which were displayed to nameling users (see our corresponding analysis of the ranking performance).

The setup of this challenge is motivated by the actual use case. That is, providing name recommendations in an online application. While the online challenge will exactly cover this use case, the offline challenge is as closely related to the online task as possible. Thus, models which are built for the offline challenge may also be applied to the online challenge.

Data Set

The offline challenge is based on usage data from nameling. The data set is derived from nameling’s query logs, ranging from March 6th, 2012 to February 12th, 2013, containing profile data for 60,922 users with 515,848 activities.

The nameling’s usage data is a tab separated text file, containing one activity per line:

userId  activity        name      timestamp
29429   ENTER_SEARCH    monique   13397796050
29429   LINK_SEARCH     monique   13397796180
29429   ENTER_SEARCH    brunner   13397796290
29429   ENTER_SEARCH    benedikt  13397796470

The first column is the anonymized userId (integer), the second column denotes the activity (see below for details), the third column contains the corresponding name or category name and the last column contains a POSIX timestamp. Within the data the following activities can occur:

ENTER_SEARCH
The user entered a name directly into nameling’s search field.
LINK_SEARCH
The user followed a link on some result page. Please note that this also comprises pagination of name lists.
LINK_CATEGORY_SEARCH
Wherever available, names are categorized according to the corresponding Wikipedia articles. Users may click on such a category link for obtaining all accordingly categorized names.
NAME_DETAILS
Users can get some detailed information for a name (e.g., for Folke).
ADD_FAVORITE
Users can maintain a list of favorite names.

For the challenge we selected a subset of users for the evaluation (in the following called test users). For each such test user, we withheld some of their most recent activities for testing according to the following rules:

  • For each user, we selected the chronologically last two names for evaluation which had directly been entered into nameling’s search field (i.e., ENTER_SEARCH activity) and which are also contained in the list of known names. We thereby considered the respective time stamp of a name’s first occurrence within the user’s activities.
  • We considered only those names for evaluation which had not previously been added as a favorite name by the user.
  • All remaining user activity after the (chronologically) first evaluation name has been discarded.
  • We required at least three activities per user to remain in the data set.
  • For previous publications, we already published part of nameling’s usage data. Only users not contained in this previously published data set, have been selected as test users.

The above procedure yields two data sets.

  • The secret evaluation data set contains for each test user the two left out names. It will be made publicly available only after the challenge.
  • The public challenge data set contains the remaining user activities of the test users and the full lists of activities from all other users. In particular, the public challenge data set comprises all previously published data sets.

You can find an example on how we derived the two data sets from the full usage log in the FAQ page and you can download the script which was used to split the data set as described above on the download page in order for you to obtain comparable training and evaluation scenarios from the public challenge data set.

Please also have a look at the list of supplementary data sets available on the download page and we would also like to encourage you to use any data source you can think of, such as, family trees, national name statistics, phone books or any combination thereof.

Results

It is your recommender system’s task to recommend an ordered list of 1,000 names per test user. The list will be in a tab separated text file, containing one line for each test user. Each line contains the id of the corresponding user, followed by the ordered list of up to 1,000 tab separated names, where the most relevant name is located at the first position. Of course, recommending less then 1,000 names is admissible too, while longer lists of recommended names will be discarded. You can find an example for the preparation of the result files on the FAQ page.

The following properties of the evaluation data should be considered for the computation of the recommendations:

  • All names that occur in the evaluation data set are contained in nameling’s list of known names (available on the download page). It is therefore reasonable, to remove all names which are not contained in that list. Please note, that nevertheless, the evaluation data also contains (rare) names which do not occur in the public challenge data set.
  • A name from a user’s test set may not occur as an ENTER_SEARCH or ADD_FAVORITE activity within the user’s training data, but may occur as a LINK_SEARCH or NAME_DETAILS activity.
  • No preprocessing was applied to the public data. But for testing on equivalence of names, we consider the name’s lower case variant. Accordingly, for evaluation the case of a name is also ignored.

See our FAQ page for some thoughts on why we chose this particular number of 1,000 recommended names.

Evaluation

The assessment metric for the recommendations is Mean Average Precision (MAP@1000). MAP means: For each test user look up the left-out names and take the precision at the respective position in the ordered list of recommended names. These precision values are first averaged per test user and than in total to yield the final score. MAP@1000 means that only the first 1,000 positions of a list are considered. Thus it might happen that for some test users one or both of the left out names do not occur in the list of recommendations. These cases will be handled as if the missing names were ranked at position 1001 and 1002 respectively.

You can download a Perl script on the download page which computes the MAP score and use it for your own testing. It is the same script that will be used to evaluate the final recommendations in the challenge.

See our FAQ page for some thoughts on the use of MAP as evaluation metric.

Administrative Details

Here’s the quick version of the instructions, for taking part in the challenge. More detailed instructions follow below.

  1. Register to this challenge using the registration form (registration closed)
  2. Sign the non-disclosure-agreement and send it to us via email to challenge@nullecmlpkdd.org.
  3. Receive download instructions and get the public challenge data set.
  4. Experiment on the public challenge data and train your own recommender.
  5. Create the ordered lists of recommendations for each test user and send it to us via email no later than the 1st of July 2013.
  6. Submit a full workshop paper via EasyChair explaining your approach and your experiments no later than the 8th of July 2013.
  7. Receive reviews for your submission.
  8. Submit a camera-ready version of your paper (Deadline August 23rd, 2013).

All participants are encouraged to take part in the workshop!

Registration

The challenge welcomes teams of any size and background (researchers, students, etc.) alike. Registration is fairly easy, just access the registration form (registration closed) and follow the instructions on the registration process. During the registration process, the following information is required:

  • An arbitrary chosen user name
  • An email address for correspondence and submission of results.
  • The real name of one of the team’s members. You can make your team as large as you like and you may include (or exclude) team members during the challenge without telling us.
  • A scan or digital copy of the signed non-disclosure-agreement for the data sets. Please understand that data privacy is an important issue especially in the context of searching for a baby name and we want to prevent commercial use of nameling’s search profiles.
  • The team’s nick name. The team name is used to identify the contributions and to present intermediary scores on the leader board. It will be used as a publicly visible pseudonym during the challenge.

Please note that the registration process is completed after we received and sighted your signed non-disclosure-agreement. The user name will be used to identify your team. Requests will be processed within a working day.

In reply to your registration you will receive the necessary credentials to access the data sets from the download page.

Submission of Results

The submission of the recommended list of names has to be in the format described in the results section. You can send us your ordered lists of recommendations using the upload form. To successfully take part in the challenge, your team has to submit the file containing the (final) ordered lists of recommendations no later than the 1st of July 2013.

You may send us intermediary results whenever you like. The challenge will be decided based on the last submissions of each team. On the website we will post a leader board on April 1st. It will subsequently be updated every Friday until the end of the challenge. Intermediary submissions can be valuable to gain insights on the current score and the current ranking compared to other teams. The winning team will be announced on this website on the 15th of July 2013. Please note that the submission of a workshop contribution is mandatory to win the challenge.

Submission of Workshop Contributions

All teams must submit a paper about their recommendation approach and take part of our workshop session during the ECML PKDD. The paper should describe the recommender approach as well as experimental setup and results. For more details on the submission of papers see the workshop page.

Finally

The task of recommending names is new and accordingly: Everything is a contribution. Even small pre-processing steps or applications of standard recommender systems can be successful. We are very much looking forward to your contribution and seeing you at the 20DC13 workshop!

Further Reading

We wrote a few papers on the analysis and recommendation of names. You will find some baseline results in the paper on given name recommendations.