Code:Trias

Aus Wiki
Wechseln zu: Navigation, Suche

Overview

Trias is an algorithm for computing triadic concepts which fulfill minimal support constraints. It has first been described in this paper.

There is a GForge project where the source code is hosted:


Please cite the algorithm using this reference (in BibSonomy, in the ACM digital library):

Robert Jäschke, Andreas Hotho, Christoph Schmitz, Bernhard Ganter, and Gerd Stumme. Trias - An Algorithm for Mining Iceberg Tri-Lattices. In ICDM ’06: Proceedings of the Sixth International Conference on Data Mining, pages 907–911, Washington, DC, USA, 2006. IEEE Computer Society.

Other papers related to Trias can be found on myBibSonomy tagged with trias.

Users

Currently, there are two several to run/configure Trias:

Input file format

By default, Trias expects as input an ASCII text file containing one triple per line. Each triple consists of three numbers, separated by blanks. The first number represents the item in the first dimension, the second number the item in the second dimension and the third number the item in the third dimension. An example could be this file:

1 3 2
1 1 1
1 2 3

Here, the first line is a hyperedge between the items 1, 2, and 3.

It is important to note that the items in each dimension must be numbered consecutively, i.e., without holes, and beginning with 1. If that is not the case, you have to activate the HOLES option.

If you need other input formats, you can write a Java class which implements the de.unikassel.cs.kde.trias.io.TriasReader interface. For more information on this, have a look at the developers section.

Output file format

By default, Trias writes results into an ASCII file containing one tri-concept per line. The tri-concept (A,B,C) = ({1}, {3}, {2}) is written in the first line of the following example file:

1 1 1 1 0 0 0 0 0 0 0   A = {1, },  B = {3, },  C = {2, }
1 1 1 1 0 0 0 0 0 0 0   A = {1, },  B = {2, },  C = {3, }
1 1 1 1 0 0 0 0 0 0 0   A = {1, },  B = {1, },  C = {1, }

The numbers preceding the concepts are some scores about their size (volume) and can be used to grep/sort concepts of certain sizes.

If you need a custom output format, you can easily implement the de.unikassel.cs.kde.trias.io interface to write one. For more information on this, have a look at the developers section.

Configuration

Command line arguments

TRIAS reads facts from STDIN and writes tri-concepts to STDOUT
usage:
java de.unikassel.cs.kde.trias.Trias X A B C dim0minsup dim1minsup dim2minsup HOLES|NOHOLES
 neccessary parameters are:
   X ... number of triples
   A ... number of items in dim0
   B ... number of items in dim1
   C ... number of items in dim2
   dim0minsup, dim1minsup, dim2minsup ... minimal support (absolut values!) for dim0, dim1, dim2
   HOLES|NOHOLES ...  HOLES = columns contain holes (i.e., numbers are missing), NOHOLES = opposite

trias.properties

You can use Java properties files to configure Trias. A simple example looks like that:

trias.input = /tmp/trias/test
trias.holes = false
trias.numberOfTriples = 3
trias.numberOfItemsPerDimension.0 = 1
trias.numberOfItemsPerDimension.1 = 3
trias.numberOfItemsPerDimension.2 = 3
trias.minSupportPerDimension.0 = 1
trias.minSupportPerDimension.1 = 1
trias.minSupportPerDimension.2 = 1
trias.output = /tmp/trias/test.tri

The syntax is pretty self-explanatory. Nevertheless, here the help output of the TriasPropertiesConfigurator:

The following properties are used for configuration:

trias.numberOfTriples ... number of triples
trias.numberOfItemsPerDimension.0 ... number of items in dim0
trias.numberOfItemsPerDimension.1 ... number of items in dim1
trias.numberOfItemsPerDimension.2 ... number of items in dim2
trias.input ... path to output file (default: STDIN)
trias.output ... path to output file (default: STDOUT)
trias.outputScores ... set to 'true', if scores for concepts should be printed (default: false)
trias.holes ... set to 'true', if the items are not numbered consecutively (default: false)
trias.delimiter ... a Java regular expression depicting, how the items of a triple are separated
trias.minSupportPerDimension.0 ... minimal number of items of dim0 to be in each tri-concept
trias.minSupportPerDimension.1 ... minimal number of items of dim1 to be in each tri-concept
trias.minSupportPerDimension.2 ... minimal number of items of dim2 to be in each tri-concept

As you can see, using properties files, you can also change the item delimiter.

SOAP webapp

The module trias-webapp contains a webapp which can be deployed on a servlet container like Apache Tomcat and accessed over HTTP via SOAP.

Developers

For writing custom input, output, or configuration mechanisms, there exist three interfaces you can implement.

TriasReader

The basic interface to deliver input to Trias. The method getItemList shall return an array of integers. The first dimension contains the triples, the second dimension represents the triples by a four dimensional array whose fourth dimension is empty.

public interface TriasReader {
	public int[][] getItemlist () throws NumberFormatException, IOException;
}

Implementations

TriasStandardReader

Default reader which expects one triple per line with the items of each triple being separated by the delimiter given in the constructor. It is neccessary, to provide the number of triples beforehand. Each triple consists of three items, represented by numbers. The items in each dimension must be numbered consecutively, beginning with 1.

The items in each dimensions

public TriasStandardReader(int numberOfTriples, final String delimiter)
TriasHoleReader

Similiar to the TriasStandardReader, this TriasReader allows the items to be non-consecutively numbered. Output must be written with the TriasHoleWriter using the map retrieved by public HashMap<Integer, Integer>[] getInverseMapping().

TriasWriter

The TriasWriter's is called by Trias to write the resulting concepts during computation. This is done by a call to the

public interface TriasWriter {
	public void write (final int[][] concept) throws IOException;
	public void close() throws IOException;
}

Implementations

TriasStandardWriter
TriasHoleWriter
TriasStringWriter

TriasConfigurator

Implementations

TriasCommandLineArgumentConfigurator
TriasPropertiesConfigurator
TriasJavaConfigurator

License

Trias is licensed unter GNU General Public License version 2.

Copyright 2006-2008 Robert Jäschke

Trias is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

Trias is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with Trias; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA