Myrrix, the REST-ified Mahout for real-time recommandations

Myrrix is a complete, real-time, scalable clustering and recommender system, evolved from Apache Mahout. The full Myrrix system uses two components: a Computation Layer and one or many Serving Layers.

While the Computation Layer computes the large machine learning models needed by the Serving Layer, the Serving Layer is a Java HTTP server application. This server serves user requests in real-time, making recommendations and receiving new input via a REST API. Many instances of the Serving Layer may be run for horizontal scalability.

In stand-alone mode, one Serving Layer instance operates as the complete system on one server. It performs all computation locally, and does not access a distributed Computation Layer. All input and intermediate data is stored locally in a directory on the local file system. Here, the Serving Layer is not clustered either, and only a single instance can be run at once.

In distributed mode, one or more Serving Layer instances cooperate with a Computation Layer, sending it data and reading its computed recommender models. In this mode, the Computation Layer hold all the recommandation process, and the storage and computation are distributed, using one or many computers to process much more data, faster. It is implemented on top a distributed storage system, and uses Apache Hadoop, an implementation of the MapReduce distributed computing paradigm.

First, you need this:

<dependency>
<groupId>net.myrrix</groupId>
<artifactId>myrrix-online-local</artifactId>
<version>1.0.1</version>
</dependency>
<dependency>
<groupId>net.myrrix</groupId>
<artifactId>myrrix-common</artifactId>
<version>1.0.1</version>
</dependency>
<dependency>
<groupId>net.myrrix</groupId>
<artifactId>myrrix-client</artifactId>
<version>1.0.1</version>
</dependency>


Let's write some lines of code:



import java.io.File;
import java.io.IOException;

import net.myrrix.client.ClientRecommender;
import net.myrrix.client.MyrrixClientConfiguration;
import net.myrrix.common.MyrrixRecommender;
import net.myrrix.online.ServerRecommender;

import org.apache.commons.io.FileUtils;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Profile;

@Configuration
public class SuggestConfig {

@Configuration
@Profile({ "prod" })
public static class Default {

@Bean
public MyrrixRecommender recommander() throws IOException {

MyrrixClientConfiguration clientConfig = new MyrrixClientConfiguration();
clientConfig.setHost("localhost");
// clientConfig.setPort(443);
// clientConfig.setSecure(true);
ClientRecommender clientRecommender = new ClientRecommender(
clientConfig);
return clientRecommender;
}
}

@Configuration
@Profile({ "test" })
public static class Test {

public final static File suggesterDataDir = new File("suggester-tests");

@Bean
public MyrrixRecommender recommander() throws IOException {
FileUtils.deleteQuietly(suggesterDataDir);
FileUtils.forceMkdir(suggesterDataDir);
ServerRecommender serverRecommender = new ServerRecommender(
suggesterDataDir);
return serverRecommender;
}
}
}

This configuration class defines an instance of MyrrixRecommender depending on the activated profile "test" or "prod"):
- In test mode you run an embedded instance of Myrrix serving layer
- For production purposes, we simply ease the dedicated client, and we need to deploy an instance of serving layer. For the tutorial, the simpliest way is to install a stand-alone serving layer (for heavy use and/or medium to huge datasets, you may be interested in the distributed version - Serving Layer instance(s) and Computation Layer instance -). Just download the jar for the stand-alone serving layer here: http://myrrix.com/download-interstitial/ and run it with: sudo java -Dmodel.features=100 -Dmodel.als.lambda=2 -Xmx512m -jar myrrix-serving-1.0.1.jar Once run, the Serving Layer console should be accessible by browsing to http://localhost/ in a browser.

Let's play with it now:



import java.io.File;
import java.util.ArrayList;
import java.util.Collection;
import java.util.concurrent.TimeUnit;

import net.myrrix.common.MyrrixRecommender;

import org.apache.commons.io.FileUtils;
import org.apache.mahout.cf.taste.common.TasteException;
import org.junit.Test;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;

import com.streaming.config.Profiles;
import com.streaming.config.SuggestConfig;

public class Recommand {

@Test
public void localRecommander() throws Exception {

AnnotationConfigApplicationContext context = new AnnotationConfigApplicationContext();
context.register(SuggestConfig.class);
context.getEnvironment().addActiveProfile(Profiles.TEST);
context.refresh();

// Trick: add some data in order to avoid a "net.myrrix.common.NotReadyException"
File csv = new File(SuggestConfig.Test.suggesterDataDir, "initial.csv");
Collection<String> lines = new ArrayList<String>();
lines.add("1, 3, 1");
lines.add("1, 4, 1");
lines.add("2, 3, 1");
lines.add("2, 4, 1");
lines.add("2, 5, 1");
lines.add("2, 6, 1");
lines.add("1, 5, 1");
lines.add("3, 3, 1");
lines.add("3, 5, 1");
FileUtils.writeLines(csv, lines);

goForARide(context);

}

@Test
public void remoteRecommander() throws Exception {

AnnotationConfigApplicationContext context = new AnnotationConfigApplicationContext();
context.register(SuggestConfig.class);
context.getEnvironment().addActiveProfile(Profiles.PROD);
context.refresh();

goForARide(context);

}

private void goForARide(AnnotationConfigApplicationContext context)
throws TasteException, InterruptedException {
MyrrixRecommender recommander = context
.getBean(MyrrixRecommender.class);

recommander.setPreference(1, 3);
recommander.setPreference(1, 4);
recommander.setPreference(2, 3);
recommander.setPreference(2, 4);
recommander.setPreference(2, 5);
recommander.setPreference(2, 6);
recommander.setPreference(1, 5);
recommander.setPreference(3, 3);
recommander.setPreference(3, 5);

TimeUnit.SECONDS.sleep(1);

System.err.println(recommander.recommend(1, 2));
System.err.println(recommander.mostPopularItems(3));

context.close();
}
}

Enjoy!





Sources:
http://myrrix.com/myrrix-is/
http://myrrix.com/quick-start/
https://aws.amazon.com/amis/myrrix-serving-layer-distributed

Labels: , , ,