<%@ page import="java.io.*" %> <%@ page import="java.util.Collections" %> <%@ page import="java.util.Comparator" %> <%@ page import="java.util.ArrayList" %> <%@ page import="AlphanumComparator.*" %>MRNT - map reduced newsticker Welcome to map/reduced Newsticker!
What are the most used words today:
<% String file = "/tmp/output/part-00000"; BufferedReader br = new BufferedReader(new FileReader( file)); String zeile; ArrayListStuff = new ArrayList (); AlphanumComparator ac = new AlphanumComparator(); while ((zeile = br.readLine()) != null) { String [] splitupText = zeile.split("\t"); String a1 = splitupText[1] + " : " + splitupText[0]; Stuff.add(a1); } Collections.sort(Stuff, ac); Collections.reverse(Stuff); for(int j=0; j < Stuff.size(); j++) { out.println(Stuff.get(j)); out.println("
"); } %>
How it works:
- We need Java JRE and JDK for Java Developement from http://java.sun.com/javase/downloads/index.jsp
- We need Apache Tomcat from http://tomcat.apache.org/
- Unzip the software and start Apache Tomcat
/usr/local/apache-tomcat-6.0.26/bin/startup.sh(Listen default on port 8080) Set JAVA_HOME in bin/catalina.sh if the system has no defaultsJAVA_HOME=/usr/local/jre1.6.0_18- Additional Software is needed: GNU JavaMail and GNU JAF and GNU inetlib
- The first java app is a NNTP Client (look at http://blog.jservlet.com/post/2007/06/29/first or download it). Configure news account and newsgroup name. Compile the java to a class:
/usr/local/jdk1.6.0_18/bin/javac -cp gnumail.jar:gnumail-providers.jar NNTP.java- Download and install Hadoop from http://hadoop.apache.org
- Compile the WordCount.java example from hadoop sources
/usr/local/jdk1.6.0_18/bin/javac -cp hadoop-0.20.2-core.jar -d wordcount_classes WordCount.java
/usr/local/jdk1.6.0_18/bin/jar -cvf wordcount.jar -C wordcount_classes/ .- Fetching all article bodies from the selected newsgroup:
cd /usr/local/cloudcomputing/mail-1.1.2/
/usr/local/jre1.6.0_18/bin/java NNTP gnu.mail.providers.nntp.NNTPStore > ../hadoop-0.20.2/input/001
- Starting hadoop in single mode or server or use AWS:
cd /usr/local/cloudcomputing/hadoop-0.20.2/
bin/hadoop jar wordcount.jar org.myorg.WordCount input output
cp /usr/local/cloudcomputing/hadoop-0.20.2/output/part-00000 /usr/local/apache-tomcat-6.0.26/webapps/cloud/
- Download AlphanumComparator from here and install it in the class directory from your webapp
- Write