Monday 13 October 2014

How to Train My own Model using Mahout

Mahout Training A Model

       Mahout is a scalable machine learning algorithms focused primarily in the areas of collaborative filtering. Follow step by step process to train your own model based on your own data. Here in this example we are giving training on 20news-all data folder.
 
1. Create a working directory and copy all the data into this directory, make a copy of duplicate, so that your original data will not lost.

2.    Create Sequence files from Training data using following command:
a.    Command:
./bin/mahout  seqdirectory   -­i 20news-all -o 20news-seq
3.    We have to convert Sequence files to Vectores, mahout will use vector files only for all algorithms, its the basic need to convert data file to sequence and then sequence files to vector files.
a.    Command:
./bin/mahout seq2sparse ­i 20news­seq ­o 20news-vectors ­lnorm ­nv -wt tfidf -wt(--weight) =  TF  or  TFIDF
4.    In practice, training examples are typically divided into two parts. One part, known as the training data, consists of 80 –90 percent of the available data. The training data is used in training to produce the model. A second part, called the test data, is then given to the model without telling it the desired answers, although they’re known. This is done in order to compare the output that results from classifying, using the desired output. Here, we split the data set randomly as 80% training data and 20% test data. The value 20 after the “ randomSelectionPct” parameter indicates that 20% of the data will be selected randomly for the test data set
a.    Command:
./bin/mahout  split ­i 20news­vectors/tfidf-vectors --trainingOutput  20news-train-vectors -­testOutput 20news­test­vectors -­randomSelectionPct 20 -­overwrite -­sequenceFiles -xm sequential
5.    Training Naive Bayes model
a.    Command:
./bin/mahout trainnb -i 20news-train-vectors -el -o model -li labelindex –ow


6.    Self testing on training set: You have ready with your own model now, but we need to test the accuracy of the model by using training set itself.
a.    Command:
./bin/mahout testnb -i 20news­train­vectors -m model -l labelindex -ow -o 20news-testing
b.    Output:
                                          i.    Kappa                                       0.4955
                                         ii.    Accuracy                                    99.103%
                                        iii.    Reliability                                99.0858%
                                       iv.    Reliability (standard deviation)            0.0952
7.    Testing on holdout set, Here we are using holdout set to test model accuracy.
a.    Command:
./bin/mahout testnb -i 20news­test­vectors -m model -l labelindex -ow -o 20news-testing
b.    Output:
                                          i.    Accuracy                                    90.093% 

This is the way to train a model in Apache mahout. Next blog will be how to use this model to get sentiment of a comment.

Wednesday 8 October 2014

Serialize and Deserialize java Object

To Serialize and De-serialize java Object

Serialization is nothing but converting an object into a sequence of bytes(here we are encrypting using BASE64Encoder), which can be persisted database or save in a file on disc. And Creating Java object from sequence of bytes which is save in the file is called Deserialize

Your class must be implement Serializable interface in order to Serialize or Deserialize.
Serializable is a marker interface that adds serializable behaviour to the class implementing it.


Here We are going to serialize with sun.misc.BASE64Encoder to secure encrypt, So create encrypt and decrypt objects in your main class;

    static private BASE64Encoder encode = new BASE64Encoder();
    static private BASE64Decoder decode = new BASE64Decoder();




Call objectToSerialize method to serialize your any object which is implemented Serializable interface.

    static public String objectToSerialize(Object obj) {
        String out = null;
        if (obj != null) {
            try {
                ByteArrayOutputStream baos = new ByteArrayOutputStream();
                ObjectOutputStream oos = new ObjectOutputStream(baos);
                oos.writeObject(obj);
                out = encode.encode(baos.toByteArray());
            } catch (IOException e) {
                e.printStackTrace();
                return null;
            }
        }       
        return out;
    }
                     


Call derializeToObject method to Deserialize your String to again Java Object.

    static public Object SToO(String str) {
        Object out = null;
        if (str != null) {
            try {

                ByteArrayInputStream bios = new ByteArrayInputStream(decode.decodeBuffer(str));
                ObjectInputStream ois = new ObjectInputStream(bios);
                out = ois.readObject();
            } catch (IOException e) {
                e.printStackTrace();
                return null;
            } catch (ClassNotFoundException e) {
                e.printStackTrace();
                return null;
            }
        }      
        return out;
    }


Thats it:

Wednesday 1 October 2014

Struts2 File upload using Ajax & Jquery uploadify

Struts2 File upload  using Ajax & Jquery uploadify

         The Struts2 framework provides built-in support for file uploads using "Form-based File Upload". When a file is uploaded it will be stored in a temporary directory and they should be moved by your Action class to a permanent place or directory to ensure the file is not lost.

   You can download jquery.uploadify.min.js file from uploadify.org, include both js and css files to your form

         File uploading in Struts is possible through a pre-defined interceptor called FileUpload interceptor which is available (org.apache.struts2.interceptor.FileUploadInterceptor ).

Add below code snippet to your Jsp file(Multiple is optional):

<script src="js/jquery.uploadify.min.js" type="text/javascript"></script>
<link rel="stylesheet" type="text/css" href="css/uploadify.css" />

 ....
<form ....action="uploadfiles.do">
 <input id="file_upload" name="myFile" type="file" multiple="multiple" />
 <div id="queue"></div>

.....
</form>

<script type="text/javascript">
    $('#file_upload').uploadify({
        'formData': {hoardingid: $("#hoardingid").val()},
        'swf'     : 'js/uploadify.swf',
        'uploader': 'hrdimgs.do',
        buttonText:'Add Photos',
        buttonClass:'uploadimg',
        height:22,
        width:100,
        fileSizeLimit:"2MB",
        fileTypeDesc:'Images',
        fileTypeExts:'*.gif;*.png;*.jpg',
        removeCompleted:false,
        fileObjName:'myFile'
    });

</script>

Add the following code in struts.xml file:

    <action name="uploadfiles" class="com.struts2.uploadFileAction">
       <interceptor-ref name="basicStack">
       <interceptor-ref name="fileUpload">
           <param name="allowedTypes">image/jpeg,image/gif</param>
       </interceptor-ref>      
   </action>
 
 
Now we are ready to implement you action class, Add the following code in your action to save the file in destPath before its loss:
   Properties in your action with setter and getter methods:

    private File myFile;
    private String myFileContentType;
    private String myFileFileName;
    private String destPath = Constants.IMGSAVEPATH;

.................

Method need to write, Here you can not use same session from normal action and uploadify.swf , so you need to put the saved file path in your servlet context or you should save in your database immediately.

public String uploadImages(){
        try {           
              File destFile  = new File(destPath, myFileFileName);
             FileUtils.copyFile(myFile, destFile);
             //putting files in context, to use after submit the form.

             HttpServletRequest request = ServletActionContext.getRequest();
             List<File> files = (List<File>)request.getServletContext().getAttribute("uploadedfiles");
             if(files == null){
                 files = new ArrayList<File>();               
                 request.getServletContext().setAttribute("uploadedfiles", files);
             }
             files.add(destFile);
        } catch (Exception e) {
            e.printStackTrace();
              return ERROR;
        }
        return ActionSupport.NONE;
    }


Your done  Thats it!