Sie sind auf Seite 1von 9

Apache Mahout

What is it ? How does it work ? Machine Learning Algorithms Install

www.semtech-solutions.co.nz

info@semtech-solutions.co.nz

Mahout What is it ?

Machine learning For large data Based on Hadoop But can work on a non Hadoop cluster Scaleable Licensed by Apache

www.semtech-solutions.co.nz

info@semtech-solutions.co.nz

Mahout How does it work ?

Uses Hadoop Map Reduce Has many supplied algorithms Supports four use cases

Recommendation mining Clustering Classification Frequent Itemset Mining

www.semtech-solutions.co.nz

info@semtech-solutions.co.nz

Mahout - Machine Learning


Machine learning what does it mean ?

A branch of artificial intelligence Systems that learn from data Classify data after learning Learn on test data sets Generalisation the ability to classify unseen data sets

after learning

www.semtech-solutions.co.nz

info@semtech-solutions.co.nz

Mahout Algorithms
Some of the available algorithms (among many others)

Collaborative filtering

Narrow Sense make predictions about user interests by collecting preferences General - Multi agent collaboration for information filtering Mode seeking, used for visual tracking Find unique features

Mean shift clustering

Parallel frequent pattern mining

www.semtech-solutions.co.nz

info@semtech-solutions.co.nz

Mahout Install
So how do we install Mahout and test it ?

Install Maven

sudo apt-get install maven3 You will need subversion installed svn co http://svn.apache.org/repos/asf/mahout/trunk Go to dir containing pom.xml file

Install Apache Mahout


mvn install

## in ./trunk

Full details available in the Mahout install guide on our web site shop

www.semtech-solutions.co.nz

info@semtech-solutions.co.nz

Mahout Test Install


So let us run a test

cd $MAHOUT_HOME/examples/bin ./build-reuters.sh choose option 1 kmeans clustering Should finish with see next slide

Full details available in the Mahout install guide on our web site shop

www.semtech-solutions.co.nz

info@semtech-solutions.co.nz

Mahout Test Install


cd $MAHOUT_HOME/examples/bin ; ./build-reuters.sh Please call cluster-reuters.sh directly next time. This file is going away. Please select a number to choose the corresponding clustering algorithm 1. kmeans clustering 2. fuzzykmeans clustering 3. lda clustering Enter your choice : 1 ok. You chose 1 and we'll use kmeans Clustering ................................. Inter-Cluster Density: NaN Intra-Cluster Density: 0.0 CDbw Inter-Cluster Density: NaN CDbw Intra-Cluster Density: NaN CDbw Separation: NaN Full details available in the Mahout install guide on our web site shop www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Contact Us

Feel free to contact us at


www.semtech-solutions.co.nz info@semtech-solutions.co.nz

We offer IT project consultancy We are happy to hear about your problems You can just pay for those hours that you need To solve your problems

Das könnte Ihnen auch gefallen