Sie sind auf Seite 1von 17

GuruPrevails Machine Learning Course

Basic Math
Examples given in Octave code

Victor K Miclovich

vicmiclovich@{ugandasoft, gmail}.com

December 5, 2009
2

Preface
I would like to welcome you to this tutorial courtesy of GuruPrevails. We
want to make Machine learning a really wonderful experience and I hope
these tutorials shall give you great satisfaction.
Good luck and have fun!

Victor K Miclovich
December, 2009
Uganda
Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1 Introduction 5
1.1 Starting Octave . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Using Octave like a calculator . . . . . . . . . . . . . . 5
1.1.2 Inbuilt functions . . . . . . . . . . . . . . . . . . . . . 6
1.1.3 Some constants to take note of. . . . . . . . . . . . . . 6

2 Linear Algebra basics 7


2.1 Matrix(-ces) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Representing a matrix . . . . . . . . . . . . . . . . . . 7
2.1.2 Matrix operations... at least some of them . . . . . . . 8
2.2 Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 What is a vector? . . . . . . . . . . . . . . . . . . . . 10
2.3.2 What is a vector space? . . . . . . . . . . . . . . . . . 10
2.3.3 The real definition! . . . . . . . . . . . . . . . . . . . . 10

3 Probability Theory 13
3.1 Simple terminology . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.1 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.2 Probability space or Sample space . . . . . . . . . . . 13
3.1.3 Probability . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.4 Review of basic Probability concepts . . . . . . . . . . 13
3.1.5 Law of Total Probability . . . . . . . . . . . . . . . . . 14
3.1.6 Baye’s Theorem . . . . . . . . . . . . . . . . . . . . . 14
3.2 Where Probability occurs in Octave . . . . . . . . . . . . . . 14
3.3 Functions that are probabilistic in nature . . . . . . . . . . . 15

4 What next? 17
4.1 Alternatives that we are going to use . . . . . . . . . . . . . . 17
4.1.1 What makes Python different from Octave/MatLab? . 17

3
4 CONTENTS
Chapter 1

Introduction

In this section I briefly demonstrate the power of Octave in solving mathe-


matical problems.
Octave is supported by many operating systems: commonly in Microsoft
Windows, Linux and Mac OS... In this tutorial, I will target an audience
with a strong UNIX1 . So get ready...

1.1 Starting Octave


1. Start your terminal2
2. Type Octave3
3. Start hacking; you should see a prompt something like this
octave:1>

1.1.1 Using Octave like a calculator


Octave can be used like a calculator; this just means you could type math-
ematical expressions and have a result come back to you straight away.

Addition
octave:1> 3 + 4
=>7
This is a simple example... and almost any kind of operand4 will work.
1
Don’t worry if you don’t have strong UNIX background... similar things can get done
in Windows
2
Windows users should look for Octave under the Programs menu of their machine;
install Octave if you don’t have it.
3
Windows users will probably click under the programs menu... and a terminal should
pop out.
4
In the above example, the operands are 3 and 4

5
6 CHAPTER 1. INTRODUCTION

Subtraction
octave:1> 5 - 2
=>3

Multiplication
octave:1> 5 * 2
=>10

1.1.2 Inbuilt functions


There are many inbuilt functions in Octave. Examples include sin, cos, tan,
sqrt, etc.
octave:1>sqrt(4)
=>2
octave:2>sin(pi / 2)
=>1
octave:3>sin(90)
=>0.89405

1.1.3 Some constants to take note of.


octave:1>pi
=>3.1416
octave:2>e
2.7183
There are many other constants in science and math that even I may not
know about because that’s not my specialty... physicists for example will
have their known set of constants, chemists have theirs... Mathematicians
like I have our own special set of constants that we use...
The next sections of this tutorial are going to be on more specific topics.
Enjoy reading :)

5
trigonometric functions usually take radians an input... e.g. 90◦ is π
2
Chapter 2

Linear Algebra basics

2.1 Matrix(-ces)
2.1.1 Representing a matrix
A matrix is just an array of numbers. These numbers could be real numbers,
integers or some other system of numbers or numerics.
Mathematical representation of an m × n matrix

 
a1,1 a1,2 ··· a1,n
 a2,1 a2,2 ··· a2,n 
Am,n = .
 
.. .. .. 
 .. . . . 
am,1 am,2 · · · am,n
In Octave I show several types of matrices
Row matrix

octave:1>A = [3,4,5]
=>
3 4 5

Column matrix

octave:2>A = [3; 4; 5]
=>
3
4
5
A 2 × 3 matrix
octave:3>A = [1, 2, 3; 4, 5, 6;]
=>

7
8 CHAPTER 2. LINEAR ALGEBRA BASICS

1 2 3
4 5 6
or octave:3>A = [ [1, 2, 3]; [4, 5, 6]]
=>
1 2 3
4 5 6

A 3 × 3 matrix

octave:4>A = [ [1, 2, 3]; [4, 5, 6]; [7, 8, 9] ]


=>
1 2 3
4 5 6
7 8 9

2.1.2 Matrix operations... at least some of them


I am going to use the 3 × 3 matrix A used in the previous sub-section.

Adding two matrices

octave:1># given two matrices A and some B


octave:2>A = [ [1,2,3]; [4, 5, 6]; [7, 8, 9] ];
octave:3># a semi-colon at the end of the statement above prevents
Octave from print A octave:4>B = [ [10, 11, 12]; [13,14,15]; [16,17,18]
];
octave:5>A + B
=>
11 13 15
17 19 21
23 25 27

Please, don’t get confused by my use of #; this is put inside your program
to help you read and understand the logic of your program.

It is good practice to keep commenting your programs... any program-


ming language’s decoder will understand that it does not have to execute any
instructions in the line that has the ’#’. One can use a ’%’ symbol to com-
ment. I’ve just decided to use ’#’ symbol at the start of that line.
Other programming languages have other ways in which comments are
placed in their code.

Scalar products
2.1. MATRIX(-CES) 9

octave:1> newA = 2 * A;
octave:2># remember to always give A some values. You can test
other values...
octave:3># A is a variable; a place in computer memory where we
stor the values octave:4># in this case those values are entries/elements
of an array
octave:5> newA
=>
2 4 6
8 10 12
14 16 18

Take note of my use of newA; this is called variable... and the line
newA = 2 * A computes a value and puts it under that name... so I can
use newA anywhere in my program. Please read the Octave manual to get
other concepts in programming using the Octave language. Don’t forget to
PRACTICE

Finding a transpose of matrix/vector


A definition of transpose is in this footnote1

octave:1> A = [3,2,3; 1,3,6];


octave:2> A transpose = A’;
octave:3> A transpose
=>
3 1
2 3
3 6

Determinants
Determinants are usually calculated for square matrices n × n; you might
turn up with some nasty errors if you work with singular or m × n matrices
where m 6= n

octave:1>A = [2,3,4];[3,15,20];[7,8,1];
octave:1>det(A)
=> -203.00

1
Mathematical definition of a Transpose of a matrix: Given an m × n matrix, it’s
transpose is just that matrix with rows interchanged for columns... an n × m;

[A]Tij = [A]ji
10 CHAPTER 2. LINEAR ALGEBRA BASICS

2.2 Assignment
You are going to test the following functions; study the math and octave
code too!

1. Given ~v = (3, 2, 3), ~u = (4, 6, 7), Find ~v .~u


v = [3 2 3];
u=[4 6 7];
dot(v,u)

2. Using the above vectors, get the cross product, ~v ×~u [Hint: cross(x,y)
gives ~x × ~y ]

3. Think of other vectors... try out 2 dimensional, 3-D, 4-D and n-


dimensional vectors. Also, take note of my exclusion of a comma
i.e. v = [3 2 3] is the same as v = [3, 2, 3]

2.3 Vector Spaces


2.3.1 What is a vector?
A vector is a line segment whose length is its magnitude and whose orien-
tation in space is its direction.

2.3.2 What is a vector space?


A vector space is a mathematical structure formed by a collection of vectors.

2.3.3 The real definition!


Let V be a set on which addition and scalar multiplications are defined2 . If
that is the case, then for all objects3 ~u, ~v , and w
~ in V and all scalers c and
k then V is called a vector spaces and the objects in V are called vectors
Below are some axioms we look at:

1. ~u + ~v is in V 4

2. c~u is in V 5

3. ~u + ~v = ~v + ~u

4. u + v = v + u
2
if ~
u and ~v are objects in V and c is a scalar, then we can in some way, define ~v + ~
u
and c~ u or c~v .
3
we use objects here because vector has 2 things its describing: orientation and
length/size... which makes the term sound more ”important”.
4
closed under addition
5
closed under scalar multiplication
2.3. VECTOR SPACES 11

5. ~u + (~v + w)
~ = (~u + ~v ) + w
~

6. zero vector is a special vector in V space.

7. For every ~u in V there’s another object in V , denoted −~u and called


the negative of ~u such that ~u + (−~u) = 0 or ~u − ~u = 0

8. c(~u + ~v ) = c~u + c~v

9. (c + k)~u = c~u + k~v

10. c(k~u) = (ck)~u

11. 1 × ~u = ~u

From your knowledge of the previous sections, implement these axioms for
the following vectors... if this isn’t enough you can set your questions :)

(a)
~v = (2, 7)
octave:1> v = [-13; -25]
Use the constants c = 3, 5, 6, k = 3, 5, 6 And test using the Octave
interpreter6

6
another name for the terminal that you see on running Octave
12 CHAPTER 2. LINEAR ALGEBRA BASICS
Chapter 3

Probability Theory

3.1 Simple terminology


3.1.1 Events
An event is an outcome or occurrence that has a probability assigned to it.

3.1.2 Probability space or Sample space


These are all the possible outcomes of an experiment or series of events. We
can represent this mathematically using an S

3.1.3 Probability
To find the probability of an event A
n(A)
P (A) =
n(S)
At times we may need to find the complement of some probability A0 , that
event that does not occur.
P (A0 ) = 1 − P (A)

3.1.4 Review of basic Probability concepts


To find the probability of getting event A or B,
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
For mutually exclusive events P (A ∩ B) is zero! So that means you just add
up the probabilities of events A and B occurring! Conditional probabil-
ities are measure the probability of one event occurring relative to another
occurring.
P (A ∩ B)
P (A|B) =
P (B)

13
14 CHAPTER 3. PROBABILITY THEORY

This can easily be written as:

P (A ∩ B) = P (A|B) × P (B)

For some extra credit and work... please re-familiarize yourselves with the
probability tree... I won’t do that here...
Probability trees make calculations of conditional probabilities easy... con-
ditional probabilities occur a lot with decision making algorithms, recursive
depth search and many other algorithms you are gonna encounter in your
Machine learning careerer.

3.1.5 Law of Total Probability


If you have two events A and B, then

P (B) = P (B ∩ A) + P (B ∩ A0 )
= P (A)P (B|A) + P (A0 )P (B|A0 )

This has an interesting attribute1 .

3.1.6 Baye’s Theorem


If you have n mutually exclusive and exhaustive events A1 , A2 , through to
An , and B is another event, we have then:

P (A)P (B|A)
P (A|B) =
P (A)P (B|A) + P (A0 )P (B|A0 )

The Baye’s Theorem is used in computing as a way of filtering emails and


detecting spam... most training sets use Baye’s theorem; and this teaches a
”program”/machine to easily detect commonly used words in Junk mail or
SPAM; and so we can easily classify and cluster emails as correctly addressed
and due to you and the ones that head off to your SPAM box.

3.2 Where Probability occurs in Octave


Probability in computers is in many causes referred to as ”psuedo-random”...
calculations that are made by a bunch of transistors and crystals2 , and other
complicated electronics stuff that is set to compute something that is seem-
ingly ”random” and that is what we use to mimic an un-planned-for event
– an accident of chance!
So... these are just a few cases where probability is used...

1
The equation above is the denominator of Baye’s Theorem
2
timers
3.3. FUNCTIONS THAT ARE PROBABILISTIC IN NATURE 15

3.3 Functions that are probabilistic in nature


octave:1>rand()
The rand() function returns a matrix with random elements uniformly dis-
tributed on the interval (0, 1).
16 CHAPTER 3. PROBABILITY THEORY
Chapter 4

What next?

During the next few weeks we shall be looking at probability distributions


and have them implement in Octave... this is a rather intensive part of this
course... that of being able to convert normal mathematics into understand-
ing!
I shall however post as many resources as I can... please try to look through
both the code and references that I leave.
The next tutorial shall be about generation of vectors! Watch this space!

4.1 Alternatives that we are going to use


I have written this section to encourage you all to get in times with Python.
Python some fabulous support for many algorithms and scientific computing.

4.1.1 What makes Python different from Octave/MatLab?


• It is an object oriented interpreted programming language

• It has extensive support

• It is relatively fast; but if more speed is need extra programming can


be done in C and/or C++

• There’s a lot more fun stuff... like libraries that let you play with the
web; you can implement features like web scrappers, SPAM filters,
signal analysis, training algorithms, etc.

During the course, we shall look at python and more math in detail.

17

Das könnte Ihnen auch gefallen