Sie sind auf Seite 1von 15

Developers of Automated Space Robotics

Devon Ash, Founder of DASpR Inc.

A Robot's Software

Table of Contents
1 INTRODUCTION.................................................................................................... 1 2 PREREQUISITES...................................................................................................1
2.1 KNOWLEDGE........................................................................................................1 2.1.1 Math ............................................................................................................ 2.1.2 Computer Science........................................................................................... 2.2 SKILLS 2.2.1 Researching.................................................................................................... 2.2.2 Reading Mathematical Equations.................................................................. 2.2.3 Understanding The Symbols.......................................................................... 2.2.4 Implementing Algorithms via Mathematical Notation...................................

3 GETTING READY..................................................................................................2
3.1 3.2 3.3 3.4 GATHER YOUR TOOLS....................................................................................... LINUX.................................................................................................................... SOURCE CONTROL THE AGENA BOOK DISCIPLINE.......................................................................

4 THIRD-PARTY ROBOTICS PLATFORMS........................................................ 5 HARDWARE............................................................................................................ 6 SOFTWARE HIERARCHY.................................................................................... 7 VISION SOLUTIONS.............................................................................................. 8 SPEECH SOLUTIONS............................................................................................ 9 NAVIGATION SOLUTIONS................................................................................... 10 SOFTWARE TOOLS FOR THE JOB.................................................................. 11 ABBREVIATIONS AND SYMBOLS 12 DEFINITIONS 13 BIBLIOGRAPHY...................................................................................................I

ii

1. Prerequisites

1 Introduction
The purpose of this paper is to show how one can implement the software in a robot. When the team, Thunderbots@Home, in which I have been a member since October 2012 at the University of British Columbia, started researching how to write software for a robot that would live in our homes, we found no good compiled, and general, resources to study from. As such, we had to spend the first year building a foundation of knowledge to start the project. I hope, by writing this paper, I can cut away the fat of writing software for a robot by providing the resources we looked at and describe them in ways a beginner can update their prior knowledge distribution with. Firstly, to initialize the weights in the reader's hidden units, I will provide a quick overview of how I will discuss the material in this paper. I feel strongly that the brain learns like the algorithms used in Machine Learning, and so I will structure the book as if you were each a Deep Neural Network. This means that each section will first have a small intro to initialize your weights, an analogy so that your brain can learn from transfer learning, and questions at the end of each section to test how good your models were at generating the data. Moreover, it will also include additional resources to further update your priors in the end of each section. If you get the questions wrong, read one of the alternative sources and attempt the questions again, AFTER a break. Your brain takes time to update via back-propogation or whatever method it may use. Using a different set of data to update your models should give you more insight on generating the proper answers and increase your accuracy, because some articles weigh their concepts differently.

2 PREREQUISITES
For you to do well with the material in this paper, it would help if your models have their prior distributions trained up, aka, you should have the mathematical and computer science knowledge to understand what is being said in the paper. In addition, you should have some skills that I realized were required on the pursuit to build a robot in

2. Prerequisites software. If you have these skills or have recognized what is being said here, then skip this section. I am assuming you have the knowledge explained in the section below (2.1) and thus I will not explain where to find the knowledge on those topics. The skills section is more in-depth, and is more respective to the goals of this paper. These are skills I wish had been explained to myself in the journey of developing the software for a robot. 2.1 Knowledge

2.1.1 Mathematical Since this book is about a robot's software, and the core algorithms are in the machine learning discipline, we will need to have some mathematical knowledge. The mathematical theory machine learning is being built on requires knowledge from statistics, probability, and optimization. For understanding the theory behind algorithms like the Support Vector Machine, linear programming is preferred. However if you know how to read mathematical equations for their concepts, then you don't need to have in-depth knowledge of each of the fields, and you can derive the concept and purpose each time you read the symbols. This removes the need to memorize and conceptualize. This abstraction then makes the interesting equations you see just like another word in your mathematical vocabulary. This is covered in Section 2.2.1. 2.1.2 Computer Science For more information on a good idea of a complete software stack to use, and which one I used, see Chapter 10. 2.1.2.1 Operating Systems

For implementing the software, I have used, and you should also use, use a linux distribution. I prefer a stable distribution such as Ubuntu 12.04 because it has long term support. You should also understand how threading works, as it is crucial for optimizing the performance in practice. Caching and low-level computer workings such as instructions, memory reads, and memory hierarchies are good to know for debugging things and improving the speed. You'll

2. Prerequisites need to know why passing a matrix by value is eating up your CPU cycles. Working knowledge of how a GPU works is extremely useful when learning how to use the tools like OpenMP, CUDA, and other GPU/optimization control technologies. 2.1.2.2 Languages

The languages you will need to know, for speed and implementation, are C++ and C. For connectivity and modularity within your system, Java and Python. Python as well has key libraries such as Theano and sci-kit-learn, and I will discuss those in Chapter 10. Being able to use object-oriented programming is definitely helpful as it classifies each concept and can help you modularize and think clearly when designing a software hierarchy. 2.1.2.3 Distributed Computing

Distributed computing is not necessary to get your robot working. However, it is necessary for optimizing the algorithm training times and real-time application speed. You should know how to fix your program so that it can deal with multi-processor and multi-computer networks and so that minimizes the time required to find solutions. Networking is obviously a pre-req to distributed computing, for a basic example, understanding whether to use UDP or TCP for sending a video stream to another PC for preprocessing. 2.2 Skills Some skills are described below, with their motivations, that would be beneficial in your journey of learning how to write a robot's software. They consist of learning how to do the research necessary in order to implement the most powerful, cutting-edge technologies, reading mathematical notations to understand the dense concepts and novel methods presented in those papers, and implementing algorithms from mathematical symbols. 2.2.1 Researching

2. Prerequisites When I first started getting into researching robotics, it was not obvious that there were techniques and efficient methods of absorbing the dense knowledge bases in each field. Furthermore, there was no one to point out (maybe obvious?) facts like: research takes a lot of time! And even the professionals take a week, to a month, to understand the novel technologies presented in papers. There is a reason why people specialize in one discipline initially, because when you read a cutting-edge paper, you also have to read its references to understand a method it may be using, and you may also continue to do that until you reach something you already understand. You will have to Wikipedia, Google, and reference textbooks for all the words you are unsure of, and prove the concepts to yourself to finally understand them. The best way, in my opinion, to understand the novel methods presented, is to implement the algorithms in software and see it working yourself. However, to do that, you must be able to read the mathematical symbols and interpret them in terms of software engineering concepts. 2.2.2 Reading Mathematical Symbols

Reading and understanding the symbols in papers, and in general mathematical presentations, I find, is more important than learning the actual concepts in each novel method. If you learn to understand how symbols interact and the basis functions of each, you can easily derive how one came to the conclusion of that method and how to visualize it in your head. Once you become good at this, you will not even have to read the paper for a textual understanding (Except for looking for explainations of symbols). You could skip right to the mathematical formulations and visualize it just by reading the definitions and symbols, the most efficient use of your time! This section will use this mathematical formula, the overall cost function for the back-propogation algorithm, cited from the secondary paper about Sparse Autoencoders (Secondary, [1]) to demonstrate how to read and understand the symbols: J (W , b ) 1 l) 2 = J ( W , b : x( i ) , y ( i) ) + ( W (ji ) m i= 1 2 l =1 i=1 j = 1 1 1 l) 2 = ( h( W , b) ( x (i ) ) y( i )2 ) + ( W (ji ) m i =1 2 2 l = 1 i= 1 j = 1
m

[ [

nl 1 s l

sl + 1

(1 )

nl 1 sl

sl + 1

2. Prerequisites 2.2.2.1 Reading Symbols A key part for me, when reading the functions was to first integrate the definitions into my understanding. Here, I will show an algorithm I use to read the symbols. 1. Read the basic definitions. Assign the definitions to variables in writing by copying down the defined term and relating it to its symbol in the equation. Firstly, in the paper about Sparse Autoencoders (Secondary, [1]), I would look at the overall definition of the back-propogation algorithm, which is (1). We can see that the equation for the cost function with respect to a single training example is:
J (W , b ; x , y ) = 1 2 h ( x ) y 2 ( W , b) (2 )

So we first look at that, the building block of the overall cost function. What do the symbols W ,b,x, and y mean? You should know that from linear algebra, W represents a matrix. It is the weight matrix we're taking in for our system. To find b , we have to look for the definition in the reading. It is found in the footnote of that page and is defined as the bias term
bi
(l )

. The subscript

represents the bias

of the i'th example and l represents the l'th weight matrix iteration in our system (or layer, inferrred by the fact it is an l ). Most definitions, if it is a good paper, are able to be read from the paragraphs surrounding the introduction of that variable. In any chance there is no description of it, it is most likely common notation in that discipline and assumed you understand it. Try looking at a paper that is referenced in that case. Thenforth, we continue our reading exercise and look for the definitions of variables x y . At the top of the page, we see them defined as

2. Prerequisites
( i) (i ) (x , y ) .

x represents a data point in machine learning, and adding a subscript

to it just means it's the i'th data point. Moreover, they tell us y (i ) 2 . We also know that y is a multi-dimensional output, as per the reading. Hence, we now know all the inputs to the function J and know what each symbol represents in the system. 2. Abstract the basic definitons When you were younger and first learning to read, you first learned the characters of words. In the paragraphs just prior to this one, you defined the meaning of those characters. We can now compose the characters to form mathematical words; in analogous terms, the functions of math. We see (1) defined as
m

J ( W , b ) , the

overall cost function, and in the first part of the addition, appears to be summing over

1 J ( W , b ; x ( i) , y( i )) , m i= 1

J ( W , b ; x , y ) which is defined earlier as (2). If you

know what the formula for an arithmetic mean is from statistics, you can see the similarities to 1 a . We can infer that this term is only computing the arithmetic mean n i =1 i
n

of the sum-of-squares error, also defined in the paragraph below (8) in [Secondary (1)]. There is an averaging over something we can assume, and that in this case is (2), the squared error cost function. Again, if the paper is worth its weight, it should then define the bigger mathematical terms in whole after defining the symbols. Secondly, we can look at the second term in (1), ( W lji )2 , and either deduce from 2 l = 1 i= 1 j = 1
n l1 sl sl +1

our own knowledge that this is a regularization term or read the definition given in the paper. Now that we know what these things are representing, it is time to understand what the symbols are trying to say conceptually. Analogously, we can know what words mean, but their meanings change depending on context in a sentence, or when paired with other words. 2.2.3 Understanding The Symbols Understanding, in terms of machine learning, means to be able to generate new things from a distribution of data. Our brains do something similar. In order for us to understand something, and know we understand it, we must be able to generalize and gener6

2. Prerequisites ate new data that is a combination of the old data, but not necessarily the old data, and then prove it works by testing the generalization accuracy and soundness. For us to understand the example provided in the prior section, (1), we should be able to do something similar. We should be able to explain why term one is an averaging term, and what makes term two a regularization term, and why a weight decay ization term, we can abstract to further our understanding of is necessary. Once we are able to understand the effects of an averaging term and a regularJ ( W , b ) . Before we

can abstract, we must understand. That is how in the previous step involving the definition of symbols we were able to abstract without intentionally understanding. When I say without intention, it is because we already understood the meanings of words in text and we transferred our knowledge from the concept into the symbol. Thereby inheriting the properties of the knowledge with the symbol as the child. The next step is to understand how each symbol interacts with each other symbol. After we have looked at how the symbols in each term interact with eachother, we can look at how the terms interact with eachother in order to come to the final conclusion of how the overall cost function J ( W , b ) works conceptually.

2.2.2.2 Abstraction of The Basis Function The equation in (2) was learned to be a squared-error cost function. Either you knew what the structure of a squared-error function looked like, or you read the definition from the reading. That is, we understood from the way the symbols were interacting and the way they were placed what the concept being conveyed was. If you did not pick it up from the text, nor did you have prior knowledge, then that is still okay and you are still able to derive it with a bit more thought. Firstly, I'd look at the left and right double lines. These are the norm specifiers. Since we know what this does, we can see that it's taking the norm of some difference of function
h

and the output

vector y . You can then visualize in your head the difference between some functions value and some functions approximated value. That is to say, we're looking at the error between an approximation and the actual value. Then we square it and divide it by two. We now have a basic idea how symbols interact and we can begin using them in the bigger picture.

2. Prerequisites 2.2.2.3 Abstraction of Term One

Abstracting term one involves combining the concept of the basis function and our concept of the arithmetic mean. The arithmetic mean, in general is this case 1 a and in m i= 1 i
m

a i= J ( W , b ; x , y ) . Combining the two in a textual representation gives

us: Take the arithmetic mean of the sum of the squared-error cost function, as per said in (Secondary [1]). Mathematically, it's also quite simple to see what is going on and so I'll leave that as an exercise for the reader to visualize. 2.2.2.4 Abstraction of Term Two

To start with term two, we could either cheat and say we know it's a regularization term from the reading or prior knowledge; instead, we will deduce it from the positions and meanings of the basis symbols. In math, I find it's almost always better to read from inner to outer when putting things together. Starting with ( W lji )2 , we can

see that the summations are looking at specific elements inside the weight matrices. The l represents which weight matrice we're looking at, and each combination of ji is an element inside the weight matrix. The summations basically say for all,
W for that layer and taking the square of the

just like a for loop in programming. With that being said, we're looking at all the layers, and all the elements in the matrice

element and applying a weight decay term

to it. What could this possibly mean? 2


to

Imagine you have some graph with data points from

and it goes up

and down wildly near high input cases to the function. When we square the output and apply , we're penalizing the values that are really large because larger values get 2

even larger when squared relative to smaller squared numbers. This is regularization. Now that we know the underlying concept of term two, we can call it our regularization function and ignore the smaller symbols and how they interact, because we know how regularization works and that is all that will matter in understanding 2.2.2.5 Combining the Concepts from Term 1 and Term 2
J (W , b ) .

Term 1, our average squared-error function, and Term 2, our regularization function

2. Prerequisites compose together to define the meaning of J ( W , b ) . Then we are simply adding W and some bias

them together to get an overall cost for a given weight matrix data before taking the partial with respect to the weights. 2.2.4 Implementing Mathematical Notation

b . One could then assume that the function is normalizing and regularizing the

The next step after learning the basic characters and words when you were younger was to understand how to put them together. Now that you understand how to put mathematical symbols and functions together and abstract as you read, you should be able to generate and transfer the concepts you learned in the theoretical (math) realm into the practical (programming) realm. To do so, we will work through what each symbol means in programming terms. The reason this section is here is because when we at Thunderbots first started researching how to make a robot and the algorithms that would get us where we need to be, we found, as undergrads, this was a difficult task. As such, we will go over how to transfer the function found in (1) and (2) into C+ + code in hopes it will allow you to transfer the concepts of translation into other functions later in the paper. Turning the Basis Function into a C++ Function Firstly, we will follow the same algorithm presented earlier when we were reading what the function was. We will define our variables with the definitions they're represented with. But before that, we must define the functions inputs and outputs and declare our function. Clearly the output is either an integer or a real, but to be safe we should use double precision. To get that, we look at J ( W , b ; x , y ) and note that our inputs will be a matrix ive output set y . W , a bias term set b , an input feature set x , and relat-

// Basis Squared-error cost function double J(Mat W, const int* b, const int* x, const int* y) { ... } To fill in the function body, we have to look at the function body of our mathematical definition. When we assign something to a function, we can equivocate in programming terms and figure it as the code between the braces { and }. The first thing we should notice is that there is a hypothesis evaluation function h( W , b) ( x ) and so we

2. Prerequisites should code that one in too. The hypothesis function is just the output value of the network given that current data, and since we don't have the definition we won't be able to code it exactly except for the structure of the function. The output should be an array of doubles, which should be of same dimensionality of the vector // Hypothesis Function double* hypothesis(Mat W, const double* bias, const double* x) { ... }
y .

7 Vision Solutions
Introduction Vision encapsulates the domain of problems wherein we infer from information taken from HD cameras, LIDARs, infrared, stereo cameras, and other sensors that use light as a source of information. In the past, researchers worked on hand-crafting feature detectors and extractors to feed into complex systems of classifiers, as well as hand-crafted pre-processing techniques that would clean and normalize our data. The problems in vision are similar to the problems primates solve with their vision systems. For example, how do we detect a face in the jungle? How do we recognize if the face is known or unknown? These tools would be useful for a robot to have in order to communicate with humans and other beings. Then on the inanimate spectrum of problems, we have things like background-extraction, scene recognition, and action recognition. Of course, there comes up the side discussion on what is the difference between detection and recognition. Detection is noticing when an event has occurred, and recognition is determining what event it is. This section will discuss the vision systems robots will use to take control of their world, methods of optimizing these solutions, the overall flow, the tried-and-true and state-of-the art solutions, and how deep learning is going to revolutionize vision (and not to side-track, but this applies to audio, actions, and control systems). 7.1 7.2 7.3 7.4 Flow of Vision Data Hierarchy of Vision Problems Vision Problems The Computerized Vision System

10

2. Prerequisites 7.5 7.6 7.7 7.8 7.9 The Biological Vision System Tried-and-true Vision Solutions State-of-the Art Vision Solutions Optimizing your System Deep learning and its role in Vision

11

3 Appendices
Appendix I: Heading of this appendix

4 Bibliography Primary sources Secondary sources


[1] Ng, Andrew: CS294A Lecture notes, Stanford, URL: www.stanford.edu/class/cs294a/sparseAutoencoder.pdf [Accessed 07/11/2013]. Last name, first name (year): Title, Place, URL: http://www.templates.services.openoffice.org [Accessed 11/07/2013].

II

Das könnte Ihnen auch gefallen