Sie sind auf Seite 1von 83

Software Carpentry

Sunil Mohan Adapa


sunil at medhas dot org

Some content derived from Software Carpentry Lecture Material http://software-carpentry.org/license/


This work and the original are under Create Commons Attribution 3.0 License
http://creativecommons.org/licenses/by/3.0/

About the Tutorial

Introductory

Hands on

Interactive

Software Carpentry for academics and


research in any discipline

Makes software work easier

Enables new kinds of work

Gets work done faster

Summary

The Unix Shell

Regular Expressions

Make

Version Control

Python

Unix Shell

About Shell

Why use command line when we have GUI

Typical shell: bash

Terminal programs: gnome-terminal, Konsole,


xterm, putty

Example Use Cases

Mr. A wishes to retrieve all files modified last


week and replace the phrase this week with
next week in those files.
Everyday, Mr. B wishes to automatically retrieve
all files modified on that day and back them up to
different location.
Mr. C likes to rename files so that their
extensions are removed
Mr. D likes to combine to merge fives sets of
user lists into a single one

File system

ls to list the files in the directory

ls -l to list files with extra information

pwd to show the current directory

cd <path> to switch to a directory

cd to switch to home directory

/ is the top most directory. It is also the path separator

. is the current directory

.. is the parent directory. /home/user/work/.. is same


as /home/user

File System Structure

/root and /home store user data

/bin, /usr/bin, /sbin and /usr/sbin store executable commands

/usr stores files related to user applications

/usr/local contains applications compiled by the user

/var contains (variable) files that usually grow over time

/lib contains libraries

/tmp contains temporary files

/proc is a virtual file system containing kernel information

/mnt and /mount contain file system mounts

Manipulating Files

cp copies one file to another file or directory


mv renames a file or moves it to another
directory, overwriting

rm deletes files

rm -rf deletes files and directories

mkdir creates a directory

rmdir removes an empty directory

File Permissions

ls -l shows ownership and permissions of a file

chmod changes the permissions

chown changes the ownership

su switches the current user by launching a


new shell

Redirection

ls > out stores the output of ls into out file

cat concatenates files and input given to it

cat < out reads the contents of out file and


provides as input to cat
sort < out > sorted sorts a contents of out file
and stores it in the sorted file
| (a pipe) redirects the output of one command
to another: ls | sort > sorted

Some More Commands

du to find the size occupied by file on disk

less and more for paginated display

find to recursively find files matching a complex criteria

xargs to convert input into arguments

grep to match a pattern/regular expression in a file

head and tail to see part of a file

sort to sort data in a file

uniq to find items after sorting

wc counts number of chars, words and lines in a file

Jobs

Control-C terminates a program

Backgrounding a program

Control-Z and bg

& at the end of the command

jobs list current jobs

fg foregrounds a program

ps lists processes

kill kills a process

References

Bash Manual Page: man bash


http://linux.die.net/man/1/bash
GNU/Linux Man Pages: man
Learning the Shell:
http://linuxcommand.org/learning_the_shell.php

Regular Expressions

What are Regular Expressions?

A concise and flexible means for matching


strings of text

Like *.txt means all files with .txt extension

Parts of matches can be extracted

Matched text can be replaced

Example Use Cases

Mr. A has list of 1000 phrases in a text file. She


would like to add a full-stop at the end of each
line.
Mr. B has a list of percentages of various
categories in Wikipedia and their growth in X
(Y) format. He would like to covert it X/Y format.
Mr. C would like find out all words in a file
containing 3 to 5 alphabets.

Example Use Cases (contd.)

Mr. D would like to list all hexadecimal numbers


in a file.
Mr. E would like to convert all American
formatted dates in a file to ISO date format.
Mr. F would like to retrieve all the sentences
starting with 'Which' from a file.
Mr. G would like to retrieve all words in a
document containing two Hindi consonants
joined by a halant.

Where are RegExps Used?

Editors: Vim, Emacs, Eclipse, Notepad++ etc.

Programming Languages
Inbuilt: Perl, Ruby, Javascript etc.
As library: C, C++, Java, Php, Python etc.

Unix command line: rename, grep, sed, perl etc.

Lot more:
Configuring Apache Web Server
Syntax Highlighting in editors
Even Google Search (well... not really. Just code search)

Basics

A normal alpha-numeric character in regex


matches that character in target string
hello matches the text hello

. matches any character


* repeats the previous expression zero or more
times

Example Applications

Unix command line: grep

Editor: Vim

Programming: Perl

Metacharacters

. matches any character


a. matches as, ab etc.

^ matches the beginning of a line

$ matches the end of a line

| alternation
H|h matches h or H

() grouping
H|hello matches H or hello
(H|h)ello matches Hello or hello

\ escapes any metacharacter


Mr. matches Mr. and Mrs
Mr\. matches Mr. and not Mrs

Character Classes

[Hh] means (h|H)

[0-9] means (0|1|2|3|4|5|6|7|8|9)

[0-9a-z] means ([0-9]|[a-z])

[^ab] matches any characters but not a and b

\x{0915} matches devanagari

\n matches a new line

\r matches a return

\t matches a tab

Character Classes (Perl)

\w matches a word

\W matches a non-word

\s matches a whitespace

\S matches a non-whitespace

\d matches a digit

Quantifiers

* matches 0 or more times

+ matches 1 or more times

? matches 0 or 1 time

{7} matches 7 times

{5,} matches at least 5 times

{2,5} matches at least 2 times but no more than


5 times

Greedy vs. Stingy

In text "XYZ" to "PQR"

".*" will match "XYZ" to "PQR"

".*?" will match "XYZ"

? applies to all other quantifiers also

Substitutions

s/hello/Hello/ will substitute Hello with hello


s/(H|h)ello/Hi/ will substitute Hello or hello with
Hi

() will extract a match

\1, \2 etc. hold the value of the match

s/([0-9])([0-9])/\2\1/ matches two digits and


reverses them

Modifiers

i means case-insensitive match


/Hello/i will match hello, Hello or HELLO

g means global matching

m means multi-line string

References

Perl Regular Expressions: man perlre


http://perldoc.perl.org/perlre.html

Build Tools

Building a Project
file1.c

file2.c

file3.c

file4.c

file1.o

file2.o

file3.o

file4.o

library1.so

main.c

main.o

program

library2.so

Make

Needs a dependency graph

Operates on files and time stamps

Executes shell commands

Other uses
Any set of tasks with dependency graphs
Automated testing
Building documentation
Even booting an operating system!

Writing Makefiles
hello: hello.o
gcc hello.o -o hello
hello.o: hello.c
gcc hello.c -c -o
hello.o
clean:
rm -f hello.o hello

Using Make

$ make

$ make clean

Basics
Target

Prerequisites

hello: hello.o
gcc hello.o -o hello
hello.o: hello.c
gcc hello.c -c -o
hello.o
clean:
rm -f hello.o hello

Commands

Rules

Bigger Project
hello: main.o filel.o file2.o
gcc main.o file1.o file2.o -o hello
main.o: main.c file1.h file2.h
gcc main.c -c -o main.o
file1.o: file1.c file1.h
gcc file1.c -c -o file1.o
file2.o: file2.c file2.h
gcc file2.c -c -o file2.o
clean:
rm -f hello main.o file1.o file2.o

Improving: Step 1
hello: main.o filel.o file2.o
gcc $^ -o $@
mail.o: file1.h file2.h
main.o: main.c
gcc $^ -c -o $@
file1.o: file1.h
file1.o: file1.c
gcc $^ -c -o $@
file2.o: file2.h
file2.o: file2.c
gcc $^ -c -o $@
clean:
rm -f hello main.o file1.o file2.o

Improving: Step 2
hello: main.o filel.o file2.o
gcc $^ -o $@
mail.o: file1.h file2.h

file1.o: file1.h
file2.o: file2.h
%.o: %.c
gcc $^ -c -o $@
clean:
rm -f hello main.o file1.o file2.o

Improving: Step 3
TARGET = hello
OBJECTS = main.o file1.o file2.o
main.o: file1.h file2.h
file1.o: file1.h
file2.o: file2.h
$(TARGET): $(OBJECTS)
gcc $^ -o $@
%.o: %.c
gcc $< -c -o $@
clean:
rm -f $(TARGET) $(OBJECTS)

Improving: Step 4
TARGET = hello
OBJECTS = main.o file1.o file2.o
main.o: file1.h file2.h
file1.o: file1.h
file2.o: file2.h
$(TARGET): $(OBJECTS)
gcc $^ -o $@
$(OBJECTS): %.o: %.c
gcc $< -c -o $@
clean:
rm -f $(TARGET) $(OBJECTS)

Phony Targets

Try this:

$ touch clean

$ make clean

What happened and why?


Declaring a target as phony addresses the
problem

.PHONY: clean

Improving: Step 5
TARGET = hello
OBJECTS = main.o file1.o file2.o
main.o: file1.h file2.h
file1.o: file1.h
file2.o: file2.h
$(TARGET): $(OBJECTS)
gcc $^ -o $@
$(OBJECTS): %.o: %.c
gcc $< -c -o $@
.PHONY: clean
clean:
rm -f $(TARGET) $(OBJECTS)

Even Better Build System

Autoconf

M4

Write macros for Autoconf

Automake

Detect system environment and build accordingly

Automatically generate makefiles

Libtool

Automatically handle different library formats in


different OSes

References

GNU Make Manual: info make


http://www.gnu.org/software/make/manual/make.html

GNU Automake Manual: info automake


http://www.gnu.org/software/automake/manual
GNU Autoconf Manual: info autoconf
http://www.gnu.org/software/autoconf/manual

Version Control

Why?

Keep track of changes

Release management

Work as a group

Identify regressions easily

Maintain personal changes to code elsewhere

Revisions
Initial Version

Added feature 1

Added feature 2

Fixed bug 1

Latest version

Release Management
Initial Version

Added feature 1

Fixed bug 1

Added feature 2

Version 1.1

Fixed bug 1

Version 2.0

Work as a Group
Initial Version

Added feature 1

B's Feature

A's Feature

Merge

Latest Version

Identify Regressions
Bug free version

Bug introduced

Latest version contains a bug

Personal Changes
Free Software Project
on the Internet
Version 1.0

Version 2.0

My research work
Idea 1

Version 3.0
Idea 2
Version 4.0
Idea 3

Getting Started with Git

Basic configuration:
$ git config --global user.name "Your Name Comes Here"
$ git config --global user.email you@yourdomain.example.com

Creating a repository:
$ git init

Adding files to the repository:


$ git add file1.c

Committing the changes


$ git commit

Editing

Edit your file


$ nano file1.c

Mark for commit


$ git add file1.c

Commit the changes


$ git commit

Reviewing Changes

Edit and review changes


$ nano file1.c
$ git diff

Current status
$ git status

Reviewing Changes (contd.)

Changes between two revisions


$ git diff r1..r2

History of changes
$ git log

Exchanging Patches

The diff format

Patch file

Producing a patch file


$ git diff r1..r2 > my_feature.patch

Applying a patch
$ patch -p1 < my_feature.patch

Better ways

Tagging

What are tags?

Creating a tag
$ git tag VERSION_1

Deleting a tag
$ git tag -d VERSION_1

Retrieving older versions


$ git checkout -b VERSION_1

More Topics of Interest

Branching and Merging

Pushing and Pulling from repositories

Rebasing

Bisecting

Stashing changes

Graphical Tools

References

Git: http://git-scm.com

Official Git Tutorial:


http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html

ProGit Book: http://progit.org

Git Manual Pages: man git

Python

Content derived from Official Python Tutorial http://docs.python.org/tutorial/

Why Python?

Easy for beginners

Yet powerful

Rapid development

Scalable for large and complex project

Object oriented

Cross platform

Large set of libraries for performing various


tasks

First Python Program


$ python
>>> 2 + 3
5
>>>

Hello, World!
$ python
>>> print "Hello, World!"
Hello, World!
>>>

Hello, World! in a File


#!/usr/bin/python
print "Hello, World!"

Python as Calculator
>>> 2+2
4
>>> (50-5*6)/4
5

Variables
>>> a = 2
>>> b = 3
>>> print a * b
6

Strings
>>> hello = "Hello"
>>> world = "World"
>>> print hello
Hello
>>> print world
World
>>> print hello + world
HelloWorld
>>> print hello + ", " + world + "!"
Hello, World!

Lists
>>> a = ['spam', 'eggs', 100, 1234]
>>> a
['spam', 'eggs', 100, 1234]
>>> a[0]
'spam'
>>> a[3]
1234
>>> a[-2]
100
>>> a[1:-1]
['eggs', 100]
>>> a[2] = a[2] + 23
>>> a
['spam', 'eggs', 123, 1234]

More on Lists
>>> a = [66.25, 333, 333, 1, 1234.5]
>>> print a.count(333), a.count(66.25), a.count('x')
2 1 0
>>> a.insert(2, -1)
>>> a.append(333)
>>> a
[66.25, 333, -1, 333, 1, 1234.5, 333]
>>> a.index(333)
1
>>> a.remove(333)
>>> a
[66.25, -1, 333, 1, 1234.5, 333]
>>> a.reverse()
>>> a
[333, 1234.5, 1, 333, -1, 66.25]
>>> a.sort()
>>> a
[-1, 1, 66.25, 333, 333, 1234.5]

More on Lists
>>> mat = [
...
[1, 2, 3],
...
[4, 5, 6],
...
[7, 8, 9],
...
]

Tuples
>>> t = 12345, 54321, 'hello!'
>>> t[0]
12345
>>> t
(12345, 54321, 'hello!')
>>> # Tuples may be nested:
... u = t, (1, 2, 3, 4, 5)
>>> u
((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
>>> t = 12345, 54321, 'hello!'
>>> x, y, z = t

Dictionaries
>>> tel = {'jack': 4098, 'sape': 4139}
>>> tel['guido'] = 4127
>>> tel
{'sape': 4139, 'guido': 4127, 'jack': 4098}
>>> tel['jack']
4098
>>> del tel['sape']
>>> tel['irv'] = 4127
>>> tel
{'guido': 4127, 'irv': 4127, 'jack': 4098}
>>> tel.keys()
['guido', 'irv', 'jack']
>>> 'guido' in tel
True

If .. else
>>> x = int(raw_input("Please enter an int: "))
Please enter an integer: 42
>>> if x < 0:
...
x = 0
...
print 'Negative changed to zero'
... elif x == 0:
...
print 'Zero'
... elif x == 1:
...
print 'Single'
... else:
...
print 'More'

For
>>> # Measure some strings:
... a = ['cat', 'window', 'defenestrate']
>>> for x in a:
...
print x, len(x)
...
cat 3
window 6
defenestrate 12

For
>>>
[0,
>>>
>>>
...

range(10)
1, 2, 3, 4, 5, 6, 7, 8, 9]
a = ['Mary', 'had', 'a', 'little', 'lamb']
for i in range(len(a)):
print i, a[i]

Break
>>> for i in range(10):
...
if i > 5:
...
break
...
print i
...
0
1
2
3
4
5

Continue
>>> for i in range(10):
...
if i == 5:
...
continue
...
print i
...
0
1
2
3
4
6
7
8
9

Comments
>>>
>>>
...
...

# This is single line comment


""" This is a
multiline
comment"""

Functions
>>>
...
...
...
...
...
...
>>>
...
0 1

def fib(n):
# print Fibonacci series
"""Print a Fibonacci series up to n."""
a, b = 0, 1
while a < n:
print a,
a, b = b, a+b
# Now call the function we just defined:
fib(1000)
1 2 3 5 8 13 21 34 55 89 144 233 377 610 987

References

Python Programming Language Official


Website: http://python.org
The Python Tutorial:
http://docs.python.org/tutorial
The Python Standard Library:
http://docs.python.org/library
The Python Language Reference:
http://docs.python.org/reference

Feedback & Further Assistance:


sunil at medhas dot org

Thank you

Das könnte Ihnen auch gefallen