You are on page 1of 36

Introduction to Shell Scripting

Mar 2011
A shell script is a script written for the shell, or command line interpreter, of an operating
system. It is often considered a simple domain-specific programming language. Typical
operations performed by shell scripts include file manipulation, program execution, and printing
Types of shell scripts : Basically we have three types of the shell scripts [Most Prominent
one’s ,widely used in industry].
1. Bourne Shell :
The Bourne shell, or sh, was the default Unix shell of Unix Version 7, and replaced the
Thompson shell, whose executable file had the same name, sh. It was developed by Stephen
Bourne, of AT&T Bell Laboratories, and was released in 1977 in the Version 7 Unix release
distributed to colleges and universities. It remains a popular default shell for Unix accounts. The
binary program of the Bourne shell or a compatible program is located at /bin/sh on most Unix
systems, and is still the default shell for the root superuser on many current Unix
Features of the Bourne shell include:

Scripts can be invoked as commands by using their filename
May be used interactively or non-interactively

Allow both synchronous and asynchronous execution of commands

supports input and output redirection and pipelines

provides a set of built-in commands

provides flow control constructs, quotation facilities, and functions.

typeless variables

provides local and global variable scope

scripts do not require compilation before execution

does not have a goto facility, so code restructuring may be necessary

Command substitution using back quotes: `command`.

Here documents using << to embed a block of input text within a script.

"for ~ do ~ done" loops, in particular the use of $* to loop over arguments.

"case ~ in ~ esac" selection mechanism, primarily intended to assist argument


It contains strong provisions for controlling input and output and in its expression
matching facilities.

provided support for environment variables using keyword parameters and exportable

2. C Shell :
The C shell (csh or the improved version, tcsh, on most machines) is a Unix shell that was
created by Bill Joy while a graduate student at University of California, Berkeley in the late
The C shell is a command processor that's typically run in a text window, allowing the user to
type commands which cause actions. The C shell can also read commands from a file, called a
script. Like all Unix shells, it supports filename wildcarding, piping, here documents, command
substitution, variables and control structures for condition-testing and iteration. What
differentiated the C shell, especially in the 1980s, were its interactive features and overall style.
Its new features made it easier and faster to use. The overall style of the language looked more
like C and was seen as more readable.
Features of C Shell :
Some of the features of the C shell are listed here:

Customizable environment.
Abbreviate commands. (Aliases.)

History. (Remembers commands typed before.)

Job control. (Run programs in the background or foreground.)

Shell scripting. (One can write programs using the shell.)

Keyboard shortcuts.

3. Korn Shell :
The Korn shell (ksh) is a Unix shell which was developed by David Korn (AT&T Bell
Laboratories) in the early 1980s and announced at Toronto USENIX on July 14 1983
ksh is backwards-compatible with the Bourne shell and includes many features of the C shell as
well, such as a command history, which was inspired by the requests of Bell Labs users.

The Korn shell's major new features include:

Command-line editing , allowing you to use vi or emacs -style editing commands on
your command lines.
Integrated programming features : the functionality of several external UNIX
commands, including test , expr , getopt , and echo , has been integrated into the shell
itself, enabling common programming tasks to be done more cleanly and without creating
extra processes.

Control structures , especially the select construct, which enables easy menu

Debugging primitives that make it possible to write tools that help programmers debug
their shell code.

Regular expressions , well known to users of UNIX utilities like grep and awk , have
been added to the standard set of filename wildcards and to the shell variable facility.

Advanced I/O features , including the ability to do two-way communication with
concurrent processes (coroutines ).

New options and variables that give you more ways to customize your environment.

Increased speed of shell code execution.

Security features that help protect against "Trojan horses" and other types of break-in

Why do we Use Shell Scripting
Mar 2011
We use Shell Scripts to obtain the following purposes.
1. Customizing your work environment.
2. Automating Your Daily Task.
3.Automating Repetitive Task.
4.Executing Important Procedures like shutting down the system,formatting a disk,Creating a file
system on it,mounting the file system,letting the users use the floppy and finally un mounting the
5.Performing the same operation on many files.

Helpful when we donot know which command will solve the purpose. To update parameter file with session start time and end tim Basic Commands Part1 Naveen 24 Mar 2011 Command Option 1. Requires a variety of Software tools. Is too complex. Requires a high degree of efficiency 3.g :. Print the calendar year or month Print the calendar 2009 year Print calender for current december month Tue Mar 09 07:10:10 IST 2009 . Usage of Shell Scripts with in Informatica: the different option available for copying. One line definition of grep usage. 2. Uname 2.To run the command task. 2.Cal cal 2009 cal dec 4. To find the user manual for grep command.such as writing an entire billing system. Date date date +%m Description To find the version of unix os To find the machine name Print the user manual for any command.To run an batch via pmcmd.e. To have header and footer in case we are goanna write to an flat file 3.When Not to use Shell Scripting : 1. Man uname -r uname -n man grep man -e grep man -k copy 3. 4.

ls To Print the contents of current directory.< * > indicate all files containing executable code and </> indicate the directories. Prints Multi column output of files with their names. Print all files and subdirectories in directory +%h date +'%h%m' date +%d date +%y H. Print the content of abc directory. To Print only directoris and executable files. Reverse the sort order of all the files with their names. Prints the inode number of file. <a> stands for all.Bc To use the calculator 9. Print all files including hidden files.Lock 3 March Mar-03 10 9 Login details of all the users in system Login details for only myself Login details for all users with headers info Login details for only active users To change the password lock -45 lock the terminal for 45 minutes 8. Print all the directories. Prints all the file with their last modification time in reverse order keeping last modified at bottom. prints last access time of files.Who who who am I who -H who -u 6.M and S 5.Passwd 7. ls -x ls -F ls -a ls -r ls -x abc ls -R ls -ld ls -il testfile ls -lt ls -ltr ls -lut . Prints all the file with their last modification time keeping latest modified at top.

change the default split line size to 72. Renaming the testfile to new name as <newfile>.rm cat testfile cat > testfile cp -i testfile abc cp -r abc newabc rm -f newabc rm -r newabc Print the content of any file. Split the files into small files. To copy file and directories. copying entire directiry structure abc to newabc recursively. Recursive deletion will not remove the write protected files. copying testfile to abc directory interactively.Cat 11.lp Used for Printing the files 14. Removing directory newabc forcefuly. split testfile split -72 testfile split testfile newfile 15.wc These three command are used to find the difference in two files wc -w testfile wc -c testfile wc -l testfile wc testfile Print the numbers of words. Cmp comm Diff 16.10. 13. Create a new file with testfile name. 18. All the options of cp command is applicable for rm command.CP 12. lines and character in the file 17.Split To split the big files in different small files of 1000 lines each<default>.stty stty -a stty intr \^c stty eof \^a Print current setting of terminal To delete the character during backspacing To change the interrupt key to .Path Set the path variable for PATH=$PATH:/home/ SHELL to use these directory gagan:/home/deep path to locate the executables commands.

To create short hand names of command. the value of home directory can be seen by $HOME environment variable 23. 19. To change the defalult setting of command history saving feature. group <g>.History -l' unalias history -5 history 10 15 HISTSIZE 1200 r-2 r 20 Print all the alias set in the system. Every command has event - To switch between current working directory and most recently used directory. . Repeat previous command in Korn Repeat previous command in Bash Repeat previuos to prevoius command Repeat the command with 20 event number. To find the previous command that has been used. To redefine and unset the alias. 22. Passing positional parameter ETL to showdir alias will take us to that directory. Print the last 5 command that has been used. 24. Display event number between 10 an d15.chmod Changing the permission of chmod u+rwx newfile file for user <u>. r !! alias alias l ='ls -ltr' alias showdir ='cd $1.Alias 20.Tilde(~) Changes control to home directory Changes absolute path home directory of user a_cmdb.Ctrl c instead of 'DELETE' default To change the termination control of input during file creation using cat command from Ctrl d to Ctrl a.

others <o> or all <a> with + as assign . To pick special no of character from file. Indicate the current directory. Basic Commands Part2 Naveen 24 Mar 2011 Command Options 25. .log Display end of file Display last 5 lines of file.r read<4>.as remove and = as absolute permission. . ps -f -u a_cmdb Prints the Processes associated with ps -a current user with their hierarchy. w write<2> and x execute<1>.ps Prints the Processes associated with ps -f current user. start dispplaying 10 line onward till the end.Tail Recursively changes the permission to execute for all directory and sub directory in abc directory for all users.Head 26. Description head newfile head -5 newfile head -1 newfile | wc -c vi 'ls -t | head -1' head -c 512 newfile head -c 1b newfile head -c 2m newfile Display top content of file Display top 10 lines of file when used without argument Display top 5 lines of file<newfile> when used without argument To count the number of character present in first line.f . Print first 512 byte character Print first 2 blocks (1 MB each) tail -5 tail +10 tail -f 6_load. Print the gworth of file 27.$$ Prints the Process id of current shell 28. chmod g+x newfile chmod 757 newfile chmod 457 newfile chmod -R a+x abc chmod -R 001 . To open the last modified file for editing.

Kill the parent process id which inturn kill all the child process. A Process having process id as 1 is parent of all SHELL processes. Now after running the process in nohup mode kernel has reassigned the PPID of ps process with PID of system process which will not die even after SHELL dies through logging out. 29.nohup Shell is parent of all background jobs and it dies when user logs out and ultimately all the child process also dies.Shell become the parent of all ps -f background process. Running fg will bring most recently started background process <LAST IN FIRST OUT> First job will come into foreground. System process init having PID as 1 nohup ps -eaf | is parent of all SHELL processes grep 'ksh' & and when the user login this PID become the PPID of SHELL prrocess. prints all the ksh process running in the system by first searching for all the process. . Kill the process with 520 process id. 31.Kill kill 520 kill 640 kill 1 kill $! kill 0 To kill the command.To avoid dying of shell even when the user logs out run the background process using nohup command. Prints all the users processes. 30. Killing system process init is not possible.Background ps -f -u a_cmdb This command print all the jobs job ps -a runnung in background with their ps -e sequence no of exceution. Prints all the processes associated with a_cmdb user. Running job in background Run the search job in background by printing its process -e stands for full. Prints system process.

default nice value is 20. at 2:10 load. 35. shift one position. 36. This will set $1.Shift shift shift 2 39. $2.ksh will execute today by 2:10 34.Nice Running the command with low priority nice who | wc Nice value range from 1 to 39.No need to remember Process id of last background procees its in $! Kill all process in the system except login shell process. Prints details about a_cmdb user. shift 2 places 38.Chown To change the owner of file can chown jaspreet only be done by owner or sys testfile admin.Cron cron executes process at regular intervals it keeps checking the control file at /user/spool/cron/crontabs for jobs.ksh batch command executes the process whenever the CPU is free.Finger Produces list of all logged users when run without any argument. l& higher the nice value lower the nice -n 10 who | priority. 33.At To Run the job at specied time of the day. 37. $3 .Batch batch load. $* and $# posional parameter When this statement is used at start of ksh script it echoes each statement at the terminal shift command transfer the contents of positional parameter to next lower number. The 2nd and last field of output is taken fron passwd file from /etc directory. chown -R change the owner from gagan to .ksh Indicate the load. wc -l & nice value becomes 30. $$ stores the PID of current login shell 32.Set finger finger a_cmdb set 10 20 30 -x set the posional parameter.

Drop the link between two files.df 44.ksh will give us same inode number.Touch 42.ksh 7_load.ksh rm 6_load.Chgrp 41. Now gagan cannot change the ownership of test file what he can do is just create the jaspreet testfile similar copy of file and then he can become the owner of file. Prints the disk usage of specific directory tree. ls -li 6_load. touch 03161430 test file touch -m 04161430 test file touch -a 05161430 test file ln 6_load.ksh df -t /home/oracle du -s du -s /home/* Changing the time stamps of file touch without any argument will change both modified time and access time.du To change the group of file can only be done by owner or sys admin.ksh 7_load. Changes only access time. Changes only modification time. Prints disk usage for every user .Linking in 43. Recursively changes the owner all files under current directory. Linking two file. 40. Prints summary of whole directory tree. Prints the free and total space under oracle file system. chgrp GTI test group changed to GTI from IB as file user is still the same and he can chgrp -R GTI again change the group of file to IB. doing this both the file will have same inode number and changes in one will reflect in another also. test file Recursively changes group all files under current directory. Prints the amount of free space available on the disc. By default du prints disk usage for each sub directory.jaspreet for testfile.

zip unzip file. F stands for field cut./*. Cut the field b/w 1 and 5 and piped the output to new file.gz eg: etl_code zip file*sql eg: eg:file*sql 2. Priority1.6-12 test file -f <field start and end no> eg:cut -f 1.sql -c <column start and no> 3. By default the sorting starts with first character of each line.Zipping Files Naveen 24 Mar 2011 Command Options 1.5 test file | new file 4.sql -x files are restored eg:tar-xcvf /home/gagan/sqlbackup using -x option. tabs . Cut eg : cut -c -5. field (default deliminator tab).gz Description gunzip*. -cvf eg :tar -cvf /home/gagan/sqlbackup . tar create backup of files recursively. gzip gzip etl_code eg:etl_code. sort sort test file column (by specifying position).5 test file '-d -f <field start and end no> eg: cut -d "|" -f 1. C stands for column cut. .

eg : sort -t "|" +1 -2 +3 <> Sorting based on different field as in case of order by clause. overring the default. -r eg : sort -t \| -r +2 test file Reverse sort starting with 3rd field.. -d deliminator.uppercase letters 4.2. -2 indicate to stop the sorting after 2nd field and resume it with 3rd field.txt translate all | with ~ and \ with .txt abc_sort.paste -t eg : sort -t "|" +2 test file Sorting starts from 3rd field skipping 2nd field. -o eg : sort -o abc.txt translate to upper case. Lower case letters. -u eg : sort -u <> Unique sort. eg : sort -t \| +2r test file The above command can be written in another way. -d delete all occurrence . numerals 3.delimiter to distinguish b/w start and end of field. -n eg : sort -n <> Numeric sort. eg : tr '[a-z]' '[A-Z] <test. 5.txt save sorted data in file. eg:paste -d "|" <> <> 6.Sorting starts with 2nd field then with 4th eg : tr '|\' '~-' <test.

eg : cut -d "|" -f3 <>|sort|uniq -u -d select only dup records.wall wall -g dba "hello" 11. shutdown 17:30 shutdown at 17:30.Change Date date 09181754 10. eg: cut -d "|" -f3 <>|sort|uniq -d -c duplicate count. ls -lt time of last modification touch -m 01290430 <file>.shutdown shutdown -g2 12.uniq of | .Changing time stamp touch mon date hrs mins <file>. power down after 2 mins. 9. ls -lu time of last access touch -a 01290415 <file>. -u remove duplicate. eg : cut -d "|" -f3|sort|uniq -c 8. shutdown -y -g0 immediate shutdown.txt 7. shutdown -r now shutdown immediate and reboot. Unique require sorted file as input. shutdown -y -g0 -i6 shutdown and reboot (init level 6).eg : tr -d '|' test. Disk usage. . du to selectively send msg to dba group.

! –newer eg: find / -name "*.lst" \) -print double quotes necessary. -name *. 'atime = access time eg: find .log -prune exe – directory. -prune don't descend exe eg: find .sh" -o -name "*. -a (and) -o (OR) eg: find . -mtime +20 | xargs -n20 -p -t rm -f remove at max 20 files in batch and in interactive mode. -atime +180 -ok rm -f {} \. remove the files which are not modi for last 20 days -ok eg: find . xargs -n -p –t eg:find . -atime +180 -ok rm -f {} \. -size.find du /home/expimp/create_db tree output for each directory inside.13. -type eg: find / -name log -type f -print f for file and d for directory. Pattern Searching . -mtime +20 | xargs rm -f will be executed only once. -atime +365 -print find the file not accessed in last 1 year.lst file '-mtime = mod time eg: find . find <loc> <option> <pattern> <action> find in root dir abc in emp. -mtime -2 -print find file modi in less then 2 days. print. eg: find . -exec eg: find . xargs remove all file rm eg: find . \( -name "*.pl" ! -newer last_backup -print file modi before last_backup. before removing prompt for confirmation. -size +2048 -print files greater then x blocks. du -s //home/expimp/create_db summary.

$s/<search string>/<Replace String >/g Pattern search and replacement. : $s/gagan/deep/gc c Ask for confirmation for replacement grep -c counting occurrence.s/gagan/deep/g Only the current line. : $s/gagan/deep/g Only the last line. Repeat the previous command :1. -n display line number for record.10s/gagan/deep/g search between lines 3 and 10 : . [PQR] match any single character. -l display files containing record. .Naveen 24 Mar 2011 Commands Options Description / /Unix Forward search of Unix keyword in file ? ?Unix Backward search of Unix keyword in file n Repeat the last search . -v skip records that contain directory. -i ignore case. g Stands for globally :3. 1.$ Represent all lines in the file.

fgrep and egrep accepts multiple pattern both form command line and a file but unlike grep and egrep does not accept regular . ls –l |grep “^d” Prints only directories. {a-Z A-Z 0-9} match any single character. egrep '[aA]g+[ar] [ar]wal' test1.pattern stored in file eg(prashant|admin| director) Fgrep + Matches one or more occurrence of previous character.txt match eap1 or eap2 (finds prashant or line with director) egrep egrep (das|sen)gupta match exp x1x3 or x2x3 test1. <pat>$ ending with pattern. egrep prashant| director test1. egrep '[aA]gg?[ar] [ar]wal' test1.txt (like dasgupta & sengupta) egrep -f Huge list of pattern search <pattern_file_name> can passed in the form of test1. ? Matches zero or one occurrence of previous character.txt match one or more occurrence matches ag & agg.[c1-c2] match char with ASCII range [^PQR] match single character which is not PQR. ^<pat> beginning with pattern.txt file name .txt match zero or one occurrence.

[!x-z] Matches single character that is not within the ASCII range of character x and z . ls -x chap* Matches all the files which starts with chap and prints in multi column way. fgrep -f pattern file emp file Faster than grep and egrep family Pattern Matching Naveen 25 Mar 2011 Command Options * Description Matches any number of character including NONE.expression. [ !ijk ] Matches single character that is not i or j or k [x-z] Matches single character that is not within the ASCII range of character x and z. ls -l chap? Matches all the files with only 5 character name and should start with chap. ls -l chap* Matches all the files which starts with chap. * does not match all files beginning with dot <. ? Matches single character.>. [ ijk ] Matches single character either i or j or k.

To pass the standard output of one command as the . ls -l [ a – z A-Z] Matches all file names beginning with alphabet irrespective of case. ls -l chap[!0 .???* The above problem can be solved with specifying first few character using meta character <?> explicitly.l chap* Print all files whose names start with chap but not the one whose name is chap*.ist Print all the files which end with ist -l chap0[1 .l chap\* The above problem can be solved by escaping the special character. cmp chap[12] Compares file chap1 and chap2. ls .9] Matches all file names beginning with alphabet chap and not ending with any number. Escaping Backslash ( \ ) Pipe | Playing with file names which uses meta character in their file name.4] Range specification is also available. mv * ../bin Moves all the files to bin directory.>. ls . ls *. ls -l * * does not match all files beginning with dot <. cp chap?? abc Copy all files starts with chap to abc directory. ls -l . cat chap[!0 .9] Concatenates all the files beginning with chap and not ending with number.

. b=cd. who | tee users list Tee saves the output of who command in user list as well as display it also. a=ab. $1 is part of positional parameter. eg: $10 echo "$10" eg: 0 Control Structures Naveen 25 Mar 2011 Shell is looking for $1 variable which is undefined so passes null for this. echo $z shell concatenate two variable. All words starts with $ are considered as variable unless single quoted or escaped.standard input to another. Shell Variable Shell variables are initialized to null value < by default > so it returns null. z=$a$b. echo '$10' All words starts with $ are considered as variable unless single quoted or escaped. Tee who | wc -l Output of who command <three users> passed as input to wc which count the number of lines present.


Sample Unix shell Scripts Naveen 10 Apr 2011 #ksh: JobNum. # $1 = environment (DEV) ############################################################################## ###### #set -x ############################################################################## ###### . /etldata/aagt/common/scripts/ #Author: Prasad Degela #Date: 03/13/2006 #Reviewed by By: #Date: #Project: Appeals & Grivances Tracking ############################################################################## ###### #This script will call .Part 1 # These parameters are provided for all interfaces.env inboxdir=$PROJECTDIR/qnxt/inbox srcfiledir=$PROJECTDIR/qnxt/estage errorfiledir=$PROJECTDIR/qnxt/error tempfiledir=$PROJECTDIR/nice/tstage . # SET PARAMETERS .env file to get common variables.

gz rm -f TRAIL*.txt .gz $ARCHFILEDIR rm -f $pfilename.tempfiledirqnxt=$PROJECTDIR/qnxt/tstage ERROR_EXIT_STATUS=99 SUCCESS_EXIT_STATUS=0 echo "Control Moved to the Inbox Directory = "$inboxdir cd $inboxdir fileconv=`echo "PDP*.txt"` echo "fileconv = "$fileconv files=`ls -tr $fileconv` if [[ $? -ne 0 ]] then echo "No Source File in the InBox "$inboxdir exit $SUCCESS_EXIT_STATUS fi echo "List Of Files in the Inbox Directory "$inboxdir echo "{ $files }" for pfilename in $files do cd $inboxdir cp $pfilename $srcfiledir gzip $pfilename cp $pfilename.

echo "Source File Name Is "$pfilename pDSjobname="sjAG01" pINVOKE_ID=$pfilename pJobStatusFile=`echo $pfilename".log"` ############################################################################## ###### # RESET AND RUN job sequencer # ############################################################################## ###### $DSBINDIR/dsjob -jobinfo $DSPROJECT $pDSjobname.$pINVOKE_ID > $LOGFILEDIR/ $pJobStatusFile JOBSTATUS=`grep "Job Status" $LOGFILEDIR/$pJobStatusFile | cut -f2 -d '('|cut -f1 -d ')'` echo "$pDSjobname Job status code is "$JOBSTATUS if [[ $JOBSTATUS = 3 ]] then echo "Job Status = "$JOBSTATUS $DSBINDIR/dsjob -run -mode RESET -wait -jobstatus $DSPROJECT $pDSjobname.stat"` pJobLogFile=`echo $pfilename". $pINVOKE_ID echo "************************* Reset the Job Sequencer *******************" fi echo AGDB: $AGDB echo AGDBUserID: $AGDBUserID echo ORADB: $ORADB echo ORADBUSRID: $ORADBUSRID .

echo NICEDB: $NICEDB echo NICEDBUserID: $NICEDBUserID echo SrcFile: $pfilename echo SrcFileDir: $SRCFILEDIR echo TempFileDirectory: $TEMPFILEDIR echo TempFileDirectoryqnxt:$TEMPFILEDIRQNXT echo ErrorFileDirectory: $ERRORFILEDIR echo ScriptFileDirectory: $SCRIPTFILEDIR echo "calling $Jobname Sequence" $DSBINDIR/dsjob -run \ -param AGDB=$AGDB \ -param AGDBUserID=$AGDBUserID \ -param AGDBPswd=$AGDBPswd \ -param ORADB=$ORADB \ -param ORADBUSRID=$ORADBUSRID \ -param ORADBPswd=$ORADBPswd \ -param NICEDB=$NICEDB \ -param NICEDBUserID=$NICEDBUserID \ -param NICEDBPswd=$NICEDBPswd \ -param SrcFile=$pfilename \ -param SrcFileDir=$SRCFILEDIR \ -param TempFileDirectory=$TEMPFILEDIR \ -param TempFileDirectoryqnxt=$TEMPFILEDIRQNXT \ .

-param ErrorFileDirectory=$ERRORFILEDIR \ -param ScriptFileDirectory=$SCRIPTFILEDIR \ -wait -jobstatus $DSPROJECT $pDSjobname.3 .$pINVOKE_ID > $LOGFILEDIR/ $pJobStatusFile $DSBINDIR/dsjob -logsum $DSPROJECT $pDSjobname.$pINVOKE_ID $DSBINDIR/dsjob -jobinfo $DSPROJECT $pDSjobname.1.$INVOKE_ID > $LOGFILEDIR/ $pJobLogFile JOBSTATUS=`grep "Job Status" $LOGFILEDIR/$pJobStatusFile | cut -f2 -d '('|cut -f1 -d ')'` echo "$pDSjobname Job status code is "$JOBSTATUS if [[ $JOBSTATUS = 1 ]] then echo "Job Status = "$JOBSTATUS echo "Removing File from $srcfiledir on successful processing of file $pfilename" cd $srcfiledir rm -f $pfilename else exit $ERROR_EXIT_STATUS fi done echo 99 END OF THE PROG exit $SUCCESS_EXIT_STATUS # ################################################################## Standards for ETL UNIX Shell Scripts for use with PowerCenter 7.

3 Scripting Standards (for PowerCenter version 7. dss # # NOTE: Enter the Project ID parameter that is designated in the # directory structure for your team # (ie. The following is a template that should be used to create scripts for new scheduled jobs.Naveen 10 Apr 2011 Standards for ETL UNIX Shell Scripts for use with PowerCenter 7. # Program: etl_unix_shell_script. ud.1. # # $1 = Project Id Parameter (ie. which the scheduling tool uses to start the PowerCenter job. pass username and password variables # and start the job using the pmcmd command line. and a separate file which contains the username and password for the user called in the script. dss would be used for the DWDAS team as the # directory is /usr/local/autopca/dss/) #----------------------------------------------------------------- .1. This script has been provided as an example and is named etl_unix_shell_script.) # # Example usage: # Author: Kevin Gillenwater # Date: 7/24/2003 # Purpose: Sample UNIX shell script to load environment variables # needed to run PowerMart jobs. dss.3): Scripting standards include the use of a UNIX shell script. etc. Following is the script and explanation.

/usr/local/bin/set_pm_var.# Call the script to set up the Informatica Environment Variables: #----------------------------------------------------------------. will provide ease of maintenance if changes need to be made to PowerCenter and will provide one . /usr/local/bin/ located on each machine (Leisure for Production and Glance for Development).sh to set up the PowerCenter variables used in the session (. The #----------------------------------------------------------------# Read ETL configuration parameters from a separate file: #----------------------------------------------------------------ETL_CONFIG_FILE=$JOBBASE/$1/remote_accts/test_etl.config ETL_USER=`grep ETL_USER $ETL_CONFIG_FILE | awk -F: '{print $2}'` ETL_PWD=`grep ETL_PWD $ETL_CONFIG_FILE | awk -F: '{print $2}'` #----------------------------------------------------------------# Start the job #----------------------------------------------------------------$PM_HOME/pmcmd startworkflow -u $ETL_USER -p $ETL_PWD -s $MACHINE:4001 -f DWDAS_ LOAD_dssqa –wait s_m_CENTER_INSTITUTE #----------------------------------------------------------------# Trap the return code #----------------------------------------------------------------rc=$? if [[ $rc -ne 0 ]] then exit $rc fi Notes Regarding the Script/Standards: 1. The beginning of each script should call and execute the script set_pm_var.

script run by Autosys) # Author: Kevin Gillenwater # Date: 7/3/2003 # Purpose: UNIX script which sets the variables for running PowerMart 6. it should evaluate to their $HOME variable.source for the scripts on a machine. the variable should evaluate to /usr/local/autopca. For AUTOSYS # and AUTODBA.2 # when called from a shell script (ie. You will not need to do anything with this script. # For all other accounts.. it is for informational purposes only: # Program: set_pm_var.. The following is the code in the script. # The value of JOBBASE differs based on the account. *) JOBBASE=$HOME . #--------------------------------------------------------------------case $HOME in /home/autopca/autopca) JOBBASE=/usr/local/autopca . /home/autopca/autodba) JOBBASE=/usr/local/autopca . esac . #------------------------------------------#Set up the Informatica Variables #------------------------------------------export MACHINE=`hostname` export SHLIB_PATH=/usr/local/pmserver/informatica/pm/infoserver export PM_HOME=/usr/local/pmserver/informatica/pm/infoserver #--------------------------------------------------------------------# Set the environment variables needed for scheduling jobs.

The third section of etl_unix_shell_script..r. The folder permissions on the Production machine will only permit PCA’s access to this folder. 3. DWDAS_LOAD_dssqa wflw_m_CENTER_INSTITUTE). Following is the contents of test_etl_config: ETL_USER:etlguy ETL_PWD:ou812 This file (your password file) must be located in your team directory under the remote_accts folder (i. Regular Expressions Naveen 23 Apr 2011 What Are Regular Expressions? A regular expression is a pattern template you define that a Linux utility Uses to filter sets up ETL parameters for username and password usage by PowerCenter when running the workflow. the filename test_etl_config contains the username and password to be used when running the workflow. A Linux utility (such as the sed editor or the gawk program)matches the regular expression pattern against data as that data flows Into the utility. The username and password are no longer stored as part of the main shell script. 4. The regular expression pattern makes use of wildcard characters to represent one or more characters in the data contains the command to run the workflow in PowerCenter.Updated to generalize the scheduling tool used to run the shell scripts as UC4 has been chosen to replace Autosys as the scheduling tool used to run PowerMart workflows. /usr/local/autopca/dss/remote_accts/) contains code to trap the return code of the script that indicated success or failure.-) on this file. Please follow the script exactly.. The only thing that will need to be changed in this section is the folder name and the workflow within the folder that is being executed (i. *10/6/04 .export JOBBASE 2. Types of regular expressions: . it’s accepted for processing. The second section of etl_unix_shell_script. The permissions should be 6-4-0 (rw. If the data matches the pattern. *8/27/03 – Added ‘-wait’ parameter to the pmcmd command line to start the scheduled job. The final section of etl_unix_shell_script. In the script example. .e. If the data doesn’t match the pattern. it’s rejected.

00 $ . if you want to search for a dollar sign in your text. $ echo "This is a test" | gawk ’/trial/{print $0}’ $ Eg 2: Special characters The special characters recognized by regular expressions are: .There are two popular regular expression engines:   The POSIX Basic Regular Expression (BRE) engine The POSIX Extended Regular Expression (ERE) engine Defining BRE Patterns: The most basic BRE pattern is matching text characters in a data stream. $ echo "This is a test" | sed -n ’/trial/p’ $ $ echo "This is a test" | gawk ’/test/{print $0}’ This is a test.*[]^${}\+?|() For example.00 $ sed -n ’/\$/p’ data2 The cost is $4. Eg 1: Plain text $ echo "This is a test" | sed -n ’/test/p’ This is a test. just precede it with a backslash character: $ cat data2 The cost is $4.

Eg 3: Looking for the ending The dollar sign ($) special character defines the end anchor. Eg 1: The question mark The question mark indicates that the preceding character can appear zero or one time. It doesn’t match repeating occurrences of the character: $ echo "bt" | gawk ’/be?t/{print $0}’ bt $ echo "bet" | gawk ’/be?t/{print $0}’ Bet . but the sed editor doesn’t. but that’s all. The gawk program recognizes the ERE patterns. $ echo "This is a good book" | sed -n ’/book$/p’ This is a good book $ echo "This book is good" | sed -n ’/book$/p’ $ Eg 4: Using ranges You can use a range of characters within a character class by using the dash symbol. Now you can simplify the zip code example by specifying a range of digits: $ sed -n ’/^[0-9][0-9][0-9][0-9][0-9]$/p’ data8 60633 46201 45902 $ Extended Regular Expressions: The POSIX ERE patterns include a few additional symbols that are used by some Linux applications and utilities.

the text passes.. Here’s an example of this: $ echo "The cat is asleep" | gawk ’/cat|dog/{print $0}’ The cat is asleep $ echo "The dog is asleep" | gawk ’/cat|dog/{print $0}’ . the data stream text fails. but must be present at least once. The pattern doesn’t match if the character is not present: $ echo "beeet" | gawk ’/be+t/{print $0}’ beeet $ echo "beet" | gawk ’/be+t/{print $0}’ beet $ echo "bet" | gawk ’/be+t/{print $0}’ bet $ echo "bt" | gawk ’/be+t/{print $0}’ $ Eg 3: The pipe symbol The pipe symbol allows to you to specify two or more patterns that the regular expression engine uses in a logical OR formula when examining the data stream. If any of the patterns match the data stream text.$ echo "beet" | gawk ’/be?t/{print $0}’ $ $ echo "beeet" | gawk ’/be?t/{print $0}’ $ Eg 2: The plus sign The plus sign indicates that the preceding character can appear one ormore times.. The format for using the pipe symbol is: expr1|expr2|. If none of the patterns match.

the group is treated like a standard character.The dog is asleep $ echo "The sheep is asleep" | gawk ’/cat|dog/{print $0}’ $ Eg 4: Grouping expressions When you group a regular expression pattern. ftp -i -v -n wilma <<END_FTP user randy mypassword binary lcd /scripts/download cd /scripts get auto_ftp_xfer.ksh bye END_FTP Reactions: Process a File line by line . For example: $ echo "Sat" | gawk ’/Sat(urday)?/{print $0}’ Sat $ echo "Saturday" | gawk ’/Sat(urday)?/{print $0}’ Saturday $ Automated FTP File Transfer Naveen 22 Apr 2011 Automated FTP File Transfer : You can use a here document to script an FTP file transfer. The basic idea is shown here. You can apply a special character to the group just as you would to a regular character.

. I can use a pipe to send the output to the more command.Naveen 22 Apr 2011 Process a File line by line : There are Numerous methods to achieve the same one such method is discussed here : Method 1: Let’s start with the most common method that I see. You could also use () C-type function definition if you wanted. Then this temporary system file is used as input to the more command. the pipe stores the output in a temporary system file. Our use of piping output to a while loop works the same way. as in the following command: df | more When the df command is executed. The pipe is the key to the popularity of this method.3. the output of the cat command is used as input to the while loop and is read into the LINE variable on each loop iteration.2. function while_read_LINE { cat $FILENAME | while read LINE do echo “$LINE” : done } Each of these test loops is created as a function so that we can time each method using the shell script. allowing me to view the df command output one page/line at a time. This continuous loop will run until all of the lines in the file have been processed one at a time. On each loop iteration a single line of text is read into a variable named LINE. It is intuitively obvious that the output from the previous command in the pipe is used as input to the next command in the pipe. As an example. Look at the complete function in Listing 2. which is catting a file and piping the file output to a while read loop. if I execute the df command to list file system statistics and it scrolls across the screen out of view. as shown in Listing 2.

the loop will not do anything either. I use the no-op only as a placeholder so that you can cut the function code out and paste it in one of your scripts. the while loop will not fail. A no-op (:) does nothing. but it always has a 0. Within the while loop notice that I added the no-op (:) after the echo statement. zero. . If you should remove the echo statement and leave the no-op. I tend to use the function method more often so that when someone edits the script they will know the block of code is a function. return code. For beginners. the word “function” helps understanding the whole shell script a lot. however.while_read_LINE () { cat $FILENAME | while read LINE do echo “$LINE” : done } Whether you use the function or () technique. The $FILENAME variable is set in the main body of the shell script. you get the same result.