Introduction to Shell Scripting

Mar 2011
A shell script is a script written for the shell, or command line interpreter, of an operating
system. It is often considered a simple domain-specific programming language. Typical
operations performed by shell scripts include file manipulation, program execution, and printing
Types of shell scripts : Basically we have three types of the shell scripts [Most Prominent
one’s ,widely used in industry].
1. Bourne Shell :
The Bourne shell, or sh, was the default Unix shell of Unix Version 7, and replaced the
Thompson shell, whose executable file had the same name, sh. It was developed by Stephen
Bourne, of AT&T Bell Laboratories, and was released in 1977 in the Version 7 Unix release
distributed to colleges and universities. It remains a popular default shell for Unix accounts. The
binary program of the Bourne shell or a compatible program is located at /bin/sh on most Unix
systems, and is still the default shell for the root superuser on many current Unix
Features of the Bourne shell include:

Scripts can be invoked as commands by using their filename
May be used interactively or non-interactively

Allow both synchronous and asynchronous execution of commands

supports input and output redirection and pipelines

provides a set of built-in commands

provides flow control constructs, quotation facilities, and functions.

typeless variables

provides local and global variable scope

scripts do not require compilation before execution

does not have a goto facility, so code restructuring may be necessary

Command substitution using back quotes: `command`.

Here documents using << to embed a block of input text within a script.

"for ~ do ~ done" loops, in particular the use of $* to loop over arguments.

"case ~ in ~ esac" selection mechanism, primarily intended to assist argument


It contains strong provisions for controlling input and output and in its expression
matching facilities.

provided support for environment variables using keyword parameters and exportable

2. C Shell :
The C shell (csh or the improved version, tcsh, on most machines) is a Unix shell that was
created by Bill Joy while a graduate student at University of California, Berkeley in the late
The C shell is a command processor that's typically run in a text window, allowing the user to
type commands which cause actions. The C shell can also read commands from a file, called a
script. Like all Unix shells, it supports filename wildcarding, piping, here documents, command
substitution, variables and control structures for condition-testing and iteration. What
differentiated the C shell, especially in the 1980s, were its interactive features and overall style.
Its new features made it easier and faster to use. The overall style of the language looked more
like C and was seen as more readable.
Features of C Shell :
Some of the features of the C shell are listed here:

Customizable environment.
Abbreviate commands. (Aliases.)

History. (Remembers commands typed before.)

Job control. (Run programs in the background or foreground.)

Shell scripting. (One can write programs using the shell.)

Keyboard shortcuts.

3. Korn Shell :
The Korn shell (ksh) is a Unix shell which was developed by David Korn (AT&T Bell
Laboratories) in the early 1980s and announced at Toronto USENIX on July 14 1983
ksh is backwards-compatible with the Bourne shell and includes many features of the C shell as
well, such as a command history, which was inspired by the requests of Bell Labs users.

The Korn shell's major new features include:

Command-line editing , allowing you to use vi or emacs -style editing commands on
your command lines.
Integrated programming features : the functionality of several external UNIX
commands, including test , expr , getopt , and echo , has been integrated into the shell
itself, enabling common programming tasks to be done more cleanly and without creating
extra processes.

Control structures , especially the select construct, which enables easy menu

Debugging primitives that make it possible to write tools that help programmers debug
their shell code.

Regular expressions , well known to users of UNIX utilities like grep and awk , have
been added to the standard set of filename wildcards and to the shell variable facility.

Advanced I/O features , including the ability to do two-way communication with
concurrent processes (coroutines ).

New options and variables that give you more ways to customize your environment.

Increased speed of shell code execution.

Security features that help protect against "Trojan horses" and other types of break-in

Why do we Use Shell Scripting
Mar 2011
We use Shell Scripts to obtain the following purposes.
1. Customizing your work environment.
2. Automating Your Daily Task.
3.Automating Repetitive Task.
4.Executing Important Procedures like shutting down the system,formatting a disk,Creating a file
system on it,mounting the file system,letting the users use the floppy and finally un mounting the
5.Performing the same operation on many files.

2.Cal cal 2009 cal dec 4. Requires a high degree of efficiency 3. To find the user manual for grep command. Date date date +%m Description To find the version of unix os To find the machine name Print the user manual for any command. Man uname -r uname -n man grep man -e grep man -k copy the different option available for copying. 2. Print the calendar year or month Print the calendar 2009 year Print calender for current december month Tue Mar 09 07:10:10 IST 2009 .To run an batch via pmcmd. Helpful when we donot know which command will solve the purpose.To run the command task. Uname 2.such as writing an entire billing system.When Not to use Shell Scripting : 1.e. To have header and footer in case we are goanna write to an flat file 3. Usage of Shell Scripts with in Informatica: 1. One line definition of grep usage. To update parameter file with session start time and end tim Basic Commands Part1 Naveen 24 Mar 2011 Command Option 1.g :. 4. Requires a variety of Software tools. Is too complex.

Reverse the sort order of all the files with their names. To Print only directoris and executable files.Bc To use the calculator 9.Passwd 7. Print the content of abc directory. Prints the inode number of file. Print all the directories.M and S To Print the contents of current directory. prints last access time of files.< * > indicate all files containing executable code and </> indicate the directories. ls -x ls -F ls -a ls -r ls -x abc ls -R ls -ld ls -il testfile ls -lt ls -ltr ls -lut . <a> stands for all. Prints all the file with their last modification time keeping latest modified at top. Print all files and subdirectories in directory TREE. Prints all the file with their last modification time in reverse order keeping last modified at bottom. Prints Multi column output of files with their names.Lock 3 March Mar-03 10 9 Login details of all the users in system Login details for only myself Login details for all users with headers info Login details for only active users To change the password lock -45 lock the terminal for 45 minutes 8. Print all files including hidden files.Who who who am I who -H who -u +%h date +'%h%m' date +%d date +%y H.

Split the files into small files. 13. Create a new file with testfile name.wc These three command are used to find the difference in two files wc -w testfile wc -c testfile wc -l testfile wc testfile Print the numbers of words.10. Recursive deletion will not remove the write protected files. copying entire directiry structure abc to newabc recursively.Split To split the big files in different small files of 1000 lines each<default>. lines and character in the file 17.Path Set the path variable for PATH=$PATH:/home/ SHELL to use these directory gagan:/home/deep path to locate the executables commands. Cmp comm Diff 16. change the default split line size to 72. copying testfile to abc directory interactively.stty stty -a stty intr \^c stty eof \^a Print current setting of terminal To delete the character during backspacing To change the interrupt key to . To copy file and directories.Cat 11.lp Used for Printing the files 14.CP 12. All the options of cp command is applicable for rm command. Renaming the testfile to new name as <newfile>. split testfile split -72 testfile split testfile newfile 15. 18.rm cat testfile cat > testfile cp -i testfile abc cp -r abc newabc rm -f newabc rm -r newabc Print the content of any file. Removing directory newabc forcefuly.

To change the defalult setting of command history saving - To switch between current working directory and most recently used directory. Passing positional parameter ETL to showdir alias will take us to that directory. To create short hand names of command.Tilde(~) Changes control to home directory Changes absolute path home directory of user a_cmdb. the value of home directory can be seen by $HOME environment variable 23.chmod Changing the permission of chmod u+rwx newfile file for user <u>. group <g>.History 21. 24. Display event number between 10 an d15. Print the last 5 command that has been used.Ctrl c instead of 'DELETE' default To change the termination control of input during file creation using cat command from Ctrl d to Ctrl a. To redefine and unset the alias. -l' unalias history -5 history 10 15 HISTSIZE 1200 r-2 r 20 Print all the alias set in the system. Repeat previous command in Korn Repeat previous command in Bash Repeat previuos to prevoius command Repeat the command with 20 event number.Alias 20. To find the previous command that has been used. 19. r !! alias alias l ='ls -ltr' alias showdir ='cd $1. Every command has event number. .

. Print first 512 byte character Print first 2 blocks (1 MB each) tail -5 tail +10 tail -f 6_load.f .Tail Recursively changes the permission to execute for all directory and sub directory in abc directory for all users.$$ Prints the Process id of current shell remove and = as absolute permission.r read<4>. Basic Commands Part2 Naveen 24 Mar 2011 Command Options 25.others <o> or all <a> with + as assign .ps Prints the Processes associated with ps -f current user. Indicate the current directory. start dispplaying 10 line onward till the end. Print the gworth of file 27. To open the last modified file for editing. . To pick special no of character from file. w write<2> and x execute<1>. chmod g+x newfile chmod 757 newfile chmod 457 newfile chmod -R a+x abc chmod -R 001 .log Display end of file Display last 5 lines of file.Head 26. Description head newfile head -5 newfile head -1 newfile | wc -c vi 'ls -t | head -1' head -c 512 newfile head -c 1b newfile head -c 2m newfile Display top content of file Display top 10 lines of file when used without argument Display top 5 lines of file<newfile> when used without argument To count the number of character present in first line. ps -f -u a_cmdb Prints the Processes associated with ps -a current user with their hierarchy.

30. Running fg will bring most recently started background process <LAST IN FIRST OUT> First job will come into foreground. . Now after running the process in nohup mode kernel has reassigned the PPID of ps process with PID of system process which will not die even after SHELL dies through logging out. 29. Prints system -e stands for full. Running job in background Run the search job in background by printing its process id. 31. Prints all the users processes. System process init having PID as 1 nohup ps -eaf | is parent of all SHELL processes grep 'ksh' & and when the user login this PID become the PPID of SHELL prrocess.Shell become the parent of all ps -f background process.To avoid dying of shell even when the user logs out run the background process using nohup command. prints all the ksh process running in the system by first searching for all the process.Background ps -f -u a_cmdb This command print all the jobs job ps -a runnung in background with their ps -e sequence no of exceution. Killing system process init is not possible. Kill the parent process id which inturn kill all the child process. Kill the process with 520 process id.nohup Shell is parent of all background jobs and it dies when user logs out and ultimately all the child process also dies. A Process having process id as 1 is parent of all SHELL processes. Prints all the processes associated with a_cmdb user.Kill kill 520 kill 640 kill 1 kill $! kill 0 To kill the command.

chown -R change the owner from gagan to .Finger Produces list of all logged users when run without any argument.ksh will execute today by 2:10 34. l& higher the nice value lower the nice -n 10 who | priority. This will set $1.Shift shift shift 2 39. $2. 37. The 2nd and last field of output is taken fron passwd file from /etc directory. $3 .ksh batch command executes the process whenever the CPU is free.ksh Indicate the load. Prints details about a_cmdb user. 33.No need to remember Process id of last background procees its in $! Kill all process in the system except login shell process. $* and $# posional parameter When this statement is used at start of ksh script it echoes each statement at the terminal shift command transfer the contents of positional parameter to next lower number. shift 2 places 38. 35.Cron cron executes process at regular intervals it keeps checking the control file at /user/spool/cron/crontabs for jobs. $$ stores the PID of current login shell 32.Batch batch load.Nice Running the command with low priority nice who | wc Nice value range from 1 to 39.At To Run the job at specied time of the day.Chown To change the owner of file can chown jaspreet only be done by owner or sys testfile admin. at 2:10 load. 36.default nice value is 20.Set finger finger a_cmdb set 10 20 30 -x set the posional parameter. wc -l & nice value becomes 30. shift one position.

Prints summary of whole directory tree.Linking in 43. Linking two file.ksh 7_load. 40.Chgrp 41.ksh will give us same inode number. Prints the free and total space under oracle file system. Now gagan cannot change the ownership of test file what he can do is just create the jaspreet testfile similar copy of file and then he can become the owner of file. ls -li 6_load.ksh rm 6_load. doing this both the file will have same inode number and changes in one will reflect in another also. Changes only access time. touch 03161430 test file touch -m 04161430 test file touch -a 05161430 test file ln 6_load. Drop the link between two files.du To change the group of file can only be done by owner or sys admin. Prints disk usage for every user . By default du prints disk usage for each sub directory.Touch 42. Prints the disk usage of specific directory tree. test file Recursively changes group all files under current directory. Recursively changes the owner all files under current directory. chgrp GTI test group changed to GTI from IB as file user is still the same and he can chgrp -R GTI again change the group of file to IB.ksh df -t /home/oracle du -s du -s /home/* Changing the time stamps of file touch without any argument will change both modified time and access time.ksh 7_load. Changes only modification time.df 44.jaspreet for testfile. Prints the amount of free space available on the disc.

5 test file | new file 4.sql -x files are restored eg:tar-xcvf /home/gagan/sqlbackup using -x option.gz Description gunzip etl_code. tar create backup of files recursively. tabs . field (default deliminator tab). F stands for field cut./*. -cvf eg :tar -cvf /home/gagan/sqlbackup .gz eg: etl_code zip file*sql eg: file. By default the sorting starts with first character of each line.5 test file '-d -f <field start and end no> eg: cut -d "|" -f 1. sort sort test file column (by specifying position).zip unzip file./*.zip eg:file*sql Priority1.Zipping Files Naveen 24 Mar 2011 Command Options 1. Cut the field b/w 1 and 5 and piped the output to new file.sql -c <column start and no> 3. C stands for column cut. . Cut eg : cut -c -5. gzip gzip etl_code eg:etl_code.6-12 test file -f <field start and end no> eg:cut -f 1.

numerals 3.txt translate all | with ~ and \ with .tr eg : tr '|\' '~-' <test. -n eg : sort -n <> Numeric sort.txt abc_sort.Sorting starts with 2nd field then with 4th field. eg:paste -d "|" <> <> 6.2. -d deliminator.uppercase letters 4. eg : sort -t "|" +1 -2 +3 <> Sorting based on different field as in case of order by clause. -o eg : sort -o abc.delimiter to distinguish b/w start and end of field. -d delete all occurrence . -u eg : sort -u <> Unique sort. Lower case letters. eg : tr '[a-z]' '[A-Z] <test. 5. -r eg : sort -t \| -r +2 test file Reverse sort starting with 3rd field..txt translate to upper case.txt save sorted data in file. eg : sort -t \| +2r test file The above command can be written in another way.paste -t eg : sort -t "|" +2 test file Sorting starts from 3rd field skipping 2nd field. -2 indicate to stop the sorting after 2nd field and resume it with 3rd field. overring the default.

eg : tr -d '|' test. 9. Disk usage. -u remove duplicate. ls -lt time of last modification touch -m 01290430 <file>.wall wall -g dba "hello" 11.shutdown shutdown -g2 12. eg : cut -d "|" -f3|sort|uniq -c 8. eg : cut -d "|" -f3 <>|sort|uniq -u -d select only dup records. power down after 2 mins. shutdown -y -g0 immediate shutdown. shutdown -r now shutdown immediate and reboot.txt 7.Changing time stamp touch mon date hrs mins <file>.Change Date date 09181754 10.uniq of | . shutdown 17:30 shutdown at 17:30. . ls -lu time of last access touch -a 01290415 <file>. shutdown -y -g0 -i6 shutdown and reboot (init level 6). eg: cut -d "|" -f3 <>|sort|uniq -d -c duplicate count. du to selectively send msg to dba group. Unique require sorted file as input.

-atime +180 -ok rm -f {} \. -mtime -2 -print find file modi in less then 2 days. xargs remove all file rm eg: find . print. -mtime +20 | xargs rm -f will be executed only once. -prune don't descend exe eg: find . \( -name "*. find <loc> <option> <pattern> <action> find in root dir abc in emp.lst" \) -print double quotes necessary. 'atime = access time eg: find . -name *. Pattern Searching . -atime +365 -print find the file not accessed in last 1 year.lst file '-mtime = mod time eg: find .13. -a (and) -o (OR) eg: find . remove the files which are not modi for last 20 days -ok eg: find .sh" -o -name "*.pl" ! -newer last_backup -print file modi before last_backup. ! –newer eg: find / -name "*. -type eg: find / -name log -type f -print f for file and d for directory.find du /home/expimp/create_db tree output for each directory inside. -exec eg: find . du -s //home/expimp/create_db summary. -mtime +20 | xargs -n20 -p -t rm -f remove at max 20 files in batch and in interactive mode. eg: find . xargs -n -p –t eg:find . before removing prompt for confirmation. -atime +180 -ok rm -f {} \. -size.log -prune exe – directory. -size +2048 -print files greater then x blocks.

-n display line number for record. : $s/gagan/deep/gc c Ask for confirmation for replacement grep -c counting occurrence. . [PQR] match any single character. 1. $s/<search string>/<Replace String >/g Pattern search and replacement.Naveen 24 Mar 2011 Commands Options Description / /Unix Forward search of Unix keyword in file ? ?Unix Backward search of Unix keyword in file n Repeat the last search .s/gagan/deep/g Only the current line. : $s/gagan/deep/g Only the last line. -i ignore case.$ Represent all lines in the file. -v skip records that contain directory. Repeat the previous command :1. -l display files containing record. g Stands for globally :3.10s/gagan/deep/g search between lines 3 and 10 : .

? Matches zero or one occurrence of previous character. ls –l |grep “^d” Prints only directories. {a-Z A-Z 0-9} match any single character. <pat>$ ending with pattern. egrep prashant| director test1.txt match zero or one occurrence.[c1-c2] match char with ASCII range [^PQR] match single character which is not PQR. egrep '[aA]gg?[ar] [ar]wal' test1.pattern stored in file eg(prashant|admin| director) Fgrep + Matches one or more occurrence of previous character.txt file name .txt match one or more occurrence matches ag & agg. fgrep and egrep accepts multiple pattern both form command line and a file but unlike grep and egrep does not accept regular .txt match eap1 or eap2 (finds prashant or line with director) egrep egrep (das|sen)gupta match exp x1x3 or x2x3 test1. egrep '[aA]g+[ar] [ar]wal' test1. ^<pat> beginning with pattern.txt (like dasgupta & sengupta) egrep -f Huge list of pattern search <pattern_file_name> can passed in the form of test1.

fgrep -f pattern file emp file Faster than grep and egrep family Pattern Matching Naveen 25 Mar 2011 Command Options * Description Matches any number of character including NONE. ls -l chap* Matches all the files which starts with chap. * does not match all files beginning with dot <.expression. [!x-z] Matches single character that is not within the ASCII range of character x and z . [ !ijk ] Matches single character that is not i or j or k [x-z] Matches single character that is not within the ASCII range of character x and z. ? Matches single character. ls -x chap* Matches all the files which starts with chap and prints in multi column way.>. [ ijk ] Matches single character either i or j or k. ls -l chap? Matches all the files with only 5 character name and should start with chap.

l chap* Print all files whose names start with chap but not the one whose name is chap*.???* The above problem can be solved with specifying first few character using meta character <?> explicitly. mv * . To pass the standard output of one command as the .>./bin Moves all the files to bin -l chap0[1 . cmp chap[12] Compares file chap1 and chap2.9] Matches all file names beginning with alphabet chap and not ending with any number.. ls . ls -l [ a – z A-Z] Matches all file names beginning with alphabet irrespective of case. Escaping Backslash ( \ ) Pipe | Playing with file names which uses meta character in their file name. cp chap?? abc Copy all files starts with chap to abc directory. ls .9] Concatenates all the files beginning with chap and not ending with number.4] Range specification is also available. cat chap[!0 . ls -l chap[!0 . ls -l .l chap\* The above problem can be solved by escaping the special character. ls -l * * does not match all files beginning with dot <. ls *.ist Print all the files which end with ist extensions.

Tee who | wc -l Output of who command <three users> passed as input to wc which count the number of lines present. echo '$10' All words starts with $ are considered as variable unless single quoted or escaped.standard input to another. echo $z shell concatenate two variable. All words starts with $ are considered as variable unless single quoted or escaped. . Shell Variable Shell variables are initialized to null value < by default > so it returns null. $1 is part of positional parameter. b=cd. z=$a$b. who | tee users list Tee saves the output of who command in user list as well as display it also. a=ab. eg: $10 echo "$10" eg: 0 Control Structures Naveen 25 Mar 2011 Shell is looking for $1 variable which is undefined so passes null for this.


/etldata/aagt/common/scripts/aagt. # SET PARAMETERS .env file to get common #Author: Prasad Degela #Date: 03/13/2006 #Reviewed by By: #Date: #Project: Appeals & Grivances Tracking ############################################################################## ###### #This script will call .env inboxdir=$PROJECTDIR/qnxt/inbox srcfiledir=$PROJECTDIR/qnxt/estage errorfiledir=$PROJECTDIR/qnxt/error tempfiledir=$PROJECTDIR/nice/tstage .Sample Unix shell Scripts Naveen 10 Apr 2011 #ksh: JobNum.Part 1 # These parameters are provided for all interfaces. # $1 = environment (DEV) ############################################################################## ###### #set -x ############################################################################## ###### .

tempfiledirqnxt=$PROJECTDIR/qnxt/tstage ERROR_EXIT_STATUS=99 SUCCESS_EXIT_STATUS=0 echo "Control Moved to the Inbox Directory = "$inboxdir cd $inboxdir fileconv=`echo "PDP*.txt"` echo "fileconv = "$fileconv files=`ls -tr $fileconv` if [[ $? -ne 0 ]] then echo "No Source File in the InBox "$inboxdir exit $SUCCESS_EXIT_STATUS fi echo "List Of Files in the Inbox Directory "$inboxdir echo "{ $files }" for pfilename in $files do cd $inboxdir cp $pfilename $srcfiledir gzip $pfilename cp $pfilename.txt .gz rm -f TRAIL*.gz $ARCHFILEDIR rm -f $pfilename.

$pINVOKE_ID echo "************************* Reset the Job Sequencer *******************" fi echo AGDB: $AGDB echo AGDBUserID: $AGDBUserID echo ORADB: $ORADB echo ORADBUSRID: $ORADBUSRID .echo "Source File Name Is "$pfilename pDSjobname="sjAG01" pINVOKE_ID=$pfilename pJobStatusFile=`echo $pfilename".$pINVOKE_ID > $LOGFILEDIR/ $pJobStatusFile JOBSTATUS=`grep "Job Status" $LOGFILEDIR/$pJobStatusFile | cut -f2 -d '('|cut -f1 -d ')'` echo "$pDSjobname Job status code is "$JOBSTATUS if [[ $JOBSTATUS = 3 ]] then echo "Job Status = "$JOBSTATUS $DSBINDIR/dsjob -run -mode RESET -wait -jobstatus $DSPROJECT $pDSjobname.log"` ############################################################################## ###### # RESET AND RUN job sequencer # ############################################################################## ###### $DSBINDIR/dsjob -jobinfo $DSPROJECT $pDSjobname.stat"` pJobLogFile=`echo $pfilename".

echo NICEDB: $NICEDB echo NICEDBUserID: $NICEDBUserID echo SrcFile: $pfilename echo SrcFileDir: $SRCFILEDIR echo TempFileDirectory: $TEMPFILEDIR echo TempFileDirectoryqnxt:$TEMPFILEDIRQNXT echo ErrorFileDirectory: $ERRORFILEDIR echo ScriptFileDirectory: $SCRIPTFILEDIR echo "calling $Jobname Sequence" $DSBINDIR/dsjob -run \ -param AGDB=$AGDB \ -param AGDBUserID=$AGDBUserID \ -param AGDBPswd=$AGDBPswd \ -param ORADB=$ORADB \ -param ORADBUSRID=$ORADBUSRID \ -param ORADBPswd=$ORADBPswd \ -param NICEDB=$NICEDB \ -param NICEDBUserID=$NICEDBUserID \ -param NICEDBPswd=$NICEDBPswd \ -param SrcFile=$pfilename \ -param SrcFileDir=$SRCFILEDIR \ -param TempFileDirectory=$TEMPFILEDIR \ -param TempFileDirectoryqnxt=$TEMPFILEDIRQNXT \ .

$INVOKE_ID > $LOGFILEDIR/ $pJobLogFile JOBSTATUS=`grep "Job Status" $LOGFILEDIR/$pJobStatusFile | cut -f2 -d '('|cut -f1 -d ')'` echo "$pDSjobname Job status code is "$JOBSTATUS if [[ $JOBSTATUS = 1 ]] then echo "Job Status = "$JOBSTATUS echo "Removing File from $srcfiledir on successful processing of file $pfilename" cd $srcfiledir rm -f $pfilename else exit $ERROR_EXIT_STATUS fi done echo 99 END OF THE PROG exit $SUCCESS_EXIT_STATUS # ################################################################## Standards for ETL UNIX Shell Scripts for use with PowerCenter 7.$pINVOKE_ID > $LOGFILEDIR/ $pJobStatusFile $DSBINDIR/dsjob -logsum $DSPROJECT $pDSjobname.1.3 .$pINVOKE_ID $DSBINDIR/dsjob -jobinfo $DSPROJECT $pDSjobname.-param ErrorFileDirectory=$ERRORFILEDIR \ -param ScriptFileDirectory=$SCRIPTFILEDIR \ -wait -jobstatus $DSPROJECT $pDSjobname.

sh dss # # NOTE: Enter the Project ID parameter that is designated in the # directory structure for your team # (ie. The following is a template that should be used to create scripts for new scheduled # Author: Kevin Gillenwater # Date: 7/24/2003 # Purpose: Sample UNIX shell script to load environment variables # needed to run PowerMart jobs. dss.Naveen 10 Apr 2011 Standards for ETL UNIX Shell Scripts for use with PowerCenter 7. # Program: etl_unix_shell_script.1. # # $1 = Project Id Parameter (ie. ud. This script has been provided as an example and is named etl_unix_shell_script. which the scheduling tool uses to start the PowerCenter job. pass username and password variables # and start the job using the pmcmd command line. dss would be used for the DWDAS team as the # directory is /usr/local/autopca/dss/) #----------------------------------------------------------------- . hr. etc.1.3): Scripting standards include the use of a UNIX shell Following is the script and explanation.3 Scripting Standards (for PowerCenter version 7. and a separate file which contains the username and password for the user called in the script. siq.) # # Example usage: etl_unix_shell_script.

The beginning of each script should call and execute the script set_pm_var. located on each machine (Leisure for Production and Glance for Development). will provide ease of maintenance if changes need to be made to PowerCenter #----------------------------------------------------------------# Read ETL configuration parameters from a separate file: #----------------------------------------------------------------ETL_CONFIG_FILE=$JOBBASE/$1/remote_accts/test_etl.config ETL_USER=`grep ETL_USER $ETL_CONFIG_FILE | awk -F: '{print $2}'` ETL_PWD=`grep ETL_PWD $ETL_CONFIG_FILE | awk -F: '{print $2}'` #----------------------------------------------------------------# Start the job #----------------------------------------------------------------$PM_HOME/pmcmd startworkflow -u $ETL_USER -p $ETL_PWD -s $MACHINE:4001 -f DWDAS_ LOAD_dssqa –wait s_m_CENTER_INSTITUTE #----------------------------------------------------------------# Trap the return code #----------------------------------------------------------------rc=$? if [[ $rc -ne 0 ]] then exit $rc fi Notes Regarding the Script/Standards: to set up the PowerCenter variables used in the session (. /usr/local/bin/ /usr/local/bin/set_pm_var.# Call the script to set up the Informatica Environment Variables: #----------------------------------------------------------------. and will provide one .sh). The set_pm_var.

# For all other accounts. esac . #------------------------------------------#Set up the Informatica Variables #------------------------------------------export MACHINE=`hostname` export SHLIB_PATH=/usr/local/pmserver/informatica/pm/infoserver export PM_HOME=/usr/local/pmserver/informatica/pm/infoserver #--------------------------------------------------------------------# Set the environment variables needed for scheduling jobs.. script run by Autosys).. *) JOBBASE=$HOME .sh # Author: Kevin Gillenwater # Date: 7/3/2003 # Purpose: UNIX script which sets the variables for running PowerMart 6.source for the scripts on a machine. #--------------------------------------------------------------------case $HOME in /home/autopca/autopca) JOBBASE=/usr/local/autopca . it is for informational purposes only: # Program: set_pm_var. the variable should evaluate to /usr/local/autopca. /home/autopca/autodba) JOBBASE=/usr/local/autopca .. For AUTOSYS # and AUTODBA.2 # when called from a shell script (ie. # The value of JOBBASE differs based on the account. it should evaluate to their $HOME variable. The following is the code in the script. You will not need to do anything with this script.

The third section of etl_unix_shell_script.e. *10/6/04 . .sh contains code to trap the return code of the script that indicated success or failure. it’s rejected. /usr/local/autopca/dss/remote_accts/).sh sets up ETL parameters for username and password usage by PowerCenter when running the workflow. the filename test_etl_config contains the username and password to be used when running the workflow.-) on this file. Types of regular expressions: .export JOBBASE 2. Regular Expressions Naveen 23 Apr 2011 What Are Regular Expressions? A regular expression is a pattern template you define that a Linux utility Uses to filter text. 3. If the data matches the pattern. The permissions should be 6-4-0 (rw.r. DWDAS_LOAD_dssqa wflw_m_CENTER_INSTITUTE). The second section of contains the command to run the workflow in PowerCenter. Following is the contents of test_etl_config: ETL_USER:etlguy ETL_PWD:ou812 This file (your password file) must be located in your team directory under the remote_accts folder (i.e.Updated to generalize the scheduling tool used to run the shell scripts as UC4 has been chosen to replace Autosys as the scheduling tool used to run PowerMart workflows. Please follow the script exactly. In the script example.. The only thing that will need to be changed in this section is the folder name and the workflow within the folder that is being executed (i. A Linux utility (such as the sed editor or the gawk program)matches the regular expression pattern against data as that data flows Into the utility. The username and password are no longer stored as part of the main shell script. The final section of etl_unix_shell_script. 4. *8/27/03 – Added ‘-wait’ parameter to the pmcmd command line to start the scheduled job. The regular expression pattern makes use of wildcard characters to represent one or more characters in the data stream.. If the data doesn’t match the pattern.. it’s accepted for processing. The folder permissions on the Production machine will only permit PCA’s access to this folder.

$ echo "This is a test" | gawk ’/trial/{print $0}’ $ Eg 2: Special characters The special characters recognized by regular expressions are: .There are two popular regular expression engines:   The POSIX Basic Regular Expression (BRE) engine The POSIX Extended Regular Expression (ERE) engine Defining BRE Patterns: The most basic BRE pattern is matching text characters in a data stream.00 $ sed -n ’/\$/p’ data2 The cost is $4.*[]^${}\+?|() For example. just precede it with a backslash character: $ cat data2 The cost is $4. if you want to search for a dollar sign in your text. Eg 1: Plain text $ echo "This is a test" | sed -n ’/test/p’ This is a test.00 $ . $ echo "This is a test" | sed -n ’/trial/p’ $ $ echo "This is a test" | gawk ’/test/{print $0}’ This is a test.

but that’s all. Now you can simplify the zip code example by specifying a range of digits: $ sed -n ’/^[0-9][0-9][0-9][0-9][0-9]$/p’ data8 60633 46201 45902 $ Extended Regular Expressions: The POSIX ERE patterns include a few additional symbols that are used by some Linux applications and utilities. $ echo "This is a good book" | sed -n ’/book$/p’ This is a good book $ echo "This book is good" | sed -n ’/book$/p’ $ Eg 4: Using ranges You can use a range of characters within a character class by using the dash symbol.Eg 3: Looking for the ending The dollar sign ($) special character defines the end anchor. but the sed editor doesn’t. It doesn’t match repeating occurrences of the character: $ echo "bt" | gawk ’/be?t/{print $0}’ bt $ echo "bet" | gawk ’/be?t/{print $0}’ Bet . Eg 1: The question mark The question mark indicates that the preceding character can appear zero or one time. The gawk program recognizes the ERE patterns.

The pattern doesn’t match if the character is not present: $ echo "beeet" | gawk ’/be+t/{print $0}’ beeet $ echo "beet" | gawk ’/be+t/{print $0}’ beet $ echo "bet" | gawk ’/be+t/{print $0}’ bet $ echo "bt" | gawk ’/be+t/{print $0}’ $ Eg 3: The pipe symbol The pipe symbol allows to you to specify two or more patterns that the regular expression engine uses in a logical OR formula when examining the data stream. If any of the patterns match the data stream text.. Here’s an example of this: $ echo "The cat is asleep" | gawk ’/cat|dog/{print $0}’ The cat is asleep $ echo "The dog is asleep" | gawk ’/cat|dog/{print $0}’ . the text passes. If none of the patterns match. but must be present at least once. the data stream text fails..$ echo "beet" | gawk ’/be?t/{print $0}’ $ $ echo "beeet" | gawk ’/be?t/{print $0}’ $ Eg 2: The plus sign The plus sign indicates that the preceding character can appear one ormore times. The format for using the pipe symbol is: expr1|expr2|.

ksh bye END_FTP Reactions: Process a File line by line . ftp -i -v -n wilma <<END_FTP user randy mypassword binary lcd /scripts/download cd /scripts get auto_ftp_xfer.The dog is asleep $ echo "The sheep is asleep" | gawk ’/cat|dog/{print $0}’ $ Eg 4: Grouping expressions When you group a regular expression pattern. For example: $ echo "Sat" | gawk ’/Sat(urday)?/{print $0}’ Sat $ echo "Saturday" | gawk ’/Sat(urday)?/{print $0}’ Saturday $ Automated FTP File Transfer Naveen 22 Apr 2011 Automated FTP File Transfer : You can use a here document to script an FTP file transfer. the group is treated like a standard character. The basic idea is shown here. You can apply a special character to the group just as you would to a regular character.

You could also use () C-type function definition if you wanted. allowing me to view the df command output one page/line at a time. as in the following command: df | more When the df command is executed. I can use a pipe to send the output to the more command. . It is intuitively obvious that the output from the previous command in the pipe is used as input to the next command in the pipe. Then this temporary system file is used as input to the more command. Look at the complete function in Listing 2. the output of the cat command is used as input to the while loop and is read into the LINE variable on each loop iteration. The pipe is the key to the popularity of this method. This continuous loop will run until all of the lines in the file have been processed one at a time. if I execute the df command to list file system statistics and it scrolls across the screen out of view. Our use of piping output to a while loop works the same way. function while_read_LINE { cat $FILENAME | while read LINE do echo “$LINE” : done } Each of these test loops is created as a function so that we can time each method using the shell script.2.Naveen 22 Apr 2011 Process a File line by line : There are Numerous methods to achieve the same one such method is discussed here : Method 1: Let’s start with the most common method that I see.3. the pipe stores the output in a temporary system file. On each loop iteration a single line of text is read into a variable named LINE. as shown in Listing 2. As an example. which is catting a file and piping the file output to a while read loop.

the while loop will not fail. Within the while loop notice that I added the no-op (:) after the echo statement. zero. the word “function” helps understanding the whole shell script a lot. however. return code. I use the no-op only as a placeholder so that you can cut the function code out and paste it in one of your scripts. I tend to use the function method more often so that when someone edits the script they will know the block of code is a function. . the loop will not do anything either. you get the same result. If you should remove the echo statement and leave the no-op. The $FILENAME variable is set in the main body of the shell script. For beginners.while_read_LINE () { cat $FILENAME | while read LINE do echo “$LINE” : done } Whether you use the function or () technique. but it always has a 0. A no-op (:) does nothing.