Introduction to Shell Scripting

Mar 2011
A shell script is a script written for the shell, or command line interpreter, of an operating
system. It is often considered a simple domain-specific programming language. Typical
operations performed by shell scripts include file manipulation, program execution, and printing
Types of shell scripts : Basically we have three types of the shell scripts [Most Prominent
one’s ,widely used in industry].
1. Bourne Shell :
The Bourne shell, or sh, was the default Unix shell of Unix Version 7, and replaced the
Thompson shell, whose executable file had the same name, sh. It was developed by Stephen
Bourne, of AT&T Bell Laboratories, and was released in 1977 in the Version 7 Unix release
distributed to colleges and universities. It remains a popular default shell for Unix accounts. The
binary program of the Bourne shell or a compatible program is located at /bin/sh on most Unix
systems, and is still the default shell for the root superuser on many current Unix
Features of the Bourne shell include:

Scripts can be invoked as commands by using their filename
May be used interactively or non-interactively

Allow both synchronous and asynchronous execution of commands

supports input and output redirection and pipelines

provides a set of built-in commands

provides flow control constructs, quotation facilities, and functions.

typeless variables

provides local and global variable scope

scripts do not require compilation before execution

does not have a goto facility, so code restructuring may be necessary

Command substitution using back quotes: `command`.

Here documents using << to embed a block of input text within a script.

"for ~ do ~ done" loops, in particular the use of $* to loop over arguments.

"case ~ in ~ esac" selection mechanism, primarily intended to assist argument


It contains strong provisions for controlling input and output and in its expression
matching facilities.

provided support for environment variables using keyword parameters and exportable

2. C Shell :
The C shell (csh or the improved version, tcsh, on most machines) is a Unix shell that was
created by Bill Joy while a graduate student at University of California, Berkeley in the late
The C shell is a command processor that's typically run in a text window, allowing the user to
type commands which cause actions. The C shell can also read commands from a file, called a
script. Like all Unix shells, it supports filename wildcarding, piping, here documents, command
substitution, variables and control structures for condition-testing and iteration. What
differentiated the C shell, especially in the 1980s, were its interactive features and overall style.
Its new features made it easier and faster to use. The overall style of the language looked more
like C and was seen as more readable.
Features of C Shell :
Some of the features of the C shell are listed here:

Customizable environment.
Abbreviate commands. (Aliases.)

History. (Remembers commands typed before.)

Job control. (Run programs in the background or foreground.)

Shell scripting. (One can write programs using the shell.)

Keyboard shortcuts.

3. Korn Shell :
The Korn shell (ksh) is a Unix shell which was developed by David Korn (AT&T Bell
Laboratories) in the early 1980s and announced at Toronto USENIX on July 14 1983
ksh is backwards-compatible with the Bourne shell and includes many features of the C shell as
well, such as a command history, which was inspired by the requests of Bell Labs users.

The Korn shell's major new features include:

Command-line editing , allowing you to use vi or emacs -style editing commands on
your command lines.
Integrated programming features : the functionality of several external UNIX
commands, including test , expr , getopt , and echo , has been integrated into the shell
itself, enabling common programming tasks to be done more cleanly and without creating
extra processes.

Control structures , especially the select construct, which enables easy menu

Debugging primitives that make it possible to write tools that help programmers debug
their shell code.

Regular expressions , well known to users of UNIX utilities like grep and awk , have
been added to the standard set of filename wildcards and to the shell variable facility.

Advanced I/O features , including the ability to do two-way communication with
concurrent processes (coroutines ).

New options and variables that give you more ways to customize your environment.

Increased speed of shell code execution.

Security features that help protect against "Trojan horses" and other types of break-in

Why do we Use Shell Scripting
Mar 2011
We use Shell Scripts to obtain the following purposes.
1. Customizing your work environment.
2. Automating Your Daily Task.
3.Automating Repetitive Task.
4.Executing Important Procedures like shutting down the system,formatting a disk,Creating a file
system on it,mounting the file system,letting the users use the floppy and finally un mounting the
5.Performing the same operation on many files.

Print the calendar year or month Print the calendar 2009 year Print calender for current december month Tue Mar 09 07:10:10 IST 2009 . Usage of Shell Scripts with in Informatica: 1. One line definition of grep usage.When Not to use Shell Scripting : 1. Uname 2.e.g :. Man uname -r uname -n man grep man -e grep man -k copy 3.such as writing an entire billing system. 2. To find the user manual for grep command.Cal cal 2009 cal dec the different option available for copying. Helpful when we donot know which command will solve the purpose. 2.To run the command task. Requires a high degree of efficiency 3.To run an batch via pmcmd. Is too complex. 4. Requires a variety of Software tools. Date date date +%m Description To find the version of unix os To find the machine name Print the user manual for any command. To have header and footer in case we are goanna write to an flat file 3. To update parameter file with session start time and end tim Basic Commands Part1 Naveen 24 Mar 2011 Command Option 1.

Lock 3 March Mar-03 10 9 Login details of all the users in system Login details for only myself Login details for all users with headers info Login details for only active users To change the password lock -45 lock the terminal for 45 minutes 8. prints last access time of files. Print all the directories.< * > indicate all files containing executable code and </> indicate the directories. Print the content of abc directory. ls -x ls -F ls -a ls -r ls -x abc ls -R ls -ld ls -il testfile ls -lt ls -ltr ls -lut . Print all files including hidden files.M and S 5. To Print only directoris and executable To Print the contents of current directory. Print all files and subdirectories in directory +%h date +'%h%m' date +%d date +%y H.Who who who am I who -H who -u 6.Passwd 7. <a> stands for all.Bc To use the calculator 9. Prints the inode number of file. Prints all the file with their last modification time in reverse order keeping last modified at bottom. Prints all the file with their last modification time keeping latest modified at top. Prints Multi column output of files with their names. Reverse the sort order of all the files with their names.

copying entire directiry structure abc to newabc recursively. Renaming the testfile to new name as <newfile>.Cat 11. 18. copying testfile to abc directory interactively. 13. change the default split line size to 72. Cmp comm Diff 16. All the options of cp command is applicable for rm command.10. Create a new file with testfile name.lp Used for Printing the files 14. lines and character in the file 17. To copy file and directories. Recursive deletion will not remove the write protected files. split testfile split -72 testfile split testfile newfile 15.CP 12. Split the files into small files.rm cat testfile cat > testfile cp -i testfile abc cp -r abc newabc rm -f newabc rm -r newabc Print the content of any file. Removing directory newabc forcefuly.Path Set the path variable for PATH=$PATH:/home/ SHELL to use these directory gagan:/home/deep path to locate the executables commands.Split To split the big files in different small files of 1000 lines each<default>.wc These three command are used to find the difference in two files wc -w testfile wc -c testfile wc -l testfile wc testfile Print the numbers of words.stty stty -a stty intr \^c stty eof \^a Print current setting of terminal To delete the character during backspacing To change the interrupt key to .

To find the previous command that has been used. .Alias 20.History 21.Ctrl c instead of 'DELETE' default To change the termination control of input during file creation using cat command from Ctrl d to Ctrl a. group <g>. To redefine and unset the alias. To create short hand names of command. 19. 24. Display event number between 10 an d15. To change the defalult setting of command history saving feature.chmod Changing the permission of chmod u+rwx newfile file for user <u>. Repeat previous command in Korn Repeat previous command in Bash Repeat previuos to prevoius command Repeat the command with 20 event number. the value of home directory can be seen by $HOME environment variable 23. Every command has event number. Passing positional parameter ETL to showdir alias will take us to that -l' unalias history -5 history 10 15 HISTSIZE 1200 r-2 r 20 Print all the alias set in the system. Print the last 5 command that has been used. r !! alias alias l ='ls -ltr' alias showdir ='cd $1.Tilde(~) Changes control to home directory Changes absolute path home directory of user a_cmdb. - To switch between current working directory and most recently used directory.

log Display end of file Display last 5 lines of file. To pick special no of character from file.Tail Recursively changes the permission to execute for all directory and sub directory in abc directory for all users. .f . To open the last modified file for editing. w write<2> and x execute<1>.$$ Prints the Process id of current shell 28. start dispplaying 10 line onward till the end.r read<4>.as remove and = as absolute permission. .others <o> or all <a> with + as assign .Head 26. chmod g+x newfile chmod 757 newfile chmod 457 newfile chmod -R a+x abc chmod -R 001 . Indicate the current Prints the Processes associated with ps -f current user. Description head newfile head -5 newfile head -1 newfile | wc -c vi 'ls -t | head -1' head -c 512 newfile head -c 1b newfile head -c 2m newfile Display top content of file Display top 10 lines of file when used without argument Display top 5 lines of file<newfile> when used without argument To count the number of character present in first line. Print the gworth of file 27. Print first 512 byte character Print first 2 blocks (1 MB each) tail -5 tail +10 tail -f 6_load. ps -f -u a_cmdb Prints the Processes associated with ps -a current user with their hierarchy. Basic Commands Part2 Naveen 24 Mar 2011 Command Options 25.

System process init having PID as 1 nohup ps -eaf | is parent of all SHELL processes grep 'ksh' & and when the user login this PID become the PPID of SHELL prrocess. 29. 30. Running job in background Run the search job in background by printing its process id.Shell become the parent of all ps -f background process.Kill kill 520 kill 640 kill 1 kill $! kill 0 To kill the command. Killing system process init is not possible. Now after running the process in nohup mode kernel has reassigned the PPID of ps process with PID of system process which will not die even after SHELL dies through logging out. Prints all the processes associated with a_cmdb user. A Process having process id as 1 is parent of all SHELL processes. Prints all the users processes. Running fg will bring most recently started background process <LAST IN FIRST OUT> First job will come into foreground. Kill the parent process id which inturn kill all the child process. 31. prints all the ksh process running in the system by first searching for all the process. Kill the process with 520 process id.Background ps -f -u a_cmdb This command print all the jobs job ps -a runnung in background with their ps -e sequence no of exceution.To avoid dying of shell even when the user logs out run the background process using nohup command. Prints system -e stands for full.nohup Shell is parent of all background jobs and it dies when user logs out and ultimately all the child process also dies. .

$$ stores the PID of current login shell 32.default nice value is 20.ksh batch command executes the process whenever the CPU is free.Finger Produces list of all logged users when run without any argument. shift 2 places 38. l& higher the nice value lower the nice -n 10 who | priority.Shift shift shift 2 39. $* and $# posional parameter When this statement is used at start of ksh script it echoes each statement at the terminal shift command transfer the contents of positional parameter to next lower number.Set finger finger a_cmdb set 10 20 30 -x set the posional parameter. 36.Batch batch load.Nice Running the command with low priority nice who | wc Nice value range from 1 to 39. at 2:10 load.No need to remember Process id of last background procees its in $! Kill all process in the system except login shell process. The 2nd and last field of output is taken fron passwd file from /etc directory.Chown To change the owner of file can chown jaspreet only be done by owner or sys testfile admin. 35. wc -l & nice value becomes 30. $3 . $2.ksh Indicate the load.At To Run the job at specied time of the day. chown -R change the owner from gagan to . This will set $1. 37. shift one position.Cron cron executes process at regular intervals it keeps checking the control file at /user/spool/cron/crontabs for jobs. Prints details about a_cmdb user.ksh will execute today by 2:10 34. 33.

40. ls -li 6_load. Changes only modification time.du To change the group of file can only be done by owner or sys admin. Changes only access time.ksh 7_load. Drop the link between two files. touch 03161430 test file touch -m 04161430 test file touch -a 05161430 test file ln 6_load. Now gagan cannot change the ownership of test file what he can do is just create the jaspreet testfile similar copy of file and then he can become the owner of file. doing this both the file will have same inode number and changes in one will reflect in another also. Prints disk usage for every user . Prints summary of whole directory tree.jaspreet for testfile. By default du prints disk usage for each sub directory. Prints the disk usage of specific directory tree.ksh will give us same inode number. Recursively changes the owner all files under current directory.Touch 42. chgrp GTI test group changed to GTI from IB as file user is still the same and he can chgrp -R GTI again change the group of file to IB. Prints the amount of free space available on the disc. test file Recursively changes group all files under current directory. Prints the free and total space under oracle file system.df 44.Chgrp 41.Linking in 43.ksh df -t /home/oracle du -s du -s /home/* Changing the time stamps of file touch without any argument will change both modified time and access time. Linking two file.ksh 7_load.ksh rm 6_load.

Priority1. Cut eg : cut -c -5. sort sort test file column (by specifying position).gz Description gunzip etl_code. -cvf eg :tar -cvf /home/gagan/sqlbackup . By default the sorting starts with first character of each unzip file.5 test file '-d -f <field start and end no> eg: cut -d "|" -f 1. . F stands for field cut.5 test file | new file 4.Zipping Files Naveen 24 Mar 2011 Command Options 1./*. C stands for column cut. Cut the field b/w 1 and 5 and piped the output to new file.sql -c <column start and no> 3. gzip gzip etl_code field (default deliminator tab).gz eg: etl_code zip file*sql eg: file. tar create backup of files recursively. tabs ./*.sql -x files are restored eg:tar-xcvf /home/gagan/sqlbackup using -x eg:file*sql 2.6-12 test file -f <field start and end no> eg:cut -f 1.

paste -t eg : sort -t "|" +2 test file Sorting starts from 3rd field skipping 2nd eg : tr '|\' '~-' <test. -o eg : sort -o abc.delimiter to distinguish b/w start and end of field. Lower case letters.2. -d delete all occurrence . eg : sort -t "|" +1 -2 +3 <> Sorting based on different field as in case of order by clause.txt translate to upper case. -d deliminator. -r eg : sort -t \| -r +2 test file Reverse sort starting with 3rd field. overring the default. -u eg : sort -u <> Unique sort. 5.Sorting starts with 2nd field then with 4th field. -2 indicate to stop the sorting after 2nd field and resume it with 3rd field. numerals 3. eg : tr '[a-z]' '[A-Z] <test.uppercase letters 4.txt abc_sort. -n eg : sort -n <> Numeric sort.txt save sorted data in file.. eg : sort -t \| +2r test file The above command can be written in another way. eg:paste -d "|" <> <> 6.txt translate all | with ~ and \ with .

. shutdown -y -g0 -i6 shutdown and reboot (init level 6). eg : cut -d "|" -f3 <>|sort|uniq -u -d select only dup records. shutdown 17:30 shutdown at 17:30.Change Date date 09181754 10. eg : cut -d "|" -f3|sort|uniq -c 8.shutdown shutdown -g2 12. power down after 2 mins. 9. shutdown -r now shutdown immediate and reboot. eg: cut -d "|" -f3 <>|sort|uniq -d -c duplicate count. du to selectively send msg to dba group. ls -lt time of last modification touch -m 01290430 <file>.Changing time stamp touch mon date hrs mins <file>.wall wall -g dba "hello" 11. ls -lu time of last access touch -a 01290415 <file>.eg : tr -d '|' test.txt 7. shutdown -y -g0 immediate shutdown.uniq of | . Disk usage. Unique require sorted file as input. -u remove duplicate.

-exec eg: find . du -s //home/expimp/create_db summary. xargs -n -p –t eg:find . xargs remove all file rm eg: find . -atime +180 -ok rm -f {} \.pl" ! -newer last_backup -print file modi before last_backup.lst file '-mtime = mod time eg: find . -mtime +20 | xargs -n20 -p -t rm -f remove at max 20 files in batch and in interactive mode. -name *.13. -atime +180 -ok rm -f {} \. before removing prompt for confirmation. eg: find . Pattern Searching .find du /home/expimp/create_db tree output for each directory inside.log -prune exe – directory. -mtime +20 | xargs rm -f will be executed only once. ! –newer eg: find / -name "*. \( -name "*. print.lst" \) -print double quotes necessary. -size. find <loc> <option> <pattern> <action> find in root dir abc in emp. -mtime -2 -print find file modi in less then 2 days. -a (and) -o (OR) eg: find . 'atime = access time eg: find . -type eg: find / -name log -type f -print f for file and d for directory. -size +2048 -print files greater then x" -o -name "*. -atime +365 -print find the file not accessed in last 1 year. -prune don't descend exe eg: find . remove the files which are not modi for last 20 days -ok eg: find .

1.$ Represent all lines in the file. -v skip records that contain directory.10s/gagan/deep/g search between lines 3 and 10 : . Repeat the previous command :1. : $s/gagan/deep/g Only the last line. -n display line number for record. [PQR] match any single character. $s/<search string>/<Replace String >/g Pattern search and replacement. -l display files containing record. -i ignore case. .s/gagan/deep/g Only the current line. : $s/gagan/deep/gc c Ask for confirmation for replacement grep -c counting occurrence. g Stands for globally :3.Naveen 24 Mar 2011 Commands Options Description / /Unix Forward search of Unix keyword in file ? ?Unix Backward search of Unix keyword in file n Repeat the last search .

egrep prashant| director test1.txt match eap1 or eap2 (finds prashant or line with director) egrep egrep (das|sen)gupta match exp x1x3 or x2x3 test1.[c1-c2] match char with ASCII range [^PQR] match single character which is not PQR. ls –l |grep “^d” Prints only directories. ^<pat> beginning with pattern.txt match one or more occurrence matches ag & agg.txt (like dasgupta & sengupta) egrep -f Huge list of pattern search <pattern_file_name> can passed in the form of test1. egrep '[aA]gg?[ar] [ar]wal' test1.txt file name . <pat>$ ending with pattern. ? Matches zero or one occurrence of previous character. {a-Z A-Z 0-9} match any single character.txt match zero or one occurrence. fgrep and egrep accepts multiple pattern both form command line and a file but unlike grep and egrep does not accept regular .pattern stored in file eg(prashant|admin| director) Fgrep + Matches one or more occurrence of previous character. egrep '[aA]g+[ar] [ar]wal' test1.

>. ? Matches single character. * does not match all files beginning with dot <. [ !ijk ] Matches single character that is not i or j or k [x-z] Matches single character that is not within the ASCII range of character x and z. ls -l chap* Matches all the files which starts with chap. [ ijk ] Matches single character either i or j or k. fgrep -f pattern file emp file Faster than grep and egrep family Pattern Matching Naveen 25 Mar 2011 Command Options * Description Matches any number of character including NONE.expression. [!x-z] Matches single character that is not within the ASCII range of character x and z . ls -l chap? Matches all the files with only 5 character name and should start with chap. ls -x chap* Matches all the files which starts with chap and prints in multi column way.

l chap\* The above problem can be solved by escaping the special character. ls -l chap[!0 .9] Matches all file names beginning with alphabet chap and not ending with any number. ls .9] Concatenates all the files beginning with chap and not ending with number. To pass the standard output of one command as the . ls -l * * does not match all files beginning with dot <.l chap* Print all files whose names start with chap but not the one whose name is chap*. ls -l [ a – z A-Z] Matches all file names beginning with alphabet irrespective of case. ls . cp chap?? abc Copy all files starts with chap to abc directory. ls *. cmp chap[12] Compares file chap1 and chap2.. cat chap[!0 . Escaping Backslash ( \ ) Pipe | Playing with file names which uses meta character in their file -l chap0[1 .4] Range specification is also available. ls -l .>.ist Print all the files which end with ist extensions. mv * ./bin Moves all the files to bin directory.???* The above problem can be solved with specifying first few character using meta character <?> explicitly.

Shell Variable Shell variables are initialized to null value < by default > so it returns null. eg: $10 echo "$10" eg: 0 Control Structures Naveen 25 Mar 2011 Shell is looking for $1 variable which is undefined so passes null for this. . All words starts with $ are considered as variable unless single quoted or escaped. b=cd. z=$a$b. echo '$10' All words starts with $ are considered as variable unless single quoted or escaped. a=ab. who | tee users list Tee saves the output of who command in user list as well as display it also. echo $z shell concatenate two variable. Tee who | wc -l Output of who command <three users> passed as input to wc which count the number of lines present. $1 is part of positional parameter.standard input to another.


env file to get common variables.Sample Unix shell Scripts Naveen 10 Apr 2011 #ksh: JobNum. # SET PARAMETERS .sh #Author: Prasad Degela #Date: 03/13/2006 #Reviewed by By: #Date: #Project: Appeals & Grivances Tracking ############################################################################## ###### #This script will call .Part 1 # These parameters are provided for all interfaces.env inboxdir=$PROJECTDIR/qnxt/inbox srcfiledir=$PROJECTDIR/qnxt/estage errorfiledir=$PROJECTDIR/qnxt/error tempfiledir=$PROJECTDIR/nice/tstage . /etldata/aagt/common/scripts/aagt. # $1 = environment (DEV) ############################################################################## ###### #set -x ############################################################################## ###### .

gz $ARCHFILEDIR rm -f $pfilename.tempfiledirqnxt=$PROJECTDIR/qnxt/tstage ERROR_EXIT_STATUS=99 SUCCESS_EXIT_STATUS=0 echo "Control Moved to the Inbox Directory = "$inboxdir cd $inboxdir fileconv=`echo "PDP*.txt"` echo "fileconv = "$fileconv files=`ls -tr $fileconv` if [[ $? -ne 0 ]] then echo "No Source File in the InBox "$inboxdir exit $SUCCESS_EXIT_STATUS fi echo "List Of Files in the Inbox Directory "$inboxdir echo "{ $files }" for pfilename in $files do cd $inboxdir cp $pfilename $srcfiledir gzip $pfilename cp $pfilename.gz rm -f TRAIL*.txt .

$pINVOKE_ID echo "************************* Reset the Job Sequencer *******************" fi echo AGDB: $AGDB echo AGDBUserID: $AGDBUserID echo ORADB: $ORADB echo ORADBUSRID: $ORADBUSRID .echo "Source File Name Is "$pfilename pDSjobname="sjAG01" pINVOKE_ID=$pfilename pJobStatusFile=`echo $pfilename".log"` ############################################################################## ###### # RESET AND RUN job sequencer # ############################################################################## ###### $DSBINDIR/dsjob -jobinfo $DSPROJECT $pDSjobname.stat"` pJobLogFile=`echo $pfilename".$pINVOKE_ID > $LOGFILEDIR/ $pJobStatusFile JOBSTATUS=`grep "Job Status" $LOGFILEDIR/$pJobStatusFile | cut -f2 -d '('|cut -f1 -d ')'` echo "$pDSjobname Job status code is "$JOBSTATUS if [[ $JOBSTATUS = 3 ]] then echo "Job Status = "$JOBSTATUS $DSBINDIR/dsjob -run -mode RESET -wait -jobstatus $DSPROJECT $pDSjobname.

echo NICEDB: $NICEDB echo NICEDBUserID: $NICEDBUserID echo SrcFile: $pfilename echo SrcFileDir: $SRCFILEDIR echo TempFileDirectory: $TEMPFILEDIR echo TempFileDirectoryqnxt:$TEMPFILEDIRQNXT echo ErrorFileDirectory: $ERRORFILEDIR echo ScriptFileDirectory: $SCRIPTFILEDIR echo "calling $Jobname Sequence" $DSBINDIR/dsjob -run \ -param AGDB=$AGDB \ -param AGDBUserID=$AGDBUserID \ -param AGDBPswd=$AGDBPswd \ -param ORADB=$ORADB \ -param ORADBUSRID=$ORADBUSRID \ -param ORADBPswd=$ORADBPswd \ -param NICEDB=$NICEDB \ -param NICEDBUserID=$NICEDBUserID \ -param NICEDBPswd=$NICEDBPswd \ -param SrcFile=$pfilename \ -param SrcFileDir=$SRCFILEDIR \ -param TempFileDirectory=$TEMPFILEDIR \ -param TempFileDirectoryqnxt=$TEMPFILEDIRQNXT \ .

$pINVOKE_ID $DSBINDIR/dsjob -jobinfo $DSPROJECT $pDSjobname.$pINVOKE_ID > $LOGFILEDIR/ $pJobStatusFile $DSBINDIR/dsjob -logsum $DSPROJECT $pDSjobname.3 .1.-param ErrorFileDirectory=$ERRORFILEDIR \ -param ScriptFileDirectory=$SCRIPTFILEDIR \ -wait -jobstatus $DSPROJECT $pDSjobname.$INVOKE_ID > $LOGFILEDIR/ $pJobLogFile JOBSTATUS=`grep "Job Status" $LOGFILEDIR/$pJobStatusFile | cut -f2 -d '('|cut -f1 -d ')'` echo "$pDSjobname Job status code is "$JOBSTATUS if [[ $JOBSTATUS = 1 ]] then echo "Job Status = "$JOBSTATUS echo "Removing File from $srcfiledir on successful processing of file $pfilename" cd $srcfiledir rm -f $pfilename else exit $ERROR_EXIT_STATUS fi done echo 99 END OF THE PROG exit $SUCCESS_EXIT_STATUS # ################################################################## Standards for ETL UNIX Shell Scripts for use with PowerCenter 7.

Following is the script and explanation. pass username and password variables # and start the job using the pmcmd command line.1. This script has been provided as an example and is named etl_unix_shell_script.) # # Example usage: ud. dss would be used for the DWDAS team as the # directory is /usr/local/autopca/dss/) #----------------------------------------------------------------- . etc. and a separate file which contains the username and password for the user called in the script.3): Scripting standards include the use of a UNIX shell script. dss. # Author: Kevin Gillenwater # Date: 7/24/2003 # Purpose: Sample UNIX shell script to load environment variables # needed to run PowerMart jobs. siq. # # $1 = Project Id Parameter (ie. # Program: etl_unix_shell_script. The following is a template that should be used to create scripts for new scheduled dss # # NOTE: Enter the Project ID parameter that is designated in the # directory structure for your team # (ie.Naveen 10 Apr 2011 Standards for ETL UNIX Shell Scripts for use with PowerCenter 7. which the scheduling tool uses to start the PowerCenter job.1.3 Scripting Standards (for PowerCenter version 7.

/usr/local/bin/set_pm_var. located on each machine (Leisure for Production and Glance for Development). will provide ease of maintenance if changes need to be made to PowerCenter variables. The beginning of each script should call and execute the script set_pm_var. The set_pm_var.config ETL_USER=`grep ETL_USER $ETL_CONFIG_FILE | awk -F: '{print $2}'` ETL_PWD=`grep ETL_PWD $ETL_CONFIG_FILE | awk -F: '{print $2}'` #----------------------------------------------------------------# Start the job #----------------------------------------------------------------$PM_HOME/pmcmd startworkflow -u $ETL_USER -p $ETL_PWD -s $MACHINE:4001 -f DWDAS_ LOAD_dssqa –wait s_m_CENTER_INSTITUTE #----------------------------------------------------------------# Trap the return code #----------------------------------------------------------------rc=$? if [[ $rc -ne 0 ]] then exit $rc fi Notes Regarding the Script/Standards: 1.# Call the script to set up the Informatica Environment Variables: /usr/local/bin/ to set up the PowerCenter variables used in the session (.sh #----------------------------------------------------------------# Read ETL configuration parameters from a separate file: #----------------------------------------------------------------ETL_CONFIG_FILE=$JOBBASE/$1/remote_accts/test_etl. and will provide one .sh).

/home/autopca/autodba) JOBBASE=/usr/local/autopca . # For all other accounts. #--------------------------------------------------------------------case $HOME in /home/autopca/autopca) JOBBASE=/usr/local/autopca .. it should evaluate to their $HOME variable. # The value of JOBBASE differs based on the account.2 # when called from a shell script (ie. script run by Autosys) # Author: Kevin Gillenwater # Date: 7/3/2003 # Purpose: UNIX script which sets the variables for running PowerMart 6.source for the scripts on a machine. it is for informational purposes only: # Program: set_pm_var. For AUTOSYS # and AUTODBA. the variable should evaluate to /usr/local/autopca. #------------------------------------------#Set up the Informatica Variables #------------------------------------------export MACHINE=`hostname` export SHLIB_PATH=/usr/local/pmserver/informatica/pm/infoserver export PM_HOME=/usr/local/pmserver/informatica/pm/infoserver #--------------------------------------------------------------------# Set the environment variables needed for scheduling jobs. The following is the code in the script. You will not need to do anything with this script. esac .. *) JOBBASE=$HOME .

The username and password are no longer stored as part of the main shell script. The only thing that will need to be changed in this section is the folder name and the workflow within the folder that is being executed (i. The folder permissions on the Production machine will only permit PCA’s access to this folder. *8/27/03 – Added ‘-wait’ parameter to the pmcmd command line to start the scheduled job. If the data matches the pattern. *10/6/04 . . The final section of etl_unix_shell_script.e.. Please follow the script exactly. it’s rejected. Types of regular expressions: . A Linux utility (such as the sed editor or the gawk program)matches the regular expression pattern against data as that data flows Into the utility. Regular Expressions Naveen 23 Apr 2011 What Are Regular Expressions? A regular expression is a pattern template you define that a Linux utility Uses to filter text. Following is the contents of test_etl_config: ETL_USER:etlguy ETL_PWD:ou812 This file (your password file) must be located in your team directory under the remote_accts folder (i.-) on this file.r. it’s accepted for processing..Updated to generalize the scheduling tool used to run the shell scripts as UC4 has been chosen to replace Autosys as the scheduling tool used to run PowerMart workflows. /usr/local/autopca/dss/remote_accts/). The regular expression pattern makes use of wildcard characters to represent one or more characters in the data stream.e. If the data doesn’t match the pattern. the filename test_etl_config contains the username and password to be used when running the workflow.. The third section of etl_unix_shell_script. DWDAS_LOAD_dssqa wflw_m_CENTER_INSTITUTE).sh sets up ETL parameters for username and password usage by PowerCenter when running the workflow. The second section of contains code to trap the return code of the script that indicated success or failure. The permissions should be 6-4-0 (rw. In the script example. 4.export JOBBASE 2. contains the command to run the workflow in PowerCenter.

just precede it with a backslash character: $ cat data2 The cost is $4.*[]^${}\+?|() For example. $ echo "This is a test" | gawk ’/trial/{print $0}’ $ Eg 2: Special characters The special characters recognized by regular expressions are: . Eg 1: Plain text $ echo "This is a test" | sed -n ’/test/p’ This is a test.There are two popular regular expression engines:   The POSIX Basic Regular Expression (BRE) engine The POSIX Extended Regular Expression (ERE) engine Defining BRE Patterns: The most basic BRE pattern is matching text characters in a data stream. if you want to search for a dollar sign in your text.00 $ sed -n ’/\$/p’ data2 The cost is $4.00 $ . $ echo "This is a test" | sed -n ’/trial/p’ $ $ echo "This is a test" | gawk ’/test/{print $0}’ This is a test.

Eg 1: The question mark The question mark indicates that the preceding character can appear zero or one time.Eg 3: Looking for the ending The dollar sign ($) special character defines the end anchor. It doesn’t match repeating occurrences of the character: $ echo "bt" | gawk ’/be?t/{print $0}’ bt $ echo "bet" | gawk ’/be?t/{print $0}’ Bet . The gawk program recognizes the ERE patterns. $ echo "This is a good book" | sed -n ’/book$/p’ This is a good book $ echo "This book is good" | sed -n ’/book$/p’ $ Eg 4: Using ranges You can use a range of characters within a character class by using the dash symbol. but that’s all. but the sed editor doesn’t. Now you can simplify the zip code example by specifying a range of digits: $ sed -n ’/^[0-9][0-9][0-9][0-9][0-9]$/p’ data8 60633 46201 45902 $ Extended Regular Expressions: The POSIX ERE patterns include a few additional symbols that are used by some Linux applications and utilities.

$ echo "beet" | gawk ’/be?t/{print $0}’ $ $ echo "beeet" | gawk ’/be?t/{print $0}’ $ Eg 2: The plus sign The plus sign indicates that the preceding character can appear one ormore times.. the data stream text fails. The pattern doesn’t match if the character is not present: $ echo "beeet" | gawk ’/be+t/{print $0}’ beeet $ echo "beet" | gawk ’/be+t/{print $0}’ beet $ echo "bet" | gawk ’/be+t/{print $0}’ bet $ echo "bt" | gawk ’/be+t/{print $0}’ $ Eg 3: The pipe symbol The pipe symbol allows to you to specify two or more patterns that the regular expression engine uses in a logical OR formula when examining the data stream. If none of the patterns match.. The format for using the pipe symbol is: expr1|expr2|. the text passes. but must be present at least once. If any of the patterns match the data stream text. Here’s an example of this: $ echo "The cat is asleep" | gawk ’/cat|dog/{print $0}’ The cat is asleep $ echo "The dog is asleep" | gawk ’/cat|dog/{print $0}’ .

ftp -i -v -n wilma <<END_FTP user randy mypassword binary lcd /scripts/download cd /scripts get auto_ftp_xfer. the group is treated like a standard character.The dog is asleep $ echo "The sheep is asleep" | gawk ’/cat|dog/{print $0}’ $ Eg 4: Grouping expressions When you group a regular expression pattern.ksh bye END_FTP Reactions: Process a File line by line . The basic idea is shown here. For example: $ echo "Sat" | gawk ’/Sat(urday)?/{print $0}’ Sat $ echo "Saturday" | gawk ’/Sat(urday)?/{print $0}’ Saturday $ Automated FTP File Transfer Naveen 22 Apr 2011 Automated FTP File Transfer : You can use a here document to script an FTP file transfer. You can apply a special character to the group just as you would to a regular character.

On each loop iteration a single line of text is read into a variable named LINE. . Look at the complete function in Listing 2. function while_read_LINE { cat $FILENAME | while read LINE do echo “$LINE” : done } Each of these test loops is created as a function so that we can time each method using the shell script.Naveen 22 Apr 2011 Process a File line by line : There are Numerous methods to achieve the same one such method is discussed here : Method 1: Let’s start with the most common method that I see. Our use of piping output to a while loop works the same way. The pipe is the key to the popularity of this method. This continuous loop will run until all of the lines in the file have been processed one at a time.2. It is intuitively obvious that the output from the previous command in the pipe is used as input to the next command in the pipe. the pipe stores the output in a temporary system file. as shown in Listing 2.3. Then this temporary system file is used as input to the more command. if I execute the df command to list file system statistics and it scrolls across the screen out of view. which is catting a file and piping the file output to a while read loop. I can use a pipe to send the output to the more command. the output of the cat command is used as input to the while loop and is read into the LINE variable on each loop iteration. as in the following command: df | more When the df command is executed. allowing me to view the df command output one page/line at a time. As an example. You could also use () C-type function definition if you wanted.

but it always has a 0. A no-op (:) does nothing. return code. however.while_read_LINE () { cat $FILENAME | while read LINE do echo “$LINE” : done } Whether you use the function or () technique. the word “function” helps understanding the whole shell script a lot. I use the no-op only as a placeholder so that you can cut the function code out and paste it in one of your scripts. The $FILENAME variable is set in the main body of the shell script. For beginners. . zero. you get the same result. I tend to use the function method more often so that when someone edits the script they will know the block of code is a function. the loop will not do anything either. Within the while loop notice that I added the no-op (:) after the echo statement. the while loop will not fail. If you should remove the echo statement and leave the no-op.