Introduction to Shell Scripting

Naveen
24
Mar 2011
A shell script is a script written for the shell, or command line interpreter, of an operating
system. It is often considered a simple domain-specific programming language. Typical
operations performed by shell scripts include file manipulation, program execution, and printing
text.
Types of shell scripts : Basically we have three types of the shell scripts [Most Prominent
one’s ,widely used in industry].
1. Bourne Shell :
The Bourne shell, or sh, was the default Unix shell of Unix Version 7, and replaced the
Thompson shell, whose executable file had the same name, sh. It was developed by Stephen
Bourne, of AT&T Bell Laboratories, and was released in 1977 in the Version 7 Unix release
distributed to colleges and universities. It remains a popular default shell for Unix accounts. The
binary program of the Bourne shell or a compatible program is located at /bin/sh on most Unix
systems, and is still the default shell for the root superuser on many current Unix
implementations.
Features of the Bourne shell include:

Scripts can be invoked as commands by using their filename
May be used interactively or non-interactively

Allow both synchronous and asynchronous execution of commands

supports input and output redirection and pipelines

provides a set of built-in commands

provides flow control constructs, quotation facilities, and functions.

typeless variables

provides local and global variable scope

scripts do not require compilation before execution

does not have a goto facility, so code restructuring may be necessary

Command substitution using back quotes: `command`.

Here documents using << to embed a block of input text within a script.

"for ~ do ~ done" loops, in particular the use of $* to loop over arguments.

"case ~ in ~ esac" selection mechanism, primarily intended to assist argument
parsing.

sh

It contains strong provisions for controlling input and output and in its expression
matching facilities.

provided support for environment variables using keyword parameters and exportable
variables.

2. C Shell :
The C shell (csh or the improved version, tcsh, on most machines) is a Unix shell that was
created by Bill Joy while a graduate student at University of California, Berkeley in the late
1970s.
The C shell is a command processor that's typically run in a text window, allowing the user to
type commands which cause actions. The C shell can also read commands from a file, called a
script. Like all Unix shells, it supports filename wildcarding, piping, here documents, command
substitution, variables and control structures for condition-testing and iteration. What
differentiated the C shell, especially in the 1980s, were its interactive features and overall style.
Its new features made it easier and faster to use. The overall style of the language looked more
like C and was seen as more readable.
Features of C Shell :
Some of the features of the C shell are listed here:

Customizable environment.
Abbreviate commands. (Aliases.)

History. (Remembers commands typed before.)

Job control. (Run programs in the background or foreground.)

Shell scripting. (One can write programs using the shell.)

Keyboard shortcuts.

3. Korn Shell :
The Korn shell (ksh) is a Unix shell which was developed by David Korn (AT&T Bell
Laboratories) in the early 1980s and announced at Toronto USENIX on July 14 1983
ksh is backwards-compatible with the Bourne shell and includes many features of the C shell as
well, such as a command history, which was inspired by the requests of Bell Labs users.

The Korn shell's major new features include:

Command-line editing , allowing you to use vi or emacs -style editing commands on
your command lines.
Integrated programming features : the functionality of several external UNIX
commands, including test , expr , getopt , and echo , has been integrated into the shell
itself, enabling common programming tasks to be done more cleanly and without creating
extra processes.

Control structures , especially the select construct, which enables easy menu
generation.

Debugging primitives that make it possible to write tools that help programmers debug
their shell code.

Regular expressions , well known to users of UNIX utilities like grep and awk , have
been added to the standard set of filename wildcards and to the shell variable facility.

Advanced I/O features , including the ability to do two-way communication with
concurrent processes (coroutines ).

New options and variables that give you more ways to customize your environment.

Increased speed of shell code execution.

Security features that help protect against "Trojan horses" and other types of break-in
schemes.

Why do we Use Shell Scripting
Naveen
24
Mar 2011
We use Shell Scripts to obtain the following purposes.
1. Customizing your work environment.
2. Automating Your Daily Task.
3.Automating Repetitive Task.
4.Executing Important Procedures like shutting down the system,formatting a disk,Creating a file
system on it,mounting the file system,letting the users use the floppy and finally un mounting the
disk.
5.Performing the same operation on many files.

One line definition of grep usage.Cal cal 2009 cal dec 4. 2. Helpful when we donot know which command will solve the purpose. To update parameter file with session start time and end tim Basic Commands Part1 Naveen 24 Mar 2011 Command Option 1.When Not to use Shell Scripting : 1. Usage of Shell Scripts with in Informatica: 1.such as writing an entire billing system. Man uname -r uname -n man grep man -e grep man -k copy 3. Is too complex. 2.To run an batch via pmcmd.g :. Uname 2.gives the different option available for copying. To have header and footer in case we are goanna write to an flat file 3. 4. Requires a variety of Software tools. Date date date +%m Description To find the version of unix os To find the machine name Print the user manual for any command.To run the command task. Print the calendar year or month Print the calendar 2009 year Print calender for current december month Tue Mar 09 07:10:10 IST 2009 .e. To find the user manual for grep command. Requires a high degree of efficiency 3.

Prints all the file with their last modification time in reverse order keeping last modified at bottom. Reverse the sort order of all the files with their names. Prints all the file with their last modification time keeping latest modified at top. To Print only directoris and executable files. Prints the inode number of file. <a> stands for all.ls To Print the contents of current directory.Lock 3 March Mar-03 10 9 Login details of all the users in system Login details for only myself Login details for all users with headers info Login details for only active users To change the password lock -45 lock the terminal for 45 minutes 8.Bc To use the calculator 9. Print all files including hidden files. ls -x ls -F ls -a ls -r ls -x abc ls -R ls -ld ls -il testfile ls -lt ls -ltr ls -lut . prints last access time of files. Print all files and subdirectories in directory TREE.date +%h date +'%h%m' date +%d date +%y H. Print all the directories.Passwd 7.M and S 5. Print the content of abc directory.Who who who am I who -H who -u 6. Prints Multi column output of files with their names.< * > indicate all files containing executable code and </> indicate the directories.

Create a new file with testfile name.stty stty -a stty intr \^c stty eof \^a Print current setting of terminal To delete the character during backspacing To change the interrupt key to . copying entire directiry structure abc to newabc recursively. copying testfile to abc directory interactively. change the default split line size to 72. Removing directory newabc forcefuly. split testfile split -72 testfile split testfile newfile 15.Split To split the big files in different small files of 1000 lines each<default>. 13. Split the files into small files. All the options of cp command is applicable for rm command. To copy file and directories.Cat 11.Path Set the path variable for PATH=$PATH:/home/ SHELL to use these directory gagan:/home/deep path to locate the executables commands. Cmp comm Diff 16.lp Used for Printing the files 14.10.rm cat testfile cat > testfile cp -i testfile abc cp -r abc newabc rm -f newabc rm -r newabc Print the content of any file. Recursive deletion will not remove the write protected files.CP 12. lines and character in the file 17. 18. Renaming the testfile to new name as <newfile>.wc These three command are used to find the difference in two files wc -w testfile wc -c testfile wc -l testfile wc testfile Print the numbers of words.

Repeat previous command in Korn Repeat previous command in Bash Repeat previuos to prevoius command Repeat the command with 20 event number. 19. r !! alias alias l ='ls -ltr' alias showdir ='cd $1.Ctrl c instead of 'DELETE' default To change the termination control of input during file creation using cat command from Ctrl d to Ctrl a. To find the previous command that has been used.History 21.Tilde(~) Changes control to home directory Changes absolute path home directory of user a_cmdb.Alias 20. To change the defalult setting of command history saving feature. Print the last 5 command that has been used. 24.ls -l' unalias history -5 history 10 15 HISTSIZE 1200 r-2 r 20 Print all the alias set in the system. the value of home directory can be seen by $HOME environment variable 23. Every command has event number. Passing positional parameter ETL to showdir alias will take us to that directory.chmod Changing the permission of chmod u+rwx newfile file for user <u>. group <g>. Display event number between 10 an d15. .cd - To switch between current working directory and most recently used directory. 22. To redefine and unset the alias. To create short hand names of command.

others <o> or all <a> with + as assign . Description head newfile head -5 newfile head -1 newfile | wc -c vi 'ls -t | head -1' head -c 512 newfile head -c 1b newfile head -c 2m newfile Display top content of file Display top 10 lines of file when used without argument Display top 5 lines of file<newfile> when used without argument To count the number of character present in first line.Head 26. ps -f -u a_cmdb Prints the Processes associated with ps -a current user with their hierarchy. start dispplaying 10 line onward till the end. To open the last modified file for editing. chmod g+x newfile chmod 757 newfile chmod 457 newfile chmod -R a+x abc chmod -R 001 . w write<2> and x execute<1>.as remove and = as absolute permission. Basic Commands Part2 Naveen 24 Mar 2011 Command Options 25.Tail Recursively changes the permission to execute for all directory and sub directory in abc directory for all users. Indicate the current directory.ps Prints the Processes associated with ps -f current user.log Display end of file Display last 5 lines of file.$$ Prints the Process id of current shell 28. . Print the gworth of file 27. Print first 512 byte character Print first 2 blocks (1 MB each) tail -5 tail +10 tail -f 6_load.f . To pick special no of character from file. .r read<4>.

31.ps -e stands for full. Killing system process init is not possible. Prints system process. Prints all the users processes. 30. A Process having process id as 1 is parent of all SHELL processes. 29.Kill kill 520 kill 640 kill 1 kill $! kill 0 To kill the command. System process init having PID as 1 nohup ps -eaf | is parent of all SHELL processes grep 'ksh' & and when the user login this PID become the PPID of SHELL prrocess. Now after running the process in nohup mode kernel has reassigned the PPID of ps process with PID of system process which will not die even after SHELL dies through logging out. Running job in background Run the search job in background by printing its process id. Kill the process with 520 process id. Running fg will bring most recently started background process <LAST IN FIRST OUT> First job will come into foreground. . Prints all the processes associated with a_cmdb user. Kill the parent process id which inturn kill all the child process.Background ps -f -u a_cmdb This command print all the jobs job ps -a runnung in background with their ps -e sequence no of exceution.Shell become the parent of all ps -f background process. prints all the ksh process running in the system by first searching for all the process.To avoid dying of shell even when the user logs out run the background process using nohup command.nohup Shell is parent of all background jobs and it dies when user logs out and ultimately all the child process also dies.

l& higher the nice value lower the nice -n 10 who | priority.Shift shift shift 2 39.Nice Running the command with low priority nice who | wc Nice value range from 1 to 39.ksh will execute today by 2:10 34.At To Run the job at specied time of the day. chown -R change the owner from gagan to . wc -l & nice value becomes 30.Set finger finger a_cmdb set 10 20 30 -x set the posional parameter. 35.ksh Indicate the load. 33.ksh batch command executes the process whenever the CPU is free. shift one position.Batch batch load.No need to remember Process id of last background procees its in $! Kill all process in the system except login shell process.default nice value is 20. $* and $# posional parameter When this statement is used at start of ksh script it echoes each statement at the terminal shift command transfer the contents of positional parameter to next lower number.Cron cron executes process at regular intervals it keeps checking the control file at /user/spool/cron/crontabs for jobs. 36. The 2nd and last field of output is taken fron passwd file from /etc directory. Prints details about a_cmdb user.Finger Produces list of all logged users when run without any argument. at 2:10 load. This will set $1.Chown To change the owner of file can chown jaspreet only be done by owner or sys testfile admin. $2. 37. $$ stores the PID of current login shell 32. shift 2 places 38. $3 .

Touch 42. Changes only access time.jaspreet for testfile. By default du prints disk usage for each sub directory. Prints the free and total space under oracle file system.ksh 7_load.ksh 7_load. ls -li 6_load. Now gagan cannot change the ownership of test file what he can do is just create the jaspreet testfile similar copy of file and then he can become the owner of file. Changes only modification time.ksh df -t /home/oracle du -s du -s /home/* Changing the time stamps of file touch without any argument will change both modified time and access time.ksh rm 6_load.du To change the group of file can only be done by owner or sys admin. Linking two file. Prints the amount of free space available on the disc.Linking in 43. Prints disk usage for every user . doing this both the file will have same inode number and changes in one will reflect in another also. chgrp GTI test group changed to GTI from IB as file user is still the same and he can chgrp -R GTI again change the group of file to IB.Chgrp 41. Prints summary of whole directory tree. test file Recursively changes group all files under current directory. 40. Prints the disk usage of specific directory tree.ksh will give us same inode number. Recursively changes the owner all files under current directory. Drop the link between two files.df 44. touch 03161430 test file touch -m 04161430 test file touch -a 05161430 test file ln 6_load.

gz Description gunzip etl_code. Priority1. .sql -x files are restored eg:tar-xcvf /home/gagan/sqlbackup using -x option. field (default deliminator tab).space. -cvf eg :tar -cvf /home/gagan/sqlbackup ./*./*. Cut eg : cut -c -5. tar create backup of files recursively.5 test file '-d -f <field start and end no> eg: cut -d "|" -f 1.sql -c <column start and no> 3. C stands for column cut.gz eg: etl_code zip file*sql eg: file.Zipping Files Naveen 24 Mar 2011 Command Options 1. F stands for field cut.5 test file | new file 4.6-12 test file -f <field start and end no> eg:cut -f 1. By default the sorting starts with first character of each line. gzip gzip etl_code eg:etl_code.zip unzip file. sort sort test file column (by specifying position). tabs . Cut the field b/w 1 and 5 and piped the output to new file.zip eg:file*sql 2.

-r eg : sort -t \| -r +2 test file Reverse sort starting with 3rd field.2.paste -t eg : sort -t "|" +2 test file Sorting starts from 3rd field skipping 2nd field. -o eg : sort -o abc. overring the default. -2 indicate to stop the sorting after 2nd field and resume it with 3rd field.tr eg : tr '|\' '~-' <test. eg : sort -t \| +2r test file The above command can be written in another way. -n eg : sort -n <> Numeric sort.delimiter to distinguish b/w start and end of field.txt save sorted data in file.txt abc_sort. eg : sort -t "|" +1 -2 +3 <> Sorting based on different field as in case of order by clause.. eg:paste -d "|" <> <> 6.Sorting starts with 2nd field then with 4th field. eg : tr '[a-z]' '[A-Z] <test. -u eg : sort -u <> Unique sort.txt translate all | with ~ and \ with . Lower case letters. -d delete all occurrence .uppercase letters 4. -d deliminator.txt translate to upper case. 5. numerals 3.

-u remove duplicate. 9.txt 7.Changing time stamp touch mon date hrs mins <file>. eg : cut -d "|" -f3|sort|uniq -c 8. Disk usage. eg: cut -d "|" -f3 <>|sort|uniq -d -c duplicate count. power down after 2 mins.eg : tr -d '|' test. shutdown 17:30 shutdown at 17:30. ls -lu time of last access touch -a 01290415 <file>. shutdown -r now shutdown immediate and reboot. du to selectively send msg to dba group. Unique require sorted file as input. eg : cut -d "|" -f3 <>|sort|uniq -u -d select only dup records.wall wall -g dba "hello" 11.Change Date date 09181754 10. shutdown -y -g0 -i6 shutdown and reboot (init level 6).shutdown shutdown -g2 12. ls -lt time of last modification touch -m 01290430 <file>. shutdown -y -g0 immediate shutdown. .uniq of | .

lst" \) -print double quotes necessary.sh" -o -name "*. xargs -n -p –t eg:find . remove the files which are not modi for last 20 days -ok eg: find . -mtime +20 | xargs rm -f will be executed only once. find <loc> <option> <pattern> <action> find in root dir abc in emp. -type eg: find / -name log -type f -print f for file and d for directory. -size +2048 -print files greater then x blocks. du -s //home/expimp/create_db summary. -size. 'atime = access time eg: find . ! –newer eg: find / -name "*. -mtime +20 | xargs -n20 -p -t rm -f remove at max 20 files in batch and in interactive mode. -atime +180 -ok rm -f {} \. -atime +180 -ok rm -f {} \. -exec eg: find . -a (and) -o (OR) eg: find . -mtime -2 -print find file modi in less then 2 days. eg: find . xargs remove all file rm eg: find .log -prune exe – directory.13. print. \( -name "*.pl" ! -newer last_backup -print file modi before last_backup. before removing prompt for confirmation. -atime +365 -print find the file not accessed in last 1 year. -name *.find du /home/expimp/create_db tree output for each directory inside. -prune don't descend exe eg: find . Pattern Searching .lst file '-mtime = mod time eg: find .

[PQR] match any single character. -v skip records that contain directory.Naveen 24 Mar 2011 Commands Options Description / /Unix Forward search of Unix keyword in file ? ?Unix Backward search of Unix keyword in file n Repeat the last search . Repeat the previous command :1. : $s/gagan/deep/gc c Ask for confirmation for replacement grep -c counting occurrence. -l display files containing record. .s/gagan/deep/g Only the current line. 1. $s/<search string>/<Replace String >/g Pattern search and replacement. : $s/gagan/deep/g Only the last line. g Stands for globally :3. -i ignore case.10s/gagan/deep/g search between lines 3 and 10 : .$ Represent all lines in the file. -n display line number for record.

{a-Z A-Z 0-9} match any single character. <pat>$ ending with pattern. egrep '[aA]g+[ar] [ar]wal' test1.txt file name .txt (like dasgupta & sengupta) egrep -f Huge list of pattern search <pattern_file_name> can passed in the form of test1. egrep prashant| director test1. ls –l |grep “^d” Prints only directories.pattern stored in file eg(prashant|admin| director) Fgrep + Matches one or more occurrence of previous character.txt match one or more occurrence matches ag & agg.txt match eap1 or eap2 (finds prashant or line with director) egrep egrep (das|sen)gupta match exp x1x3 or x2x3 test1. egrep '[aA]gg?[ar] [ar]wal' test1. ? Matches zero or one occurrence of previous character.txt match zero or one occurrence. fgrep and egrep accepts multiple pattern both form command line and a file but unlike grep and egrep does not accept regular .[c1-c2] match char with ASCII range [^PQR] match single character which is not PQR. ^<pat> beginning with pattern.

[ !ijk ] Matches single character that is not i or j or k [x-z] Matches single character that is not within the ASCII range of character x and z. [!x-z] Matches single character that is not within the ASCII range of character x and z . * does not match all files beginning with dot <. ls -l chap? Matches all the files with only 5 character name and should start with chap.>. ls -l chap* Matches all the files which starts with chap. [ ijk ] Matches single character either i or j or k. fgrep -f pattern file emp file Faster than grep and egrep family Pattern Matching Naveen 25 Mar 2011 Command Options * Description Matches any number of character including NONE.expression. ls -x chap* Matches all the files which starts with chap and prints in multi column way. ? Matches single character.

???* The above problem can be solved with specifying first few character using meta character <?> explicitly. ls .9] Concatenates all the files beginning with chap and not ending with number.9] Matches all file names beginning with alphabet chap and not ending with any number. mv * .ist Print all the files which end with ist extensions.ls -l chap0[1 . ls *.. cat chap[!0 .4] Range specification is also available. cmp chap[12] Compares file chap1 and chap2.l chap* Print all files whose names start with chap but not the one whose name is chap*. ls -l [ a – z A-Z] Matches all file names beginning with alphabet irrespective of case. ls -l * * does not match all files beginning with dot <./bin Moves all the files to bin directory. Escaping Backslash ( \ ) Pipe | Playing with file names which uses meta character in their file name. ls -l chap[!0 . ls -l . cp chap?? abc Copy all files starts with chap to abc directory. ls .>. To pass the standard output of one command as the .l chap\* The above problem can be solved by escaping the special character.

Shell Variable Shell variables are initialized to null value < by default > so it returns null. who | tee users list Tee saves the output of who command in user list as well as display it also. $1 is part of positional parameter. echo '$10' All words starts with $ are considered as variable unless single quoted or escaped. Tee who | wc -l Output of who command <three users> passed as input to wc which count the number of lines present. a=ab. b=cd. . eg: $10 echo "$10" eg: 0 Control Structures Naveen 25 Mar 2011 Shell is looking for $1 variable which is undefined so passes null for this.standard input to another. echo $z shell concatenate two variable. z=$a$b. All words starts with $ are considered as variable unless single quoted or escaped.

.

# $1 = environment (DEV) ############################################################################## ###### #set -x ############################################################################## ###### .env file to get common variables. # SET PARAMETERS . /etldata/aagt/common/scripts/aagt.env inboxdir=$PROJECTDIR/qnxt/inbox srcfiledir=$PROJECTDIR/qnxt/estage errorfiledir=$PROJECTDIR/qnxt/error tempfiledir=$PROJECTDIR/nice/tstage .sh #Author: Prasad Degela #Date: 03/13/2006 #Reviewed by By: #Date: #Project: Appeals & Grivances Tracking ############################################################################## ###### #This script will call .Sample Unix shell Scripts Naveen 10 Apr 2011 #ksh: JobNum.Part 1 # These parameters are provided for all interfaces.

tempfiledirqnxt=$PROJECTDIR/qnxt/tstage ERROR_EXIT_STATUS=99 SUCCESS_EXIT_STATUS=0 echo "Control Moved to the Inbox Directory = "$inboxdir cd $inboxdir fileconv=`echo "PDP*.txt .gz rm -f TRAIL*.gz $ARCHFILEDIR rm -f $pfilename.txt"` echo "fileconv = "$fileconv files=`ls -tr $fileconv` if [[ $? -ne 0 ]] then echo "No Source File in the InBox "$inboxdir exit $SUCCESS_EXIT_STATUS fi echo "List Of Files in the Inbox Directory "$inboxdir echo "{ $files }" for pfilename in $files do cd $inboxdir cp $pfilename $srcfiledir gzip $pfilename cp $pfilename.

log"` ############################################################################## ###### # RESET AND RUN job sequencer # ############################################################################## ###### $DSBINDIR/dsjob -jobinfo $DSPROJECT $pDSjobname. $pINVOKE_ID echo "************************* Reset the Job Sequencer *******************" fi echo AGDB: $AGDB echo AGDBUserID: $AGDBUserID echo ORADB: $ORADB echo ORADBUSRID: $ORADBUSRID .$pINVOKE_ID > $LOGFILEDIR/ $pJobStatusFile JOBSTATUS=`grep "Job Status" $LOGFILEDIR/$pJobStatusFile | cut -f2 -d '('|cut -f1 -d ')'` echo "$pDSjobname Job status code is "$JOBSTATUS if [[ $JOBSTATUS = 3 ]] then echo "Job Status = "$JOBSTATUS $DSBINDIR/dsjob -run -mode RESET -wait -jobstatus $DSPROJECT $pDSjobname.echo "Source File Name Is "$pfilename pDSjobname="sjAG01" pINVOKE_ID=$pfilename pJobStatusFile=`echo $pfilename".stat"` pJobLogFile=`echo $pfilename".

echo NICEDB: $NICEDB echo NICEDBUserID: $NICEDBUserID echo SrcFile: $pfilename echo SrcFileDir: $SRCFILEDIR echo TempFileDirectory: $TEMPFILEDIR echo TempFileDirectoryqnxt:$TEMPFILEDIRQNXT echo ErrorFileDirectory: $ERRORFILEDIR echo ScriptFileDirectory: $SCRIPTFILEDIR echo "calling $Jobname Sequence" $DSBINDIR/dsjob -run \ -param AGDB=$AGDB \ -param AGDBUserID=$AGDBUserID \ -param AGDBPswd=$AGDBPswd \ -param ORADB=$ORADB \ -param ORADBUSRID=$ORADBUSRID \ -param ORADBPswd=$ORADBPswd \ -param NICEDB=$NICEDB \ -param NICEDBUserID=$NICEDBUserID \ -param NICEDBPswd=$NICEDBPswd \ -param SrcFile=$pfilename \ -param SrcFileDir=$SRCFILEDIR \ -param TempFileDirectory=$TEMPFILEDIR \ -param TempFileDirectoryqnxt=$TEMPFILEDIRQNXT \ .

$pINVOKE_ID > $LOGFILEDIR/ $pJobStatusFile $DSBINDIR/dsjob -logsum $DSPROJECT $pDSjobname.1.-param ErrorFileDirectory=$ERRORFILEDIR \ -param ScriptFileDirectory=$SCRIPTFILEDIR \ -wait -jobstatus $DSPROJECT $pDSjobname.$INVOKE_ID > $LOGFILEDIR/ $pJobLogFile JOBSTATUS=`grep "Job Status" $LOGFILEDIR/$pJobStatusFile | cut -f2 -d '('|cut -f1 -d ')'` echo "$pDSjobname Job status code is "$JOBSTATUS if [[ $JOBSTATUS = 1 ]] then echo "Job Status = "$JOBSTATUS echo "Removing File from $srcfiledir on successful processing of file $pfilename" cd $srcfiledir rm -f $pfilename else exit $ERROR_EXIT_STATUS fi done echo 99 END OF THE PROG exit $SUCCESS_EXIT_STATUS # ################################################################## Standards for ETL UNIX Shell Scripts for use with PowerCenter 7.3 .$pINVOKE_ID $DSBINDIR/dsjob -jobinfo $DSPROJECT $pDSjobname.

3): Scripting standards include the use of a UNIX shell script. dss. Following is the script and explanation.3 Scripting Standards (for PowerCenter version 7.1. pass username and password variables # and start the job using the pmcmd command line.sh dss # # NOTE: Enter the Project ID parameter that is designated in the # directory structure for your team # (ie. ud. # # $1 = Project Id Parameter (ie. # Program: etl_unix_shell_script. siq.1. dss would be used for the DWDAS team as the # directory is /usr/local/autopca/dss/) #----------------------------------------------------------------- . and a separate file which contains the username and password for the user called in the script.Naveen 10 Apr 2011 Standards for ETL UNIX Shell Scripts for use with PowerCenter 7.sh. hr.) # # Example usage: etl_unix_shell_script. This script has been provided as an example and is named etl_unix_shell_script.sh # Author: Kevin Gillenwater # Date: 7/24/2003 # Purpose: Sample UNIX shell script to load environment variables # needed to run PowerMart jobs. The following is a template that should be used to create scripts for new scheduled jobs. etc. which the scheduling tool uses to start the PowerCenter job.

sh). /usr/local/bin/set_pm_var. The set_pm_var. located on each machine (Leisure for Production and Glance for Development). /usr/local/bin/set_pm_var.config ETL_USER=`grep ETL_USER $ETL_CONFIG_FILE | awk -F: '{print $2}'` ETL_PWD=`grep ETL_PWD $ETL_CONFIG_FILE | awk -F: '{print $2}'` #----------------------------------------------------------------# Start the job #----------------------------------------------------------------$PM_HOME/pmcmd startworkflow -u $ETL_USER -p $ETL_PWD -s $MACHINE:4001 -f DWDAS_ LOAD_dssqa –wait s_m_CENTER_INSTITUTE #----------------------------------------------------------------# Trap the return code #----------------------------------------------------------------rc=$? if [[ $rc -ne 0 ]] then exit $rc fi Notes Regarding the Script/Standards: 1.sh #----------------------------------------------------------------# Read ETL configuration parameters from a separate file: #----------------------------------------------------------------ETL_CONFIG_FILE=$JOBBASE/$1/remote_accts/test_etl.# Call the script to set up the Informatica Environment Variables: #----------------------------------------------------------------. The beginning of each script should call and execute the script set_pm_var. will provide ease of maintenance if changes need to be made to PowerCenter variables.sh to set up the PowerCenter variables used in the session (. and will provide one .sh.

The following is the code in the script. the variable should evaluate to /usr/local/autopca.. #------------------------------------------#Set up the Informatica Variables #------------------------------------------export MACHINE=`hostname` export SHLIB_PATH=/usr/local/pmserver/informatica/pm/infoserver export PM_HOME=/usr/local/pmserver/informatica/pm/infoserver #--------------------------------------------------------------------# Set the environment variables needed for scheduling jobs. it should evaluate to their $HOME variable. esac .source for the scripts on a machine.2 # when called from a shell script (ie.. it is for informational purposes only: # Program: set_pm_var. For AUTOSYS # and AUTODBA. /home/autopca/autodba) JOBBASE=/usr/local/autopca . # The value of JOBBASE differs based on the account.sh # Author: Kevin Gillenwater # Date: 7/3/2003 # Purpose: UNIX script which sets the variables for running PowerMart 6. You will not need to do anything with this script.. # For all other accounts. script run by Autosys). #--------------------------------------------------------------------case $HOME in /home/autopca/autopca) JOBBASE=/usr/local/autopca . *) JOBBASE=$HOME .

r.. Please follow the script exactly. 3.sh sets up ETL parameters for username and password usage by PowerCenter when running the workflow.sh contains the command to run the workflow in PowerCenter.. Following is the contents of test_etl_config: ETL_USER:etlguy ETL_PWD:ou812 This file (your password file) must be located in your team directory under the remote_accts folder (i. The username and password are no longer stored as part of the main shell script. In the script example. *8/27/03 – Added ‘-wait’ parameter to the pmcmd command line to start the scheduled job. A Linux utility (such as the sed editor or the gawk program)matches the regular expression pattern against data as that data flows Into the utility.. If the data matches the pattern. DWDAS_LOAD_dssqa wflw_m_CENTER_INSTITUTE). The only thing that will need to be changed in this section is the folder name and the workflow within the folder that is being executed (i. .export JOBBASE 2.e. it’s rejected.e. The second section of etl_unix_shell_script.-) on this file. If the data doesn’t match the pattern. The third section of etl_unix_shell_script.sh contains code to trap the return code of the script that indicated success or failure.Updated to generalize the scheduling tool used to run the shell scripts as UC4 has been chosen to replace Autosys as the scheduling tool used to run PowerMart workflows. The regular expression pattern makes use of wildcard characters to represent one or more characters in the data stream. The permissions should be 6-4-0 (rw. it’s accepted for processing. *10/6/04 . The final section of etl_unix_shell_script. 4. Types of regular expressions: . The folder permissions on the Production machine will only permit PCA’s access to this folder. Regular Expressions Naveen 23 Apr 2011 What Are Regular Expressions? A regular expression is a pattern template you define that a Linux utility Uses to filter text. the filename test_etl_config contains the username and password to be used when running the workflow. /usr/local/autopca/dss/remote_accts/).

Eg 1: Plain text $ echo "This is a test" | sed -n ’/test/p’ This is a test. $ echo "This is a test" | gawk ’/trial/{print $0}’ $ Eg 2: Special characters The special characters recognized by regular expressions are: . just precede it with a backslash character: $ cat data2 The cost is $4.*[]^${}\+?|() For example.00 $ . if you want to search for a dollar sign in your text. $ echo "This is a test" | sed -n ’/trial/p’ $ $ echo "This is a test" | gawk ’/test/{print $0}’ This is a test.00 $ sed -n ’/\$/p’ data2 The cost is $4.There are two popular regular expression engines:   The POSIX Basic Regular Expression (BRE) engine The POSIX Extended Regular Expression (ERE) engine Defining BRE Patterns: The most basic BRE pattern is matching text characters in a data stream.

but that’s all. Eg 1: The question mark The question mark indicates that the preceding character can appear zero or one time. It doesn’t match repeating occurrences of the character: $ echo "bt" | gawk ’/be?t/{print $0}’ bt $ echo "bet" | gawk ’/be?t/{print $0}’ Bet . The gawk program recognizes the ERE patterns. but the sed editor doesn’t. Now you can simplify the zip code example by specifying a range of digits: $ sed -n ’/^[0-9][0-9][0-9][0-9][0-9]$/p’ data8 60633 46201 45902 $ Extended Regular Expressions: The POSIX ERE patterns include a few additional symbols that are used by some Linux applications and utilities.Eg 3: Looking for the ending The dollar sign ($) special character defines the end anchor. $ echo "This is a good book" | sed -n ’/book$/p’ This is a good book $ echo "This book is good" | sed -n ’/book$/p’ $ Eg 4: Using ranges You can use a range of characters within a character class by using the dash symbol.

The pattern doesn’t match if the character is not present: $ echo "beeet" | gawk ’/be+t/{print $0}’ beeet $ echo "beet" | gawk ’/be+t/{print $0}’ beet $ echo "bet" | gawk ’/be+t/{print $0}’ bet $ echo "bt" | gawk ’/be+t/{print $0}’ $ Eg 3: The pipe symbol The pipe symbol allows to you to specify two or more patterns that the regular expression engine uses in a logical OR formula when examining the data stream.$ echo "beet" | gawk ’/be?t/{print $0}’ $ $ echo "beeet" | gawk ’/be?t/{print $0}’ $ Eg 2: The plus sign The plus sign indicates that the preceding character can appear one ormore times. Here’s an example of this: $ echo "The cat is asleep" | gawk ’/cat|dog/{print $0}’ The cat is asleep $ echo "The dog is asleep" | gawk ’/cat|dog/{print $0}’ . The format for using the pipe symbol is: expr1|expr2|. If none of the patterns match.. the data stream text fails.. the text passes. but must be present at least once. If any of the patterns match the data stream text.

The dog is asleep $ echo "The sheep is asleep" | gawk ’/cat|dog/{print $0}’ $ Eg 4: Grouping expressions When you group a regular expression pattern. You can apply a special character to the group just as you would to a regular character.ksh bye END_FTP Reactions: Process a File line by line . ftp -i -v -n wilma <<END_FTP user randy mypassword binary lcd /scripts/download cd /scripts get auto_ftp_xfer. For example: $ echo "Sat" | gawk ’/Sat(urday)?/{print $0}’ Sat $ echo "Saturday" | gawk ’/Sat(urday)?/{print $0}’ Saturday $ Automated FTP File Transfer Naveen 22 Apr 2011 Automated FTP File Transfer : You can use a here document to script an FTP file transfer. the group is treated like a standard character. The basic idea is shown here.

The pipe is the key to the popularity of this method. Our use of piping output to a while loop works the same way. I can use a pipe to send the output to the more command. On each loop iteration a single line of text is read into a variable named LINE.3. You could also use () C-type function definition if you wanted. allowing me to view the df command output one page/line at a time. It is intuitively obvious that the output from the previous command in the pipe is used as input to the next command in the pipe. . Then this temporary system file is used as input to the more command. as in the following command: df | more When the df command is executed. function while_read_LINE { cat $FILENAME | while read LINE do echo “$LINE” : done } Each of these test loops is created as a function so that we can time each method using the shell script. if I execute the df command to list file system statistics and it scrolls across the screen out of view. As an example.2. the pipe stores the output in a temporary system file. which is catting a file and piping the file output to a while read loop. This continuous loop will run until all of the lines in the file have been processed one at a time. the output of the cat command is used as input to the while loop and is read into the LINE variable on each loop iteration. Look at the complete function in Listing 2. as shown in Listing 2.Naveen 22 Apr 2011 Process a File line by line : There are Numerous methods to achieve the same one such method is discussed here : Method 1: Let’s start with the most common method that I see.

however. you get the same result. the loop will not do anything either. A no-op (:) does nothing.while_read_LINE () { cat $FILENAME | while read LINE do echo “$LINE” : done } Whether you use the function or () technique. but it always has a 0. I tend to use the function method more often so that when someone edits the script they will know the block of code is a function. I use the no-op only as a placeholder so that you can cut the function code out and paste it in one of your scripts. the while loop will not fail. zero. return code. Within the while loop notice that I added the no-op (:) after the echo statement. . the word “function” helps understanding the whole shell script a lot. The $FILENAME variable is set in the main body of the shell script. For beginners. If you should remove the echo statement and leave the no-op.

Sign up to vote on this title
UsefulNot useful