WWW - Appsdba.info Docs Oracle Apps Advance PCP PDF

Parallel Concurrent
Processing
Mike Swing
TruTek
mswing@trutek.com
RMOUG 2009
1
Conclusions
You dont need RAC to use Parallel Concurrent
Processing (PCP)!
If you have PCP enabled, secondary nodes
must be defined during the upgrade to R12
Tuning of TCP, SQLNet and PMON
parameters can minimize PCP failover time.
Implement Failover Sensitive Workshifts
Concurrent Processing Server

Allows scheduling of jobs batch jobs, or Requests in
Oracle terms.
Processes concurrent programs as a Request.
Requests can be grouped together into Request Sets.
Different types of concurrent managers handle different
types of requests.
A concurrent program can be assigned to a responsibility,
and that responsibility can be assigned to users, allowing
them the permission to run the concurrent program.
Concurrent managers may have limits on the concurrent
programs that can be run, and the times that they can be
started. Requests have priorities, status and log and out
files in the above directory
3
Definitions
CP => Concurrent Processing

DCD => Dead Connection Detection
ICM => Internal Concurrent Manager
IM => Internal Monitor
CRM => Conflict Resolution Manager
PCP => Parallel Concurrent Processing
PMON => Process Monitor for ICM
4
Concurrent Request
Phase and Status of Concurrent Requests

Phase
Status
Description - Action
Pending
Normal
The request is waiting to be picked up by the next

available manager.
Pending
Standby
Waiting for CRM to resolve conflict. CRM could be

slow or an incompatible program is running.
Running
Normal
The request is running normally.
Completed
Normal
The request has finished successfully
Completed
Error
The request has finished with an error. Check

logs.
Completed
Warning
The request has finished with a Warning. Check

the logs.
Inactive
No Manager
Request wont run without a manager.

Specialization rules arent configured properly.
6
PCP Failover
DB Node RH8
Database
RH7
RH8
RH9
PCP
PCP
PCP
sqlnet.ora
Database
Listener
SQL*Net
SQL*Net
SQL*Net
Client
Client
Client
TCP_KEEPALIVE takes 240 seconds before issuing DCD
Concurrent Managers
Concurrent Managers
Manager Type
Service Instance
Program
Internal Concurrent Manager
Internal Manager
FNDLIBR
Conflict Resolution Manager
FNDCRM
Internal Monitor
Internal Monitor:Node
FNDIMON
Service Manager: Node
FNDSM
Concurrent Manager
Standard Manager
FNDLIBR
Concurrent Manager
Inventory Manager
INVLIBR
Concurrent Manager
Session History Cleanup
FNDLIBR
Concurrent Manager
PA Streamline Manager
PALIBR
Transaction Manager
CRP Inquiry Manager
CYQLIB
Transaction Manager
FastFormula Transaction Manager
FFTM
Transaction Manager
PO Document Approval Manager
POXCON
Transaction Manager
Transaction Manager
FNDTMTST
Scheduler/Prerelease Manager
FNDSVC
OAM Generic Collection Service:Node
FNDSVC
9
Concurrent Processing
1. The Concurrent
Web
Processing server
Interface
Browser
communicates with
the database using
Forms Server
Oracle SQL*Net.
JAVA
2. The concurrent
JInitiator
Interface
program log or output
Reports Server
file from a request is
passed back as a
report to the Report
SQL*Net
ICM
Service
Internal
Report
Review Agent.
FNDLIBR
Manager
Monitor
Review
FNDSM
.rdx
FNDIMON
3. The Report Review
Agent
Agent passes a file
Standard
Manager
containing the entire
Requests
Log
Out
FNDCRM
FNDLIBR
report to the forms
server.
4. The Forms Services component passes the report back to the users browser one
page at time. Profile options can be used to control the size of the files and pages
passed, to suit report volume and available network capacity.
HTML
Web Server
10

The Internal Concurrent Manager (ICM) starts, sets the
number of active processes, monitors, and terminates all
other concurrent processes through requests made to
the Service Manager, including restarting any failed
processes.
The ICM also starts and stops, and restarts the Service
Manager for each node.
The ICM will perform process migration during an
instance or node failure.
The ICM will be active on a single node.
This is also true in a PCP environment, where the ICM
will be active on at least one node at all times.
11

The ICM really does not have any scheduling
responsibilities. It has NOTHING to do with scheduling
requests, or deciding which manager will run a particular
request. The function of the ICM is to run 'queue control'
requests; requests to startup or shutdown other
managers.
The ICM is responsible for startup and shutdown of the
whole concurrent processing facility, and it monitors the
other managers periodically, and restarts them if they
should go down. It can also take over the Conflict
Resolution manager's job, and resolve incompatibilities.
If the ICM itself should go down, requests will continue to
run normally, except for 'queue control' requests. Restart
the ICM with 'startmgr'; no need to kill the other
managers first.
12
13
Service Manager
FNDSM process - Communicates with the Internal Concurrent
Manager, Concurrent Manager, and non-Manager Service
processes.
The Service Manager (SM) spawns, and terminates manager and
service processes (these could be Forms, or Apache Listeners,
Metrics or Reports Server, and any other process controlled through
Generic Service Management).
When the ICM terminates the SM that resides on the same node
with the ICM will also terminate.
The SM is chained to the ICM. The SM will only reinitialize after
termination when there is a function it needs to perform (start, or
stop a process), so there may be periods of time when the SM is not
active, and this would be normal.
14
Service Manager
All processes initialized by the SM inherit the
same environment as the SM.
The SMs environment is set by APPSORA.env
file, and the gsmstart.sh script.
The apps_<sid> listener must be active on each
CP node to support the SM connection to the
local instance.
There should be a Service Manager active on
each node where a Concurrent or non-Manager
service process will reside.
15
FNDSM Failure
FNDSM failover as noted in the concurrent manager log:
Could not contact Service Manager FNDSM_RH8_VIS. The TNS
alias could not be located, the listener process on RH8 could not
be contacted, or the listener failed to spawn the Service
Manager process.
Found dead process: spid=(962754), cpid=(2259578), Service
Instance=(1045)
CONC-SM TNS FAIL
Call to PingProcess failed for WFMAILER
CONC-SM TNS FAIL
Call to StopProcess failed for WFMAILER
CONC-SM TNS FAIL
Call to PingProcess failed for FNDCPGSC
16
FNDSM Failover
Instance=(2009)
Instance=(2010)
Starting WFMGSMD Concurrent Manager
: 15-AUG-2008
13:28:56
Starting WFMGSMDB Concurrent Manager
: 15-AUG-2008
13:28:56
Starting WFALSNRSVCB Concurrent Manager : 15-AUG-2008
13:28:57
Starting STANDARD Concurrent Manager
: 15-AUG-2008
13:30:31
Starting Internal Concurrent Manager Concurrent Manager : 15-AUG2008 13:30:32
17
Internal Monitor
(FNDIMON process) - Communicates with the Internal Concurrent
Manager.
This manager/service is used to implement Parallel Concurrent
Processing.
You do not need to run this manager/service unless you are using
Parallel Concurrent Processing.
The Internal Monitor (IM) monitors the Internal Concurrent Manager,
and restarts any failed ICM on the local node. It monitors whether
the ICM is still running, and if the ICM crashes, it will restart it on
another node.
During a node failure in a PCP environment the IM will restart the
ICM on a surviving node (multiple ICM's may be started on multiple
nodes, but only the first ICM started will eventually remain active, all
others will gracefully terminate).
There should be an Internal Monitor defined on each node where
the ICM may migrate.
18
Standard Manager
(FNDLIBR process) - Communicates with
the Service Manager and any client
application process.
The Standard Manager is a worker
process that initiates, and executes client
requests on behalf of Applications batch,
and OLTP clients.
19
Standard Manager
20
Standard Manager - OAM
The Standard Manager is active

on RH9, even though no primary
node is defined
Since no
secondary node is
defined, the
Standard Manager
will not failover
Failover Processes in the Work Shifts definition
are the number of processes that will run (3)
when the Standard Manager fails over to the
secondary node.
21
Transaction Manager
A Transaction Manger communicates with the Service
Manager, and any user process initiated on behalf of
Forms, or a Standard Manager request.
A Transaction Manager:
Supports synchronous processing of requests from a
client program
Gets request for a client program to run a server-side
program synchronously.
Return a status/results to the client program.
At runtime, it starts a number of these managers as
defined.
Doesnt poll concurrent request table for a new request
Only need 1 transaction manager per database, not 1
per instance.
22
Transaction Managers
Some of the Transaction

Managers in R12
23
Configuring Transaction Managers

for RAC
R11i Transaction Managers use DBMS_PIPE
This does not work across RAC instances
RAC users must perform additional configuration
Requires complicated configuration or additional hardware
R12 Transaction Managers use AQ
Works across RAC Instances

Simplifies configuration
Reduces complexity
Profile Option can switch between mechanisms
DBMS_PIPE can be used for non-RAC users if performance
becomes an issue
24

for RAC
Edit $ORACLE_HOME/dbs/<context_name>_ifile.ora and add

these parameters:
_lm_global_posts=TRUE
_immediate_commit_propagation=TRUE
Change the profile option Concurrent: TM Transport Type' to

QUEUE', and verify that the transaction manager works across
the RAC instance. ATG RUP3 (4334965) or higher provides an
option to use AQs in place of Pipes.
Profile Concurrent:TM Transport Type
Set to QUEUE
Pipes are more efficient but require a Transaction Manager to be
running on each DB Instance.
Navigate to Concurrent > Manager > Define screen, and set up
the primary and secondary node names for transaction managers.
25

for RAC
Transaction Managers allow a client to make a request for a

program to be run on the server immediately. The client then waits
for the program to complete and can receive program results from
the server. As the client and server are two separate database
sessions, the communication between has been handled using the
DBMS_PIPE package.
Unfortunately the DBMS_PIPE package does not extend to
communications between sessions on different RAC instances. On
an Applications instance using RAC, the client and server are very
likely to be on different instances, causing transactions to time out
for long periods or fail completely. The current workaround is to
manually set up Transaction managers to connect to all RAC
instances, which not only takes up additional resources, it may
require additional middle-tier hardware or a complicated
configuration that is difficult to maintain.
26
R12 Transaction Managers

In R12, the Transaction Managers use the AQ
mechanism; the Transaction Managers, work on
RAC connected to either instance.
This greatly simplifies the configuration and
reduces the complexity for RAC administrators.
A Profile Option has been introduced to allow
users to switch between the two transports
DBMS_PIPE or AQ.
27
Concurrent:PCP Instance Check

Concurrent processing provides database instancesensitive failover capabilities. When an instance is down,
all managers connecting to it switch to a secondary
middle-tier node.
However, if you prefer to handle instance failover
separately from such middle-tier failover (for example,
using TNS connection-time failover mechanism instead),
use the profile option Concurrent:PCP Instance Check.
When this profile option is set to OFF, Parallel
Concurrent Processing will not provide database
instance failover support; however, it will continue to
provide middle-tier node failover support when a node
goes down.
28
Concurrent managers read requests to start concurrent programs.

The Conflict Resolution Manager checks concurrent program
definitions for incompatibility rules.
If a program is identified as Run Alone, then the Conflict Resolution
Manager prevents the concurrent managers from starting other
programs in the same conflict domain.
When a program lists other programs as being incompatible with it, the
Conflict Resolution Manager prevents the program from starting until
any incompatible programs in the same domain have completed
running.
To enable/disable the Conflict Resolution Manager, use the system
profile option 'Concurrent: Use ICM'. Set this to 'No' (default) allows
the CRM to be started.
Setting it to 'Yes' causes the CRM to be shutdown and the Internal
Manager (ICM) will take over the conflict resolution duties.
If the CRM will not start (it is started automatically by the ICM), check
this profile option.
29

Use the system profile option 'Concurrent:
Use ICM'. 'No allows the CRM to be started.
Setting it to 'Yes' causes the CRM to shutdown.
The Internal Manager (ICM) will take over the
conflict resolution duties.
Using the ICM to resolve conflicts is not
recommended.
The CRM's sole purpose is to resolve conflicts,
while the ICM has other functions to perform as
well.
Setting this option to 'YES' is not recommended.
30
Generic Service Management
An E-Business Suite system depends on a variety of services, such

as Forms Listeners, HTTP Servers, Concurrent Managers, and
Workflow Mailers. These services are composed of one or more
processes. In the past, many of these processes had to be
individually started and monitored by system administrators.
Management of these processes is complicated, since these
services can be distributed across multiple host machines.
The introduction of Generic Service Management in Release 11i
helped simplify the management of these processes by providing a
fault tolerant service framework and a central management console
built into Oracle Applications Manager.
Service Management is an extension of Concurrent Processing, and
provides a framework for managing processes on multiple host
machines. With Service Management, virtually any application tier
service can be integrated into this framework.
Patch 2221688 introduces GSM.
31
GSM
32
Generic Services
33
GSM and Multiple Nodes

GSM enables users to manage Applications
services across multiple middle-tier nodes.
This includes services on Web/Forms nodes that
previously have had no concurrent processing
footprint.
Users configuring GSM in a multiple-node
system should be sure to have followed the
instructions for Parallel Concurrent Processing.
This includes setting the environment variable
APPLDCP=ON and assigning a primary node for
all defined managers and services (if not already
defined.)
34
Seeded GSM Services

When configuring GSM the following GSM
Services are seeded automatically:
Forms Listener
Metrics Server
Metrics Client
Reports Server
Apache Listener
LINUX users should not Activate the Reports

Server under GSM
35
Starting GSM
Apps Listener:
listener.ora
gsmstart.sh
exec FNDSM
36
adcmctl.sh
adcmctl.sh calls:
starmgr.sh
batchmgr.sh
CONCSUB
FNDSVCRG
37
FNDSVCRG Service Controller

Utility
FNDSVCRG is an executable introduced as a
part of the Seeded GSM Services. It provides
improved coordination between the GSM
monitoring of these service and their commandline control scripts.
The $FND_TOP/bin/FNDSVCRG executable is
called from adcmctl.sh control script before and
after the script starts or stops the service.
FNDSVCRG connects to the database using
JDBC and validates the configuration of the
Seeded GSM Service.
38
Verify GSM
To verify GSM is working, start the concurrent
managers.
Once GSM is enabled, the ICM uses Service
Managers to start all concurrent managers and
activated services.
If the ICM is successfully starting the managers,
then GSM has been configured properly.
If managers and/or services fail to start, errors
should appear in the ICM log file.
39
Service Manager Log

Each Service Manager maintains its own
log file named FNDSMxxxx.mgr, located in
the same directory as concurrent manager
log files.
If you cannot locate the Service Manager
log file, it is likely that the Service
Managers are not starting properly and
there is a configuration issue that needs
troubleshooting.
40
Test Kill services and see if

GSM restarts them
Kill FNDSM
applvis 9007 1 0 11:53 ?

00:00:00 FNDSM
applvis 9159 9155 0 11:55 ?
00:00:00 FNDLIBR
applvis 9161 5683 0 11:55 pts/3 00:00:00 grep FND
[applvis@rh9 scripts]$ kill -9 9007
[applvis@rh9 scripts]$ ps -ef |grep FND
applvis 9159 9155 0 11:55 ?
00:00:00 FNDLIBR
applvis 9169 1 0 11:55 ?
00:00:00 FNDSM
applvis 9249 5683 0 11:57 pts/3 00:00:00 grep FND
Kill FNDCRM
[applvis@rh9 scripts]$ ps -ef |grep FNDCRM
applvis 8886 1 0 11:52 ?
00:00:00 FNDCRM
APPS/ZGA13053E1E1B7BA773417089054DA88F194EAC0D687728CC2551870E6B78C4B439
EADB287342795115A88DBC85788CCB4 FND FNDCRM N 10 c LOCK Y RH9 1302318
[applvis@rh9 scripts]$ kill -9 8886
[applvis@rh9 scripts]$ ps -ef |grep FNDCRM
applvis 9457 9392 0 12:09 ?
00:00:00 FNDCRM
APPS/ZG26430816FA3570354BC57DE47FF105D145F8DE226EFE58CE04B416633DCB90126
7BFECFA7585114F7090060EFE1147BE FND FNDCRM N 10 c LOCK Y RH9 1302343
Both of these services were started before I could enter the grep command to find the corresponding
process.
41
11i - Defining PCP Details
In Release 11i,
the Secondary
Node doesnt
need to be filled
in for failover to
occur
42
R12 PCP Details
In Release 12,
failover wont
occur if there is
no Secondary
Node defined
43
R12 PCP Setup

The only
Standard
Manager set
up to fail over
is the
Standard
Manager
44
R12 Manager Failover
45
PCP Failover
DB Node RH8
Database
RH7
RH8
RH9
PCP
PCP
PCP
sqlnet.ora
Database
Listener
SQL*Net
SQL*Net
SQL*Net
Client
Client
Client
TCP_KEEPALIVE takes 240 seconds before issuing DCD
46
Parallel Concurrent Processing

Parallel concurrent processing allows distribution of
concurrent managers across multiple nodes.
Benefits are better: performance, availability and
scalability (load balancing).
Parallel Concurrent Processing (PCP) is activated along
with Generic Service Management (GSM); it can not be
activated independently of GSM.
With parallel concurrent processing implemented with
GSM, the Internal Concurrent Manager (ICM) tries to
assign valid nodes for concurrent managers and other
service instances.
47

There should be only one ICM and CRM,
at any given time, although the ICM and
CRM could be configured to run on
several of the nodes.
Concurrent Managers migrate to the
surviving node when one of the concurrent
nodes goes down.
48

Web
Browser
HTML
Web Server
Interface
Forms Server
JInitiator
Internal
Monitor
FNDIMON
FNDCRM
Internal
Monitor
FNDIMON
FNDCRM
Data
JAVA
Interface
ICM
FNDLIBR
Standard
Manager
FNDLIBR
ICM
FNDLIBR
Standard
Manager
FNDLIBR
Reports Server
Service
Manager
FNDSM
Report
Review
Agent
Requests
Logs
Service
Manager
FNDSM
Report
Review
Agent
SQL*Net
.rdx
Out
SQL*Net
.rdx
Database
Requests
Logs
Out
Whats wrong with this picture?

49
APPLDCP Profile Option

Starting with Release 11.5.10, FND.H, the APPLDCP environment
variable is ignored. R12 GSM requires the value of APPLDCP to be
set to ON. The value is hard-coded in afpcsq.lpc version 115.35,
thereby ignoring the value of APPLDCP.
As per ATG Development:
As of file "afpcsq.lpc" version 115.35 or higher, APPLDCP is internally
hard-coded to "ON" when the Generic Service Management (GSM) is
enabled--"keeping in mind, use of the GSM is required".
In short, at "afpcsq.lpc" version 115.35 or higher with the GSM enabled,
the setting of the APPLDCP environment variable is ignored--this is the
"default behavior on all R12 releases."
NOTE: As per ARU, "Patch 11i.FND.H" (3262159) and "Oracle
Applications Release 11.5.10" (3140000) contains "afpcsq.lpc" version
115.37.
From Note: 753678.1
50
PCP Failover Mechanisms
TCP keepalive
PMON ICM Process Monitor
Dead Connection Detection
Connection Failure Recovery R12
10g Timeout Parameters (untested)
sqlnet.inbound_connect_timeout (server)
sqlnet.send_timeout (client and/or server)
sqlnet.recv_timeout (client and/or server)
51
11i PCP Failure

TCP Failure
ICM Lock is released, FNDIMON pings
ICM node, if ping fails, check PMON
PMON detects a dead process, crashed
ICM
reviver.sh
DCD
52
R12 PCP Failure

TCP Failure
PMON detects a dead process
ICM Shutdown
Look for error messages ORA-3113, ORA3114 or ORA-1041
reviver.sh
DCD
53
Reviver
REVIVER
ICM
Start
No
Receive
Shutdown?
Starts to Shutdown
Attempt to
Get DB
Connection
Lost DB
Connection?
No
Sleep
Yes
Yes
Kill Previous DB
Session
No
Spawn Reviver
Yes
Start ICM
Exit
No
ICM
Started?
Yes
From the CM log file:

The ICM has lost its
database
connection and is
shutting down.
Spawning reviver
process to restart
the ICM when the
database becomes
available again.
Spawned reviver
process 10910.
Exit
54
reviver.log
The ICM has lost its database connection
and is shutting down.
Spawning reviver process to restart the ICM
when the database becomes available
again.
Spawned reviver process 10910.
55
TCP
TCP/IP is a connection-oriented protocol; TCP
implements packet timeout and retransmission
in an effort to guarantee the safe and sequenced
order of data packets.
If a timely acknowledgement is not received in
response to the probe packet, the TCP/IP stack
will retransmit the packet some number of times
before timing out.
After TCP/IP gives up, SQL*Net receives
notification that the probe failed.
56
TCP Keepalive
At this time, client side SQL*Net connections do not enable
keepalive for TCP connections by default.
However, it is possible to enable this by adding the
ENABLE=BROKEN parameter to the SQL*Net connect
string, by adding this parameter to the sqlnet.ora file.
**WARNING** Keepalive intervals can typically be set to 2
hours or more (i.e,,it can take more than 2 hours to
notice a dead server even if keepalive is enabled). To
make keepalive useful for PCP and TAF the keepalive
interval needs to be reduced to a smaller value (such as
2 minutes).
If there are a lot of IDLE connections on your network, then
reducing keepalive can increase network traffic
significantly.
57
ENABLE=BROKEN
Sample TNS alias to enable keepalive (notice the
ENABLE=BROKEN clause)
VIS_BALANCE = (DESCRIPTION =
(ENABLE=BROKEN)
(ADDRESS_LIST = (LOAD_BALANCE = ON)
(FAILOVER = ON)
ADDRESS = (PROTOCOL = TCP)
(HOST = rh8)(PORT = 1521)) (ADDRESS =
(PROTOCOL = TCP)(HOST = rh6)(PORT = 1521)))
58
TCP Keepalive
**WARNING** Keepalive intervals are
typically set to 2 hours or more (ie: it can
take more than 2 hours to notice a dead
server even if keepalive is enabled).
To make keepalive useful for TAF, the
keepalive interval would need to be
reduced to a smaller value (such as 2
minutes). Note: 249213.1
59
TCP KeepAlive Parameters for

Linux
tcp_keepalive_time
tcp_keepalive_intvl
tcp_keepalive_probes
Default Settings
the time since the last data

packet sent and the first
keepalive probe
the time between keepalive
probes
the number of probes to be
sent before declaring the
connection dead
tcp_keepalive_time = 7200 seconds
tcp_keepalive_intvl = 75
tcp_keepalive_probes = 9
A total of 7875 seconds, or 2 hours 11 minutes and 15 seconds.

60
TCP Keepalive
Initial Settings
tcp_keepalive_time = 200 secs
tcp_keepalive_intvl = 20
tcp_keepalive_probes = 2
After 200 seconds of no response, TCP sends

the first of 2 probes, 20 seconds apart.
TCP notifies SQL*Net of the failure, and
SQL*Net removes the offending connection.
61
TCP Retries
tcp_retries1 (default: 3) The number of times TCP will
attempt to retransmit a packet on an established
connection normally, without the extra effort of getting
the network layers involved.
tcp_retries2 (default: 15) The maximum number of times
a TCP packet is retransmitted in established state before
giving up
tcp_syn_retries (default: 5) The maximum number of
times initial SYNs for an active TCP connection attempt
will be retransmitted. The default value is 5, corresponds
to approximately 180 seconds.
62
TCP Retries
Now lets consider changing the following
TCP parameters from their default values:
tcp_retries1 = 2
tcp_retries2 = 2
tcp_syn_retries = 2
In this example, the time to initialize the PCP

failover was an average of 8 seconds after
changing these TCP parameters.
63
Disconnect TCP Connection

from RH9
From the ICM log:
The Internal Concurrent Manager has encountered an error.
Review concurrent manager log file for more detailed information. : 12JAN-2009 15:22:55 Shutting down Internal Concurrent Manager : 12-JAN-2009 15:22:55
12-JAN-2009 15:22:55
The ICM has lost its database connection and is shutting down.
Spawning reviver process to restart the ICM when the database
becomes available again.
The VIS_0112@VIS internal concurrent manager has terminated with
status 1 - giving up.
Found dead process: spid=(17963), cpid=(1302176), ORA pid=(26),
manager=(0/1)
64
PMON & fnd_concurrent _queues

PMON updates the work_start column in the
fnd_concurrent_queues table every 4 PMON cycles
fdpsrp() (running_processes correction):
ICM cannot obtain exclusive lock on
FND_CONCURRENT_QUEUES
Oracle error code returned: 1
This message is information and does not indicate a
problem with CP functionality.
remote call function (FNDIMON)
15-AUG-2008 10:06:02 - Function to call: PingProcess
65
PMON ICM Lock 11i

If the ICM lock is not available, FNDIMON will
now ping the node of the ICM.
If the ping succeeds, we conclude that the ICM is
fine.
What????
If the ping fails, we further check if it has been over
quesiz pmon cycles since the ICM updated the
work_start column fnd_concurrent_queues.
If it has been more than four pmon cycles we
conclude that the ICM is dead.
66
PMON found dead process

On RH9 the PMON found a dead process. The
PMON takes about 1 second to run, then sleeps for
2 minutes:
Process monitor session started : 18-JAN-2009 21:46:05
Instance=(36543)
Process monitor session ended : 18-JAN-2009 21:46:06
The Internal Concurrent Manager has encountered an error.
Review concurrent manager log file for more detailed
information. : 18-JAN-2009 22:02:01
67
PMON node RH9 is down

From the ICM log:
Process monitor session started : 12-JAN-2009
15:18:27
Internal Concurrent Manager found node RH9 to
be down. Adding it to the list of unavailable
nodes.
CONC-SM TNS FAIL
Call to PingProcess failed for XDPCTRLS
68
PMON
Process monitor session started : 18-JAN-2009
22:38:57
CONC-SM TNS FAIL
Call to PingProcess failed for OAMGCS
18-JAN-2009 22:38:58 - Node:(RH7), Service
Manager:(FNDSM_RH7_VIS) currently unreachable by TNS
Found dead process: spid=(11234), cpid=(1321563), ORA
pid=(167), manager=(0/4)
Process monitor session ended : 18-JAN-2009

22:38:58
69
PMON
Shutting down Internal Concurrent Manager : 18JAN-2009 22:02:01
18-JAN-2009 22:02:01
The ICM has lost its database connection and is
shutting down.
Spawning reviver process to restart the ICM when
the database becomes available again.
70
PMON runs every 2 minutes

Process monitor session ended : 18-JAN2009 21:49:05
Process monitor session started : 18-JAN2009 21:51:05
71
Edit ICM Runtime Parameters
72
Edit PMON Parameters
73
Edit PMON Parameters
ICM parameters are read

from batchmgr.sh when
adcmctl.sh runs. Changing
these parameters here does
not change batchmgr.sh!
74
$FND_TOP/bin/batchmgr.sh
Make sure the PMON changes are made in the $FND_TOP/bin/batchmgr.sh file.
FILENAME
#
batchmgr
# DESCRIPTION
#
fire up Internal Concurrent Manager process
# USAGE
#
batchmgr arg1=val1 arg2=val2 ...
#
#
Parameters may be sent via the environment.
#
# ARGUMENTS
#
[appmgr|sysmgr]=username/password
#
[sleep=sleep_seconds]
#
[mgrname=manager_name]
#
[logfile=log_filename]
#
[restart=N|mim minutes between restarts]
#
[mailto="user1 user2..."]
#
[PRINTER=printer_name]
#
[pmon=iterations]
#
[quesiz=pmon_iterations]
#
[diag=Y|N]
DEFAULT
15
icm
$FND_TOP/$APPLLOG/$mgrname.mgr
N
current user
4
1
N
75
Reviver
REVIVER
ICM
Start
No
Receive
Shutdown?
Starts to Shutdown
Attempt to
Get DB
Connection
Lost DB
Connection?
No
Sleep
Yes
Yes
Kill Previous DB
Session
No
Spawn Reviver
Yes
Start ICM
Exit
No
ICM
Started?
Yes
From the CM log file:

The ICM has lost its
database
connection and is
shutting down.
Spawning reviver
process to restart
the ICM when the
database becomes
available again.
Spawned reviver
process 10910.
Exit
76
reviver.log
reviver.sh starting up...
[ Mon Jan 12 20:02:15 MST 2009 ] - Read APPS username/password.
[ Mon Jan 12 20:02:45 MST 2009 ] - Attempting database connection...
[ Mon Jan 12 20:02:45 MST 2009 ] - Successful database connection.
[ Mon Jan 12 20:02:45 MST 2009 ] - Killing previous ICM session...
1 row updated.
Commit complete.
[ Mon Jan 12 20:02:45 MST 2009 ] - Looking for a running ICM
process...
[ Mon Jan 12 20:02:45 MST 2009 ] - ICM now running, reviver.sh
complete.
77
reviver.sh
reviver.sh code summary
Sleep 30
Test_connection
Kill_old _icm
Get session
Alter system kill session
Check_running_icm
Fnd_conc.ecm_alive
start_icm
startmgr.sh
78
Dead Connection Detection

Dead Connection Detection (DCD) is a
feature of SQL*Net 2.1 and later, including
Oracle Net8. DCD detects when a partner
in a SQL*Net V2 client/server or
server/server connection has terminated
unexpectedly, and releases the resources
associated with it.
79
Implement DCD
Implement by:
adding SQLNET.EXPIRE_TIME = 1 (Minutes)
to the sqlnet.ora file
If the connection is idle for the time interval
specified in minutes by the
SQLNET.EXPIRE_TIME parameter, the serverside process sends a small 10-byte packet to the
client. The packet is sent using TCP/IP.
80
DCD ICM Lock

ICM and IM can use the DCD functionality
of the Network (TCP sqlnet).
ICM is a client process connected to a
DCD enabled DB dedicated server
process.
ICM holds the named PL/SQL Lock, the
ICM lock.
IM is continuously trying to check whether
it can get the same named PL/SQL Lock.
81
DCD ICM Lock

As soon as the ICM lock is released by the DB / DCD,
FNDIMON pings the ICM node, and the IM deduces that
the ICM has crashed.
If the ping succeeds, we conclude that the ICM is fine.
Obviously, the ICM can be down, even if TCP is working, this is bad
logic.
If the ping fails, FNDIMON determines if its been over four

pmon cycles since the ICM updated the work_start column
fnd_concurrent_queues.
If it has been more than four pmon cycles FNDIMON concludes
the ICM is dead.
The DCD comes into picture here after ICM has crashed
and DB needs to identify that the ICM is gone.
The DB needs to clean up the dedicated server process
resource corresponding to the ICM client process
82
FNDIMON has the ICM Lock

Check if the ICM updated the work_start column fnd_concurrent_queues.
Be aware that if a TCP failure is not detected, failover will not occur.
The following except from a concurrent manager log shows:
fdpsrp() (running_processes correction):
ICM cannot obtain exclusive lock on FND_CONCURRENT_QUEUES
Oracle error code returned: 1
This message is information and does not indicate a problem with CP
functionality.
remote call function (FNDIMON)
15-AUG-2008 10:06:02 - Function to call: PingProcess
The PingProcess continues until the CP processes resume, or a TCP

failure is detected, and failover is begun.
83
11i PCP Failure

TCP Failure
ICM Lock is released, FNDIMON pings
ICM node, if ping fails, check PMON
PMON detects a dead process, crashed
ICM
reviver.sh
DCD
84
R12 PCP Failure

TCP Failure
PMON detects a dead process
ICM Shutdown
Look for error messages ORA-3113, ORA3114 or ORA-1041
reviver.sh
DCD
85
Test PCP Failover Parameters

Test to explore effect of DCD, PMON and TCP
failover methods.
Variables: sqlnet.expire_time, pmon sleep and
number of cycles, and the following TCP
Keepalive parameters:
tcp_keepalive_time,
tcp_keepalive_intvl,
tcp_keepalive_probes
tcp_retries1 (default: 3, new value 2)
tcp_retries2 (default: 15, new value 2)
tcp_syn_retries (default: 5, new value 2)
86
Failover Test Results

Failover time /
Failback time
Expire_time
PMON
Sleep
PMON
Cycles
tcp_KA
time
tcp KA
intvl
tcp KA
probes
tcp
retries
tcp
retries2
tcp syn
retries
241 secs /
1 minute
30 secs
200
20
15
250 secs / 50 secs
5 minute
30 secs
200
20
15
262 secs / 100 sec
10 minutes
30 secs
200
20
15
300 secs / 75 secs
1 minute
15 secs
200
20
15
285 secs / 35 min
10 minute
30 secs
1000
60
10
15
8 secs / 105 secs
1 minute
30 secs
1000
60
10
10 secs / 42 secs
1 minute
30 secs
200
20
7 secs / 40 secs
10 minutes
30 secs
200
20
6 secs / 34 secs
1 minute
15 secs
200
20
87
All Services are UP
88
Concurrent Managers
Processes - Actual = 1 and Target = 1, manager is running

Processes - Actual = 0 and Target = 1, manager is running
89
Actual Processes = 0
Example of Actual Processes = 0,

in this example the CRM is not
running
90
PCP Setup
PCP setup this screen is continued on the next slide

91
Primary and Secondary Nodes

Any
concurrent
programs not
assigned to
the Standard
Manager will
not fail over
The CRM, ICM
and Standard
Manager will
fail over
92
TCP Failure
TCP disconnected at 2:57:25

10 seconds after the TCP connection was pulled, OAM reported the status above.
It took 10 seconds for OAM to register a failure of services on RH9.
93
CRM is DOWN
If any of the subordinate

services fail, it rolls up to the
Dashboard
94
CRM Failure
CRM has failed, Actual

Processes = 0
95
PCP Failover from RH9 to RH7
Adding Node:(RH9), to unavailable list

Found dead process: spid=(9696), cpid=(1321449), ORA pid=(80), manager=(0/0)
Found running request 4413565 attached to dead manager process.
Attempting to restart request.
Internal Concurrent Manager found node RH9 to be down. Adding it to the list of
unavailable nodes.
96
GSM tries to restart the services

TCP and TNS is unavailable:
: 18-JAN-2009 21:43:42
CONC-SM TNS FAIL
Routine AFPEIM encountered an error while starting concurrent manager STANDARD
with library /d01/oracle/VIS/apps/apps_st/appl/fnd/12.0.0/bin/FNDLIBR.
Check that your system has enough resources to start a concurrent manager process.
Contac : 18-JAN-2009 21:43:42
: 18-JAN-2009 21:43:42
CONC-SM TNS FAIL
Check that your system has enough resources to start a concurrent manager process.
Contac : 18-JAN-2009 21:43:42
: 18-JAN-2009 21:43:42
CONC-SM TNS FAIL
97
ICM and CRM are DOWN
98
RH9 is DOWN
Not really down, just not on the

network
99
PCP is DOWN
This is momentary as
GSM figures out what to
do
100
Failover to Secondary Node
The ICM and CRM failed

over to RH7 in about 1
minute and 30 seconds
101
Failover from RH9 to RH7

Starting Internal Concurrent Manager Concurrent
Manager : 18-JAN-2009 21:51:23
: Started ICM on Target RH7.
Process monitor session ended : 18JAN-2009 21:52:53
: Migration of ICM has completed.
The VIS_0118@VIS internal concurrent manager
has terminated successfully - exiting.
102
ICM Failover to RH7

Manager : 18-JAN-2009 21:51:23
103
RH9 not available
104
Request Failover
105
Standard Manager Failover

Configuration
Note the Inventory Manager, MRP Manager and OAM

Metrics Collection Manager are not setup to failover.
106
Managers with a Secondary Node
Note the Inventory Manager, MRP Manager and OAM

Metrics Collection Manager are not setup to failover.
107
Failback
FAILBACK tcp connected at 31:40

The host, RH9 becomes available on OAM about 2
minutes later.
108
RH9 available
109
ICM Failback
110
Concurrent Manager Log

Manager : 18-JAN-2009 22:53:33
111
112
Failback Complete
Total Failback Time 3 minutes and 45 seconds

113
Standard Manager before Failover
The Standard Manager

has 3 Actual and Target
processes.
114
Standard Manager is DOWN
115
Standard Manager has 2

Processes on Failover
After 3 minutes and 30 seconds the Standard Manager started on RH7

116
Shutdown of CP
117
Concurrent Processing Load

Balancing
Two types of Load Balancing
Load Balancing with both nodes running
no failover
Load Balancing during failover
118
PCP Load Balancing

One of the benefits Parallel Concurrent
Processing provides:
failover in case of node failure
maintain throughput and keep the business running during
node failures.
When a node fails, the processes that were

running on the failed node are restarted on
secondary nodes.
However, a resource intensive node may
overload the secondary node when it fails-over.
119
PCP Load Balancing

If too many processes are running on the secondary
node when the primary node fails over, the secondary
node may not have the capacity to process the requests
from additional concurrent managers.
R12 introduces Failover Sensitive Workshifts. This
enhancement allows the System Administrator to
configure how many processes failover for each
workshift. With this added control, System Administrators
can enjoy the benefits of PCP failover without risking
performance issues through overloaded resources.
120
R12 Failover Sensitive Workshifts
121
Failover Sensitive Workshifts
122
Conversely, if a failover occurs from node 1 to

node 2, we may want to reduce the failover
processes, however, this doesnt work.
Only if the node fails does the failover
processes take effect.
123
Failover Processes
PO Document Approval Manager and the Standard Manager will reduce the number of
processes when RH7 fails. When RH9 fails, the number of failover processes for managers
that run on RH7 are not reduced.
124

Its clear: to run a R11i or R12 system during
a failover, there are two choices:
Run the servers at 35% or less utilization
Reduce the number of processes that are
allowed during failover
For most businesses the second option is
the most practical.
125
References
249213.1 - Performance problems with Failover when TCP Network goes down
364171.1- TAF Session Hangs, Select Fails To Complete W/ Loss Of NIC: Tune TCP
Keepalive
211362.1 - Process Monitor Session Cycle Repeats Too Frequently
291201.1 - How To Remove a Dead Connection to the Target Database
362135.1 - Configuring Oracle Applications Release 11i with Oracle10g Release 2 Real
Application Clusters and Automatic Storage Management
Optimizing the E-Business Suite with Real Application Clusters (RAC) - Ahmed Alomari
240818.1 - Concurrent Processing: Transaction Manager Setup and Configuration
Requirement in an 11i RAC Environment
R12 ATG - Concurrent Processing Functional Overview Aaron Weisberg
210062.1 - Generic Service Management (GSM) in Oracle Applications 11i
271090.1 - Parallel Concurrent Processing Failover/Failback Expectations
241370.1 - Concurrent Manager Setup and Configuration Requirements in an 11i RAC
Environment
602899.1 - Some More Facts On How to Activate Parallel Concurrent Processing
126

WWW - Appsdba.info Docs Oracle Apps Advance PCP PDF

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

WWW - Appsdba.info Docs Oracle Apps Advance PCP PDF

Hochgeladen von

Copyright:

Verfügbare Formate

Parallel Concurrent

Concurrent Processing Server

CP => Concurrent Processing

Phase and Status of Concurrent Requests

The request is waiting to be picked up by the next

Waiting for CRM to resolve conflict. CRM could be

The request is running normally.

The request has finished successfully

The request has finished with an error. Check

The request has finished with a Warning. Check

Request wont run without a manager.

TCP_KEEPALIVE takes 240 seconds before issuing DCD

Internal Concurrent Manager

Conflict Resolution Manager

Conflict Resolution Manager

Service Manager: Node

Session History Cleanup

CRP Inquiry Manager

FastFormula Transaction Manager

PO Document Approval Manager

OAM Generic Collection Service:Node

Internal Concurrent Manager

Internal Concurrent Manager

Internal Concurrent Manager

Standard Manager - OAM

The Standard Manager is active

Some of the Transaction

Configuring Transaction Managers

R12 Transaction Managers use AQ

Works across RAC Instances

Configuring Transaction Managers

Edit $ORACLE_HOME/dbs/<context_name>_ifile.ora and add

Change the profile option Concurrent: TM Transport Type' to

Configuring Transaction Managers

Transaction Managers allow a client to make a request for a

R12 Transaction Managers

Concurrent:PCP Instance Check

Conflict Resolution Manager

Concurrent managers read requests to start concurrent programs.

Conflict Resolution Manager

Generic Service Management

An E-Business Suite system depends on a variety of services, such

GSM and Multiple Nodes

Seeded GSM Services

LINUX users should not Activate the Reports

FNDSVCRG Service Controller

Service Manager Log

Test Kill services and see if

applvis 9007 1 0 11:53 ?

11i - Defining PCP Details

R12 PCP Details

R12 PCP Setup

R12 Manager Failover

TCP_KEEPALIVE takes 240 seconds before issuing DCD

Parallel Concurrent Processing

Parallel Concurrent Processing

Parallel Concurrent Processing

Whats wrong with this picture?

APPLDCP Profile Option

From Note: 753678.1

PCP Failover Mechanisms

11i PCP Failure