Sie sind auf Seite 1von 19

DBAFAQ

RAC
Date : 17th August 2010 If my OCR and Voting Disks are in ASM, can I shutdown the ASM instance? No. You will have to stop the clusterware on that node? Either crsctl stop cluster or crsctl stop crs. Can I run Oracle 9i RAC and Oracle RAC 10g in the same cluster? YES. However Oracle Clusterware (CRS) will not support a Oracle 9i RAC database so you will have to leave the current configuration in place. You can install Oracle Clusterware and Oracle RAC 10g into the same cluster. On Windows and Linux, you must run the 9i Cluster Manager for the 9i Database and the Oracle Clusterware for the 10g Database. When you install Oracle Clusterware, your 9i srvconfig file will be converted to the OCR. Both Oracle 9i RAC and Oracle RAC 10g will use the OCR. Do not restart the 9i gsd after you have installed Oracle Clusterware. With Oracle Clusterware 11g Release 2, the GSD resource will be disabled by default. You only need to enable this resource if you are running Oracle 9i RAC in the clsuter. Remember to check certify for details of what vendor clusterware can be run with Oracle Clusterware. I want to use rconfig to convert a single instance to Oracle RAC but I am using raw devices in Oracle RAC. Does rconfig support RAW ? No. rconfig supports ASM and shared file system only. How many NICs do I need to implement Oracle RAC? At minimum you need 2: external (public), interconnect (private). When storage for Oracle RAC is provided by Ethernet based networks (e.g. NAS/nfs or iSCSI), you will need a third interface for I/O so a minimum of 3. Anything else will cause performance and stability problems under load. From an HA perspective, you want these to be redundant, thus needing a total of 6. Can I run more than one clustered database on a single Oracle RAC cluster? You can run multiple databases in a Oracle RAC cluster, either one instance per node (w/ different databases having different subsets of nodes in a cluster), or multiple instances per node (all databases running across all nodes) or some combination in between. Running multiple instances per node does cause memory and resource fragmentation, but this is no different from running multiple instances on a single node in a single instance environment which is quite common. It does provide the flexibility of being able to share CPU on the node, but the Oracle Resource Manager will not currently limit resources between multiple instances on one node. You will need to use an OS level resource manager to do this. Is it supported to install Oracle Clusterware and Oracle RAC as different users? Yes, Oracle Clusterware and Oracle RAC can be installed as different users. The Oracle Clusterware user and the Oracle RAC user must both have OINSTALL as their primary group. Every Database home can have a different OSDBA group with a different username. Does changing uid or gid of the Oracle User affect Oracle Clusterware? There are a lot of files in the Oracle Clusterware home and outside of the Oracle Clusterware home that are chgrp'ed to the appropriate groups for security and appropriate access. The filesystem records the uid 1 ORAFACT

DBAFAQ (not the username), and so if you exchange the names, now the files are owned by the wrong group. Can we output the backupset onto regular file system directly (not onto flash recovery area) using RMAN command, when we use SE RAC? Yes, - customers might want to backup their database to offline storage so this is also supported. I am receiving an ORA-29740 error. What should I do? This error can occur when problems are detected on the cluster: Error: ORA-29740 (ORA-29740) Text: evicted by member %s, group incarnation %s --------------------------------------------------------------------------Cause: This member was evicted from the group by another member of the cluster database for one of several reasons, which may include a communications error in the cluster, failure to issue a heartbeat to the control file, etc. Action: Check the trace files of other active instances in the cluster group for indications of errors that caused a reconfiguration. Why do we have a Virtual IP (VIP) in Oracle RAC 10g or 11g? Why does it just return a dead connection when its primary node fails? The goal is application availability. When a node fails, the VIP associated with it is automatically failed over to some other node. When this occurs, the following things happen. (1) VIP detects public network failure which generates a FAN event. (2) the new node re-arps the world indicating a new MAC address for the IP. (3) connected clients subscribing to FAN immediately receive ORA-3113 error or equivalent. Those not subscribing to FAN will eventually time out. (4) New connection requests rapidly traverse the tnsnames.ora address list skipping over the dead nodes, instead of having to wait on TCP-IP timeouts Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs and FAN. The easiest way to use FAN is to use an integrated client with Fast Connection Failover (FCF) such as JDBC, OCI, or ODP.NET. What do the VIP resources do once they detect a node has failed/gone down? Are the VIPs automatically acquired, and published, or is manual intervention required? Are VIPs mandatory? With Oracle RAC 10g or higher, each node requires a VIP. With Oracle RAC 11g Release 2, 3 additional SCAN vips are required for the cluster. When a node fails, the VIP associated with the failed node is automatically failed over to one of the other nodes in the cluster. When this occurs, two things happen: 2. The new node re-arps the world indicating a new MAC address for this IP address. For directly connected clients, this usually causes them to see errors on their connections to the old address; 3. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately. 2 ORAFACT

DBAFAQ In the case of existing SQL conenctions, errors will typically be in the form of ORA-3113 errors, while a new connection using an address list will select the next entry in the list. Without using VIPs, clients connected to a node that died will often wait for a TCP/IP timeout period before getting an error. This can be as long as 10 minutes or more. As a result, you don't really have a good HA solution without using VIPs. With Oracle RAC 11g Release 2, you can delegate the management of the VIPs to the cluster. If you do this, the Grid Naming Service (part of the Oracle Clusterware) will automatically allocated and manage all VIPs in the cluster. This requires a DHCP service on the public network. If I use Services with Oracle RAC, do I still need to set up Load Balancing ? Yes, Services allow you granular definition of workload and the DBA can dynamically define which instances provide the service. Connection Load Balancing (provided by Oracle Net Services) still needs to be set up to allow the user connections to be balanced across all instances providing a service. With Oracle RAC 10g Release 2 or higher, set the CLB_GOAL on service to define the type of load balancing you want, SHORT for short lived connections (IE connection pool) or LONG (default) for applciations that have connections active for long periods (IE Oracle Forms application Will adding a new instance to my Oracle RAC database (new node to the cluster) allow me to scale the workload? YES! Oracle RAC allows you to dynamically scale out your workload by adding another node to the cluster. You must remember that adding more work to the database means that in addition to the CPU and Memory that the new node brings, you will have to ensure that your I/O subsystem can support the additional I/O requirements. In an Oracle RAC environment, you need to look at the total I/O across all instances in the cluster. What is the Cluster Verification Utiltiy (cluvfy)? The Cluster Verification Utility (CVU) is a validation tool that you can use to check all the important components that need to be verified at different stages of deployment in a RAC environment. The wide domain of deployment of CVU ranges from initial hardware setup through fully operational cluster for RAC deployment and covers all the intermediate stages of installation and configuration of various components. Cluvfy does not take any corrective action following the failure of a verification task, does not enter into areas of performance tuning or monitoring, does not perform any cluster or RAC operation, and does not attempt to verify the internals of cluster database or cluster elements. How do you backup voting disk? A: #dd if=voting_disk_name of=backup_file_name How do I identify the voting disk location ? A: #crsctl query css votedisk How do I identify the OCR file location ? A: check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform) or 3 ORAFACT

DBAFAQ #ocrcheck Is ssh required for normal Oracle RAC operation ? A: "ssh" are not required for normal Oracle RAC operation. However "ssh" should be enabled for Oracle RAC and patchset installation. What is SCAN? A: Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g Release 2 feature that provides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not need to change if you add or remove nodes in the cluster. What is the purpose of Private Interconnect ? A: Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the the clustered nodes. This communication is based on the TCP protocol. RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster. Why do we have a Virtual IP (VIP) in Oracle RAC? A: Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs. When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately. What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR Report? A:This is most likely due to a fault in interconnect network. Check netstat -s if you see "fragments dropped" or "packet reassemblies failed" , Work with your system administrator find the fault with network. How many nodes are supported in a RAC Database? A: 10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a RAC database. Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus can start it on both nodes? How do you identify the problem? A: Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl. Now you will get detailed error stack. What are Oracle Clusterware Components? A: Voting Disk Oracle RAC uses the voting disk to manage cluster membership by way of a health 4 ORAFACT

DBAFAQ check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk. Oracle Cluster Registry (OCR) Maintains cluster configuration information as well as configuration information about any cluster database within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster How do you backup the OCR? A: There is an automatic backup mechanism for OCR. The default location is : $ORA_CRS_HOME\cdata\"clustername"\ To display backups : #ocrconfig -showbackup To restore a backup : #ocrconfig -restore With Oracle RAC 10g Release 2 or later, you can also use the export command: #ocrconfig -export -s online, and use -import option to restore the contents back. With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR with the command: # ocrconfig -manualbackup What are Oracle database background processes specific to RAC? LMSGlobal Cache Service Process LMDGlobal Enqueue Service Daemon LMONGlobal Enqueue Service Monitor LCK0Instance Enqueue Process To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances. How do we verify an existing current backup of OCR? A: We can verify the current backup of OCR using the following command : ocrconfig -showbackup What are the performance views in an Oracle RAC environment? A: We have v$ views that are instance specific. In addition we have GV$ views called as global views that has an INST_ID column of numeric data type.GV$ views obtain information from individual V$ views. What are the types of connection load-balancing? A: There are two types of connection load-balancing:server-side load balancing and client-side load balancing. What is the differnece between server-side and client-side connection load balancing? 5 ORAFACT

DBAFAQ A: Client-side balancing happens at client side where load balancing is done using listener.In case of server-side load balancing listener uses a load-balancing advisory to redirect connections to the instance providing best service. Give the usage of srvctl:srvctl start instance -d db_name -i "inst_name_list" [-o start_options]srvctl stop instance -d name -i "inst_name_list" [-o stop_options]srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediatesrvctl start database -d name [-o start_options]srvctl stop database -d name [-o stop_options]srvctl start database -d orcl -o mount How do we remove ASM from a Oracle RAC environment? A: We need to stop and delete the instance in the node first in interactive or silent mode.After that asm can be removed using srvctl tool as follows: srvctl stop asm -n node_name srvctl remove asm -n node_name We can verify if ASM has been removed by issuing the following command: srvctl config asm -n node_name How do we verify that an instance has been removed from OCR after deleting an instance? A: Issue the following srvctl command: srvctl config database -d database_name cd CRS_HOME/bin ./crs_stat Which enable the load balancing of applications in RAC? A: Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database. WHAT IS CACHE FUSION AND HOW DOES THIS AFFECT APPLICATIONS? * Cache Fusion is a new parallel database architecture for exploiting clustered computers to achieve scalability of all types of applications. * Cache Fusion is a shared cache architecture that uses high speed low latency interconnects available on clustered systems to maintain database cache coherency. Database blocks are shipped across the interconnect to the node where access to the data is needed. WHAT ARE THE DEPENDENCIES BETWEEN OCFS AND ASM IN ORACLE DATABASE 10G ? In an Oracle RAC 10g environment, there is no dependency between Automatic Storage Management (ASM) and Oracle Cluster File System (OCFS). OCFS is not required ASM for database files. You can use OCFS on Windows( Version 2 on Linux ) for files that ASM does not handle . If you do not want to use ASM for your database files, you can still use OCFS for database files in Oracle Database 10g. HOW DO I DETERMINE WHICH NODE IN THE CLUSTER IS THE "MASTER" NODE? * For the cluster synchronization service (CSS), the master can be found by searching 6 ORAFACT

DBAFAQ ORACLE_HOME/log/cssd/ocssd.log where it is either the Oracle HOME for the Oracle Clusterware (this is the Grid Infrastructure home in Oracle Database 11g Release 2). * For master of a enqueue resource with Oracle RAC, you can select from v$ges_resource. There should be a master_node column. CAN I RUN ORACLE RAC 10G WITH ORACLE RAC 11G? Yes, The Oracle Clusterware should always run at the highest level. With Oracle Clusterware 11g, you can run both Oracle RAC 10g and Oracle RAC 11g databases. If you are using ASM for storage, you can use either Oracle Database 10g ASM or Oracle Database 11g ASM however to get the 11g features, you must be running Oracle Database 11g ASM. It is recommended to use Oracle Database 11g ASM. IS IT SUPPORTED TO INSTALL ORACLE CLUSTERWARE AND ORACLE RAC AS DIFFERENT USERS? Yes, Oracle Clusterware and Oracle RAC can be installed as different users. The Oracle Clusterware user and the Oracle RAC user must both have OINSTALL as their primary group. Every Database home can have a different OSDBA group with a different username. DOES CHANGING UID OR GID OF THE ORACLE USER AFFECT ORACLE CLUSTERWARE? * There are a lot of files in the Oracle Clusterware home and outside of the Oracle Clusterware home that are chgrp'ed to the appropriate groups for security and appropriate access. * The filesystem records the uid (not the username), and so if you exchange the names, now the files are owned by the wrong group. CAN RMAN BACKUP ORACLE REAL APPLICATION CLUSTER DATABASES? * Absolutely. RMAN can be configured to connect to all nodes within the cluster to parallelize the backup of the database files and archive logs. * If files need to be restored, using set AUTOLOCATE ON alerts RMAN to search for backed up files and archive logs on all nodes. WHAT IS CRS ? Cluster Ready Services (CRS) is the primary program that manages high availability operations in an RAC environment. The crs process manages designated cluster resources, such as databases, services, and listeners. WHAT ARE ALL THE RAC BACKGROUND PROCESSES ? DIAG: Diagnosability Daemon LCKx - This process manages the global enqueue requests and the cross-instance broadcast. Workload is automatically shared. LMON - The Global Enqueue Service Monitor (LMON) monitors the entire cluster to manage the global enqueues and the resources. LMDx - The Global Enqueue Service Daemon- The LMD process also handles deadlock detection and remote enqueue requests. Remote resource requests are the requests originating from another instance. LMSx - The Global Cache Service Processes (LMSx) are the processes that handle remote Global 7 ORAFACT

DBAFAQ Cache Service (GCS) messages. CRSCLTL COMMANDS ? Enable Oracle Clusterware # crsctl enable crs Start Oracle Clusterware # crsctl start crs Stop Oracle Clusterware # crsctl stop crs Disable Oracle Clusterware # crsctl disable crs CHECKING VOTING DISK LOCATION ? $ crsctl query css votedisk 0. 0 /dev/sda3 1. 0 /dev/sda5 2. 0 /dev/sda6 Located 3 voting disk(s). Note: -Any command which just needs to query information can be run using oracle user. But anything which alters Oracle Clusterware requires root privileges. Add Voting disk # crsctl add css votedisk path Remove Voting disk # crsctl delete css votedisk path Check CRS Status HOW TO SEE THE PARTICULAR DAEMON STATUS ? $crsctl check cssd Cluster Synchronization Services appears healthy. $crsctl check crsd Cluster Ready Services appears healthy. $crsctl check evmd Event Manager appears healthy. You can also check Clusterware status on both the nodes using: $crsctl check cluster prod01 ONLINE prod02 ONLINE CHECKING ORACLE CLUSTERWARE VERSION: To determine software version (binary version of the software on a particular cluster node) use $crsctl query crs softwareversion Oracle Clusterware version on node [prod01] is [11.1.0.6.0] 8 ORAFACT

DBAFAQ FOR CHECKING ACTIVE VERSION ON CLUSTER, USE $ crsctl query crs activeversion Oracle Clusterware active version on the cluster is [11.1.0.6.0] VIEWING OCR DISK INFORMATION: [root@node1-pub ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 262120 Used space (kbytes) : 3848 Available space (kbytes) : 258272 ID : 744414276 Device/File Name : /u02/ocfs2/ocr/OCRfile_0 Device/File integrity check succeeded Device/File Name : /u02/ocfs2/ocr/OCRfile_1 Device/File integrity check succeeded Cluster registry integrity check succeeded.. RESTORING VOTEDISKS ? crsctl stop crs crsctl query css votedisk dd if=<backup of Votedisk> of=<Votedisk file> (do this for all the votedisks) crsctl start crs ENABLE THE NODEAPPS, ASM, DATABASE INSTANCES FOR ALL THE NODES: srvctl enable instance -d test -i test1,test2 srvctl enable asm -n node1-pub srvctl enable asm -n node2-pub srvctl enable nodeapps -n node1-pub srvctl enable nodeapps -n node2-pub VIEWING NO. OF NODES CONFIGURED IN CLUSTER: olsnodes -n -p -i [root@node1-pub ~]# olsnodes -n -p -i node1-pub 1 node1-prv node1-vip node2-pub 2 node2-prv node2-vip WHAT ARE THE MAJOR RAC WAIT EVENTS? In a RAC environment the buffer cache is global across all instances in the cluster and hence the processing differs.The most common wait events related to this are ; GC CR REQUEST : The time it takes to retrieve the data from the remote cache Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will increase 9 ORAFACT

DBAFAQ the amount of data blocks requested by an Oracle session. The more blocks requested typically means the more often a block will need to be read from a remote instance via the interconnect.) GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested data block. What is RAC and how is it different from non RAC databases? RAC stands for Real Application Clusters. It allows multiple nodes in a clustered system to mount and open a single database that resides on shared disk storage. Should a single system (node) fail, the database service will still be available on the remaining nodes. A non-RAC database is only available on a single system. If that system fails, the database service will be down (single point of failure). Can any application be deployed on RAC? Most applications can be deployed on RAC without any modifications and still scale linearly (well, almost). Applications with 'hot' blocks (the same data blocks continuously accessed by processes on different nodes) may not work well. This is because data blocks will constantly be moved from one Oracle Instance to another. In such cases the application may need to be partitioned based on function or data to eliminate the contention. Do you need special hardware to run RAC? RAC requires the following hardware components: A dedicated network interconnect - might be as simple as a fast network connection between nodes; and A shared disk subsystem. How many OCR and voting disks should one have? For redundancy, one should have at lease two OCR disks and three voting disks (raw disk partitions). These disk partitions should be spread across different physical disks. How does one convert a single instance database to RAC? Oracle 10gR2 introduces a utility called rconfig (located in $ORACLE_HOME/bin) that will convert a single instance database to a RAC database. $ cp $ORACLE_HOME/assistants/rconfig/sampleXMLs/ConvertToRAC.xml racconv.xml $ vi racconv.xml $ rconfig racconv.xml One can also use dbca and enterprise manager to convert the database to RAC mode. For prior releases, follow these steps: Shut Down your Database: SQL> CONNECT SYS AS SYSDBA SQL> SHUTDOWN NORMAL Enable RAC - On Unix this is done by relinking the Oracle software. Make the software available on all computer systems that will run RAC. This can be done by copying the software to all systems or to a shared clustered file system. 10 ORAFACT

DBAFAQ Each instance requires its own set of Redo Log Files (called a thread). Create additional log files: SQL> CONNECT SYS AS SYSBDA SQL> STARTUP EXCLUSIVE SQL> ALTER DATABASE ADD LOGFILE THREAD 2 SQL> GROUP G4 ('RAW_FILE1') SIZE 500k, SQL> GROUP G5 ('RAW_FILE2') SIZE 500k, SQL> GROUP G6 ('RAW_FILE3') SIZE 500k; SQL> ALTER DATABASE ENABLE PUBLIC THREAD 2; Each instance requires its own set of Undo segments (rollback segments). To add undo segments for New Nodes: UNDO_MANAGEMENT = auto UNDO_TABLESPACE = undots2 Edit the SPFILE/INIT.ORA files and number the instances 1, 2,...: CLUSTER_DATABASE = TRUE (PARALLEL_SERVER = TRUE prior to Oracle9i). INSTANCE_NUMBER = 1 THREAD = 1 UNDO_TABLESPACE = undots1 (or ROLLBACK_SEGMENTS if you use UNDO_MANAGEMENT=manual) # Include %T for the thread in the LOG_ARCHIVE_FORMAT string. # Set LM_PROCS to the number of nodes * PROCESSES # etc.... Create the dictionary views needed for RAC by running catclust.sql (previously called catparr.sql): SQL> START ?/rdbms/admin/catclust.sql On all the computer systems, startup the instances: SQL> CONNECT / as SYSDBA SQL> STARTUP; How does one stop and start RAC instances? There are no difference between the way you start a normal database and RAC database, except that a RAC database needs to be started from multiple nodes. The CLUSTER_DATABASE=TRUE (PARALLEL_SERVER=TRUE) parameter needs to be set before a database can be started in cluster mode. In Oracle 10g one can use the srvctl utility to start instances and listener across the cluster from a single node. Here are some examples: $ srvctl status database -d RACDB $ srvctl start database -d RACDB $ srvctl start instance -d RACDB -i RACDB1 $ srvctl start instance -d RACDB -i RACDB2 $ srvctl stop database -d RACDB $ srvctl start asm -n node2 How Can I test if a database is running in RAC mode? 11 ORAFACT

DBAFAQ Use the DBMS_UTILITY package to determine if a database is running in RAC mode or not. Example: BEGIN IF dbms_utility.is_cluster_database THEN dbms_output.put_line('Running in SHARED/RAC mode.'); ELSE dbms_output.put_line('Running in EXCLUSIVE mode.'); END IF; END; / Another method is to look at the database parameters. For example, from SQL*Plus: SQL> show parameter CLUSTER_DATABASE If the value of CLUSTER_DATABASE is FALSE then database is not running in RAC Mode. How can I keep track of active instances? You can keep track of active RAC instances by executing one of the following queries: SELECT * FROM SYS.V_$ACTIVE_INSTANCES; SELECT * FROM SYS.V_$THREAD Can one see how connections are distributed across the nodes? Select from gv$session. Some examples: SELECT inst_id, count(*) "DB Sessions" FROM gv$session WHERE type = 'USER' GROUP BY inst_id; With login time (hour): SELECT inst_id, TO_CHAR(logon_time, 'DD-MON-YYYY HH24') "Hour when connected", count(*) "DB Sessions" FROM gv$session WHERE type = 'USER' GROUP BY inst_id, TO_CHAR(logon_time, 'DD-MON-YYYY HH24') ORDER BY inst_id, TO_CHAR(logon_time, 'DD-MON-YYYY HH24 What are Oracle Clusterware processes for 10g on Unix and Linux? Cluster Synchronization Services (ocssd) Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart. Cluster Ready Services (crsd) The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user Event manager daemon (evmd) A background process that publishes events that crs creates. Process Monitor Daemon (OPROCD) This process monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms. RACG (racgmain, racgimon) Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur. 12 ORAFACT

DBAFAQ What are Oracle database background processes specific to RAC? LMSGlobal Cache Service Process LMDGlobal Enqueue Service Daemon LMONGlobal Enqueue Service Monitor LCK0Instance Enqueue Process To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances. What are Oracle Clusterware Components? Voting Disk Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk. Oracle Cluster Registry (OCR) Maintains cluster configuration information as well as configuration information about any cluster database within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster How do you troubleshoot node reboot ? Please check metalink ... Note 265769.1 Troubleshooting CRS Reboots Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions. How do you backup the OCR? There is an automatic backup mechanism for OCR. The default location is : $ORA_CRS_HOME\cdata\"clustername"\ To display backups : #ocrconfig -showbackup To restore a backup : #ocrconfig -restore With Oracle RAC 10g Release 2 or later, you can also use the export command: #ocrconfig -export -s online, and use -import option to restore the contents back. With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR with the command: # ocrconfig -manualbackup How do you backup voting disk? #dd if=voting_disk_name of=backup_file_name How do I identify the voting disk location? #crsctl query css votedisk

13

ORAFACT

DBAFAQ How do I identify the OCR file location? check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform) or #ocrcheck Is ssh required for normal Oracle RAC operation ? "ssh" are not required for normal Oracle RAC operation. However "ssh" should be enabled for Oracle RAC and patchset installation. What is SCAN? Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g Release 2 feature that provides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not need to change if you add or remove nodes in the cluster. What is the purpose of Private Interconnect ? Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the the clustered nodes. This communication is based on the TCP protocol. RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster. Why do we have a Virtual IP (VIP) in Oracle RAC? Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs. When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately. What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR Report? This is most likely due to a fault in interconnect network. Check netstat -s if you see "fragments dropped" or "packet reassemblies failed" , Work with your system administrator find the fault with network. How many nodes are supported in a RAC Database? 10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a RAC database. Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus can start it on both nodes? How do you identify the problem? Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl. Now you will get detailed error stack. 14 ORAFACT

DBAFAQ what is the purpose of the ONS daemon? The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware as part of the nodeapps. There is one ons daemon started per clustered node. The Oracle Notification Service daemon receive a subset of published clusterware events via the local evmd and racgimon clusterware daemons and forward those events to application subscribers and to the local listeners. This in order to facilitate: a. the FAN or Fast Application Notification feature or allowing applications to respond to database state changes. b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing accross different rac nodes dependent of the load on the different nodes. The rdbms MMON is creating an advisory for distribution of work every 30seconds and forward it via racgimon and ONS to listeners and applications. What is the purpose ofPrivate Interconnect ? Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the the clustered nodes. This communication is based on the TCP protocol. RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster. Why do we have a Virtual IP (VIP) in Oracle RAC? Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs. When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately. What is Global Cache Service? Global Cache Service (GCS) is the main component of Oracle Cache Fusion technology. This is represented by background process LMSn. There can be max 10 LMS process for an instance. The main function of GCS is to track the status and location of data blocks. Status of data block means the mode and role of data block (I will explain mode and role further). GCS is the main mechanism by which cache coherency among multiple cache is maintained. GCS is also responsible for block transfer between the instances. What is Global Enqueue Service? Global Enqueue Service (GES) tracks the status of all Oracle enqueuing mechanism. This involves all non-cache fusion intra instance operations. GES performs concurrency control on dictionary cache locks, library cache locks and transactions. If performs this operation for resources that are accessed by more then once instance. 15 ORAFACT

DBAFAQ Enqueue services are also present in single instance database. These are responsible for locking the rows on a table using different locking modes. To understand more about enqueues, what is Global Resource Directory? GES and GCS together maintains Global Resource Directory (GRD). GRD is like a in-memory database which contains details about all the blocks that are present in cache. GRD know what is the location of latest version of block, what is the mode of block, what is the role of block (Mode and role will be discussed shortly) etc. When ever a user ask for any data block GCS gets all the information from GRD. GRD is a distributed resource, meaning that each instance maintain some part of GRD. This distributed nature of GRD is a key to fault tolerance of RAC. GRD is stored in SGA. Typically GRD contains following and more information Data Block Address This is the address of data block being modified Location of most current version of data block Modes of data block Roles of data block SCN number of data block Image of data block Could be current image or past image. what is GCS resource modes and roles? Mode of data block is decided based on whether a resource holder intends to modify the data or read the data. The modes are as follows: 1 Null (N) Mode: Null mode is the least restrictive mode. It indicates no access rights. It acts as a place holder. 2 Shared (S) Mode: Shared mode indicate that database block is being read and not modified. However another session can read the data block 3 Exclusive (X) Mode: Exclusive mode indicate exclusive access to block. Other resource cannot have write over this data block. However it can have consistent read on this datablock. GCS resources also has roles. Following are the different roles present: 1 Local: When a data block is first read into the instance from the disk it has a local role. Meaning that only 1 copy of data block exists in the cache. No other instance cache has a copy of this block. 2 Global: Global role indicates that multiple copy of data block exists in clustered instance. For example a user connected to one of the instance request for a data block. This data block is read from disk into an instance. The role granted is local. If another instance request for same block this block will get copied to the requesting instance and the role becomes global. This role and mode information is maintained in GRD (Global Resource Directory) by GCS (Global Cache Service). What is RAC? RAC stands for Real Application cluster. It is a clustering solution from Oracle Corporation that ensures high availability of databases by providing instance failover, media failover features. What is RAC and how is it different from non RAC databases? RAC stands for Real Application Cluster, you have n number of instances running in their own separate nodes and based on the shared storage. Cluster is the key component and is a collection of servers operations as one unit. RAC is the best solution for high performance and high availably. Non RAC 16 ORAFACT

DBAFAQ databases has single point of failure in case of hardware failure or server crash. Give the usage of srvctl : srvctl start instance -d db_name -i "inst_name_list" [-o start_options] srvctl stop instance -d name -i "inst_name_list" [-o stop_options] srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediate srvctl start database -d name [-o start_options] srvctl stop database -d name [-o stop_options] srvctl start database -d orcl -o mount Mention the Oracle RAC software components : Oracle RAC is composed of two or more database instances. They are composed of Memory structures and background processes same as the single instance database.Oracle RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable cache fusion.Oracle RAC instances are composed of following background processes: ACMSAtomic Controlfile to Memory Service (ACMS) GTX0-jGlobal Transaction Process LMONGlobal Enqueue Service Monitor LMDGlobal Enqueue Service Daemon LMSGlobal Cache Service Process LCK0Instance Enqueue Process RMSnOracle RAC Management Processes (RMSn) RSMNRemote Slave Monitor What is GRD? GRD stands for Global Resource Directory. The GES and GCS maintains records of the statuses of each datafile and each cahed block using global resource directory.This process is referred to as cache fusion and helps in data integrity. What are the different network components are in 10g RAC? public, private, and vip components Private interfaces is for intra node communication. VIP is all about availability of application. When a node fails then the VIP component fail over to some other node, this is the reason that all applications should based on vip components means tns entries should have vip entry in the host list. Give Details on ACMS ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is an agent that ensures a distributed SGA memory update(ie)SGA updates are globally committed on success or globally aborted in event of a failure. What is Cache Fusion? Cache fusion is the mechanism to transfer the data block from memory to memory of one node to the other.If two nodes require the same block for query or update, the block must be transfered from the cache of one node to the other. RAC system must equipped with low-latency and high speed interconnect to make it happen. What are the major RAC wait events? In a RAC environment the buffer cache is global across all instances in the cluster and hence the processing differs.The most common wait events related to this are gc cr request and gc buffer busy 17 ORAFACT

DBAFAQ GC CR request :the time it takes to retrieve the data from the remote cache Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will increase the amount of data blocks requested by an Oracle session. The more blocks requested typically means the more often a block will need to be read from a remote instance via the interconnect.) What components in RAC must reside in shared storage? All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage. What is the significance of using cluster-aware shared storage in an Oracle RAC environment? All instances of an Oracle RAC can access all the datafiles,control files, SPFILE's, redolog files when these files are hosted out of cluster-aware shared storage which are group of shared disks. Give few examples for solutions that support cluster storage: ASM(automatic storage management),raw disk devices,network file system(NFS), OCFS2 and OCFS(Oracle Cluster Fie systems). What is an interconnect network? An interconnect network is a private network that connects all of the servers in a cluster. The interconnect network uses a switch/multiple switches that only the nodes in the cluster can access. How can we configure the cluster interconnect? Configure User Datagram Protocol(UDP) on Gigabit ethernet for cluster interconnect.On unix and linux systems we use UDP and RDS(Reliable data socket) protocols to be used by Oracle Clusterware.Windows clusters use the TCP protocol. Can we use crossover cables with Oracle Clusterware interconnects? No, crossover cables are not supported with Oracle Clusterware intercnects. What is the use of cluster interconnect? Cluster interconnect is used by the Cache fusion for inter instance communication. How do users connect to database in an Oracle RAC environment? Users can access a RAC database using a client/server configuration or through one or more middle tiers ,with or without connection pooling.Users can use oracle services feature to connect to database. What is the use of a service in Oracle RAC environment? Applications should use the services feature to connect to the Oracle database.Services enable us to define rules and characteristics to control how users and applications connect to database instances. What are the characteristics controlled by Oracle services feature? The charateristics include a unique name, workload balancing and failover options,and high availability characteristics.

18

ORAFACT

DBAFAQ What enables the load balancing of applications in RAC? Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database. What is a virtual IP address or VIP? A virtual IP address or VIP is an alternate IP address that the client connections use instead of the standard public IP address. To configureVIP address, we need to reserve a spare IP address for each node, and the IP addresses must use the same subnet as the public network. What is the use of VIP? If a node fails, then the node's VIP address fails over to another node on which the VIP address can accept TCP connections but it cannot accept Oracle connections. Give situations under which VIP address failover happens: VIP addresses failover happens when the node on which the VIP address runs fails, all interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from the network. What is the significance of VIP address failover? When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connection refused error .They don't have to wait for TCP connection timeout messages.

19

ORAFACT

Das könnte Ihnen auch gefallen