| 02/07/2003 |
mappingDBCA.cfg
Disk Layout
dbca pic 1
|
|
Created the Cluster!!
------------------------
a) Preliminary Steps:
---------------------
0 ) => Ensure SRVM_SHARED_CONFIG is set to the correct raw device on both nodes (/dev/raw/raw2 in this case)
1 ) => Start the Watchdog Timer and the Cluster Manager Software on both nodes
2 ) => Start the listeners on bith nodes
3 ) => Start the Global Services Daemon
b) DBCA Oracle Installer:
-------------------------
4 ) => Ensure DBCA_RAW_CONFIG points to your mappingDBCA.cfg file
5 ) => Start the Database Configuration Assisstant
6 ) => Choose Create a Database
7 ) => Cluster Configuration
8 ) => Select both nodes
9 ) => Most of the file specifications were correctly picked up from the DBCA_RAW_CONFIG file
The ones that did not get correctly picked up were the tablespaces that I had added
such as USER_DATA and USER_NDX.
10) => Go through the installation process checking that the raw devices are set correctly.
11) => Save the scripts, Save the Configuration as a template, Do NOT check Create Database as yet!
For some reason the installer had grief with the redo logs and gave an error saying that they
were set to zero. The next step (I do not know why...) bypassed the problem.
12) => At this point the dbca exits since 'Create Database' was not checked
13) => Restart the dbca
14) => Choose Create Database, and use your newly created template, double check the settings
14) => At the very end, check only 'Create Database'
15) => And go!
15) => About a half an hour later my RAC was built and running!!
|
| 02/06/2003 |
a) LVM Gotchas to lookout for in a shared storage configuration:
----------------------------------------------------------------
Problem:
--------
In my configuration BOX2 already had two prior Volume Groups VGSYSTEM and VGINTERNAL
prior to the addition of VGRAC.
BOX1 had no Volume Groups on it.
When adding the additional Logical Volumes needed for the cluster database I did them using
lvcreate commands on BOX1. This created a problem as lvcreate on BOX1 used the next available
minor device number for logical volumes on BOX1.
This caused BOX1 to reuse an existing minor device number on BOX2. When BOX2 attempted to do a vgscan
a conflict was encountered and the VGRAC volume numbers stepped on the volume numbers as used in VGSYSTEM.
As a result BOX2 could not mount its root file systems!!
Solution:
---------
I dropped the VGRAC volume group and rebuilt it from scratch, from BOX2!
This is important, since BOX2's minor device number sequence will be ahead of BOX1's. BOX1
would have no problems with this when a vgscan was done.
The rule then is to build the Shared Volume Group (VGRAC) from the machine with the most logical volumes.
|
| 02/05/2003 |
mappingDBCA.cfg
listener.ora
tnsnames.ora
|
|
a) Listener Pre Configuration
-----------------------------
- The listeners on box1 and box2 were both configured, have a look at the
sample configuration in the listener.ora file for box2 above.
- The tnsnames.ora file also shows the configuration for that file.
|
| 02/04/2003 |
createdb_lvols.sh
removedb_lvols.sh
rac-mkraw
|
|
a) Additional Logical Volumes Scripts
-------------------------------------
The two scripts createdb_lvols.sh and removedb_lvols.sh can be used once each to either
add or remove the needed the logical volumes.
b) Raw devices
--------------
Also look at the revised rac-mkraw script which has had devices added to support the
added logical volumes.
Under SuSE 8.1 the additional raw devices were needed from 16 upwards, to do this the following commands were used:
cd /dev/raw
mknod raw16 c 162 16
162 is the Major device number for a raw device character file, the minor number in this case is 16, increment
as needed.
|
| 02/03/2003 |
Disk Layout
vgdisplay -v vgrac
|
|
a) Proposed Disk Layout
-----------------------
I created logical volumes on the vgrac volume group to fulfill the needs o fthe various aspects of the database
such as:
- Data Tablespaces
- Index Tablespaces
- Undo Tablepaces
- Redo Logs
- System Tablespace
- spfile
Look at the "Disk Layout" link above for details.
b) vgdisplay -v vgrac
---------------------
Look at the link above for a listing of "vgdisplay -v vgrac"
|
| 02/02/2003 |
rac-core
rdevtest.conf
|
|
Couple of things done:
----------------------
a) rac-core script
------------------
Added /etc/init.d/rac-core
This is a script that allows me to start,stop,restart and check the status of:
- Watchdog Daemon Timer
- Oracle Cluster Manager
- Global Services Daemon
b) Files, Permissions, Symbolc Links, etc
-----------------------------------------
- Make sure the raw files /dev/raw/raw? are owned by oracle as they will need to be rw for Global Service to Use them
- Created a symbolic link from /var/opt/oracle/srvConfig.loc to /u01/oracle/OraHome1/srvm/config/srvConfig.loc on all nodes
c) Start the Global Services Daemon:
------------------------------------
- A preliminary step done from box1 for first tome usage was: srvconfig -init
- gsdctl start
d) To verify that the node raw devices are working a test file as follows was created:
--------------------------------------------------------------------------------------
cat $ORACLE_HOME/bin/rdevtest.conf
intbox1:/dev/raw/raw1
intbox2:/dev/raw/raw1
intbox1:/dev/raw/raw2
intbox2:/dev/raw/raw2
intbox1:/dev/raw/raw3
intbox2:/dev/raw/raw3
Then I ran $ORACLE_HOME/oracm/bin/rdevtest rdevtest.conf: (on both nodes after starting the GSD Daemon)
oracle@box1:/u01/oracle/OraHome1/oracm/bin> ./rdevtest rdevtest.conf
Ok
oracle@box2:/u01/oracle/OraHome1/oracm/bin> ./rdevtest rdevtest.conf
Ok
e) You can use the rac-core script as follows to test:
------------------------------------------------------
box1:/etc/init.d/rac-core status
Watchdog Daemon is running
Cluster Manager is running
Global Services Daemon is running
|
| 02/01/2003 |
Back to work, I apologize for the delay, we have a new baby boy!
I will get started on the cluster again.
|
| 11/27/2002 |
Backup Scripts
Link
|
|
I have been very busy with some personal activities and probably will be for another two to three weeks.
For those who are following the log I apologise for the delay.
I added a script, pretty simple stuff to backup various directories and files of importance. Just grab them above or visit the link.
|
| 11/21/2002 |
ocmargs.ora
Oracle Architecture Notes
|
|
Starting Oracle Cluster Manager:
--------------------------------
i) Edit $ORACLE_HOME/oracm/admin/ocmargs.ora as follows:
watchdogd -g dba -l 0 -d /dev/null
oranm
oracm /a:0
norestart 1800
The watchdog arguments "-l 0 -d /dev/null" are for testing only, when done testing
remove them so the watchdog line is as follows:
watchdogd -g dba
ii) Start Oracle Cluster Manager and watchdogd (as root):
su - oracle
cd $ORACLE_HOME/oracm/bin
su root
./ocmstart.sh
Here's what the processes were after I executed the script:
==================================================================================
#ps -ef|egrep "watchdog|oracm"
root 5052 1 0 14:06 ? 00:00:00 watchdogd -g dba -l 0 -d /dev/null
root 5053 1 0 14:06 pts/1 00:00:00 oracm /a:0
root 5055 5053 0 14:06 pts/1 00:00:00 oracm /a:0
root 5056 5055 0 14:06 pts/1 00:00:00 oracm /a:0
root 5057 5055 0 14:06 pts/1 00:00:00 oracm /a:0
root 5058 5055 0 14:06 pts/1 00:00:00 oracm /a:0
root 5059 5055 0 14:06 pts/1 00:00:00 oracm /a:0
root 5060 5055 0 14:06 pts/1 00:00:00 oracm /a:0
root 5061 5055 0 14:06 pts/1 00:00:00 oracm /a:0
root 5062 5055 0 14:06 pts/1 00:00:00 oracm /a:0
root 5070 4868 0 14:09 pts/1 00:00:00 egrep watchdog|oracm
==================================================================================
Installing Oracle 9iR2 RAC option
---------------------------------
After starting OCM I started to look for the srvctl and gsd commands, to no avail...
I then decided to re-run the installer and two things happened:
(bear in mind that the OCM was started as laid out above)
a) I was presented with a dialog for the cluster node selection and both box1 and box2 were
available within it!
b) From the Available Products screen I selected "Oracle 9i Database 9.2.0.1" [Next]
then selected "custom" from the Installation Types screen [Next]
Two options were automatically highlighted under the Available Product Components screen:
1) Oracle 9i Real Application Clusters 9.2.0.1 (New Install)
2) Legato Networker Single Server 6.1.0.0.0 (New Install)
I checked the 9i RAC option for install only
The reason the RAC software was unavailable the first time I did the install was
because the Cluster Manager was not started, so a cluster was not detected.
When installing the software on box2 I disabled the OCM software and the RAC option was unavailable,
after starting the OCM software ($ORACLE_HOME/oracm/bin/ocmstart.sh) the RAC option was availabe and the
cluster nodes were presented.
The software prompts to run $ORACLE_HOME/root.sh
As root issue the following commands before running the script:
mkdir -p /var/opt/oracle
touch /var/opt/oracle/srvConfig.loc
then run $ORACLE_HOME/root.sh
this places the following entry entry in the file
srvconfig_loc=/dev/raw/raw2
which corresponds to the Server Config logical volume /dev/vgrac/svrcfglv
srvctl and gsd (RAC specific commands) were now available for use.
Oracle Architecture Notes
-------------------------
Follow the Oracle Architecture Notes above, Several of the OCM parameters have been made obsolete
from 9iR1 to 9iR2. Also the oranm (Node Monitor) process is obsolete, and nmcfg.ora is not needed.
rcp rlogin rsh
--------------
The IBM paper recommends that the above be configured between box1 and box2 as some the oracle configuration scripts
use these commands to copy config files between the machines. I am leary of this as it is a security hole, a colleague
has suggested that I symlink rcp to scp.... a good suggestion that I will look into later. For now I'm following the
recommendation.
More tomorrow!!!
|
| 11/20/2002 |
Changes to /etc/lilo.conf
------------------------
i) Edit your lilo.conf and include an entry for:
append = "CONFIG_WATCHDOG_NOWAYOUT=Y"
ii) Run /sbin/lilo -v
iii) reboot
|
| 11/18/2002 |
Oracle Environment |
|
Installed Oracle's Cluster Manager Software for Linux, under 9iR2, this is a separate option you choose at install time.
Recall that /etc/init.d/rac-mkraw maps the logical volumes in vgrac to raw devices as follows:
raw /dev/raw/raw1 /dev/vgrac/cmlv
raw /dev/raw/raw2 /dev/vgrac/svrcfglv
raw /dev/raw/raw3 /dev/vgrac/quorumlv
At least one shared raw device has to be created as an informatin repository for the database server configuration.
In this case /dev/vgrac/svrcfglv is used which maps to /dev/raw/raw2
This is known as the Server Management (SRVM) Configuration device
Note the environment variable (see the link above) SRVM_SHARED_CONFIG is set to this raw device
Installation Steps:
-------------------
1) Start installer and select "Oracle Cluster Manager 9.2.0.1.0" [ Next ]
2) The first dialog prompts for "Public Node Information", specify:
Public Node 1: box1
Public Node 2: box2 [ Next ]
3) The next dialog prompts for "Private Node Information", specify:
Private Node 1: intbox1
Private Node 2: intbox2 [ Next ]
4) The next dialog prompts for "WatchDog Parameter Information"
Accept the default of 60000 milliseconds (60 seconds) [ Next ]
5) You are now prompted for the quorum disk information.
Specify /dev/raw/raw3 which maps to /dev/vgrac/quorumlv [ Next ]
6) [ INSTALL ]
Wait for the install to complete and repeat the process on box2... tomorrow I'll try to test this...
|
| 11/17/2002 |
rac-synclvm
rac-mkraw
rac-kernel
|
|
Several scripts were added, please customize them to suit your environment...
Added files to allow:
i) Syncronization of the VGRAC volume group for shared storage on boot
The commands are:
vgchange -a n vgrac
vgscan
vgchange -a y vgrac
vgdisplay -v vgrac
These:
- deactivate the vgrac group
- rebuilds the /etc/lvmtab
- reactivates the vgrac group
- displays information on the vgrac group
This must be done at each boot so I created a file /etc/init.d/rac-synclvm and sym linked it to run level 5 as follows:
ln -sf /etc/init.d/rac-synclvm /etc/init.d/rc5.d/S25rac-synclvm
ii) Creation of the raw devices - initialize the raw devices on boot
The commands (at this point) are:
raw /dev/raw/raw1 /dev/vgrac/cmlv
raw /dev/raw/raw2 /dev/vgrac/svrcfglv
raw /dev/raw/raw3 /dev/vgrac/quorumlv
The IBM pdf in the main page of the site lists the devices as /dev/raw1 /dev/raw2 /dev/raw3 on SuSE 8.0 up raw devices
are grouped under /dev/raw/
Note that this script will be extended when the database is built later... so check it oftem for changes. I will note
in the log when it changes.
This must be done at each boot so I created a file /etc/init.d/rac-mkraw and sym linked it to run level 5 as follows:
ln -sf /etc/init.d/rac-mkraw /etc/init.d/rc5.d/S26rac-mkraw
iii) Setting kernel parameters for semaphores and shared memory segments
To set Semaphore Parameters the file /proc/sys/kernel/sem is modified as follows:
echo "Setting SEMMSL SEMMNS SEMOPM SEMMNI in /proc/sys/kernel/sem"
echo 250 256000 100 1024 > /proc/sys/kernel/sem
This must be done at each boot so I created a file /etc/init.d/rac-kernel and sym linked it to run level 5 as follows:
ln -sf /etc/init.d/rac-kernel /etc/init.d/rc5.d/S27rac-kernel
I wish the day had more hours in it..... Still reading on RAC setup..
|
| 11/17/2002 |
i) Added a FAQ section - getting emailed the same questions
ii) Still reading on RAC setup..
|
| 11/16/2002 |
Volume Group Log
Box1 Adaptec Utility Screen
Box2 Adaptec Utility Screen
Equipment pic 1
Equipment pic 2
Equipment pic 3
|
|
It has been a very busy night, but it was all worth it. All the hardware configuration is done, tested and works!!
1) The Shared SCSI disk sub-system:
a) I actually ended up using different SCSI id's as in my diagram...
Box1 SCSI ID Layout:
--------------------
0 - The SCSI Adapter itself
2 - An internal SCSI Quantum Atlas IV, this is box1's only drive with the OS on it
Box2 SCSI ID Layout:
--------------------
15 - The SCSI Adapter itself
Shared SCSI Storage Array
-------------------------
4 - QUANTUM XP34550W
7 - SEAGATE ST34371W
I opened up the Adaptec Utility on both boxes, notice that they see all the dives along the chain, except
the SCSI cards in the opposite machines... see the links at the bottom for actual screen shots
The SCSI Storage Array had to be powered up first. I tested with first box1 and shared storage on alone,
then box2 and shared storage alone, then finally with box1, box2 and shared storage, it worked great.
b) LVM Setup: Use box1 or box2 to do this, I used box2 then informed box1 of the changes using vgscan.
On box2 the disks are seen as /dev/sda and /dev/sdb , use whatever your environment has.
i) Using fdisk partition the disks to use the whole disk with a single partition of type 8e (LVM).
ii) Initialise the disks for use:
pvcreate /dev/sda1
pvcreate /dev/sdb1
iii) Create the volume group VGRAC for use:
vgcreate vgrac /dev/sda1 /dev/sdb1
iv) Create two logical volumes: (May need more later as I read on, can always remove these, read the LVM pdf)
lvcreate -i1 -L 100M -n cmlv vgrac
lvcreate -i1 -L 100M -n svrcfglv vgrac
v) Verify:
vgdisplay -v vgrac
vi) Notify box1 of the new volume group:
1) Inform box1 of the new volume group and activate it
vgscan
vgchange -ay vgrac
2) Verify on box1 by issuing:
vgdisplay -v vgrac
Look at the vglog.txt link for the actual output...
Notice on box1 the drives are /dev/sdb and /dev/sdc, but this does not affect things as the VG information
is stored on the disk, so the vgscan builds the vgroup in /etc/lvmtab correctly... nice.
2) The Interconnect: NetGear FA310TX and LinkSys 10/100 NICs, plus a 10/100 crossover patch cable
a) I found A NetGear FA310 NIC and a LinkSys NIC for the interconnect. I used YaST2 to configure these.
I then used a crossover cable between the two. This is fast... later if this is not fast enough I
may consider bumping the interconnect up to Gigabit Cards.. (have to price it first)
b) Setup was simple....
i) Used YaST2 to add and configure the cards with IP addresses 1.1.1.1 & 1.1.1.2 on box1 and box2 respectively,
subnet mask 255.255.255.0 I didn't have to do anything special.. that was it.
ii) Edited /etc/hosts on both machines to include the following:
1.1.1.1 intbox1
1.1.1.2 intbox2
iii) I was able to ping and ssh both ways between the boxes
Tomorrow it's onto RAC!! Lots of reading in store for me....
|
| 11/14/2002 |
Adaptec 2940UW Flash BIOS Version 2.2 |
|
1) Built the shared SCSI tower tonight...
2) The Adaptec Cards I have had older versions of the BIOS on them...
The newer BIOS for Adaptec's AHA-2940UW may be obtained from the link above
3) I ran a test with the SCSI disk sub system connected to Box2 only, LVM works great
I also liked that SuSE does a vgscan on boot so the lvmtab is rebuilt automatically
This allows the drives to be moved around without worry about their SCSI id's changing.. wonderful!
4) Tomorrow it's onto the dual SCSI connect...
|
| 11/12/2002 |
Starting to build hardware setup....
During Setup of SuSE 8.1 ... problems with Adaptec 2940 Cards and aic7xxx.o driver
SuSE 8.1 has a problematic Adaptec driver included
i) Edited /etc/lilo.conf and added: append="acpi=off"
ii) Ran /sbin/lilo -v
|