Oracle DB failed to start: ORA-27102: out of memory

Oracle not being exactly my turf, I ran into this last week. After some patching and reboot of RHEL 5.9 server, Oracle DB failed to start. So, I took a peek inside /opt/oracle/product/11.2.0/db_1/startup.log and found this:

ORA-27102: out of memory
Linux-x86_64 Error: 28: No space left on device
SYS@PRD001 > Disconnected

Lovely. Unfortunately, no DBA was around so I had to do some searching. Thankfully I came across this page that pretty much fit my case.

[root@db0001 db_1]# getconf PAGE_SIZE
4096
[root@db0001 db_1]# sysctl -a | grep shmall
kernel.shmall = 5243392

Going by the suggestion from the above link shmall for my server with 32GB of memory should be: 1024 * 1024 * 1024 * 32 / 4096 which comes to 8388608. This was clearly at odds with what sysctl reported. So…

[root@db0001 db_1]# sysctl kernel.shmall=8388608

After that database startup was successful. Of course, I corrected the parameter in /etc/sysctl.conf so it persists across reboots.

Later I found out whose handy work the incorrect value was… No, it was not me…

Posted on July 27, 2013 at 08:57 by somedude · Permalink · Leave a comment
In: centos, linux, oracle, redhat

vShield Manager CLI password change

This just seems nonsensical to me. Apparently, you cannot change user passwords via CLI in vShield Manager 5.1.2, but you have to go through the rigmarole of removing and recreating accounts. Specifically, I needed to change password for admin account.

Moreover, CLI admin account is separate entity from admin account used in conjunction with web interface!

So, first create a temporary admin account and log out:

manager# config t
manager(config)# user tempadmin password plaintext pass1
manager(config)# exit
manager# write mem
Building Configuration...
Configuration saved.
[OK]
manager# exit

Then log back in using tempadmin account, delete admin account and re-create it using desired password:

manager# config t
manager(config)# no user admin
manager(config)# user admin password plaintext pass2
manager(config)# exit
manager# write mem
Building Configuration...
Configuration saved.
[OK]
manager#

…and finally, logout as tempadmin, log back in as admin and remove tempadmin account:

manager# config t
manager(config)# no user tempadmin
manager(config)# exit
manager# write mem
Building Configuration...
Configuration saved.
[OK]
manager#

More on this is here… And yes, article recommends removing admin completely.

Posted on June 30, 2013 at 10:31 by somedude · Permalink · 2 Comments
In: virtualization, vmware, vshield

Copying directories with tar

This is one of those, “can’t seem to memorize something” posts… Sometimes copying files with cp is just painful. Instead:

[root@sparky ~]# mkdir /tmp/newopt && mount /dev/mapper/system-opt /tmp/newopt
[root@sparky ~]# ( cd /opt && tar --xattrs -cf - . ) | ( cd /tmp/newopt && tar xvfp - )

This will copy files and directories from /opt to /newopt, including extended attributes like acl’s and SELinux contexts. And it’s faster than using cp as well…

Posted on May 27, 2013 at 09:48 by somedude · Permalink · Leave a comment
In: linux, linux tips, shell, solaris, solaris tips

Dynamically discovering disk volume in RedHat

This is one of those “note to myself” posts. Discovering a new volume on a Linux server is little cruder than it is in Solaris.

Obviously, you have to provision a LUN on the storage. I have never tried it but I suppose you could use /usr/bin/rescan-scsi-bus.sh script from sg3_utils package.

Anyway, then you need to figure out which HBA the new disk should be accessible through. If your HBA has multiple channels, you will need to know which channel the new disk should be accessible through.

Taking a concrete example, this server had one, single channel HBA with an existing disk, and with a new disk zoned for the HBA. This was RedHat 5.9 server.

[root@db0001 root]#echo "0 1 0" > /sys/class/scsi_host/host0/scan
[root@db0001 root]#

The above command initiated scan of first HBA (host0), channel 0, target 1 for a new LUN. Since the scan was successful the following appeared in syslog:

[root@db0001 root]# tail /var/log/messages
Apr 16 11:46:23 db0001 kernel: sdb: Write Protect is off
Apr 16 11:46:23 db0001 kernel: sdb: cache data unavailable
Apr 16 11:46:23 db0001 kernel: sdb: assuming drive cache: write through
Apr 16 11:46:23 db0001 kernel: SCSI device sdb: 10485760 512-byte hdwr sectors (5369 MB)
Apr 16 11:46:23 db0001 kernel: sdb: Write Protect is off
Apr 16 11:46:23 db0001 kernel: sdb: cache data unavailable
Apr 16 11:46:23 db0001 kernel: sdb: assuming drive cache: write through
Apr 16 11:46:23 db0001 kernel:  sdb: unknown partition table
Apr 16 11:46:23 db0001 kernel: sd 0:0:1:0: Attached scsi disk sdb
Apr 16 11:46:23 db0001 kernel: sd 0:0:1:0: Attached scsi generic sg1 type 0

In the above example, if you had a dual channel HBA (dual port FC HBA) and no existing disk attached to the second channel, to see a new disk on the second channel, you would do:

[root@db0001 root]#echo "1 0 0" > /sys/class/scsi_host/host0/scan
[root@db0001 root]#

Similarly, if you had two single channel HBAs, with new disk zoned for the second HBA, which already had one disk (disk 0) attached to it, you would do:

[root@db0001 root]#echo "0 1 0" > /sys/class/scsi_host/host1/scan
[root@db0001 root]#

The whole x:x:x:x nomenclature is little confusing with some people using bus to refer to a channel, etc. You can think of it as adapter:port:target:lun

Posted on April 28, 2013 at 16:18 by somedude · Permalink · Leave a comment
In: centos, fibre channel, linux, linux tips, redhat, san, storage

ext3 detected aborted journal

This was an interesting exercise in a sense that I have never had a similar filesystem issue in the past. For some unknown reason so far one of the RHEL 5.9 ESXi guests decided to remount /var as read-only. Taking a look at output from dmesg confirmed ext3 had some ordeal going on:

EXT3-fs error (device dm-3): ext3_free_blocks_sb: bit already cleared for block 364592
Aborting journal on device dm-3.
ext3_abort called.
EXT3-fs error (device dm-3): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
EXT3-fs error (device dm-3) in ext3_free_blocks_sb: Journal has aborted
EXT3-fs error (device dm-3) in ext3_free_blocks_sb: Journal has aborted
EXT3-fs error (device dm-3) in ext3_free_blocks_sb: Journal has aborted
EXT3-fs error (device dm-3) in ext3_reserve_inode_write: Journal has aborted
EXT3-fs error (device dm-3) in ext3_truncate: Journal has aborted
EXT3-fs error (device dm-3) in ext3_reserve_inode_write: Journal has aborted
EXT3-fs error (device dm-3) in ext3_orphan_del: Journal has aborted
EXT3-fs error (device dm-3) in ext3_reserve_inode_write: Journal has aborted
__journal_remove_journal_head: freeing b_committed_data

I figured mount -o remount /var would do the job, as usual. But no:

[root@vm-prd-039 ~]# mount -o remount /var
mount: block device /dev/system/var is write-protected, mounting read-only

Interesting, so I decided to take at volume groups and logical volumes:

[root@vm-prd-039 ~]# vgs
  File-based locking initialisation failed.
[root@vm-prd-039 ~]# lvs
  File-based locking initialisation failed.

Fine, /var was read-only and LVM kept lock files there, so LVM could not work them. One more shot:

[root@vm-prd-039 ~]# vgs --ignorelockingfailures
  VG     #PV #LV #SN Attr   VSize   VFree
  system   1   5   0 wz--n-  24.47G     0

That looked OK. I did a quick Google search and it looked like I was going to have to drop the filesystem journal, check the filesystem and then create a new filesystem journal as follows:

umount /dev/mapper/system-var
fsck -y /dev/mapper/system-var
tune2fs -O ^has_journal /dev/mapper/system-var
fsck -f -y /dev/mapper/system-var
tune2fs -j /dev/mapper/system-var

So, I dropped the server into single user mode, killed processes that had open files on /var, unmounted the filesystem and just ran fsck to fix up the filesystem. That’s it.

Posted on March 30, 2013 at 09:41 by somedude · Permalink · One Comment
In: centos, ext3, linux, linux tips, lvm2, redhat, storage