* You are viewing the archive for the ‘solaris tips’ Category

Using CVS with SMF

Most services in Solaris 10 are controlled by SMF. SMF uses xml files to define services it manages. I had a need to quickly create a service manifest for CVS. The inetconv command takes an input file with inetd.conf format and converts it into basic SMF manifest and imports it into the SMF repository. In the case of CVS I created cvs_inetd file with following content:

cvspserver stream tcp nowait root /export/apps/cvs -f --allow-root=/export/cvs_repos/primary --allow-root=/export/cvs_repos/secondary pserver

Then I converted and imported the file using the following:

root@ultra# inetconv -f -i ./cvs_inetd
cvspserver -> /var/svc/manifest/network/cvspserver-tcp.xml
Importing cvspserver-tcp.xml ...Done
root@ultra#

The resulting CVS manifest was saved in /var/svc/manifest/network. Later, if needed, you can view, modify, etc. the service manifest properties using svccfg and svcprop commands. The -f switch above causes CVS manifest in /var/svc/manifest/network be overwritten, if it exists.

Also, make sure cvspserver is defined in /etc/services. Continue Reading

Increasing number of NFS servers on Sun Cluster

By default Solaris 10 starts 16 NFS servers to handle NFS requests. You can tune this by editing /etc/default/nfs file.
<—————–SNIP—————->
# Maximum number of concurrent NFS requests.
# Equivalent to last numeric argument on nfsd command line.
NFSD_SERVERS=16

<—————–SNIP—————->
Changing above variables did not seem to have any effects on how many NFS server Sun Cluster started. Poking around I found nfs_start_daemons script which is part of SUNWscnfs package. In my case it was in /opt/SUNWscnfs/bin directory. It turns out that this script is looking at pre-Solaris 10 nfs.server init script to determine if more than 16 NFS servers are supposed to be started. In Solaris 10 NFS server as most of the services is handled by SMF. The /etc/init.d/nfs.server script is still present, probably due to legacy reasons, but it simply calls svcadm command to start NFS. Here is the relevant section of nfs_start_daemons script:
<—————–SNIP—————->
DEFAULT_NFSDCMD="/usr/lib/nfs/nfsd -a 16"
if [ -f /etc/init.d/nfs.server ]; then
NFSDCMD="`egrep '^[^#]*/usr/lib/nfs/nfsd' \
/etc/init.d/nfs.server \
2>/dev/null | head -1`"
fi

<—————–SNIP—————->
In order to increase number of NFS server that get started by Sun Cluster, you must change the number 16 above to something higher like 1024. Continue Reading

Moving LUN's between hosts using metarecover

Sometimes you might need to move a LUN between hosts X and Y on a SAN. Solaris has cool command, metarecover, that lets you do just that. First, you will need to stop all processes on X that might be using the given LUN. Then unmount the LUN and present it to the host Y. Now you can use metarecover to recover metadevice information from the LUN. Couple of things to remember:

There is -n switch that will cause dry run, i.e. metarecover will output what it would do – it will not actually perform recovery operation. Also, be careful about metadevice names on host Y and the ones you are trying to recover. I have not tried it so I am not sure what would happen. It would probably bomb out. You could use metarename to rename Y’s existing metadevices and try to eliminate possible conflict before recovery.

root@ultra# metarecover -v c3t600508B400011C370000C00002970000d0s2 -p -d
Verifying on-disk structures on
c3t600508B400011C370000C00002970000d0s2.
The following extent headers were found on
c3t600508B400011C370000C00002970000d0s2.
Name Seq# Type Offset
Length
d100 0 ALLOC 205635593
25165825
d110 0 ALLOC 16384
32769
d120 0 ALLOC 49153
32769
d130 0 ALLOC 81922
32769
d140 0 ALLOC 114691
41943041
d150 0 ALLOC 42057732
146800641
d160 0 ALLOC 188858373
2097153
d170 0 ALLOC 190955526
6291457
d180 0 ALLOC 197246983
6291457
d190 0 ALLOC 203538440
2097153
NONE 0 END 251609087
1
NONE 0 FREE 230801418
20807669
Found 10 soft partition(s) on c3t600508B400011C370000C00002970000d0s2.
Checking sequence numbers.
mp->c.un_type: 5
mp->c.un_size: 136
mp->c.un_self_id: 100
mp->un_status: 5
mp->un_numexts: 1
mp->un_length: 25165824
mp->un_dev: 30933010
mp->un_key: 43
Ext# voff poff Len
0 0 205635594 25165824
mp->c.un_type: 5
mp->c.un_size: 136
mp->c.un_self_id: 110
mp->un_status: 5
mp->un_numexts: 1
mp->un_length: 32768
mp->un_dev: 30933010
mp->un_key: 43
Ext# voff poff Len
0 0 16385 32768
mp->c.un_type: 5
mp->c.un_size: 136
mp->c.un_self_id: 120
mp->un_status: 5
mp->un_numexts: 1
mp->un_length: 32768
mp->un_dev: 30933010
mp->un_key: 43
Ext# voff poff Len
0 0 49154 32768
mp->c.un_type: 5
mp->c.un_size: 136
mp->c.un_self_id: 130
mp->un_status: 5
mp->un_numexts: 1
mp->un_length: 32768
mp->un_dev: 30933010
mp->un_key: 43
Ext# voff poff Len
0 0 81923 32768
mp->c.un_type: 5
mp->c.un_size: 136
mp->c.un_self_id: 140
mp->un_status: 5
mp->un_numexts: 1
mp->un_length: 41943040
mp->un_dev: 30933010
mp->un_key: 43
Ext# voff poff Len
0 0 114692 41943040
mp->c.un_type: 5
mp->c.un_size: 136
mp->c.un_self_id: 150
mp->un_status: 5
mp->un_numexts: 1
mp->un_length: 146800640
mp->un_dev: 30933010
mp->un_key: 43
Ext# voff poff Len
0 0 42057733 146800640
mp->c.un_type: 5
mp->c.un_size: 136
mp->c.un_self_id: 160
mp->un_status: 5
mp->un_numexts: 1
mp->un_length: 2097152
mp->un_dev: 30933010
mp->un_key: 43
Ext# voff poff Len
0 0 188858374 2097152
mp->c.un_type: 5
mp->c.un_size: 136
mp->c.un_self_id: 170
mp->un_status: 5
mp->un_numexts: 1
mp->un_length: 6291456
mp->un_dev: 30933010
mp->un_key: 43
Ext# voff poff Len
0 0 190955527 6291456
mp->c.un_type: 5
mp->c.un_size: 136
mp->c.un_self_id: 180
mp->un_status: 5
mp->un_numexts: 1
mp->un_length: 6291456
mp->un_dev: 30933010
mp->un_key: 43
Ext# voff poff Len
0 0 197246984 6291456
mp->c.un_type: 5
mp->c.un_size: 136
mp->c.un_self_id: 190
mp->un_status: 5
mp->un_numexts: 1
mp->un_length: 2097152
mp->un_dev: 30933010
mp->un_key: 43
Ext# voff poff Len
0 0 203538441 2097152
The following soft partitions were found and will be added to
your metadevice configuration.
Name Size No. of Extents
d100 25165824 1
d110 32768 1
d120 32768 1
d130 32768 1
d140 41943040 1
d150 146800640 1
d160 2097152 1
d170 6291456 1
d180 6291456 1
d190 2097152 1
WARNING: You are about to add one or more soft partition
metadevices to your metadevice configuration. If there
appears to be an error in the soft partition(s) displayed
above, do NOT proceed with this recovery operation.
Are you sure you want to do this (yes/no)? yes
c3t600508B400011C370000C00002970000d0s2: Soft Partitions recovered
from device.
root@ultra#
Continue Reading

Replacing disk controlled by SVM

The following scenario assumes two mirrored disks, with two state database replicas located on slice 7 of both disks. High level steps for this are as follows:

  1. determine failed disk
  2. detach failed submirrors
  3. clear failed submirror metadevices and database replicas from failed disk
  4. unconfigure the failed disk and replace it
  5. configure the new disk and recreate VTOC
  6. add new database replicas
  7. recreate the submirrors and reattach them to the respective mirrors

This is the current /etc/vfstab:

bash-3.00# more /etc/vfstab
#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
fd - /dev/fd fd - no -
/proc - /proc proc - no -
/dev/md/dsk/d0 - - swap - no -
/dev/md/dsk/d10 /dev/md/rdsk/d10 / ufs 1 no logging
/dev/md/dsk/d30 /dev/md/rdsk/d30 /export/home ufs 2 yes logging
/devices - /devices devfs - no -
ctfs - /system/contract ctfs - no -
objfs - /system/object objfs - no -
swap - /tmp tmpfs - yes -

From here on I will use d0 and its submirrors as an example. d0 consists of d1 and d2. d2 is on the failed disk.

d0: Mirror
Submirror 0: d1
State: Okay
Submirror 1: d2
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 4100928 blocks (2.0 GB)
d1: Submirror of d0
State: Okay
Size: 4100928 blocks (2.0 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1t0d0s0 0 No Okay Yes
d2: Submirror of d0
State: Needs maintenance
Invoke: metareplace d0 c1t1d0s0 <new device>
Size: 4100928 blocks (2.0 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1t1d0s0 0 No Maintenance Yes

First we detach d2. The same has to be repeated for d32 and d12:

bash-3.00# metadetach -f d0 d2
d0: submirror d2 is detached

We need to clear d2. Again, the same is repeated for d32 and d12:

bash-3.00# metaclear d2
d2: Concat/Stripe is cleared

Now we delete database replicas from the failed disk. It’s also very important to make sure we have at least half of state database replicas available before we start removing them from the failed disk. Here is a Sun document that explains Majority Consensus Algorithm Solaris Volume Manager uses. You can determine number and location of the replicas using metadb -i command.

bash-3.00# metadb -d c1t1d0s7

Now we can unconfigure the failed disk using cfgadm, replace it and configure the new disk:

bash-3.00# cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c0 scsi-bus connected configured unknown
c0::dsk/c0t0d0 CD-ROM connected configured unknown
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t1d0 disk connected configured unknown
c2 scsi-bus connected unconfigured unknown
usb0/1 unknown empty unconfigured ok
usb0/2 unknown empty unconfigured ok
bash-3.00# cfgadm -c unconfigure c1::dsk/c1t1d
bash-3.00# cfgadm -c configure c1::dsk/c1t1d0

Now we replicate VTOC from the good disk:

bash-3.00# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2

Add database replicas to the new disk:

bash-3.00# metadb -a -c2 c1t1d0s7

Finally, we can recreate failed submirrors and attach them to their respective mirrors and let them sync up. Again, the same is applies for d32 and d12:

bash-3.00# metainit d2 1 1 c1t1d0s0
d2: Concat/Stripe is setup
bash-3.00# metattach d0 d2
d0: submirror d2 is attached

Few notes: This setup contains total of 4 state database replicas. During a disk failure half of the replicas will be gone. If the server gets rebooted for whatever reason, it will not come up in multiuser mode. If you have less than half of the replicas, the system will panic. For more info on all that check out docs.sun.com.

When using cfgadm to unconfigure disk, there can be no resources using that disk. Otherwise, unconfigure will fail. Quite possibly swap metadevice is set to be dedicated dump device. To view or change dedicated dump device settings use dumpadm command. Continue Reading

NFS4 Invalid inbound domain name

It seems that starting with Solaris 06/07 nfs4_domain is required in sysidcfg file, otherwise jumpstart will go interactive. You can force a value for example nfs4_domain=example.net or you can set it to be dynamic. In that case the value will be derived from the name service in use. Solaris 10 has nfsmapid daemon that maps numeric UID/GID to a string in format user@example.net.

If there is a domain mismatch between NFS4 client and server, the client will see files on the server owned by nobody. On the server syslog might log something like this:

Mar 3 15:13:14 ultra /usr/lib/nfs/nfsmapid[275]: [ID 300081 daemon.error] valid_domain: Invalid inbound domain name example.net..

In my case there was a typo in /etc/resolv.conf file at the end of domain entry. The entry contained trailing dot. This Sun document has all the useful info that might help troubleshooting similar problems with nfsmapid.

Continue Reading

Page 5 of 6« First...23456