* You are viewing the archive for the ‘centos’ Category

Linux multipathing

I use MPxIO in Solaris quite often and it works very well for me. This time I needed to test out I/O multipathing in RedHat. What I really needed to do: have a server with two HBA’s manage a mirror which has submirrors on separate SAN’s; so that the server has multiple paths to each submirror. That way, if an HBA goes the server has still connection to both submirrors through the remaining HBA.

Gear used in this “experiment”:

  • Dell Poweredge server.
  • Two Qlogic QLA2310 HBA’s.
  • RHEL Server 5.3 x86.
  • Two SAN’s presenting one LUN each.

Rough steps I took to get this working:

  1. Make sure device mapper package is installed.
  2. Present two LUN’s from two SAN’s.
  3. Probe HBA’s for presented LUN’s.
  4. Configure multipathing.

First and foremost, make sure qla2xxx driver is loaded. You also have to make sure you have device-mapper-multipath-0.4.7-23.el5 installed. Next, configure multipathing daemon so that it starts on boot:

[root@carbon ~]# chkconfig multipathd on

When that’s done you need to make the system aware of the presented LUN’s. One way to do so is to reboot the server. Another option is to force HBA scan:

[root@carbon ~]# echo "- - -" > /sys/class/scsi_host/host1/scan

During this you should watch /var/log/messages to see if your LUN’s are detected. When done, make multipathd aware of the LUN’s:

[root@carbon ~]# multipath -v2 -d

The above command is a “dry run”. There will be no device map changes committed. You will only be shown device mapper changes that will be made. To commit device map changes run:

[root@carbon ~]# multipath -v2

Once this is done you can see what multipathd is seeing:

[root@carbon ~]# multipath -ll
mpath2 (3600508d311100a300000f00001a90000) dm-3 COMPAQ,HSV111 (C)COMPAQ
[size=15G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=100][enabled]
 \_ 1:0:3:1 sde 8:64 [active][ready]
 \_ 2:0:3:1 sdh 8:112 [active][ready]
\_ round-robin 0 [prio=20][enabled]
 \_ 1:0:2:1 sdd 8:48 [active][ready]
 \_ 2:0:2:1 sdg 8:96 [active][ready]
mpath1 (3600508c362d0a1250000900001490000) dm-2 COMPAQ,HSV111 (C)COMPAQ
[size=15G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=100][enabled]
 \_ 1:0:0:1 sdb 8:16 [active][ready]
 \_ 2:0:4:1 sdi 8:128 [active][ready]
\_ round-robin 0 [prio=20][enabled]
 \_ 1:0:1:1 sdc 8:32 [active][ready]
 \_ 2:0:1:1 sdf 8:80 [active][ready]

If everything looks good, you can create configuration file for multipathd. You will need to edit /etc/multipath.conf and depending on your environment, add or modify some parameters. The configuration file contains enough comments and examples to figure out what different parameters mean. When in doubt, consult the man pages.

First, add a blacklist section, which will make certain device exempt from multipathing. I have my internal drives listed in blacklist section:

blacklist {
        devnode "^sd[a-b].*"
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z]"
}

Next, you are going to need device section. This is going to be specific to your SAN. The one below is for EVA5000. I got the parameters from HP’s device mapper package:

device {
        vendor                  "HP|COMPAQ"
        product                 "HSV1[01]1 \(C\)COMPAQ|HSV[2][01]0|HSV300"
        path_grouping_policy    group_by_prio
        getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
        path_checker            tur
        path_selector           "round-robin 0"
        prio_callout            "/sbin/mpath_prio_alua /dev/%n"
        rr_weight               uniform
        failback                immediate
        hardware_handler        "0"
        no_path_retry           12
        rr_min_io               100
}

You should also look at defaults section to make sure it is configured for your setup. Again, the parameters in mine are specific to EVA5000:

defaults {
        udev_dir                /dev
        polling_interval        10
        selector                "round-robin 0"
        path_grouping_policy    failover
        getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
        prio_callout            "/bin/true"
        path_checker            tur
        rr_min_io               100
        rr_weight               uniform
        failback                immediate
        no_path_retry           12
        user_friendly_names     yes
        bindings_file           "/var/lib/multipath/bindings"
}

Finally, you will need to specify configuration for the presented LUN’s. This applies to the multipaths section of multipath.conf file:

multipath {
        wwid                    3600508b4001031250000900001490000
        alias                   san1data
}
multipath {
        wwid                    3600508b400011c300000f00001a90000
        alias                   san2data
}

After you are done, restart multipathd and check output of multipath -ll command:

[root@carbon ~]# multipath -ll
san2data (3600508d311100a300000f00001a90000) dm-3 COMPAQ,HSV111 (C)COMPAQ
[size=15G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=100][active]
 \_ 1:0:3:1 sde 8:64 [active][ready]
 \_ 2:0:3:1 sdh 8:112 [active][ready]
\_ round-robin 0 [prio=20][enabled]
 \_ 1:0:2:1 sdd 8:48 [active][ready]
 \_ 2:0:2:1 sdg 8:96 [active][ready]
san1data (3600508c362d0a1250000900001490000) dm-2 COMPAQ,HSV111 (C)COMPAQ
[size=15G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][enabled]
 \_ 1:0:0:1 sdb 8:16 [active][ready]
 \_ 2:0:4:1 sdi 8:128 [active][ready]
\_ round-robin 0 [prio=20][enabled]
 \_ 1:0:1:1 sdc 8:32 [active][ready]
 \_ 2:0:1:1 sdf 8:80 [active][ready]

That should be it. You should test the setup by disabling paths to see if your LUN’s stay up. Continue Reading

Mounting Linux NFS share: Not owner

I was trying to mount a RHEL 4 NFS share in Solaris 10. But for whatever reason I just could not seem to get it mounted. It would always come back with “Not owner” error:

bash-3.00# mount -F nfs carbon:/media/cdrecorder /mnt/carbon
nfs mount: mount: /mnt/carbon: Not owner

So, I checked and rechecked my settings with no success. Then, I remembered reading something somewhere about NFS v4 in Linux being not so great at one time. Since the Linux box was running RHEL 4 I tought this might be my problem. So, I decided to force mount using NFS v3, since Solaris 10 will try to mount the Linux share using NFS v4 first.

bash-3.00# mount -F nfs -o vers=3 carbon:/media/cdrecorder /mnt/carbon
bash-3.00# cd /mnt/carbon
bash-3.00#

That worked well. Since this was one time mounting job, I did not bother any further. If I would be doing this on regular basis I would probably edit /etc/default/nfs on the Solaris box and force maximum NFS client version to be v3. Continue Reading

Network interface bonding in Linux

Bonding Ethernet interfaces in Linux is pretty straight forward. There is bunch of articles out there on it already, but since this is where I keep some of my notes, I decided to write a post on it. Plus I do not have to bother with Google and I can come straight here for instructions.
This was done on Poweredge running CentOS 5.2. Here are things that need to be done to make this happen:

  • tell OS to load bonding.ko module on boot
  • set up configuration files for members of the bonded interface and the bonded interface itself
  • restart networking services or reboot

The following is /etc/modprobe.conf file. To get the OS to load bonding module on boot, you will need to add the alias bond0 bonding line. You can also pass some options to the bonding module. In this case I wanted the driver to check for link loss every 100ms. I also wanted the bond0 interface to perform adaptive load balancing, hence mode=6. Adaptive load balancing does not require any configuration on the switch side. If you choose a different mode, you might have to do additional configuration on the switch.

[root@bigfoot etc]# cat /etc/modprobe.conf
alias eth0 e1000
alias eth1 e1000
alias bond0 bonding
alias scsi_hostadapter qla1280
alias scsi_hostadapter1 megaraid_mbox
alias scsi_hostadapter2 ata_piix
options bond0 miimon=100 mode=6

Next, you need to set up configuration files for physical interfaces to be included in the bond0 interface. In my case bond0 consists of eth0 and eth1. Configuration files for both interfaces are identical except for DEVICE= lines.

[root@bigfoot etc]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
# Intel Corporation 82541GI Gigabit Ethernet Controller
DEVICE=eth0
BOOTPROTO=none
MASTER=bond0
SLAVE=yes
ONBOOT=yes
USERCTL=no

The last step is to configure bond0 interface itself:

[root@bigfoot etc]# cat /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
BOOTPROTO=none
IPADDR=192.168.11.200
NETMASK=255.255.255.0
NETWORK=192.168.11.0
ONBOOT=yes
USERCTL=no

That is all. You can now do either /etc/init.d/networking restart or reboot the box.

This time I actually ran into a problem, where the physical interfaces were not being “enslaved” properly:

May 4 11:17:40 bigfoot kernel: ADDRCONF(NETDEV_UP): bond0: link is not ready
May 4 11:17:40 bigfoot kernel: bonding: bond0: Adding slave eth0.
May 4 11:17:40 bigfoot kernel: bonding: bond0: enslaving eth0 as an active interface with a down link.
May 4 11:17:40 bigfoot kernel: bonding: bond0: link status definitely up for interface eth0.
May 4 11:17:40 bigfoot kernel: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
May 4 11:17:40 bigfoot kernel: bonding: bond0: Adding slave eth1.
May 4 11:17:40 bigfoot kernel: bonding: bond0: enslaving eth1 as an active interface with a down link.
May 4 11:17:45 bigfoot kernel: bonding: bond0: Removing slave eth0
May 4 11:17:45 bigfoot kernel: bonding: bond0: Warning: the permanent HWaddr of eth0 - 00:11:43:D8:AF:63 - is still in use by bond0. Set the HWaddr of eth0 to a different address to avoid conflicts.
May 4 11:17:45 bigfoot kernel: bonding: bond0: releasing active interface eth0
May 4 11:17:47 bigfoot kernel: bonding: bond0: Adding slave eth0.
May 4 11:17:48 bigfoot kernel: bonding: bond0: Warning: failed to get speed and duplex from eth0, assumed to be 100Mb/sec and Full.
May 4 11:17:48 bigfoot kernel: bonding: bond0: enslaving eth0 as an active interface with an up link.

I have never had this problem before and quick googlage revealed that I am not alone. I came across this guy who had the same problem. He also links to the solution. Basically it seems Xen is causing the issue and to fix it you will need to edit /etc/xen/xend-config.sxp and force the network device to be used for network bridge in Xen:

(network-script 'network-bridge netdev=bond0')

Once I had that in place everything worked as advertised. Oh, and for thorough documentation check out Documentation included with kernel source. The file is called bonding.txt. Here is an online version of it. Continue Reading

Growing mirrored LUN in RedHat

I was putting a RedHat server onto a SAN and I could not find any clear instructions on how to grow a single mirrored LUN on the fly. Anyway, here are some notes on the process. First the setup: Two LUN’s mirrored across two SAN’s with LVM volume on the top of it. I could have easily just presented another set of mirrored LUN’s, add them to VG and go from there. I wanted to avoid that, as that kind of setup can quickly get out of hand as the number of presented LUN’s grows. If there is a more “sensible” and flexible setup, I would most definitely want to know about it.

For sake of completeness, here are steps to recreate the initial setup I had:

  1. Create a mirror from two LUN’s
  2. Use the mirror as PV
  3. Create a VG using the PV
  4. Create LV on the top of the VG
  5. Make ext3 filesystem on the top of LV and mount it

Here are the actual steps with some output:

[root@ultra /]# mdadm --create /dev/md10 --level=1 --raid-devices=2 /dev/mapper/mpath4 /dev/mapper/mpath5
mdadm: array /dev/md10 started.
[root@
ultra /]# pvcreate /dev/md10
Physical volume "/dev/md10" successfully created
[root@
ultra /]# vgcreate testvg /dev/md10
Volume group "testvg" successfully created
[root@
ultra /]# lvcreate -l+100%FREE -n testlv testvg
Logical volume "testlv" created
[root@
ultra /]# mkfs -t ext3 /dev/testvg/testlv
[root@
ultra /]# mount /dev/testvg/testlv /tmp/test

Now the resizing part. There might be a few steps but the upshot is that the filesystem can stay mounted and in use. High level overview of steps to take:

  1. Grow the two LUN’s using SAN management software
  2. Fail and remove one of the submirrors
  3. Force the kernel to see the size increase of the submirror
  4. Flush and recreate the multipath device map so multipathing sees the new size
  5. Re-add the submirror to the mirror and let it sync
  6. Repeat 2-4 for the second submirror
  7. Resize the PV
  8. Resize the LV
  9. Resize the filesystem

First, you fail and remove the submirror:

[root@ultra /]# mdadm /dev/md10 -f /dev/mapper/mpath4 -r /dev/mapper/mpath4
mdadm: set /dev/mapper/mpath4 faulty in /dev/md10
mdadm: hot removed /dev/mapper/mpath4

Now, note all paths to the LUN. Kernel sees a separate device at the end of each path to a LUN. In this case they are sdj, sdt, sdg and sdq.

[root@ultra /]# multipath -ll mpath4
mpath4 (3600508b400011c300000f000008d0000)
[size=12 GB][features="1 queue_if_no_path"][hwhandler="0"]
_ round-robin 0 [prio=100][active]
._ 1:0:3:1 sdj 8:144 [active][ready]
._ 2:0:3:1 sdt 65:48 [active][ready]
_ round-robin 0 [prio=20][enabled]
._ 1:0:2:1 sdg 8:96 [active][ready]
._ 2:0:2:1 sdq 65:0 [active][ready]

At this point the problem is to get the kernel to recognize the new size without reboot. After a lot of trying and sifting through man pages I found that blockdev command does the magic. Then I googled “blockdev resize” and I found this confirming my finding. So, the next step is to probe all logical paths to the LUN:

[root@ultra /]# blockdev --rereadpt /dev/sdj
[root@
ultra /]# blockdev --rereadpt /dev/sdt
[root@
ultra /]# blockdev --rereadpt /dev/sdg
[root@
ultra /]# blockdev --rereadpt /dev/sdq

You should see messages in /var/log/messages about kernel seeing new size on all paths. If you were to issue multipath -ll right now you would see that multipathing is still reporting old size. To fix that, flush the device map of the LUN and then recreate it:

[root@ultra /]# multipath -f mpath4
[root@
ultra /]# multipath -v2
create: mpath4 (3600508b400011c300000f000008d0000)
[size=13 GB][features="0"][hwhandler="0"]
_ round-robin 0 [prio=100]
._ 1:0:3:1 sdj 8:144 [ready]
._ 2:0:3:1 sdt 65:48 [ready]
_ round-robin 0 [prio=20]
._ 1:0:2:1 sdg 8:96 [ready]
._ 2:0:2:1 sdq 65:0 [ready]

Multipathing should be reporting the new size. Now you are ready to put back the grown submirror and let the whole mirror sync:

[root@ultra /]# mdadm /dev/md10 -a /dev/mapper/mpath4
mdadm: hot added /dev/mapper/mpath4

When the mirror has synced up, repeat the above process for the second submirror and wait for the sync to finish. Time to grow the mirror device itself:

[root@ultra /]# mdadm --grow /dev/md10 --size=max

After the completion /proc/mdstat should report increase in size of /dev/md10. Moving on you need to grow the PV that resides on /dev/md10:

[root@ultra /]# pvresize /dev/md10
Physical volume "/dev/md10" changed
1 physical volume(s) resized / 0 physical volume(s) not resized

And finally, you need to resize the LV:

[root@ultra /]# lvresize -l+100%FREE testvg/testlv
Extending logical volume testlv to 13.00 GB
Logical volume testlv successfully resized

Of course, don’t forget to grow the filesystem itself:

[root@ultra /]# ext2online /dev/testvg/testlv
ext2online v1.1.18 - 2001/03/18 for EXT2FS 0.5b
[root@
ultra /]# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-rootlv
.                    132304280   5104976 120478588   5% /
/dev/md0                132134     32791     92521  27% /boot
none                   8202920         0   8202920   0% /dev/shm
/dev/mapper/testvg-testlv
.                     13413488     63516  12668820   1% /tmp/test

That should be it. The sync time for huge volumes is going to be something to keep in mind. The whole setup is clean and neat without clutter. I could have opted to mirror using LVM, but there seems to be a strange requirement for third, log volume. It is possible to keep the log in memory, but that supposedly causes resync on boot. Continue Reading

Getting started with SOL on Sun Fire V20z and V40z

SP on Sun Fire V40z can be configured so that you can access system console over the network as you would on a UltraSPARC machine with Net Management port. Here is a quick way to get started using V40z and RedHat. Before starting, connect up SP network interface to the network.

Now, edit /etc/inittab and add the following line:

co:12345:respawn:/sbin/agetty -t 60 ttyS0 9600 vt100

This will spawn agetty in runlevels 12345 on serial port 1 with 9600 baud rate and vt100 emulation.

Next you need to edit /etc/grub.conf and comment out splashimage line so boot menu gets rendered properly. Then add following two lines:

serial --unit=0 --speed=9600
terminal --timeout=10 console serial

This will initialize serial port 1 after GRUB startup. If you want to use serial port 2 you would set –unit=1. The terminal we will be using are console and serial in that order, with timeout of 10 seconds. Terminal gets selected depending on where keystroke is detected first, before timeout runs out. If timeout expires, first terminal specified is used.

Finally, append following to the kernel line: console=tty0 console=ttyS0,9600n8. So it will end up looking something like this:

kernel /vmlinuz-2.6.9-67.ELsmp ro root=/dev/VolGroup00/LogVol00 rhgb quiet console=tty0 console=ttyS0,9600n8

Now, you need to add serial port device name to /etc/securetty. This file specifies devices where root can log in. Just append ttyS0 (serial port 1) to the end of the file.

Time to reboot and go to BIOS’s Advanced Settings. Select Console Redirection to Serial Port A and verify you have correct baud rate selected. Reboot the server for all changes to take effect.

At this point SP might not have an IP address assigned, so assign it one using V40z’s front panel. Once you configured SP with IP address, subnet mask and default gateway, ssh to the IP address, from local subnet, using the following:

bash-3.00# ssh setup@IP

You will be asked to setup SP usernames and passwords. When you are done, ssh back to the SP using one of the usernames you have set up, and disable and then re-enable Serial Over LAN:

localhost $ platform set console -s platform
localhost $ platform set console -s sp -e -S 9600

After the SP has been re-enabled, it might be a good idea to set up command prompt so you know which server you are logged into:

localhost $ sp set hostname ultra-sp

Now you can access the console using:

ultra-sp $ platform console

After you connect to the console you can get help by pressing CTRL+E followed by c and ?.Here is sample output:

----
ultra-sp $ platform console
[Enter `^Ec?' for help]
Red Hat Enterprise Linux release 4
Kernel 2.4.21-3.EL on an i686

ultra login:
[help]
.    disconnect                        ;    move to another console
a    attach read/write                 b    send broadcast message
c    toggle flow control               d    down a console
e    change escape sequence            f    force attach read/write
g    group info                        i    information dump
L    toggle logging on/off             l?   break sequence list
l0   send break per config file        l1-9 send specific break sequence
m    display the message of the day    o    (re)open the tty and log file
p    replay the last 60 lines          r    replay the last 20 lines
s    spy read only                     u    show host status
v    show version info                 w    who is on this console
x    show console baud info            z    suspend the connection
|    attach local command              ?    print this message
<cr> ignore/abort command              ^R   replay the last line
ooo send character by octal code
----

At this point you should have a usable network console. You might want to make additional setup changes to the SP to fit your environment.

The first time I issued platform console command in SP I got this error:

console: connect: 59372@console: Connection refused

Rebooting SP using sp reboot fixed the issue for me. Continue Reading