Solaris Containers and ZFS

I had to create some containers for developers to do their work. Developers always seem to want root access to a machine. Containers work very nice in this scenario: if a developer messes up his container, I can just clone a new one off a “gold” container. ZFS can be very handy here as well: by installing a container on ZFS filesystem and assigning ZFS quota, you can limit how big the container can grow.

So, first I created a ZFS pool out of two slices on two disks. This is not really recommended way to create ZFS pool. You should really be using two whole disks. And, ignore the fact that those disks both reside on the same controller. Right after that I created dev1 filesystem within the zonepool:

bash-3.00# zpool create -m /export/home/zones zonepool mirror c0t0d0s3 c0t1d0s3
bash-3.00# zfs create zonepool/dev1
bash-3.00# zfs list
NAME               USED  AVAIL  REFER  MOUNTPOINT
zonepool           122K  55.6G  25.5K  /export/home/zones
zonepool/dev1     24.5K  8.00G  24.5K  /export/home/zones/dev1

Next I set ZFS quota on the filesystem to 8GB:

bash-3.00# zfs set quota=8G zonepool/dev1
bash-3.00# zfs get all zonepool/dev1
NAME              PROPERTY       VALUE                       SOURCE
zonepool/dev1     type           filesystem                  -
zonepool/dev1     creation       Fri Jun  4  9:17 2010       -
zonepool/dev1     used           24.5K                       -
zonepool/dev1     available      8.00G                       -
zonepool/dev1     referenced     24.5K                       -
zonepool/dev1     compressratio  1.00x                       -
zonepool/dev1     mounted        yes                         -
zonepool/dev1     quota          8G                          local
zonepool/dev1     reservation    none                        default
zonepool/dev1     recordsize     128K                        default
zonepool/dev1     mountpoint     /export/home/zones/dev1     inherited from zonepool
zonepool/dev1     sharenfs       off                         default
zonepool/dev1     checksum       on                          default
zonepool/dev1     compression    off                         default
zonepool/dev1     atime          on                          default
zonepool/dev1     devices        on                          default
zonepool/dev1     exec           on                          default
zonepool/dev1     setuid         on                          default
zonepool/dev1     readonly       off                         default
zonepool/dev1     zoned          off                         default
zonepool/dev1     snapdir        hidden                      default
zonepool/dev1     aclmode        groupmask                   default
zonepool/dev1     aclinherit     secure                      default
zonepool/dev1     canmount       on                          default
zonepool/dev1     shareiscsi     off                         default
zonepool/dev1     xattr          on                          default

Now, I should mention, that prior to configuring /export/home/zones to reside on ZFS I uninstalled dev1 container which was there previously. So, the container itself was gone, but the system still had knowledge of the container’s configuration. I wrote a post on configuring containers here.

bash-3.00# zoneadm list -cv
ID NAME             STATUS     PATH                           BRAND    IP
0 global           running    /                              native   shared
- dev1             configured /export/home/zones/dev1        native   shared

Since the container was already configured, I went ahead and started installing it:

bash-3.00# zoneadm -z dev1 install
/export/home/zones/dev1 must not be group readable.
/export/home/zones/dev1 must not be group executable.
/export/home/zones/dev1 must not be world readable.
/export/home/zones/dev1 must not be world executable.
could not verify zonepath /export/home/zones/dev1 because of the above errors.
zoneadm: zone dev1 failed to verify
bash-3.00#

Woops, looks like the container directory permissions need some fixing:

bash-3.00# cd /export/home/zones/
bash-3.00# ls -l
total 3
drwxr-xr-x   2 root     sys            2 Jun  3 09:48 dev1
bash-3.00# chmod 700 dev1
bash-3.00# chown root:root dev1

One more try to install the container:

bash-3.00# zoneadm -z dev1 install
Preparing to install zone .
Creating list of files to copy from the global zone.
Copying <2561> files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize <1086> packages on the zone.
Initialized <1086> packages on zone.
Zone  is initialized.
The file  contains a log of the zone installation.

That’s it. After the container install completed, before booting dev1, I stuck the following sysidcfg file into /etc directory of dev1 container:

bash-3.00# more sysidcfg
system_locale=en_US
timezone=US/Central
terminal=vt100
security_policy=NONE
network_interface=primary {
hostname=dev1
}
nfs4_domain=dynamic
name_service=NIS {
domain_name=example.com
name_server=nis1(10.1.1.1)
}

That way I would not be asked any container configuration questions during first container boot. Except for the root password, of course. Continue Reading

Installing and patching Data Protector client on standalone Solaris machine

I needed to install HP Data Protector 5.5 client on an old machine running Solaris 2.6. I also needed to apply some Data Protector patches to that machine. The problem was I had no Data Protector Install Server for Data Protector version 5.5.

Normally, Data Protector clients are installed from Data Protector Install Server. When you need to deploy patches, first you patch the Install Server and then you push out patches to you Data Protector clients.

Installing the client itself on standalone machine is not a big deal. All you need is an appropriate Data Protector software depot and off you go. The problem was installing patches without the Install Server.

Here is the process I took installing and patching this machine:

B6960-15041_DP55_HPUX_PA_IS_CD.tar – HPUX Install Server – contains Solaris 2.6 client
DPSOL_00168.zip – Patch
DPSOL_00180.zip – Patch

First, I decompressed Data Protector software depot that contains Solaris 2.6 client and installed Disk Agent on the machine:

root@client # tar xf B6960-15041_DP55_HPUX_PA_IS_CD.tar
root@client # cd LOCAL_INSTALL
root@client # ./omnisetup.sh -install da

After that was done, it was time to patch Data Protector installation. First, I decompressed the patch archive. Then I located and decompressed the proper packet.Z file for Solaris 2.6:

root@client # unzip DPSOL_00168.zip
root@client # cd OB2-SOLUX/root/opt/omni/databases/vendor/da/sun/sparc/solaris-26/A.05.50/
root@client # uncompress packet.Z

Finally I installed the packet, which is really Solaris package:

root@client # pkgadd -d ./packet all

The whole patch process has to be repeated for all patches you are trying to install. Note, that when you unzip the patch, you kind of have to know what you are looking for, so you need to pay attention to the PATH to packet.Z file; i.e. in the above case I was installing Data Agent patch on a Sun box, running Solaris 2.6 and Data Protector version 5.5. Continue Reading

Trussing processes on Sun Cluster?

One of the apps running on Sun Cluster was randomly crashing. So, I decided to take a look what was happening. Yeah, there is DTrace in Solaris 10. Since I am pretty comfortable with truss I decided to give that a shot first:

root@node1 # truss -p 27462
truss: process is traced: 27462
root@node1 #

That’s it. No truss output, nothing. That was weird. truss will not work if there is a debugger attached to the process to be traced, which was not the case. So, I figured it might have something to do with the fact that the process is handled by the cluster software.

Finaly, NOTES section of pmfadm manpage gave me the answer:
To avoid collisions with other controlling processes. truss(1) does not allow tracing a process that it detects as being controlled by another process by way of the /proc interface. Since rpc.pmfd(1M) uses the /proc interface to monitor processes and their descendents, those processes that are submitted to rpc.pmfd by way of pmfadm cannot be traced or debugged.
So, Dtrace it was. Thankfully, Brendan Gregg already did the hard work for me, by creating DTrace version of truss. The more you know… Continue Reading

Jetadmin and non-HP network printers

I was trying to get Dell 3130cn printer working with Jetadmin in Solaris. But, I was not able to create the print queue even though the printer was reachable, SNMP was working and so was telnet. Fortunately, there is a workaround I found somewhere on the Internet. Of course, I failed to keep the link to the workaround. Anyways, the non-HP printer that you are trying to configure has to have Jetdirect card. The workaround:

  1. Take an HP Jetdirect printer that you already have configured.
  2. Edit /etc/hosts file on the print server and add entry with Dell printer name. The IP address for that entry has to be the one belonging to the working HP printer.
  3. Add a print queue as you normally would.
  4. Remove the /etc/hosts file entry.
  5. Test printing.

Of course, this assumes that you have some sort of name resolution mechanism in place such as NIS, so the printer names get resolved properly. Also, your /etc/nsswitch.conf file has to specify that /etc/hosts file is the first place the server goes to when resolving names.

The /etc/hosts file entry temporarily overrides your global name resolution mechanism. This way you can create print queue with the Dell printer, but you are actually talking to the HP printer when creating the queue. You might also want to use generic network printer driver in Jetadmin. Anyways, I got that Dell printer working. Continue Reading

Failed Repository Integrity Check

Last week I was presented with the following error on one of the Solaris 10 boxes:

svc.configd: smf(5) database integrity check of:

/etc/svc/repository.db

failed. The database might be damaged or a media error might have
prevented it from being verified. Additional information useful to
your service provider is in:

/etc/svc/volatile/db_errors

The system will not be able to boot until you have restored a working
database. svc.startd(1M) will provide a sulogin(1M) prompt for recovery
purposes. The command:

/lib/svc/bin/restore_repository

can be run to restore a backup version of your repository. See
http://sun.com/msg/SMF-8000-MY for more information.

Having never seen this error, I was thinking: “this is gonna be interesting…”. Thankfully the error was pretty verbose so I started to disect it section by section. Yeah, service repository got hosed, somehow, and I can potentially find some usefull info in /etc/svc/volatile/db_errors. Unfortunatelly, there was nothing of use in there.

The restore_repository script mentioned gave me little more hope. I also went and checked out the page URL. After reading the page I decided to go ahead and try to restore the service repository.

I logged in to the box in single user mode and took a look at the restore script to get an idea of what it might do. Then, I ran it. Fortunatelly, the script was pretty good at doing checks and told me that I can not proceed any further because / filesystem is mounted RO. To fix this I was asked to run:

bash-3.00# /lib/svc/method/fs-root
bash-3.00# /lib/svc/method/fs-usr

Once the filesystems were fixed up I ran the restore_repository script. I was asked which backup copy I wanted to restore and that was it. The system rebooted and came back up fine. This turned out to be a pretty good learning experience and http://www.sun.com/msg/SMF-8000-MY is very well worth reading. Continue Reading

Page 1 of 912345...Last »