Flashing firmware on SunFire T2000

Not so long ago I had to flash firmware on SunFire T2000. It’s not a complicated thing.

I need to do something with Service Controller and in the process I found out that there is no scadm utility on T2000. Why? I am not completely sure. Maybe someone can chime in. It’s probably because T2000 runs ALOM CMT which is not the same as ALOM.

Anyway, there are two methods of doing all this. You can use SC’s flashupdate command to grab firmware off an FTP server. Unfortunately I did not have Net management port hooked up, only serial. Long story…

I had to transfer the firmware archive to the server and then, using included utility upload it to System Controller and perform firmware flashing.

Before starting note your current firmware version by logging into System Controller:

sc> showhost
Sun-Fire-T2000 System Firmware 6.3.9  2007/08/16 23:53
Host flash versions:
   Hypervisor 1.3.4 2007/03/28 06:03
   OBP 4.25.8 2007/08/16 10:59
   POST 4.25.8 2007/08/16 11:26

Start by transferring the firmware archive to the server and unzipping it:

bash-3.00# ls
139434-09
bash-3.00# cd 139434-09
bash-3.00# ls
139434-09.html
Install.info
LEGAL_LICENSE.TXT
Legal
README.139434-09
Sun_Fire_T2000_metadata.xml
Sun_System_Firmware-6_7_12-SPARC_Enterprise_T2000.bin
Sun_System_Firmware-6_7_12-Sun_Fire_T2000.bin
copyright
sysfwdownload
sysfwdownload.README

Using included sysfwdownload utility upload the firmware to System Controller. This takes roughly 15-20 minutes.

bash-3.00# ./sysfwdownload Sun_System_Firmware-6_7_12-Sun_Fire_T2000.bin
 
.......... (9%).......... (18%).......... (27%).......... (37%).......... (46%).......... (55%).......... (64%).......... (74%).......... (83%).......... (92%)......... (100%)
 
Download completed successfully.
bash-3.00#

At this point firmware is on System Controller. Now you need to shut down the system.

bash-3.00# shutdown -g0 -y -i0

Power down the system from SC console and proceed with firmware flashing.

{8} ok sc> poweroff
Are you sure you want to power off the system [y/n]?  y
sc>
SC Alert: SC Request to Power Off Host.
 
SC Alert: Host system has shut down.

Next, make sure keyswitch is set to NORMAL. If it is set to LOCKED you will not be able to flash the firmware or send STOP-A to the system. If keyswitch is set to NORMAL start actual flashing process using flashupdate command:

sc> showkeyswitch
Keyswitch is in the NORMAL position.
sc> flashupdate -s 127.0.0.1
 
SC Alert: System poweron is disabled.
.............................................................
.............................................................
............................................................
 
Update complete. Reset device to use new software.
 
SC Alert: SC firmware was reloaded

At this point firmware is flashed. You still have to reset SC so it loads the new firmware:

sc> resetsc
Are you sure you want to reset the SC [y/n]?  y
User Requested SC Shutdown
 
ALOM BOOTMON v1.7.11
ALOM Build Release: 001
Reset register: 00000000
 
ALOM POST 1.0
 
 
Dual Port Memory Test, PASSED.
 
TTY External - Internal Loopback Test
TTY External - Internal Loopback Test, PASSED.
 
TTYC - Internal Loopback Test
TTYC - Internal Loopback Test, PASSED.
 
TTYD - Internal Loopback Test
TTYD - Internal Loopback Test, PASSED.
 
Memory Data Lines Test
Memory Data Lines Test, PASSED.
 
Memory Address Lines Test
  Slide address bits to test open address lines
  Test for shorted address lines
Memory Address Lines Test, PASSED.
 
Boot Sector FLASH CRC Test
Boot Sector FLASH CRC Test, PASSED.
 
 
 
Return to Boot Monitor for Handshake
ALOM POST 1.0
   Status = 00007fff
 
Returned from Boot Monitor and Handshake
 
 
 
Loading the runtime image... VxWorks running.
 
Starting Advanced Lights Out Manager CMT v1.7.11
 
Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.
Current mode: NORMAL
Attaching network interface lo0... done.
Attaching network interface motfec0.... done.
Booting from Segment 0
 
 
Oracle Advanced Lights Out Manager CMT v1.7.11
 
 
SC Alert: SC System booted.
 
 
Full VxDiag Tests
 
BASIC TOD TEST
  Read the TOD Clock:        TUE NOV 22 22:04:15 2011
  Wait, 1 - 3 seconds
  Read the TOD Clock:        TUE NOV 22 22:04:17 2011
BASIC TOD TEST, PASSED
 
ETHERNET CPU LOOPBACK TEST
  50 BYTE PACKET   - a 0 in field of 1's.
  50 BYTE PACKET   - a 1 in field of 0's.
  900 BYTE PACKET  - pseudo-random data.
ETHERNET CPU LOOPBACK TEST, PASSED
 
Full VxDiag Tests - PASSED
 
 
 
    Status summary  -  Status = 7FFF
 
       VxDiag    -          -  PASSED
       POST      -          -  PASSED
       LOOPBACK  -          -  PASSED
 
       I2C       -          -  PASSED
       EPROM     -          -  PASSED
       FRU PROM  -          -  PASSED
 
       ETHERNET  -          -  PASSED
       MAIN CRC  -          -  PASSED
       BOOT CRC  -          -  PASSED
 
       TTYD      -          -  PASSED
       TTYC      -          -  PASSED
       MEMORY    -          -  PASSED
       MPC885    -          -  PASSED
 
 
Please login:

After logging in make sure SC is running new version of firmware:

sc> showhost
Sun-Fire-T2000 System Firmware 6.7.12  2011/07/06 20:03
 
Host flash versions:
   OBP 4.30.4.d 2011/07/06 14:29
   Hypervisor 1.7.3.c 2010/07/09 15:14
   POST 4.30.4.b 2010/07/09 14:24

That’s it, now you can poweron the system.

Posted on January 20, 2012 at 16:14 by somedude · Permalink · Leave a comment
In: alom, openboot, solaris, solaris tips, sun hardware

Unable to access console login prompt from ALOM

Not so long ago, I logged into System Controller of SunFire T2000 running Solaris 10 and tried to access server’s console. For some reason, usual console command did not work. It simply did not return anything.

I figured maybe something was stuck, so I reset System Controller and tried again. Again, I got nothing, which was strange, because usually couple of ‘Enters’ bring up console login prompt.

This had worked before so I knew there was not some configuration issue elsewhere. I vaguely remembered that I had a similar issue in Solaris 9. If you look at /etc/inittab on Solaris 9 and prior you will see something like this:

co:234:respawn:/usr/lib/saf/ttymon -g -h -p "`uname -n` console login: " -T sun -d /dev/console -l console -m ldterm,ttcompat

Killing ttymon process would cause init respawn it again in specified run-levels. This fixed my problem then. On Solaris 10 it’s different. Console login prompt is handled by SMF:

bash-3.00# svcs -a | grep console-login
online         14:41:04 svc:/system/console-login:default

So, restarting svc:/system/console-login:default should fix the issue:

bash-3.00# svcadm restart svc:/system/console-login:default

Now you should be able to get console login prompt from ALOM.

Posted on December 13, 2011 at 10:19 by somedude · Permalink · Leave a comment
In: alom, openboot, smf, solaris, solaris tips, sun hardware

ZFS and swapping

ZFS is great. Obviously. Generally when I do a server I install system on UFS and then mirror disks with SVM. I keep data and apps on ZFS. That’s nice, especially if ZFS is on a SAN. If something happens with the server I can move ZFS pools to a different server, if necessary.

The server build process is just a preference. You can easily install Solaris 10 on ZFS and be done with it.

Not so long ago I ran into situation, where I needed to add some swap. For various reasons I did not want to shuffle slicing on system disks while system is running. Nor could I add a swap file on one of UFS filesystems. So The last option I had was to use one of the zpools:

bash-3.00# zpool status
  pool: dudespool
 state: ONLINE
 scrub: none requested
config:
 
        NAME        STATE     READ WRITE CKSUM
        dudespool   ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            c2t1d0  ONLINE       0     0     0
            c2t2d0  ONLINE       0     0     0
 
errors: No known data errors

I was going to create a file to use it as swap area:

bash-3.00# cd /dudespool
bash-3.00# mkfile 50M swap
bash-3.00# swap -a /dudespool/swap
"/dudespool/swap" may contain holes - can't swap on it.

So that was that. As it turns out, ZFS can not be used for swap files. You have to create zvol and swap on that.

bash-3.00# zfs create -V 50M dudespool/swapvol
bash-3.00# swap -a /dev/zvol/dsk/dudespool/swapvol
bash-3.00# swap -l
swapfile             dev  swaplo blocks   free
/dev/md/dsk/d10     85,10      8 1220928 1220928
/dev/zvol/dsk/dudespool/swapvol 181,1       8 102392 102392

There is more info on the net, just google it…

Posted on November 15, 2011 at 11:19 by somedude · Permalink · Leave a comment
In: solaris, solaris tips, zfs

Very basic SELinux troubleshooting

SELinux has been around for a while in RedHat. SELinux is Mandatory Access Control mechanism. Starting with RedHat 6, the installer automatically sets SELinux to enforcing mode.

When troubleshooting something SELinux is one more thing to keep in mind.

If you are fixing something and you booted with SELinux disabled, all files created since you have disabled it will not have SELinux context (fcontext). This will cause filesystem relabelling, when you turn SELinux back on. This can take a long time and you will lose fcontexts unless you have added them to the policy database.

If you do decide to go this route, you can disable SELinux by passing selinux=0 to init or edit /etc/sysconfig/selinux config file and the reboot.

When troubleshooting, it’s probably better to use getenforce and setenforce commands:

[root@ultra opt]# getenforce
Enforcing

To change SELinux status use setenforce:

[root@ultra opt]# setenforce 0
[root@ultra opt]# getenforce
Permissive

One thing that I forget sometimes is difference between cp and mv commands with respect to SELinux. Moving file preserves fcontext, whereas copying does not, unless you use -a option.

Then there are different booleans that can be read and set using getsebool and setsebool. The key thing to remember is that unless you supply -P option to setsebool, the change will not survive reboot.

If you are suspecting problems with fcontexts, you can use chcon and semanage tools. Using chcon changes context on a file or directory, but the context is not added to policy database, so it will not survive reboot.

[root@ultra opt]# chcon --reference /var/www/html /www

This is handy if you want to quickly test out fcontext. The command applies the same fcontext from /var/www/html to /www.

To make fcontext stick across reboots you have to do something like:

[root@ultra opt]# semanage -a -t public_content_t '/www(/.*)?'

You will need to substitute desired fcontext in place of public_content_t.

Then there is setroubleshootd, which along with sealert can help you figure out what’s happening. The log file /var/log/messages will contain SELinux messages that setroubleshootd intercepts when it’s running, giving you sealert command to run to see in detail what SELinux violation occurred.

That would be it in a very basic nutshell.

One more thing to remember, if you did something like:

[root@ultra opt]# semanage -a -t private_content_t '/www/stuff(/.*)?'
[root@ultra opt]# semanage -a -t public_content_t '/www(/.*)?'

Then during next filesystem relabel /www/stuff will have public_content_t fcontext!

Posted on October 31, 2011 at 21:01 by somedude · Permalink · Leave a comment
In: centos, linux, linux tips, redhat, security

OpenBoot: All CPUs failed or disabled

I guess every day you learn something new. This incident happened on Sun Fire V480R. The server was running for running for ages but for one reason or another it had to be rebooted.

So after the diags ran, it came back with this:

Sun Fire 480R, No Keyboard
Copyright 1998-2003 Sun Microsystems, Inc.  All rights reserved.
OpenBoot 4.10.7, 4096 MB memory installed, Serial #55408554.
Ethernet address 0:3:ba:4d:77:aa, Host ID: 834d77aa.
 
 
 
 
                                                                       
FATAL: All CPUs failed or disabled.

Hmm, interesting. This was the first time I have seen this error. Admittedly, it was a little bit stressful, because the system had to come back up, never mind the fact that the hardware is at a completely different location.

So, pretty much the only viable options was to make server boot somehow. Let’s see what had been ASR disabled:

{3} ok .asr
ASR Disablement Status
Component:     Status
 
CPU/Memory:    Enabled
IO-Bridge5:    Enabled
IO-Bridge8:    Enabled
IO-Bridge9:    Enabled
GPTwo Slots:   Enabled
Onboard FCAL:  Enabled
Onboard Net1:  Enabled
Onboard Net0:  Enabled
Onboard IDE:   Enabled
PCI Slots:     Enabled

All seems OK to me. Let’s see if disabling and enabling CPU/Memory banks will do the trick:

{3} ok asr-disable cpu0
{3} ok asr-disable cpu1
{3} ok asr-disable cpu2
{3} ok asr-disable cpu3
{3} ok .asr
ASR Disablement Status
Component:     Status
 
CPU0:          Disabled
Memory Bank0:  Enabled
Memory Bank1:  Enabled
Memory Bank2:  Enabled
Memory Bank3:  Enabled
CPU1/Memory:   Disabled
Memory Bank0:  Enabled
Memory Bank1:  Enabled
Memory Bank2:  Enabled
Memory Bank3:  Enabled
CPU2:          Disabled
Memory Bank0:  Enabled
Memory Bank1:  Enabled
Memory Bank2:  Enabled
Memory Bank3:  Enabled
CPU3:          Disabled
Memory Bank0:  Enabled
Memory Bank1:  Enabled
Memory Bank2:  Enabled
Memory Bank3:  Enabled
IO-Bridge5:    Enabled
IO-Bridge8:    Enabled
IO-Bridge9:    Enabled
GPTwo Slots:   Enabled
Onboard FCAL:  Enabled
Onboard Net1:  Enabled
Onboard Net0:  Enabled
Onboard IDE:   Enabled
PCI Slots:     Enabled
 
{3} ok asr-enable cpu0
{3} ok asr-enable cpu1
{3} ok asr-enable cpu2
{3} ok asr-enable cpu3
{3} ok reset-all

After reset, the system came back again with the same error, saying all CPU’s were failed or disabled. So the whole enable,disable procedure was repeated again. Except this time, the system was powered off and then back on.

This time it booted happily. Maybe simple poweroff and poweron would suffice. Anyways, there is some good information right here.

Posted on September 19, 2011 at 21:32 by somedude · Permalink · Leave a comment
In: openboot, sun hardware