Tips and tricks for PowerVC 1.2.3 (PVID, ghostdev, clouddev, rest API, growing volumes, deleting boot volume) | PowerVC 1.2.3 Redbook

Writing a Redbook was one of my main goal. After working days and nights for more than 6 years on PowerSystems IBM gave me the opportunity to write a Redbook. I was looking on the Redbook residencies page since a very very long time to find the right one. As there was nothing new on AIX and PowerVM (which are my favorite topics) I decided to give a try to the latest PowerVC Redbook (this Redbook is an update, but a huge one. PowerVC is moving fast). I am a Redbook reader since I’m working on AIX. Almost all Redbooks are good, most of them are the best source of information for AIX and Power administrators. I’m sure that like me, you saw that part about becoming an author every time you are reading a RedBook. I can now say THAT IT IS POSSIBLE (for everyone). I’m now one of this guys and you can also become one. Just find the Redbook that will fit for you and apply on the Redbook webpage (http://www.redbooks.ibm.com/residents.nsf/ResIndex). I wanted to say a BIG Thank you to all the people who gave me the opportunity to do that, especially Philippe Hermes, Jay Kruemcke, Eddie Shvartsman, Scott Vetter, Thomas R Bosthworth. In addition to these people I wanted also to thanks my teammates on this Redbook: Guillermo Corti, Marco Barboni and Liang Xu, they are all true professional people, very skilled and open … this was a great team ! One more time thank you guys. Last, I take the opportunity here to thanks the people who believed in me since the very beginning of my AIX career: Julien Gabel, Christophe Rousseau, and JL Guyot. Thank you guys ! You deserve it, stay like you are. I’m now not an anonymous guy anymore.

redbook

You can download the Redbook at this address: http://www.redbooks.ibm.com/redpieces/pdfs/sg248199.pdf. I’ve learn something during the writing of the Redbook and by talking to the members of the team. Redbooks are not there to tell and explain you what’s “behind the scene”. A Redbook can not be too long, and needs to be written in almost 3 weeks, there is no place for everything. Some topics are better integrated in a blog post than in a Redbook, and Scott told me that a couple of time during the writing session. I totally agree with him. So here is this long awaited blog post. The are advanced topics about PowerVC read the Redbook before reading this post.

Last one thanks to IBM (and just IBM) for believing in me :-). THANK YOU SO MUCH.

ghostdev, clouddev and cloud-init (ODM wipe if using inactive live partition mobility or remote restart)

Everybody who is using cloud-init should be aware of this. Cloud-init is only supported with AIX version that have the clouddev attribute available on sys0. To be totally clear at the time of writing this blog post you will be supported by IBM only if you use AIX 7.1 TL3 SP5 or AIX 6.1 TL9 SP5. All other versions are not supported by IBM. Let me explain why and how you can still use cloud-init on older versions just by doing a little trick. But let’s first explain what the problem is:

Let’s say you have different machines some of them using AIX 7100-03-05 and some of them using 7100-03-04, both use cloud-init for the activation. By looking at cloud-init code at this address here we can say that:

  • After the cloud-init installation cloud-init is:
  • Changing clouddev to 1 if sys0 has a clouddev attribute:
  • # oslevel -s
    7100-03-05-1524
    # lsattr -El sys0 -a ghostdev
    ghostdev 0 Recreate ODM devices on system change / modify PVID True
    # lsattr -El sys0 -a clouddev
    clouddev 1 N/A True
    
  • Changing ghostdev to 1 if sys0 don’t have a clouddev attribute:
  • # oslevel -s
    7100-03-04-1441
    # lsattr -El sys0 -a ghostdev
    ghostdev 1 Recreate ODM devices on system change / modify PVID True
    # lsattr -El sys0 -a clouddev
    lsattr: 0514-528 The "clouddev" attribute does not exist in the predefined
            device configuration database.
    

This behavior can directly be observed in the cloud-init code:

ghostdev_clouddev_cloudinit

Now that we are aware of that, let’s make a remote restart test between two P8 boxes. I take the opportunity here to present you one of the coolest feature of PowerVC 1.2.3. You can now remote restart your virtual machines directly from the PowerVC GUI if you have one of your host in a failure state. I highly encourage you to check my latest post about this subject if you don’t know how to setup remote restartable partitions http://chmod666.org/index.php/using-the-simplified-remote-restart-capability-on-power8-scale-out-servers/:

  • Only simplified remote restart can be managed by PowerVC 1.2.3, the “normal” version of remote restart is not handle by PowerVC 1.2.3
  • In the compute template configuration there is now a checkbox allowing you to create remote restartable partition. Be careful: you can’t go back to a P7 box without having to reboot the machine. So be sure your Virtual Machines will stay on P8 box if you check this option.
  • remote_restart_compute_template

  • When the machine is shutdown or there is a problem on it you can click the “Remotely Restart Virtual Machines” button:
  • rr1

  • Select the machines you want to remote restart:
  • rr2
    rr3

  • While the Virtual Machines are remote restarting, you can check the states of the VM and the state of the host:
  • rr4
    rr5

  • After the evacuation the host is in “Remote Restart Evacuated State”:

rr6

Let’s now check the state of our two Virtual Machines:

  • The ghostdev one (the sys0 messages in the errpt indicates that the partition ID has changed AND DEVICES ARE RECREATED (ODM Wipe)) (no more ip address set on en0):
  • # errpt | more
    IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
    A6DF45AA   0803171115 I O RMCdaemon      The daemon is started.
    1BA7DF4E   0803171015 P S SRC            SOFTWARE PROGRAM ERROR
    CB4A951F   0803171015 I S SRC            SOFTWARE PROGRAM ERROR
    CB4A951F   0803171015 I S SRC            SOFTWARE PROGRAM ERROR
    D872C399   0803171015 I O sys0           Partition ID changed and devices recreat
    # ifconfig -a
    lo0: flags=e08084b,c0
            inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
            inet6 ::1%1/0
             tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
    
  • The clouddev one (the sys0 message in the errpt indicate that the partition ID has changed) (note that the errpt message does not indicate that the devices are recreated):
  • # errpt |more
    60AFC9E5   0803232015 I O sys0           Partition ID changed since last boot.
    # ifconfig -a
    en0: flags=1e084863,480
            inet 10.10.10.20 netmask 0xffffff00 broadcast 10.244.248.63
             tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
    lo0: flags=e08084b,c0
            inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
            inet6 ::1%1/0
             tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
    

VSAE is designed to manage ghostdev only OS on the other hand cloud-init is designed to manage clouddev OS. To be perfectly clear here are how ghostdev and clouddev works. But we first need to answer a question. Why do we need to set clouddev or ghostdev to 1 ? The answer is pretty obvious, one of this attribute needs to be set to 1 before capturing the Virtual Machine. When the Virtual Machines is captured, one of this attributes is set to 1. When you will deploy a new Virtual Machine this flag is needed to wipe the ODM before reconfiguring the virtual machine with the parameters set in the PowerVC GUI (ip, hostname). In both clouddev and ghostdev cases it is obvious that we need to wipe the ODM at the machine build/deploy time. Then VSAE or cloud-init (using config drive datasource) is setting hostname, ip address previously wiped by clouddev and ghostdev attributes. This is working well for a new deploy because we need to wipe the ODM in all cases but what about an inactive live partition mobility or a remote restart operation ? The Virtual Machine has moved (not on the same host, and not with the same lpar ID) and we need to keep the ODM as it is. How is it working:

  • If you are using VSAE, this one is managing the ghostdev attribute for you. At the capture time ghostdev is set to 1 by VSAE (when you run the pre-capture script). When deploying a new VM, at the activation time, VSAE is setting ghostdev back to 0. Inactive live partition mobility and remote restart operations will work fine with ghostdev set to 0.
  • If you are using cloud-init on a supported system clouddev is set to 1 at the installation of cloud-init. As cloud-init is doing nothing with both attributes at the activation time IBM needs to find a way to avoid wiping the ODM after the remote restart operation. The clouddev device was introduced: this one is writing a flag in the NVRAM, so when a new VM is built, there is no flag in the NVRAM for this one, the ODM is wiped. When an already existing VM is remote restarted, the flag exists in the NVRAM, the ODM is not wiped. By using clouddev there is no post deploy action needed.
  • If you are using cloud-init on an unsupported system ghostdev is set to 1 at the installation of cloud-init. As cloud-init is doing nothing at post-deploy time, ghostdev will remains to 1 in all cases and the ODM will always be wiped.

cloudghost

There is a way to use cloud-init on unsupported system. Keep in mind that in this case you will not be supported by IBM. So do this at you own risk. To be totally honest I’m using this method in production to use the same activation engine for all my AIX version:

  1. Pre-capture, set ghostdev to 1. What ever happens THIS IS MANDATORY.
  2. Post-capture, reboot the captured VM and set ghostdev to 0.
  3. Post-deploy on every Virtual machine set ghostdev to 0. You can put this in the activation input to do the job:
  4. #cloud-config
    runcmd:
     - chdev -l sys0 -a ghostdev=0
    

The PVID problem

I realized I had this problem after using PowerVC for a while. As PowerVC images for rootvg and other volume group are created using Storage Volume Controller flashcopy (in case of a SVC configuration, but there are similar mechanisms for other storage providers) the PVID for both rootvg and additional volume groups will always be the same for each new virtual machines (all new virtual machines will have the same PVID for their rootvg, and the same PVID for each captured volume group). I did contact IBM about this and the PowerVC team told me that this behavior is totally normal and was observed since the release of VMcontrol. They didn’t have any issues related to this, so if you don’t care about it, just do nothing and keep this behavior as it is. I recommend doing nothing about this!

It’s a shame but most AIX administrators like to keep things as they are and don’t want any changes. (In my humble opinion this is one of the reason AIX is so outdated compared to Linux, we need a community, not narrow-minded people keeping their knowledge for them, just to stay in their daily job routine without having anything to learn). If you are in this case, facing angry colleagues about this particular point you can use the solution proposed below to calm the passions of the few ones who do not want to change !. :-). This is my rant : CHANGE !

By default if you build two virtual machines and check the PVID of each one, you will notice that the PVID are the same:

  • Machine A:
  • root@machinea:/root# lspv
    hdisk0          00c7102d2534adac                    rootvg          active
    hdisk1          00c7102d00d14660                    appsvg          active
    
  • Machine B:
  • root@machineb:root# lspv
    hdisk0          00c7102d2534adac                    rootvg          active
    hdisk1          00c7102d00d14660                    appsvg         active
    

For the rootvg the PVID is always set to 00c7102d2534adac and for the appsvg the PVID is always set to 00c7102d00d14660.

For the rootvg the solution is to change the ghostdev (only the ghostdev) to 2, and to reboot the machine. Putting ghostdev to 2 will change the PVID of the rootvg at the reboot time (after the PVID is changed ghostdev will be automatically set back to 0)

# lsattr -El sys0 -a ghostdev
ghostdev 2 Recreate ODM devices on system change / modify PVID True
# lsattr -l sys0 -R -a ghostdev
0...3 (+1)

For the non rootvg volume group this is a little bit tricky but still possible, the solution is to use the recreatevg (-d option) command to change the PVID of all the physical volumes of your volume group. Before rebooting the server ensure that:

  • Umount all the filesystems in the volume group on which you want to change the PVID.
  • varyoff the volume group.
  • Get the physical volumes names composing the volume group.
  • export the volume group.
  • recreate the volume group (this action will change the PVID)
  • re-import the volume group.

Here is the shell commands doing the trick:

# vg=appsvg
# lsvg -l $vg | awk '$6 == "open/syncd" && $7 != "N/A" { print "fuser -k " $NF }' | sh
# lsvg -l $vg | awk '$6 == "open/syncd" && $7 != "N/A" { print "umount " $NF }' | sh
# varyoffvg $vg
# pvs=$(lspv | awk -v my_vg=$vg '$3 == my_vg {print $1}')
# recreatevg -y $vg -d $pvs
# importvg -y $vg $(echo ${pvs} | awk '{print $1}'

We now agree that you want to do this, but as you are a smart person you want to do it automatically using cloud-init and the activation input, there are two way to do it, the silly way (using shell) and the noble way (using cloudinit syntax):

PowerVC activation engine (shell way)

Use this short ksh script in the activation input, this is not my recommendation, but you can do it for simplicity:

activation_input_shell

PowerVC activation engine (cloudinit way)

Here is the cloud-init way. Important note: use the latest version of cloud-init, the first one I used had a problem with the cc_power_state_change.py not using the right parameters for AIX:

activation_input_ci

Working with REST Api

I will not show you here how to work with the PowerVC RESTful API. I prefer to share a couple of scripts on my github account. Nice examples are often better than how to tutorials. So check the scripts on the github if you want a detailed how to … scripts are well commented. Just a couple of things to say before closing this topic: the best way to work with RESTful api is to code in python, there are a lot existing python libs to work with RESTful api (httplib2, pycurl, request). For my own understanding I prefer in my script using the simple httplib. I will put all my command line tools in a github repository called pvcmd (for PowerVC command line). You can download the scripts at this address, or just use git to clone the repo. One more time it is a community project, feel free to change and share anything: https://github.com/chmod666org/pvcmd:

Growing data lun

To be totally honest here is what I do when I’m creating a new machine with PowerVC. My customers always needs one additionnal volume groups for applications (we will call it appsvg). I’ve create a multi volume image with this volume group created (with a bunch of filesystem in it). As most of customers are asking for the volume group to be 100g large the capture was made with this size. Unfortunately for me we often get requests to create a bigger volume groups let’s say 500 or 600 Gb. Instead of creating a new lun and extending the volume group PowerVC allows you to grow the lun to the desired size. For volume group other than the boot one you must use the RESTful API to extend the volume. To do this I’ve created a python script to called pvcgrowlun (feel free to check the code on github) https://github.com/chmod666org/pvcmd/blob/master/pvcgrowlun. At each virtual machine creation I’m checking if the customer needs a larger volume group and extend it using the command provided below.

While coding this script I got a problem using the os-extend parameter in my http request. PowerVC is not exactly using the same parameters as Openstack is, if you want to code by yourself be aware of this and check in the PowerVC online documentation if you are using “extended attributes” (Thanks to Christine L Wang for this one):

  • In the Openstack documentation the attribute is “os-extend” link here:
  • os-extend

  • In the PowerVC documentation the attribute is “ibm-extend” link here:
  • ibm-extend

  • Identify the lun you want to grow (as the script is taking the name of the volume as parameter) (I have one not published to list all the volumes, tell me if you want it). In my case the volume name is multi-vol-bf697dfa-0000003a-828641A_XXXXXX-data-1, and I want to change its size from 60 to 80. This is not stated in the offical PowerVC documentation but this will work for both boot and data lun.
  • Check the size of the lun is lesser than the desired size:
  • before_grow

  • Run the script:
  • # pvcgrowlun -v multi-vol-bf697dfa-0000003a-828641A_XXXXX-data-1 -s 80 -p localhost -u root -P mysecretpassword
    [info] growing volume multi-vol-bf697dfa-0000003a-828641A_XXXXX-data-1 with id 840d4a60-2117-4807-a2d8-d9d9f6c7d0bf
    JSON Body: {"ibm-extend": {"new_size": 80}}
    [OK] Call successful
    None
    
  • Check the size is changed after the command execution:
  • aftergrow_grow

  • Don’t forget to do the job in the operating system by running a “chvg -g” (check total PPS here):
  • # lsvg vg_apps
    VOLUME GROUP:       vg_apps                  VG IDENTIFIER:  00f9aff800004c000000014e6ee97071
    VG STATE:           active                   PP SIZE:        256 megabyte(s)
    VG PERMISSION:      read/write               TOTAL PPs:      239 (61184 megabytes)
    MAX LVs:            256                      FREE PPs:       239 (61184 megabytes)
    LVs:                0                        USED PPs:       0 (0 megabytes)
    OPEN LVs:           0                        QUORUM:         2 (Enabled)
    TOTAL PVs:          1                        VG DESCRIPTORS: 2
    STALE PVs:          0                        STALE PPs:      0
    ACTIVE PVs:         1                        AUTO ON:        yes
    MAX PPs per VG:     32768                    MAX PVs:        1024
    LTG size (Dynamic): 256 kilobyte(s)          AUTO SYNC:      no
    HOT SPARE:          no                       BB POLICY:      relocatable
    MIRROR POOL STRICT: off
    PV RESTRICTION:     none                     INFINITE RETRY: no
    DISK BLOCK SIZE:    512                      CRITICAL VG:    no
    # chvg -g appsvg
    # lsvg appsvg
    VOLUME GROUP:       appsvg                  VG IDENTIFIER:  00f9aff800004c000000014e6ee97071
    VG STATE:           active                   PP SIZE:        256 megabyte(s)
    VG PERMISSION:      read/write               TOTAL PPs:      319 (81664 megabytes)
    MAX LVs:            256                      FREE PPs:       319 (81664 megabytes)
    LVs:                0                        USED PPs:       0 (0 megabytes)
    OPEN LVs:           0                        QUORUM:         2 (Enabled)
    TOTAL PVs:          1                        VG DESCRIPTORS: 2
    STALE PVs:          0                        STALE PPs:      0
    ACTIVE PVs:         1                        AUTO ON:        yes
    MAX PPs per VG:     32768                    MAX PVs:        1024
    LTG size (Dynamic): 256 kilobyte(s)          AUTO SYNC:      no
    HOT SPARE:          no                       BB POLICY:      relocatable
    MIRROR POOL STRICT: off
    PV RESTRICTION:     none                     INFINITE RETRY: no
    DISK BLOCK SIZE:    512                      CRITICAL VG:    no
    

My own script to create VMs

I’m creating Virtual Machine every weeks, sometimes just a couple and sometime I got 10 Virtual Machines to create in a row. We are here using different storage connectivity groups, and different storage templates if the machine is in production, in development, and so on. We also have to choose the primary copy on the SVC side if the machine is in production (I am using a streched cluster between two distant sites, so I have to choose different storage templates depending on the site where the Virtual Machine is hosted). I make mistakes almost every time using the PowerVC gui (sometime I forgot to put the machine name, sometimes the connectivity group). I’m a lazy guy so I decided to code a script using the PowerVC rest api to create new machines based on a template file. We are planing to give the script to our outsourced teams to allow them to create machine, without knowing what PowerVC is \o/. The script is taking a file as parameter and create the virtual machine:

  • Create a file like the one below with all the information needed for your new virtual machine creation (name, ip address, vlan, host, image, storage connectivity group, ….):
  • # cat test.vm
    name:test
    ip_address:10.16.66.20
    vlan:vlan6666
    target_host:Default Group
    image:multi-vol
    storage_connectivity_group:npiv
    virtual_processor:1
    entitled_capacity:0.1
    memory:1024
    storage_template:storage1
    
  • Launch the script, the Virtual Machine will be created:
  • pvcmkvm -f test.vm -p localhost -u root -P mysecretpassword
    name: test
    ip_address: 10.16.66.20
    vlan: vlan666
    target_host: Default Group
    image: multi-vol
    storage_connectivity_group: npiv
    virtual_processor: 1
    entitled_capacity: 0.1
    memory: 1024
    storage_template: storage1
    [info] found image multi-vol with id 041d830c-8edf-448b-9892-560056c450d8
    [info] found network vlan666 with id 5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5
    [info] found host aggregation Default Group with id 1
    [info] found storage template storage1 with id bfb4f8cc-cd68-46a2-b3a2-c715867de706
    [info] found image multi-vol with id 041d830c-8edf-448b-9892-560056c450d8
    [info] found a volume with id b3783a95-822c-4179-8c29-c7db9d060b94
    [info] found a volume with id 9f2fc777-eed3-4c1f-8a02-00c9b7c91176
    JSON Body: {"os:scheduler_hints": {"host_aggregate_id": 1}, "server": {"name": "test", "imageRef": "041d830c-8edf-448b-9892-560056c450d8", "networkRef": "5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5", "max_count": 1, "flavor": {"OS-FLV-EXT-DATA:ephemeral": 10, "disk": 60, "extra_specs": {"powervm:max_proc_units": 32, "powervm:min_mem": 1024, "powervm:proc_units": 0.1, "powervm:max_vcpu": 32, "powervm:image_volume_type_b3783a95-822c-4179-8c29-c7db9d060b94": "bfb4f8cc-cd68-46a2-b3a2-c715867de706", "powervm:image_volume_type_9f2fc777-eed3-4c1f-8a02-00c9b7c91176": "bfb4f8cc-cd68-46a2-b3a2-c715867de706", "powervm:min_proc_units": 0.1, "powervm:storage_connectivity_group": "npiv", "powervm:min_vcpu": 1, "powervm:max_mem": 66560}, "ram": 1024, "vcpus": 1}, "networks": [{"fixed_ip": "10.244.248.53", "uuid": "5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5"}]}}
    {u'server': {u'links': [{u'href': u'https://powervc.lab.chmod666.org:8774/v2/1471acf124a0479c8d525aa79b2582d0/servers/fc3ab837-f610-45ad-8c36-f50c04c8a7b3', u'rel': u'self'}, {u'href': u'https://powervc.lab.chmod666.org:8774/1471acf124a0479c8d525aa79b2582d0/servers/fc3ab837-f610-45ad-8c36-f50c04c8a7b3', u'rel': u'bookmark'}], u'OS-DCF:diskConfig': u'MANUAL', u'id': u'fc3ab837-f610-45ad-8c36-f50c04c8a7b3', u'security_groups': [{u'name': u'default'}], u'adminPass': u'u7rgHXKJXoLz'}}
    

One of the major advantage of using this is batching Virtual Machine creation. By using the script you can create one hundred Virtual Machine in a couple of minutes. Awesome !

Working with Openstack commands

PowerVC is based on Openstack, so why not using the Openstack command to work with PowerVC. It is possible, but I repeat one more time that this is not supported by IBM at all. Use this trick at you own risk. I was working with cloud manager with openstack (ICMO) and a script including shells variables is provided to “talk” to the ICMO Openstack. Based on the same file I created the same one for PowerVC. Before using any Openstack commands create a powervcrc file that match you PowerVC environement:

# cat powervcrc
export OS_USERNAME=root
export OS_PASSWORD=mypasswd
export OS_TENANT_NAME=ibm-default
export OS_AUTH_URL=https://powervc.lab.chmod666.org:5000/v3/
export OS_IDENTITY_API_VERSION=3
export OS_CACERT=/etc/pki/tls/certs/powervc.crt
export OS_REGION_NAME=RegionOne
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_DOMAIN_NAME=Default

Then source the powervcrc file, and you are ready to play with all Openstack commands:

# source powervcrc

You can then play with Openstack commands, here are a few nice example:

  • List virtual machines:
  • # nova list
    +--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
    | ID                                   | Name                  | Status | Task State | Power State | Networks               |
    +--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
    | dc5c9fce-c839-43af-8af7-e69f823e57ca | ghostdev0clouddev1    | ACTIVE | -          | Running     | vlan666=10.16.66.56    |
    | d7d0fd7e-a580-41c8-b3d8-d7aab180d861 | ghostdevto1cloudevto1 | ACTIVE | -          | Running     | vlan666=10.16.66.57    |
    | bf697dfa-f69a-476c-8d0f-abb2fdcb44a7 | multi-vol             | ACTIVE | -          | Running     | vlan666=10.16.66.59    |
    | 394ab4d4-729e-44c7-a4d0-57bf2c121902 | deckard               | ACTIVE | -          | Running     | vlan666=10.16.66.60    |
    | cd53fb69-0530-451b-88de-557e86a2e238 | priss                 | ACTIVE | -          | Running     | vlan666=10.16.66.61    |
    | 64a3b1f8-8120-4388-9d64-6243d237aa44 | rachael               | ACTIVE | -          | Running     |                        |
    | 2679e3bd-a2fb-4a43-b817-b56ead26852d | batty                 | ACTIVE | -          | Running     |                        |
    | 5fdfff7c-fea0-431a-b99b-fe20c49e6cfd | tyrel                 | ACTIVE | -          | Running     |                        |
    +--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
    
  • Reboot a machine:
  • # nova reboot multi-vol
    
  • List the hosts:
  • # nova hypervisor-list
    +----+---------------------+-------+---------+
    | ID | Hypervisor hostname | State | Status  |
    +----+---------------------+-------+---------+
    | 21 | 828641A_XXXXXXX     | up    | enabled |
    | 23 | 828641A_YYYYYYY     | up    | enabled |
    +----+---------------------+-------+---------+
    
  • Migrate a virtual machine (run a live partition mobility operation):
  • # nova live-migration ghostdevto1cloudevto1 828641A_YYYYYYY
    
  • Evacuate and set a server in maintenance mode and move all the partitions to another host:
  • # nova maintenance-enable --migrate active-only --target-host 828641A_XXXXXX 828641A_YYYYYYY
    
  • Virtual Machine creation (output truncated):
  • # nova boot --image 7100-03-04-cic2-chef --flavor powervm.tiny --nic net-id=5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5,v4-fixed-ip=10.16.66.51 novacreated
    +-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
    | Property                            | Value                                                                                                                                            |
    +-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
    | OS-DCF:diskConfig                   | MANUAL                                                                                                                                           |
    | OS-EXT-AZ:availability_zone         | nova                                                                                                                                             |
    | OS-EXT-SRV-ATTR:host                | -                                                                                                                                                |
    | OS-EXT-SRV-ATTR:hypervisor_hostname | -                                                                                                                                                |
    | OS-EXT-SRV-ATTR:instance_name       | novacreated-bf704dc6-00000040                                                                                                                    |
    | OS-EXT-STS:power_state              | 0                                                                                                                                                |
    | OS-EXT-STS:task_state               | scheduling                                                                                                                                       |
    | OS-EXT-STS:vm_state                 | building                                                                                                                                         |
    | accessIPv4                          |                                                                                                                                                  |
    | accessIPv6                          |                                                                                                                                                  |
    | adminPass                           | PDWuY2iwwqQZ                                                                                                                                     |
    | avail_priority                      | -                                                                                                                                                |
    | compliance_status                   | [{"status": "compliant", "category": "resource.allocation"}]                                                                                     |
    | cpu_utilization                     | -                                                                                                                                                |
    | cpus                                | 1                                                                                                                                                |
    | created                             | 2015-08-05T15:56:01Z                                                                                                                             |
    | current_compatibility_mode          | -                                                                                                                                                |
    | dedicated_sharing_mode              | -                                                                                                                                                |
    | desired_compatibility_mode          | -                                                                                                                                                |
    | endianness                          | big-endian                                                                                                                                       |
    | ephemeral_gb                        | 0                                                                                                                                                |
    | flavor                              | powervm.tiny (ac01ba9b-1576-450e-a093-92d53d4f5c33)                                                                                              |
    | health_status                       | {"health_value": "PENDING", "id": "bf704dc6-f255-46a6-b81b-d95bed00301e", "value_reason": "PENDING", "updated_at": "2015-08-05T15:56:02.307259"} |
    | hostId                              |                                                                                                                                                  |
    | id                                  | bf704dc6-f255-46a6-b81b-d95bed00301e                                                                                                             |
    | image                               | 7100-03-04-cic2-chef (96f86941-8480-4222-ba51-3f0c1a3b072b)                                                                                      |
    | metadata                            | {}                                                                                                                                               |
    | name                                | novacreated                                                                                                                                      |
    | operating_system                    | -                                                                                                                                                |
    | os_distro                           | aix                                                                                                                                              |
    | progress                            | 0                                                                                                                                                |
    | root_gb                             | 60                                                                                                                                               |
    | security_groups                     | default                                                                                                                                          |
    | status                              | BUILD                                                                                                                                            |
    | storage_connectivity_group_id       | -                                                                                                                                                |
    | tenant_id                           | 1471acf124a0479c8d525aa79b2582d0                                                                                                                 |
    | uncapped                            | -                                                                                                                                                |
    | updated                             | 2015-08-05T15:56:02Z                                                                                                                             |
    | user_id                             | 0688b01e6439ca32d698d20789d52169126fb41fb1a4ddafcebb97d854e836c9                                                                                 |
    +-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
    
    

LUN order, remove a boot lun

If you are moving to PowerVC you will probably need to migrate existing machines to your PowerVC environment. One of my customer is asking to move its machines from old boxes using vscsi, to new PowerVC managed boxes using NPIV. I am doing it with the help of a SVC for the storage side. Instead of creating the Virtual Machine profile on the HMC, and then doing the zoning and masking on the Storage Volume Controller and on the SAN switches, I decided to let PowerVC do the job for me. Unfortunately, PowerVC can’t only “carve” Virtual Machine, if you want to do so you have to build a Virtual Machine (rootvg include). This is what I am doing. During the migration process I have to replace the PowerVC created lun by the lun used for the migration …. and finally delete the PowerVC created boot lun. There is a trick to know if you want to do this:

  • Let’s say the lun created by PowerVC is the one named “volume-clouddev-test….” and the orignal rootvg is named “good_rootvg”. The Virtual Machine is booted on the “good_rootvg” lun and I want to remove the “volume-clouddev-test….”:
  • root1

  • You first have to click the “Edit Details” button:
  • root2

  • Then toggle the boot set to “YES” for the “good_rootvg” lun and click move up (the rootvg order must be set to 1, it is mandatory, the lun at order 1 can’t be deleted):
  • root3

  • Toggle the boot set to “NO” for the PowerVC created rootvg:
  • root4

  • If you are trying to detach the volume in first position you will got an error:
  • root5

  • When the order are ok, you can detach and delete the lun created by PowerVC:
  • root6
    root7

Conclusion

There are always good things to learn about PowerVC and related AIX topics. Tell me if these tricks are useful for you and I will continue to write posts like this one. You don’t need to understand all this details to work with PowerVC, most customers don’t. I’m sure you prefer understand what is going on “behind the scene” instead of just clicking a nice GUI. I hope it helps you to better understand what PowerVC is made of. And don’t be shy share you tricks with me. Next: more to come about Chef ! Up the irons !

Configuration of a Remote Restart Capable partition

How can we move a partition to another machine if the machine or the data-center on which the partition is hosted is totally unavailable ? This question is often asked by managers and technical people. Live Partition Mobility can’t answer to this question because the source machine needs to be running to initiate the mobility. I’m sure that most of you are implementing a manual solution based on a bunch of scripts recreating the partition profile by hand but this is hard to maintain and it’s not fully automatized and not supported by IBM. A solution to this problem is to setup your partitions as Remote Restart Capable partitions. This PowerVM feature is available since the release of VMcontrol (IBM Systems Director plugin). Unfortunately this powerful feature is not well documented but will probably in the next months or in the next year be a must have on your newly deployed AIX machines. One last word : with the new Power8 machines things are going to change about remote restart, the functionality will be easier to use and a lot of prerequisites are going to disappear. Just to be clear this post has been written using Power7+ 9117-MMD machines, the only thing you can’t do with these machines (compared to Power8 ones) is changing a current partition to be remote restart capable aware without having to delete and recreate its profile.

Pre-requesite

To create and use a remote restart partition on Power7+/Power8 machines you’ll need this prerequisites :

  • A PowerVM enterprise license (Capability “PowerVM remote restart capable” to true, be careful there is another capability named “Remote restart capable” this was used by VMcontrol only, so double check the capability ok for you).
  • A firmware 780 (or later all Power8 firmware are ok, all Power7 >= 780 are ok).
  • Your source and destination machine are connected to the same Hardware Management Console, you can’t remote restart between two HMC at the moment.
  • Minimum version of HMC is 8r8.0.0. Check you have the rrstartlpar command (not the rrlpar command used by VMcontrol only).
  • Better than a long post check this video (don’t laugh at me, I’m trying to do my best but this is one of my first video …. hope it is good) :

What is a remote restart capable virtual machine ?

Better than a long text to explain you what is, check the picture below and follow each number from 1 to 4 to understand what is a remote restart partition :

remote_restart_explanation

Create the profile of you remote restart capable partition : Power7 vs Power8

A good reason to move to Power8 faster than you planed is that you can change a virtual machine to be remote restart capable without having to recreate the whole profile. I don’t know why at the time of writing this post but changing a non remote restart capable lpar to a remote restart capable lpar is only available on Power8 systems. If you are using a Power7 machine (like me in all the examples below) be carful to check this option while creating the machine. Keep in mind that if you forgot to check to option you will not be able to enable the remote restart capability afterwards and you unfortunately have to remove you profile and recreate it, sad but true … :

  • Don’t forget to check the check box to allow the partition to be remote restart capable :
  • remote_restart_capable_enabled1

  • After the partition is created you can notice in the I/O tab that all remote restart capable partition are not able to own any physical I/O adapter :
  • rr2_nophys

  • You can check in the properties that the remote restart capable feature is activated :
  • remote_restart_capable_activated

  • If you try to modify an existing profile on a Power7 machine you’ll get this error message. On a Power8 machine there will be not problem :
  • # chsyscfg -r lpar -m XXXX-9117-MMD-658B2AD -p test_lpar-i remote_restart_capable=1
    An error occurred while changing the partition named test_lpar.
    The managed system does not support changing the remote restart capability of a partition. You must delete the partition and recreate it with the desired remote restart capability.
    
  • You can verify that some of your lpar are remote restart capable :
  • lssyscfg -r lpar -m source-machine -F name,remote_restart_capable
    [..]
    lpar1,1
    lpar2,1
    lpar3,1
    remote-restart,1
    [..]
    
  • On a Power 7 machine the best way to enable remote restart on an already created machine is to delete the profile and recreate it by hand and adding it the remote restart attribute :
  • Get the current partition profile :
  • $ lssyscfg -r prof -m s00ka9927558-9117-MMD-658B2AD --filter "lpar_names=temp3-b642c120-00000133"
    name=default_profile,lpar_name=temp3-b642c120-00000133,lpar_id=11,lpar_env=aixlinux,all_resources=0,min_mem=8192,desired_mem=8192,max_mem=8192,min_num_huge_pages=0,desired_num_huge_pages=0,max_num_huge_pages=0,mem_mode=ded,mem_expansion=0.0,hpt_ratio=1:128,proc_mode=shared,min_proc_units=2.0,desired_proc_units=2.0,max_proc_units=2.0,min_procs=4,desired_procs=4,max_procs=4,sharing_mode=uncap,uncap_weight=128,shared_proc_pool_id=0,shared_proc_pool_name=DefaultPool,affinity_group_id=none,io_slots=none,lpar_io_pool_ids=none,max_virtual_slots=64,"virtual_serial_adapters=0/server/1/any//any/1,1/server/1/any//any/1",virtual_scsi_adapters=3/client/2/s00ia9927560/32/0,virtual_eth_adapters=32/0/1659//0/0/vdct/facc157c3e20/all/0,virtual_eth_vsi_profiles=none,"virtual_fc_adapters=""2/client/1/s00ia9927559/32/c050760727c5007a,c050760727c5007b/0"",""4/client/1/s00ia9927559/35/c050760727c5007c,c050760727c5007d/0"",""5/client/2/s00ia9927560/34/c050760727c5007e,c050760727c5007f/0"",""6/client/2/s00ia9927560/35/c050760727c50080,c050760727c50081/0""",vtpm_adapters=none,hca_adapters=none,boot_mode=norm,conn_monitoring=1,auto_start=0,power_ctrl_lpar_ids=none,work_group_id=none,redundant_err_path_reporting=0,bsr_arrays=0,lpar_proc_compat_mode=default,electronic_err_reporting=null,sriov_eth_logical_ports=none
    
  • Remove the partition :
  • $ chsysstate -r lpar -o shutdown --immed -m source-server -n temp3-b642c120-00000133
    $ rmsyscfg -r lpar -m source-server -n temp3-b642c120-00000133
    
  • Recreate the partition with the remote restart attribute enabled :
  • mksyscfg -r lpar -m s00ka9927558-9117-MMD-658B2AD -i 'name=temp3-b642c120-00000133,profile_name=default_profile,remote_restart_capable=1,lpar_id=11,lpar_env=aixlinux,all_resources=0,min_mem=8192,desired_mem=8192,max_mem=8192,min_num_huge_pages=0,desired_num_huge_pages=0,max_num_huge_pages=0,mem_mode=ded,mem_expansion=0.0,hpt_ratio=1:128,proc_mode=shared,min_proc_units=2.0,desired_proc_units=2.0,max_proc_units=2.0,min_procs=4,desired_procs=4,max_procs=4,sharing_mode=uncap,uncap_weight=128,shared_proc_pool_name=DefaultPool,affinity_group_id=none,io_slots=none,lpar_io_pool_ids=none,max_virtual_slots=64,"virtual_serial_adapters=0/server/1/any//any/1,1/server/1/any//any/1",virtual_scsi_adapters=3/client/2/s00ia9927560/32/0,virtual_eth_adapters=32/0/1659//0/0/vdct/facc157c3e20/all/0,virtual_eth_vsi_profiles=none,"virtual_fc_adapters=""2/client/1/s00ia9927559/32/c050760727c5007a,c050760727c5007b/0"",""4/client/1/s00ia9927559/35/c050760727c5007c,c050760727c5007d/0"",""5/client/2/s00ia9927560/34/c050760727c5007e,c050760727c5007f/0"",""6/client/2/s00ia9927560/35/c050760727c50080,c050760727c50081/0""",vtpm_adapters=none,hca_adapters=none,boot_mode=norm,conn_monitoring=1,auto_start=0,power_ctrl_lpar_ids=none,work_group_id=none,redundant_err_path_reporting=0,bsr_arrays=0,lpar_proc_compat_mode=default,sriov_eth_logical_ports=none'
    

Creating a reserved storage device

The reserved storage device pool is used to store the configuration data of the remote restart partition. At the time of writing this post thoses devices are mandatory and as far as I know they are used just to store the configuration and not the state (memory state) of the virtual machines itself (maybe in the future, who knows ?) (You can’t create or boot any remote restart partition if you do not have a reserved storage device pool created, do this before doing anything else) :

  • You have first to find on both Virtual I/O Server and on both machines (source and destination machine used for the remote restart operation) a bunch of devices. These ones have to be the same on all the Virtual I/O Server used for the remote restart operation. The lsmemdev command is used to find those devices :
  • vios1$ lspv | grep -iE "hdisk988|hdisk989|hdisk990"
    hdisk988         00ced82ce999d6f3                     None
    hdisk989         00ced82ce999d960                     None
    hdisk990         00ced82ce999dbec                     None
    vios2$ lspv | grep -iE "hdisk988|hdisk989|hdisk990"
    hdisk988         00ced82ce999d6f3                     None
    hdisk989         00ced82ce999d960                     None
    hdisk990         00ced82ce999dbec                     None
    vios3$ lspv | grep -iE "hdisk988|hdisk989|hdisk990"
    hdisk988         00ced82ce999d6f3                     None
    hdisk989         00ced82ce999d960                     None
    hdisk990         00ced82ce999dbec                     None
    vios4$ lspv | grep -iE "hdisk988|hdisk989|hdisk990"
    hdisk988         00ced82ce999d6f3                     None
    hdisk989         00ced82ce999d960                     None
    hdisk990         00ced82ce999dbec                     None
    
    $ lsmemdev -r avail -m source-machine -p vios1,vios2
    [..]
    device_name=hdisk988,redundant_device_name=hdisk988,size=61440,type=phys,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E5000000000000,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E5000000000000,redundant_capable=1
    device_name=hdisk989,redundant_device_name=hdisk989,size=61440,type=phys,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E6000000000000,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E6000000000000,redundant_capable=1
    device_name=hdisk990,redundant_device_name=hdisk990,size=61440,type=phys,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E7000000000000,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E7000000000000,redundant_capable=1
    [..]
    $ lsmemdev -r avail -m dest-machine -p vios3,vios4
    [..]
    device_name=hdisk988,redundant_device_name=hdisk988,size=61440,type=phys,phys_loc=U2C4E.001.DBJN914-P2-C2-T1-W500507680140F32C-L3E5000000000000,redundant_phys_loc=U2C4E.001.DBJN914-P2-C1-T1-W500507680140F32C-L3E5000000000000,redundant_capable=1
    device_name=hdisk989,redundant_device_name=hdisk989,size=61440,type=phys,phys_loc=U2C4E.001.DBJN914-P2-C2-T1-W500507680140F32C-L3E6000000000000,redundant_phys_loc=U2C4E.001.DBJN914-P2-C1-T1-W500507680140F32C-L3E6000000000000,redundant_capable=1
    device_name=hdisk990,redundant_device_name=hdisk990,size=61440,type=phys,phys_loc=U2C4E.001.DBJN914-P2-C2-T1-W500507680140F32C-L3E7000000000000,redundant_phys_loc=U2C4E.001.DBJN914-P2-C1-T1-W500507680140F32C-L3E7000000000000,redundant_capable=1
    [..]
    
  • Create the reserved storage device pool using the chhwres command on the Hardware Management Console (create on all machines used by the remote restart operation) :
  • $ chhwres -r rspool -m source-machine -o a -a vios_names=\"vios1,vios2\"
    $ chhwres -r rspool -m source-machine -o a -p vios1 --rsubtype rsdev --device hdisk988 --manual
    $ chhwres -r rspool -m source-machine -o a -p vios1 --rsubtype rsdev --device hdisk989 --manual
    $ chhwres -r rspool -m source-machine -o a -p vios1 --rsubtype rsdev --device hdisk990 --manual
    $ lshwres -r rspool -m source-machine --rsubtype rsdev
    device_name=hdisk988,vios_name=vios1,vios_id=1,size=61440,type=phys,state=Inactive,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E5000000000000,is_redundant=1,redundant_device_name=hdisk988,redundant_vios_name=vios2,redundant_vios_id=2,redundant_state=Inactive,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E5000000000000,lpar_id=none,device_selection_type=manual
    device_name=hdisk989,vios_name=vios1,vios_id=1,size=61440,type=phys,state=Inactive,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E6000000000000,is_redundant=1,redundant_device_name=hdisk989,redundant_vios_name=vios2,redundant_vios_id=2,redundant_state=Inactive,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E6000000000000,lpar_id=none,device_selection_type=manual
    device_name=hdisk990,vios_name=vios1,vios_id=1,size=61440,type=phys,state=Inactive,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E7000000000000,is_redundant=1,redundant_device_name=hdisk990,redundant_vios_name=vios2,redundant_vios_id=2,redundant_state=Inactive,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E7000000000000,lpar_id=none,device_selection_type=manual
    $ lshwres -r rspool -m source-machine
    "vios_names=vios1,vios2","vios_ids=1,2"
    
  • You can also create the reserved storage device pool from Hardware Management Console GUI :
  • After selecting the Virtual I/O Server, click select devices :
  • rr_rsd_pool_p

  • Choose the maximum and minimum size to filter the devices you can select for the creation of the reserved storage device :
  • rr_rsd_pool2_p

  • Choose the disk you want to put in you reserved storage device pool (put all the devices used by remote restart partitions in manual mode, automatic devices are used by suspend/resume operation or AMS pool. One device can not be shared by two remote restart partitions) :
  • rr_rsd_pool_waiting_3_p
    rr_pool_create_7_p

  • You can check afterwards that your reserved device storage pool is created and is composed by three devices :
  • rr_pool_create_9
    rr_pool_create_8_p

Select a storage device for each remote restart partition before starting it :

After creating the reserved device storage pool you have for every partition to select a device from the storage pool. This device will be used to store the configuration data of the partition :

  • You can see you cannot start the partition if no devices were selected !
  • To select the correct device size you first have to calculate the needed space for every partition using the using the lsrsdevsize command. This size around the size of max memory value set in the partition profile (don’t ask me why):
  • $ lsrsdevsize -m source-machine -p temp3-b642c120-00000133
    size=8498
    
  • Select the device you want to assign to your machine (in my case there was already a device selected for this machine) :
  • rr_rsd_pool_assign_p

  • Then select the machine you want to assign for the device :
  • rr_rsd_pool_assign2_p

  • Or do this in command line :
  • $ chsyscfg -r lpar -m source-machine -i "name=temp3-b642c120-00000133,primary_rs_vios_name=vios1,secondary_rs_vios_name=vios2,rs_device_name=hdisk988"
    $ lssyscfg -r lpar -m source-machine --filter "lpar_names=temp3-b642c120-00000133" -F primary_rs_vios_name,secondary_rs_vios_name,curr_rs_vios_name
    vios1,vios2,vios1
    $ lshwres -r rspool -m source-machine --rsubtype rsdev
    device_name=hdisk988,vios_name=vios1,vios_id=1,size=61440,type=phys,state=Active,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E5000000000000,is_redundant=1,redundant_device_name=hdisk988,redundant_vios_name=vios2,redundant_vios_id=2,redundant_state=Active,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E5000000000000,lpar_name=temp3-b642c120-00000133,lpar_id=11,device_selection_type=manual
    

Launch the remote restart operation

All the remote restart operations are launched from the Hardware Management Console with the rrstartlpar command. At the time of writing this post there is not GUI function to remote restart a machine and you can only do it with the command line :

Validation

Like you can do it with a Live Partition Mobility move you can validate a remote restart operation before running it. You can only perform the remote restart operation if the machine on which the remote restart machine is hosted is shutdown or in error, so the validation is very useful and mandatory to check your remote restart machine are well configured without having to stop the source machine :

$ rrstartlpar -o validate -m source-machine -t dest-machine -p rrlpar
$ rrstartlpar -o validate -m source-machine -t dest-machine -p rrlpar -d 5
$ rrstartlpar -o validate -m source-machine -t dest-machine -p rrlpar --redundantvios 2 -d 5 -v

Execution

As I said before the remote restart operation can only be performed if the source machine is in a particular state, the states that allows a remote restart operation are :

  • Power Off.
  • Error.
  • Error – Dump in progress state.

So the only way to test a remote restart operation today is to shutdown your source machine :

  • Shutdown the source machine :
  • step1

    $ chsysstate -m source-machine -r sys  -o off --immed
    

    rr_step2_mod

  • You can next check on the Hardware Management Console that Virtual I/O Servers and the remote restart lpar are in state “Not available”. You’re now ready to remote restart the lpar (if the partition id is used on the destination machine the next available one will be used) (you have to wait a little before remote restarting the partition, check below) :
  • $ rrstartlpar -o restart -m source-machine -t dest-machine -p rrlpar -d 5 -v
    HSCLA9CE The managed system is not in a valid state to support partition remote restart operations.
    $ rrstartlpar -o restart -m source-machine -t dest-machine -p rrlpar -d 5 -v
    Warnings:
    HSCLA32F The specified partition ID is no longer valid. The next available partition ID will be used.
    

    step3
    rr_step4_mod
    step5

Cleanup

When the source machine is ready to be up (after an outage for instance) just boot the machine and its Virtual I/O Server. After the machine is up you can notice that the rrlpar profile is still there and it can be a huge problem if somebody is trying to boot this machine because it is started on the other machine after the remote restart operation. To prevent such an error you have to cleanup your remote restart partition by using the rrstartlpar command again. Be careful not to check the option to boot the partitions after the machine is started :

  • Restart the source machine and its Virtual I/O Servers :
  • $ chsysstate -m source-machine -r sys -o on
    $ chsysstate -r lpar -m source-machine -n vios1 -o on -f default_profile
    $ chsysstate -r lpar -m source-machine -n vios2 -o on -f default_profile
    

    rr_step6_mod

  • Perform the cleanup operation to remove the profile of the remote restart partition (if you want later to LPM back your machine you have to keep the device of the reserved device storage pool in the pool, if you do not use the –retaindev option the device will be automatically removed from the pool) :
  • $ rrstartlpar -o cleanup -m source-machine -p rrlpar --retaindev -d 5 -v --force
    

    rr_step7_mod

Refresh the partition and profile data

During my test I encounter a problem. The configuration was not correctly synced between the device used in the reserved device storage pool and the current partition profile. I had to use a command named refdev (for refresh device) to synchronize the partition and profile data to the storage device.

$ refdev -m source-machine -p refdev -m sys1 -p temp3-b642c120-00000133 -v 

What’s in the reserved storage device ?

I’m a curious guy. After playing with remote restart I asked myself a question, what is really stored in the reserved device storage device assigned to the remote restart partition. Looking in the documentation on the internet does not answer to my question so I had to look on it on my own. By ‘dding” the reserved storage device assigned to a partition I realized that the profile is stored in xml format. Maybe this format is the same format that the one used by the HMC 8 templates library. For the moment and during my tests on Power7+ machine the state of the memory of the partition is not transferred to the destination machine, maybe because I had to shutdown the whole source machine to test. Maybe the memory state of the machine is transferred to the destination machine if this one is in error state or is dumping. I had not chance to test this :

root@vios1:/home/padmin# dd if=/dev/hdisk17 of=/tmp/hdisk17.out bs=1024 count=10
10+0 records in
10+0 records out
root@vios1:/home/padmin# more hdisk17.out
[..]
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
BwEAAAAAAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACgDIAZAAAQAEAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" Profile="H4sIAAAAA
98VjxbxEAhNaZEqpEptPS/iMJO4cTJBdHVj38zcYvu619fTGQlQVmxY0AUICSH4A5XYorJgA1I3sGMBCx5Vs4RNd2zgXI89tpNMxslIiRzPufec853zfefk/t/osMfRBYPZRbpuF9ueUTQsShxR1NSl9dvEEPPMMgnfvPnVk
a2ixplLuOiVCHaUKn/yYMv/PY/ydTRuv016TbgOzdVv4w6+KM0vyheMX62jgq0L7hsCXtxBH6J814WoZqRh/96+4a+ff3Br8+o3uTE0pqJZA7vYoKKnOgYnNoSsoiPECp7KzHfELTQV/lnBAgt0/Fbfs4Wd1sV+ble7Lup/c
be0LQj01FJpoVpecaNP15MhHxpcJP8al6b7fg8hxCnPY68t8LpFjn83/eKFhcffjqF8DRUshs0almioaFK0OfHaUKCue/1GcN0ndyfg9/fwsyzQ6SblellXK6RDDaIIwem6L4iXCiCfCuBZxltFz6G4eHed2EWD2sVVx6Mth
eEOtnzSjQoVwLbo2+uEf3T/s2emPv3z4xA16eD0AC6oRN3FXNnYoA6U7y3OfFc1g5hOIiTQsVUHSusSc43QVluEX2wKdKJZq4q2YmJXEF7hhuqYJA0+inNx3YTDab2m6T7vEGpBlAaJnU0qjWofTkj+uT2Tv3Rl69prZx/9s
thQTBMK42WK7XSzrizqFhPL5E6FeHGVhnSJQLlKKreab1l6z9MwF0C/jTi3OfmKCsoczcJGwITgy+f74Z4Lu2OU70SDyIdXg1+JAApBWZoAbLaEj4InyonZIDbjvZGwv3H5+tb7C5tPThQA9oUdsCN0HsnWoLxWLjPHAdJSp
Ja45pBarVb3JDyUJOn3aemXcIqtUfgPi3wCuiw76tMh6mVtNVDHOB+BxqEUDWZGtPgPrFc9oBgBhhJzEdsEVI9zC1gr0JTexhwgThzIwYEG7lLbt3dcPyHQLKQqfGzVsSNzVSvenkDJU/lUoiXGRNrdxLy2soyhtcNX47INZ
nHKOCjYfsoeR3kpm58GdYDVxipIZXDgSmhfCDCPlKZm4dZoVFORzEX0J6CLvK4py6N7Pz94yiXlPBAArd3zqIEtjXFZ4izJzQ44sCv7hh3bTnY5TbKdnOtHGtatTjrEynTuWFNXV3ouaUKIIKfDgE5XrrpWb/SHWyWCbXMM5
DkaHNzXVJws6csK57jnpToLopiQLZdgHJJh9wm+M+wbof7GzSRJBYvAAaV0RvE8ZlA5yxSob4fAiJiNNwwQAwu2y5/O881fvvz3HxgK70ZDwc1FS8JezBgKR0e/S4XR3ta8OwmdS56akXJITAmYBpElF5lZOdlXuO+8N0opU
m0HeJTw76oiD8PS9QfRECUYqk0B1KGkZ+pRGQPUhPFEb12XIoe7u4WXuwdVqTAnZT8gyYrvAPlL/sYG4RkDmAx5HFZpFIVnAz9Lrlyh9tFIc4nZAColOLNGdFRKmE8GJd5zZx++zMiAoTOWNrJvBjODNo1UOGuXngzcHWjrn
LgmkxjBXLj+6Fjy1DHFF0zV6lVH/p+VYO6pbZzYD9/ORFLouy6MwvlGuRz8Qz10ugawprAdtJ4GxWAOtmQjZXJ+Lg58T/fDy4K74bYWr9CyLIVdQiplHPLbjinZRu4BZuAENE6jxTP2zNkBVgfiWiFcv7f3xYjFqxs/7vb0P
 lpar_name="rrlpar" lpar_uuid="0D80582A44F64B43B2981D632743A6C8" lpar_uuid_gen_method="0"><SourceLparConfig additional_mac_addr_bases="" ame_capability="0" auto_start_e
rmal" conn_monitoring="0" desired_proc_compat_mode="default" effective_proc_compat_mode="POWER7" hardware_mem_encryption="10" hardware_mem_expansion="5" keylock="normal
"4" lpar_placement="0" lpar_power_mgmt="0" lpar_rr_dev_desc="	<cpage>		<P>1</P>
		<S>51</S>
		<VIOS_descri
00010E0000000000003FB04214503IBMfcp</VIOS_descriptor>
	</cpage>
" lpar_rr_status="6" lpar_tcc_slot_id="65535" lpar_vtpm_status="65535" mac_addres
x_virtual_slots="10" partition_type="rpa" processor_compatibility_mode="default" processor_mode="shared" shared_pool_util_authority="0" sharing_mode="uncapped" slb_mig_
ofile="1" time_reference="0" uncapped_weight="128"><VirtualScsiAdapter is_required="false" remote_lpar_id="2" src_vios_slot_number="4" virtual_slot_number="4"/><Virtual
"false" remote_lpar_id="1" src_vios_slot_number="3" virtual_slot_number="3"/><Processors desired="4" max="8" min="1"/><VirtualFibreChannelAdapter/><VirtualEthernetAdapt
" filter_mac_address="" is_ieee="0" is_required="false" mac_address="82776CE63602" mac_address_flags="0" qos_priority="0" qos_priority_control="false" virtual_slot_numb
witch_id="1" vswitch_name="vdct"/><Memory desired="8192" hpt_ratio="7" max="16384" memory_mode="ded" min="256" mode="ded" psp_usage="3"><IoEntitledMem usage="auto"/></M
 desired="200" max="400" min="10"/></SourceLparConfig></SourceLparInfo></SourceInfo><FileInfo modification="0" version="1"/><SriovEthMappings><SriovEthVFInfo/></SriovEt
VirtualFibreChannelAdapterInfo/></VfcMappings><ProcPools capacity="0"/><TargetInfo concurr_mig_in_prog="-1" max_msp_concur_mig_limit_dynamic="-1" max_msp_concur_mig_lim
concur_mig_limit="-1" mpio_override="1" state="nonexitent" uuid_override="1" vlan_override="1" vsi_override="1"><ManagerInfo/><TargetMspInfo port_number="-1"/><TargetLp
ar_name="rrlpar" processor_pool_id="-1" target_profile_name="mig3_9117_MMD_10C94CC141109224549"><SharedMemoryConfig pool_id="-1" primary_paging_vios_id="0"/></TargetLpa
argetInfo><VlanMappings><VlanInfo description="VkVSU0lPTj0xClZJT19UWVBFPVZFVEgKVkxBTl9JRD0zMzMxClZTV0lUQ0g9dmRjdApCUklER0VEPXllcwo=" vlan_id="3331" vswitch_mode="VEB" v
ibleTargetVios/></VlanInfo></VlanMappings><MspMappings><MspInfo/></MspMappings><VscsiMappings><VirtualScsiAdapterInfo description="PHYtc2NzaS1ob3N0PgoJPGdlbmVyYWxJbmZvP
mVyc2lvbj4KCQk8bWF4VHJhbmZlcj4yNjIxNDQ8L21heFRyYW5mZXI+CgkJPGNsdXN0ZXJJRD4wPC9jbHVzdGVySUQ+CgkJPHNyY0RyY05hbWU+VTkxMTcuTU1ELjEwQzk0Q0MtVjItQzQ8L3NyY0RyY05hbWU+CgkJPG1pb
U9TcGF0Y2g+CgkJPG1pblZJT1Njb21wYXRhYmlsaXR5PjE8L21pblZJT1Njb21wYXRhYmlsaXR5PgoJCTxlZmZlY3RpdmVWSU9TY29tcGF0YWJpbGl0eT4xPC9lZmZlY3RpdmVWSU9TY29tcGF0YWJpbGl0eT4KCTwvZ2VuZ
TxwYXJ0aXRpb25JRD4yPC9wYXJ0aXRpb25JRD4KCTwvcmFzPgoJPHZpcnREZXY+CgkJPHZEZXZOYW1lPnJybHBhcl9yb290dmc8L3ZEZXZOYW1lPgoJCTx2TFVOPgoJCQk8TFVBPjB4ODEwMDAwMDAwMDAwMDAwMDwvTFVBP
FVOU3RhdGU+CgkJCTxjbGllbnRSZXNlcnZlPm5vPC9jbGllbnRSZXNlcnZlPgoJCQk8QUlYPgoJCQkJPHR5cGU+dmRhc2Q8L3R5cGU+CgkJCQk8Y29ubldoZXJlPjE8L2Nvbm5XaGVyZT4KCQkJPC9BSVg+CgkJPC92TFVOP
gkJCTxyZXNlcnZlVHlwZT5OT19SRVNFUlZFPC9yZXNlcnZlVHlwZT4KCQkJPGJkZXZUeXBlPjE8L2JkZXZUeXBlPgoJCQk8cmVzdG9yZTUyMD50cnVlPC9yZXN0b3JlNTIwPgoJCQk8QUlYPgoJCQkJPHVkaWQ+MzMyMTM2M
DAwMDAwMDAwMDNGQTA0MjE0NTAzSUJNZmNwPC91ZGlkPgoJCQkJPHR5cGU+VURJRDwvdHlwZT4KCQkJPC9BSVg+CgkJPC9ibG9ja1N0b3JhZ2U+Cgk8L3ZpcnREZXY+Cjwvdi1zY3NpLWhvc3Q+" slot_number="4" sou
_slot_number="4"><PossibleTargetVios/></VirtualScsiAdapterInfo><VirtualScsiAdapterInfo description="PHYtc2NzaS1ob3N0PgoJPGdlbmVyYWxJbmZvPgoJCTx2ZXJzaW9uPjIuNDwvdmVyc2lv
NjIxNDQ8L21heFRyYW5mZXI+CgkJPGNsdXN0ZXJJRD4wPC9jbHVzdGVySUQ+CgkJPHNyY0RyY05hbWU+VTkxMTcuTU1ELjEwQzk0Q0MtVjEtQzM8L3NyY0RyY05hbWU+CgkJPG1pblZJT1NwYXRjaD4wPC9taW5WSU9TcGF0
YXRhYmlsaXR5PjE8L21pblZJT1Njb21wYXRhYmlsaXR5PgoJCTxlZmZlY3RpdmVWSU9TY29tcGF0YWJpbGl0eT4xPC9lZmZlY3RpdmVWSU9TY29tcGF0YWJpbGl0eT4KCTwvZ2VuZXJhbEluZm8+Cgk8cmFzPgoJCTxwYXJ0
b25JRD4KCTwvcmFzPgoJPHZpcnREZXY+CgkJPHZEZXZOYW1lPnJybHBhcl9yb290dmc8L3ZEZXZOYW1lPgoJCTx2TFVOPgoJCQk8TFVBPjB4ODEwMDAwMDAwMDAwMDAwMDwvTFVBPgoJCQk8TFVOU3RhdGU+MDwvTFVOU3Rh
cnZlPm5vPC9jbGllbnRSZXNlcnZlPgoJCQk8QUlYPgoJCQkJPHR5cGU+dmRhc2Q8L3R5cGU+CgkJCQk8Y29ubldoZXJlPjE8L2Nvbm5XaGVyZT4KCQkJPC9BSVg+CgkJPC92TFVOPgoJCTxibG9ja1N0b3JhZ2U+CgkJCTxy
UlZFPC9yZXNlcnZlVHlwZT4KCQkJPGJkZXZUeXBlPjE8L2JkZXZUeXBlPgoJCQk8cmVzdG9yZTUyMD50cnVlPC9yZXN0b3JlNTIwPgoJCQk8QUlYPgoJCQkJPHVkaWQ+MzMyMTM2MDA1MDc2ODBDODAwMDEwRTAwMDAwMDAw
ZmNwPC91ZGlkPgoJCQkJPHR5cGU+VURJRDwvdHlwZT4KCQkJPC9BSVg+CgkJPC9ibG9ja1N0b3JhZ2U+Cgk8L3ZpcnREZXY+Cjwvdi1zY3NpLWhvc3Q+" slot_number="3" source_vios_id="1" src_vios_slot_n
tVios/></VirtualScsiAdapterInfo></VscsiMappings><SharedMemPools find_devices="false" max_mem="16384"><SharedMemPool/></SharedMemPools><MigrationSession optional_capabil
les" recover="na" required_capabilities="veth_switch,hmc_compatibilty,proc_compat_modes,remote_restart_capability,lpar_uuid" stream_id="9988047026654530562" stream_id_p
on>

About the state of the source machine ?

You have to know this before using remote restart : at the time of writing this post the remote restart feature is still young and have to evolve before being usable in real life, I’m saying this because the FSP of the source machine has to be up to perform a remote restart operation. To be clear the remote restart feature does not answer to the total loss of one of your site. It’s just useful to restart partitions of a system with a problem that is not an FSP problem (problem with memory DIMM, problem with CPUs for instance). It can be used in your DRP exercises but not if your whole site is totally down which is -in my humble opinion- one of the key feature that remote restart needs to answer. Don’t be afraid read the conclusion ….

Conclusion

This post have been written using Power7+ machines, my goal was to give you an example of remote restart operations : a summary of what is is, how it work, and where and when to use it. I’m pretty sure that a lot of things are going to change about remote restart. First, on Power8 machines you don’t have to recreate the partitions to make them remote restart aware. Second, I know that changes are on the way for remote restart on Power8 machines, especially about reserved storage devices and about the state of the source machine. I’m sure this feature will have a bright future and used with PowerVC it can be a killer feature. Hope to see all this changes in a near future ;-). Once again I hope this post helps you.

Building my AIX/Power Home Lab : HMC/iSCSI/Power Saving and other tricks

I’m living in fear ! A few months ago I decided to quit my job for some reasons I can’t tell you in this blog. As many of you already know, you know exactly what you quit when you are leaving a place, but do not exactly know what you will get in the new one. Unfortunately this is the way France is doing today about jobs in general (just ask you why we do not have any Champion for Power Systems in France ?). Anyway this blog isn’t about politics but about technical stuffs. So I finally admitted that my professional path will not be a straight forward line, and with this comes the feeling that I needed to work and experiment some stuffs on my own without being at work. Here are some benefits I have to own my Power Server at home :

  • I’m the only developer of nmon2graphite. I really need a test platform to add new features and maintain nmon2graphite.
  • I can experiment things on my own without the fear to broke some production hosts, and to finally be fired. Example : I deliberately create a Shared Ethernet Adapter with wrong parameters to create a network loop (with some wireshark trace) to really understand why so many people are doing this mistake and how it really works. (I’m planning to post about this.)
  • I do not have to wait to experiment things, I’m the master/keeper in the house, no more firewall, dns problems, I can do what I want !
  • I can broke things and do it again and again. Need an IVM ? No problem less than one hour to build an IVM. Need an HMC … same thing.
  • It’s cool when some geek friends are coming home and I can say “Hey buddy, wanna play on my Power server ?”

I know this is a lost cause by I give it a try here. If some IBMers are reading this blog. My P520 Express is running a PowerVM standard edition, if someone can provides me a COD code for an enterprise edition it’ll be very cool (Am I dreaming ? Is the world really managed by money ?) . Remember that this blog is here to help the Power community, there is no business behind this blog. It’s free. I’m blogging in my free time, it’s a passion.

mypower1

It’s a long way to the promise land.

Finding a Power Server at a cheap price is not easy. My first tough was to ask people who already buy one for their own use how they did it. I decided to ask Brian Smith who owns a few Power Systems : How did you get your servers ? He kindly answers to my question and told me to check on eBay or in governmental auctions. So I started to check eBay in the beginning of May 2013. In Europe prices were too expansives for my budget (no server cheaper than 5000€) ; so I decided to check in US if prices were cheaper. For price convenience my advice is to check for a Power 520 express or a Power710 express if you can afford to spend more money. Before buying anything always :

  • Ask for the Serial Number to check activated options, go on the IBM POD for this http://www-912.ibm.com/pod/pod.
  • Always ask for the PowerVM Edition. In my case my need is PowerVM Standard or Enterprise. Go on this page to check how to identify your PowerVM edition : http://www-01.ibm.com/support/docview.wss?uid=isg3T1010860
  • Always ask for adapters and disks shipped with the server. A lot of servers are sold without any adapter.
  • Always ask for power supply with the server. A lot of servers are sold without any power.
  • If you are planning to install dual Virtual I/O Server check if a split backplane is installed with the system (in my case I have no split backplane, and I’ll tell you how to setup dual Virtual I/O Server in this case.)
  • Check the price ! Don’t get ripped off !

After three months asking questions and checking eBay everyday. I finally found something interesting. The NeuComp Company (an IBM Business Partner) is selling refurbished Power Server on eBay. I wanted to thanks Katie K. (if you are reading this Katie tell me if I can give you real name :-)) for her kindness and for answering all my questions. If you are buying from a foreign country don’t make the same mistake as me and be aware of the custom fees (in my case around 15% of the total prices), seriously DHL guys are ruthless …

poweropened

What about the Hardware Management Console ?

If you are planning to have a PowerVM Standard/Enterprise edition server it’s easier to manage it through an Hardware Management Console : two solutions :

  • Find an old hardware management console, on eBay it’s not to expansive and affordable.
  • Like Scott Vetter point in this post : The Power Systems Hardware Management Console as a virtual appliance ? it is possible to virtualize an Hardware Management Console, or an SDMC. Like him I cannot tell you here how to do it but it’s doable, and it’s not my case :-) (Don’t ask me for this one). Anyway if you want an SDMC you are totally in legality if you download and install the SDMC ova file and deploy it on a KVM or a VMware infrastructure.

Protect your ear. Check electricity bill : enable Power Save mode.

Do not forget to enable power saver mode if you are owning a Power6 or Power7 server. First : fans speeds will be decreased and the server will produce less noise. Second : you’ll notice a huge difference at the end of the month on your electricity bill if you are running the server 24 hours a day ! Notice that when the power saver mode on a managed system is enabled, the processor voltage and clock frequency are lowered to reduce the power consumption of the processors in the managed system.

  • Enabling Power Saver mode (and check with pmcycles command on a partition) :
  • # chpwrmgmt -m power520e-8203-EA4 -r sys -o enable
    root@priss.lab.chmod666.org [/root] # pmcycles -Mm
    CPU 0 runs at 3618 MHz
    CPU 1 runs at 3617 MHz
    CPU 2 runs at 3619 MHz
    CPU 3 runs at 3619 MHz
    
  • Disabling Power Save mode :
  • # chpwrmgmt -m power520e-8203-EA4 -r sys -o disable
    root@priss.lab.chmod666.org [/root] # pmcycles -Mm
    CPU 0 runs at 4201 MHz
    CPU 1 runs at 4201 MHz
    CPU 2 runs at 4202 MHz
    CPU 3 runs at 4202 MHz
    

As you can see on the output above the processor speed was decreased from 4.2GHz to 3.6GHz. I do not have any recorder but I can swear that fans are decreasing their speed at the moment the command was typed on the Hardware Management Console. Power Saving mode is not a joke it really works.

Updating/Installation

Firmware update

The server was not shipped with the latest firmware level. If you are working in a big enterprise like me this kind of operations are performed by IBM Inspectors, so the last time I updated a firmware was a couple of months ago and I did it through the IBM Systems Director, this is not the kind of operation I’m used to do, so here is a quick reminder on how to do it :

  • First of all copy all the files (xml and sdd files included) you have downloaded with the new firmware on the Hardware Management Console :
  • hscroot@gaff:~> ls /home/hscroot/fw
    01EL350_149_038.dd.xml	01EL350_149_038.pd.sdd	    01EL350_149_038.rpm
    01EL350_149_038.html	01EL350_149_038.readme.txt  01EL350_149_038.xml
    
  • You can check with the lslic command if there are “flashable” firmware in a specific directory :
  • hscroot@gaff:~> lslic -m p520express-8203-E4A-SN0666000 -r mountpoint -d /home/hscroot/fw/ -F retrievable_release concurrent_retrievable_level disruptive_retrievable_level
    None 149 149
    
  • Flash the firmware with the updlic command (note that a copy of the old microcode is backuped in the /var filesystem):
  • hscroot@gaff:~> updlic -o a -t sys -l latest -m p520express-8203-E4A-SN0666000  -r mountpoint -d /home/hscroot/fw -v
    Current profile data backup files have been copied: 
    8203-E4A*0666000: /var/hsc/profiles/0666000/backupFile_FirmwareUpdate01EL350, /var/hsc/profiles/0666000/directory/backupFile_FirmwareUpdate01EL350.dir
    
    8203-E4A*0666000: Retrieving updates.
    8203-E4A*0666000 Managed System Primary: Retrieving updates.
    8203-E4A*0666000: Installing updates.
    8203-E4A*0666000 Managed System Primary: Preparing for install.
    8203-E4A*0666000: Installing updates.
    8203-E4A*0666000 Managed System Primary: Installing updates.
    8203-E4A*0666000 Managed System Primary: Writing update files.
    8203-E4A*0666000 Managed System Primary: Writing file 80a00020.  0 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file 80a00701.  5959118 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file 80a00701.  16043962 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file 80a00711.  7858212 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file 81cf0689.  0 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file 81e00100.  2684818 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file 81e00101.  2684818 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file 81e00200.  65378 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file 81e00200.  5304258 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file 81e00400.  0 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file 81e00704.  0 bytes written.
    8203-E4A*0666000 Managed System Primary: Writing file a0e00c21.  0 bytes written.
    8203-E4A*0666000 Managed System Primary: Finished writing update files.
    8203-E4A*0666000: Activating updates.
    8203-E4A*0666000 Managed System Primary: Activating updates - Restarting Flexible Service Processor.
    8203-E4A*0666000 Managed System Primary: Flexible Service Processor Restart completed successfully.
    8203-E4A*0666000 Managed System Primary: Activation completed.
    8203-E4A*0666000: Completed All Updates.
    

Virtual I/O Server install through the Hardware Management Console.

To install my Virtual I/O Server I decided to use the new integrated Hardware Management Console feature. This feature allows you to install the Virtual I/O Server through the Hardware Management Console the first time you boot it. Unfortunately you need to have all NIM ports opened because it’s just a graphical interface to the old well known installios command based on the old nimol product and running on the HMC. What a shame I was dreaming of a full integrated installation without the need of bootp/NFS and so on. So I’m very disappointed about this new feature :-(.

  • You first have to download the Virtual I/O Server media to the Hardware Management Console, check you have enough space left on the HMC before importing images :
  • hmcimportvios

  • Wait until images are correctly uploaded to the Hardware Management Console :
  • hmcuploadimagewait

  • If images were correctly uploaded on the HMC, you’ll see it on the Virtual I/O Server image repository :
  • virtualioserverimagerepository

  • When you’ll boot your Virtual I/O Server the HMC will ask you if you want to install a Virtual I/O Server on it :
  • bootviosinstall

  • The Virtual I/O Server will boot and you’ll be asked for the IP address to use and the image you want to install, in this case set the HMC ip as NIM server (the HMC is a nimol server):
  • discoveringadapters
    hmcviosip

  • After this step the installation will begin and the output will be visible in the HMC window :
  • installhmc1

  • When the installation is finished you can check log output HMC window :
  • installhmc2
    installhmc3

Need a cheap SAN : Use iSCSI

This is the trickiest part of this post. My Power Server was not shipped with a split backplane my only solution was to create an iSCSI target and to boot my Virtual I/O Server on iSCSI, the iSCSI target partition was created on an AIX lpar and I decided to use file backed luns for backup convenience (I just have to copy the file to save my Virtual I/O Server rootvg :-))

Creating target

Filesets for iSCSI target can be found on the AIX Expansion Pack DVD. You have to install these fileset on the iSCSI target before trying to configure it.

# installp -aXYgd . devices.tmiscsw.rte 
[..]
# lslpp -Lc | grep devices.tmiscsw.rte 
devices.tmiscsw:devices.tmiscsw.rte:7.1.2.0: : :C: :iSCSI Target Software Device Driver : : : : : : :0:0:/:1241
  • To create the target follow these steps :
  • # mkdev -c driver -s node -t tmsw 
    tmsw0 Available
    # mkdev -c tmiscsi -s tmtarget -t target -p tmiscsi0 -a owner='tmsw0' -a iscsi_name="inq.priss.lab.chmod666.org"  
    target0 Available
    
  • To create lun follow these steps : In my case I choose to create file backed iSCSI lun. I use dd command to create the lun itself, and then add it to the iSCSI configuration :
  • # dd if=/dev/zero of=/luns/lun1  bs=1m count=10k
    10240+0 records in.
    10240+0 records out.
    # ls -l /luns/lun1
    -rw-r--r--    1 root     system   10737418240 Dec 14 10:22 /luns/lun1
    # mkdev -c tmiscsi -s tmdev -t lu -p 'target1' -l 'lun1' -a back_dev_type='file' -a back_dev='lun1' -a back_file='/luns/lun1'
    lun1 Available
    # lsattr -El lun1
    back_dev        lun1       Backing Device Name                           True
    back_dev_option            N/A                                           True
    back_dev_type   file       Backing Device Type                           True
    back_file       /luns/lun1 Backing File Full Path Name                   True
    dev_size                   N/A                                           True
    lun_id          0x0        Logical Unit Number ID                        False
    queue_depth     3          Maximum Number of Commands to Queue to Device True
    

Initiator configuration

  • Give a name to your initiator :
  • # chdev  -l 'iscsi0' -a initiator_name='inq.deckard'
    
  • Modify the /etc/iscsi/targets file to add the name and port of the iSCSI target, then cfgmgr :
  • # echo "192.168.0.21 3260 inq.priss.lab.chmod666.org" > /etc/iscsi/targets
    # tail -1 /etc/iscsi/targets
    192.168.0.21 3260 inq.priss.lab.chmod666.org
    # cfgmgr -vl iscsi0
    # lspv
    hdisk0          00ce7e539a9ee2fb                    rootvg          active      
    hdisk1          00ce7e53d67b71ca                    None 
    # lsdev -Cc disk | grep hdisk1 
    hdisk1 Available  Other iSCSI Disk Drive
    

Boot on iSCSI

    If like me you decided to boot on iSCSI, follow the same steps as above to configure the target. Then boot you partition on SMS menu, begin the installation. The configuration of the iSCSI initiator is performed on the BOS installation menu, follow the steps below to configure it, and then boot on iSCSI :

  • When the menu comes up, choose Configure Network Disks (iSCSI) :
  •                            Welcome to Base Operating System
                          Installation and Maintenance
    
    Type the number of your choice and press Enter. Choice is indicated by >>>.
    
    >>> 1 Start Install Now with Default Settings
        2 Change/Show Installation Settings and Install
        3 Start Maintenance Mode for System Recovery
        4 Configure Network Disks (iSCSI)
        5 Select Storage Adapters
    
        88  Help ?
        99  Previous Menu
    
    >>> Choice [1]: 4  
    
  • Then choose Configure iSCSI :
  •                                 Configure iSCSI
    
    Move cursor to desired item and press Enter.
    
      Configure iSCSI
      Network Utilities
    
  • Set your iSCSI target name, ip, port and lun number, if all is ok you’ll see the iSCSI block devices presented to your server as hdisk devices :
  •                       iSCSI Configuration -- SW Initiator
    
    Type or select values in entry fields.
    Press Enter AFTER making all desired changes.
    
                                                            [Entry Fields]
    * Ethernet Interface                                  en0
    * IP Address of Ethernet Interface                    192.168.0.6
    * Network Mask of Ethernet Interface                  255.255.255.0
    * Gateway to iSCSI Target                             192.168.0.254
    * iSCSI Target Name                                  [inq.priss]
    * IP Address of iSCSI Target                         [192.168.0.21]
    * Port Number of iSCSI Target                        [3260]
    * iSCSI Initiator Name                               [inq.deckard]
    
                                     COMMAND STATUS
    
    Command: OK            stdout: yes           stderr: no
    
    Before command completion, additional instructions may appear below.
    
    en0 changed
    inet0 changed
    iscsi0 changed
    hdisk0 Available  N/A
    hdisk1 Available  N/A
    hdisk2 Available  N/A
    hdisk3 Available  N/A
    hdisk4 Available  N/A
    
  • After this step you can choose your disks. Do it the same way as a normal installation.
  • If you are checking the bootlist on a server booting on iSCSI, check with the -v option to have a detailed output :
  • # bootlist -o -m normal
    hdisk0 
    # bootlist -o -m normal -v
    'ibm,max-boot-devices' = 0x5
    NVRAM variable: (boot-device=/lhea@23c00200/ethernet@23e00200:iscsi,ciaddr=192.168.0.6,subnet-mask=255.255.255.0,itname=inq.deckard,iport=3260,ilun=0,iname=inq.priss,siaddr=192.168.0.21,2)
    Path name: (/lhea@23c00200/ethernet@23e00200:iscsi,ciaddr=192.168.0.6,subnet-mask=255.255.255.0,itname=inq.deckard,iport=3260,ilun=0,iname=inq.priss,siaddr=192.168.0.21,2)
    match_specific_info: ut=disk/iscsi/osdisk
    hdisk0 
    
  • I had to change my iSCSI target ip address for a dumb reason, but after changing this on my lpars booting on iSCSI were not able to find their boot disk anymore. The only solution I found was to change the boot device NVRAM boot-device variable Open firmware prompt :
  • 0 > printenv 
    [..]
    boot-device              /lhea@23c00200/ethernet@23e00200:iscsi,ciaddr=8.0.0.2,subnet-mask=255.255.255.0,itname=iqn.localhost.hostid.00000000,iport=3260,ilun=0,iname=inq.priss,siaddr=8.0.0.1,2 
    [..]
    0 > setenv boot-device /lhea@23c00200/ethernet@23e00200:iscsi,ciaddr=192.168.0.6,subnet-mask=255.255.255.0,itname=inq.deckard,,iport=3260,ilun=0,iname=inq.priss,siaddr=192.168.0.21,2  ok
    0 > printenv 
    [..]
    boot-device              /lhea@23c00200/ethernet@23e00200:iscsi,ciaddr=192.168.0.6,subnet-mask=255.255.255.0,itname=inq.deckard,,iport=3260,ilun=0,iname=inq.priss,siaddr=192.168.0.21,2 
    [..]
    0 > boot -v
    iSCSI BOOT ---------------------------------------------------
    Server IP.....................192.168.0.21
    Client IP.....................192.168.0.6
    Subnet Mask...................255.255.255.0
    iSCSI Initiator...............inq.deckard
    iSCSI Target..................inq.priss
    Target Port...................3260 
    Target LUN....................0 
    
    \
    Elapsed time since release of system processors: 45 mins 24 secs
    

If you are planning to buy your own Power Server don’t hesitate to contact me if you have any question. It really gives me the opportunity to test things I was not able to do at work, PowerHA migration for 6 to 7, PowerSC, some HMC features, some NIM features (suma), the possibility are infinite. If you want to contribute to the rising of chmod666.org all over the AIX world, I’m now looking for :

  • A Cisco manageable switch with ios and 16 ports.
  • A Power7 based server (any model).
  • An old DS* array.
  • A split backplane.
  • Any old hardware.

Don’t hesitate to contact me if you are trashing old machines/switch I’m interested if you want to make me gifts :-), I can buy it if you do not want to give it :-). Once again I hope this post will help you or will make you dream. I’m living in fear it’s hard to assume :-(.

NIM Less known features : HANIM, nimsh over ssl, DSM

The Network Installation Manager server is one of the most important host in an environment. New machines installations, machines backups, backups restorations,software (filesets), third party products installations, in some cases volume group backups are made from the NIM server. Some best practices have to be respected. I’ll give you in this post a few tricks for NIM. First off all a NIM server has to be in your disaster recovery plan because it the first server needed when you have to re-build a crashed machine : my solution HANIM. It has to be secured (nimsh, and nimsh authentication over ssl), and it has to be flexible and automated (DSM).

NIM High Availability : HANIM

Finding documentation and information about NIM High Availability is not so easy. I recommend you to check the NIM from a to Z Redbook, it’s one of the only viable source for HANIM. HANIM simple to setup and simple to use, but there are a few things to know and to understand about it :

HANIM Overview

  • The alternate NIM master is a backup NIM build from the NIM master.
  • Takeover operations from master to alternate are manuals. PowerHA can be used to run these takeover operations but my advice is not to use it. Takeover can be performed even if the NIM master is down. HANIM does not perform any heartbeat.
  • HANIM only provides a method for replicating NIM database and resources. Resources can be replicated from master to alternate : NIM database AND resources data can be replicated (replicate=yes option).
  • My advice is to run every NIM operation from the master (even if it is possible to run a NIM operation from the alternate).
  • Disks are not shared between the master and the alternate, when a sync operation is done, missing resources are copied over NFS form the master to the alternate, or from the alternate to the master. HANIM does not provides a filesystem takeover.
  • A takeover operation modify all the nimclient’s /etc/niminfo files. The NIM_MASTER_HOSTNAME_LIST is modified by the takeover operation and the alternate NIM master is moved in first position. The NIM_MASTER_HOSTNAME is modified with the alternated NIM master hostname.


Initial setup

On the NIM master and on the alternate NIM master some filesets have to be installed, check the presence of : bos.sysmgt.nim.master, bos.sysmgt.nim.spot, bos.sysmgt.nim.client. NIM master and alternate NIM master must be one the same AIX version :

# lslpp -l | grep -i nim
  bos.sysmgt.nim.client     7.1.2.15  COMMITTED  Network Install Manager -
  bos.sysmgt.nim.master     7.1.2.15  COMMITTED  Network Install Manager -
  bos.sysmgt.nim.spot       7.1.2.15  COMMITTED  Network Install Manager - SPOT
  bos.sysmgt.nim.client     7.1.2.15  COMMITTED  Network Install Manager -
# oslevel -s
7100-02-02-1316

Configure the NIM master

Initialize the NIM master with the nimconfig command, you’ll need to name the first network used by NIM. nimesis daemons will be started at this step.

# nimconfig -a pif_name=en0 -a netname=10-10-20-0-s24-net -a master_port=1058 -a verbose=3 -a cable_type=N/A
[..]
Checking input attributes.
attr_ass:
        'cpuid' => '00F359164D00'
        'pif_name' => 'en0'
        'netname' => '10-10-20-0-s24-net'
        'master_port' => '1058'
        'cable_type' => 'N/A'
        'net_addr' => '10.10.20.1'
        'snm' => '255.255.255.0'
        'adpt_addr' => '667C70F7A904'
        'adpt_name' => 'ent0'
Making sure the NIM Master package is OK.
      set_state: id=1361463886; name=; state_attr=85; new_state=5;
   checking the object definition of ;
   checking interface info for master;
Built NIM infomation file.
      10.10.20.1 is known as nim_master
Adding default route 10.10.20.254 to network object
         0 - /usr/lpp/bos.sysmgt/nim/methods/m_mknet
         1 - -anet_addr=10.10.20.1
         2 - -asnm=255.255.255.0
         3 - -tent
         4 - -arouting1=default 10.10.20.254
         5 - 10-10-20-0-s24-net
Connecting NIM master to master network.
         0 - /usr/lpp/bos.sysmgt/nim/methods/m_chmaster
         1 - -aif1=10-10-20-0-s24-net nim_master 667C70F7A904
         2 - -amaster_port=1058
         3 - -aregistration_port=1059
         4 - -acable_type1=N/A
         5 - master
Adding NIM deamons to SRC and starting....
0513-071 The nimesis Subsystem has been added.
0513-071 The nimd Subsystem has been added.
0513-059 The nimesis Subsystem has been started. Subsystem PID is 9568296.
[..]

NIM resources such as spot, lpp_source and so on can be created right now, please refer to the NIM cheatsheet by chmod666.org ;-). For the purpose of this post some resources (spot, lpp_source, mksysb, network) are created, these ones will be replicated later.

Configure the alternate NIM master

NIM alternate master is configured with the niminit command. If you check on the NIM from a to Z, page 124, a note is warning you about the synchronization : “At the time of writing, only rsh/rshd communication is supported for NIM synchronization.”.THIS STATEMENT IS FALSE : I’m using nimsh for the synchronization, and I recommend to use it. We are in 2013, do not use rsh anymore.

# niminit -a is_alternate=yes -a master=nim_master -a pif_name=en0 -a cable_type1=N/A -a connect=nimsh -a name=nim_alternate
0513-071 The nimesis Subsystem has been added.
0513-071 The nimd Subsystem has been added.
0513-059 The nimesis Subsystem has been started. Subsystem PID is 10944522.
nimsh:2:wait:/usr/bin/startsrc -g nimclient >/dev/console 2>&1
0513-044 The nimsh Subsystem was requested to stop.
0513-059 The nimsh Subsystem has been started. Subsystem PID is 5963998.

Verification

You’re done with the configuration, you can now start to synchronize, replicate and takeover… pretty easy. Here are some points you can verify :

  • On the NIM master, the attribute is_alternate is set to yes :
  • # lsnim -l master
    [..]
       is_alternate        = yes
    [..]
    
  • On the NIM master, a new machine object typed alternate_master is created :
  • # lsnim -t alternate_master
    nim_alternate     machines       alternate_master
    
  • After the first database synchronization, on the alternate NIM master, a new machine object typed alternate_master is created, this the NIM master :
  • # lsnim -t alternate_master
    nim_master     machines       alternate_master
    
  • On the alternate NIM master, the attribute is_alternate does not exists :
  • # lsnim -l master | grep alternate
    

Synchronization and replication

NIM master and alternate NIM master can now communicate with each others, some resources are created on the master, and it’s now time to synchronize. Remember : HANIM only provides a method for replicating NIM database and resources. You can -if you want- synchronize the NIM database only or the NIM database and its resources (data included). Remember : never perform a NIM synchronization from the alternate NIM master.

Database synchronization only

The database synchronization is useful, when objects are modified, for example when you are modifying a subnet mask for a network object. It also can be useful when objects “without files” are created ; for instance a machine. On the other hand if your are trying to synchronize the database if an object “with a file” exists such as an lpp_source, a spot, or an fb_script, this one will not be created, you have to copy the file before synchronize, or use the replicate attribute :

  • On NIM master two objects are created, an fb_script and a machine:
  • # nim -o define -t fb_script -a server=master -a location=/export/nim/others/postinstall/fb_script.ksh fb_script01
    # ls -l /export/nim/others/postinstall/fb_script.ksh
    -rw-r--r--    1 root     system           35 Mar  8 18:01 /export/nim/others/postinstall/fb_script.ksh
    # lsnim ruby
    ruby     machines       standalone
    
  • A database synchronization is performed :
  • # nim -o sync -a force nim_alternate
    [..]
    The level of the NIM master fileset on this machine is: 7.1.2.15
    The level of the NIM database backup is: 7.1.2.15
    [..]
    Checking NIM resources
      Removing fb_script01
        0518-307 odmdelete: 1 objects deleted. from nim_attr (serves attr)
        0518-307 odmdelete: 0 objects deleted. from nim_attr (group memberships)
        0518-307 odmdelete: 5 objects deleted. from nim_attr (resource attributes)
        0518-307 odmdelete: 1 objects deleted. from nim_object (resource object)
      Finished removing fb_script01
    
  • On the alternate NIM master, the machine object is here but the fb_script was not replicated because the file was not present on the alternate NIM master :
  • # lsnim ruby
    ruby     machines       standalone
    # lsnim fb_script01
    0042-053 lsnim: there is no NIM object named "fb_script01"
    
  • If you copy the file before synchronize the resource will be created :
  • master# scp fb_script.ksh nim_alternate:/export/nim/others/postinstall
    fb_script.ksh                      100%   35     0.0KB/s   00:00
    
    master# nim -o sync -a force nim_alternate
    [..]
    Restoring the NIM database from /tmp/_nim_dir_13041674/mnt0
    x ./etc/NIM.level, 9 bytes, 1 tape blocks
    [..]
      Keeping fb_script01
    
    alternate# # lsnim fb_script01
    fb_script01     resources       fb_script
    

    Synchronization with replication

    I encourage you not to use the database synchronization, but to use it with replication, it does the same job but copy the files for you. Much much easier, just add replicate=yes attribute to the nim command, it works like a charm :

    # lsnim -q sync alternate_master
    
    the following attributes are optional:
            -a verbose=
            -a replicate=
            -a reset_clients=
    # nim -o sync -a force=yes -a replicate=yes alternate_master
    

    Takeover

    If the NIM master is down a takeover operation allows the alternate NIM master to become NIM master for the clients. On clients /etc/niminfo file is modified (NIM_MASTER_HOSTNAME and NIM_MASTER_HOSTNAME_LIST attributes are modified).

    • /etc/niminfo and lsnim output file before a takeover operation :
    • client# grep -E "NIM_MASTER_HOSTNAME_LIST|NIM_MASTER_HOSTNAME" /etc/niminfo
      export NIM_MASTER_HOSTNAME=nim_master
      export NIM_MASTER_HOSTNAME_LIST="nim_master nim_alternate"
      master# lsnim -l client | grep current_master
         current_master = nim_master
      
    • Takeover operation is initiated from the alternate NIM master :
    • alternate# nim -o takeover -a show_progress=yes nim_master
      +-----------------------------------------------------------------------------+
                            Performing "reset" Operation
      +-----------------------------------------------------------------------------+
      +-----------------------------------------------------------------------------+
                            "reset" Operation Summary
      +-----------------------------------------------------------------------------+
       Target                  Result
       ------                  ------
       client                   RESET
       client1                  RESET
       [..]
      +-----------------------------------------------------------------------------+
                            Initiating "takeover" Operation
      +-----------------------------------------------------------------------------+
       Initiating the takeover operation on machine 1 of 240: client ...
      
       Initiating the takeover operation on machine 2 of 240: client1...
      [..]
      +-----------------------------------------------------------------------------+
                            "takeover" Operation Summary
      +-----------------------------------------------------------------------------+
       Target                  Result
       ------                  ------
       client                  SUCCESS
       client1                 SUCCESS
      [..]
      alternate# lsnim -l client | grep current_master
         current_master = nim_alternate
      client# grep -E "NIM_MASTER_HOSTNAME_LIST|NIM_MASTER_HOSTNAME" /etc/niminfo
      export NIM_MASTER_HOSTNAME=nim_alternate
      export NIM_MASTER_HOSTNAME_LIST="nim_alternate nim_master"
      
    • When the NIM master is up, initiate the takeover for the master :
    • # nim -o takeover -a show_progress=yes nim_alternate
      

    Synchronization automation and other files ?

    I recommend to run a NIM synchronization every day, I personally have a cronjob doing it every day at eleven PM. Most of the time a NIM synchronization is not enough and you’ll need to synchronize others file in my case, my root .profile my etc/hosts file, in your case whatever you want. For this need I’m using a little script based over rsync which synchronize my master to my alternate everyday :

    # crontab -l
    [..]
    0 23 * * * /export/nim/others/tools/do_sync.ksh >/dev/null 2>&1
    [..]
    # cat /export/nim/others/tools/do_sync.ksh
    [..]
        nim -o sync -a force=yes -a replicate=yes -a reset_clients=yes ${alternate}
        /export/nim/others/tools/sync_to_alternate.ksh
    [..]
    # cat /export/nim/others/tools/sync_to_alternate.ksh
    [..]
      /usr/bin/rsync -ave ssh ${a_filesystem} ${alternate_nim_master}:${a_filesystem}
    [..]
    

    NIM Security, use nimsh and use it over SSL

    nimsh over ssl

    NIM Master configuration form nimsh over SSL

    From the NIM master enable the SSL support trough the nimconfig command, certificates will be generated in /ssl_nimsh/keys, OpenSSL fileset has to be installed :

    • Check OpenSSL filesets :
    • # lslpp -l | grep openssl
        openssl.base            0.9.8.2400  COMMITTED  Open Secure Socket Layer
        openssl.license         0.9.8.2400  COMMITTED  Open Secure Socket License
        openssl.man.en_US       0.9.8.2400  COMMITTED  Open Secure Socket Layer
        openssl.base            0.9.8.2400  COMMITTED  Open Secure Socket Layer
      
    • Use nimconfig to enable SSL support :
    • # nimconfig -c
      0513-029 The tftpd Subsystem is already active.
      Multiple instances are not supported.
      NIM_MASTER_HOSTNAME=nim_master
      x - /usr/lib/libssl.so.0.9.8
      x - /usr/lib/libcrypto.so.0.9.8
      Target "all" is up to date.
      Generating a 1024 bit RSA private key
      ......++++++
      .++++++
      writing new private key to '/ssl_nimsh/keys/rootkey.pem'
      -----
      Signature ok
      subject=/C=US/ST=Texas/L=Austin/O=ibm.com/CN=Root CA
      Getting Private key
      Generating a 1024 bit RSA private key
      ...............++++++
      .......++++++
      writing new private key to '/ssl_nimsh/keys/clientkey.pem'
      -----
      Signature ok
      subject=/C=US/ST=Texas/L=Austin/O=ibm.com
      Getting CA Private Key
      Generating a 1024 bit RSA private key
      ......++++++
      .............++++++
      writing new private key to '/ssl_nimsh/keys/serverkey.pem'
      -----
      Signature ok
      subject=/C=US/ST=Texas/L=Austin/O=ibm.com
      Getting CA Private Key
      
    • Check the NIM master : attribute ssl_support is now set to yes :
    • # lsnim -l master | grep ssl_support
         ssl_support         = yes
      

    NIM alternate master for nimsh over SSL

    If you’re using an alternate NIM master repeat the same operation (OpenSSL and nimconfig -r). Alternate NIM master is also a client of the NIM master, its client has to be configured :

    # nimclient -c
    x - /usr/lib/libssl.so.0.9.8
    x - /usr/lib/libcrypto.so.0.9.8
    Received 2763 Bytes in 0.0 Seconds
    0513-044 The nimsh Subsystem was requested to stop.
    0513-077 Subsystem has been changed.
    0513-059 The nimsh Subsystem has been started. Subsystem PID is 9502954.
    

    Client configuration

    Configure all nimclients to use ssl crypted authentication, if you are using alternate NIM master do not forget to download alternate certificates on clients :

    # rmitab nimsh 2>/dev/null 
    # rm -rf /etc/niminfo
    # niminit -aname=$(hostname) -a master=nim_master -a master_port=1058 -a registration_port=1059 -a connect=nimsh
    # nimclient -c
    # nimclient -o get_cert -a master_name=nim_alternate
    # stopsrc -s nimsh
    # startsrc -s nimsh
    

    On the NIM server itself client’s connect attribute is now set to “nimsh (secure)” :

    # lsnim -l ruby | grep connect
       connect        = nimsh (secure)
    

    Are the data encrypted ?

    Check this statement in NIM from a to Z Redbook at page 434 :

    “Any communication initiated from the NIM client (pull operation) reaches the NIM master on the request for services and registration ports (1058 and 1059, respectively). This communication is not encrypted. For any communication initiated from the NIM master (push operations), the NIM master communicates with the NIM client using the NIMSH daemon. This allows an encrypted handshake dialog during authentication. However, data packets are not encrypted.”

    To sum up :

    • Only push operations can use secure nimsh.
    • Data packets are not encrypted.
    • Secure nimsh just add an encrypted handshake between NIM master and its clients.

    Have a look on this two screenshots, the first one is the tcp stream of a non-secure operation, the second one is secured :

    • Non secure tcp stream of a push operation :
    • Secure tcp stream of a push operation :

    Distributed Systems Management

    Distributed Systems Management (we’ll call it DSM until now), is a set of tools and programs used to enhance NIM capabilities. I personally use DSM for two main purposes, opening and monitoring consoles through the dconsole utility, and to automate my installations. DSM add new objects the NIM environment, and new attributes to the NIM objects. You can also gain more on control on your lpars and directly restart, maint_boot an lpar through NIM by using DSM. Hardware Management Console (HMC objects) and Pserie’s frames (CEC objects) can be added in NIM, profile management are added to standalone objects in order to take advantage of DSM with NIM.

    There are two main source of information for DSM

    • The dsm.core fileset comes with a pdf file named dsm_tech_note.pdf, page 161, chapter 5.
    • # lslpp -f dsm.core | grep dsm_tech_note.pdf
                              /opt/ibm/sysmgt/dsm/doc/dsm_tech_note.pdf
      
    • There are full detailed examples in the IBM AIX Version 7.1 Differences Guide .

    Filesets prerequisites

    Starting with AIX 6.1 TL3 base installation media are shipped with DSM packages (dsm.core). expect, tcl, tk, and xterm are needed by this DSM pacakges :

    # lslpp -l | grep -E "dsm|tcl|tk|expect|xterm"
      X11.apps.aixterm           7.1.2.0  COMMITTED  AIXwindows aixterm Application
      X11.apps.xterm            7.1.2.15  COMMITTED  AIXwindows xterm Application
      X11.msg.en_US.apps.aixterm
                                 7.1.2.0  COMMITTED  AIXwindows aixterm Messages -
      dsm.core                  7.1.2.15  COMMITTED  Distributed Systems Management
      dsm.dsh                   7.1.2.15  COMMITTED  Distributed Systems Management
      expect.base               5.42.1.0  COMMITTED  Binary executable files of
      expect.man.en_US          5.42.1.0  COMMITTED  Expect man page documentation
      tcl.base                   8.4.7.0  COMMITTED  Binary executable files of Tcl
      tcl.man.en_US              8.4.7.0  COMMITTED  Tcl man page documentation
      tk.base                    8.4.7.0  COMMITTED  Binary executable files of Tk
      tk.man.en_US               8.4.7.0  COMMITTED  Tk man page documentation
    

    Defining HMC objects

    DSM is using HMC to start (poweron) lpars, stop (poweroff) lpars and open console on lpars. HMC can be defined on NIM. An HMC object is a management object. To avoid prompting password each time a NIM operations is performed, or each time dconsole is called, DSM provides a mechanism to manage SSH key sharing between the NIM and the HMC. Before adding an HMC object use dpasswd and dkeyexch command to enable SSH key authentication :

    • Create the authentication file with dpasswd command. File is by default stored in /etc/ibm/sysmgm/dsm/config :
    • # dpasswd -f hmc1_passwd -U hscroot
      Password:
      Re-enter password:
      Password file created
      # ls -l  /etc/ibm/sysmgt/dsm/config/
      total 24
      -r--r--r--    1 root     system           16 Mar 11 13:25 .key
      -r--r--r--    1 root     system           24 Mar 11 13:25 hmc1_passwd
      
    • Share the key between NIM master and HMC using dkeyexch command :
    • # dkeyexch -f /etc/ibm/sysmgt/dsm/config/hmc1_passwd -I hmc -H hmc1
      OpenSSH_6.0p1, OpenSSL 0.9.8x 10 May 2012
      
    • At this step you should be able to connect to the HMC without password prompting :
    • # ssh hscroot@hmc1
      Last login: Mon Mar 11 13:51:35 2013 from 10.10.20.21
      
    • Define the new HMC object with nim command, the network on which the HMC is running must be defined as an NIM network :
    • # nim -o define -t ent -a net_addr=10.10.30.0 -a snm=255.255.254.0 -a routing1="default 10.10.31.254" 10-10-30-0-s23-net
      # nim -o define -t hmc -a if1="find_net hmc1 0" -a passwd_file=/etc/ibm/sysmgt/dsm/config/hmc1_passwd hmc1
      # lsnim -t hmc
      hmc1     management       hmc
      # lsnim -lF hmc1
      hmc1:
         id          = 1363005068
         class       = management
         type        = hmc
         if1         = 10-10-30-0-s23-net hmc1 0
         Cstate      = ready for a NIM operation
         prev_state  =
         Mstate      = not running
         passwd_file = /etc/ibm/sysmgt/dsm/config/hmc1_passwd
      

    Defining CEC objects

    Defining HMC object allows to define CEC object, NIM CEC‘s object are requiring four mandatory attributes, hardware type (hw_type), hardware model (hw_model), hardware serial (hw_serial), and the HMC used to control this CEC object (mgmt_source). Query the HMC to get the attributes with lssyscfg command, and define the new CEC object with the nim command :

    • Querying HMC to get hw_model, hw_serial, and hw_type :
    • # ssh hscroot@hmc1 "lssyscfg -r sys -F name,type_model,serial_num"
      # CEC1,8203-E4A,060CE99
      
    • lssyscfg output tells you that : hw_type=8203, hw_model=EA4 and hw_serial=060CE99
    • Create the CEC object :
    • # nim -o define -t cec -a hw_type=8203 -a hw_model=E4A -a hw_serial=060CE99 -a mgmt_source=hmc1 cec1
      # lsnim -l cec1
      cec1:
         class      = management
         type       = cec
         Cstate     = ready for a NIM operation
         prev_state =
         hmc        = hmc1
         serial     = 8203-E4A*060CE99
      

    Adding profile management to standalone object

    To define a standalone object with a management profile or to add a management profile to an existing standalone, MAC address and lpar id are needed, the lpar id can easily be learned by the HMC, for the MAC address use the dgetmacs command to get it:

    • Get the lpar id trough the HMC :
    • ssh hscroot@infmc102 "lssyscfg -r lpar -m CEC1 -F name,lpar_id"
      lpar1,5
      lpar2,4
      vios1,3
      vios2,2
      lpar3,1
      
    • Define the machine and replace the MAC address by 0 :
    • # nim -o define -t standalone -a if1="10-10-20-0-s24-net lpar2 0" -a net_settings1="auto auto" -a mgmt_profile1="hmc1 4 CEC1" lpar2
      
    • Retrieve the machine MAC address by using the dgetmacs command, the host will booted on openfirmware. If the host is already installed get the MAC address with entstat command directly on the machine :
    • #  dgetmacs -n lpar2 -C NIM
      Using an adapter type of "ent".
      Could not dsh to node lpar2.
      Attempting to use openfirmware method to collect MAC addresses.
      Acquiring adapter information from Open Firmware for node lpar2.
      
      # Node::adapter_type::interface_name::MAC_address::location::media_speed::adapter_duplex::UNUSED::install_gateway::ping_status::machine_type::netaddr::subnet_mask
      
      lpar1::ent_v::::2643EEBC6C04::U8203.E4A.060CE99-V4-C4-T1::auto::auto::::::n/a::secondary::::
      
    • Modify the NIM object to add the MAC address :
    • # nim -o change -a if1="10-10-20-0-s24-net lpar2 2643EEBC6C04" lpar2
      

    Using dconsole to open and monitor machines consoles

    If the machine is already installed, or after the installation with a bos_inst operation, you can manage its console with the dconsole command. A few cool things comes with dconsole such as opening a console in read only mode, opening a console in text mode or through an xterm, and logging all consoles outputs into /var/ibm/sysmgt/dsm/log/console; here are a few examples :

    • Opening a text console in read-write mode and log the output in /var/ibm/sysmgt/dsm/log/console :
    • # dconsole -C NIM -n lpar2 -t -l
      Starting console daemon
      [read-write session]
      
       Open in progress
      
       Open Completed.
      AIX Version 7
      Copyright IBM Corporation, 1982, 2013.
      Console login: root
      # echo test
      test
      # tail -10 /var/ibm/sysmgt/dsm/log/console/lpar2.0
      # echo test
      test
      # exit
      
    • Opening an xterm console in read-write mode and log the output in /var/ibm/sysmgt/dsm/log/console on greenclient1 :
    • # export DISPLAY=10.10.20.35:0
      # dconsole -C NIM -n greenclient1  -l
      Starting console daemon
      

    • Opening a text console in read-only mode :
    • # dconsole -C NIM -n lpar2  -l -t -r
      Starting console daemon
      [read only session, user input discarded]
      
       Open in progress
      
       Open Completed.
      AIX Version 7
      Copyright IBM Corporation, 1982, 2013.
      Console login: [read only session, user input discarded]
      [read only session, user input discarded]
      

    bos_inst operation through NIM with DSM

    Machine installation and bos_inst operation can be automated with DSM. If a machine has a management profile and a bos_inst operation is performed this one will be rebooted and automatically installed, I do install machine with this method and it works like a charm :

    • Install the machine lpar2 in aix 7100-02-02, a bosinst_data with no prompt stanza was created for this installation :
    • # nim -o bos_inst -a bosinst_data=hdisk0_noprompt-bosinst_data -a source=rte -a installp_flags=agX -a accept_licenses=yes -a spot=7100-02-02-1316-spot -a lpp_source=7100-02-02-1316-lpp_source lpar2
      dnetboot Status: Invoking /opt/ibm/sysmgt/dsm/dsmbin/lpar_netboot lpar2
      dnetboot Status: Was successful network booting node lpar2.
      
    • DSM is using HMC lpar_netboot command to install machines, the output of this command can be found in /tmp filesystem :
    • # cat /tmp/lpar_netboot.12124286.exec.log
      lpar_netboot Status: process id is 12124286
      lpar_netboot Status: lpar_netboot -i -t ent -D -S 10.10.20.140 -G 10.10.20.254 -C 10.10.20.202 -m 2643EEBC6C04 -s auto -d auto -F /etc/ibm/sysmgt/dsm/config/hmc1_passwd -j hmc -J 10.10.30.1 4 060C
      E74 8203-E4A
      [..]
      IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
      IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
      
                1 = SMS Menu                          5 = Default Boot List
                8 = Open Firmware Prompt              6 = Stored Boot List
      [..]
      10.10.20.202:    24  bytes from 10.10.20.140:  icmp_seq=7  ttl=? time=21  ms
      
      10.10.20.202:    24  bytes from 10.10.20.140:  icmp_seq=8  ttl=? time=21  ms
      PING SUCCESS.
      [..]
      38300 ^MPACKET COUNT = 38400 ^MPACKET COUNT = 38500 ^MPACKET COUNT = 38600 ^MPACKET COUNT = 38700 ^MPACKET COUNT = 38800 ^MPACKET COUNT = 38900 ^MFINAL PACKET COUNT = 38913
      FINAL FILE SIZE = 19922944  BYTES
      
    • The installation progression can be monitored form the NIM itself :
    • # lsnim -l lpar2 |grep info
         info           = BOS install 39% complete : Installing additional software.
      

    Is it free ?

    Unlike CSM DSM is free, you do not need any licenses to use it. As you can see these tools can be very powerful to automate installations for standalone clients. VMControl is using DSM and NIM to automate installations. DSM is the right tool to industrialize your NIM installations.

    Cheatsheet

    I love cheat sheet ! NIM commands are complex and hard to remember, I’ve search over the internet if a NIM cheat sheet already exists but I haven’t found anything correct or anything that fits my needs. I’m sure that a lot of my readers already knows William Favorite’s Quicksheets. I’m a huge fan of this Quicksheets and I was inspired by Willam when creating my own one for NIM. Feel free to contact me if you want to add or correct something in my cheat sheet, you’ll be -of course- credited if you add some useful informations. Click here to download my NIM cheat sheet :chmod666 NIM Cheat Sheet

    No future ?

    I do love NIM, but in my opinion it’s a little bit outdated, everyone is calling for an update of the Redbook (click here to call for an update ;-)) and of the product, me included. This part of the post was inspired by one of my AIX Guru, thanks to him, I’m sure he’ll recognize himself. If IBMers are reading this part of the post, please tell IBM to update NIM. Readers please react in the comments if you agree with me on this point. Here are a few points I want to see in a future NIM release :

    • Network package repository of software : publish lpp_source over http or https. IBM can publish an official repository, and customer can create its own one on the NIM server (this one can be synchronized with IBM official repository).
    • Create a client (updated nimclient) with search and download option. (Yes like yum).
    • Getting rid of bootp and tftp, download kernel (created in /tftpboot when a new SPOT is created) and ramdisk image trough http or https.
    • Replace nfs exports by http or https (or force nfsv4) for NIM resources sharing (SPOT, lpp_source, install_script, bosinst_data…)(easier for security, and firewall ruling).
    • Allow IPL menu to be setup in dhcp.
    • Automatic dependencies checking and resolution while installing a software.
    • Simplify postinstall (script) and firstboot (fb_script). My actual solution is to create a firstboot script, this one download a script and add an entry in /etc/inittab, the downloaded script do the job and remove the entry in /etc/inittab at the end of its execution.
    • Automatic multibos creation while updating a system trough NIM — or in option.
    • Keep mksysb the way it is, this is the best bare metal backup I ever known.
    • Getting rid of rsh, force user to use nimsh (for nimadm too).
    • Better design for high availability (HANIM auto sync for example).
    • NIM Database flexibility : Let user renaming an resource object (please do this !!!) — Who has never experienced this problem while creating a SPOT or an lpp_source with an erroneous name ?
    • Allow allocating multiple lpp_source for different installp_bundle for installation.
    • Allow nimadm migration to be performed without the exact same level for bos.alt_disk_install.rte fileset.
    • Allow nimsh to be configured over http or https (no more multiple ports for nimsh ; easier for security, and firewall ruling).
    • Automatically enabled cryptographic authentication for NIM service handler. (nimsh can uses SSL-encrypted certificates).
    • Easier NIM backup and restore, getting rid of m_backup_db and m_restore_db.


    Please comment and react I do need support ;-). Hope this can help.