Tips and tricks for PowerVC 1.2.3 (PVID, ghostdev, clouddev, rest API, growing volumes, deleting boot volume) | PowerVC 1.2.3 Redbook

Writing a Redbook was one of my main goal. After working days and nights for more than 6 years on PowerSystems IBM gave me the opportunity to write a Redbook. I was looking on the Redbook residencies page since a very very long time to find the right one. As there was nothing new on AIX and PowerVM (which are my favorite topics) I decided to give a try to the latest PowerVC Redbook (this Redbook is an update, but a huge one. PowerVC is moving fast). I am a Redbook reader since I’m working on AIX. Almost all Redbooks are good, most of them are the best source of information for AIX and Power administrators. I’m sure that like me, you saw that part about becoming an author every time you are reading a RedBook. I can now say THAT IT IS POSSIBLE (for everyone). I’m now one of this guys and you can also become one. Just find the Redbook that will fit for you and apply on the Redbook webpage (http://www.redbooks.ibm.com/residents.nsf/ResIndex). I wanted to say a BIG Thank you to all the people who gave me the opportunity to do that, especially Philippe Hermes, Jay Kruemcke, Eddie Shvartsman, Scott Vetter, Thomas R Bosthworth. In addition to these people I wanted also to thanks my teammates on this Redbook: Guillermo Corti, Marco Barboni and Liang Xu, they are all true professional people, very skilled and open … this was a great team ! One more time thank you guys. Last, I take the opportunity here to thanks the people who believed in me since the very beginning of my AIX career: Julien Gabel, Christophe Rousseau, and JL Guyot. Thank you guys ! You deserve it, stay like you are. I’m now not an anonymous guy anymore.

redbook

You can download the Redbook at this address: http://www.redbooks.ibm.com/redpieces/pdfs/sg248199.pdf. I’ve learn something during the writing of the Redbook and by talking to the members of the team. Redbooks are not there to tell and explain you what’s “behind the scene”. A Redbook can not be too long, and needs to be written in almost 3 weeks, there is no place for everything. Some topics are better integrated in a blog post than in a Redbook, and Scott told me that a couple of time during the writing session. I totally agree with him. So here is this long awaited blog post. The are advanced topics about PowerVC read the Redbook before reading this post.

Last one thanks to IBM (and just IBM) for believing in me :-). THANK YOU SO MUCH.

ghostdev, clouddev and cloud-init (ODM wipe if using inactive live partition mobility or remote restart)

Everybody who is using cloud-init should be aware of this. Cloud-init is only supported with AIX version that have the clouddev attribute available on sys0. To be totally clear at the time of writing this blog post you will be supported by IBM only if you use AIX 7.1 TL3 SP5 or AIX 6.1 TL9 SP5. All other versions are not supported by IBM. Let me explain why and how you can still use cloud-init on older versions just by doing a little trick. But let’s first explain what the problem is:

Let’s say you have different machines some of them using AIX 7100-03-05 and some of them using 7100-03-04, both use cloud-init for the activation. By looking at cloud-init code at this address here we can say that:

  • After the cloud-init installation cloud-init is:
  • Changing clouddev to 1 if sys0 has a clouddev attribute:
  • # oslevel -s
    7100-03-05-1524
    # lsattr -El sys0 -a ghostdev
    ghostdev 0 Recreate ODM devices on system change / modify PVID True
    # lsattr -El sys0 -a clouddev
    clouddev 1 N/A True
    
  • Changing ghostdev to 1 if sys0 don’t have a clouddev attribute:
  • # oslevel -s
    7100-03-04-1441
    # lsattr -El sys0 -a ghostdev
    ghostdev 1 Recreate ODM devices on system change / modify PVID True
    # lsattr -El sys0 -a clouddev
    lsattr: 0514-528 The "clouddev" attribute does not exist in the predefined
            device configuration database.
    

This behavior can directly be observed in the cloud-init code:

ghostdev_clouddev_cloudinit

Now that we are aware of that, let’s make a remote restart test between two P8 boxes. I take the opportunity here to present you one of the coolest feature of PowerVC 1.2.3. You can now remote restart your virtual machines directly from the PowerVC GUI if you have one of your host in a failure state. I highly encourage you to check my latest post about this subject if you don’t know how to setup remote restartable partitions http://chmod666.org/index.php/using-the-simplified-remote-restart-capability-on-power8-scale-out-servers/:

  • Only simplified remote restart can be managed by PowerVC 1.2.3, the “normal” version of remote restart is not handle by PowerVC 1.2.3
  • In the compute template configuration there is now a checkbox allowing you to create remote restartable partition. Be careful: you can’t go back to a P7 box without having to reboot the machine. So be sure your Virtual Machines will stay on P8 box if you check this option.
  • remote_restart_compute_template

  • When the machine is shutdown or there is a problem on it you can click the “Remotely Restart Virtual Machines” button:
  • rr1

  • Select the machines you want to remote restart:
  • rr2
    rr3

  • While the Virtual Machines are remote restarting, you can check the states of the VM and the state of the host:
  • rr4
    rr5

  • After the evacuation the host is in “Remote Restart Evacuated State”:

rr6

Let’s now check the state of our two Virtual Machines:

  • The ghostdev one (the sys0 messages in the errpt indicates that the partition ID has changed AND DEVICES ARE RECREATED (ODM Wipe)) (no more ip address set on en0):
  • # errpt | more
    IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
    A6DF45AA   0803171115 I O RMCdaemon      The daemon is started.
    1BA7DF4E   0803171015 P S SRC            SOFTWARE PROGRAM ERROR
    CB4A951F   0803171015 I S SRC            SOFTWARE PROGRAM ERROR
    CB4A951F   0803171015 I S SRC            SOFTWARE PROGRAM ERROR
    D872C399   0803171015 I O sys0           Partition ID changed and devices recreat
    # ifconfig -a
    lo0: flags=e08084b,c0
            inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
            inet6 ::1%1/0
             tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
    
  • The clouddev one (the sys0 message in the errpt indicate that the partition ID has changed) (note that the errpt message does not indicate that the devices are recreated):
  • # errpt |more
    60AFC9E5   0803232015 I O sys0           Partition ID changed since last boot.
    # ifconfig -a
    en0: flags=1e084863,480
            inet 10.10.10.20 netmask 0xffffff00 broadcast 10.244.248.63
             tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
    lo0: flags=e08084b,c0
            inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
            inet6 ::1%1/0
             tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
    

VSAE is designed to manage ghostdev only OS on the other hand cloud-init is designed to manage clouddev OS. To be perfectly clear here are how ghostdev and clouddev works. But we first need to answer a question. Why do we need to set clouddev or ghostdev to 1 ? The answer is pretty obvious, one of this attribute needs to be set to 1 before capturing the Virtual Machine. When the Virtual Machines is captured, one of this attributes is set to 1. When you will deploy a new Virtual Machine this flag is needed to wipe the ODM before reconfiguring the virtual machine with the parameters set in the PowerVC GUI (ip, hostname). In both clouddev and ghostdev cases it is obvious that we need to wipe the ODM at the machine build/deploy time. Then VSAE or cloud-init (using config drive datasource) is setting hostname, ip address previously wiped by clouddev and ghostdev attributes. This is working well for a new deploy because we need to wipe the ODM in all cases but what about an inactive live partition mobility or a remote restart operation ? The Virtual Machine has moved (not on the same host, and not with the same lpar ID) and we need to keep the ODM as it is. How is it working:

  • If you are using VSAE, this one is managing the ghostdev attribute for you. At the capture time ghostdev is set to 1 by VSAE (when you run the pre-capture script). When deploying a new VM, at the activation time, VSAE is setting ghostdev back to 0. Inactive live partition mobility and remote restart operations will work fine with ghostdev set to 0.
  • If you are using cloud-init on a supported system clouddev is set to 1 at the installation of cloud-init. As cloud-init is doing nothing with both attributes at the activation time IBM needs to find a way to avoid wiping the ODM after the remote restart operation. The clouddev device was introduced: this one is writing a flag in the NVRAM, so when a new VM is built, there is no flag in the NVRAM for this one, the ODM is wiped. When an already existing VM is remote restarted, the flag exists in the NVRAM, the ODM is not wiped. By using clouddev there is no post deploy action needed.
  • If you are using cloud-init on an unsupported system ghostdev is set to 1 at the installation of cloud-init. As cloud-init is doing nothing at post-deploy time, ghostdev will remains to 1 in all cases and the ODM will always be wiped.

cloudghost

There is a way to use cloud-init on unsupported system. Keep in mind that in this case you will not be supported by IBM. So do this at you own risk. To be totally honest I’m using this method in production to use the same activation engine for all my AIX version:

  1. Pre-capture, set ghostdev to 1. What ever happens THIS IS MANDATORY.
  2. Post-capture, reboot the captured VM and set ghostdev to 0.
  3. Post-deploy on every Virtual machine set ghostdev to 0. You can put this in the activation input to do the job:
  4. #cloud-config
    runcmd:
     - chdev -l sys0 -a ghostdev=0
    

The PVID problem

I realized I had this problem after using PowerVC for a while. As PowerVC images for rootvg and other volume group are created using Storage Volume Controller flashcopy (in case of a SVC configuration, but there are similar mechanisms for other storage providers) the PVID for both rootvg and additional volume groups will always be the same for each new virtual machines (all new virtual machines will have the same PVID for their rootvg, and the same PVID for each captured volume group). I did contact IBM about this and the PowerVC team told me that this behavior is totally normal and was observed since the release of VMcontrol. They didn’t have any issues related to this, so if you don’t care about it, just do nothing and keep this behavior as it is. I recommend doing nothing about this!

It’s a shame but most AIX administrators like to keep things as they are and don’t want any changes. (In my humble opinion this is one of the reason AIX is so outdated compared to Linux, we need a community, not narrow-minded people keeping their knowledge for them, just to stay in their daily job routine without having anything to learn). If you are in this case, facing angry colleagues about this particular point you can use the solution proposed below to calm the passions of the few ones who do not want to change !. :-). This is my rant : CHANGE !

By default if you build two virtual machines and check the PVID of each one, you will notice that the PVID are the same:

  • Machine A:
  • root@machinea:/root# lspv
    hdisk0          00c7102d2534adac                    rootvg          active
    hdisk1          00c7102d00d14660                    appsvg          active
    
  • Machine B:
  • root@machineb:root# lspv
    hdisk0          00c7102d2534adac                    rootvg          active
    hdisk1          00c7102d00d14660                    appsvg         active
    

For the rootvg the PVID is always set to 00c7102d2534adac and for the appsvg the PVID is always set to 00c7102d00d14660.

For the rootvg the solution is to change the ghostdev (only the ghostdev) to 2, and to reboot the machine. Putting ghostdev to 2 will change the PVID of the rootvg at the reboot time (after the PVID is changed ghostdev will be automatically set back to 0)

# lsattr -El sys0 -a ghostdev
ghostdev 2 Recreate ODM devices on system change / modify PVID True
# lsattr -l sys0 -R -a ghostdev
0...3 (+1)

For the non rootvg volume group this is a little bit tricky but still possible, the solution is to use the recreatevg (-d option) command to change the PVID of all the physical volumes of your volume group. Before rebooting the server ensure that:

  • Umount all the filesystems in the volume group on which you want to change the PVID.
  • varyoff the volume group.
  • Get the physical volumes names composing the volume group.
  • export the volume group.
  • recreate the volume group (this action will change the PVID)
  • re-import the volume group.

Here is the shell commands doing the trick:

# vg=appsvg
# lsvg -l $vg | awk '$6 == "open/syncd" && $7 != "N/A" { print "fuser -k " $NF }' | sh
# lsvg -l $vg | awk '$6 == "open/syncd" && $7 != "N/A" { print "umount " $NF }' | sh
# varyoffvg $vg
# pvs=$(lspv | awk -v my_vg=$vg '$3 == my_vg {print $1}')
# recreatevg -y $vg -d $pvs
# importvg -y $vg $(echo ${pvs} | awk '{print $1}'

We now agree that you want to do this, but as you are a smart person you want to do it automatically using cloud-init and the activation input, there are two way to do it, the silly way (using shell) and the noble way (using cloudinit syntax):

PowerVC activation engine (shell way)

Use this short ksh script in the activation input, this is not my recommendation, but you can do it for simplicity:

activation_input_shell

PowerVC activation engine (cloudinit way)

Here is the cloud-init way. Important note: use the latest version of cloud-init, the first one I used had a problem with the cc_power_state_change.py not using the right parameters for AIX:

activation_input_ci

Working with REST Api

I will not show you here how to work with the PowerVC RESTful API. I prefer to share a couple of scripts on my github account. Nice examples are often better than how to tutorials. So check the scripts on the github if you want a detailed how to … scripts are well commented. Just a couple of things to say before closing this topic: the best way to work with RESTful api is to code in python, there are a lot existing python libs to work with RESTful api (httplib2, pycurl, request). For my own understanding I prefer in my script using the simple httplib. I will put all my command line tools in a github repository called pvcmd (for PowerVC command line). You can download the scripts at this address, or just use git to clone the repo. One more time it is a community project, feel free to change and share anything: https://github.com/chmod666org/pvcmd:

Growing data lun

To be totally honest here is what I do when I’m creating a new machine with PowerVC. My customers always needs one additionnal volume groups for applications (we will call it appsvg). I’ve create a multi volume image with this volume group created (with a bunch of filesystem in it). As most of customers are asking for the volume group to be 100g large the capture was made with this size. Unfortunately for me we often get requests to create a bigger volume groups let’s say 500 or 600 Gb. Instead of creating a new lun and extending the volume group PowerVC allows you to grow the lun to the desired size. For volume group other than the boot one you must use the RESTful API to extend the volume. To do this I’ve created a python script to called pvcgrowlun (feel free to check the code on github) https://github.com/chmod666org/pvcmd/blob/master/pvcgrowlun. At each virtual machine creation I’m checking if the customer needs a larger volume group and extend it using the command provided below.

While coding this script I got a problem using the os-extend parameter in my http request. PowerVC is not exactly using the same parameters as Openstack is, if you want to code by yourself be aware of this and check in the PowerVC online documentation if you are using “extended attributes” (Thanks to Christine L Wang for this one):

  • In the Openstack documentation the attribute is “os-extend” link here:
  • os-extend

  • In the PowerVC documentation the attribute is “ibm-extend” link here:
  • ibm-extend

  • Identify the lun you want to grow (as the script is taking the name of the volume as parameter) (I have one not published to list all the volumes, tell me if you want it). In my case the volume name is multi-vol-bf697dfa-0000003a-828641A_XXXXXX-data-1, and I want to change its size from 60 to 80. This is not stated in the offical PowerVC documentation but this will work for both boot and data lun.
  • Check the size of the lun is lesser than the desired size:
  • before_grow

  • Run the script:
  • # pvcgrowlun -v multi-vol-bf697dfa-0000003a-828641A_XXXXX-data-1 -s 80 -p localhost -u root -P mysecretpassword
    [info] growing volume multi-vol-bf697dfa-0000003a-828641A_XXXXX-data-1 with id 840d4a60-2117-4807-a2d8-d9d9f6c7d0bf
    JSON Body: {"ibm-extend": {"new_size": 80}}
    [OK] Call successful
    None
    
  • Check the size is changed after the command execution:
  • aftergrow_grow

  • Don’t forget to do the job in the operating system by running a “chvg -g” (check total PPS here):
  • # lsvg vg_apps
    VOLUME GROUP:       vg_apps                  VG IDENTIFIER:  00f9aff800004c000000014e6ee97071
    VG STATE:           active                   PP SIZE:        256 megabyte(s)
    VG PERMISSION:      read/write               TOTAL PPs:      239 (61184 megabytes)
    MAX LVs:            256                      FREE PPs:       239 (61184 megabytes)
    LVs:                0                        USED PPs:       0 (0 megabytes)
    OPEN LVs:           0                        QUORUM:         2 (Enabled)
    TOTAL PVs:          1                        VG DESCRIPTORS: 2
    STALE PVs:          0                        STALE PPs:      0
    ACTIVE PVs:         1                        AUTO ON:        yes
    MAX PPs per VG:     32768                    MAX PVs:        1024
    LTG size (Dynamic): 256 kilobyte(s)          AUTO SYNC:      no
    HOT SPARE:          no                       BB POLICY:      relocatable
    MIRROR POOL STRICT: off
    PV RESTRICTION:     none                     INFINITE RETRY: no
    DISK BLOCK SIZE:    512                      CRITICAL VG:    no
    # chvg -g appsvg
    # lsvg appsvg
    VOLUME GROUP:       appsvg                  VG IDENTIFIER:  00f9aff800004c000000014e6ee97071
    VG STATE:           active                   PP SIZE:        256 megabyte(s)
    VG PERMISSION:      read/write               TOTAL PPs:      319 (81664 megabytes)
    MAX LVs:            256                      FREE PPs:       319 (81664 megabytes)
    LVs:                0                        USED PPs:       0 (0 megabytes)
    OPEN LVs:           0                        QUORUM:         2 (Enabled)
    TOTAL PVs:          1                        VG DESCRIPTORS: 2
    STALE PVs:          0                        STALE PPs:      0
    ACTIVE PVs:         1                        AUTO ON:        yes
    MAX PPs per VG:     32768                    MAX PVs:        1024
    LTG size (Dynamic): 256 kilobyte(s)          AUTO SYNC:      no
    HOT SPARE:          no                       BB POLICY:      relocatable
    MIRROR POOL STRICT: off
    PV RESTRICTION:     none                     INFINITE RETRY: no
    DISK BLOCK SIZE:    512                      CRITICAL VG:    no
    

My own script to create VMs

I’m creating Virtual Machine every weeks, sometimes just a couple and sometime I got 10 Virtual Machines to create in a row. We are here using different storage connectivity groups, and different storage templates if the machine is in production, in development, and so on. We also have to choose the primary copy on the SVC side if the machine is in production (I am using a streched cluster between two distant sites, so I have to choose different storage templates depending on the site where the Virtual Machine is hosted). I make mistakes almost every time using the PowerVC gui (sometime I forgot to put the machine name, sometimes the connectivity group). I’m a lazy guy so I decided to code a script using the PowerVC rest api to create new machines based on a template file. We are planing to give the script to our outsourced teams to allow them to create machine, without knowing what PowerVC is \o/. The script is taking a file as parameter and create the virtual machine:

  • Create a file like the one below with all the information needed for your new virtual machine creation (name, ip address, vlan, host, image, storage connectivity group, ….):
  • # cat test.vm
    name:test
    ip_address:10.16.66.20
    vlan:vlan6666
    target_host:Default Group
    image:multi-vol
    storage_connectivity_group:npiv
    virtual_processor:1
    entitled_capacity:0.1
    memory:1024
    storage_template:storage1
    
  • Launch the script, the Virtual Machine will be created:
  • pvcmkvm -f test.vm -p localhost -u root -P mysecretpassword
    name: test
    ip_address: 10.16.66.20
    vlan: vlan666
    target_host: Default Group
    image: multi-vol
    storage_connectivity_group: npiv
    virtual_processor: 1
    entitled_capacity: 0.1
    memory: 1024
    storage_template: storage1
    [info] found image multi-vol with id 041d830c-8edf-448b-9892-560056c450d8
    [info] found network vlan666 with id 5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5
    [info] found host aggregation Default Group with id 1
    [info] found storage template storage1 with id bfb4f8cc-cd68-46a2-b3a2-c715867de706
    [info] found image multi-vol with id 041d830c-8edf-448b-9892-560056c450d8
    [info] found a volume with id b3783a95-822c-4179-8c29-c7db9d060b94
    [info] found a volume with id 9f2fc777-eed3-4c1f-8a02-00c9b7c91176
    JSON Body: {"os:scheduler_hints": {"host_aggregate_id": 1}, "server": {"name": "test", "imageRef": "041d830c-8edf-448b-9892-560056c450d8", "networkRef": "5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5", "max_count": 1, "flavor": {"OS-FLV-EXT-DATA:ephemeral": 10, "disk": 60, "extra_specs": {"powervm:max_proc_units": 32, "powervm:min_mem": 1024, "powervm:proc_units": 0.1, "powervm:max_vcpu": 32, "powervm:image_volume_type_b3783a95-822c-4179-8c29-c7db9d060b94": "bfb4f8cc-cd68-46a2-b3a2-c715867de706", "powervm:image_volume_type_9f2fc777-eed3-4c1f-8a02-00c9b7c91176": "bfb4f8cc-cd68-46a2-b3a2-c715867de706", "powervm:min_proc_units": 0.1, "powervm:storage_connectivity_group": "npiv", "powervm:min_vcpu": 1, "powervm:max_mem": 66560}, "ram": 1024, "vcpus": 1}, "networks": [{"fixed_ip": "10.244.248.53", "uuid": "5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5"}]}}
    {u'server': {u'links': [{u'href': u'https://powervc.lab.chmod666.org:8774/v2/1471acf124a0479c8d525aa79b2582d0/servers/fc3ab837-f610-45ad-8c36-f50c04c8a7b3', u'rel': u'self'}, {u'href': u'https://powervc.lab.chmod666.org:8774/1471acf124a0479c8d525aa79b2582d0/servers/fc3ab837-f610-45ad-8c36-f50c04c8a7b3', u'rel': u'bookmark'}], u'OS-DCF:diskConfig': u'MANUAL', u'id': u'fc3ab837-f610-45ad-8c36-f50c04c8a7b3', u'security_groups': [{u'name': u'default'}], u'adminPass': u'u7rgHXKJXoLz'}}
    

One of the major advantage of using this is batching Virtual Machine creation. By using the script you can create one hundred Virtual Machine in a couple of minutes. Awesome !

Working with Openstack commands

PowerVC is based on Openstack, so why not using the Openstack command to work with PowerVC. It is possible, but I repeat one more time that this is not supported by IBM at all. Use this trick at you own risk. I was working with cloud manager with openstack (ICMO) and a script including shells variables is provided to “talk” to the ICMO Openstack. Based on the same file I created the same one for PowerVC. Before using any Openstack commands create a powervcrc file that match you PowerVC environement:

# cat powervcrc
export OS_USERNAME=root
export OS_PASSWORD=mypasswd
export OS_TENANT_NAME=ibm-default
export OS_AUTH_URL=https://powervc.lab.chmod666.org:5000/v3/
export OS_IDENTITY_API_VERSION=3
export OS_CACERT=/etc/pki/tls/certs/powervc.crt
export OS_REGION_NAME=RegionOne
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_DOMAIN_NAME=Default

Then source the powervcrc file, and you are ready to play with all Openstack commands:

# source powervcrc

You can then play with Openstack commands, here are a few nice example:

  • List virtual machines:
  • # nova list
    +--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
    | ID                                   | Name                  | Status | Task State | Power State | Networks               |
    +--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
    | dc5c9fce-c839-43af-8af7-e69f823e57ca | ghostdev0clouddev1    | ACTIVE | -          | Running     | vlan666=10.16.66.56    |
    | d7d0fd7e-a580-41c8-b3d8-d7aab180d861 | ghostdevto1cloudevto1 | ACTIVE | -          | Running     | vlan666=10.16.66.57    |
    | bf697dfa-f69a-476c-8d0f-abb2fdcb44a7 | multi-vol             | ACTIVE | -          | Running     | vlan666=10.16.66.59    |
    | 394ab4d4-729e-44c7-a4d0-57bf2c121902 | deckard               | ACTIVE | -          | Running     | vlan666=10.16.66.60    |
    | cd53fb69-0530-451b-88de-557e86a2e238 | priss                 | ACTIVE | -          | Running     | vlan666=10.16.66.61    |
    | 64a3b1f8-8120-4388-9d64-6243d237aa44 | rachael               | ACTIVE | -          | Running     |                        |
    | 2679e3bd-a2fb-4a43-b817-b56ead26852d | batty                 | ACTIVE | -          | Running     |                        |
    | 5fdfff7c-fea0-431a-b99b-fe20c49e6cfd | tyrel                 | ACTIVE | -          | Running     |                        |
    +--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
    
  • Reboot a machine:
  • # nova reboot multi-vol
    
  • List the hosts:
  • # nova hypervisor-list
    +----+---------------------+-------+---------+
    | ID | Hypervisor hostname | State | Status  |
    +----+---------------------+-------+---------+
    | 21 | 828641A_XXXXXXX     | up    | enabled |
    | 23 | 828641A_YYYYYYY     | up    | enabled |
    +----+---------------------+-------+---------+
    
  • Migrate a virtual machine (run a live partition mobility operation):
  • # nova live-migration ghostdevto1cloudevto1 828641A_YYYYYYY
    
  • Evacuate and set a server in maintenance mode and move all the partitions to another host:
  • # nova maintenance-enable --migrate active-only --target-host 828641A_XXXXXX 828641A_YYYYYYY
    
  • Virtual Machine creation (output truncated):
  • # nova boot --image 7100-03-04-cic2-chef --flavor powervm.tiny --nic net-id=5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5,v4-fixed-ip=10.16.66.51 novacreated
    +-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
    | Property                            | Value                                                                                                                                            |
    +-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
    | OS-DCF:diskConfig                   | MANUAL                                                                                                                                           |
    | OS-EXT-AZ:availability_zone         | nova                                                                                                                                             |
    | OS-EXT-SRV-ATTR:host                | -                                                                                                                                                |
    | OS-EXT-SRV-ATTR:hypervisor_hostname | -                                                                                                                                                |
    | OS-EXT-SRV-ATTR:instance_name       | novacreated-bf704dc6-00000040                                                                                                                    |
    | OS-EXT-STS:power_state              | 0                                                                                                                                                |
    | OS-EXT-STS:task_state               | scheduling                                                                                                                                       |
    | OS-EXT-STS:vm_state                 | building                                                                                                                                         |
    | accessIPv4                          |                                                                                                                                                  |
    | accessIPv6                          |                                                                                                                                                  |
    | adminPass                           | PDWuY2iwwqQZ                                                                                                                                     |
    | avail_priority                      | -                                                                                                                                                |
    | compliance_status                   | [{"status": "compliant", "category": "resource.allocation"}]                                                                                     |
    | cpu_utilization                     | -                                                                                                                                                |
    | cpus                                | 1                                                                                                                                                |
    | created                             | 2015-08-05T15:56:01Z                                                                                                                             |
    | current_compatibility_mode          | -                                                                                                                                                |
    | dedicated_sharing_mode              | -                                                                                                                                                |
    | desired_compatibility_mode          | -                                                                                                                                                |
    | endianness                          | big-endian                                                                                                                                       |
    | ephemeral_gb                        | 0                                                                                                                                                |
    | flavor                              | powervm.tiny (ac01ba9b-1576-450e-a093-92d53d4f5c33)                                                                                              |
    | health_status                       | {"health_value": "PENDING", "id": "bf704dc6-f255-46a6-b81b-d95bed00301e", "value_reason": "PENDING", "updated_at": "2015-08-05T15:56:02.307259"} |
    | hostId                              |                                                                                                                                                  |
    | id                                  | bf704dc6-f255-46a6-b81b-d95bed00301e                                                                                                             |
    | image                               | 7100-03-04-cic2-chef (96f86941-8480-4222-ba51-3f0c1a3b072b)                                                                                      |
    | metadata                            | {}                                                                                                                                               |
    | name                                | novacreated                                                                                                                                      |
    | operating_system                    | -                                                                                                                                                |
    | os_distro                           | aix                                                                                                                                              |
    | progress                            | 0                                                                                                                                                |
    | root_gb                             | 60                                                                                                                                               |
    | security_groups                     | default                                                                                                                                          |
    | status                              | BUILD                                                                                                                                            |
    | storage_connectivity_group_id       | -                                                                                                                                                |
    | tenant_id                           | 1471acf124a0479c8d525aa79b2582d0                                                                                                                 |
    | uncapped                            | -                                                                                                                                                |
    | updated                             | 2015-08-05T15:56:02Z                                                                                                                             |
    | user_id                             | 0688b01e6439ca32d698d20789d52169126fb41fb1a4ddafcebb97d854e836c9                                                                                 |
    +-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
    
    

LUN order, remove a boot lun

If you are moving to PowerVC you will probably need to migrate existing machines to your PowerVC environment. One of my customer is asking to move its machines from old boxes using vscsi, to new PowerVC managed boxes using NPIV. I am doing it with the help of a SVC for the storage side. Instead of creating the Virtual Machine profile on the HMC, and then doing the zoning and masking on the Storage Volume Controller and on the SAN switches, I decided to let PowerVC do the job for me. Unfortunately, PowerVC can’t only “carve” Virtual Machine, if you want to do so you have to build a Virtual Machine (rootvg include). This is what I am doing. During the migration process I have to replace the PowerVC created lun by the lun used for the migration …. and finally delete the PowerVC created boot lun. There is a trick to know if you want to do this:

  • Let’s say the lun created by PowerVC is the one named “volume-clouddev-test….” and the orignal rootvg is named “good_rootvg”. The Virtual Machine is booted on the “good_rootvg” lun and I want to remove the “volume-clouddev-test….”:
  • root1

  • You first have to click the “Edit Details” button:
  • root2

  • Then toggle the boot set to “YES” for the “good_rootvg” lun and click move up (the rootvg order must be set to 1, it is mandatory, the lun at order 1 can’t be deleted):
  • root3

  • Toggle the boot set to “NO” for the PowerVC created rootvg:
  • root4

  • If you are trying to detach the volume in first position you will got an error:
  • root5

  • When the order are ok, you can detach and delete the lun created by PowerVC:
  • root6
    root7

Conclusion

There are always good things to learn about PowerVC and related AIX topics. Tell me if these tricks are useful for you and I will continue to write posts like this one. You don’t need to understand all this details to work with PowerVC, most customers don’t. I’m sure you prefer understand what is going on “behind the scene” instead of just clicking a nice GUI. I hope it helps you to better understand what PowerVC is made of. And don’t be shy share you tricks with me. Next: more to come about Chef ! Up the irons !

Exploit the full potential of PowerVC by using Shared Storage Pools & Linked Clones | PowerVC secrets about Linked Clones (pooladm,mksnap,mkclone,vioservice)

My journey into PowerVC still continues :-). The blog was not updated for two months but I’ve been busy these days, get sick … and so on, have another post in the pipe but this one has to be approved by IBM before posting ….. Since the latest version (at the time of writing this post 1.2.1.2) PowerVC is now capable of managing Shared Storage Pool (SSP). It’s a huge announcement because a lot of customers do not have a Storage Volume Controller and supported fibre channel switches. By using PowerVC in conjunction with SSP you will reveal the true and full potential of the product. There are two major enhancements brought by SSP, the first is the time of deployment of the new virtual machines … by using an SSP you’ll move from minutes to …. seconds. Second huge enhancement : by using SSP you’ll automatically -without knowing it- using a feature called “Linked Clones”. For those who are following my blog since the very beginning you’re probably aware that Linked Clones are usable and available since SSP were managed by the IBM Systems Director VMcontrol module. You can still refer to my blog posts about it … even if ISD VMcontrol is now a little bit outdated by PowerVC : here. Using PowerVC with Shared Storage Pools is easy, but how does it work behind the scene ? After analysing the process of deployment I’ve found some cool features, PowerVC is using secrets undocumented commands, pooladm, vioservice, mkdev secrets arguments … :

Discovering Shared Storage Pool on your PowerVC environment

The first step to do before beginning is to discover the Shared Storage Pool on PowerVC. I’m taking the time to explain you that because it’s so easy that people (like me) can think there is much to do about it … but no PowerVC is simple. You have nothing to do. I’m not going to explain you here how to create a Shared Storage Pool, please refer to my previous posts about this : here and here. After the Shared Storage Pool is created this one will be automatically added into PowerVC … nothing to do. Keep in mind that you will need the latest in date version of the Hardware Management Console (v8r8.1.0). If you are in trouble discovering the Shared Storage Pool check the Virtual I/O Server‘s RMC are ok. In general if you can query and perform any action on the Shared Storage Pool from the HMC there will be no problem from the PowerVC side.

  • You don’t have to reload PowerVC after creating the Shared Storage Pool, just check you can see it from the storage tab :
  • pvc_ssp1

  • You will get more details by clicking on the Shared Storage Pool ….
  • pvc_ssp2

  • such as captured image on the Shared Storage Pool ….
  • pvc_ssp3

  • volumes created on it …
  • pvc_ssp4

What is a linked clone ?

Think before start. You have to understand what is a Linked Clone before reading the rest of this post. Linked Clones are not well described in documentations and Rebooks. Linked Clones are based on Shared Storage Pools snapshots. No Shared Storage Pool = No Linked Clones. Here is what is going behind the scene when you are deploying a Linked Clone :

  1. The captured rootvg underlying disk is a Shared Storage Pool Logical Unit.
  2. When the image is captured the rootvg Logical Unit is copied and is known as a “Logical (Client Image) Unit”.
  3. When deploying a new machine a snapshot is created from the Logical (Client Image) Unit.
  4. A “special Logical Unit” is created from the snapshot. This Logical Unit seems to be a pointer to the snapshot. We call it a clone.
  5. The machine is booted and the activation engine is running and reconfiguring the network.
  6. When a block is modified on the new machine this one is duplicated and modified on one new block on the Shared Storage Pool.
  7. This said if no blocks are modified all the machines created from this capture are sharing the same blocks on the Shared Storage Pool.
  8. Only modified blocks are not shared between Linked Clones. The more things you will change on your rootvg the more space you will use on the Shared Storage Pool.
  9. That’s why these machines are called Linked Clones : they all are connected by the same source Logical Unit.
  10. You will save TIME (just a snapshot creation for the storage side) and SPACE (all rootvg will be shared by all the deployed machines) by using Linked Clones.

An image is sometimes better than long text, so here is a schema explaining all about Linked Clones :

LinkedClones

You have to capture an SSP base VM to deploy on the SSP

Be aware of one thing, you can’t deploy a virtual machine on the SSP if you don’t have an captured image on the SSP. You can’t deploy your Storwize images to deploy on the SSP. You first have to create by your own a machine which has its rootvg running on the SSP :

  • Create an image based on an SSP virtual machine :
  • pvc_ssp_capture1

  • Shared Storage Pool Logical Unit are stored in /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1.
  • Shared Storage Pool Logical (Client Image) Unit are stored in /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM.
  • The Logical Unit of the captured virtual machine is copied with the dd command from the VOL1 (/var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1) directory to the IM directory (/var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM) (so from volumes to images).
  • If you do this yourself by using the dd command you can see that the capture image is not shown at the ouput of the snapshot command (by using Linked Clones the snapshot command is separated in two categories, the actuals and “real” Logical Unit and Logical (Client Image) Units which are the PowerVC images …
  • A secret API managed by a secret command called vioservice is adding your newly created image too the Shared Storage pool soliddb.
  • After the “registration” the Client Image is visible with the snapshot command.

Deployment

After the image is captured and stored on the Shared Storage Pool images directory, you can now deploy virtual machines based on this image. Keep in mind that blocks are shared by each linked clones, you’ll be suprised that deploying machines will not used the free space on the shared storage pool. But be aware that you can’t deploy any machines if there is no “blank” space in the PowerVC space bar (check image below ….) :

deploy

Step by step deployment by exemple

  • A snapshot of the image is created trough the pooladm command. You can check the output of the snapshot command after this step you’ll see a new snapshot derived from the Logical (Client Image) Unit.
  • This snapshot is cloned (My understanding of the clone is that it is a normal logical unit sharing block with an image). After the snapshot is cloned a new volume is created in the shared storage pool volume directory but at this step this one is not visible with the lu command because creating a clone do not create meta-data on the shared storage pool.
  • A dummy logical unit is created. Then the clone is moved on the dummy logical unit to replace it.
  • The clone logical unit is mapped to client.

dummy

You can do it yourself without PowerVC (not supported)

Just for my understanding of what is doing PowerVC behind the scene I decided to try to do all the steps on my own.This steps are working but are not supported at all by IBM.

  • Before starting to read this you need to know that $ prompts are for padmin commands, # prompts are for root commands. You’ll need the cluster id and the pool id to build some xml files :
  • $ cluster -list -field CLUSTER_ID
    CLUSTER_ID:      c50a291c18ab11e489f46cae8b692f30
    $ lssp -clustername powervc_cluster -field POOL_ID
    POOL_ID:         000000000AFFF80C0000000053DA327C
    
  • So the cluster id will be c50a291c18ab11e489f46cae8b692f30 and the pool id will be c50a291c18ab11e489f46cae8b692f30000000000AFFF80C0000000053DA327C. These id are often prefixed by two characters (I don’t know the utility of these ones but it will work in all cases …)
  • Image files are stored in /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM.
  • Logical units files are stored in /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1.
  • Create the “envelope” of the Logical (Client Image) Unit, by creating an xml file (the udid are build with the cluster udid and the pool udid) used as the standard input of the vioservice command :
  • # cat create_client_image.xml
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <VIO xmlns="http://ausgsa.austin.ibm.com/projects/v/vios/schema/1.20" version="1.20">
        <Request action="1">
            <Cluster udid="22c50a291c18ab11e489f46cae8b692f30">
                <Pool udid="24c50a291c18ab11e489f46cae8b692f30000000000AFFF80C0000000053DA327C">
                    <Tier>
                        <LU capacity="55296" type="8">
                            <Image label="chmod666-homemade-image"/>
                        </LU>
                    </Tier>
                </Pool>
            </Cluster>
        </Request>
    </VIO>
    # /usr/ios/sbin/vioservice lib/libvio/lu < create_client_image.xml
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <VIO xmlns="http://ausgsa.austin.ibm.com/projects/v/vios/schema/1.20" version="1.20">
    
    <Response><Cluster udid="22c50a291c18ab11e489f46cae8b692f30" name="powervc_cluster"><Pool udid="24c50a291c18ab11e489f46cae8b692f30000000000AFFF80C0000000053DA327C" name="powervc_sp" raidLevel="0" overCommitSpace="0"><Tier udid="25c50a291c18ab11e489f46cae8b692f3019f95b3ea4c4dee1" name="SYSTEM" overCommitSpace="0"><LU udid="29c50a291c18ab11e489f46cae8b692f30d87113d5be9004791d28d44208150874" capacity="55296" physicalUsage="0" unusedCommitment="0" type="8" derived="" thick="0" tmoveState="0"><Image label="chmod666-homemade-image" relPath=""/></LU></Tier></Pool></Cluster></Response></VIO>
    # ls -l /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM
    total 339759720
    -rwx------    1 root     staff    57982058496 Sep  8 19:00 chmod666-homemade-image.d87113d5be9004791d28d44208150874
    -rwx------    1 root     system   57982058496 Aug 12 17:53 volume-Image_7100-03-03-1415-SSP3e2066b2a7a9437194f48860affd56c0.ac671df86edaf07e96e399e3a2dbd425
    -rwx------    1 root     system   57982058496 Aug 18 19:15 volume-Image_7100-03-03-1415-c--5bd3991bdac84c48b519e19bfb1be381.e525b8eb474f54e1d34d9d02cb0b49b4
    
  • You can now see with the snapshot command that a new Logical (Client Image) Unit is here :
  • $ snapshot -clustername powervc_cluster -list -spname powervc_sp
    Lu(Client Image)Name     Size(mb)       ProvisionType     %Used Unused(mb)     Lu Udid
    volume-Image_7100-03-03-1415-SSP3e2066b2a7a9437194f48860affd56c055296          THIN               100% 0              ac671df86edaf07e96e399e3a2dbd425
    chmod666-homemade-image  55296          THIN                 0% 55299          d87113d5be9004791d28d44208150874
    volume-Image_7100-03-03-1415-c--5bd3991bdac84c48b519e19bfb1be38155296          THIN               100% 55299          e525b8eb474f54e1d34d9d02cb0b49b4
                    Snapshot
                    2631012f1a558e51d1af7608f3779a1bIMSnap
                    09a6c90817d24784ece38f71051e419aIMSnap
                    e400827d363bb86db7984b1a7de08495IMSnap
                    5fcef388618c9a512c0c5848177bc134IMSnap
    
  • Copy the source image (the stopped virtual machine with the activation engine activated) to this newly created image. (This one will be the new reference of all your virtual machines created with this image as source). Use the dd command to do it (and don’t forget the block size). You can check while the dd is running that the unused percentage is increasing :
  • # dd if=/var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-aaaa95f8317c666549c4809264281db536dd.a2b7ed754030ca97668b30ab6cff5c45 of=/var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM/chmod666-homemade-image.d87113d5be9004791d28d44208150874 bs=1M
    $ snapshot -clustername powervc_cluster -list -spname powervc_sp
    Lu(Client Image)Name     Size(mb)       ProvisionType     %Used Unused(mb)     Lu Udid
    chmod666-homemade-image  55296          THIN                23% 0              d87113d5be9004791d28d44208150874
    [..]
    $ snapshot -clustername powervc_cluster -list -spname powervc_sp
    Lu(Client Image)Name     Size(mb)       ProvisionType     %Used Unused(mb)     Lu Udid
    [..]
    chmod666-homemade-image  55296          THIN                40% 0              d87113d5be9004791d28d44208150874
    n$ snapshot -clustername powervc_cluster -list -spname powervc_sp
    Lu(Client Image)Name     Size(mb)       ProvisionType     %Used Unused(mb)     Lu Udid
    [..]
    chmod666-homemade-image  55296          THIN               100% 0              d87113d5be9004791d28d44208150874
    
  • You have now a new reference image. This one will be used as a reference for all you linked clone deployed virtual machines. A linked clone is created from a snapshot, so you have first to create a snapshot of the newly created image, by using the pooladm command (keep in mind that you can’t use snapshot command to work on Logical (Client Image) Unit). The snapshot is identified by the logical unit name suffixed by the “@“. Use mksnap to create the snap, and lssnap to show it. The snapshot will be visible at the output of the snapshot command :
  • # pooladm file mksnap /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM/chmod666-homemade-image.d87113d5be9004791d28d44208150874@chmod666IMSnap
    # pooladm file lssnap /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM/chmod666-homemade-image.d87113d5be9004791d28d44208150874
    Primary Path         File Snapshot name
    ---------------------------------------
    /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM/chmod666-homemade-image.d87113d5be9004791d28d44208150874 chmod666IMSnap
    $ snapshot -clustername powervc_cluster -list -spname powervc_sp
    Lu(Client Image)Name     Size(mb)       ProvisionType     %Used Unused(mb)     Lu Udid
    chmod666-homemade-image  55296          THIN               100% 55299  d87113d5be9004791d28d44208150874
                    Snapshot
                    chmod666IMSnap
    [..]
    
  • You can now create the clone from the snap (snap are identified by a ‘@’ character prefixed by the image name). Name the clone the way you want because this one will be renamed and moved to replace a normal logical unit, I’m using here the PowerVC convention (IMtmp). The creation of the clone will create a new file in the VOL1 directory with no shared storage pool meta data, so this clone will no be visible at the output of the lu command :
  • $ pooladm file mkclone /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM/chmod666-homemade-image.d87113d5be9004791d28d44208150874@chmod666IMSnap /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-chmod666-IMtmp
    $ ls -l  /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/*chmod666-IM*
    -rwx------    1 root     system   57982058496 Sep  9 16:27 /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-chmod666-IMtmp
    
  • By using vioservice, create a logical unit on the shared storage pool. This will create a new image with a newly generated udid. If you check in the volume directory you can notice that the clone does not have the meta-data file needed by shared storage pool.(This file is prefixed by a dot (.)). After creating this logical unit replace it with your clone with a simple move :
  • $ cat create_client_lu.xml
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <VIO xmlns="http://ausgsa.austin.ibm.com/projects/v/vios/schema/1.20" version="1.20">
        <Request action="1">
            <Cluster udid="22c50a291c18ab11e489f46cae8b692f30">
                <Pool udid="24c50a291c18ab11e489f46cae8b692f30000000000AFFF80C0000000053DA327C">
                    <Tier>
                        <LU capacity="55296" type="1">
                            volume-boot-9117MMD_658B2AD-chmod666"/>
                        </LU>
                    </Tier>
                </Pool>
            </Cluster>
        </Request>
    </VIO>
    $ /usr/ios/sbin/vioservice lib/libvio/lu < create_client_lu.xml
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <VIO xmlns="http://ausgsa.austin.ibm.com/projects/v/vios/schema/1.20" version="1.20">
    
    <Response><Cluster udid="22c50a291c18ab11e489f46cae8b692f30" name="powervc_cluster"><Pool udid="24c50a291c18ab11e489f46cae8b692f30000000000AFFF80C0000000053DA327C" name="powervc_sp" raidLevel="0" overCommitSpace="0"><Tier udid="25c50a291c18ab11e489f46cae8b692f3019f95b3ea4c4dee1" name="SYSTEM" overCommitSpace="0"><LU udid="27c50a291c18ab11e489f46cae8b692f30e4d360832b29be950824d3e5bf57d777" capacity="55296" physicalUsage="0" unusedCommitment="0" type="1" derived="" thick="0" tmoveState="0"><Disk label="volume-boot-9117MMD_658B2AD-chmod666"/></LU></Tier></Pool></Cluster></Response></VIO>
    $ mv /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-chmod666-IMtmp /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-chmod666.e4d360832b29be950824d3e5bf57d777
    
  • You are ready to use your linked clone, you have a source image, a snap of this one, and a clone of this snap :
  • # pooladm file lssnap /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM/chmod666-homemade-image.d87113d5be9004791d28d44208150874
    Primary Path         File Snapshot name
    ---------------------------------------
    /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM/chmod666-homemade-image.d87113d5be9004791d28d44208150874 chmod666IMSnap
    # pooladm file lsclone /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-chmod666.e4d360832b29be950824d3e5bf57d777
    Snapshot             Clone name
    ----------------------------------
    chmod666IMSnap /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-chmod666.e4d360832b29be950824d3e5bf57d777
    
  • Then, using vioservice or the mkdev command map the clone to your virtual scsi adapter (identifed by its physloc name) (do this on both Virtual I/O Servers) :
  • $ cat map_clone.xml
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <VIO xmlns="http://ausgsa.austin.ibm.com/projects/v/vios/schema/1.20" version="1.20">
        <Request action="5">
            <Cluster udid="22c50a291c18ab11e489f46cae8b692f30">
                <Map label="" udid="27c50a291c18ab11e489f46cae8b692f30e4d360832b29be950824d3e5bf57d777" drcname="U9117.MMD.658B2AD-V2-C99"/>
            </Cluster>
        </Request>
    </VIO>
    $ /usr/ios/sbin/vioservice lib/libvio/lu < map_clone.xml
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <VIO xmlns="http://ausgsa.austin.ibm.com/projects/v/vios/schema/1.20" version="1.20">
    
    <Response><Cluster udid="22c50a291c18ab11e489f46cae8b692f30" name="powervc_cluster"/></Response></VIO>
    

    or

    # mkdev -t ngdisk -s vtdev -c virtual_target -aaix_tdev=volume-boot-9117MMD_658B2AD-chmod666.e4d360832b29be950824d3e5bf57d777 -audid_info=4d360832b29be950824d3e5bf57d77 -apath_name=/var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1 -p vhost5 -acluster_id=c50a291c18ab11e489f46cae8b692f30
    
  • Boot the machine ... this one is a linked clone create by yourself without PowerVC.

About the activation engine ?

Your captured image has the activation engine enabled. To reconfigure the network & the hostname PowerVC is copying an iso from the PowerVC server to the Virtual I/O Server. This iso contains an ovf file needed by the activation engine to customize your virtual machine. To customize my linked clone virtual machine created on my own I decided to re-use an old iso file created by PowerVC for another deployment :

  • Mount the image located in /var/vio/VMLibrary, and modify the xml ovf file to fit your needs :
  • # ls -l /var/vio/VMLibrary
    total 840
    drwxr-xr-x    2 root     system          256 Jul 31 20:17 lost+found
    -r--r-----    1 root     system       428032 Sep  9 18:11 vopt_c07e6e0bab6048dfb23586aa90e514e6
    # loopmount -i vopt_c07e6e0bab6048dfb23586aa90e514e6 -o "-V cdrfs -o ro" -m /mnt
    
  • Copy the content of the cd to a directory :
  • # mkdir /tmp/mycd
    # cp -r /mnt/* /tmp/mycd
    
  • Edit the ovf file to fit your needs (In my case for instance I'm changing the hostname of the machine and it's ip address :
  • # cat /tmp/mycd/ovf-env.xml
    <Environment xmlns="http://schemas.dmtf.org/ovf/environment/1" xmlns:ovf="http://schemas.dmtf.org/ovf/envelope/1" xmlns:ovfenv="http://schemas.dmtf.org/ovf/environment/1" xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData" xmlns:vssd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ovfenv:id="vs0">
        <PlatformSection>
        <Locale>en</Locale>
      </PlatformSection>
      <PropertySection>
      <Property ovfenv:key="com.ibm.ovf.vmcontrol.system.networking.ipv4defaultgateway" ovfenv:value="10.218.238.1"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.system.networking.hostname" ovfenv:value="homemadelinkedclone"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.adapter.networking.slotnumber.1" ovfenv:value="32"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.system.networking.dnsIPaddresses" ovfenv:value=""/><Property ovfenv:key="com.ibm.ovf.vmcontrol.adapter.networking.usedhcpv4.1" ovfenv:value="false"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.adapter.networking.ipv4addresses.1" ovfenv:value="10.218.238.140"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.adapter.networking.ipv4netmasks.1" ovfenv:value="255.255.255.0"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.system.networking.domainname" ovfenv:value="localdomain"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.system.timezone" ovfenv:value=""/></PropertySection>
    </Environment>
    
  • Recreate the cd using the mkdvd command and put it in the /var/vio/VMLibrary directory :
  • # mkdvd -r /tmp/mycd -S
    Initializing mkdvd log: /var/adm/ras/mkcd.log...
    Verifying command parameters...
    Creating temporary file system: /mkcd/cd_images...
    Creating Rock Ridge format image: /mkcd/cd_images/cd_image_19267708
    Running mkisofs ...
    
    mkrr_fs was successful.
    # mv /mkcd/cd_images/cd_image_19267708 /var/vio/VMLibrary
    $ lsrep
    Size(mb) Free(mb) Parent Pool         Parent Size      Parent Free
        1017     1015 rootvg                   279552           171776
    
    Name                                                  File Size Optical         Access
    cd_image_19267708                                             1 None            rw
    vopt_c07e6e0bab6048dfb23586aa90e514e6                         1 vtopt1          ro
    
  • Load the cdrom and map it to the linked clone :
  • $ mkvdev -fbo -vadapter vhost11
    $ loadopt -vtd vtopt0 -disk cd_image_19267708
    
  • When the linked clone virtual machine will boot the cd will be mounted and the activation engine will take the ovf file as parameter, and will reconfigure the network. For instance you can check the hostname has changed :
  • # hostname
    homemadelinkedclone.localdomain
    

A view on the layout ?

I asked myself a question about Linked Clones, how can we check Shared Storage Pool blocks (or PP ?) are shared by the capture machine (the captured LU) on one linked clone ? To answer to this question I had to play with the pooladm command (which is unsupported for customer use) to check the logcial unit layout of the capture virtual machine and of the deployed linked clone and then compare them. Please note that this is my understanding of the linked clones. This is not validated by any IBM support, do this at your own risk, you can correct my interpretation of what I'm seeing here :-) :

  • Get the layout of the captured VM by getting the layout of the logical unit (the captured image is in my case located in /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM/volume-Image_7100-03-03-1415-c--5bd3991bdac84c48b519e19bfb1be381.e525b8eb474f54e1d34d9d02cb0b49b4) :
  • root@vios:/home/padmin# ls -l /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM
    total 339759720
    -rwx------    1 root     system   57982058496 Aug 12 17:53 volume-Image_7100-03-03-1415-SSP3e2066b2a7a9437194f48860affd56c0.ac671df86edaf07e96e399e3a2dbd425
    -rwx------    1 root     system   57982058496 Aug 18 19:15 volume-Image_7100-03-03-1415-c--5bd3991bdac84c48b519e19bfb1be381.e525b8eb474f54e1d34d9d02cb0b49b4
    # pooladm file layout /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/IM/volume-Image_7100-03-03-1415-c--5bd3991bdac84c48b519e19bfb1be381.e525b8eb474f54e1d34d9d02cb0b49b4 | /tmp/captured_vm.layout
    0x0-0x100000 shared
        LP 0xFE:0xF41000
        PP /dev/hdisk968 0x2E8:0xF41000
    0x100000-0x200000 shared
        LP 0x48:0x387F000
        PP /dev/hdisk866 0x1:0x387F000
    [..]
    
  • Get the layout of the linked clone (the linked clone is in my case located in /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-layo583b3eb5e98b495b992fdc3accc39bc3.54c172062957d73ec92e90d203d23fba)
  • # ls /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-layo583b3eb5e98b495b992fdc3accc39bc3.54c172062957d73ec92e90d203d23fba
    /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-layo583b3eb5e98b495b992fdc3accc39bc3.54c172062957d73ec92e90d203d23fba
    # pooladm file layout /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-layo583b3eb5e98b495b992fdc3accc39bc3.54c172062957d73ec92e90d203d23fba | tee /tmp/linked_clone.layout
    0x0-0x100000 shared
        LP 0xFE:0xF41000
        PP /dev/hdisk968 0x2E8:0xF41000
    0x100000-0x200000 shared
        LP 0x48:0x387F000
        PP /dev/hdisk866 0x1:0x387F000
    [..]
    
  • At this step you can first compare the two files, you can see some useful informations, but do not misunderstand this output. You first have to sort it to make conclusion. But you can be sure of one thing : some PPs have been modified on the linked clone and cannot be shared anymore, others are shared between the linked clone and the capture image :
  • sdiff_layout1_modifed_1

  • You can have a better view of shared and not shared PPs by sorting the output of these files, here the commands I used to do it :
  • #grep PP linked_clone.layout | tr -s " " | sort -k1 > /tmp/pp_linked_clone.layout
    #grep PP captured_vm.layout | tr -s " " | sort -k1 > /tmp/pp_captured_vm.layout
    
  • By sdiffing these two files I can now check which PPs are shared and which are not :
  • sdiff_layout2_modifed_1

  • The pooladm command can give you stats about linked clone. My understanding of the owned block count tell me that 78144 SSP blocks (not PPs) (so blocks of 4k) are uniq to this linked clones and not shared with the captured image :
  • vios1#pooladm file stat /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-layo583b3eb5e98b495b992fdc3accc39bc3.54c172062957d73ec92e90d203d23fba
    Path: /var/vio/SSP/powervc_cluster/D_E_F_A_U_L_T_061310/VOL1/volume-boot-9117MMD_658B2AD-layo583b3eb5e98b495b992fdc3accc39bc3.54c172062957d73ec92e90d203d23fba
    Size            57982058496
    Number Blocks   14156655
    Data Blocks     14155776
    Pool Block Size 4096
    
    Tier: SYSTEM
    Owned Blocks    78144
    Max Blocks      14156655
    Block Debt      14078511
    

Mixed NPIV & SSP deployment

For some reasons for some machine with an I/O intensive workload, it can be usefull to put your data luns on an NPIV adapter. I'm actually working on a project involving PowerVC and the question was ask, why not mix SSP Lun for rootvg and NPIV based lun for data volume group. One more time it's very simple with PowerVC, just attach a volume, this time by choosing your Storage Volume Controller provider ... easy :

mixed1_masked

This will created NPIV adapters and create new zoning and masking on the fibre channels switches. One more time easy ....

Debugging ?

I'll not lie. I had a lot of problems with Shared Storage Pool and PowerVC but these problems were related to my configuration moving a lot during the tests. Always remind you that you'll learn from theses errors and in my case it helped my a lot to debug PowerVC :

  • From the Virtual I/O Server side check you have no core file in the /home/ios/logs directory. A core file in this directory indicates one of the command run by PowerVC just "cored" :
  • root@vios1:/home/ios/logs# ls core*
    core.9371682.18085943
    
  • From the Virtual I/O Server side check the /home/ios/logs/viosvc.log. You can check all the xml files and all the ouputs used by the vioservice command. Most of PowerVC actions are performed trough the vioservice command ....
  • root@vios1:/home/ios/logs# ls viosvc.log
    -rw-r--r--    1 root     system     10240000 Sep 11 00:28 viosvc.log
    
  • Step by step check all PowerVC actions are ok. For instance verify with the lsrep command that the iso has been copied from PowerVC to the Virtual I/O Server library. Check there is space left on the Shared Storage Pool ....
  • Sometimes the secret vioserivce api is stucked and not responding. In some cases it can be useful to rebuild the soliddb ... I'm using this script to do it (run it as root) :
  • # cat rebuilddb.sh
    #!/usr/bin/ksh
    set -x
    stopsrc -s vio_daemon
    sleep 30
    rm -rf /var/vio/CM
    startsrc -s vio_daemon
    
  • EDIT I had another info from IBM regarding the method to rebuild the SolidDB, using my script won't properly bring up the SolidDB back up properly and could leave you in a bad state. Just add this at the end of the script :
  • pid=$(lssrc -s vio_daemon | awk 'NR==2 {print $2}')
    kill -1 $pid  
    
  • On PowerVC side when you have problem it is always good to increase the verbosity of the logs (located in /var/log) (in this case nova) (restart PowerVC after setting verbosity level)
  • # openstack-config --set /etc/nova/nova-9117MMD_658B2AD.conf DEFAULT default_log_levels powervc_nova=DEBUG,powervc_k2=DEBUG,nova=DEBUG
    

Conclusion

It takes me more than two months write this post. Why ? Just because PowerVC design is not documented. It work like a charm, but nobody will explain you HOW. I hope this post will help you to understand how PowerVC is working. I'm a huge fan of PowerVC and SSP, try it by yourself and you'll see that it is a pleasure to use it. It's simple, effecient, and powerfull. Anybody can give me an access to a PowerKVM host to write & proove that PowerVC is also simple and efficient with PowerKVM ... ?

Virtual I/O Server 2.2.3.1 Part 1 : Shared Storage Pool enhancements

Everybody knows that I’m a huge fan of Shared Storage Pools, you can check my previous post on this subject on the blog. With the new version of 2.2.3.1 of Virtual I/O Servers Shared Storage Pool have been enhanced with some cool features : simplified management, pool mirroring, pool shrinking and the long awaited unicast mode to get rid of the multicast. This post will show you that Shared Storage Pool are now powerful and ready to be used for production server (this is my own opinion). I’ll not talk about this later but be aware that the maximum pool size is now 16TB and the pool can now serve 250 Virtual I/O Clients. Here we go :

Rolling updates

Rolling updates are available since Virtual I/O Server 2.2.2.0 but it is the first time I am using it. This feature is not a new enhancement brought by the version 2.2.3.1 but it still good to write about it :-). The rolling updates allow you to update Virtual I/O Severs (members of a Shared Storage Pool) one by one without causing an outage of the entire cluster. Each Virtual I/O Server has to leave the cluster before starting the update and can rejoin the cluster after the reboot and its update. A special node called database primary node (DBN) checks every ten minutes if Virtual I/O Servers are ON_LEVEL or UP_LEVEL. Before trying new Shared Storage Pool features I had to update Virtual I/O Servers, here is a little reminder on how to do it with rolling updates :

  • Who is the current database primary node in the cluster (This is the last I’ll update) :
  • # cluster -clustername vio-cluster -status -verbose -fmt : -field pool_name pool_state node_name node_mtm node_partition_num node_upgrade_status node_roles
    vio-ssp:OK:vios1.domain.test:8202-E4C02064099R:2:ON_LEVEL:
    vio-ssp:OK:vios2.domain.test:8202-E4C02064099R:1:ON_LEVEL:
    vio-ssp:OK:vios3.domain.test:8202-E4C02064011R:2:ON_LEVEL:DBN
    vio-ssp:OK:vios4.domain.test:8202-E4C02064011R:1:ON_LEVEL:
    
  • The DBN is going to move from one Virtual I/O Server to one another at the moment it will leave the cluster :
  • vio-ssp:OK:vios1.domain.test:8202-E4C02064099R:2:UP_LEVEL:
    vio-ssp:OK:vios2.domain.test:8202-E4C02064099R:1:UP_LEVEL:
     : :vios3.domain.test:8202-E4C02064011R:2:ON_LEVEL:
    vio-ssp:OK:vios4.domain.test:8202-E4C02064011R:1:UP_LEVEL:DBN
    
  • On the Virtual I/O Server you want to update leave the cluster before running the update :
  • # clstartstop -n vio-cluster -m $(hostname)
    # mount nim_server:/export/nim/lpp_source/vios2231-fp27-lpp_source /mnt
    # updateios -dev /mnt -accept -install
    
  • When the update is finished, re-join the cluster :
  • # ioslevel
    2.2.3.1
    # clstartstop -start -n vio_cluster -m $(hostname)
    
  • On any Virtual I/O Server you can list the cluster and check if Virtual I/O Servers are ON_LEVEL or UP_LEVEL :
  • # cluster -clustername vio-cluster -status -verbose -fmt : -field pool_name pool_state node_name node_mtm node_partition_num node_upgrade_status node_roles
    vio-ssp:OK:vios1.domain.test:8202-E4C02064099R:2:ON_LEVEL:
    vio-ssp:OK:vios2.domain.test:8202-E4C02064099R:1:UP_LEVEL:
    vio-ssp:OK:vios3.domain.test:8202-E4C02064011R:2:ON_LEVEL:DBN
    vio-ssp:OK:vios4.domain.test:8202-E4C02064011R:1:UP_LEVEL:
    
  • All backing devices served by the Shared Storage Pool are still available on each node no matter it is ON_LEVEL or UP_LEVEL.
  • When all the Virtual I/O Servers are updated (and the last one is updated, in my case the DBN), all Virtual I/O Server node_upgrade_status will be ON_LEVEL. Remember that you have to wait at least 10 minutes to check that all Virtual I/O Servers are ON_LEVEL :
  • $ cluster -clustername vio-cluster -status -verbose -fmt : -field pool_name pool_state node_name node_mtm node_partition_num node_upgrade_status node_roles
    vio-ssp:OK:vios1.domain.test:8202-E4C02064099R:2:2.2.3.1 ON_LEVEL:
    vio-ssp:OK:vios2.domain.test:8202-E4C02064099R:1:2.2.3.1 ON_LEVEL:DBN
    vio-ssp:OK:vios3.domain.test:8202-E4C02064011R:2:2.2.3.1 ON_LEVEL:
    vio-ssp:OK:vios4.domain.test:8202-E4C02064011R:1:2.2.3.1 ON_LEVEL:
    
  • The Shared Storage Pool upgrade is performed by root user by the crontab. You can run the sspupgrade command by hand if you want to check if there are any Shared Storage Pool update running :
  • # crontab -l | grep sspupgrade
    0,10,20,30,40,50 * * * * /usr/sbin/sspupgrade -upgrade
    # sspupgrade -status
    No Upgrade in progress
    
  • If all nodes are not at the same ioslevel you’ll not be able to use new commands (check the output below) :
  • # failgrp -list
    The requested operation can not be performed since the software capability is currently not enabled.
    Please upgrade all nodes within the cluster and retry the request once the upgrade has completed successfully.
    
  • If you are moving from any version to 2.2.3.1 the communication mode for the cluster will change from multicast to unicast as part of rolling upgrade operation. You have nothing to do.

Mirroring the Shared Storage Pool with failgrp command

One of the major drawback of Shared Storage Pools was resilience. The Shared Storage Pool was not able to mirror luns from on site to one another. The failgrp command introduced in this new version is used to mirror the Shared Storage Pool across different SAN array and can easily answer to resiliency questions. In my opinion this was the missing feature needed to deploy Shared Storage Pools in a production environment. One of the major cool thing in this is that you have NOTHING to do at the lpar level. All the mirroring is performed by the Virtual I/O Server and the Shared Storage Pool themselves, no need at all to verify on each lpar mirroring of your volume groups. The single point of management for mirroring is now the failure group. :-)

  • By default after upgrading all nodes to 2.2.3.1 all luns are assigned to a failure group called Default, you can rename it if you want to. Notice that the Shared Storage Pool is not mirrored at this state :
  • $ failgrp -list -fmt : -header
    POOL_NAME:TIER_NAME:FG_NAME:FG_SIZE:FG_STATE
    vio-ssp:SYSTEM:Default:55744:ONLINE
    $ failgrp -modify -clustername vio-cluster -sp vio-ssp -fg Default -attr FG_NAME=failgrp_site1
    Given attribute(s) modified successfully.
    $  failgrp -list -fmt : -header
    POOL_NAME:TIER_NAME:FG_NAME:FG_SIZE:FG_STATE
    vio-ssp:SYSTEM:failgrp_site1:55744:ONLINE
    $ cluster -clustername vio-cluster -status -verbose 
    [..]
        Pool Name:            vio-ssp
        Pool Id:              000000000AFD672900000000529641F4
        Pool Mirror State:    NOT_MIRRORED
    [..]
    
  • Create the second failure group on the second site (be carefull with the command syntax) :
  • $ failgrp -create -clustername vio-cluster -sp vio-ssp -fg failgrp_site2: hdiskpower141
    failgrp_site2 FailureGroup has been created successfully.
    $ failgrp -list -fmt : -header
    POOL_NAME:TIER_NAME:FG_NAME:FG_SIZE:FG_STATE
    vio-ssp:SYSTEM:failgrp_site1:223040:ONLINE
    vio-ssp:SYSTEM:failgrp_site2:223040:ONLINE
    
  • While failgrp command is mirroring the pool (and when it’s finished) you can check that your pool is in SYNCED mode :
  • $ cluster -status -clustername vio-cluster -verbose
        Pool Name:            vio-ssp
        Pool Id:              000000000AFD672900000000529641F4
        Pool Mirror State:    SYNCED
    
  • To identify luns in the Shared Storage Pool a new command is introduced called pv. You can quickly identify luns used by the Shared Storage Pool with their respective failure group :
  • $ pv -list
    
    POOL_NAME: vio-ssp
    TIER_NAME: SYSTEM
    FG_NAME: failgrp_site1
    PV_NAME          SIZE(MB)    STATE            UDID
    hdiskpower156    55770       ONLINE           1D06020F6709SYMMETRIX03EMCfcp
    hdiskpower1      111540      ONLINE           1D06020F6909SYMMETRIX03EMCfcp
    
    POOL_NAME: vio-ssp
    TIER_NAME: SYSTEM
    FG_NAME: failgrp_site2
    PV_NAME          SIZE(MB)    STATE            UDID
    hdiskpower2      223080      ONLINE           1D06020F6B09SYMMETRIX03EMCfcp
    

Mapping made easier with the lu command

If like me you ever used a Shared Storage Pool you probably already know that mapping a device was not so easy with the mkbdsp command. Once again the new version of Virtual I/O Server is trying to simplify the logical units and backing device management with the addition of a new command called lu. Its purpose is to manage logical units creation, listing, mapping and removing in a single easy to use command, i’ll not tell you here how to use the command but here are a few nice example :

  • Create a thin backing device called bd_lu01 and map it to vhost4 :
  • $ lu -create -clustername vio-cluster -sp vio-ssp -lu lu01 -vadapter vhost4 -vtd bd_lu01 -size 10G
    Lu Name:lu0011
    Lu Udid:e982bf85313bcafe0af7653e8e39c3d9
    
    Assigning logical unit 'lu01' as a backing device.
    
    VTD:bd_lu01
    
  • Do not forget to map it on the second Virtual I/O Server :
  • $ lu -map -clustername vio-cluster -sp vio-ssp -lu lu01 -vadapter vhost4 -vtd bd_lu01
    Assigning logical unit 'lu01' as a backing device.
    
    VTD:bd_lu01
    
  • List all logical units viewed in the Shared Storage Pool :
  • $ lu -list
    POOL_NAME: vio-ssp
    TIER_NAME: SYSTEM
    LU_NAME                 SIZE(MB)    UNUSED(MB)  UDID
    lu01                  10240       10240       e982bf85313bcafe0af7653e8e39c3d9
    [..]
    

Physical volume management made easier with the pv command

Before this new release Shared Storage Pool was able to replace a lun by using the chsp command. You were not able to remove a lun for the Shared Storage Pool. The new release aims to simplify the Shared Storage Pool management and a new command called pv is introduced to manage luns from a single and easy command to add, replace, remove and list luns from the Shared Storage Pool.

  • Adding a disk to the Shared Storage Pool :
  • $ pv -add -clustername vio-cluster -sp vio-ssp -fg failgrp_site1: hdiskpower1
    Given physical volume(s) have been added successfully.
    
  • Replacing a disk from the Shared Storage Pool (note that you can’t replace a disk by a smaller one even if this one is not totally used) :
  • $ pv -replace -clustername vio-cluster -sp vio-ssp -oldpv hdiskpower2 -newpv hdiskpower1
    Current request action progress: % 5
    Current request action progress: % 100
    The capacity of the new device(s) is less than the capacity
    of the old device(s). The old device(s) cannot be replaced.
    $ pv -replace -clustername vio-cluster -sp vio-ssp -oldpv hdiskpower156 -newpv hdiskpower1
    Current request action progress: % 5
    Current request action progress: % 6
    Current request action progress: % 100
    Given physical volume(s) have been replaced successfully.
    
  • Unfortunately the REPDISK cannot be replaced by this command, you have to use chrepos command:
  • $ lspv | grep hdiskpower0
    hdiskpower0      00f7407858d6a19d                     caavg_private    active
    $ pv -replace -oldpv hdiskpower0 -newpv hdiskpower156
    The specified PV is not part of the storage pool
    
  • Used pv can be easily list with this command :
  • $ pv -list -verbose -fmt : -header
    POOL_NAME:TIER_NAME:FG_NAME:PV_NAME:PV_SIZE:PV_STATE:PV_UDID:PV_DESC
    vio-ssp:SYSTEM:failgrp_site1:hdiskpower1:111540:ONLINE:1D06020F6909SYMMETRIX03EMCfcp:PowerPath Device
    vio-ssp:SYSTEM:failgrp_site2:hdiskpower2:223080:ONLINE:1D06020F6B09SYMMETRIX03EMCfcp:PowerPath Device
    

    Removing a disk from the Shared Storage Pool

    It is now possible to remove disk from the Shared Storage Pool, this new features is known as pool shrinking

  • Removing a disk from the Shared Storage Pool :
  • $ pv -remove -clustername vio-cluster -sp vio-ssp -pv hdiskpower1
    Given physical volume(s) have been removed successfully.
    

    Repository disk failure and replacement

    The Shared Storage pool is now able to be up without its repository disk running. Having an error on the repository disk is not a problem at all. This one can be replaced at the moment you found an error on it by the chrepos command and it’s pretty easy to do. Here is a little reminder :

    • To identify repository disk you can use CAA command as root :
    • # /usr/lib/cluster/clras lsrepos
      [..]
      hdisk558 has a cluster repository signature.
      hdisk559 has a cluster repository signature.
      hdisk560 has a cluster repository signature.
      Cycled 680 disks.
      hdisk561 has a cluster repository signature.
      hdiskpower139 has a cluster repository signature.
      Cycled 690 disks.
      Found 5 cluster repository disks.
      [..]
      
    • For the example we are going to write some zero to the repository disk, and showing it’s not readable anymore by the Shared Storage Pool :
    • # lsvg -p caavg_private
      caavg_private:
      PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
      hdiskpower0       active            15          8           02..00..00..03..03
      # lsvg -l caavg_private
      caavg_private:
      LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
      caalv_private1      boot       1       1       1    closed/syncd  N/A
      caalv_private2      boot       1       1       1    closed/syncd  N/A
      caalv_private3                 4       4       1    open/syncd    N/A
      powerha_crlv        boot       1       1       1    closed/syncd  N/A
      #  dd if=/dev/zero of=/dev/hdiskpower0 bs=1024 count=1024
      1024+0 records in
      1024+0 records out
      # lsvg -l caavg_private
      0516-066 : Physical volume is not a volume group member.
              Check the physical volume name specified.
      
    • The REPDISK cannot be totally removed. You have to replace by another lun in case of emergency :
    • $ chrepos -n vio-cluster -r -hdiskpower0
      chrepos: The removal of repository disks is not currently supported.
      
    • If the repository disk has a problem you can move it another disk by using the chrepos command :
    • $ chrepos -r -hdiskpower0,+hdiskpower141
      chrepos: Successfully modified repository disk or disks.
      

    Cluster unicast communication by default

    Using unicast in a Cluster AIX Aware cluster is a long awaited feature asked by IBM customers. As everybody knows Shared Storage Pools are based on the Cluster AIX Aware cluster and is using multicast before the 2.2.3.1 release. By updating all Virtual I/O Servers members of a Shared Storage Pool the communication_mode attribute used by CAA will now be change to unicast. This mode does not exists at all in previous version so do not try to modify it. With this new feature a new command will be available as padmin called clo. This one will lets you check and change CAA tunables, communication_mode included :

    • Before updating the Virtual I/O Server to 2.2.3.1 the communication_mode tunable does not exists and you have to check it with the CAA command : clctrl (you have to be root) :
    • # clctrl -tune -a
         vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648).config_timeout = 480
           vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648).deadman_mode = a
           vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648).link_timeout = 0
        vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648).node_down_delay = 10000
           vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648).node_timeout = 20000
             vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648).packet_ttl = 32
       vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648).remote_hb_factor = 10
             vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648).repos_mode = e
      vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648).site_merge_policy = p
      
    • As you can see on the output below running a lscluster command on a Shared Storage Pool before version 2.2.3.1 gives you the multicast address used by the CAA cluster :
    • $ lscluster -i | grep MULTICAST
                      IPv4 MULTICAST ADDRESS: 228.253.103.41
                      IPv4 MULTICAST ADDRESS: 228.253.103.41
                      IPv4 MULTICAST ADDRESS: 228.253.103.41
                      IPv4 MULTICAST ADDRESS: 228.253.103.41
      
    • After updating a node to version 2.2.3.1 in the /usr/ios/utils directory a new command named clo will be available, and a new tunable will be here called communication_mode :
    • $ oem_setup_env
      # ls -l /usr/ios/utils/clo
      lrwxrwxrwx    1 root     system           20 Dec  2 10:20 /usr/ios/utils/clo -> /usr/lib/cluster/clo
      # exit
      $ clo -L communication_mode
      NAME                      DEF    MIN    MAX    UNIT           SCOPE
           ENTITY_NAME(UUID)                                                CUR
      --------------------------------------------------------------------------------
      communication_mode        m                                   c
           vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648)                m
      --------------------------------------------------------------------------------
      
    • If all nodes in the cluster are not at the same level (an your update is not finished) you’ll not be able to use this tunable :
    • # clo -o communication_mode
      clo: 1485-506 The current cluster level does not support tunable communication_mode.
      
    • After updating all nodes to 2.2.3.1 (all nodes will be ON_LEVEL), the communication_mode will automatically be configured to unicast :
    • $  clo -o communication_mode
      vio-cluster(791d19c8-5796-11e3-ac2c-5cf3fcea5648).communication_mode = u
      

    To sum up : in my opinion Shared Storage Pools are now production ready. I’ve never seen them in production neither in development environment but Shared Storage Pools really deserve to be used. Things are drasticly simplified and even more with this new version. Please try the new Shared Storage Pools and give me your feedbacks in the comments. Once again I hope it helps.

    IBM Oct 8 2013 Announcement : A technical point of view after Enterprise 2013

    PowerVM and Power systems lovers like me have probably heard a lot of things since Oct 8 2013. Unfortunately with the big mass of information, some of you, like me, may be lost with all these new things coming. I personnaly feel the need to clarify things from a technical point of vue, and have a deeper look on these annoucements. Finding technical stuffs from the announcement was not easy but Enterprise 2013 gave us a lot of information about the shape of things to come. Is this a new era for Power Systems ? Are our jobs going to change with the rise of the cloud ? Will PowerVM be replaced by KVM in a few years ? Will AIX be replaced by Linux On Power ? I can easily give you an answer : NO ! Have a look below and you will be as excited as I am, Power Systems are awesome, and they are going to be even better with all these new products:-).

    PowerVC

    PowerVC stands for Power Virtualization Center. PowerVC let you manages your virtualized infrastructure by capturing, deploying, creating and moving virtual machines. It’s based on OpenStack and has to be installed on a Linux Machine (Red Hat Enterprise Linux 6, running on x86 or Power). PowerVC runs on top of Power Systems infrastructure (Power6, or Power7 hardware, Hardware Management Console or IVM). It’s declined on two versions, standard (allowing Power6 and Power7 management with an HMC) and express (allowing only Power7 management with an IVM). At the time of writing this post PowerVC manages storage only for IBM V* Serie and IBM SVC. Brocade SAN switch are the only one supported. In a few words here is what you have to remember :

    • PowerVC runs on top of Hardware and HMC/IVM, like VMcontrol.
    • PowerVC allows you to manage and create (capture, deploy, …) virtual machines.
    • PowerVC allows you to move and automatically place virtual machines (resource pooling, dynamic virtual machines placement).
    • PowerVC is based on OpenStack API. IBM modifications and enhancements to OpenStack are committed to the community.
    • PowerVC only runs on RHEL 6.4 x86 or Power, RHEL support is not included.
    • PowerVC express manages Power7, Power7+ hardware.
    • PowerVC standard manages Power6, Power7 and Power7+ hardware.
    • Storage Systems SVC-family is mandatory (SVC/V7000/V3700/V3500)
    • PowerVC does not install Virtual I/O Server neither IVM. Virtual infrastructure has to be installed and it’s a mandatory prerequisite.
    • PowerVC express edition needs at least Virtual I/O Server 2.2.1.5, storage can be pre-zoned (I don’t know if it can be pre-masked), limit of five managed hosts with a total of 100 maximum managed LPARs.
    • PowerVC standard edition needs HMC 7.7.8 and Virtual I/O Server 2.2.3.0, storage can’t be pre-zoned and only supported SAN Switch is Brocade. It’s limited to ten managed hosts with a total of 400 LPARs.
    • PowerVC1

      In my opinion PowerVC will in the near future replace VMcontrol, it’s a shame for me because I spend so much time on VMcontrol but I think it’s good thing for Power Systems. PowerVC seems to be easy to deploy, easy to manage and seems to have a nice look and feel, my only regret : it’s only running on a Linux :-(. So keep an eye on this product because it’s the future of Power Systems management. You can see it as a VMware VCenter for Power Systems ;-)

      PowerVC2

    PowerVP

    PowerVP was first known as Sleuth and was first an internal IBM tool. It’s a performance analysis tool looking from the whole machine to lpar. No joke, you can compare it to the lssrad command and it’s seems to be a graphical version of this one. Nice visuals tells you how the hardware resources are assigned and consumed by the lpars. Three views are accessible :

    • The System Topology view shows the hardware topology how busy are the chips, and how busy is the traffic between each chips :
    • PowerVP2
      PowerVP1

    • The Node View shows you how each cores per chips are consumed and gives you information on memory controller, busses, and traffic from/to remote I/O :
    • PowerVP3

    • The partition view seems to be more classical and gives you information on CPU, Memory, Disk, Ethernet cache and memory affinity. You can drill down on each of this statistics :-) :
    • PowerVP4

    Some agents are needed by PowerVP. For partitions drill down an agent seems to be mandatory (oh no, not again); for whole system monitoring a kind of “super-agent” is also needed (to be installed on one partition ?)

    PowerVPcollectors

    I don’t why but PowerVP graphical interface is a java/swing application. In 2013 everybody wants a web interface I really do not understand this choice …. All data can be monitored and recorded (for analysis comparison) on the fly. The product is included to PowerVM enterprise edition and needs at least firmware 770 and 780 for high end machines.

    PowerVM Virtual I/O Server 2.2.3.0

    Shared Storage Pool 4 (SSP4)

    As a big fan of Shared Storage Pool, the SSPv4 was long long awaited. It enables a few cool things, the one I’m waiting the most is the SSP mirroring. You can now mask luns from two different SAN array and the mirroring is performed by the Virtual I/O Server. Nothing to do in the Virtual I/O Client :-). This new feature comes with a new command : failgrp. By default when the SSP is created the failover group is named default. You can also check which pv belongs to which failover group with the new pv command, this command allows you to check SAN array id by checking pv’s udid. It allows you too to check if pv are capable (for example pv comming from iscsi are not SSP capable) Here are a few cool commands example (sorry, without the output, theses ones are deduced from recent Nigel G. tests :-)).

    # failgrp -list
    # pv -list -capable
    # failgrp -create -fg SANARRAY2: hdisk12 hdisk13 hdisk14
    # failgrp -modify -fg Default -attr fg_name=SANARRAY1
    # pv -add -fg SANARRAY1: hdisk15 hdisk16 SANARRAY2: hdisk17 hdisk18
    # pv -list
    

    SSP4 is also simplying SSP management with new commands. The lu command allows you to create a “lu” in the SSP and to allocate it to a vhost in one command.

    • lu creation and mapping in one command :
    # lu -create -lu lpar1rootvg -size 32G -vadapter vhost3
    
    • lu creation and mapping in two commands :
    # lu -create -lu lpar1rootvg
    # lu -map -lu lpar1rootvg -vadpater vhost3
    

    The last thing to say about SSPv4, you can now remove a lun from the storage pool ! So with all these new features SSP can be used on production server. Finally.

    Throwing away Control Channel from Shared Ethernet Adapter

    To simplifiy Virtual I/O Server management and SEA creation it is now possible to create SEA failover without creating and specifying any control channel adapter. Both Virtual I/O Server can now detect if an SEA failover is created and use their default virtual adapter for control channel. You can find more precision on this subject on Scott Vetter‘s blog here. A funny thing to notice is that an APAR was created this year on this subject. This APAR leaks the name of the Project K2 (IBM internal name for PowerVM simplification) : IV37193. Ok to sum up here are the technical things to remember about this Shared Ethernet Adapter Simplification :

    • The creation of the SEA requires HMC V7.7.8, Virtual I/O Server 2.2.3.0, and firmware 780 (it’s seems to be only supported on Power7 and Power7+).
    • The end user can’t see it but the control channel function is using the SEA‘s default adapter.
    • The end user can’t see it but the control channel function is using a special VLAN id : 4095.
    • Standard SEA with classic control channel adapter can still be created.
    • My understanding of this feature is that it has a limitation : you can create only one Shared Ethernet Adapter per virtual switch. It’s seems that the HMC check that there is only one adapter with priority 1 and one adapter with priority 2 per virtual switch.
    • Command to create SEA is still the same, it you omit the ctl_chan attribute the SEA will automatically use the management VLAN 4095 for the control channel
    • A special discovery protocol is implemented on the Virtual I/O Server to automatically enables the “ghost” control channel

    … other few things

    VIOS advisors are modified, they can now monitor Shared Storage Pools and NPIV adapters, the part command is updated to do so. There are also some additional statistics and listing for SEA.

    Virtual Network Adapter are updated with a ping feature to check if the adapter is up or down. A virtual adapter can now be considered as down and can feel a physical link loss/failure. Is this feature the end of the netmon.cf file in PowerHA ?

    Conclusion, OpenPower, Power8, KVM

    I can’t finish this post without writing a few lines about huge global things to come. A new consorsium called OpenPower was created to work on Power8. It means that some third party manufacturers will be able to build their own implementation of Power8 processors. Talking about Power8 this one is on the way a seems to be a real performance monster, first entry class systems will probably be available in Q3/Q4 2014. But Power8 is not going to be released alone, KVM is going to be ported on Power8 hardware. Keep in mind that this will not provide nested virtualization (the purpose is not to run KVM on AIX, or to run AIX on KVM). A Power8 hardware will be able to run a special firmware letting you run a Linux KVM and to build and run Linux PPC virtual machines. At the time of writing this post users will be able to switch between this firmware and the classic one running PowerVM. It is pretty exciting !

    I hope this post is clarifying the current situation about all these new awesome annoucements.