Putting NovaLink in Production & more PowerVC (1.3.1.2) tips and tricks

I’ve been quite busy and writing the blog is getting to be more and more difficult with the amount of work I have but I try to stick to my thing as writing these blogs posts is almost the only thing I can do properly in my whole life. So why do without ? As my place is one of the craziest place I have ever worked in -(for the good … and the bad (I’ll not talk here about how are the things organized here or how is the recognition of your work but be sure it is probably be one the main reason I’ll probably leave this place one day or another)- the PowerSystems growth is crazy and the number of AIX partitions we are managing with PowerVC never stops increasing and I think that we are one the biggest PowerVC customer in the whole world (I don’t know if it is a good thing or not). Just to give you a couple of examples we have here on the biggest Power Enterprise Pool I have ever seen (384 Power8 mobile cores), the number of partitions managed by PowerVC is around 2600 and we have a PowerVC managing almost 30 hosts. You have understand well … theses numbers are huge. It’s seems to be very funny, but it’s not ; the growth is problem, a technical problem and we are facing problems that most of you will never hit. I’m speaking about density and scalability. Hopefully for us the “vertical” design of PowerVC can now be replaced by what I call an “horizontal” design. Instead of putting all the nova instances on one single machine, we now have the possibility to spread the load on each host by using NovaLink. As we needed to solve these density and scalability problems we decided to move all the P8 hosts to NovaLink (this process is still ongoing but most of the engineering stuffs are already done). As you now know we are not deploying a host every year but generally a couple by month and that’s why we needed to find a solution to automate this. So this blog post will talk about all the things and the best practices I have learn using and implementing NovaLink in a huge production environment (automated installation, tips and tricks, post-install, migration and so on). But we will not stop here I’ll also talk about the new things I have learn about PowerVC (1.3.1.2 and 1.3.0.1) and give more tips and tricks to use the product as it best. Before going any further I’d first want to say a big thank you to the whole PowerVC team for their kindness and the precious time they gave to us to advise and educate the OpenStack noob I am. (A special thanks to Drew Thorstensen for the long discussions we had about Openstack and PowerVC. He is probably one the most passionate guy I have ever met at IBM).

Novalink Automated installation

I’ll not write big introduction, let’s work and let’s start with NovaLink and how to automate the Novalink installation process. Copy the content of the installation cdrom to a directory that can be served by an http server on your NIM server (I’m using my NIM server for the bootp and tftp part). Note that I’m doing this with a tar command because there are symbolic links in the iso and a simple cp will end up with a full filesystem.

# loopmount -i ESD_-_PowerVM_NovaLink_V1.0.0.3_062016.iso -o "-V cdrfs -o ro" -m /mnt
# tar cvf iso.tar /mnt/*
# tar xvf ios.tar -C /export/nim/lpp_source/powervc/novalink/1.0.0.3/iso
# ls -l /export/nim/lpp_source/powervc/novalink/1.0.0.3/iso
total 320
dr-xr-xr-x    2 root     system          256 Jul 28 17:54 .disk
-r--r--r--    1 root     system          243 Apr 20 21:27 README.diskdefines
-r--r--r--    1 root     system         3053 May 25 22:25 TRANS.TBL
dr-xr-xr-x    3 root     system          256 Apr 20 11:59 boot
dr-xr-xr-x    3 root     system          256 Apr 20 21:27 dists
dr-xr-xr-x    3 root     system          256 Apr 20 21:27 doc
dr-xr-xr-x    2 root     system         4096 Aug 09 15:59 install
-r--r--r--    1 root     system       145981 Apr 20 21:34 md5sum.txt
dr-xr-xr-x    2 root     system         4096 Apr 20 21:27 pics
dr-xr-xr-x    3 root     system          256 Apr 20 21:27 pool
dr-xr-xr-x    3 root     system          256 Apr 20 11:59 ppc
dr-xr-xr-x    2 root     system          256 Apr 20 21:27 preseed
dr-xr-xr-x    4 root     system          256 May 25 22:25 pvm
lrwxrwxrwx    1 root     system            1 Aug 29 14:55 ubuntu -> .
dr-xr-xr-x    3 root     system          256 May 25 22:25 vios

Prepare the PowerVM NovaLink repository. The content of the repository can be found in the NovaLink iso image in pvm/repo/pvmrepo.tgz:

# ls -l /export/nim/lpp_source/powervc/novalink/1.0.0.3/iso/pvm/repo/
total 720192
-r--r--r--    1 root     system          223 May 25 22:25 TRANS.TBL
-rw-r--r--    1 root     system         2106 Sep 05 15:56 pvm-install.cfg
-r--r--r--    1 root     system    368722592 May 25 22:25 pvmrepo.tgz

Extract the content of this tgz file in a directory that can be served by the http server:

# mkdir /export/nim/lpp_source/powervc/novalink/1.0.0.3/pvmrepo
# cp /export/nim/lpp_source/powervc/novalink/1.0.0.3/iso/pvm/repo/pvmrepo.tgz
# cd /export/nim/lpp_source/powervc/novalink/1.0.0.3/pvmrepo
# gunzip pvmrepo.tgz
# tar xvf pvmrepo.tar
[..]
x ./pool/non-free/p/pvm-core/pvm-core-dbg_1.0.0.3-160525-2192_ppc64el.deb, 54686380 bytes, 106810 media blocks.
x ./pool/non-free/p/pvm-core/pvm-core_1.0.0.3-160525-2192_ppc64el.deb, 2244784 bytes, 4385 media blocks.
x ./pool/non-free/p/pvm-core/pvm-core-dev_1.0.0.3-160525-2192_ppc64el.deb, 618378 bytes, 1208 media blocks.
x ./pool/non-free/p/pvm-pkg-tools/pvm-pkg-tools_1.0.0.3-160525-492_ppc64el.deb, 170700 bytes, 334 media blocks.
x ./pool/non-free/p/pvm-rest-server/pvm-rest-server_1.0.0.3-160524-2229_ppc64el.deb, 263084432 bytes, 513837 media blocks.
# rm pvmrepo.tar 
# ls -l 
total 16
drwxr-xr-x    2 root     system          256 Sep 11 13:26 conf
drwxr-xr-x    2 root     system          256 Sep 11 13:26 db
-rw-r--r--    1 root     system          203 May 26 02:19 distributions
drwxr-xr-x    3 root     system          256 Sep 11 13:26 dists
-rw-r--r--    1 root     system         3132 May 24 20:25 novalink-gpg-pub.key
drwxr-xr-x    4 root     system          256 Sep 11 13:26 pool

Copy the NovaLink boot files in a directory that can be served by your tftp server (I’m using /var/lib/tftpboot):

# mkdir /var/lib/tftpboot
# cp -r /export/nim/lpp_source/powervc/novalink/1.0.0.3/iso/pvm /var/lib/tftpboot
# ls -l /var/lib/tftpboot
total 1016
-r--r--r--    1 root     system         1120 Jul 26 20:53 TRANS.TBL
-r--r--r--    1 root     system       494072 Jul 26 20:53 core.elf
-r--r--r--    1 root     system          856 Jul 26 21:18 grub.cfg
-r--r--r--    1 root     system        12147 Jul 26 20:53 pvm-install-config.template
dr-xr-xr-x    2 root     system          256 Jul 26 20:53 repo
dr-xr-xr-x    2 root     system          256 Jul 26 20:53 rootfs
-r--r--r--    1 root     system         2040 Jul 26 20:53 sample_grub.cfg

I still don’t know why this is the case on AIX but the tftp server is searching for the grub.cfg in the root directory of your AIX system. It’s not the case for my RedHat Enterprise Linux installation but it’s the case for the NovaLink/Ubuntu installation. Copy the sample-grub.cfg to /grub.cfg and modify the content of the file:

  • As the gateway, netmask and nameserver will be provided the the pvm-install-config.cfg (the configuration file of the Novalink installer we will talk about this later) file comment those three lines.
  • The hostname will still be needed.
  • Modify the linux line and point to the vmlinux file provided in the NovaLink iso image.
  • Modify the live-installer to point to the filesystem.squashfs provided in the NovaLink iso image.
  • Modify the pvm-repo line to point to the pvm-repository directory we created before.
  • Modify the pvm-installer line to point to the NovaLink install configuration file (we will modify this one after).
  • Don’t do anything with the pvm-vios line as we are installing NovaLink on a system already having Virtual I/O Servers installed (I’m not installing Scale Out system but high end models only).
  • I’ll talk later about the pvm-disk line (this line is not by default in the pvm-install-config.template provided in the NovaLink iso image).
# cp /var/lib/tftpboot/sample_grub.cfg /grub.cfg
# cat /grub.cfg
# Sample GRUB configuration for NovaLink network installation
set default=0
set timeout=10

menuentry 'PowerVM NovaLink Install/Repair' {
 insmod http
 insmod tftp
 regexp -s 1:mac_pos1 -s 2:mac_pos2 -s 3:mac_pos3 -s 4:mac_pos4 -s 5:mac_pos5 -s 6:mac_pos6 '(..):(..):(..):(..):(..):(..)' ${net_default_mac}
 set bootif=01-${mac_pos1}-${mac_pos2}-${mac_pos3}-${mac_pos4}-${mac_pos5}-${mac_pos6}
 regexp -s 1:prefix '(.*)\.(\.*)' ${net_default_ip}
# Setup variables with values from Grub's default variables
 set ip=${net_default_ip}
 set serveraddress=${net_default_server}
 set domain=${net_ofnet_network_domain}
# If tftp is desired, replace http with tftp in the line below
 set root=http,${serveraddress}
# Remove comment after providing the values below for
# GATEWAY_ADDRESS, NETWORK_MASK, NAME_SERVER_IP_ADDRESS
# set gateway=10.10.10.1
# set netmask=255.255.255.0
# set namserver=10.20.2.22
  set hostname=nova0696010
# In this sample file, the directory novalink is assumed to exist on the
# BOOTP server and has the NovaLink ISO content
 linux /export/nim/lpp_source/powervc/novalink/1.0.0.3/iso/install/vmlinux \
 live-installer/net-image=http://${serveraddress}/export/nim/lpp_source/powervc/novalink/1.0.0.3/iso/install/filesystem.squashfs \
 pkgsel/language-pack-patterns= \
 pkgsel/install-language-support=false \
 netcfg/disable_dhcp=true \
 netcfg/choose_interface=auto \
 netcfg/get_ipaddress=${ip} \
 netcfg/get_netmask=${netmask} \
 netcfg/get_gateway=${gateway} \
 netcfg/get_nameservers=${nameserver} \
 netcfg/get_hostname=${hostname} \
 netcfg/get_domain=${domain} \
 debian-installer/locale=en_US.UTF-8 \
 debian-installer/country=US \
# The directory novalink-repo on the BOOTP server contains the content
# of the pvmrepo.tgz file obtained from the pvm/repo directory on the
# NovaLink ISO file.
# The directory novalink-vios on the BOOTP server contains the files
# needed to perform a NIM install of VIOS server(s)
#  pvmdebug=1
 pvm-repo=http://${serveraddress}/export/nim/lpp_source/powervc/novalink/1.0.0.3/novalink-repo/ \
 pvm-installer-config=http://${serveraddress}/export/nim/lpp_source/powervc/novalink/1.0.0.3/mnt/pvm/repo/pvm-install.cfg \
 pvm-viosdir=http://${serveraddress}/novalink-vios \
 pvmdisk=/dev/mapper/mpatha \
 initrd /export/nim/lpp_source/powervc/novalink/1.0.0.3/mnt/install/netboot_initrd.gz
}

Modify the pvm-install.cfg, it’s the NovaLink installer configuration file. We just need to modify here the [SystemConfig],[NovaLinkGeneralSettings],[NovaLinkNetworkSettings],[NovaLinkAPTRepoConfig] and [NovaLinkAdminCredential]. My advice is to configure one NovaLink by hand (by doing an installation directly with the iso image, then after the installation your configuration file is saved in /var/log/pvm-install/novalink-install.cfg. You can copy this one as your template on your installation server. This file is filled by the answers you gave during the NovaLink installation)

# more /export/nim/lpp_source/powervc/novalink/1.0.0.3/mnt/pvm/repo/pvm-install.cfg
[SystemConfig]
serialnumber = XXXXXXXX
lmbsize = 256

[NovaLinkGeneralSettings]
ntpenabled = True
ntpserver = timeserver1
timezone = Europe/Paris

[NovaLinkNetworkSettings]
dhcpip = DISABLED
ipaddress = YYYYYYYY
gateway = ZZZZZZZZ
netmask = 255.255.255.0
dns1 = 8.8.8.8
dns2 = 8.8.9.9
hostname = WWWWWWWW
domain = lab.chmod666.org

[NovaLinkAPTRepoConfig]
downloadprotocol = http
mirrorhostname = nimserver
mirrordirectory = /export/nim/lpp_source/powervc/novalink/1.0.0.3/mnt/
mirrorproxy =

[VIOSNIMServerConfig]
novalink_private_ip = 192.168.128.1
vios1_private_ip = 192.168.128.2
vios2_private_ip = 192.168.128.3
novalink_netmask = 255.255.128.0
viosinstallprompt = False

[NovaLinkAdminCredentials]
username = padmin
password = $6$N1hP6cJ32p17VMpQ$sdThvaGaR8Rj12SRtJsTSRyEUEhwPaVtCTvbdocW8cRzSQDglSbpS.jgKJpmz9L5SAv8qptgzUrHDCz5ureCS.
userdescription = NovaLink System Administrator

Finally modify the /etc/bootptab file and add a line matching your installation:

# tail -1 /etc/bootptab
nova0696010:bf=/var/lib/tftpboot/core.elf:ip=10.20.65.16:ht=ethernet:sa=10.255.228.37:gw=10.20.65.1:sm=255.255.255.0:

Don’t forget to setup an http server, serving all the needed files. I know this configuration is super unsecured. But honestly I don’t care my NIM server is in a super secured network just accessible by the VIOS and NovaLink partition. So I’m good :-) :

# cd /opt/freeware/etc/httpd/ 
# grep -Ei "^Listen|^DocumentRoot" conf/httpd.conf
Listen 80
DocumentRoot "/"

novaserved

Instead of doing this over and over and over at every NovaLink installation I have written a custom script preparing my NovaLink installation file, what I do in this script is:

  • Preparing the pvm-install.cfg file.
  • Modifying the grub.cfg file.
  • Adding a line to the /etc/bootptab file.
#  ./custnovainstall.ksh nova0696010 10.20.65.16 10.20.65.1 255.255.255.0
#!/usr/bin/ksh

novalinkname=$1
novalinkip=$2
novalinkgw=$3
novalinknm=$4
cfgfile=/export/nim/lpp_source/powervc/novalink/novalink-install.cfg
desfile=/export/nim/lpp_source/powervc/novalink/1.0.0.3/mnt/pvm/repo/pvm-install.cfg
grubcfg=/export/nim/lpp_source/powervc/novalink/grub.cfg
grubdes=/grub.cfg

echo "+--------------------------------------+"
echo "NovaLink name: ${novalinkname}"
echo "NovaLink IP: ${novalinkip}"
echo "NovaLink GW: ${novalinkgw}"
echo "NovaLink NM: ${novalinknm}"
echo "+--------------------------------------+"
echo "Cfg ref: ${cfgfile}"
echo "Cfg file: ${cfgfile}.${novalinkname}"
echo "+--------------------------------------+"

typeset -u serialnumber
serialnumber=$(echo ${novalinkname} | sed 's/nova//g')

echo "SerNum: ${serialnumber}"

cat ${cfgfile} | sed "s/serialnumber = XXXXXXXX/serialnumber = ${serialnumber}/g" | sed "s/ipaddress = YYYYYYYY/ipaddress = ${novalinkip}/g" | sed "s/gateway = ZZZZZZZZ/gateway = ${novalinkgw}
/g" | sed "s/netmask = 255.255.255.0/netmask = ${novalinknm}/g" | sed "s/hostname = WWWWWWWW/hostname = ${novalinkname}/g" > ${cfgfile}.${novalinkname}
cp ${cfgfile}.${novalinkname} ${desfile}
cat ${grubcfg} | sed "s/  set hostname=WWWWWWWW/  set hostname=${novalinkname}/g" > ${grubcfg}.${novalinkname}
cp ${grubcfg}.${novalinkname} ${grubdes}
# nova1009425:bf=/var/lib/tftpboot/core.elf:ip=10.20.65.15:ht=ethernet:sa=10.255.248.37:gw=10.20.65.1:sm=255.255.255.0:
echo "${novalinkname}:bf=/var/lib/tftpboot/core.elf:ip=${novalinkip}:ht=ethernet:sa=10.255.248.37:gw=${novalinkgw}:sm=${novalinknm}:" >> /etc/bootptab

Novalink installation: vSCSI or NPIV ?

NovaLink is not designed to be installed of top of NPIV it’s a fact. As it is designed to be installed on a totally new system without any Virtual I/O Servers configured the NovaLink installation is by default creating the Virtual I/O Servers and using these VIOS the installation process is creating backing devices on top of logical volumes created in the default VIOS storage pool. Then the Novalink installation partition is created on top of these two logical volumes and at the end mirrored. This is the way NovaLink is doing for Scale Out systems.

For High End systems NovaLink is assuming your going to install the NovaLink partition on top of vSCSI (have personnaly tried with hdisk backed and SSP Logical Unit backed and both are working ok). For those like me who wants to install NovaLink on top of NPIV (I know this is not a good choice, but once again I was forced to do that) there still is a possiblity to do it. (In my humble opinion the NPIV design is done for high performance and the Novalink partition is not going to be an I/O intensive partition. Even worse our whole new design is based on NPIV for LPARs …. it’s a shame as NPIV is not a solution designed for high denstity and high scalability. Every PowerVM system administrator should remember this. NPIV IS NOT A GOOD CHOICE FOR DENSITY AND SCALABILITY USE IT FOR PERFORMANCE ONLY !!!. The story behind this is funny. I’m 100% sure that SSP is ten time a better choice to achieve density and scalability. I decided to open a poll on twitter asking this question “Will you choose SSP or NPIV to design a scalable AIX cloud based on PowerVC ?”. I was 100% sure SSP will win and made a bet with friend (I owe him beers now) that I’ll be right. What was my surprise when seeing the results. 90% of people vote for NPIV. I’m sorry to say that guys but there are two possibilities: 1/ You don’t really know what scalability and density means because you never faced it so that’s why you made the wrong choice. 2/ You know it and you’re just wrong :-) . This little story is another proof telling that IBM is not responsible about the dying of AIX and PowerVM … but unfortunately you are responsible of it not understanding that the only way to survive is to face high scalable solution like Linux is doing with Openstack and Ceph. It’s a fact. Period.)

This said … if you are trying to install NovaLink on top of NPIV you’ll get an error. A workaround to this problem is to add the following line to the grub.cfg file

 pvmdisk=/dev/mapper/mpatha \

If you do that you’ll be able to install NovaLink on your NPIV disk but still have an error the first time you’ll install it at the “grub-install step”. Just re-run the installation a second time and the grub-install command will work ok :-) (I’ll explain how to do to avoid this second issue later).

One work-around to this second issue is to recreate the initrd by adding a line in the debian-installer config file.

Fully automated installation by example

  • Here the core.elf file is downloaded by tftp. You can se in the capture below that the grub.cfg file is searched in / :
  • 1m
    13m

  • The installer is starting:
  • 2

  • The vmlinux is downloaded (http):
  • 3

  • The root.squashfs is downloaded (http):
  • 4m

  • The pvm-install.cfg configuration file is downloaded (http):
  • 5

  • pvm services are started. At this time if you are running in co-management mode you’ll see the Red lock in the HMC Server status:
  • 6

  • The Linux and Novalink istallation is ongoing:
  • 7
    8
    9
    10
    11
    12

  • System is ready:
  • 14

Novalink code auto update

When adding a NovaLink host to PowerVC the powervc packages coming from the powervc management host will be installed on the NovaLink partition. You can check this during the installation. Here is what’s going on when adding the NovaLink host to PowerVC:

15
16

# cat /opt/ibm/powervc/log/powervc_install_2016-09-11-164205.log
################################################################################
Starting the IBM PowerVC Novalink Installation on:
2016-09-11T16:42:05+02:00
################################################################################

LOG file is /opt/ibm/powervc/log/powervc_install_2016-09-11-164205.log

2016-09-11T16:42:05.18+02:00 Installation directory is /opt/ibm/powervc
2016-09-11T16:42:05.18+02:00 Installation source location is /tmp/powervc_img_temp_1473611916_1627713/powervc-1.3.1.2
[..]
Setting up python-neutron (10:8.0.0-201608161728.ibm.ubuntu1.375) ...
Setting up neutron-common (10:8.0.0-201608161728.ibm.ubuntu1.375) ...
Setting up neutron-plugin-ml2 (10:8.0.0-201608161728.ibm.ubuntu1.375) ...
Setting up ibmpowervc-powervm-network (1.3.1.2) ...
Setting up ibmpowervc-powervm-oslo (1.3.1.2) ...
Setting up ibmpowervc-powervm-ras (1.3.1.2) ...
Setting up ibmpowervc-powervm (1.3.1.2) ...
W: --force-yes is deprecated, use one of the options starting with --allow instead.

***************************************************************************
IBM PowerVC Novalink installation
 successfully completed at 2016-09-11T17:02:30+02:00.
 Refer to
 /opt/ibm/powervc/log/powervc_install_2016-09-11-165617.log
 for more details.
***************************************************************************

17

Installing the missing deb packages if NovaLink host was added before PowerVC upgrade

If the NovaLink host was added in PowerVC 1.3.1.1 and you updated to PowerVC 1.3.1.2 you have to update the package by hand because there is a little bug during the update of some packages:

  • From the PowerVC management host copy the latest packages to the NovaLink host:
  • # scp /opt/ibm/powervc/images/powervm/powervc-powervm-compute-1.3.1.2.tgz padmin@nova0696010:~
    padmin@nova0696010's password:
    powervc-powervm-compute-1.3.1.2.tgz
    
  • Update the packages on the NovaLink host
  • # tar xvzf powervc-powervm-compute-1.3.1.2.tgz
    # cd powervc-1.3.1.2/packages/powervm
    # dpkg -i nova-powervm_2.0.3-160816-48_all.deb
    # dpkg -i networking-powervm_2.0.1-160816-6_all.deb
    # dpkg -i ceilometer-powervm_2.0.1-160816-17_all.deb
    # /opt/ibm/powervc/bin/powervc-services restart
    

rsct and pvm deb update

Never forget to install latest rsct and pvm packages after the installation. You can clone the official IBM repository for pvm and rsct files (you can check my previous post about Novalink for more details about cloning the repository). Then create two files in /etc/apt/sources.list.d one for pvm, the other for rsct

# vi /etc/apt/sources.list.d/pvm.list
deb http://nimserver/export/nim/lpp_source/powervc/novalink/nova/debian novalink_1.0.0 non-free
# vi /etc/apt/source.list.d/rsct.list
deb http://nimserver/export/nim/lpp_source/powervc/novalink/rsct/ubuntu xenial main
# dpkg -l | grep -i rsct
ii  rsct.basic                                3.2.1.0-15300                           ppc64el      Reliable Scalable Cluster Technology - Basic
ii  rsct.core                                 3.2.1.3-16106-1ubuntu1                  ppc64el      Reliable Scalable Cluster Technology - Core
ii  rsct.core.utils                           3.2.1.3-16106-1ubuntu1                  ppc64el      Reliable Scalable Cluster Technology - Utilities
# # dpkg -l | grep -i pvm
ii  pvm-cli                                   1.0.0.3-160516-1488                     all          Power VM Command Line Interface
ii  pvm-core                                  1.0.0.3-160525-2192                     ppc64el      PVM core runtime package
ii  pvm-novalink                              1.0.0.3-160525-1000                     ppc64el      Meta package for all PowerVM Novalink packages
ii  pvm-rest-app                              1.0.0.3-160524-2229                     ppc64el      The PowerVM NovaLink REST API Application
ii  pvm-rest-server                           1.0.0.3-160524-2229                     ppc64el      Holds the basic installation of the REST WebServer (Websphere Liberty Profile) for PowerVM NovaLink 
# apt-get install rsct.core rsct.basic
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  docutils-common libpaper-utils libpaper1 python-docutils python-roman
Use 'apt autoremove' to remove them.
The following additional packages will be installed:
  rsct.core.utils src
The following packages will be upgraded:
  rsct.core rsct.core.utils src
3 upgraded, 0 newly installed, 0 to remove and 6 not upgraded.
Need to get 9,356 kB of archives.
After this operation, 548 kB disk space will be freed.
[..]
# apt-get install pvm-novalink
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  docutils-common libpaper-utils libpaper1 python-docutils python-roman
Use 'apt autoremove' to remove them.
The following additional packages will be installed:
  pvm-core pvm-rest-app pvm-rest-server pypowervm
The following packages will be upgraded:
  pvm-core pvm-novalink pvm-rest-app pvm-rest-server pypowervm
5 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.
Need to get 287 MB of archives.
After this operation, 203 kB of additional disk space will be used.
Do you want to continue? [Y/n] Y
[..]

After the installation, here is what you should have if everything was updated properly:

dpkg -l | grep rsct
ii  rsct.basic                                3.2.1.4-16154-1ubuntu1                  ppc64el      Reliable Scalable Cluster Technology - Basic
ii  rsct.core                                 3.2.1.4-16154-1ubuntu1                  ppc64el      Reliable Scalable Cluster Technology - Core
ii  rsct.core.utils                           3.2.1.4-16154-1ubuntu1                  ppc64el      Reliable Scalable Cluster Technology - Utilities
dpkg -l | grep pvm
ii  pvm-cli                                   1.0.0.3-160516-1488                     all          Power VM Command Line Interface
ii  pvm-core                                  1.0.0.3.1-160713-2441                   ppc64el      PVM core runtime package
ii  pvm-novalink                              1.0.0.3.1-160714-1152                   ppc64el      Meta package for all PowerVM Novalink packages
ii  pvm-rest-app                              1.0.0.3.1-160713-2417                   ppc64el      The PowerVM NovaLink REST API Application
ii  pvm-rest-server                           1.0.0.3.1-160713-2417                   ppc64el      Holds the basic installation of the REST WebServer (Websphere Liberty Profile) for PowerVM NovaLink

Novalink post-installation (my ansible way to do that)

You all now know that I’m not very fond of doing the same things over and over again, that’s why I have create an ansible post-install playbook especially for NovaLink post installation. You can download it here: nova_ansible. Then install ansible on a host that has an ssh access to all your NovaLink partitions and run the the ansible playbook:

  • Untar the ansible playbook:
  • # mkdir /srv/ansible
    # cd /srv/ansible
    # tar xvf novalink_ansible.tar 
    
  • Modify the group_vars/novalink.yml to fit your environment:
  • # cat group_vars/novalink.yml
    ntpservers:
      - ntpserver1
      - ntpserver2
    dnsservers:
      - 8.8.8.8
      - 8.8.9.9
    dnssearch:
      - lab.chmod666.org
    vepa_iface: ibmveth6
    repo: nimserver
    
  • Share root ssh key to the NovaLink host (be careful by default NovaLink does not allow root login you have to modify the sshd configuration file):
  • Put all your Novalink hosts into the inventory file:
  • #cat inventories/hosts.novalink
    [novalink]
    nova65a0cab
    nova65ff4cd
    nova10094ef
    nova06960ab
    
  • Run ansible-playbook and you’re done:
  • # ansible-playbook -i inventories/hosts.novalink site.yml
    

    ansible1
    ansible2
    ansible3

More details about NovaLink

MGMTSWITCH vswitch automatic creation

Do not try to create the MGMTSWITCH by yourself. The NovaLink installer is doing it for you. As my Virtual I/O Servers are installed using the IBM Provisioning Toolkit for PowerVM … I was creating the MGMTSWITCH at this time but I was wrong. You can see this in the file /var/log/pvm-install/pvminstall.log on the NovaLink partition:

# cat /var/log/pvm-install/pvminstall.log
Fri Aug 12 17:26:07 UTC 2016: PVMDebug = 0
Fri Aug 12 17:26:07 UTC 2016: Running initEnv
[..]
Fri Aug 12 17:27:08 UTC 2016: Using user provided pvm-install configuration file
Fri Aug 12 17:27:08 UTC 2016: Auto Install set
[..]
Fri Aug 12 17:27:44 UTC 2016: Auto Install = 1
Fri Aug 12 17:27:44 UTC 2016: Validating configuration file
Fri Aug 12 17:27:44 UTC 2016: Initializing private network configuration
Fri Aug 12 17:27:45 UTC 2016: Running /opt/ibm/pvm-install/bin/switchnetworkcfg -o c
Fri Aug 12 17:27:46 UTC 2016: Running /opt/ibm/pvm-install/bin/switchnetworkcfg -o n -i 3 -n MGMTSWITCH -p 4094 -t 1
Fri Aug 12 17:27:49 UTC 2016: Start setupinstalldisk operation for /dev/mapper/mpatha
Fri Aug 12 17:27:49 UTC 2016: Running updatedebconf
Fri Aug 12 17:56:06 UTC 2016: Pre-seeding disk recipe

NPIV lpar creation problem !

As you know my environment is crazy. Every lpar we are creating have 4 virtual fibre channels adapters. Obviously two on fabric A and two on fabric B. And obviously again each fabric must be present on each Virtual I/O Servers. So to sum up. An lpar must have access to fabric A and B using VIOS1 and to fabric A and B using VIOS2. Unfortunately there was a little bug in the current NovaLink (1.0.0.3) code and all the lpar created were created with only two adapters. The PowerVC team gave my a patch to handle this particular issue patching the npiv.py file. This patch needs to be installed on the NovaLink partition itself.:

# cd /usr/lib/python2.7/dist-packages/powervc_nova/virt/ibmpowervm/pvm/volume
# sdiff npiv.py.back npiv.bck

npivpb

I’m intentionally not giving you the solution here (just by copying/pasting code) because an issue is addressed and an APAR has been opened for this issue and is resolved in 1.3.1.2 version. IT16534

From NovaLink to HMC …. and the opposite

One of the challenge for me was to be sure everything was working ok regarding LPM and NovaLink. So I decided to test different cases:

  • From NovaLink host to Novalink host (didn’t had any trouble) :-)
  • From NovaLink host to HMC host (didn’t had any trouble) :-)
  • From HMC host to Novalink host (had a trouble) :-(

Once again this issue avoiding HMC to Novalink LPM to work correctly is related to storage. A patch is ongoing but let me explain this issue a little bit (only if you have to absolutely move an LPAR from HMC to NovaLink and your are in the same case as I am):

PowerVC is not correctly doing the mapping to the destination Virtual I/O Servers and is trying to map two times the fabric A on the VIOS1 and two time the fabric B on the VIOS2. Hopefully for us you can do the migration by hand :

  • Do the LPM operation from PowerVC and check on the HMC side how PowerVC is doing the mapping (log on the HMC to check this):
  • #  lssvcevents -t console -d 0 | grep powervc_admin | grep migrlpar
    time=08/31/2016 18:53:27,"text=HSCE2124 User name powervc_admin: migrlpar -m 9119-MME-656C38A -t 9119-MME-65A0C31 --id 18 --ip 10.22.33.198 -u wlp -i ""virtual_fc_mappings=6/vios1/2//fcs2,3/vios2/1//fcs2,4/vios2/1//fcs1,5/vios1/2//fcs1"",shared_proc_pool_id=0 -o m command failed."
    
  • One interesting point you can see here is that the NovaLink user used for LPM is not padmin but wlp. Have look on the Novalink machine if you are a little bit curious:
  • 18

  • If you are double checking the mapping you’ll see that PowerVC is mixing up the VIOS. Just rerun the command in the right order and you’ll see that you’re going to be able to do HMC to NovaLink LPM (By the way PowerVC is automattically detecting that the host has changed for this lpar (moved outside of PowerVC)):
  • # migrlpar -m 9119-MME-656C38A -t 9119-MME-65A0C31 --id 18 --ip 10.22.33.198 -u wlp -i '"virtual_fc_mappings=6/vios2/1//fcs2,3/vios1/2//fcs2,4/vios2/1//fcs1,5/vios1/2//fcs1"',shared_proc_pool_id=0 -o m
    # lssvcevents -t console -d 0 | grep powervc_admin | grep migrlpar
    time=08/31/2016 19:13:00,"text=HSCE2123 User name powervc_admin: migrlpar -m 9119-MME-656C38A -t 9119-MME-65A0C31 --id 18 --ip 10.22.33.198 -u wlp -i ""virtual_fc_mappings=6/vios2/1//fcs2,3/vios1/2//fcs2,4/vios2/1//fcs1,5/vios1/2//fcs1"",shared_proc_pool_id=0 -o m command was executed successfully."
    
hmctonova

One more time don't worry about this issue a patch is on the way. But I thought it was interessting to talk about it just to show you how PowerVC is handling this (user, key sharing, check on the HMC).

Deep dive into the initrd

I am curious and there is no way to change this. As I wanted to know how the NovaLink installer is working I had to check into the netboot_initrd.gz file. There are a lot of interesting stuff to check in this initrd. Run the commands below on a Linux partition if you also want to have a look:

# scp nimdy:/export/nim/lpp_source/powervc/novalink/1.0.0.3/iso/install/netboot_initrd.gz .
# gunzip netboot_initrd
# cpio -i < netboot_initrd
185892 blocks

The installer is located in opt/ibm/pvm-install:

# ls opt/ibm/pvm-install/data/
40mirror.pvm  debpkgs.txt  license.txt  nimclient.info  pvm-install-config.template  pvm-install-preseed.cfg  rsct-gpg-pub.key  vios_diagram.txt
# ls opt/ibm/pvm-install/bin
assignio.py        envsetup        installpvm                    monitor        postProcessing    pvmwizardmain.py  restore.py        switchnetworkcfg  vios
cfgviosnetwork.py  functions       installPVMPartitionWizard.py  network        procmem           recovery          setupinstalldisk  updatedebconf     vioscfg
chviospasswd       getnetworkinfo  ioadapter                     networkbridge  pvmconfigdata.py  removemem         setupviosinstall  updatenimsetup    welcome.py
editpvmconfig      initEnv         mirror                        nimscript      pvmtime           resetsystem       summary.py        user              wizpkg

You can for instance check what's the installer is exactly doing. Let's take again the exemple of the MGMTSWITCH creation, you can see in the output below that I was right saying that:

initrd1

Remember that I was telling you before that I had problem with installation on NPIV. You can avoid installing NovaLink two times by modifying the debian installer directly in the initrd by adding a line in the debian installer file opt/ibm/pvm-install/data/pvm-install-preseed.cfg (you have to rebuild the initrd after doing this) :

# grep bootdev opt/ibm/pvm-install/data/pvm-install-preseed.cfg
d-i grub-installer/bootdev string /dev/mapper/mpatha
# find | cpio -H newc -o > ../new_initrd_file
# gzip -9 ../new_initrd_file
# scp ../new_initrdfile.gz nimdy:/export/nim/lpp_source/powervc/novalink/1.0.0.3/iso/install/netboot_initrd.gz

You can also find good example here of pvmctl commands:

# grep -R pvmctl *
pvmctl lv create --size $LV_SIZE --name $LV_NAME -p id=$vid
pvmctl scsi create --type lv --vg name=rootvg --lpar id=1 -p id=$vid --stor-id name=$LV_NAME

Troubleshooting

NovaLink is not PowerVC so here is a little reminder of what I do to troubleshot Novalink:

  • Installation troubleshooting:
  • #cat /var/log/pvm-install/pvminstall.log
    
  • Neutron Agent log (always double check this one):
  • # cat /var/log/neutron/neutron-powervc-pvm-sea-agent.log
    
  • Nova logs for this host are not accessible on the PowerVC management host anymore, so check it on the NovaLink partition if needed:
  • # cat /var/log/nova/nova-compute.log
    
  • pvmctl logs:
  • # cat /var/log/pvm/pvmctl.log
    

One last thing to add about NovaLink. One thing I like a lot is that Novalink is doing backups of the system and VIOS hourly/daily. These backup are stored in /var/backup/pvm :

# crontab -l
# VIOS hourly backups - at 15 past every hour except for midnight
15 1-23 * * * /usr/sbin/pvm-backup --type vios --frequency hourly
# Hypervisor hourly backups - at 15 past every hour except for midnight
15 1-23 * * * /usr/sbin/pvm-backup --type system --frequency hourly
# VIOS daily backups - at 15 past midnight
15 0    * * * /usr/sbin/pvm-backup --type vios --frequency daily
# Hypervisor daily backups - at 15 past midnight
15 0    * * * /usr/sbin/pvm-backup --type system --frequency daily
#ls -l /var/backups/pvm
total 4
drwxr-xr-x 2 root pvm_admin 4096 Sep  9 00:15 9119-MME*0265FF47B

More PowerVC tips and tricks

Let's finish this blog post with more PowerVC tips and tricks. Before giving you the tricks I have to warn you. All of these tricks are not supported by PowerVC, use them at your own risk OR contact your support before doing anything else. You may break and destroy everything if you are not aware of what you are doing. So please be very careful using all these tricks. YOU HAVE BEEN WARNED !!!!!!

Accessing and querying the database

This first trick is funny and will allow you to query and modify the PowerVC database. Once again do this a your own risks. One of the issue I had was strange. I do not remeber how it happends exactly but some of my luns that were not attached to any hosts and were still showing an attachmenent number equals to 1 and I didn't had the possibility to remove it. Even worse someone has deleted these luns on the SVC side. So these luns were what I called "ghost lun". Non existing but non-deletable luns. (I had also to remove the storage provider related to these luns). The only way to change this was to change the state to detached directly in the cinder database. Be careful this trick is only working with MariaDB.

First get the database password. Get the encrypted password from /opt/ibm/powervc/data/powervc-db.conf file and decode it to have the clear password:

# grep ^db_password /opt/ibm/powervc/data/powervc-db.conf
db_password = aes-ctr:NjM2ODM5MjM0NTAzMTg4MzQzNzrQZWi+mrUC+HYj9Mxi5fQp1XyCXA==
# python -c "from powervc_keystone.encrypthandler import EncryptHandler; print EncryptHandler().decode('aes-ctr:NjM2ODM5MjM0NTAzMTg4MzQzNzrQZWi+mrUC+HYj9Mxi5fQp1XyCXA==')"
OhnhBBS_gvbCcqHVfx2N
# mysql -u root -p cinder
Enter password:
MariaDB [cinder]> MariaDB [cinder]> show tables;
+----------------------------+
| Tables_in_cinder           |
+----------------------------+
| backups                    |
| cgsnapshots                |
| consistencygroups          |
| driver_initiator_data      |
| encryption                 |
[..]

Then get the lun uuid on the PowerVC gui for the lun you want to change, and follow the commands below:

dummy

MariaDB [cinder]> select * from volume_attachment where volume_id='9cf6d85a-3edd-4ab7-b797-577ff6566f78' \G
*************************** 1. row ***************************
   created_at: 2016-05-26 08:52:51
   updated_at: 2016-05-26 08:54:23
   deleted_at: 2016-05-26 08:54:23
      deleted: 1
           id: ce4238b5-ea39-4ce1-9ae7-6e305dd506b1
    volume_id: 9cf6d85a-3edd-4ab7-b797-577ff6566f78
attached_host: NULL
instance_uuid: 44c7a72c-610c-4af1-a3ed-9476746841ab
   mountpoint: /dev/sdb
  attach_time: 2016-05-26 08:52:51
  detach_time: 2016-05-26 08:54:23
  attach_mode: rw
attach_status: attached
1 row in set (0.01 sec)
MariaDB [cinder]> select * from volumes where id='9cf6d85a-3edd-4ab7-b797-577ff6566f78' \G
*************************** 1. row ***************************
                 created_at: 2016-05-26 08:51:57
                 updated_at: 2016-05-26 08:54:23
                 deleted_at: NULL
                    deleted: 0
                         id: 9cf6d85a-3edd-4ab7-b797-577ff6566f78
                     ec2_id: NULL
                    user_id: 0688b01e6439ca32d698d20789d52169126fb41fb1a4ddafcebb97d854e836c9
                 project_id: 1471acf124a0479c8d525aa79b2582d0
                       host: pb01_mn_svc_qual
                       size: 1
          availability_zone: nova
                     status: available
              attach_status: attached
               scheduled_at: 2016-05-26 08:51:57
                launched_at: 2016-05-26 08:51:59
              terminated_at: NULL
               display_name: dummy
        display_description: NULL
          provider_location: NULL
              provider_auth: NULL
                snapshot_id: NULL
             volume_type_id: e49e9cc3-efc3-4e7e-bcb9-0291ad28df42
               source_volid: NULL
                   bootable: 0
          provider_geometry: NULL
                   _name_id: NULL
          encryption_key_id: NULL
           migration_status: NULL
         replication_status: disabled
replication_extended_status: NULL
    replication_driver_data: NULL
        consistencygroup_id: NULL
                provider_id: NULL
                multiattach: 0
            previous_status: NULL
1 row in set (0.00 sec)
MariaDB [cinder]> update volume_attachment set attach_status='detached' where volume_id='9cf6d85a-3edd-4ab7-b797-577ff6566f78';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
MariaDB [cinder]> update volumes set attach_status='detached' where id='9cf6d85a-3edd-4ab7-b797-577ff6566f78';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

The second issue I had was about having some machines in deleted state but the reality was that the HMC just rebooted and for an unknow reason these machines where seen as 'deleted' .. but they were not. Using this trick I was able to force a re-evalutation of each machine is this case:

#  mysql -u root -p nova
Enter password:
MariaDB [nova]> select * from instance_health_status where health_state='WARNING';
+---------------------+---------------------+------------+---------+--------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------+
| created_at          | updated_at          | deleted_at | deleted | id                                   | health_state | reason                                                                                                                                                                                                                | unknown_reason_details |
+---------------------+---------------------+------------+---------+--------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------+
| 2016-07-11 08:58:37 | NULL                | NULL       |       0 | 1af1805c-bb59-4bc9-8b6d-adeaeb4250f3 | WARNING      | [{"resource_local": "server", "display_name": "p00ww6754398", "resource_property_key": "rmc_state", "resource_property_value": "initializing", "resource_id": "1af1805c-bb59-4bc9-8b6d-adeaeb4250f3"}]                |                        |
| 2015-07-31 16:53:50 | 2015-07-31 18:49:50 | NULL       |       0 | 2668e808-10a1-425f-a272-6b052584557d | WARNING      | [{"resource_local": "server", "display_name": "multi-vol", "resource_property_key": "vm_state", "resource_property_value": "deleted", "resource_id": "2668e808-10a1-425f-a272-6b052584557d"}]                         |                        |
| 2015-08-03 11:22:38 | 2015-08-03 15:47:41 | NULL       |       0 | 2934fb36-5d91-48cd-96de-8c16459c50f3 | WARNING      | [{"resource_local": "server", "display_name": "clouddev-test-754df319-00000038", "resource_property_key": "rmc_state", "resource_property_value": "inactive", "resource_id": "2934fb36-5d91-48cd-96de-8c16459c50f3"}] |                        |
| 2016-07-11 09:03:59 | NULL                | NULL       |       0 | 3fc42502-856b-46a5-9c36-3d0864d6aa4c | WARNING      | [{"resource_local": "server", "display_name": "p00ww3254401", "resource_property_key": "rmc_state", "resource_property_value": "initializing", "resource_id": "3fc42502-856b-46a5-9c36-3d0864d6aa4c"}]                |                        |
| 2015-07-08 20:11:48 | 2015-07-08 20:14:09 | NULL       |       0 | 54d02c60-bd0e-4f34-9cb6-9c0a0b366873 | WARNING      | [{"resource_local": "server", "display_name": "p00wb3740870", "resource_property_key": "rmc_state", "resource_property_value": "inactive", "resource_id": "54d02c60-bd0e-4f34-9cb6-9c0a0b366873"}]                    |                        |
| 2015-07-31 17:44:16 | 2015-07-31 18:49:50 | NULL       |       0 | d5ec2a9c-221b-44c0-8573-d8e3695a8dd7 | WARNING      | [{"resource_local": "server", "display_name": "multi-vol-sp5", "resource_property_key": "vm_state", "resource_property_value": "deleted", "resource_id": "d5ec2a9c-221b-44c0-8573-d8e3695a8dd7"}]                     |                        |
+---------------------+---------------------+------------+---------+--------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------+
6 rows in set (0.00 sec)
MariaDB [nova]> update instance_health_status set health_state='PENDING',reason='' where health_state='WARNING';
Query OK, 6 rows affected (0.00 sec)
Rows matched: 6  Changed: 6  Warnings: 0

pending

The ceilometer issue

When updating from PowerVC 1.3.0.1 to 1.3.1.1 PowerVC is changing the database backend from DB2 to MariaDB. This is a good thing but the way the update is done is by exporting all the data in flat files and then re-inserting it in the MariaDB database records per records. I had a huge problem because of this, just because my ceilodb base was huge because of the number of machines I had and the number of operations we run on PowerVC since it is in production. The DB insert took more than 3 days and never finish. If you don't need the ceilo data my advice is to change the retention from 270 days y default to 2 hours:

# powervc-config metering event_ttl --set 2 --unit hr 
# ceilometer-expirer --config-file /etc/ceilometer/ceilometer.conf

If this is not enough an you still experiencing problems regarding the update the best way is to flush the entire table before the update:

# /opt/ibm/powervc/bin/powervc-services stop
# /opt/ibm/powervc/bin/powervc-services db2 start
# /bin/su - pwrvcdb -c "db2 drop database ceilodb2"
# /bin/su - pwrvcdb -c "db2 CREATE DATABASE ceilodb2 AUTOMATIC STORAGE YES ON /home/pwrvcdb DBPATH ON /home/pwrvcdb USING CODESET UTF-8 TERRITORY US COLLATE USING SYSTEM PAGESIZE 16384 RESTRICTIVE"
# /bin/su - pwrvcdb -c "db2 connect to ceilodb2 ; db2 grant dbadm on database to user ceilometer"
# /opt/ibm/powervc/bin/powervc-dbsync ceilometer
# /bin/su - pwrvcdb -c "db2 connect TO ceilodb2; db2 CALL GET_DBSIZE_INFO '(?, ?, ?, 0)' > /tmp/ceilodb2_db_size.out; db2 terminate" > /dev/null

Multi tenancy ... how to deal with a huge environment

As my environment is growing bigger and bigger I faced a couple people trying to force me to multiply the number of PowerVC machine we have. As Openstack is a solution designed to handle both density and scalability I said that doing this is just a "non-sense". Seriously people who still believe in this have not understand anything about the cloud, openstack and PowerVC. Hopefully we found a solution acceptable by everybody. As we are created what we are calling "building-block" we had to find a way to isolate one "block" from one another. The solution for host isolation is called mutly tenancy isolation. For the storage side we are just going to play with quotas. By doing this a user will be able to manage a couple of hosts and the associated storage (storage template) without having the right to do anything on the others:

multitenancyisolation

Before doing anything create the tenant (or project) and a user associated with it:

# cat /opt/ibm/powervc/version.properties | grep cloud_enabled
cloud_enabled = yes
# ~/powervcrc
export OS_USERNAME=root
export OS_PASSWORD=root
export OS_TENANT_NAME=ibm-default
export OS_AUTH_URL=https://powervc.lab.chmod666.org:5000/v3/
export OS_IDENTITY_API_VERSION=3
export OS_CACERT=/etc/pki/tls/certs/powervc.crt
export OS_REGION_NAME=RegionOne
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_DOMAIN_NAME=Default
export OS_COMPUTE_API_VERSION=2.25
export OS_NETWORK_API_VERSION=2.0
export OS_IMAGE_API_VERSION=2
export OS_VOLUME_API_VERSION=2
# source powervcrc
# openstack project create hb01
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description |                                  |
| domain_id   | default                          |
| enabled     | True                             |
| id          | 90d064b4abea4339acd32a8b6a8b1fdf |
| is_domain   | False                            |
| name        | hb01                             |
| parent_id   | default                          |
+-------------+----------------------------------+
# openstack role list
+----------------------------------+---------------------+
| ID                               | Name                |
+----------------------------------+---------------------+
| 1a76014f12594214a50c36e6a8e3722c | deployer            |
| 54616a8b136742098dd81eede8fd5aa8 | vm_manager          |
| 7bd6de32c14d46f2bd5300530492d4a4 | storage_manager     |
| 8260b7c3a4c24a38ba6bee8e13ced040 | deployer_restricted |
| 9b69a55c6b9346e2b317d0806a225621 | image_manager       |
| bc455ed006154d56ad53cca3a50fa7bd | admin               |
| c19a43973db148608eb71eb3d86d4735 | service             |
| cb130e4fa4dc4f41b7bb4f1fdcf79fc2 | self_service        |
| f1a0c1f9041d4962838ec10671befe33 | vm_user             |
| f8cf9127468045e891d5867ce8825d30 | viewer              |
+----------------------------------+---------------------+
# useradd hb01_admin
# openstack role add --project hb01 --user hb01_admin admin

Then associate each host group (aggregates in Openstack terms) (you have to put your allowed hosts in an host group to enable this feature) that are allowed for this tenant using filter_tenant_id meta-data. For each allowed host group add this field to the metatadata of the host. (first find the tenant id):

# openstack project list
+----------------------------------+-------------+
| ID                               | Name        |
+----------------------------------+-------------+
| 1471acf124a0479c8d525aa79b2582d0 | ibm-default |
| 90d064b4abea4339acd32a8b6a8b1fdf | hb01        |
| b79b694c70734a80bc561e84a95b313d | powervm     |
| c8c42d45ef9e4a97b3b55d7451d72591 | service     |
| f371d1f29c774f2a97f4043932b94080 | project1    |
+----------------------------------+-------------+
# openstack aggregate list
+----+---------------+-------------------+
| ID | Name          | Availability Zone |
+----+---------------+-------------------+
|  1 | Default Group | None              |
| 21 | aggregate2    | None              |
| 41 | hg2           | None              |
| 43 | hb01_mn       | None              |
| 44 | hb01_me       | None              |
+----+---------------+-------------------+
# nova aggregate-set-metadata hb01_mn filter_tenant_id=90d064b4abea4339acd32a8b6a8b1fdf 
Metadata has been successfully updated for aggregate 43.
| Id | Name    | Availability Zone | Hosts             | Metadata                                                                                                                                   
| 43 | hb01_mn | -                 | '9119MME_1009425' | 'dro_enabled=False', 'filter_tenant_id=90d064b4abea4339acd32a8b6a8b1fdf', 'hapolicy-id=1', 'hapolicy-run_interval=1', 'hapolicy-stabilization=1', 'initialpolicy-id=4', 'runtimepolicy-action=migrate_vm_advise_only', 'runtimepolicy-id=5', 'runtimepolicy-max_parallel=10', 'runtimepolicy-run_interval=5', 'runtimepolicy-stabilization=2', 'runtimepolicy-threshold=70' |
# nova aggregate-set-metadata hb01_me filter_tenant_id=90d064b4abea4339acd32a8b6a8b1fdf 
Metadata has been successfully updated for aggregate 44.
| Id | Name    | Availability Zone | Hosts             | Metadata                                                                                                                                   
| 44 | hb01_me | -                 | '9119MME_0696010' | 'dro_enabled=False', 'filter_tenant_id=90d064b4abea4339acd32a8b6a8b1fdf', 'hapolicy-id=1', 'hapolicy-run_interval=1', 'hapolicy-stabilization=1', 'initialpolicy-id=2', 'runtimepolicy-action=migrate_vm_advise_only', 'runtimepolicy-id=5', 'runtimepolicy-max_parallel=10', 'runtimepolicy-run_interval=5', 'runtimepolicy-stabilization=2', 'runtimepolicy-threshold=70' |

To make this work add the AggregateMultiTenancyIsolation to the scheduler_default_filter in nova.conf file and restart nova services:

# grep scheduler_default_filter /etc/nova/nova.conf
scheduler_default_filters = RamFilter,CoreFilter,ComputeFilter,RetryFilter,AvailabilityZoneFilter,ImagePropertiesFilter,ComputeCapabilitiesFilter,MaintenanceFilter,PowerVCServerGroupAffinityFilter,PowerVCServerGroupAntiAffinityFilter,PowerVCHostAggregateFilter,PowerVMNetworkFilter,PowerVMProcCompatModeFilter,PowerLMBSizeFilter,PowerMigrationLicenseFilter,PowerVMMigrationCountFilter,PowerVMStorageFilter,PowerVMIBMiMobilityFilter,PowerVMRemoteRestartFilter,PowerVMRemoteRestartSameHMCFilter,PowerVMEndianFilter,PowerVMGuestCapableFilter,PowerVMSharedProcPoolFilter,PowerVCResizeSameHostFilter,PowerVCDROFilter,PowerVMActiveMemoryExpansionFilter,PowerVMNovaLinkMobilityFilter,AggregateMultiTenancyIsolation
# powervc-services restart

We are done regarding the hosts.

Enabling quotas

To allow one user/tenant to create volumes only on onz storage provider we first need to enable quotas using the following commands:

# grep quota /opt/ibm/powervc/policy/cinder/policy.json
    "volume_extension:quotas:show": "",
    "volume_extension:quotas:update": "rule:admin_only",
    "volume_extension:quotas:delete": "rule:admin_only",
    "volume_extension:quota_classes": "rule:admin_only",
    "volume_extension:quota_classes:validate_setup_for_nested_quota_use": "rule:admin_only",

Then put to 0 all the non-allowed storage template for this tenant and let the only one you want to 10000. Easy:

# cinder --service-type volume type-list
+--------------------------------------+---------------------------------------------+-------------+-----------+
|                  ID                  |                     Name                    | Description | Is_Public |
+--------------------------------------+---------------------------------------------+-------------+-----------+
| 53434872-a0d2-49ea-9683-15c7940b30e5 |               svc2 base template            |      -      |    True   |
| e49e9cc3-efc3-4e7e-bcb9-0291ad28df42 |               svc1 base template            |      -      |    True   |
| f45469d5-df66-44cf-8b60-b226425eee4f |                     svc3                    |      -      |    True   |
+--------------------------------------+---------------------------------------------+-------------+-----------+
# cinder --service-type volume quota-update --volumes 0 --volume-type "svc2" 90d064b4abea4339acd32a8b6a8b1fdf
# cinder --service-type volume quota-update --volumes 0 --volume-type "svc3" 90d064b4abea4339acd32a8b6a8b1fdf
+-------------------------------------------------------+----------+
|                        Property                       |  Value   |
+-------------------------------------------------------+----------+
|                    backup_gigabytes                   |   1000   |
|                        backups                        |    10    |
|                       gigabytes                       | 1000000  |
|              gigabytes_svc2 base template             | 10000000 |
|              gigabytes_svc1 base template             | 10000000 |
|                     gigabytes_svc3                    |    -1    |
|                  per_volume_gigabytes                 |    -1    |
|                       snapshots                       |  100000  |
|             snapshots_svc2 base template              |  100000  |
|             snapshots_svc1 base template              |  100000  |
|                     snapshots_svc3                    |    -1    |
|                        volumes                        |  100000  |
|            volumes_svc2 base template                 |  100000  |
|            volumes_svc1 base template                 |    0     |
|                      volumes_svc3                     |    0     |
+-------------------------------------------------------+----------+
# powervc-services stop
# powervc-services start

By doing this you have enable the isolation between two tenants. Then use the appropriate user to do the appropriate task.

PowerVC cinder above the Petabyte

Now that quota are enabled use this command if you want to be able to have more that one petabyte of data managed by PowerVC:

# cinder --service-type volume quota-class-update --gigabytes -1 default
# powervc-services stop
# powervc-services start

PowerVC cinder above 10000 luns

Change the osapi_max_limit in cinder.conf if you want to go above the 10000 lun limits (check every cinder configuration files; the cinder.conf if for the global number of volumes):

# grep ^osapi_max_limit cinder.conf
osapi_max_limit = 15000
# powervc-services stop
# powervc-services start

Snapshot and consistncy group

There is a new cool feature available with the latest version of PowerVC (1.3.1.2). This feature allows you to create snapshots of volume (only on SVC and Storwise for the moment). You now have the possibility to create consistency group (group of volumes) and create snapshots of these consistency groups (allowing for instance to make a backup of a volume group directly from OpenStack. I'm doing the example below using the command line because I think it is easier to understand with these commands rather than showing you the same thing with the rest api):

First create a consistency group:

# cinder --service-type volume type-list
+--------------------------------------+---------------------------------------------+-------------+-----------+
|                  ID                  |                     Name                    | Description | Is_Public |
+--------------------------------------+---------------------------------------------+-------------+-----------+
| 53434872-a0d2-49ea-9683-15c7940b30e5 |              svc2 base template             |      -      |    True   |
| 862b0a8e-cab4-400c-afeb-99247838f889 |             p8_ssp base template            |      -      |    True   |
| e49e9cc3-efc3-4e7e-bcb9-0291ad28df42 |               svc1 base template            |      -      |    True   |
| f45469d5-df66-44cf-8b60-b226425eee4f |                     svc3                    |      -      |    True   |
+--------------------------------------+---------------------------------------------+-------------+-----------+
# cinder --service-type volume consisgroup-create --name foovg_cg "svc1 base template"
+-------------------+-------------------------------------------+
|      Property     |                   Value                   |
+-------------------+-------------------------------------------+
| availability_zone |                    nova                   |
|     created_at    |         2016-09-11T21:10:58.000000        |
|    description    |                    None                   |
|         id        |    950a5193-827b-49ab-9511-41ba120c9ebd   |
|        name       |                  foovg_cg                 |
|       status      |                  creating                 |
|    volume_types   | [u'e49e9cc3-efc3-4e7e-bcb9-0291ad28df42'] |
+-------------------+-------------------------------------------+
# cinder --service-type volume consisgroup-list
+--------------------------------------+-----------+----------+
|                  ID                  |   Status  |   Name   |
+--------------------------------------+-----------+----------+
| 950a5193-827b-49ab-9511-41ba120c9ebd | available | foovg_cg |
+--------------------------------------+-----------+----------+

Create volume in this consistency group:

# cinder --service-type volume create --volume-type "svc1 base template" --name foovg_vol1 --consisgroup-id 950a5193-827b-49ab-9511-41ba120c9ebd 200
# cinder --service-type volume create --volume-type "svc1 base template" --name foovg_vol2 --consisgroup-id 950a5193-827b-49ab-9511-41ba120c9ebd 200
+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+
|           Property           |                                                                          Value                                                                           |
+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+
|         attachments          |                                                                            []                                                                            |
|      availability_zone       |                                                                           nova                                                                           |
|           bootable           |                                                                          false                                                                           |
|     consistencygroup_id      |                                                           950a5193-827b-49ab-9511-41ba120c9ebd                                                           |
|          created_at          |                                                                2016-09-11T21:23:02.000000                                                                |
|         description          |                                                                           None                                                                           |
|          encrypted           |                                                                          False                                                                           |
|        health_status         | {u'health_value': u'PENDING', u'id': u'8d078772-00b5-45fc-89c8-82c63e2c48ed', u'value_reason': u'PENDING', u'updated_at': u'2016-09-11T21:23:02.669372'} |
|              id              |                                                           8d078772-00b5-45fc-89c8-82c63e2c48ed                                                           |
|           metadata           |                                                                            {}                                                                            |
|       migration_status       |                                                                           None                                                                           |
|         multiattach          |                                                                          False                                                                           |
|             name             |                                                                        foovg_vol2                                                                        |
|    os-vol-host-attr:host     |                                                                           None                                                                           |
| os-vol-tenant-attr:tenant_id |                                                             1471acf124a0479c8d525aa79b2582d0                                                             |
|      replication_status      |                                                                         disabled                                                                         |
|             size             |                                                                           200                                                                            |
|         snapshot_id          |                                                                           None                                                                           |
|         source_volid         |                                                                           None                                                                           |
|            status            |                                                                         creating                                                                         |
|          updated_at          |                                                                           None                                                                           |
|           user_id            |                                             0688b01e6439ca32d698d20789d52169126fb41fb1a4ddafcebb97d854e836c9                                             |
|         volume_type          |                                                                   svc1 base template                                                                     |
+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+

You're now able to attach these two volumes to a machine from the PowerVC GUI:

consist

# lsmpio -q
Device           Vendor Id  Product Id       Size    Volume Name
------------------------------------------------------------------------------
hdisk0           IBM        2145                 64G volume-aix72-44c7a72c-000000e0-
hdisk1           IBM        2145                100G volume-snap1-dab0e2d1-130a
hdisk2           IBM        2145                100G volume-snap2-5e863fdb-ab8c
hdisk3           IBM        2145                200G volume-foovg_vol1-3ba0ff59-acd8
hdisk4           IBM        2145                200G volume-foovg_vol2-8d078772-00b5
# cfgmr
# lspv
hdisk0          00c8b2add70d7db0                    rootvg          active
hdisk1          00f9c9f51afe960e                    None
hdisk2          00f9c9f51afe9698                    None
hdisk3          none                                None
hdisk4          none                                None

Then you can create a snapshot fo these two volumes. It's that easy :-) :

# cinder --service-type volume cgsnapshot-create 950a5193-827b-49ab-9511-41ba120c9ebd
+---------------------+--------------------------------------+
|       Property      |                Value                 |
+---------------------+--------------------------------------+
| consistencygroup_id | 950a5193-827b-49ab-9511-41ba120c9ebd |
|      created_at     |      2016-09-11T21:31:12.000000      |
|     description     |                 None                 |
|          id         | 20e2ce6b-9c4a-4eea-b05d-f0b0b6e4768f |
|         name        |                 None                 |
|        status       |               creating               |
+---------------------+--------------------------------------+
# cinder --service-type volume cgsnapshot-list
+--------------------------------------+-----------+------+
|                  ID                  |   Status  | Name |
+--------------------------------------+-----------+------+
| 20e2ce6b-9c4a-4eea-b05d-f0b0b6e4768f | available |  -   |
+--------------------------------------+-----------+------+
# cinder --service-type volume cgsnapshot-show 20e2ce6b-9c4a-4eea-b05d-f0b0b6e4768f
+---------------------+--------------------------------------+
|       Property      |                Value                 |
+---------------------+--------------------------------------+
| consistencygroup_id | 950a5193-827b-49ab-9511-41ba120c9ebd |
|      created_at     |      2016-09-11T21:31:12.000000      |
|     description     |                 None                 |
|          id         | 20e2ce6b-9c4a-4eea-b05d-f0b0b6e4768f |
|         name        |                 None                 |
|        status       |              available               |
+---------------------+--------------------------------------+

cgsnap

Conclusion

Please keep in mind that the content of this blog post comes from real life and production examples. I hope you will be able to better understand that scalability, density, fast deployment, snapshots, multi tenancy are some features that are absolutely needed in the AIX world. As you can see the PowerVC team is moving fast. Probably faster than every customer I have ever seen. I must admit they are right. Doing this is the only way the face the Linux X86 offering. And I must confess this is damn fun to work on those things. I'm so happy to have the best of two worlds AIX/PowerSystem and Openstack. This is the only direction we have to take if we want AIX to survive. So please stop being scared or not convinced by these solutions they are damn good, production ready. Please face and embrace the future and stop looking at the past. As always I hope it help.

NovaLink ‘HMC Co-Management’ and PowerVC 1.3.0.1Dynamic Resource Optimizer

Everybody now knows that I’m using PowerVC a lot in my current company. My environment is growing bigger and bigger and we are now managing more than 600 virtual machines with PowerVC (the goal is to reach ~ 3000 this year). Some of them were build by PowerVC itself and some of them were migrated through an homemade python script calling the PowerVC rest api and moving our old vSCSI machines to the new full NPIV/Live Partition Mobility/PowerVC environment (Still struggling with the “old mens” to move on SSP, but I’m alone versus everybody on this one). I’m happy with that but (there is always a but) I’m facing a lot problems. The first one is that we are doing more and more stuffs with PowerVC (Virtual Machine creation, virtual machines resizing, adding additional disks, moving machine with LPM, and finally using this python scripts to migrate the old machines to the new environment). I realized that the machine hosting the PowerVC was slower and slower and the more actions we do the more the PowerVC was “unresponsive”. By this I mean that the GUI was slow, creating objects was slower and slower. By looking at CPU graphs in lpar2rrd we noticed that the CPU consumption was growing as fast as we were doing stuffs on PowerVC (check the graph below). The second problem was my teams (unfortunately for me, we have here different teams doing different sort of stuffs here and everybody is using the Hardware Management Consoles it’s own way, some people are renaming the machine making them unusable with PowerVC, some people were changing the profiles disabling the synchronization, even worse we have some third party tools used for capacity planning making the Hardware Management Console unusable by PowerVC). The solution to all these problems is to use NovaLink and especially the NovaLink Co-Management. By doing this the Hardware Management Consoles will be restricted to a read-only view and PowerVC will stop querying the HMCs and will directly query the NovaLink partitions on each hosts instead of querying the Hardware Management Consoles.

cpu_powervc

What is NovaLink ?

If you are using PowerVC you know that this one is based on OpenStack. Until now all the Openstack services where running on the PowerVC host. If you check on the PowerVC today you can see that there is one Nova per managed host. In the example below I’m managing ten hosts so I have ten different Nova processes running :

# ps -ef | grep [n]ova-compute
nova       627     1 14 Jan16 ?        06:24:30 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_10D6666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_10D6666.log
nova       649     1 14 Jan16 ?        06:30:25 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_65E6666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_65E6666.log
nova       664     1 17 Jan16 ?        07:49:27 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_1086666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_1086666.log
nova       675     1 19 Jan16 ?        08:40:27 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_06D6666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_06D6666.log
nova       687     1 18 Jan16 ?        08:15:57 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_6576666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_6576666.log
nova       697     1 21 Jan16 ?        09:35:40 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_6556666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_6556666.log
nova       712     1 13 Jan16 ?        06:02:23 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_10A6666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_10A6666.log
nova       728     1 17 Jan16 ?        07:49:02 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_1016666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_1016666.log
nova       752     1 17 Jan16 ?        07:34:45 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_1036666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9119MHE_1036666.log
nova       779     1 13 Jan16 ?        05:54:52 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_6596666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9119MHE_6596666.log
# ps -ef | grep [n]ova-compute | wc -l
10

The goal of NovaLink is to move these processes on a dedicated partition running on each managed host (each PowerSystems). This partition is called the NovaLink partition. This one is running on an Ubuntu 15.10 Linux OS (Little endian) (so only available on Power8 hosts) and is in charge to run the Openstack nova processes. By doing that you will distribute the load across all the NovaLink partitions instead of charging one PowerVC host. Even better my understanding is that the NovaLink partition is able to communicate directly with the FSP. By using NovaLink you will be able to stop using the Hardware Management Consoles anymore and avoid the slowness of theses ones. As the NovaLink partition is hosted on the host itself the RMC connections are can now use a direct link (ipv6) through the PowerHypervisor. No more RMC connection problem at all ;-), it’s just awesome. NovaLink allows you to choose between two modes of management:

  • Full Nova Management: You install your new host directly with NovaLink on it and you will not need an Hardware Management Console Anymore (In this case the NovaLink installation is in charge to deploy the Virtual I/O Servers and the SEAs).
  • Nova Co-Management: Your host is already installed and you give the write access (setmaster) to the NovaLink partition, the Hardware Management Console will be limited in this mode (you will not be able to create partition anymore or modify profile, it’s not a “read only” mode as you will be able to start and stop the partitions and still do some stuffs with HMC but you will be very limited).
  • You can still mix NovaLink and Non-NovaLink management hosts, and still have P7/P6 managed by HMCs, P8 managed by HMCs, P8 Nova Co-Managed and P8 full Nova Managed ;-).
  • Nova1

Prerequisites

As always upgrade your systems to the latest code level if you want to use NovaLink and NovaLink Co-Management

  • Power 8 only with firmware version 840. (or later)
  • Virtual I/O Server 2.2.4.10 or later
  • For NovaLink co-management HMC V8R8.4.0
  • Obviously install NovaLink on each NovaLink managed system (install the latest patch version of NovaLink)
  • PowerVC 1.3.0.1 or later

NovaLink installation on an existing system

I’ll show you here how to install a NovaLink partition on an existing deployed system. Installing a new system from scratch is also possible. My advice is that you look at this address to start: , and check this youtube video showing you how a system is installed from scratch :

The goal of this post is to show you how to setup a co-managed system on an already existing system with Virtual I/O Servers already deployed on the host. My advice is to be very careful. The first thing you’ll need to do is to created a partition (2VP 0.5EC and 5GB Memory) (I’m calling it nova in the example below) and use the Virtual Optical device to load the NovaLink system on this one. In the example below the machine is “SSP” backed. Be very careful when do that: setup the profile name, and all the configuration stuffs before moving to co-managed mode … after that it will be harder for you to change things as the new pvmctl command will be very new to you:

# mkvdev -fbo -vadapter vhost0
vtopt0 Available
# lsrep
Size(mb) Free(mb) Parent Pool         Parent Size      Parent Free
    3059     1579 rootvg                   102272            73216

Name                                                  File Size Optical         Access
PowerVM_NovaLink_V1.1_122015.iso                           1479 None            rw
vopt_a19a8fbb57184aad8103e2c9ddefe7e7                         1 None            ro
# loadopt -disk PowerVM_NovaLink_V1.1_122015.iso -vtd vtopt0
# lsmap -vadapter vhost0 -fmt :
vhost0:U8286.41A.21AFF8V-V2-C40:0x00000003:nova_b1:Available:0x8100000000000000:nova_b1.7f863bacb45e3b32258864e499433b52: :N/A:vtopt0:Available:0x8200000000000000:/var/vio/VMLibrary/PowerVM_NovaLink_V1.1_122015.iso: :N/A
  • At the gurb page select the first entry:
  • install1

  • Wait for the machine to boot:
  • install2

  • Choose to perform an installation:
  • install3

  • Accept the licenses
  • install4

  • padmin user:/li>
    install5

  • Put you network configuration:
  • install6

  • Accept to install the Ubuntu system:
  • install8

  • You can then modify anything you want in the configuration file (in my case the timezone):
  • install9

    By default NovaLink (I think not 100% sure) is designed to be installed on SAS disk, so without multipathing. If like me you decide to install the NovaLink partition in a “boot-on-san” lpar my advice is to launch the installation without any multipathing enabled (only one vscsi adapter or one virtual fibre channel adapter). After the installation is completed install the Ubuntu multipathd service and configure the second vscsi or virtual fibre channel adapter. If you don’t do that you may experience problem at the installation time (RAID error). Please remember that you have to do that before enabling the co-management. Last thing about the installation it may takes a lot of time to finish. So be patient (especially the preseed step).

install10

Updating to the latest code level

The iso file provider in the Entitled Software Support is not updated to the latest available NovaLink code. Make a copy of the official repository available at this address: ftp://public.dhe.ibm.com/systems/virtualization/Novalink/debian. Serve the content of this ftp server on you how http server (use the command below to copy it):

# wget --mirror ftp://public.dhe.ibm.com/systems/virtualization/Novalink/debian

Modify the /etc/apt/sources.list (and source.list.d) and comment all the available deb repository to on only keep your copy

root@nova:~# grep -v ^# /etc/apt/sources.list
deb http://deckard.lab.chmod666.org/nova/Novalink/debian novalink_1.0.0 non-free
root@nova:/etc/apt/sources.list.d# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
  pvm-cli pvm-core pvm-novalink pvm-rest-app pvm-rest-server pypowervm
6 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 165 MB of archives.
After this operation, 53.2 kB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pypowervm all 1.0.0.1-151203-1553 [363 kB]
Get:2 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pvm-cli all 1.0.0.1-151202-864 [63.4 kB]
Get:3 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pvm-core ppc64el 1.0.0.1-151202-1495 [2,080 kB]
Get:4 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pvm-rest-server ppc64el 1.0.0.1-151203-1563 [142 MB]
Get:5 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pvm-rest-app ppc64el 1.0.0.1-151203-1563 [21.1 MB]
Get:6 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pvm-novalink ppc64el 1.0.0.1-151203-408 [1,738 B]
Fetched 165 MB in 7s (20.8 MB/s)
(Reading database ... 72094 files and directories currently installed.)
Preparing to unpack .../pypowervm_1.0.0.1-151203-1553_all.deb ...
Unpacking pypowervm (1.0.0.1-151203-1553) over (1.0.0.0-151110-1481) ...
Preparing to unpack .../pvm-cli_1.0.0.1-151202-864_all.deb ...
Unpacking pvm-cli (1.0.0.1-151202-864) over (1.0.0.0-151110-761) ...
Preparing to unpack .../pvm-core_1.0.0.1-151202-1495_ppc64el.deb ...
Removed symlink /etc/systemd/system/multi-user.target.wants/pvm-core.service.
Unpacking pvm-core (1.0.0.1-151202-1495) over (1.0.0.0-151111-1375) ...
Preparing to unpack .../pvm-rest-server_1.0.0.1-151203-1563_ppc64el.deb ...
Unpacking pvm-rest-server (1.0.0.1-151203-1563) over (1.0.0.0-151110-1480) ...
Preparing to unpack .../pvm-rest-app_1.0.0.1-151203-1563_ppc64el.deb ...
Unpacking pvm-rest-app (1.0.0.1-151203-1563) over (1.0.0.0-151110-1480) ...
Preparing to unpack .../pvm-novalink_1.0.0.1-151203-408_ppc64el.deb ...
Unpacking pvm-novalink (1.0.0.1-151203-408) over (1.0.0.0-151112-304) ...
Processing triggers for ureadahead (0.100.0-19) ...
ureadahead will be reprofiled on next reboot
Setting up pypowervm (1.0.0.1-151203-1553) ...
Setting up pvm-cli (1.0.0.1-151202-864) ...
Installing bash completion script /etc/bash_completion.d/python-argcomplete.sh
Setting up pvm-core (1.0.0.1-151202-1495) ...
addgroup: The group `pvm_admin' already exists.
Created symlink from /etc/systemd/system/multi-user.target.wants/pvm-core.service to /usr/lib/systemd/system/pvm-core.service.
0513-071 The ctrmc Subsystem has been added.
Adding /usr/lib/systemd/system/ctrmc.service for systemctl ...
0513-059 The ctrmc Subsystem has been started. Subsystem PID is 3096.
Setting up pvm-rest-server (1.0.0.1-151203-1563) ...
The user `wlp' is already a member of `pvm_admin'.
Setting up pvm-rest-app (1.0.0.1-151203-1563) ...
Setting up pvm-novalink (1.0.0.1-151203-408) ...

NovaLink and HMC Co-Management configuration

Before adding the hosts on PowerVC you still need to do the most important thing. After the installation is finished enable the co-management mode to be able to have a system managed by NovaLink and still connected to an Hardware Management Console:

  • Enable the powerm_mgmt_capable attribute on the Nova partition:
  • # chsyscfg -r lpar -m br-8286-41A-2166666 -i "name=nova,powervm_mgmt_capable=1"
    # lssyscfg -r lpar -m br-8286-41A-2166666 -F name,powervm_mgmt_capable --filter "lpar_names=nova"
    nova,1
    
  • Enable co-management (please not here that you have to setmaster (you’ll see that the curr_master_name is the HMC) and then relmaster (you’ll see that the curr_master_name is the NovaLink Partition, this is that state where we want to be)):
  • # lscomgmt -m br-8286-41A-2166666
    is_master=null
    # chcomgmt -m br-8286-41A-2166666 -o setmaster -t norm --terms agree
    # lscomgmt -m br-8286-41A-2166666
    is_master=1,curr_master_name=myhmc1,curr_master_mtms=7042-CR8*2166666,curr_master_type=norm,pend_master_mtms=none
    # chcomgmt -m br-8286-41A-2166666 -o relmaster
    # lscomgmt -m br-8286-41A-2166666
    is_master=0,curr_master_name=nova,curr_master_mtms=3*8286-41A*2166666,curr_master_type=norm,pend_master_mtms=none
    

Going back to HMC managed system

You can go back to an Hardware Management Console managed system whenever you want (set the master to the HMC, delete the nova partition and release the master from the HMC).

# chcomgmt -m br-8286-41A-2166666 -o setmaster -t norm --terms agree
# lscomgmt -m br-8286-41A-2166666
is_master=1,curr_master_name=myhmc1,curr_master_mtms=7042-CR8*2166666,curr_master_type=norm,pend_master_mtms=none
# chlparstate -o shutdown -m br-8286-41A-2166666 --id 9 --immed
# rmsyscfg -r lpar -m br-8286-41A-2166666 --id 9
# chcomgmt -o relmaster -m br-8286-41A-2166666
# lscomgmt -m br-8286-41A-2166666
is_master=0,curr_master_mtms=none,curr_master_type=none,pend_master_mtms=none

Using NovaLink

After the installation you are now able to login on the NovaLink partition. (You can gain root access with “sudo su -” command). A command new called pvmctl is available on the NovaLink partition allowing you to perform any actions (stop, start virtual machine, list Virtual I/O Servers, ….). Before trying to add the host double check that the pvmctl command is working ok.

padmin@nova:~$ pvmctl lpar list
Logical Partitions
+------+----+---------+-----------+---------------+------+-----+-----+
| Name | ID |  State  |    Env    |    Ref Code   | Mem  | CPU | Ent |
+------+----+---------+-----------+---------------+------+-----+-----+
| nova | 3  | running | AIX/Linux | Linux ppc64le | 8192 |  2  | 0.5 |
+------+----+---------+-----------+---------------+------+-----+-----+

Adding hosts

On the PowerVC side add the NovaLink host by choosing the NovaLink option:

addhostnovalink

Some deb (ibmpowervc-power)packages will be installed on configured on the NovaLink machine:

addhostnovalink3
addhostnovalink4

By doing this, on each NovaLink machine you can check that a nova-compute process is here. (By adding the host the deb was installed and configured on the NovaLink host:

# ps -ef | grep nova
nova      4392     1  1 10:28 ?        00:00:07 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /var/log/nova/nova-compute.log
root      5218  5197  0 10:39 pts/1    00:00:00 grep --color=auto nova
# grep host_display_name /etc/nova/nova.conf
host_display_name = XXXX-8286-41A-XXXX
# tail -1 /var/log/apt/history.log
Start-Date: 2016-01-18  10:27:54
Commandline: /usr/bin/apt-get -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold -y install --force-yes --allow-unauthenticated ibmpowervc-powervm
Install: python-keystoneclient:ppc64el (1.6.0-2.ibm.ubuntu1, automatic), python-oslo.reports:ppc64el (0.1.0-1.ibm.ubuntu1, automatic), ibmpowervc-powervm:ppc64el (1.3.0.1), python-ceilometer:ppc64el (5.0.0-201511171217.ibm.ubuntu1.199, automatic), ibmpowervc-powervm-compute:ppc64el (1.3.0.1, automatic), nova-common:ppc64el (12.0.0-201511171221.ibm.ubuntu1.213, automatic), python-oslo.service:ppc64el (0.11.0-2.ibm.ubuntu1, automatic), python-oslo.rootwrap:ppc64el (2.0.0-1.ibm.ubuntu1, automatic), python-pycadf:ppc64el (1.1.0-1.ibm.ubuntu1, automatic), python-nova:ppc64el (12.0.0-201511171221.ibm.ubuntu1.213, automatic), python-keystonemiddleware:ppc64el (2.4.1-2.ibm.ubuntu1, automatic), python-kafka:ppc64el (0.9.3-1.ibm.ubuntu1, automatic), ibmpowervc-powervm-monitor:ppc64el (1.3.0.1, automatic), ibmpowervc-powervm-oslo:ppc64el (1.3.0.1, automatic), neutron-common:ppc64el (7.0.0-201511171221.ibm.ubuntu1.280, automatic), python-os-brick:ppc64el (0.4.0-1.ibm.ubuntu1, automatic), python-tooz:ppc64el (1.22.0-1.ibm.ubuntu1, automatic), ibmpowervc-powervm-ras:ppc64el (1.3.0.1, automatic), networking-powervm:ppc64el (1.0.0.0-151109-25, automatic), neutron-plugin-ml2:ppc64el (7.0.0-201511171221.ibm.ubuntu1.280, automatic), python-ceilometerclient:ppc64el (1.5.0-1.ibm.ubuntu1, automatic), python-neutronclient:ppc64el (2.6.0-1.ibm.ubuntu1, automatic), python-oslo.middleware:ppc64el (2.8.0-1.ibm.ubuntu1, automatic), python-cinderclient:ppc64el (1.3.1-1.ibm.ubuntu1, automatic), python-novaclient:ppc64el (2.30.1-1.ibm.ubuntu1, automatic), python-nova-ibm-ego-resource-optimization:ppc64el (2015.1-201511110358, automatic), python-neutron:ppc64el (7.0.0-201511171221.ibm.ubuntu1.280, automatic), nova-compute:ppc64el (12.0.0-201511171221.ibm.ubuntu1.213, automatic), nova-powervm:ppc64el (1.0.0.1-151203-215, automatic), openstack-utils:ppc64el (2015.2.0-201511171223.ibm.ubuntu1.18, automatic), ibmpowervc-powervm-network:ppc64el (1.3.0.1, automatic), python-oslo.policy:ppc64el (0.5.0-1.ibm.ubuntu1, automatic), python-oslo.db:ppc64el (2.4.1-1.ibm.ubuntu1, automatic), python-oslo.versionedobjects:ppc64el (0.9.0-1.ibm.ubuntu1, automatic), python-glanceclient:ppc64el (1.1.0-1.ibm.ubuntu1, automatic), ceilometer-common:ppc64el (5.0.0-201511171217.ibm.ubuntu1.199, automatic), openstack-i18n:ppc64el (2015.2-3.ibm.ubuntu1, automatic), python-oslo.messaging:ppc64el (2.1.0-2.ibm.ubuntu1, automatic), python-swiftclient:ppc64el (2.4.0-1.ibm.ubuntu1, automatic), ceilometer-powervm:ppc64el (1.0.0.0-151119-44, automatic)
End-Date: 2016-01-18  10:28:00

The command line interface

You can do ALL the stuffs you were doing on the HMC using the pvmctl command. The syntax is pretty simple: pvcmtl |OBJECT| |ACTION| where the OBJECT can be vios, vm, vea(virtual ethernet adapter), vswitch, lu (logical unit), or anything you want and ACTION can be list, delete, create, update. Here are a few examples :

  • List the Virtual I/O Servers:
  • # pvmctl vios list
    Virtual I/O Servers
    +--------------+----+---------+----------+------+-----+-----+
    |     Name     | ID |  State  | Ref Code | Mem  | CPU | Ent |
    +--------------+----+---------+----------+------+-----+-----+
    | s00ia9940825 | 1  | running |          | 8192 |  2  | 0.2 |
    | s00ia9940826 | 2  | running |          | 8192 |  2  | 0.2 |
    +--------------+----+---------+----------+------+-----+-----+
    
  • List the partitions (note the -d for display-fields allowing me to print somes attributes):
  • # pvmctl vm list
    Logical Partitions
    +----------+----+----------+----------+----------+-------+-----+-----+
    |   Name   | ID |  State   |   Env    | Ref Code |  Mem  | CPU | Ent |
    +----------+----+----------+----------+----------+-------+-----+-----+
    | aix72ca> | 3  | not act> | AIX/Lin> | 00000000 |  2048 |  1  | 0.1 |
    |   nova   | 4  | running  | AIX/Lin> | Linux p> |  8192 |  2  | 0.5 |
    | s00vl99> | 5  | running  | AIX/Lin> | Linux p> | 10240 |  2  | 0.2 |
    | test-59> | 6  | not act> | AIX/Lin> | 00000000 |  2048 |  1  | 0.1 |
    +----------+----+----------+----------+----------+-------+-----+-----+
    # pvmctl list vm -d name id 
    [..]
    # pvmctl vm list -i id=4 --display-fields LogicalPartition.name
    name=aix72-1-d3707953-00000090
    # pvmctl vm list  --display-fields LogicalPartition.name LogicalPartition.id LogicalPartition.srr_enabled SharedProcessorConfiguration.desired_virtual SharedProcessorConfiguration.uncapped_weight
    name=aix72capture,id=3,srr_enabled=False,desired_virtual=1,uncapped_weight=64
    name=nova,id=4,srr_enabled=False,desired_virtual=2,uncapped_weight=128
    name=s00vl9940243,id=5,srr_enabled=False,desired_virtual=2,uncapped_weight=128
    name=test-5925058d-0000008d,id=6,srr_enabled=False,desired_virtual=1,uncapped_weight=128
    
  • Delete the virtual adapter on the partition name nova (note the –parent-id to select the partition) with a certain uuid which was found with (pvmclt list vea):
  • # pvmctl vea delete --parent-id name=nova --object-id uuid=fe7389a8-667f-38ca-b61e-84c94e5a3c97
    
  • Power off the lpar named aix72-2:
  • # pvmctl vm power-off -i name=aix72-2-536bf0f8-00000091
    Powering off partition aix72-2-536bf0f8-00000091, this may take a few minutes.
    Partition aix72-2-536bf0f8-00000091 power-off successful.
    
  • Delete the lpar named aix72-2:
  • # pvmctl vm delete -i name=aix72-2-536bf0f8-00000091
    
  • Delete the vswitch named MGMTVSWITCH:
  • # pvmctl vswitch delete -i name=MGMTVSWITCH
    
  • Open a console:
  • #  mkvterm --id 4
    vterm for partition 4 is active.  Press Control+] to exit.
    |
    Elapsed time since release of system processors: 57014 mins 10 secs
    [..]
    
  • Power on an lpar:
  • # pvmctl vm power-on -i name=aix72capture
    Powering on partition aix72capture, this may take a few minutes.
    Partition aix72capture power-on successful.
    

Is this a dream ? No more RMC connectivty problem anymore

I’m 100% sure that you always have problems with RMC connectivity due to firwall issues, ports not opened, and IDS blocking RMC ongoing or outgoing traffic. NovaLink is THE solution that will solve all the RMC problems forever. I’m not joking it’s a major improvement for PowerVM. As the NovaLink partition is installed on each hosts this one can communicate through a dedicated IPv6 link with all the partitions hosted on the host. A dedicated virtual switch called MGMTSWITCH is used to allow the RMC flow to transit between all the lpars and the NovaLink partition. Of course this Virtual Switch must be created and one Virtual Ethernet Adapter must also be created on the NovaLink partition. These are the first two actions to do if you want to implement this solution. Before starting here are a few things you need to know:

  • For security reason the MGMTSWITCH must be created in Vepa mode. If you are not aware of what are VEPA and VEB modes here is a reminder:
  • In VEB mode all the the partitions connected to the same vlan can communicate together. We do not want that as it is a security issue.
  • The VEPA mode gives us the ability to isolate lpars that are on the same subnet. lpar to lpar traffic is forced out of the machine. This is what we want.
  • The PVID for this VEPA network is 4094
  • The adapter in the NovaLink partition must be a trunk adapter.
  • It is mandatory to name the VEPA vswitch MGMTSWITCH.
  • At the lpar creation if the MGMTSWITCH exists a new Virtual Ethernet Adapter will be automatically created on the deployed lpar.
  • To be correctly configured the deployed lpar needs the latest level of rsct code (3.2.1.0 for now).
  • The latest cloud-init version must be deploy on the captured lpar used to make the image.
  • You don’t need to configure any addresses on this adapter (on the deployed lpars the adapter is configured with the local-link address (it’s the same thing as 169.254.0.0/16 addresses used in IPv4 format but for IPv6)(please note that any IPv6 adapter must “by design” have a local-link address).

mgmtswitch2

  • Create the virtual switch called MGMTSWITCH in Vepa mode:
  • # pvmctl vswitch create --name MGMTSWITCH --mode=Vepa
    # pvmctl vswitch list  --display-fields VirtualSwitch.name VirtualSwitch.mode 
    name=ETHERNET0,mode=Veb
    name=vdct,mode=Veb
    name=vdcb,mode=Veb
    name=vdca,mode=Veb
    name=MGMTSWITCH,mode=Vepa
    
  • Create a virtual ethernet adapter on the NovaLink partition with the PVID 4094 and a trunk priorty set to 1 (it’s a trunk adapter). Note that we now have two adapters on the NovaLink partition (one in IPv4 (routable) and the other one in IPv6 (non-routable):
  • # pvmctl vea create --pvid 4094 --vswitch MGMTSWITCH --trunk-pri 1 --parent-id name=nova
    # pvmctl vea list --parent-id name=nova
    --------------------------
    | VirtualEthernetAdapter |
    --------------------------
      is_tagged_vlan_supported=False
      is_trunk=False
      loc_code=U8286.41A.216666-V3-C2
      mac=EE3B84FD1402
      pvid=666
      slot=2
      uuid=05a91ab4-9784-3551-bb4b-9d22c98934e6
      vswitch_id=1
    --------------------------
    | VirtualEthernetAdapter |
    --------------------------
      is_tagged_vlan_supported=True
      is_trunk=True
      loc_code=U8286.41A.216666-V3-C34
      mac=B6F837192E63
      pvid=4094
      slot=34
      trunk_pri=1
      uuid=fe7389a8-667f-38ca-b61e-84c94e5a3c97
      vswitch_id=4
    

    Configure the local-link IPv6 address in the NovaLink partition:

    # more /etc/network/interfaces
    [..]
    auto eth1
    iface eth1 inet manual
     up /sbin/ifconfig eth1 0.0.0.0
    # ifup eth1
    # ifconfig eth1
    eth1      Link encap:Ethernet  HWaddr b6:f8:37:19:2e:63
              inet6 addr: fe80::b4f8:37ff:fe19:2e63/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:0 errors:0 dropped:0 overruns:0 frame:0
              TX packets:17 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:0 (0.0 B)  TX bytes:1454 (1.4 KB)
              Interrupt:34
    

Capture an AIX host with the latest version of rsct installed (3.2.1.0) or later and the latest version of cloud-init installed. This version of RMC/rsct handle this new feature so this is mandatory to have it installed on the captured host. When PowerVC will deploy a Virtual Machine on a Nova managed host with this version of rsct installed a new adapter with the PVID 4094 in the virtual switch MGMTSWITCH will be created and finally all the RMC traffic will use this adapter instead of your public IP address:

# lslpp -L rsct*
  Fileset                      Level  State  Type  Description (Uninstaller)
  ----------------------------------------------------------------------------
  rsct.core.auditrm          3.2.1.0    C     F    RSCT Audit Log Resource
                                                   Manager
  rsct.core.errm             3.2.1.0    C     F    RSCT Event Response Resource
                                                   Manager
  rsct.core.fsrm             3.2.1.0    C     F    RSCT File System Resource
                                                   Manager
  rsct.core.gui              3.2.1.0    C     F    RSCT Graphical User Interface
  rsct.core.hostrm           3.2.1.0    C     F    RSCT Host Resource Manager
  rsct.core.lprm             3.2.1.0    C     F    RSCT Least Privilege Resource
                                                   Manager
  rsct.core.microsensor      3.2.1.0    C     F    RSCT MicroSensor Resource
                                                   Manager
  rsct.core.rmc              3.2.1.1    C     F    RSCT Resource Monitoring and
                                                   Control
  rsct.core.sec              3.2.1.0    C     F    RSCT Security
  rsct.core.sensorrm         3.2.1.0    C     F    RSCT Sensor Resource Manager
  rsct.core.sr               3.2.1.0    C     F    RSCT Registry
  rsct.core.utils            3.2.1.1    C     F    RSCT Utilities

When this image will be deployed a new adapter will be created in the MGMTSWITCH virtual switch, an IPv6 local-link address will be configured on it. You can check the cloud-init activation to see the IPv6 address is configured at the activation time:

# pvmctl vea list --parent-id name=aix72-2-0a0de5c5-00000095
--------------------------
| VirtualEthernetAdapter |
--------------------------
  is_tagged_vlan_supported=True
  is_trunk=False
  loc_code=U8286.41A.216666-V5-C32
  mac=FA620F66FF20
  pvid=3331
  slot=32
  uuid=7f1ec0ab-230c-38af-9325-eb16999061e2
  vswitch_id=1
--------------------------
| VirtualEthernetAdapter |
--------------------------
  is_tagged_vlan_supported=True
  is_trunk=False
  loc_code=U8286.41A.216666-V5-C33
  mac=46A066611B09
  pvid=4094
  slot=33
  uuid=560c67cd-733b-3394-80f3-3f2a02d1cb9d
  vswitch_id=4
# ifconfig -a
en0: flags=1e084863,14c0
        inet 10.10.66.66 netmask 0xffffff00 broadcast 10.14.33.255
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en1: flags=1e084863,14c0
        inet6 fe80::c032:52ff:fe34:6e4f/64
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
sit0: flags=8100041
        inet6 ::10.10.66.66/96
[..]

Note that the local-link address is configured at the activation time (fe80 starting addresses):

# more /var/log/cloud-init-output.log
[..]
auto eth1

iface eth1 inet6 static
    address fe80::c032:52ff:fe34:6e4f
    hwaddress ether c2:32:52:34:6e:4f
    netmask 64
    pre-up [ $(ifconfig eth1 | grep -o -E '([[:xdigit:]]{1,2}:){5}[[:xdigit:]]{1,2}') = "c2:32:52:34:6e:4f" ]
        dns-search fr.net.intra
# entstat -d ent1 | grep -iE "switch|vlan"
Invalid VLAN ID Packets: 0
Port VLAN ID:  4094
VLAN Tag IDs:  None
Switch ID: MGMTSWITCH

To be sure all is working correctly here is a proof test. I’m taking down the en0 interface on which the IPv4 public address is configured. Then I’m launching a tcpdump on the en1 (on the MGMTSWITCH address). Finally I’m resizing the Virtual Machine with PowerVC. AND EVERYTHING IS WORKING GREAT !!!! AWESOME !!! :-) (note the fe80 to fe80 communication):

# ifconfig en0 down detach ; tcpdump -i en1 port 657
tcpdump: WARNING: en1: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en1, link-type 1, capture size 96 bytes
22:00:43.224964 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: S 4049792650:4049792650(0) win 65535 
22:00:43.225022 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: S 2055569200:2055569200(0) ack 4049792651 win 28560 
22:00:43.225051 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: . ack 1 win 32844 
22:00:43.225547 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: P 1:209(208) ack 1 win 32844 
22:00:43.225593 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: . ack 209 win 232 
22:00:43.225638 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: P 1:97(96) ack 209 win 232 
22:00:43.225721 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: P 209:377(168) ack 97 win 32844 
22:00:43.225835 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: P 97:193(96) ack 377 win 240 
22:00:43.225910 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: P 377:457(80) ack 193 win 32844 
22:00:43.226076 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: P 193:289(96) ack 457 win 240 
22:00:43.226154 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: P 457:529(72) ack 289 win 32844 
22:00:43.226210 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: P 289:385(96) ack 529 win 240 
22:00:43.226276 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: P 529:681(152) ack 385 win 32844 
22:00:43.226335 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: P 385:481(96) ack 681 win 249 
22:00:43.424049 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: . ack 481 win 32844 
22:00:44.725800 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.rmc: UDP, length 88
22:00:44.726111 IP6 fe80::9850:f6ff:fe9c:5739.rmc > fe80::d09e:aff:fecf:a868.rmc: UDP, length 88
22:00:50.137605 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.rmc: UDP, length 632
22:00:50.137900 IP6 fe80::9850:f6ff:fe9c:5739.rmc > fe80::d09e:aff:fecf:a868.rmc: UDP, length 88
22:00:50.183108 IP6 fe80::9850:f6ff:fe9c:5739.rmc > fe80::d09e:aff:fecf:a868.rmc: UDP, length 408
22:00:51.683382 IP6 fe80::9850:f6ff:fe9c:5739.rmc > fe80::d09e:aff:fecf:a868.rmc: UDP, length 408
22:00:51.683661 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.rmc: UDP, length 88

To be sure security requirements are met from the lpar I’m pinging the NovaLink host (the first one) which is answering and then I’m pinging the second lpar (the second ping) which is not working. (And this is what we want !!!).

# ping fe80::d09e:aff:fecf:a868
PING fe80::d09e:aff:fecf:a868 (fe80::d09e:aff:fecf:a868): 56 data bytes
64 bytes from fe80::d09e:aff:fecf:a868: icmp_seq=0 ttl=64 time=0.203 ms
64 bytes from fe80::d09e:aff:fecf:a868: icmp_seq=1 ttl=64 time=0.206 ms
64 bytes from fe80::d09e:aff:fecf:a868: icmp_seq=2 ttl=64 time=0.216 ms
^C
--- fe80::d09e:aff:fecf:a868 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0/0/0 ms
# ping fe80::44a0:66ff:fe61:1b09
PING fe80::44a0:66ff:fe61:1b09 (fe80::44a0:66ff:fe61:1b09): 56 data bytes
^C
--- fe80::44a0:66ff:fe61:1b09 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

PowerVC 1.3.0.1 Dynamic Resource Optimizer

In addition to the NovaLink part of this blog post I also wanted to talk about the killer app of 2016. Dynamic Resource Optimizer. This feature can be used on any PowerVC 1.3.0.1 managed hosts (you obviously need at least to hosts). DRO is in charge to re-balance your Virtual Machines across all the available hosts (in the host-group). To sum up if a host is experiencing an heavy load and reaching a certain amount of CPU consumption over a period of time, DRO will move your virtual machines to re-balance the load across all the available hosts (this is done at a host level). Here are a few details about DRO:

  • The DRO configuration is done at a host level.
  • You setup a threshold (in the capture below) to reach to trigger the Live Partition Moblity or Mobily Cores movements (Power Entreprise Pool).
  • droo6
    droo3

  • To be triggered this threshold must be reached a certain number of time (stabilization) over a period you are defining (run interval).
  • You can choose to move virtual machines using Live Partition Mobilty, or to move “cores” using Power Entreprise Pool (you can do both; moving CPU will always be preferred as moving partitions)
  • DRO can be run in advise mode (nothing is done, a warning is thrown in the new DRO events tab) or in active mode (which is doing the job and moving things).
    droo2
    droo1

  • Your most critical virtual machines can be excluded from DRO:
  • droo5

How is DRO choosing which machines are moved

I’m running DRO in production since now one month and I had the time to check what is going on behind the scene. How is DRO choosing which machines are moved when a Live Partition Moblity operation must be run to face an heavy load on a host ? To do so I decided to launch 3 different cpuhog (16 forks, 4VP, SMT4) processes (which are eating CPU ressource) on three different lpars with 4VP each. On the PowerVC I can check that before launching this processes the CPU consumption is ok on this host (the three lpars are running on the same host) :

droo4

# cat cpuhog.pl
#!/usr/bin/perl

print "eating the CPUs\n";

foreach $i (1..16) {
      $pid = fork();
      last if $pid == 0;
      print "created PID $pid\n";
}

while (1) {
      $x++;
}
# perl cpuhog.pl
eating the CPUs
created PID 47514604
created PID 22675712
created PID 3015584
created PID 21496152
created PID 25166098
created PID 26018068
created PID 11796892
created PID 33424106
created PID 55444462
created PID 65077976
created PID 13369620
created PID 10813734
created PID 56623850
created PID 19333542
created PID 58393312
created PID 3211988

I’m waiting a couple of minutes and I realize that the virtual machines on which the cpuhog processes were launched are the ones which are migrated. So we can say that PowerVC is moving the machine that are eating CPU (another strategy could be to move all the non-eating CPU machines to let the working ones do their job without launching a mobility operation).

# errpt | head -3
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
A5E6DB96   0118225116 I S pmig           Client Partition Migration Completed
08917DC6   0118225116 I S pmig           Client Partition Migration Started

After the moves are ok I can see that the load is now ok on the host. DRO has done the job for me and moved the lpar to met the configured thresold ;-)

droo7dro_effect

The images below will show you a good example of the “power” of PowerVC and DRO. To update my Virtual I/O Servers to the latest version the PowerVC maintenance mode was used to free up the Virtual I/O Servers. After leaving the maintenance mode the DRO was doing the job to re-balance the Virtual Machines across all the hosts (The red arrows symbolize the maintenance mode action and the purple ones the DRO actions). You can also see that some lpars were moved across 4 different hosts during this process. All these pictures are taken from real life experience on my production systems. This not a lab environment, this is one part of my production. So yes DRO and PowerVC 1.3.0.1 are production ready. Hell yes!

real1
real2
real3
real4
real5

Conclusion

As my environment is growing bigger the next step for me will be to move on NovaLink on my P8 hosts. Please note that the NovaLink Co-Management feature is today a “TechPreview” but should be released GA very soon. Talking about DRO I was waiting for that for years and it finally happens. I can assure you that it is production ready, to prove this I’ll just give you this number. To upgrade my Virtual I/O Servers to 2.2.4.10 release using PowerVC maintenance mode and DRO more than 1000 Live Partition Mobility moves were performed without any outage on production servers and during working hours. Nobody in my company was aware of this during the operations. It was a seamless experience for everybody.

Tips and tricks for PowerVC 1.2.3 (PVID, ghostdev, clouddev, rest API, growing volumes, deleting boot volume) | PowerVC 1.2.3 Redbook

Writing a Redbook was one of my main goal. After working days and nights for more than 6 years on PowerSystems IBM gave me the opportunity to write a Redbook. I was looking on the Redbook residencies page since a very very long time to find the right one. As there was nothing new on AIX and PowerVM (which are my favorite topics) I decided to give a try to the latest PowerVC Redbook (this Redbook is an update, but a huge one. PowerVC is moving fast). I am a Redbook reader since I’m working on AIX. Almost all Redbooks are good, most of them are the best source of information for AIX and Power administrators. I’m sure that like me, you saw that part about becoming an author every time you are reading a RedBook. I can now say THAT IT IS POSSIBLE (for everyone). I’m now one of this guys and you can also become one. Just find the Redbook that will fit for you and apply on the Redbook webpage (http://www.redbooks.ibm.com/residents.nsf/ResIndex). I wanted to say a BIG Thank you to all the people who gave me the opportunity to do that, especially Philippe Hermes, Jay Kruemcke, Eddie Shvartsman, Scott Vetter, Thomas R Bosthworth. In addition to these people I wanted also to thanks my teammates on this Redbook: Guillermo Corti, Marco Barboni and Liang Xu, they are all true professional people, very skilled and open … this was a great team ! One more time thank you guys. Last, I take the opportunity here to thanks the people who believed in me since the very beginning of my AIX career: Julien Gabel, Christophe Rousseau, and JL Guyot. Thank you guys ! You deserve it, stay like you are. I’m now not an anonymous guy anymore.

redbook

You can download the Redbook at this address: http://www.redbooks.ibm.com/redpieces/pdfs/sg248199.pdf. I’ve learn something during the writing of the Redbook and by talking to the members of the team. Redbooks are not there to tell and explain you what’s “behind the scene”. A Redbook can not be too long, and needs to be written in almost 3 weeks, there is no place for everything. Some topics are better integrated in a blog post than in a Redbook, and Scott told me that a couple of time during the writing session. I totally agree with him. So here is this long awaited blog post. The are advanced topics about PowerVC read the Redbook before reading this post.

Last one thanks to IBM (and just IBM) for believing in me :-). THANK YOU SO MUCH.

ghostdev, clouddev and cloud-init (ODM wipe if using inactive live partition mobility or remote restart)

Everybody who is using cloud-init should be aware of this. Cloud-init is only supported with AIX version that have the clouddev attribute available on sys0. To be totally clear at the time of writing this blog post you will be supported by IBM only if you use AIX 7.1 TL3 SP5 or AIX 6.1 TL9 SP5. All other versions are not supported by IBM. Let me explain why and how you can still use cloud-init on older versions just by doing a little trick. But let’s first explain what the problem is:

Let’s say you have different machines some of them using AIX 7100-03-05 and some of them using 7100-03-04, both use cloud-init for the activation. By looking at cloud-init code at this address here we can say that:

  • After the cloud-init installation cloud-init is:
  • Changing clouddev to 1 if sys0 has a clouddev attribute:
  • # oslevel -s
    7100-03-05-1524
    # lsattr -El sys0 -a ghostdev
    ghostdev 0 Recreate ODM devices on system change / modify PVID True
    # lsattr -El sys0 -a clouddev
    clouddev 1 N/A True
    
  • Changing ghostdev to 1 if sys0 don’t have a clouddev attribute:
  • # oslevel -s
    7100-03-04-1441
    # lsattr -El sys0 -a ghostdev
    ghostdev 1 Recreate ODM devices on system change / modify PVID True
    # lsattr -El sys0 -a clouddev
    lsattr: 0514-528 The "clouddev" attribute does not exist in the predefined
            device configuration database.
    

This behavior can directly be observed in the cloud-init code:

ghostdev_clouddev_cloudinit

Now that we are aware of that, let’s make a remote restart test between two P8 boxes. I take the opportunity here to present you one of the coolest feature of PowerVC 1.2.3. You can now remote restart your virtual machines directly from the PowerVC GUI if you have one of your host in a failure state. I highly encourage you to check my latest post about this subject if you don’t know how to setup remote restartable partitions http://chmod666.org/index.php/using-the-simplified-remote-restart-capability-on-power8-scale-out-servers/:

  • Only simplified remote restart can be managed by PowerVC 1.2.3, the “normal” version of remote restart is not handle by PowerVC 1.2.3
  • In the compute template configuration there is now a checkbox allowing you to create remote restartable partition. Be careful: you can’t go back to a P7 box without having to reboot the machine. So be sure your Virtual Machines will stay on P8 box if you check this option.
  • remote_restart_compute_template

  • When the machine is shutdown or there is a problem on it you can click the “Remotely Restart Virtual Machines” button:
  • rr1

  • Select the machines you want to remote restart:
  • rr2
    rr3

  • While the Virtual Machines are remote restarting, you can check the states of the VM and the state of the host:
  • rr4
    rr5

  • After the evacuation the host is in “Remote Restart Evacuated State”:

rr6

Let’s now check the state of our two Virtual Machines:

  • The ghostdev one (the sys0 messages in the errpt indicates that the partition ID has changed AND DEVICES ARE RECREATED (ODM Wipe)) (no more ip address set on en0):
  • # errpt | more
    IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
    A6DF45AA   0803171115 I O RMCdaemon      The daemon is started.
    1BA7DF4E   0803171015 P S SRC            SOFTWARE PROGRAM ERROR
    CB4A951F   0803171015 I S SRC            SOFTWARE PROGRAM ERROR
    CB4A951F   0803171015 I S SRC            SOFTWARE PROGRAM ERROR
    D872C399   0803171015 I O sys0           Partition ID changed and devices recreat
    # ifconfig -a
    lo0: flags=e08084b,c0
            inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
            inet6 ::1%1/0
             tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
    
  • The clouddev one (the sys0 message in the errpt indicate that the partition ID has changed) (note that the errpt message does not indicate that the devices are recreated):
  • # errpt |more
    60AFC9E5   0803232015 I O sys0           Partition ID changed since last boot.
    # ifconfig -a
    en0: flags=1e084863,480
            inet 10.10.10.20 netmask 0xffffff00 broadcast 10.244.248.63
             tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
    lo0: flags=e08084b,c0
            inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
            inet6 ::1%1/0
             tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
    

VSAE is designed to manage ghostdev only OS on the other hand cloud-init is designed to manage clouddev OS. To be perfectly clear here are how ghostdev and clouddev works. But we first need to answer a question. Why do we need to set clouddev or ghostdev to 1 ? The answer is pretty obvious, one of this attribute needs to be set to 1 before capturing the Virtual Machine. When the Virtual Machines is captured, one of this attributes is set to 1. When you will deploy a new Virtual Machine this flag is needed to wipe the ODM before reconfiguring the virtual machine with the parameters set in the PowerVC GUI (ip, hostname). In both clouddev and ghostdev cases it is obvious that we need to wipe the ODM at the machine build/deploy time. Then VSAE or cloud-init (using config drive datasource) is setting hostname, ip address previously wiped by clouddev and ghostdev attributes. This is working well for a new deploy because we need to wipe the ODM in all cases but what about an inactive live partition mobility or a remote restart operation ? The Virtual Machine has moved (not on the same host, and not with the same lpar ID) and we need to keep the ODM as it is. How is it working:

  • If you are using VSAE, this one is managing the ghostdev attribute for you. At the capture time ghostdev is set to 1 by VSAE (when you run the pre-capture script). When deploying a new VM, at the activation time, VSAE is setting ghostdev back to 0. Inactive live partition mobility and remote restart operations will work fine with ghostdev set to 0.
  • If you are using cloud-init on a supported system clouddev is set to 1 at the installation of cloud-init. As cloud-init is doing nothing with both attributes at the activation time IBM needs to find a way to avoid wiping the ODM after the remote restart operation. The clouddev device was introduced: this one is writing a flag in the NVRAM, so when a new VM is built, there is no flag in the NVRAM for this one, the ODM is wiped. When an already existing VM is remote restarted, the flag exists in the NVRAM, the ODM is not wiped. By using clouddev there is no post deploy action needed.
  • If you are using cloud-init on an unsupported system ghostdev is set to 1 at the installation of cloud-init. As cloud-init is doing nothing at post-deploy time, ghostdev will remains to 1 in all cases and the ODM will always be wiped.

cloudghost

There is a way to use cloud-init on unsupported system. Keep in mind that in this case you will not be supported by IBM. So do this at you own risk. To be totally honest I’m using this method in production to use the same activation engine for all my AIX version:

  1. Pre-capture, set ghostdev to 1. What ever happens THIS IS MANDATORY.
  2. Post-capture, reboot the captured VM and set ghostdev to 0.
  3. Post-deploy on every Virtual machine set ghostdev to 0. You can put this in the activation input to do the job:
  4. #cloud-config
    runcmd:
     - chdev -l sys0 -a ghostdev=0
    

The PVID problem

I realized I had this problem after using PowerVC for a while. As PowerVC images for rootvg and other volume group are created using Storage Volume Controller flashcopy (in case of a SVC configuration, but there are similar mechanisms for other storage providers) the PVID for both rootvg and additional volume groups will always be the same for each new virtual machines (all new virtual machines will have the same PVID for their rootvg, and the same PVID for each captured volume group). I did contact IBM about this and the PowerVC team told me that this behavior is totally normal and was observed since the release of VMcontrol. They didn’t have any issues related to this, so if you don’t care about it, just do nothing and keep this behavior as it is. I recommend doing nothing about this!

It’s a shame but most AIX administrators like to keep things as they are and don’t want any changes. (In my humble opinion this is one of the reason AIX is so outdated compared to Linux, we need a community, not narrow-minded people keeping their knowledge for them, just to stay in their daily job routine without having anything to learn). If you are in this case, facing angry colleagues about this particular point you can use the solution proposed below to calm the passions of the few ones who do not want to change !. :-). This is my rant : CHANGE !

By default if you build two virtual machines and check the PVID of each one, you will notice that the PVID are the same:

  • Machine A:
  • root@machinea:/root# lspv
    hdisk0          00c7102d2534adac                    rootvg          active
    hdisk1          00c7102d00d14660                    appsvg          active
    
  • Machine B:
  • root@machineb:root# lspv
    hdisk0          00c7102d2534adac                    rootvg          active
    hdisk1          00c7102d00d14660                    appsvg         active
    

For the rootvg the PVID is always set to 00c7102d2534adac and for the appsvg the PVID is always set to 00c7102d00d14660.

For the rootvg the solution is to change the ghostdev (only the ghostdev) to 2, and to reboot the machine. Putting ghostdev to 2 will change the PVID of the rootvg at the reboot time (after the PVID is changed ghostdev will be automatically set back to 0)

# lsattr -El sys0 -a ghostdev
ghostdev 2 Recreate ODM devices on system change / modify PVID True
# lsattr -l sys0 -R -a ghostdev
0...3 (+1)

For the non rootvg volume group this is a little bit tricky but still possible, the solution is to use the recreatevg (-d option) command to change the PVID of all the physical volumes of your volume group. Before rebooting the server ensure that:

  • Umount all the filesystems in the volume group on which you want to change the PVID.
  • varyoff the volume group.
  • Get the physical volumes names composing the volume group.
  • export the volume group.
  • recreate the volume group (this action will change the PVID)
  • re-import the volume group.

Here is the shell commands doing the trick:

# vg=appsvg
# lsvg -l $vg | awk '$6 == "open/syncd" && $7 != "N/A" { print "fuser -k " $NF }' | sh
# lsvg -l $vg | awk '$6 == "open/syncd" && $7 != "N/A" { print "umount " $NF }' | sh
# varyoffvg $vg
# pvs=$(lspv | awk -v my_vg=$vg '$3 == my_vg {print $1}')
# recreatevg -y $vg -d $pvs
# importvg -y $vg $(echo ${pvs} | awk '{print $1}'

We now agree that you want to do this, but as you are a smart person you want to do it automatically using cloud-init and the activation input, there are two way to do it, the silly way (using shell) and the noble way (using cloudinit syntax):

PowerVC activation engine (shell way)

Use this short ksh script in the activation input, this is not my recommendation, but you can do it for simplicity:

activation_input_shell

PowerVC activation engine (cloudinit way)

Here is the cloud-init way. Important note: use the latest version of cloud-init, the first one I used had a problem with the cc_power_state_change.py not using the right parameters for AIX:

activation_input_ci

Working with REST Api

I will not show you here how to work with the PowerVC RESTful API. I prefer to share a couple of scripts on my github account. Nice examples are often better than how to tutorials. So check the scripts on the github if you want a detailed how to … scripts are well commented. Just a couple of things to say before closing this topic: the best way to work with RESTful api is to code in python, there are a lot existing python libs to work with RESTful api (httplib2, pycurl, request). For my own understanding I prefer in my script using the simple httplib. I will put all my command line tools in a github repository called pvcmd (for PowerVC command line). You can download the scripts at this address, or just use git to clone the repo. One more time it is a community project, feel free to change and share anything: https://github.com/chmod666org/pvcmd:

Growing data lun

To be totally honest here is what I do when I’m creating a new machine with PowerVC. My customers always needs one additionnal volume groups for applications (we will call it appsvg). I’ve create a multi volume image with this volume group created (with a bunch of filesystem in it). As most of customers are asking for the volume group to be 100g large the capture was made with this size. Unfortunately for me we often get requests to create a bigger volume groups let’s say 500 or 600 Gb. Instead of creating a new lun and extending the volume group PowerVC allows you to grow the lun to the desired size. For volume group other than the boot one you must use the RESTful API to extend the volume. To do this I’ve created a python script to called pvcgrowlun (feel free to check the code on github) https://github.com/chmod666org/pvcmd/blob/master/pvcgrowlun. At each virtual machine creation I’m checking if the customer needs a larger volume group and extend it using the command provided below.

While coding this script I got a problem using the os-extend parameter in my http request. PowerVC is not exactly using the same parameters as Openstack is, if you want to code by yourself be aware of this and check in the PowerVC online documentation if you are using “extended attributes” (Thanks to Christine L Wang for this one):

  • In the Openstack documentation the attribute is “os-extend” link here:
  • os-extend

  • In the PowerVC documentation the attribute is “ibm-extend” link here:
  • ibm-extend

  • Identify the lun you want to grow (as the script is taking the name of the volume as parameter) (I have one not published to list all the volumes, tell me if you want it). In my case the volume name is multi-vol-bf697dfa-0000003a-828641A_XXXXXX-data-1, and I want to change its size from 60 to 80. This is not stated in the offical PowerVC documentation but this will work for both boot and data lun.
  • Check the size of the lun is lesser than the desired size:
  • before_grow

  • Run the script:
  • # pvcgrowlun -v multi-vol-bf697dfa-0000003a-828641A_XXXXX-data-1 -s 80 -p localhost -u root -P mysecretpassword
    [info] growing volume multi-vol-bf697dfa-0000003a-828641A_XXXXX-data-1 with id 840d4a60-2117-4807-a2d8-d9d9f6c7d0bf
    JSON Body: {"ibm-extend": {"new_size": 80}}
    [OK] Call successful
    None
    
  • Check the size is changed after the command execution:
  • aftergrow_grow

  • Don’t forget to do the job in the operating system by running a “chvg -g” (check total PPS here):
  • # lsvg vg_apps
    VOLUME GROUP:       vg_apps                  VG IDENTIFIER:  00f9aff800004c000000014e6ee97071
    VG STATE:           active                   PP SIZE:        256 megabyte(s)
    VG PERMISSION:      read/write               TOTAL PPs:      239 (61184 megabytes)
    MAX LVs:            256                      FREE PPs:       239 (61184 megabytes)
    LVs:                0                        USED PPs:       0 (0 megabytes)
    OPEN LVs:           0                        QUORUM:         2 (Enabled)
    TOTAL PVs:          1                        VG DESCRIPTORS: 2
    STALE PVs:          0                        STALE PPs:      0
    ACTIVE PVs:         1                        AUTO ON:        yes
    MAX PPs per VG:     32768                    MAX PVs:        1024
    LTG size (Dynamic): 256 kilobyte(s)          AUTO SYNC:      no
    HOT SPARE:          no                       BB POLICY:      relocatable
    MIRROR POOL STRICT: off
    PV RESTRICTION:     none                     INFINITE RETRY: no
    DISK BLOCK SIZE:    512                      CRITICAL VG:    no
    # chvg -g appsvg
    # lsvg appsvg
    VOLUME GROUP:       appsvg                  VG IDENTIFIER:  00f9aff800004c000000014e6ee97071
    VG STATE:           active                   PP SIZE:        256 megabyte(s)
    VG PERMISSION:      read/write               TOTAL PPs:      319 (81664 megabytes)
    MAX LVs:            256                      FREE PPs:       319 (81664 megabytes)
    LVs:                0                        USED PPs:       0 (0 megabytes)
    OPEN LVs:           0                        QUORUM:         2 (Enabled)
    TOTAL PVs:          1                        VG DESCRIPTORS: 2
    STALE PVs:          0                        STALE PPs:      0
    ACTIVE PVs:         1                        AUTO ON:        yes
    MAX PPs per VG:     32768                    MAX PVs:        1024
    LTG size (Dynamic): 256 kilobyte(s)          AUTO SYNC:      no
    HOT SPARE:          no                       BB POLICY:      relocatable
    MIRROR POOL STRICT: off
    PV RESTRICTION:     none                     INFINITE RETRY: no
    DISK BLOCK SIZE:    512                      CRITICAL VG:    no
    

My own script to create VMs

I’m creating Virtual Machine every weeks, sometimes just a couple and sometime I got 10 Virtual Machines to create in a row. We are here using different storage connectivity groups, and different storage templates if the machine is in production, in development, and so on. We also have to choose the primary copy on the SVC side if the machine is in production (I am using a streched cluster between two distant sites, so I have to choose different storage templates depending on the site where the Virtual Machine is hosted). I make mistakes almost every time using the PowerVC gui (sometime I forgot to put the machine name, sometimes the connectivity group). I’m a lazy guy so I decided to code a script using the PowerVC rest api to create new machines based on a template file. We are planing to give the script to our outsourced teams to allow them to create machine, without knowing what PowerVC is \o/. The script is taking a file as parameter and create the virtual machine:

  • Create a file like the one below with all the information needed for your new virtual machine creation (name, ip address, vlan, host, image, storage connectivity group, ….):
  • # cat test.vm
    name:test
    ip_address:10.16.66.20
    vlan:vlan6666
    target_host:Default Group
    image:multi-vol
    storage_connectivity_group:npiv
    virtual_processor:1
    entitled_capacity:0.1
    memory:1024
    storage_template:storage1
    
  • Launch the script, the Virtual Machine will be created:
  • pvcmkvm -f test.vm -p localhost -u root -P mysecretpassword
    name: test
    ip_address: 10.16.66.20
    vlan: vlan666
    target_host: Default Group
    image: multi-vol
    storage_connectivity_group: npiv
    virtual_processor: 1
    entitled_capacity: 0.1
    memory: 1024
    storage_template: storage1
    [info] found image multi-vol with id 041d830c-8edf-448b-9892-560056c450d8
    [info] found network vlan666 with id 5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5
    [info] found host aggregation Default Group with id 1
    [info] found storage template storage1 with id bfb4f8cc-cd68-46a2-b3a2-c715867de706
    [info] found image multi-vol with id 041d830c-8edf-448b-9892-560056c450d8
    [info] found a volume with id b3783a95-822c-4179-8c29-c7db9d060b94
    [info] found a volume with id 9f2fc777-eed3-4c1f-8a02-00c9b7c91176
    JSON Body: {"os:scheduler_hints": {"host_aggregate_id": 1}, "server": {"name": "test", "imageRef": "041d830c-8edf-448b-9892-560056c450d8", "networkRef": "5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5", "max_count": 1, "flavor": {"OS-FLV-EXT-DATA:ephemeral": 10, "disk": 60, "extra_specs": {"powervm:max_proc_units": 32, "powervm:min_mem": 1024, "powervm:proc_units": 0.1, "powervm:max_vcpu": 32, "powervm:image_volume_type_b3783a95-822c-4179-8c29-c7db9d060b94": "bfb4f8cc-cd68-46a2-b3a2-c715867de706", "powervm:image_volume_type_9f2fc777-eed3-4c1f-8a02-00c9b7c91176": "bfb4f8cc-cd68-46a2-b3a2-c715867de706", "powervm:min_proc_units": 0.1, "powervm:storage_connectivity_group": "npiv", "powervm:min_vcpu": 1, "powervm:max_mem": 66560}, "ram": 1024, "vcpus": 1}, "networks": [{"fixed_ip": "10.244.248.53", "uuid": "5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5"}]}}
    {u'server': {u'links': [{u'href': u'https://powervc.lab.chmod666.org:8774/v2/1471acf124a0479c8d525aa79b2582d0/servers/fc3ab837-f610-45ad-8c36-f50c04c8a7b3', u'rel': u'self'}, {u'href': u'https://powervc.lab.chmod666.org:8774/1471acf124a0479c8d525aa79b2582d0/servers/fc3ab837-f610-45ad-8c36-f50c04c8a7b3', u'rel': u'bookmark'}], u'OS-DCF:diskConfig': u'MANUAL', u'id': u'fc3ab837-f610-45ad-8c36-f50c04c8a7b3', u'security_groups': [{u'name': u'default'}], u'adminPass': u'u7rgHXKJXoLz'}}
    

One of the major advantage of using this is batching Virtual Machine creation. By using the script you can create one hundred Virtual Machine in a couple of minutes. Awesome !

Working with Openstack commands

PowerVC is based on Openstack, so why not using the Openstack command to work with PowerVC. It is possible, but I repeat one more time that this is not supported by IBM at all. Use this trick at you own risk. I was working with cloud manager with openstack (ICMO) and a script including shells variables is provided to “talk” to the ICMO Openstack. Based on the same file I created the same one for PowerVC. Before using any Openstack commands create a powervcrc file that match you PowerVC environement:

# cat powervcrc
export OS_USERNAME=root
export OS_PASSWORD=mypasswd
export OS_TENANT_NAME=ibm-default
export OS_AUTH_URL=https://powervc.lab.chmod666.org:5000/v3/
export OS_IDENTITY_API_VERSION=3
export OS_CACERT=/etc/pki/tls/certs/powervc.crt
export OS_REGION_NAME=RegionOne
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_DOMAIN_NAME=Default

Then source the powervcrc file, and you are ready to play with all Openstack commands:

# source powervcrc

You can then play with Openstack commands, here are a few nice example:

  • List virtual machines:
  • # nova list
    +--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
    | ID                                   | Name                  | Status | Task State | Power State | Networks               |
    +--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
    | dc5c9fce-c839-43af-8af7-e69f823e57ca | ghostdev0clouddev1    | ACTIVE | -          | Running     | vlan666=10.16.66.56    |
    | d7d0fd7e-a580-41c8-b3d8-d7aab180d861 | ghostdevto1cloudevto1 | ACTIVE | -          | Running     | vlan666=10.16.66.57    |
    | bf697dfa-f69a-476c-8d0f-abb2fdcb44a7 | multi-vol             | ACTIVE | -          | Running     | vlan666=10.16.66.59    |
    | 394ab4d4-729e-44c7-a4d0-57bf2c121902 | deckard               | ACTIVE | -          | Running     | vlan666=10.16.66.60    |
    | cd53fb69-0530-451b-88de-557e86a2e238 | priss                 | ACTIVE | -          | Running     | vlan666=10.16.66.61    |
    | 64a3b1f8-8120-4388-9d64-6243d237aa44 | rachael               | ACTIVE | -          | Running     |                        |
    | 2679e3bd-a2fb-4a43-b817-b56ead26852d | batty                 | ACTIVE | -          | Running     |                        |
    | 5fdfff7c-fea0-431a-b99b-fe20c49e6cfd | tyrel                 | ACTIVE | -          | Running     |                        |
    +--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
    
  • Reboot a machine:
  • # nova reboot multi-vol
    
  • List the hosts:
  • # nova hypervisor-list
    +----+---------------------+-------+---------+
    | ID | Hypervisor hostname | State | Status  |
    +----+---------------------+-------+---------+
    | 21 | 828641A_XXXXXXX     | up    | enabled |
    | 23 | 828641A_YYYYYYY     | up    | enabled |
    +----+---------------------+-------+---------+
    
  • Migrate a virtual machine (run a live partition mobility operation):
  • # nova live-migration ghostdevto1cloudevto1 828641A_YYYYYYY
    
  • Evacuate and set a server in maintenance mode and move all the partitions to another host:
  • # nova maintenance-enable --migrate active-only --target-host 828641A_XXXXXX 828641A_YYYYYYY
    
  • Virtual Machine creation (output truncated):
  • # nova boot --image 7100-03-04-cic2-chef --flavor powervm.tiny --nic net-id=5fae84a7-b463-4a1a-b4dd-9ab24cdb66b5,v4-fixed-ip=10.16.66.51 novacreated
    +-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
    | Property                            | Value                                                                                                                                            |
    +-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
    | OS-DCF:diskConfig                   | MANUAL                                                                                                                                           |
    | OS-EXT-AZ:availability_zone         | nova                                                                                                                                             |
    | OS-EXT-SRV-ATTR:host                | -                                                                                                                                                |
    | OS-EXT-SRV-ATTR:hypervisor_hostname | -                                                                                                                                                |
    | OS-EXT-SRV-ATTR:instance_name       | novacreated-bf704dc6-00000040                                                                                                                    |
    | OS-EXT-STS:power_state              | 0                                                                                                                                                |
    | OS-EXT-STS:task_state               | scheduling                                                                                                                                       |
    | OS-EXT-STS:vm_state                 | building                                                                                                                                         |
    | accessIPv4                          |                                                                                                                                                  |
    | accessIPv6                          |                                                                                                                                                  |
    | adminPass                           | PDWuY2iwwqQZ                                                                                                                                     |
    | avail_priority                      | -                                                                                                                                                |
    | compliance_status                   | [{"status": "compliant", "category": "resource.allocation"}]                                                                                     |
    | cpu_utilization                     | -                                                                                                                                                |
    | cpus                                | 1                                                                                                                                                |
    | created                             | 2015-08-05T15:56:01Z                                                                                                                             |
    | current_compatibility_mode          | -                                                                                                                                                |
    | dedicated_sharing_mode              | -                                                                                                                                                |
    | desired_compatibility_mode          | -                                                                                                                                                |
    | endianness                          | big-endian                                                                                                                                       |
    | ephemeral_gb                        | 0                                                                                                                                                |
    | flavor                              | powervm.tiny (ac01ba9b-1576-450e-a093-92d53d4f5c33)                                                                                              |
    | health_status                       | {"health_value": "PENDING", "id": "bf704dc6-f255-46a6-b81b-d95bed00301e", "value_reason": "PENDING", "updated_at": "2015-08-05T15:56:02.307259"} |
    | hostId                              |                                                                                                                                                  |
    | id                                  | bf704dc6-f255-46a6-b81b-d95bed00301e                                                                                                             |
    | image                               | 7100-03-04-cic2-chef (96f86941-8480-4222-ba51-3f0c1a3b072b)                                                                                      |
    | metadata                            | {}                                                                                                                                               |
    | name                                | novacreated                                                                                                                                      |
    | operating_system                    | -                                                                                                                                                |
    | os_distro                           | aix                                                                                                                                              |
    | progress                            | 0                                                                                                                                                |
    | root_gb                             | 60                                                                                                                                               |
    | security_groups                     | default                                                                                                                                          |
    | status                              | BUILD                                                                                                                                            |
    | storage_connectivity_group_id       | -                                                                                                                                                |
    | tenant_id                           | 1471acf124a0479c8d525aa79b2582d0                                                                                                                 |
    | uncapped                            | -                                                                                                                                                |
    | updated                             | 2015-08-05T15:56:02Z                                                                                                                             |
    | user_id                             | 0688b01e6439ca32d698d20789d52169126fb41fb1a4ddafcebb97d854e836c9                                                                                 |
    +-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
    
    

LUN order, remove a boot lun

If you are moving to PowerVC you will probably need to migrate existing machines to your PowerVC environment. One of my customer is asking to move its machines from old boxes using vscsi, to new PowerVC managed boxes using NPIV. I am doing it with the help of a SVC for the storage side. Instead of creating the Virtual Machine profile on the HMC, and then doing the zoning and masking on the Storage Volume Controller and on the SAN switches, I decided to let PowerVC do the job for me. Unfortunately, PowerVC can’t only “carve” Virtual Machine, if you want to do so you have to build a Virtual Machine (rootvg include). This is what I am doing. During the migration process I have to replace the PowerVC created lun by the lun used for the migration …. and finally delete the PowerVC created boot lun. There is a trick to know if you want to do this:

  • Let’s say the lun created by PowerVC is the one named “volume-clouddev-test….” and the orignal rootvg is named “good_rootvg”. The Virtual Machine is booted on the “good_rootvg” lun and I want to remove the “volume-clouddev-test….”:
  • root1

  • You first have to click the “Edit Details” button:
  • root2

  • Then toggle the boot set to “YES” for the “good_rootvg” lun and click move up (the rootvg order must be set to 1, it is mandatory, the lun at order 1 can’t be deleted):
  • root3

  • Toggle the boot set to “NO” for the PowerVC created rootvg:
  • root4

  • If you are trying to detach the volume in first position you will got an error:
  • root5

  • When the order are ok, you can detach and delete the lun created by PowerVC:
  • root6
    root7

Conclusion

There are always good things to learn about PowerVC and related AIX topics. Tell me if these tricks are useful for you and I will continue to write posts like this one. You don’t need to understand all this details to work with PowerVC, most customers don’t. I’m sure you prefer understand what is going on “behind the scene” instead of just clicking a nice GUI. I hope it helps you to better understand what PowerVC is made of. And don’t be shy share you tricks with me. Next: more to come about Chef ! Up the irons !

Using the Simplified Remote Restart capability on Power8 Scale Out Servers

A few weeks ago I had to work on simplified remote restart. I’m not lucky enough yet -because of some political decisions in my company- to have access to any E880 or E870. We just have a few scale-out machines to play with (S814). For some critical applications we need in the future to be able to reboot the virtual machine if the system hosting the machine has failed (Hardware problem). We decided a couple of month ago not to use remote restart because it was mandatory to use a reserved storage pool device and it was too hard to manage because of this mandatory storage. We now have enough P8 boxes to try and understand the new version of remote restart called simplified remote restart which does not need any reserved storage pool device. For those who want to understand what remote restart is I strongly recommend you to check my previous blog post about remote restart on two P7 boxes: Configuration of a remote restart partition. For the others here is what I learned about the simplified version of this awesome feature.

Please keep in mind that the FSP of the machine must be up to perform a simplified remote restart operation. It means that if for instance you loose one of your datacenter or the link between your two datacenters you cannot use simplified remote restart to restart you partitions on the main/backup site. Simplified Remote Restart only prevents you from an hardware failure of your machine. Maybe this will change in a near future but for the moment it is the most important thing to understand about simplified remote restart.

Updating to the latest version of firmware

I was very surprised when I got my Power8 machines. After deploying these boxes I decided to give a try to simplified remote restart but It was just not possible. Since the Power8 Scale Out servers were release they were NOT simplified remote restart capable. The release of the SV830 firmware now enables the Simplified Remote restart on Power8 Scale Out machines. Please note that there is nothing about it in the patch note, so chmod666.org is the only place where you can get this information :-). Here is the patch note: here. Last word you will find on the internet that you need Power8 to use simplified remote restart. It’s true but partially true. YOU NEED A P8 MACHINE WITH AT LEAST A 820 FIRMWARE.

The first thing to do is to update your firmware to the SV830 version (on both systems participating in the simplified remote restart operation):

# updlic -o u -t sys -l latest -m p814-1 -r mountpoint -d /home/hscroot/SV830_048 -v
[..]
# lslic -m p814-1 -F activated_spname,installed_level,ecnumber
FW830.00,48,01SV830
# lslic -m p814-2 -F activated_spname,installed_level,ecnumber
FW830.00,48,01SV830

You can check the firmware version directly from the Hardware Management Console or in the ASMI:

fw1
fw3

After the firmware upgrade verify that you now have the Simplfied Remote Restart capability set to true.

fw2

# lssyscfg -r sys -F name,powervm_lpar_simplified_remote_restart_capable
p720-1,0
p814-1,1
p720-2,0
p814-2,1

Prerequisites

These prerequisites are true ONLY for Scale out systems:

  • To update to the firmware SV830_048 you need the latest Hardware Management Console release which is v8r8.3.0 plus MH01514 PTF.
  • Obviously on Scale out system SV830_048 is the minimum firmware requirement.
  • Minimum level of Virtual I/O Servers is 2.2.3.4 (for both source and destination systems).
  • PowerVM enterprise. (to be confirmed)

Enabling simplified remote restart of an existing partition

You probably want to enable simplified remote restart after an LPM migration/evacuation. After migrating your virtual machine(s) to a Power 8 with the Simplified Remote Restart Capability you have to enable this capability on all the virtual machines. This can only be done when the machine is shutdown, so you first have to stop the virtual machines (after a live partition mobility move) if you want to enable the SRR. It can’t be done without having to reboot the virtual machine:

  • List current partition running on the system and check which one are “simplified remote restart capable” (here only one is simplified remote restart capable):
  • # lssyscfg -r lpar -m p814-1 -F name,simplified_remote_restart_capable
    vios1,0
    vios2,0
    lpar1,1
    lpar2,0
    lpar3,0
    lpar4,0
    lpar5,0
    lpar6,0
    lpar7,0
    
  • For each lpar not simplified remote restart capable change the simplified_remote_restart_capable attribute using the chssyscfg command. Please note that you can’t do this using the Hardware Management Console gui (in the latest 8r8.3.0, when enabling it by the Hardware management console the GUI is telling you that you need a reserved device storage which is needed by the Remote Restart Capability and not by the simplified version of remote restart. You have to use the command line ! (check screenshot below)
  • You can’t change this attribute while the machine is running:
  • gui_change_to_srr

  • You can’t do it with the GUI after the machine is shutdown:
  • gui_change_to_srr2
    gui_change_to_srr3

  • The only way to enable this attribute is to do it by using the Hardware Management Console command line (please note in the output below that running lpar cannot be changed):
  • # for i in lpar2 lpar3 lpar4 lpar5 lpar6 lpar7 ; do chsyscfg -r lpar -m p824-2 -i "name=$i,simplified_remote_restart_capable=1" ; done
    An error occurred while changing the partition named lpar6.
    HSCLA9F8 The remote restart capability of the partition can only be changed when the partition is shutdown.
    An error occurred while changing the partition named lpar7.
    HSCLA9F8 The remote restart capability of the partition can only be changed when the partition is shutdown.
    # lssyscfg -r lpar -m p824-1 -F name,simplified_remote_restart_capable,lpar_env | grep -v vioserver
    lpar1,1,aixlinux
    lpar2,1,aixlinux
    lpar3,1,aixlinux
    lpar4,1,aixlinux
    lpar5,1,aixlinux
    lpar6,0,aixlinux
    lpar7,0,aixlinux
    

Remote restarting

If you are trying to do a live partition mobility operation back to a P7 or P8 box without the simplified remote restart capability it will not be possible. Enabling the simplified remote restart will force the virtual machine to stay on P8 boxes with simplified remote restart capability. This is one of the reason why most of customers are not doing it:

# migrlpar -o v -m p814-1 -t p720-1 -p lpar2
Errors:
HSCLB909 This operation is not allowed because managed system p720-1 does not support PowerVM Simplified Partition Remote Restart.

lpm_not_capable_anymore

On the Hardware Management Console you can see that the virtual machine is simplified remote restart capable by checking its properties:

gui_change_to_srr4

You can now try to remote restart your virtual machines to another server. As always the status of the server has to be different from Operating (Power Off, Error, Error – Dump in progress, Initializing). As always my advice is to validate before restarting:

# rrstartlpar -o validate -m p824-1 -t p824-2 -p lpar1
# echo $?
0
# rrstartlpar -o restart -m p824-1 -t p824-2 -p lpar1
HSCLA9CE The managed system is not in a valid state to support partition remote restart operations.
# lssyscfg -r sys -F name,state
p824-2,Operating
p824-1,Power Off
# rrstartlpar -o restart -m p824-1 -t p824-2 -p lpar1

By doing a remote restart operation the machine will boot automatically. You can check in the errpt that in most cases the partition ID will be changed (proving that you are on another machine):

# errpt | more
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
A6DF45AA   0618170615 I O RMCdaemon      The daemon is started.
1BA7DF4E   0618170615 P S SRC            SOFTWARE PROGRAM ERROR
CB4A951F   0618170615 I S SRC            SOFTWARE PROGRAM ERROR
CB4A951F   0618170615 I S SRC            SOFTWARE PROGRAM ERROR
D872C399   0618170615 I O sys0           Partition ID changed and devices recreat

Be very careful with the ghostdev sys0 attribute. Every VM remote restarted needs to have ghostdev set to 0 to avoid an ODM wipe (If you remote restart an lpar with ghostdev set to 1 you will loose all ODM customization)

# lsattr -El sys0 -a ghostdev
ghostdev 0 Recreate ODM devices on system change / modify PVID True

When the source machine is up and running you have to clean the old definition of the remote restarted lpar by launching a cleanup operation. This will wipe the old lpar defintion:

# rrstartlpar -o cleanup -m p814-1 -p lpar1

The RRmonitor (modified version)

There is a script delivered by IBM called rrMonitor, this one is looking at the PowerSystem‘s state and if this one is in particular state is restarting a specific virtual machine. This script is just not usable by a user because it has to be executed directly on the HMC (you need a pesh password to put the script on the hmc) and is only checking one particular virtual machine. I had to modify this script to ssh to the HMC and then check for every lpar on the machine and not just one in particular. You can download my modified version here : rrMonitor. Here is what’s the script is doing:

  • Checking the state of the source machine.
  • If this one is not “Operating”, the script search for every remote restartable lpars on the machine.
  • The script is launching remote restart operations to remote restart all the partitions.
  • The script is telling the user the command to cleanup the old lpar when the source machine will be running again.
# ./rrMonitor p814-1 p814-2 all 60 myhmc
Getting remote restartable lpars
lpar1 is rr simplified capable
lpar1 rr status is Remote Restartable
lpar2 is rr simplified capable
lpar2 rr status is Remote Restartable
lpar3 is rr simplified capable
lpar3 rr status is Remote Restartable
lpar4 is rr simplified capable
lpar4 rr status is Remote Restartable
Checking for source server state....
Source server state is Operating
Checking for source server state....
Source server state is Operating
Checking for source server state....
Source server state is Power Off In Progress
Checking for source server state....
Source server state is Power Off
It's time to remote restart
Remote restarting lpar1
Remote restarting lpar2
Remote restarting lpar3
Remote restarting lpar4
Thu Jun 18 20:20:40 CEST 2015
Source server p814-1 state is Power Off
Source server has crashed and hence attempting a remote restart of the partition lpar1 in the destination server p814-2
Thu Jun 18 20:23:12 CEST 2015
The remote restart operation was successful
The cleanup operation has to be executed on the source server once the server is back to operating state
The following command can be used to execute the cleanup operation,
rrstartlpar -m p814-1 -p lpar1 -o cleanup
Thu Jun 18 20:23:12 CEST 2015
Source server p814-1 state is Power Off
Source server has crashed and hence attempting a remote restart of the partition lpar2 in the destination server p814-2
Thu Jun 18 20:25:42 CEST 2015
The remote restart operation was successful
The cleanup operation has to be executed on the source server once the server is back to operating state
The following command can be used to execute the cleanup operation,
rrstartlpar -m sp814-1 -p lpar2 -o cleanup
Thu Jun 18 20:25:42 CEST 2015
[..]

Conclusion

As you can see the Simplified version of the remote restart feature is simpler that the normal one. My advice is to create all your lpars with the simplified remote restart attribute. It’s that easy :). If you plan to LPM back to P6 or P7 box, don’t use simplified remote restart. I think this functionality will become more popular when all the old P7 and P6 will be replaced by P8. As always I hope it helps.

Here are a couple of link with great documentations about Simplified Remote Restart:

  • Simplified Remote Restart Whitepaper: here
  • Original rrMonitor: here
  • Materials about lastest HMC release and a couple of videos related to the Simplified Remote Restart: here