Using Chef and cloud-init with PowerVC 1.2.2.2 | What’s new in version 1.2.2.2

I’ve been busy; very busy and I apologize for that … almost two months since the last update on the blog, but I’m still alive and I love AIX more than ever ;-). There is no blog post about it but I’ve developped a tool called “lsseas” which can be useful to all PowerVM administrators (you can find the script on github at this address https://github.com/chmod666org/lsseas). I’ll not talk to much about it but I thought sharing the information to all my readers who are not following me on twitter was the best way to promote the tool. Have a look on it, submit your own changes on github, code and share !

This said we can talk about this new blog post. PowerVC 1.2.2.2 has been released since a few months and there are a few things I wanted to talk about. The new version include new features making the product more powerful than ever (export/import images, activation input, vscsi lun management). PowerVC is only building “empty” machine, it’s a good start but we can do better. The activation engine can customize the virtual machines but is limited and in my humble opinion not really usable for post-installation tasks. With the recent release of cloud-init and Chef for AIX PowerVC can be utilized to build your machines from nothing … and finally get your application running in minutes. Using cloud-init and Chef can help you making your infrastructure repeatable, “versionable” and testable this is what we call infrastructure as code and it is damn powerful.

A big thank you to Jay Kruemcke (@chromeaix), Philippe Hermes (@phhermes) and S.Tran (https://github.com/transt) , they gave me very useful help about the cloud-init support on AIX. Follow them on twitter !

PowerVC 1.2.2.1 mandatory fixes

Before starting please note that I strongly recommend to have the latest ifixes installed on your Virtual I/O Server. These ones are mandatory for PowerVC, install these ifixes no matter what :

  • On Virtual I/O Servers install IV66758m4c, rsctvios2:
  • # emgr -X -e /mnt/VIOS_2.2.3.4_IV66758m4c.150112.epkg.Z
    # emgr -l
    [..]
    ID  STATE LABEL      INSTALL TIME      UPDATED BY ABSTRACT
    === ===== ========== ================= ========== ======================================
    1    S    rsctvios2  03/03/15 12:13:42            RSCT fixes for VIOS
    2    S    IV66758m4c 03/03/15 12:16:04            Multiple PowerVC fixes VIOS 2.2.3.4
    3    S    IV67568s4a 03/03/15 14:12:45            man fails in VIOS shell
    [..]
    
  • Check you have the latest version of the Hardware Management Console (I strongly recommend v8.2.20 Service Pack 1):
  • hscroot@myhmc:~> lshmc -V
    "version= Version: 8
     Release: 8.2.0
     Service Pack: 1
    HMC Build level 20150216.1
    ","base_version=V8R8.2.0
    "
    

Exporting and importing image from another PowerVC

The PowerVC latest version allows you to export and import images. It’s a good thing ! Let’s say that like me you have a few PowerVC hosts, on different SAN networks with different storage arrays, you probably do not want to create your images on each one and you prefer to be sure to use the same image for each PowerVC. Just create one image and use the export/import feature to copy/move this image to a different storage array or PowerVC host:

  • To do so map your current image disk on the PowerVC itself (in my case by using the SVC), you can’t attach volume used for an image volume directly from PowerVC so you have to do it on the storage side by hand:
  • maptohost
    maptohost2

  • On the PowerVC host, rescan the volume and copy the whole new discovered lun with a dd:
  • powervc_source# rescan-scsi-bus.sh
    [..]
    powervc_source# multipath -ll
    mpathe (3600507680c810010f800000000000097) dm-10 IBM,2145
    [..]
    powervc_source# dd if=/dev/mapper/mpathe of=/data/download/aix7100-03-04-cloudinit-chef-ohai bs=4M
    16384+0 records in
    16384+0 records out
    68719476736 bytes (69 GB) copied, 314.429 s, 219 MB/s                                         
    
  • Map a new volume to the new PowerVC server and upload this new created file on the new PowerVC server, then dd the file back to the new volume:
  • mapnewlun

    powervc_dest# scp /data/download/aix7100-03-04-cloudinit-chef-ohai new_powervc:/data/download
    aix7100-03-04-cloudinit-chef-ohai          100%   64GB  25.7MB/s   42:28.
    powervc_dest# dd if=/data/download/aix7100-03-04-cloudinit-chef-ohai of=/dev/mapper/mpathc bs=4M
    16384+0 records in
    16384+0 records out
    68719476736 bytes (69 GB) copied, 159.028 s, 432 MB/s
    
  • Unmap the volume from the new PowerVC after the dd operation, and import it with the PowerVC graphical interface.
  • Manage the existing current volume you just created (note that the current PowerVC code does not allows you to choose cloud-init as an activation engine even if it is working great) :
  • manage_ex1
    manage_ex2

  • Import the image:
  • import1
    import2
    import3
    import4

You can also use the command powervc-volume-image-import to import the new volume by using the command line instead of the graphical user interface. Here is an example with a Red Hat Enterprise Linux 6.4 image:

powervc_source# dd if=/dev/hdisk4 of=/apps/images/rhel-6.4.raw bs=4M
5815360+0 records in
15360+0 records out
powervc_dest# scp 10.255.248.38:/apps/images/rhel-6.4.raw .
powervc_dest# dd if=/home/rhel-6.4.raw of=/dev/mapper/mpathe
30720+0 records in
30720+0 records out
64424509440 bytes (64 GB) copied, 124.799 s, 516 MB/s
powervc_dest# powervc-volume-image-import --name rhel64 --os rhel --volume volume_capture2 --activation-type ae
Password:
Image creation complete for image id: e3a4ece1-c0cd-4d44-b197-4bbbc2984a34

Activation input (cloud-init and ae)

Instead of doing post-installation tasks by hand after the deployment of the machine you can now use the activation input added recently in PowerVC. The activation input can be utilized to run any scripts you want or even better things (such as cloud-config syntax) if you are using cloud-init instead of the old activation engine. You have to remember that cloud-init is not yet officially supported by PowerVC, for this reason I think most of customers will still use the old activation engine. Latest activation engine version is also working with the activation input. On the examples below I’m of course using cloud-init :-). Don’t worry I’ll detail later in this post how to install and use cloud-init on AIX:

  • If you are using the activation engine please be sure to use the latest version. The current version of the activation engine in PowerVC 1.2.2.* is vmc-vsae-ext-2.4.5-1, the only way to be sure your are using this version is to check the size of /opt/ibm/ae/AS/vmc-sys-net/activate.py. The size of this file is 21127 bytes for the latest version. Check this before trying to do anything with the activation input. More information can be found here: Activation input documentation.
  • A simple shebang script can be used, on the example below this one is just writing a file, but it can be anything you want:
  • ai1

    # cat /tmp/activation_input
    Activation input was used on this server
    
  • If you are using cloud-init you can directly put cloud-config “script” in the activation input. The first line is always mandatory to tell the format of the activation input. If you forget to put this first line the activation input can not determine the format and the script will not be executed. Check the next point for more information about activation input:
  • ai2

    # cat /tmp/activation_input
    cloud-config activation input
    
  • There are additional fields called “server meta data key/value pairs”, just do not use them. They are used by images provided by IBM with customization of the activation engine. Forget about this it is useless, use this field only if IBM told you to do so.
  • cloud-init valid activation input can be found here: http://cloudinit.readthedocs.org/en/latest/topics/format.html. As you can see on the two examples above shell scripts and cloud-config format can be utilized, but you can also upload a gzip archive, or use a part handler format. Go on the url above for more informations.

vscsi and mix NPIV/vscsi machine creation

This is one of the major enhancement, PowerVC is now able create and map vscsi disks, even better you can create mixed NPIV vscsi machine. To do so create storage connectivity groups for each technology you want to use. You can choose a different way to create disk for boot volumes and for data volumes. Here are three examples, full NPIV, full vscsi, and a mixed vscsi(boot) and NPIV(data):

connectivitygroup1
connectivitygroup2
connectivitygroup3

What is really cool about this new feature is that PowerVC can use existing mapped luns on the Virtual I/O Server, please note that PowerVC will only use SAN backed devices and cannot use iSCSI or local disk (local disk can be use in the express version). You obviously have to make the zoning of your Virtual I/O Server by yourself. Here is an example where I have 69 devices mapped to my Virtual I/O Server, you can see that PowerVC is using one of the existing device for its deployment. This can be very useful if you have different teams working for the SAN and the system side, the storage guys will not change their habits and still can map you bunch of luns on the Virtual I/O Server, this can be used as a transition if you did not succeed in convincing guys from you storage team:

$ lspv | wc -l
      69

connectivitygroup_deploy1

$ lspv | wc -l
      69
$ lsmap -all -fmt :
vhost1:U8202.E4D.845B2DV-V2-C28:0x00000009:vtopt0:Available:0x8100000000000000:/var/vio/VMLibrary/vopt_c1309be1ed244a5c91829e1a5dfd281c: :N/A:vtscsi1:Available:0x8200000000000000:hdisk66:U78AA.001.WZSKM6P-P1-C3-T1-W500507680C11021F-L41000000000000:false

Please note that you still need to add fabrics and storage on PowerVC even if you have pre-mapped luns on your Virtual I/O Servers. This is mandatory for PowerVC image management and creation.

Maintenance Mode

This last feature is probably the one I like the most. You can now put your host in maintenance mode, this means that when you put a host in maintenance mode all the virtual machines hosted on this one are migrated with live partition mobility (remember the migrlpar –all option, I’m pretty sure this option is utilized for the PowerVC maintenance mode). By putting an host in maintenance mode this one is no longer available for new machines deployment and for mobility operations. The host can be shutdown for instance for a firmware upgrade.

  • Select a host and click the “Enter maintenance mode button”:
  • maintenance1

  • Choose where you want to move virtual machines, or let PowerVC decide for you (packing or stripping placement policy):
  • maintenance2

  • The host is entering maintenance mode:
  • maintenance3

  • Once the host is in maintenance mode this one is ready to be shutdown:
  • maintenance4

  • Leave the maintenance mode when you are ready:
  • maintenance5

An overview of Chef and cloud-init

With PowerVC you are now able to deploy new AIX virtual machines in a few minutes but there is still some work to do. What about post-installation tasks ? I’m sure that most of you are using NIM post-install scripts for post installation tasks. PowerVC does not use NIM and even if you can run your own shell scripts after a PowerVC deployment the goal of this tool is to automate a full installation… post-install included.

If the activation engine do the job to change the hostname and ip address of the machine it is pretty hard to customize it to do other tasks. Documentation is hard to find and I can assure you that it is not easy at all to customize and maintain. PowerVC Linux user’s are probably already aware of cloud-init. cloud-init is a tool (like the activation engine) in charge of the reconfiguration of your machine after its deployment, as the activation engine do today cloud-init change the hostname and the ip address of the machine but it can do way more than that (create user, add ssh-keys, mounting a filesystem, …). The good news is that cloud-init is now available an AIX since a few days, and you can use it with PowerVC. Awesome \o/.

If cloud-init can do one part of this job, it can’t do all and is not designed for that! It is not a configuration management tool, configurations are not centralized in a server, there is now way to create cookbooks, runbooks (or whatever you call it), you can’t pull product sources from a git server, there are a lot of things missing. cloud-init is a light tool designed for a simple job. I recently (at work and in my spare time) played a lot with configuration management tools. I’m a huge fan of Saltstack but unfortunately salt-minion (which are Saltstack clients) is not available on AIX… I had to find another tool. A few months ago Chef (by Opscode) announced the support of AIX and a release of chef-client for AIX, I decided to give it a try and I can assure you that this is damn powerful, let me explain this further.

Instead of creating shell scripts to do your post installation, Chef allows you to create cookbooks. Cookbooks are composed by recipes and each recipes is doing a task, for instance install an Oracle client, create the home directory for root user and create its profile file, enable or disable service on the system. The recipes are coded in a Chef language, and you can directly put Ruby code inside a recipe. Chef recipes are idempotent, it means that if something has already be done, it will not be done again. The advantage of using a solution like this is that you don’t have to maintain shell code and shells scripts which are difficult to change/rewrite. Your infrastructure is repeatable and changeable in minutes (after Chef is installed you can for instance told him to change /etc/resolv.conf for all your Websphere server). This is called “infrastructure as a code”. Give it a try and you’ll see that the first thing you’ll think will be “waaaaaaaaaaaaaooooooooooo”.

Trying to explain how PowerVC, cloud-init and Chef can work together is not really easy, a nice diagram is probably better than a long text:

chef

  1. You have built an AIX virtual machine. On this machine cloud-init is installed, Chef client 12 is installed. cloud-init is configured to register the chef-client on the chef-server, and to run a cookbook for a specific role. This server has been captured with PowerVC and is now ready to be deployed.
  2. Virtual machines are created with PowerVC.
  3. When the machine is built cloud-init is running on first boot. The ip address and the hostname of this machine is changed with the values provided in PowerVC. cloud-init create the chef-client configuration (client.rb, validation.pem). Finally chef-client is called.
  4. chef-client is registering on chef-server. Machine are now known by the chef-server.
  5. chef-client is resolving and downloading cookbooks for a specific role. Cookbooks and recipes are executed on the machine. After cookbooks execution the machine is ready and configured.
  6. Administrator create and upload cookbooks an recipe from his knife workstation. (knife is the tool to interact with the chef-server this one can be hosted anywhere you want, your laptop, a server …)

In a few step here is what you need to do to use PowerVC, cloud-init, and Chef together:

  1. Create a virtual machine with PowerVC.
  2. Download cloud-init, and install cloud-init in this virtual machine.
  3. Download chef-client, and install chef-client in this virtual machine.
  4. Configure cloud-init, modifiy /opt/freeware/etc/cloud.cfg. In this file put the Chef configuration of the cc_chef cloud-init module.
  5. Create mandatory files, such as /etc/chef directory, put your ohai plugins in /etc/chef/ohai-plugins directory.
  6. Stop the virtual machine.
  7. Capture the virtual machine with PowerVC.
  8. Obviously as prerequisites a chef-server is up and running, cookbooks, recipes, roles, environments are ok in this chef-server.

cloud-init installation

cloud-init is now available on AIX, but you have to build the rpm by yourself. Sources can be found on github at this address : https://github.com/transt/cloud-init-0.7.5. There are a lot of prerequisites, most of them can be found on the github page, some of them on famous perzl site, download and install these prerequisites; it is mandatory (links to download the prerequisites are on the github page, the zip file containing cloud-init can be downloaded here : https://github.com/transt/cloud-init-0.7.5/archive/master.zip

# rpm -ivh --nodeps gettext-0.17-8.aix6.1.ppc.rpm
[..]
gettext                     ##################################################
# for rpm in bzip2-1.0.6-2.aix6.1.ppc.rpm db-4.8.24-4.aix6.1.ppc.rpm expat-2.1.0-1.aix6.1.ppc.rpm gmp-5.1.3-1.aix6.1.ppc.rpm libffi-3.0.11-1.aix6.1.ppc.rpm openssl-1.0.1g-1.aix6.1.ppc.rpm zlib-1.2.5-6.aix6.1.ppc.rpm gdbm-1.10-1.aix6.1.ppc.rpm libiconv-1.14-1.aix6.1.ppc.rpm bash-4.2-9.aix6.1.ppc.rpm info-5.0-2.aix6.1.ppc.rpm readline-6.2-3.aix6.1.ppc.rpm ncurses-5.9-3.aix6.1.ppc.rpm sqlite-3.7.15.2-2.aix6.1.ppc.rpm python-2.7.6-1.aix6.1.ppc.rpm python-2.7.6-1.aix6.1.ppc.rpm python-devel-2.7.6-1.aix6.1.ppc.rpm python-xml-0.8.4-1.aix6.1.ppc.rpm python-boto-2.34.0-1.aix6.1.noarch.rpm python-argparse-1.2.1-1.aix6.1.noarch.rpm python-cheetah-2.4.4-2.aix6.1.ppc.rpm python-configobj-5.0.5-1.aix6.1.noarch.rpm python-jsonpointer-1.0.c1ec3df-1.aix6.1.noarch.rpm python-jsonpatch-1.8-1.aix6.1.noarch.rpm python-oauth-1.0.1-1.aix6.1.noarch.rpm python-pyserial-2.7-1.aix6.1.ppc.rpm python-prettytable-0.7.2-1.aix6.1.noarch.rpm python-requests-2.4.3-1.aix6.1.noarch.rpm libyaml-0.1.4-1.aix6.1.ppc.rpm python-setuptools-0.9.8-2.aix6.1.noarch.rpm fdupes-1.51-1.aix5.1.ppc.rpm ; do rpm -ivh $rpm ;done
[..]
python-oauth                ##################################################
python-pyserial             ##################################################
python-prettytable          ##################################################
python-requests             ##################################################
libyaml                     ##################################################

Build the rpm by following the commands below. You can reuse this rpm on every AIX on which you want to install cloud-init package:

# jar -xvf cloud-init-0.7.5-master.zip
inflated: cloud-init-0.7.5-master/upstart/cloud-log-shutdown.conf
# mv cloud-init-0.7.5-master  cloud-init-0.7.5
# chmod -Rf +x cloud-init-0.7.5/bin
# chmod -Rf +x cloud-init-0.7.5/tools
# cp cloud-init-0.7.5/packages/aix/cloud-init.spec.in /opt/freeware/src/packages/SPECS/cloud-init.spec
# tar -cvf cloud-init-0.7.5.tar cloud-init-0.7.5
[..]
a cloud-init-0.7.5/upstart/cloud-init.conf 1 blocks
a cloud-init-0.7.5/upstart/cloud-log-shutdown.conf 2 blocks
# gzip cloud-init-0.7.5.tar
# cp cloud-init-0.7.5.tar.gz /opt/freeware/src/packages/SOURCES/cloud-init-0.7.5.tar.gz
# rpm -v -bb /opt/freeware/src/packages/SPECS/cloud-init.spec
[..]
Requires: cloud-init = 0.7.5
Wrote: /opt/freeware/src/packages/RPMS/ppc/cloud-init-0.7.5-4.1.aix7.1.ppc.rpm
Wrote: /opt/freeware/src/packages/RPMS/ppc/cloud-init-doc-0.7.5-4.1.aix7.1.ppc.rpm
Wrote: /opt/freeware/src/packages/RPMS/ppc/cloud-init-test-0.7.5-4.1.aix7.1.ppc.rpm

Finally install the rpm:

# rpm -ivh /opt/freeware/src/packages/RPMS/ppc/cloud-init-0.7.5-4.1.aix7.1.ppc.rpm
cloud-init                  ##################################################
# rpm -qa | grep cloud-init
cloud-init-0.7.5-4.1

cloud-init configuration

By installing cloud-init package on AIX some entries have been added to /etc/rc.d/rc2.d:

ls -l /etc/rc.d/rc2.d | grep cloud
lrwxrwxrwx    1 root     system           33 Apr 26 15:13 S01cloud-init-local -> /etc/rc.d/init.d/cloud-init-local
lrwxrwxrwx    1 root     system           27 Apr 26 15:13 S02cloud-init -> /etc/rc.d/init.d/cloud-init
lrwxrwxrwx    1 root     system           29 Apr 26 15:13 S03cloud-config -> /etc/rc.d/init.d/cloud-config
lrwxrwxrwx    1 root     system           28 Apr 26 15:13 S04cloud-final -> /etc/rc.d/init.d/cloud-final

The default configuration file is located in /opt/freeware/etc/cloud/cloud.cfg, this configuration file is splited in three parts. The first one called cloud_init_module tells cloud-init to run specifics modules when the cloud-init script is started at boot time. For instance set the hostname of the machine (set_hostname), reset the rmc (reset_rmc) and so on. In our case this part will automatically change the hostname and the ip address of the machine by the values provided in PowerVC at the deployement time. This cloud_init_module part is splited in two, the local one and the normal one. The local on is using information provided by the cdrom build by PowerVC at the time of the deployment. This cdrom provides ip and hostname of the machine, activation input script, nameservers information. The datasource_list stanza tells cloud-init to use the “ConfigDrive” (in our case virtual cdrom) to get ip and hostname needed by some cloud_init_modules. The second one called cloud_config_module tells cloud-init to run specific modules when cloud-config script is called, at this stage the minimal requirements have already been configured by the previous cloud_init_module stage (dns, ip address, hostname are ok). We will configure and setup the chef-client in this stage. The last part called cloud_final_module tells cloud-init to run specific modules when the cloud-final script is called. You can at this step print a final message, reboot the host and so on (In my case host reboot is needed by my install_sddpcm Chef recipe). Here is an overview of the cloud.cfg configuration file:

cloud-init

  • The datasource_list stanza tells cloud-init to use the virtual cdrom as a source of information:
  • datasource_list: ['ConfigDrive']
    
  • cloud_init_module:
  • cloud_init_modules:
    [..]
     - set-multipath-hcheck-interval
     - update-bootlist
     - reset-rmc
     - set_hostname
     - update_hostname
     - update_etc_host
    
  • cloud_config_module:
  • cloud_config_modules:
    [..]
      - mounts
      - chef
      - runcmd
    
  • cloud_final_module:
  • cloud_final_modules:
      [..]
      - final-message
    

If you do not want to use Chef at all you can modify the cloud.cfg file to fit you needs (running homemade scripts, mounting filesystems …), but my goal here is to do the job with Chef. We will try to do the minimal job with cloud-init, so the goal here is to configure cloud-init to configure chef-client. Anyway I also wanted to play with cloud-init and see its capabilities. The full documentation of cloud-init can be found here https://cloudinit.readthedocs.org/en/latest/. Here are a few thing I just added (the Chef part will be detailed later), but keep in mind you can just use cloud-init without Chef if you want (setup you ssh key, mount or create filesystems, create files and so on):

write_files:
  - path: /tmp/cloud-init-started
    content: |
      cloud-init was started on this server
    permissions: '0755'
  - path: /var/log/cloud-init-sub.log
    content: |
      starting chef logging
    permissions: '0755'

final_message: "The system is up, cloud-init is finished"

EDIT : The IBM developper of cloud-init for AIX just send me a mail yesterday about the new support of cc_power_state. As I need to reboot my host at the end of the build I can with the latest version of cloud-init for AIX use the power_state stanza, I here use poweroff as an example, use reboot … for reboot:

power_state:
 delay: "+5"
 mode: poweroff
 message: cloud-init mandatory reboot for sddpcm
 timeout: 5

power_state1

Rerun cloud-init for testing purpose

You probably want to test your cloud-init configuration before of after capturing the machine. When cloud-init is launched by the startup script a check is performed to be sure that cloud-init has not already been run. Some “semaphores” files are created in /opt/freeware/var/lib/cloud/instance/sem to tell modules have already been executed. If you want to re-run cloud-init by hand without having to rebuild a machine, just remove these files in this directory :

# rm -rf /opt/freeware/var/lib/cloud/instance/sem

Let’s say we just want to re-run the Chef part:

# rm /opt/freeware/var/lib/cloud/instance/sem/config_chef

To sum up here is what I want to do with cloud-init:

  1. Use the cdrom as datasource.
  2. Set the hostname and ip.
  3. Setup my chef-client.
  4. Print a final message.
  5. Do a mandatory reboot at the end of the installation.

chef-client installation and configuration

Before modifying the cloud.cfg file to tell cloud-init to setup the Chef client we first have to download and install the chef-client on the AIX host we will capture later. Download the Chef client bff file at this address: https://opscode-omnibus-packages.s3.amazonaws.com/aix/6.1/powerpc/chef-12.1.2-1.powerpc.bff and install it:

# installp -aXYgd . chef
[..]
+-----------------------------------------------------------------------------+
                         Installing Software...
+-----------------------------------------------------------------------------+

installp: APPLYING software for:
        chef 12.1.2.1
[..]
Installation Summary
--------------------
Name                        Level           Part        Event       Result
-------------------------------------------------------------------------------
chef                        12.1.2.1        USR         APPLY       SUCCESS
chef                        12.1.2.1        ROOT        APPLY       SUCCESS
# lslpp -l | grep -i chef
  chef                      12.1.2.1    C     F    The full stack of chef
# which chef-client
/usr/bin/chef-client

The configuration file of chef-client created by cloud-init will be created in the /etc/chef directory, by default the /etc/chef directory does not exists, so you’ll have to create it

# mkdir -p /etc/chef
# mkdir -p /etc/chef/ohai_plugins

If -like me- you are using custom ohai plugins, you have two things to do. cloud-init is using templates files to build configuration files needed by Chef. Theses templates files are located in /opt/freeware/etc/cloud/templates. Modify the chef_client.rb.tmpl file to add a configuration line for ohai plugin_path. Copy your ohai plugin in /etc/chef/ohai_plugins:

# tail -1 /opt/freeware/etc/cloud/templates/chef_client.rb.tmpl
Ohai::Config[:plugin_path] << '/etc/chef/ohai_plugins'
# ls /etc/chef/ohai_plugins
aixcustom.rb

Add the chef stanza in the /opt/freeware/cloud/cloud.cfg. After this step the image is ready to be captured (Check ohai plugin configuration if you need one), so the chef-client is already installed. Put the force_install stanza to false, put the server_url, the validation_name of your Chef server, the organization and finally put the validation RSA private key provided in your Chef server (in the example below the key has been truncated for obvious purpose; server_url and validation_name have also been replaced). As you can see below, I tell here to Chef to run all recipes defined in the aix7 cookbook, we'll see later how to create a cookbook and recipes :

chef:
  force_install: false
  server_url: "https://chefserver.lab.chmod666.org/organizations/chmod666"
  validation_name: "chmod666-validator"
  validation_key: |
    -----BEGIN RSA PRIVATE KEY-----
    MIIEpQIBAAKCAQEApj/Qqb+zppWZP+G3e/OA/2FXukNXskV8Z7ygEI9027XC3Jg8
    [..]
    XCEHzpaBXQbQyLshS4wAIVGxnPtyqXkdDIN5bJwIgLaMTLRSTtjH/WY=
    -----END RSA PRIVATE KEY-----
  run_list:
    - "role[aix7]"

runcmd:
  - /usr/bin/chef-client

EDIT: With the latest build of cloud-init for AIX there is no need to run chef-client with the runcmd stanza. Just add exec: 1 in the chef stanza.

To sum up, cloud-init is installed, cloud-init is configured to run a few actions at boot time but mainly to configure chef-client and run it with a specific role> The chef-client is installed. The machine can now be shutdown and is ready to be deployed. At the deployement time cloud-init will do the job to change ip address and hostname, and configure Chef. Chef will retreive the cookbooks and recipes and run it on the machine.

If you want to use custom ohai plugins read the ohai part before capturing your machine.

capture
capture2

Use chef-solo for testing

You will have to create your own recipes. My advice is to use chef-solo to debug. The chef-solo binary file is provided with the chef-client package. This one can be use without a Chef server to run and execute Chef recipes:

  • Create a test recipe:
  • # mkdir -p ~/chef/cookbooks/testing/recipes
    # cat  ~/chef/cookbooks/testing/recipes/test.rb
    file "/tmp/helloworld.txt" do
      owner "root"
      group "system"
      mode "0755"
      action :create
      content "Hello world !"
    end
    
  • Create a run_list with you test recipe:
  • # cat ~/chef/node.json
    {
      "run_list": [ "recipe[testing::test]" ]
    }
    
  • Create attribute file for chef-solo execution:
  • # cat  ~/chef/solo.rb
    file_cache_path "/root/chef"
    cookbook_path "/root/chef/cookbooks"
    json_attribs "/root/chef/node.json"
    
  • Run chef-solo:
  • # chef-solo -c /root/chef/solo.rb
    

chef-solo

cookbooks and recipes example on AIX

Let's say you have written all you recipes using chef-solo on a test server. On the Chef server you now want to put all these recipes in a cookbook. From the workstation, create a cookbook :

# knife cookbook create test
** Creating cookbook test in /home/kadmin/.chef/cookbooks
** Creating README for cookbook: aix7
** Creating CHANGELOG for cookbook: aix7
** Creating metadata for cookbook: aix7

In the .chef directory you can now find a directory for the aix7 cookbook. In this one you will find a directory for each Chef objects : recipes, templates, files, and so on. This place is called the chef-repo. I strongly recommend using this place as a git repository (you will by doing this save all modifications of any object in the cookbook).

# ls /home/kadmin/.chef/cookbooks/aix7/recipes
create_fs_rootvg.rb  create_profile_root.rb  create_user_group.rb  delete_group.rb  delete_user.rb  dns.rb  install_sddpcm.rb  install_ssh.rb  ntp.rb  ohai_custom.rb  test_ohai.rb
# ls /home/kadmin/.chef/cookbooks/aix7/templates/default
aixcustom.rb.erb  ntp.conf.erb  ohai_test.erb  resolv.conf.erb

Recipes

Here are a few examples of my own recipes:

  • install_ssh, the recipe is mounting an nfs filesystem (nim server). The nim_server is an attribute coming from role default attribute (we will check that later), the oslevel is an ohai attribute coming from an ohai custom plugin (we will check that later too). openssh.license and openssh.server filesets are installed, the filesystem is unmounted, and finally ssh service is started:
  • # creating temporary directory
    directory "/var/mnttmp" do
      action :create
    end
    # mouting nim server
    mount "/var/mnttmp" do
      device "#{node[:nim_server]}:/export/nim/lppsource/#{node['aixcustom']['oslevel']}"
      fstype "nfs"
      action :mount
    end
    # installing ssh packages (openssh.license, openssh.base)
    bff_package "openssh.license" do
      source "/var/mnttmp"
      action :install
    end
    bff_package "openssh.base" do
      source "/var/mnttmp"
      action :install
    end
    # umount the /var/mnttmp directory
    mount "/var/mnttmp" do
      fstype "nfs"
      action :umount
    end
    # deleting temporary directory
    directory "/var/mnttmp" do
      action :delete
    end
    # start and enable ssh service
    service "sshd" do
      action :start
    end
    
  • install_sddpcm, the recipe is mounting an nfs filesystem (nim server). The nim_server is an attribute coming from role default attribute (we will check that later), the platform_version is coming from ohai. devices.fcp.disk.ibm.mpio and devices.sddpcm.71.rte filesets are installed, the filesystem is unmounted:
  • # creating temporary directory
    directory "/var/mnttmp" do
      action :create
    end
    # mouting nim server
    mount "/var/mnttmp" do
      device "#{node[:nim_server]}:/export/nim/lpp_source/#{node['platform_version']}/sddpcm-71-2660"
      fstype "nfs"
      action :mount
    end
    # installing sddpcm packages (devices.fcp.disk.ibm.mpio, devices.sddpcm.71.rte)
    bff_package "devices.fcp.disk.ibm.mpio" do
      source "/var/mnttmp"
      action :install
    end
    bff_package "devices.sddpcm.71.rte" do
      source "/var/mnttmp"
      action :install
    end
    # umount the /var/mnttmp directory
    mount "/var/mnttmp" do
      fstype "nfs"
      action :umount
    end
    # deleting temporary directory
    directory "/var/mnttmp" do
      action :delete
    end
    
  • create_fs_rootvg, some filesystems are extended, an /apps filesystem is created and mounted. Please note that there are no cookbooks for AIX lvm for the moment and you have here to use the execute statement which is the only not to be idempotent:
  • execute "hd3" do
      command "chfs -a size=1024M /tmp"
    end
    execute "hd9var" do
      command "chfs -a size=512M /var"
    end
    execute "/apps" do
      command "crfs -v jfs2 -g rootvg -m /apps -Ay -a size=1M ; chlv -n appslv fslv00"
      not_if { ::File.exists?("/dev/appslv")}
    end
    mount "/apps" do
      device "/dev/appslv"
      fstype "jfs2"
    end
    
  • ntp, ntp.conf.erb located in the template directory is copied to /etc/ntp.conf:
  • template "/etc/ntp.conf" do
      source "ntp.conf.erb"
    end
    
  • dns, resolv.conf.erb located in the template directory is copied to /etc/resolv.conf:
  • template "/etc/resolv.conf" do
      source "resolv.conf.erb"
    end
    
  • crearte_user_group, a user for tadd is created:
  • user "taddmux" do
      gid 'sys'
      uid 421
      home '/home/taddmux'
      comment 'user TADDM connect SSH'
    end
    

Templates

On the recipes above templates are used for ntp and dns configuration. Templates files are files in which some strings are replaced by Chef attributes found in the roles, the environments, in ohai, or even directly in recipes, here are the two files I used for dns and ntp

  • ntp.conf.erb, ntpserver1,2,3 attributes are found in environments (let's say I have siteA and siteB and ntp are different for each site, I can define an environment for siteA en siteB):
  • [..]
    server <%= node['ntpserver1'] %>
    server <%= node['ntpserver2'] %>
    server <%= node['ntpserver3'] %>
    driftfile /etc/ntp.drift
    tracefile /etc/ntp.trace
    
  • resolv.conf.erb, nameserver1,2,3 and namesearch are found in environments:
  • search  <%= node['namesearch'] %>
    nameserver      <%= node['nameserver1'] %>
    nameserver      <%= node['nameserver2'] %>
    nameserver      <%= node['nameserver3'] %>
    

role assignation

Chef roles can be used to run different chef recipes depending of the type of server you want to post install. You can for instance create a role for webserver in which the Websphere recipe will be executed and create a role for databases server in which the recipe for Oracle will be executed. In my case and for the simplicity of this example I just create one role called aix7

# knife role create aix7
Created role[aix7]
# knife role edit aix7
{
  "name": "aix7",
  "description": "",
  "json_class": "Chef::Role",
  "default_attributes": {
    "nim_server": "nimsrv01"
  },
  "override_attributes": {

  },
  "chef_type": "role",
  "run_list": [
    "recipe[aix7::ohai_custom]",
    "recipe[aix7::create_fs_rootvg]",
    "recipe[aix7::create_profile_root]",
    "recipe[aix7::test_ohai]",
    "recipe[aix7::install_ssh]",
    "recipe[aix7::install_sddpcm]",
    "recipe[aix7::ntp]",
    "recipe[aix7::dns]"
  ],
  "env_run_lists": {

  }
}

What we can se here are two important things. We created an attribute specific to this role called nim_server. In all recipes, templates "node['nim_server']" will be replaced by nimsrv01 (remember the recipes above, and remember we told chef-client to run the aix7 role). We created a run_list telling that recipes coming from aix7 cookbook : install_ssh, install_sddpcm, ... should be exectued on a server calling chef-client with the aix7 role.

environments

Chef environments can be use to separate you environments, for instance production, developpement, backup, or in my example sites. In my company depending the site on which you are building a machine nameservers and ntp servers will differ. Remember that we are using templates files for resolv.conf and ntp.conf files :

knife environment show siteA
chef_type:           environment
cookbook_versions:
default_attributes:
  namesearch:  lab.chmod666.org chmod666.org
  nameserver1: 10.10.10.10
  nameserver2: 10.10.10.11
  nameserver3: 10.10.10.12
  ntpserver1:  11.10.10.10
  ntpserver2:  11.10.10.11
  ntpserver3:  11.10.10.12
description:         production site
json_class:          Chef::Environment
name:                siteA
override_attributes:

When chef-client will be called with -E siteA attribute it will replace node['namesearch'] by "lab.chmod666.org chomd666.org" in all recipes, and templates files.

A Chef run

When you are ok with your cookbook upload it to the Chef server:

# knife cookbook upload aix7
Uploading aix7           [0.1.0]
Uploaded 1 cookbook.

When chef-client is not executed by cloud-init you can run it by hand. I thought it is interessting to put an output of chef-client here, you can see that files are modified, packages installed and so on ;-) :

chef-clientrun1
chef-clientrun2

Ohai

ohai is a command delivered with chef-client. Its purpose is to gather information about the machine on which chef-client is executed. Each time chef-client is running a call to ohai is launched. By default ohai is gathering a lot of information such as ip address of the machine, the lpar id, the lpar name, and so on. A call to ohai is returning a json tree. Each element of this json tree can be accessed in Chef recipes or in Chef templates. For instance to get the lpar name the 'node['virtualization']['lpar_name']' can be called. Here is an example of a single call to ohai:

# ohai | more
  "ipaddress": "10.244.248.56",
  "macaddress": "FA:A3:6A:5C:82:20",
  "os": "aix",
  "os_version": "1",
  "platform": "aix",
  "platform_version": "7.1",
  "platform_family": "aix",
  "uptime_seconds": 14165,
  "uptime": "3 hours 56 minutes 05 seconds",
  "virtualization": {
    "lpar_no": "7",
    "lpar_name": "s00va9940866-ada56a6e-0000004d"
  },

At the time of writing this blog post there is -at my humble opinion- some attirbutes missing in ohai. For instance if you want to install a specific package from an lpp_source you first need to know what is you current oslevel (I mean the output of oslevel -s). Fortunately ohai can be surcharged by custom plugin and you can add your own attributes what ever it is.

  • In ohai 7 (the one shipped with chef-client 12) an attribute needs to be added to the Chef client.rb configuration to tells where the ohai plugins will be located. Remember that the chef-client is configured by cloud-init, to do so you need to modify the template used by cloud-init the build the client.rb file. This one is located in /opt/freeware/etc/cloud/template:
  • # tail -1 /opt/freeware/etc/cloud/templates/chef_client.rb.tmpl
    Ohai::Config[:plugin_path] << '/etc/chef/ohai_plugins'
    # mkdir -p /etc/chef/ohai_plugins
    
  • After this modification the machine is ready to be captured.
  • You want your custom ohai plugins to be uploaded to the chef-client machine at the time of chef-client execution, here is an example of custom ohai plugin used as a template. This one will gather the oslevel (oslevel -s), the node name, the partition name and the memory mode of the machine. These attributes are gathered with lparstat command:
  • Ohai.plugin(:Aixcustom) do
      provides "aixcustom"
    
      collect_data(:aix) do
        aixcustom Mash.new
    
        oslevel = shell_out("oslevel -s").stdout.split($/)[0]
        nodename = shell_out("lparstat -i | awk -F ':' '$1 ~ \"Node Name\" {print $2}'").stdout.split($/)[0]
        partitionname = shell_out("lparstat -i | awk -F ':' '$1 ~ \"Partition Name\" {print $2}'").stdout.split($/)[0]
        memorymode = shell_out("lparstat -i | awk -F ':' '$1 ~ \"Memory Mode\" {print $2}'").stdout.split($/)[0]
    
        aixcustom[:oslevel] = oslevel
        aixcustom[:nodename] = nodename
        aixcustom[:partitionname] = partitionname
        aixcustom[:memorymode] = memorymode
      end
    end
    
  • The custom ohai plugin is written. Remember that you want this one to be uploaded on the machine a the chef-client execution. New attributes created by this plugin needs to be added in ohai. Here is a recipe uploading the custom ohai plugin, at the time the plugin is uploaded ohai is reloaded and new attributes can be utilized in any further templates (for recipes you have no other choice than putting the custom ohai plugin in the directroy before the capture):
  • cat ~/.chef/cookbooks/aix7/recipes/ohai_custom.rb
    ohai "reload" do
      action :reload
    end
    
    template "/etc/chef/ohai_plugins/aixcustom.rb" do
      notifies :reload, "ohai[reload]", :immediately
    end
    

chef-server, chef workstation, knife

I'll not detail here how to setup a Chef server, and how configure you Chef workstation (knife). There are plenty of good tutorials about that on the internet. Please just note that you need to use Chef sever 12 if you are using Chef client 12. Here are some good link to start.

I had some difficulties during the configuration here are a few tricks to know :

  • cacert can by found here: /opt/opscode/embedded/ssl/cert/cacert.pem
  • The Chef validation key can be found in /etc/chef/chef-validator.pem

Building the machine, checking the logs

  • The write_file part was executed, the file is present in /tmp filesystem:
  • # cat /tmp/cloud-init-started
    cloud-init was started on this server
    
  • The chef-client was configured, file are present in /etc/chef directory, looking at the log file these files were created by cloud-init
  • # ls -l /etc/chef
    total 32
    -rw-------    1 root     system         1679 Apr 26 23:46 client.pem
    -rw-r--r--    1 root     system          646 Apr 26 23:46 client.rb
    -rw-r--r--    1 root     system           38 Apr 26 23:46 firstboot.json
    -rw-r--r--    1 root     system         1679 Apr 26 23:46 validation.pem
    
    # grep chef | /var/log/cloud-init-output.log
    2015-04-26 23:46:22,463 - importer.py[DEBUG]: Found cc_chef with attributes ['handle'] in ['cloudinit.config.cc_chef']
    2015-04-26 23:46:22,879 - util.py[DEBUG]: Writing to /opt/freeware/var/lib/cloud/instances/a8b8fe0d-34c1-4bdb-821c-777fca1c391f/sem/config_chef - wb: [420] 23 bytes
    2015-04-26 23:46:22,882 - helpers.py[DEBUG]: Running config-chef using lock ()
    2015-04-26 23:46:22,884 - util.py[DEBUG]: Writing to /etc/chef/validation.pem - wb: [420] 1679 bytes
    2015-04-26 23:46:22,887 - util.py[DEBUG]: Reading from /opt/freeware/etc/cloud/templates/chef_client.rb.tmpl (quiet=False)
    2015-04-26 23:46:22,889 - util.py[DEBUG]: Read 892 bytes from /opt/freeware/etc/cloud/templates/chef_client.rb.tmpl
    2015-04-26 23:46:22,954 - util.py[DEBUG]: Writing to /etc/chef/client.rb - wb: [420] 646 bytes
    2015-04-26 23:46:22,958 - util.py[DEBUG]: Writing to /etc/chef/firstboot.json - wb: [420] 38 bytes
    
  • The runcmd part was executed:
  • # cat /opt/freeware/var/lib/cloud/instance/scripts/runcmd
    #!/bin/sh
    /usr/bin/chef-client
    
    2015-04-26 23:46:22,488 - importer.py[DEBUG]: Found cc_runcmd with attributes ['handle'] in ['cloudinit.config.cc_runcmd']
    2015-04-26 23:46:22,983 - util.py[DEBUG]: Writing to /opt/freeware/var/lib/cloud/instances/a8b8fe0d-34c1-4bdb-821c-777fca1c391f/sem/config_runcmd - wb: [420] 23 bytes
    2015-04-26 23:46:22,986 - helpers.py[DEBUG]: Running config-runcmd using lock ()
    2015-04-26 23:46:22,987 - util.py[DEBUG]: Writing to /opt/freeware/var/lib/cloud/instances/a8b8fe0d-34c1-4bdb-821c-777fca1c391f/scripts/runcmd - wb: [448] 31 bytes
    2015-04-26 23:46:25,868 - util.py[DEBUG]: Running command ['/opt/freeware/var/lib/cloud/instance/scripts/runcmd'] with allowed return codes [0] (shell=False, capture=False)
    
  • The final message was printed in the output of the cloud-init log file
  • 2015-04-26 23:06:01,203 - helpers.py[DEBUG]: Running config-final-message using lock ()
    The system is up, cloud-init is finished
    2015-04-26 23:06:01,240 - util.py[DEBUG]: The system is up, cloud-init is finished
    2015-04-26 23:06:01,242 - util.py[DEBUG]: Writing to /opt/freeware/var/lib/cloud/instance/boot-finished - wb: [420] 57 bytes
    

On the Chef server you can check the client registred itself and get details about it.

# knife node list | grep a8b8fe0d-34c1-4bdb-821c-777fca1c391f
a8b8fe0d-34c1-4bdb-821c-777fca1c391f
# knife node show a8b8fe0d-34c1-4bdb-821c-777fca1c391f
Node Name:   a8b8fe0d-34c1-4bdb-821c-777fca1c391f
Environment: _default
FQDN:
IP:          10.10.208.61
Run List:    role[aix7]
Roles:       france_testing
Recipes:     aix7::create_fs_rootvg, aix7::create_profile_root
Platform:    aix 7.1
Tags:

What's next ?

If you have a look on the Chef supermarket (the place where you can download Chef cookbooks written by the community and validated by opscode) you'll see that there are not a lot of cookbooks for AIX. I'm currently writting my own cookbook for AIX logical volume manager and filesystems creation, but there is still a lot of work to do on cookbooks creation for AIX. Here is a list of cookbooks that needs to be written by the community : chdev, multibos, mksysb, nim client, wpar, update_all, ldap_client .... I can continue this list but I'm sure that you have a lot of ideas. Last word learn ruby and write cookbooks, they will be used by the community and we can finally have a good configuration management tool on AIX. With PowerVC, cloud-init and Chef support AIX will have a full "DevOps" stack and can finally fight against Linux. As always hope this blog post helps you to understand PowerVC, cloud-init and Chef !

Automating systems deployment & other new features : HMC8, IBM Provisioning Toolkit for PowerVM and LPM Automation Tool

I am involved in a project where we are going to deploy dozen of Power Systems (still Power7 for the moment, and Power8 in a near future). All the systems will be the same : same models with the same slots emplacements and the same Virtual I/O Server configuration. To be sure that all my machines are the same and to allow other people (who are not aware of the design or are not skilled enough to do it by themselves) I had to find a solution to automatize the deployment of the new machines. For the virtual machines the solution is now to use PowerVC but what about the Virtual I/O Servers, what about the configuration of the Shared Ethernet Adapters. In other words what about the infrastructure deployment ? I spent a week with an IBM US STG Lab services consultant (Bonnie Lebarron) for a PowerCare (you have now a PowerCare included with every high end machine you buy) about the IBM Provisioning Toolkit for PowerVM (which is a very powerful tool that allows you to deploy your Virtual I/O Server and your virtual machines automatically) and the Live Partition Mobility Automation tool. With the new Hardware Management Console (8R8.2.0) you now have the possibility to create templates not just for the new virtual machines creation, but also to deploy create and configure your Virtual I/O Severs. The goal of this post is to show that there are different way to do that but also to show you the new features embedded with the new Hardware Management Console and to spread the world about those two STG Labs Services wonderful tools that are well know in US but not so much in Europe. So it’s a HUGE post, just take what is useful for you in it. Here we go :

Hardware Management Console 8 : System templates

The goal of the systems templates is to deploy a new server in minutes without having to logging on different servers to do some tasks, you now just have to connect on the HMC to do all the work. The systems templates will deploy the Virtual I/O Server image by using your NIM server or by using the images stored in the Hardware Management Console media repository. Please note a few points :

  • You CAN’T deploy a “gold” mksysb of your Virtual I/O Server using the Hardware Management Console repository. I’ve tried this myself and it is for the moment impossible (if someone has a solution …). I’ve tried two different ways. Creating a backupios image without the mksysb flag (it will produce a tar file impossible to upload on the image repository, but usable by the installios command). Creating a backupios image with the mksysb flag and use the mkcd/mkdvd command to create iso images. Both method were failing at the installation process.
  • The current Virtual I/O Server images provided in the Eelectonic Software Delivry (2.2.3.4 at the moment) are provided in the .udf format and not the .iso format. This is not a huge problem, just rename both files to .iso before uploading the file on the Hardware Management Console.
  • If you want to deploy your own mksysb you can still choose to use your NIM server, but you will have to manually create the NIM objects, and to manually configure a bosinst installation (in my humble opinion what we are trying to do is to reduce manual interventions, but you can still do that for the moment, that’s what I do because I don’t have the choice). You’ll have to give the IP address of the NIM server and the HMC will boot the Virtual I/O Servers with the network settings already configured.
  • The Hardware Management Console installation with the media repository is based on the old well known installios command. You still need to have the NIM port opened between your HMC and the Virtual I/O Server management network (the one you will choose to install both Virtual I/O Servers) (installios is based on NIMOL). You may experience some problems if you already install your Virtual I/O Servesr this way and you may have to reset some things. My advice is to always run these three commands before deploying a system template :
  • # installios -F -e -R default1
    # installios -u 
    # installios -q
    

Uploading an iso file on the Hardware Management Console

Upload the images on the Hardware Management Console, I’ll not explain this in details …:

hmc_add_virtual_io_server_to_repo
hmc_add_virtual_io_server_to_repo2

Creating a system template

To create a system template you have first to copy an existing predefined template provided by the Hardware Management Console (1) and then edit this template to fit you own needs (2) :

create_template_1

  • You can’t edit the physical I/O part when editing a new template, you first have to deploy a system with this template to choose the physical I/O for each Virtual I/O Server and then capture this deployed system as an HMC template. Change the properties of your Virtual I/O Server :
  • create_template_2

  • Create your Shared Ethernet Adapters : let’s say we want to create one Shared Ethernet Adapter in sharing mode with four virtual adapters :
  • Adapter 1 : PVID10, vlans=1024;1025
  • Adapter 2 : PVID11, vlans=1028;1029
  • Adapter 3 : PVID12, vlans=1032;1033
  • Adapter 4 : PVID13, vlans=1036;1037
  • In the new HMC8 the terms are changing and are not the same : Virtual Network Bridge = Shared Ethernet Adapter; Load (Balance) Group = A pair of virtual adapters with the same PVID on both Virtual I/O Server.
  • Create the Shared Ethernet Adapter with the first (with PVID10) and the second (with PVID11) adapter and the first vlan (vlan 1024 has to be added on adapter with PVID 10) :
  • create_sea1
    create_sea2
    create_sea3

  • Add the second vlan (the vlan 1028) in our Shared Ethernet Adapter (Virtual Network Bridge) and choose to put it on the adapter with PVID 11 (Load Balance Group 11) :
  • create_sea4
    create_sea5
    create_sea6

  • Repeat this operation for the next vlan (1032), but this time we have to create new virtual adapters with PVID 12 (Load Balance Group 12) :
  • create_sea7

  • Repeat this operation for the next vlan (1036), but this time we have to create new virtual adapters with PVID 13 (Load Balance Group 13).
  • You can check on this picture our 4 virtual adapters with two vlans for each ones :
  • create_sea8
    create_sea9

  • I’ll not detail the other part which are very simple to understand. You can check at the end our template is created 2 Virtual I/O Servers and 8 virtual networks.

The Shared Ethernet Adapter problem : Are you deploying a Power8/Power7 with a 780 firmware or a Power6/7 server ?

When creating a system template you probably notice that when your are defining your your Shared Ethernet Adapters … sorry your Virtual Network Bridges there is no possibility to create any control channel adapters or any possibility to assign a vlan id for this control channel. If you choose to create the system template by hand with the HMC the template will be usable by all Power8 systems and all Power7 system with a firmware that allows you to create a Shared Ethernet Adapter without any control channel (780 firmwares). I’ve tried this myself and we will check that later. If you are deploying a system template an older power 7 system the deployment will fail because of this reason. You have two solutions to this problem. Create your first system “by hand” and create your Shared Ethernet Adapters with control channel on your own and then capture the system to redeploy on other machines or you have the choice to edit the XML of you current template to add the control channel adapter in it …no comments.

failed_sea_ctl_chan

If you choose to edit the template to add the control channel on your own, export your template as an xml file and edit it by hand (here is an example on the picture below), and then re-imported the modified xml file :

sea_control_channel_template

Capture an already deployed system

As you can see creating a system template from scratch can be hard and cannot match all your needs especially with this Shared Ethernet Adapter problem. My advice is to deploy by hand or by using the toolkit your first system and then capture the system to create and Hardware Management Console template based on this one. By doing this all the Shared Ethernet Adapters will be captured as configured, the ones with control channels and the ones without control channel. It can match all the cases without having to edit the xml file by hand.

  • Click “Capture configuration as template with physical I/O” :
  • capture_template_with_physical_io

  • The whole system will be captured and if you put your physical I/O in the same slot (as we do in my team) each time you deploy a new server you will not have to choice which physical I/O will belong to which Virtual I/O server :
  • capture_template_with_physical_io_capturing

  • In the system template library you can check that the physical I/O are captured and that we do not have to define our Shared Ethernet Adapter (the screenshot below shows you 49 vlans ready to be deployed) :
  • capture_template_library_with_physical_io_and_vlan

  • To do this don’t forget to edit the template and check the box “Use captured I/O information” :
  • use_captured_io_informations

    Deploying a system template

    BE VERY CAREFUL BEFORE DEPLOYING A SYSTEM TEMPLATE ALL THE ALREADY EXISTING VIRTUAL I/O SERVERS AND PARTITIONS WILL BE REMOVED BY DOING THIS. THE HMC WILL PROMPT YOU A WARNING MESSAGE. Go in the template library and right click on the template you want to deploy, then click deploy :

    reset_before_deploy1
    reset_before_deploy2

    • If you are deploying a “non captured template” choose the physical I/O for each Virtual I/O Servers :
    • choose_io1

    • If you are deploying a “captured template” the physical I/O will be automatically choose for each Virtual I/O Servers :
    • choose_io2

    • The Virtual I/O Server profiles are craved here :
    • craving_virtual_io_servers

    • You next have the choice to use a NIM server or to use the HMC image repository to deploy the Virtual I/O Servers in both cases you have to choose the adapter used to deploy the image :
    • NIM way :
    • nim_way

    • HMC way (check the tip at the beginning of the post about installios if you are choosing this method :
    • hmc_way

    • Click start when you are ready. The start button will invoke the lpar_netboot command with the settings you put in the previous screen :
    • start_dep

    • You can monitor the installation process by clicking monitoring vterm (on the images below you can check the ping is successful, the bootp is ok, the tftp is downloading, and the being mksysb restored :
    • monitor1
      monitor2
      monitor3

    • The RMC connection has to be up on both Virtual I/O Servers to build the Shared Ethernet Adapters and the Virtual I/O Server license must be accepted. Check both are ok.
    • RMCok
      licenseok

    • Choose where the Shared Ethernet Adapters will be created and the create the link aggregation device here (choose here on which network adapters and network ports will your Shared Ethernet Adapters be created) :
    • choose_adapter

    • Click start on the next screen to create the Shared Ethernet Adapter automatically :
    • sea_creation_ok

    • After a successful deployment of a system template a summary will be displayed on the screen :
    • template_ok

    IBM Provisioning Toolkit for PowerVM : A tool created by the Admins for the Admins

    As you now know the HMC templates are ok, but there are some drawbacks about using this method. In my humble opinion the HMC templates are good for a beginner, the user is now guided step by step and it is much simpler for someone who doesn’t know anything about PowerVM to build a server from scratch, without knowing and understanding all the features of PowerVM (Virtual I/O Server, Shared Ethernet Adapter). The deployment is not fully automatized the HMC will not mirror your rootvg, will not set any attributes on your fiber channel adapters, will never run a custom script after the installation to fit your needs. Last point, I’m sure that as a system administrator you probably prefer using command line tools than a “crappy” GUI, a template can not be created, neither deployed in command line (change this please). There is another way to build your server and it’s called IBM PowerVM Provisioning toolkit. This tool is developed by STG Lab Services US and is not well known in Europe but I can assure you that a lot of US customers are using it (raise your voice in comments us guys). This tool can help you in many ways :

    • Carving Virtual I/O Servers profiles.
    • Building and deploying Virtual I/O Servers with a NIM Server without having to create anything by hand.
    • Creating your SEA with or without control channel, failover/sharing, tagged/non-tagged.
    • Setting attributes on your fire channel adapters.
    • Building and deploying Virtual I/O Clients in NPIV and vscsi.
    • Mirroring you rootvg.
    • Capturing a whole frame and redeploy it on another server.
    • A lot of other things.

    Just to let you understand the approach of the tool let’s begin with an example. I want to deploy a new machine with two Virtual I/O Server :

    • 1 (white) – I’m writing a profile file : in this one I’m putting all the information that are the same all the machines (virtual switches, shared processor pools, Virtual I/O Server profiles, Shared Ethernet Adapter definition, image chosen to deploy the Virtual I/O Server, physical I/O adapter for each Virtual I/O Server)
    • 2 (white) – I’m writing a config file : in this one I’m putting all the information that are unique for each machine (name, ip, HMC name used to deploy, CEC serial number, and so on)
    • 3 (yellow) – I’m launching the provisioning toolkit to build my machine, the NIM objects are created (networks, standalone machines) and the bosinst operation is launched from the NIM server
    • 4 (red) – The Virtual I/O Servers profiles are created and the lpar_netboot command is launched an ssh key has to be shared between the NIM server and the Hardware management console
    • 5 (blue) – Shared Ethernet Adapter are created and post configuration is launched on the Virtual I/O Server (mirror creation, vfc attributes …)

    toolkit

    Let me show you a detailed example of a new machine deployment :

    • On the NIM server, the toolkit is located in /export/nim/provision. You can see that the main script called buildframe.ksh.v3.24.2, and two directories one for the profiles (build_profiles) and one for the configuration files (config_files). The work_area directory is the log directory :
    • # cd /export/nim/provision
      # ls
      build_profiles          buildframe.ksh.v3.24.2  config_files       lost+found              work_area
      
    • Let’s check a configuration file a new Power720 deployment :
    • # vi build_profiles/p720.conf
      
    • Some variables will be set in the configuration file put N/A value for this ones :
    • VARIABLES      (SERVERNAME)=NA
      VARIABLES      (BUILDHMC)=NA
      [..]
      VARIABLES      (BUILDUSER)=hscroot
      VARIABLES      (VIO1_LPARNAME)=NA
      VARIABLES      (vio1_hostname)=(VIO1_LPARNAME)
      VARIABLES      (VIO1_PROFILE)=default_profile
      
      VARIABLES      (VIO2_LPARNAME)=NA
      VARIABLES      (vio2_hostname)=(VIO2_LPARNAME)
      VARIABLES      (VIO2_PROFILE)=default_profile
      
      VARIABLES      (VIO1_IP)=NA
      VARIABLES      (VIO2_IP)=NA
      
    • Choose the ports that will be used to restore the Virtual I/O Server mksysb :
    • VARIABLES      (NIMPORT_VIO1)=(CEC1)-P1-C6-T1
      VARIABLES      (NIMPORT_VIO2)=(CEC1)-P1-C7-T1
      
    • In the example I’m building the Virtual I/O Server with 3 Shared Ethernet Adapters, and I’m not creating any LACP aggregation :
    • # SEA1
      VARIABLES      (SEA1VLAN1)=401
      VARIABLES      (SEA1VLAN2)=402
      VARIABLES      (SEA1VLAN3)=403
      VARIABLES      (SEA1VLAN4)=404
      VARIABLES      (SEA1VLANS)=(SEA1VLAN1),(SEA1VLAN2),(SEA1VLAN3),(SEA1VLAN4)
      # SEA2
      VARIABLES      (SEA2VLAN1)=100,101,102
      VARIABLES      (SEA2VLAN2)=103,104,105
      VARIABLES      (SEA2VLAN3)=106,107,108
      VARIABLES      (SEA2VLAN4)=109,110
      VARIABLES      (SEA2VLANS)=(SEA2VLAN1),(SEA2VLAN2),(SEA2VLAN3),(SEA2VLAN4)
      # SEA3
      VARIABLES      (SEA3VLAN1)=200,201,202,203,204,309
      VARIABLES      (SEA3VLAN2)=205,206,207,208,209,310
      VARIABLES      (SEA3VLAN3)=210,300,301,302,303
      VARIABLES      (SEA3VLAN4)=304,305,306,307,308
      VARIABLES      (SEA3VLANS)=(SEA3VLAN1),(SEA3VLAN2),(SEA3VLAN3),(SEA3VLAN4)
      # SEA DEF (I'm putting adapter ID and PVID here)
      SEADEF         seadefid=SEA1,networkpriority=S,vswitch=vdct,seavirtid=10,10,(SEA1VLAN1):11,11,(SEA1VLAN2):12,12,(SEA1VLAN3):13,13,(SEA1VLAN4),seactlchnlid=14,99,vlans=(SEA1VLANS),netmask=(SEA1NETMASK),gateway=(SEA1GATEWAY),etherchannel=NO,lacp8023ad=NO,vlan8021q=YES,seaat
      trid=nojumbo
      SEADEF         seadefid=SEA2,networkpriority=S,vswitch=vdca,seavirtid=15,15,(SEA2VLAN1):16,16,(SEA2VLAN2):17,17,(SEA2VLAN3):18,18,(SEA2VLAN4),seactlchnlid=19,98,vlans=(SEA2VLANS),netmask=(SEA2NETMASK),gateway=(SEA2GATEWAY),etherchannel=NO,lacp8023ad=NO,vlan8021q=YES,seaat
      trid=nojumbo
      SEADEF         seadefid=SEA3,networkpriority=S,vswitch=vdcb,seavirtid=20,20,(SEA3VLAN1):21,21,(SEA3VLAN2):22,22,(SEA3VLAN3):23,23,(SEA3VLAN4),seactlchnlid=24,97,vlans=(SEA3VLANS),netmask=(SEA3NETMASK),gateway=(SEA3GATEWAY),etherchannel=NO,lacp8023ad=NO,vlan8021q=YES,seaat
      trid=nojumbo
      # SEA PHYSICAL PORTS 
      VARIABLES      (SEA1AGGPORTS_VIO1)=(CEC1)-P1-C6-T2
      VARIABLES      (SEA1AGGPORTS_VIO2)=(CEC1)-P1-C7-T2
      VARIABLES      (SEA2AGGPORTS_VIO1)=(CEC1)-P1-C1-C3-T1
      VARIABLES      (SEA2AGGPORTS_VIO2)=(CEC1)-P1-C1-C4-T1
      VARIABLES      (SEA3AGGPORTS_VIO1)=(CEC1)-P1-C4-T1
      VARIABLES      (SEA3AGGPORTS_VIO2)=(CEC1)-P1-C5-T1
      # SEA ATTR 
      SEAATTR        seaattrid=nojumbo,ha_mode=sharing,largesend=1,large_receive=yes
      
    • I’m defining each physical I/O adapter for each Virtual I/O Servers :
    • VARIABLES      (HBASLOTS_VIO1)=(CEC1)-P1-C1-C1,(CEC1)-P1-C2
      VARIABLES      (HBASLOTS_VIO2)=(CEC1)-P1-C1-C2,(CEC1)-P1-C3
      VARIABLES      (ETHSLOTS_VIO1)=(CEC1)-P1-C6,(CEC1)-P1-C1-C3,(CEC1)-P1-C4
      VARIABLES      (ETHSLOTS_VIO2)=(CEC1)-P1-C7,(CEC1)-P1-C1-C4,(CEC1)-P1-C5
      VARIABLES      (SASSLOTS_VIO1)=(CEC1)-P1-T9
      VARIABLES      (SASSLOTS_VIO2)=(CEC1)-P1-C19-T1
      VARIABLES      (NPIVFCPORTS_VIO1)=(CEC1)-P1-C1-C1-T1,(CEC1)-P1-C1-C1-T2,(CEC1)-P1-C1-C1-T3,(CEC1)-P1-C1-C1-T4,(CEC1)-P1-C2-T1,(CEC1)-P1-C2-T2,(CEC1)-P1-C2-T3,(CEC1)-P1-C2-T4
      VARIABLES      (NPIVFCPORTS_VIO2)=(CEC1)-P1-C1-C2-T1,(CEC1)-P1-C1-C2-T2,(CEC1)-P1-C1-C2-T3,(CEC1)-P1-C1-C2-T4,(CEC1)-P1-C3-T1,(CEC1)-P1-C3-T2,(CEC1)-P1-C3-T3,(CEC1)-P1-C3-T4
      
    • I’m defining the mksysb image to use and the Virtual I/O Server profiles :
    • BOSINST        bosinstid=viogold,source=mksysb,mksysb=golden-vios-2234-29122014-mksysb,spot=golden-vios-2234-29122014-spot,bosinst_data=no_prompt_hdisk0-bosinst_data,accept_licenses=yes,boot_client=no
      
      PARTITIONDEF   partitiondefid=vioPartition,bosinstid=viogold,lpar_env=vioserver,proc_mode=shared,min_proc_units=0.4,desired_proc_units=1,max_proc_units=16,min_procs=1,desired_procs=4,max_procs=16,sharing_mode=uncap,uncap_weight=255,min_mem=1024,desired_mem=8192,max_mem=12
      288,mem_mode=ded,max_virtual_slots=500,all_resources=0,msp=1,allow_perf_collection=1
      PARTITION      name=(VIO1_LPARNAME),profile_name=(VIO1_PROFILE),partitiondefid=vioPartition,lpar_netboot=(NIM_IP),(vio1_hostname),(VIO1_IP),(NIMNETMASK),(NIMGATEWAY),(NIMPORT_VIO1),(NIM_SPEED),(NIM_DUPLEX),NA,YES,NO,NA,NA
      PARTITION      name=(VIO2_LPARNAME),profile_name=(VIO2_PROFILE),partitiondefid=vioPartition,lpar_netboot=(NIM_IP),(vio2_hostname),(VIO2_IP),(NIMNETMASK),(NIMGATEWAY),(NIMPORT_VIO2),(NIM_SPEED),(NIM_DUPLEX),NA,YES,NO,NA,NA
      
    • Let’s now check a configuration file for a specific machine (as you can see I’m putting the Virtual I/O Server name here, the ip address and all that is specific to the new machines (CEC serial number and so on)) :
    • # cat P720-8202-E4D-1.conf
      (BUILDHMC)=myhmc
      (SERVERNAME)=P720-8202-E4D-1
      (CEC1)=WZSKM8U
      (VIO1_LPARNAME)=labvios1
      (VIO2_LPARNAME)=labvios2
      (VIO1_IP)=10.14.14.1
      (VIO2_IP)=10.14.14.2
      (NIMGATEWAY)=10.14.14.254
      (VIODNS)=10.10.10.1,10.10.10.2
      (VIOSEARCH)=lab.chmod66.org,prod.chmod666.org
      (VIODOMAIN)=chmod666.org
      
    • We are now ready to build the new machine. the first thing to do is to create the vswitches on the machine (you have to confirm all operations):
    • ./buildframe.ksh.v3.24.2 -p p720 -c P720-8202-E4D-1.conf -f vswitch
      150121162625 Start of buildframe DATE: (150121162625) VERSION: v3.24.2
      150121162625        profile: p720.conf
      150121162625      operation: FRAMEvswitch
      150121162625 partition list:
      150121162625   program name: buildframe.ksh.v3.24.2
      150121162625    install dir: /export/nim/provision
      150121162625    post script:
      150121162625          DEBUG: 0
      150121162625         run ID: 150121162625
      150121162625       log file: work_area/150121162625_p720.conf.log
      150121162625 loading configuration file: config_files/P720-8202-E4D-1.conf
      [..]
      Do you want to continue?
      Please enter Y or N Y
      150121162917 buildframe is done with return code 0
      
    • Let’s now build the Virtual I/O Servers, create the Shared Ethernet Adapters and let’s have a coffee ;-)
    • # ./buildframe.ksh.v3.24.2 -p p720 -c P720-8202-E4D-1.conf -f build
      [..]
      150121172320 Creating partitions
      150121172320                 --> labvios1
      150121172322                 --> labvios2
      150121172325 Updating partition profiles
      150121172325   updating VETH adapters in partition: labvios1 profile: default_profile
      150121172329   updating VETH adapters in partition: labvios1 profile: default_profile
      150121172331   updating VETH adapters in partition: labvios1 profile: default_profile
      150121172342   updating VETH adapters in partition: labvios2 profile: default_profile
      150121172343   updating VETH adapters in partition: labvios2 profile: default_profile
      150121172344   updating VETH adapters in partition: labvios2 profile: default_profile
      150121172345   updating IOSLOTS in partition: labvios1 profile: default_profile
      150121172347   updating IOSLOTS in partition: labvios2 profile: default_profile
      150121172403 Configuring NIM for partitions
      150121172459 Executing--> lpar_netboot   -K 255.255.255.0 -f -t ent -l U78AA.001.WZSKM8U-P1-C6-T1 -T off -D -s auto -d auto -S 10.20.20.1 -G 10.14.14.254 -C 10.14.14.1 labvios1 default_profile s00ka9936774-8202-E4D-845B2CV
      150121173247 Executing--> lpar_netboot   -K 255.255.255.0 -f -t ent -l U78AA.001.WZSKM8U-P1-C7-T1 -T off -D -s auto -d auto -S 10.20.20.1 -G 10.14.14.254 -C 10.14.14.2 labvios2 default_profile s00ka9936774-8202-E4D-845B2CV
      150121174028 buildframe is done with return code 0
      
    • After the mksysb is deployed you can tail the logs on each Virtual I/O Server to check what is going on :
    • [..]
      150121180520 creating SEA for virtID: ent4,ent5,ent6,ent7
      ent21 Available
      en21
      et21
      150121180521 Success: running /usr/ios/cli/ioscli mkvdev -sea ent1 -vadapter ent4,ent5,ent6,ent7 -default ent4 -defaultid 10 -attr ctl_chan=ent8  ha_mode=sharing largesend=1 large_receive=yes, rc=0
      150121180521 found SEA ent device: ent21
      150121180521 creating SEA for virtID: ent9,ent10,ent11,ent12
      [..]
      ent22 Available
      en22
      et22
      150121180523 Success: running /usr/ios/cli/ioscli mkvdev -sea ent20 -vadapter ent9,ent10,ent11,ent12 -default ent9 -defaultid 15 -attr ctl_chan=ent13  ha_mode=sharing largesend=1 large_receive=yes, rc=0
      150121180523 found SEA ent device: ent22
      150121180523 creating SEA for virtID: ent14,ent15,ent16,ent17
      [..]
      ent23 Available
      en23
      et23
      [..]
      150121180540 Success: /usr/ios/cli/ioscli cfgnamesrv -add -ipaddr 10.10.10.1, rc=0
      150121180540 adding DNS: 10.10.10.1
      150121180540 Success: /usr/ios/cli/ioscli cfgnamesrv -add -ipaddr 10.10.10.2, rc=0
      150121180540 adding DNS: 159.50.203.10
      150121180540 adding DOMAIN: lab.chmod666.org
      150121180541 Success: /usr/ios/cli/ioscli cfgnamesrv -add -dname fr.net.intra, rc=0
      150121180541 adding SEARCH: lab.chmod666.org prod.chmod666.org
      150121180541 Success: /usr/ios/cli/ioscli cfgnamesrv -add -slist lab.chmod666.org prod.chmod666.org, rc=0
      [..]
      150121180542 Success: found fcs device for physical location WZSKM8U-P1-C2-T4: fcs3
      150121180542 Processed the following FCS attributes: fcsdevice=fcs4,fcs5,fcs6,fcs7,fcs0,fcs1,fcs2,fcs3,fcsattrid=fcsAttributes,port=WZSKM8U-P1-C1-C1-T1,WZSKM8U-P1-C1-C1-T2,WZSKM8U-P1-C1-C1-T3,WZSKM8U-P1-C1-C1-T4,WZSKM8U-P1-C2-T1,WZSKM8U-P1-C2-T2,WZSKM8U-P1-C2-T3,WZSKM8U-P
      1-C2-T4,max_xfer_size=0x100000,num_cmd_elems=2048
      150121180544 Processed the following FSCSI attributes: fcsdevice=fcs4,fcs5,fcs6,fcs7,fcs0,fcs1,fcs2,fcs3,fscsiattrid=fscsiAttributes,port=WZSKM8U-P1-C1-C1-T1,WZSKM8U-P1-C1-C1-T2,WZSKM8U-P1-C1-C1-T3,WZSKM8U-P1-C1-C1-T4,WZSKM8U-P1-C2-T1,WZSKM8U-P1-C2-T2,WZSKM8U-P1-C2-T3,WZS
      KM8U-P1-C2-T4,fc_err_recov=fast_fail,dyntrk=yes
      [..]
      150121180546 Success: found device U78AA.001.WZSKM8U-P2-D4: hdisk0
      150121180546 Success: found device U78AA.001.WZSKM8U-P2-D5: hdisk1
      150121180546 Mirror hdisk0 -->  hdisk1
      150121180547 Success: extendvg -f rootvg hdisk1, rc=0
      150121181638 Success: mirrorvg rootvg hdisk1, rc=0
      150121181655 Success: bosboot -ad hdisk0, rc=0
      150121181709 Success: bosboot -ad hdisk1, rc=0
      150121181709 Success: bootlist -m normal hdisk0 hdisk1, rc=0
      150121181709 VIOmirror <- rc=0
      150121181709 VIObuild <- rc=0
      150121181709 Preparing to reboot in 10 seconds, press control-C to abort
      

    The new server was deployed in one command and you avoid any manual mistake by using the toolkit. The example above is just one of the many was to use the toolkit. This is a very powerful and simple tool and I really want to see other Europe customers using it, so ask you IBM Pre-sales, ask for PowerCare and take the control of you deployment by using the toolkit. The toolkit is also used to capture and redeploy a whole frame for disaster recovery plan.

    Live Partition Mobility Automation Tool

    Because understanding the provisioning toolkit didn't takes me one full week we still had plenty of time the with Bonnie from STG Lab Service and we decided to give a try to another tool called Live Partition Mobility Automation Tool. I'll not talk about it in details but this tool allows you to automatize your Live Partition Mobility moves. It's a web interface coming with a tomcat server that you can run on a Linux or directly on your laptop. This web application is taking control of your Hardware Management Console and allows you to do a lot of things LPM related :

    • You can run a validation on every partitions on a system.
    • You can move you partitions by spreading or packing them on destination server.
    • You can "record" a move to replay it later (very very very useful for my previous customer for instance, we were making our moves by clients, all clients were hosted on two big P795)
    • You can run a dynamic platform optimizer after the moves.
    • You have an option to move back the partitions to their original location and this is (in my humble opinion) what's make this tool so powerfull

    lpm_toolkit

    Since I have this tool I'm now running on a week basis a validation of all my partition to check if there are any errors. I'm now using it to move and move back the partitions when I have to. So I really recommends the Live Partition Mobility Automation tool.

    Hardware Management Console 8 : Other new features

    Adding a VLAN to an already existing Shared Ethernet Adapter

    With the new Hardware Management Console you can easily add a new vlan to an already existing Shared Ethernet Adapter (failover and shared, with and without control channel : no restriction) without having to perform a dlpar operation on each Virtual I/O Server and then modifying your profiles (if you do not have the synchronization enabled). Even better by using this method to add your new vlans you will avoid any misconfiguration, for instance by forgetting to add the vlan on one or the Virtual I/O Server or by not choosing the same adapter on both side.

    • Open the Virtual Network page in the HMC and click "Add a Virtual Network". You have to remember that a Virtual Network Bridge is an Shared Ethernet Adapter, and a Load balance group is a pair of virtual adapters on both Virtual I/O Server with the same PVID :
    • add_vlan5

    • Choose the name of your vlan (in my case VLAN3331), then choose bridged network (bridged network is the new name for Shared Ethernet Adapters ...), choose "yes" for vlan tagging, and put the vlan id (in my case 3331). By choosing the virtual switch, the HMC will only let you choose a Shared Ethernet Adapter configured in the virtual switch (no mistake possible). DO NOT forget to check the box "Add new virtual network to all Virtual I/O servers" to add the vlan on both sides :
    • add_vlan

    • On the next page you have to choose the Shared Ethernet Adapter on which the vlan will be added (in my case this is super easy, I ALWAYS create one Shared Ethernet Adapter per virtual switch to avoid misconfiguration and network loops created by adding with the same vlan id on two differents Shared Ethernet Adapter) :
    • add_vlan2

    • At last choose or create a new "Load Sharing Group". A load sharing group is one of the virtual adapter of your Shared Ethernet Adapter. In my case my Shared Ethernet Adapter was created with two virtual adapters with id 10 and 11. On this screenshot I'm telling the HMC to add the new vlan on the adapter with the id 10 on both Virtual I/O Servers. You can also create a new virtual adapter to be included in the Shared Ethernet Adapter by choosing "Create a new load sharing group" :
    • add_vlan3

    • Before applying the configuration a summary is prompted to the user to check the changes :
    • add_vlan4

    Partition Templates

    You can also use the template to capture and created partitions not just systems. I'll not give you all the details because the HMC is well documented for this part and there is no tricky things to do, just follow the GUI. One more time the HMC8 is for the noobs \o/. Here are a few screenshot of partitions templates (capture and deploy) :

    create_part2
    create_part6

    A new a nice look and feel for the new Hardware Management Console

    Everybody that the HMC GUI is not very nice but it's working great. One of the major new thing of the HMC 8r8.2.0 is the new GUI. In my opinion the new GUI is awesome the design is nice and I love it. Look at the pictures below :

    hmc8
    virtual_network_diagram

    Conclusion

    The Hardware Management Console 8 is still young but offers a lot of new cool features like system and partitions template, performance dashboard and a new GUI. In my opinion the new GUI is slow and there are a lot of bugs for the moment, my advice is to use when you have the time to use it, not in a rush. Learn the new HMC on your own by trying to do all the common tasks with the new GUI (there are still impossible things to do ;-)). I can assure you that you will need more than a few hour to be familiarized with all those new features. And don't forget to call you pre-sales to have a demonstration of the STG's toolkits, both provisioning and LPM are awesome. Use it !

    What is going on in this world

    This blog is not and will never be the place for political things but with the darkest days we had in France two weeks ago with this insane and inhuman terrorists attacks I had to say a few words about it (because even if my whole life is about AIX for the moment, I'm also an human being .... if you doubt about it). Since the tragic death of 17 men and women in France everybody is raising his voice to tell us (me ?) what is right and what is wrong without thinking seriously about it. Things like this terrorist attack should never happen again. I just wanted to say that I'm for liberty, no only for the "liberty of expression", but just the liberty. By defending this liberty we have to be very careful because in the name of this defense things that are done by our government may take us what we call liberty forever. Are the phone and the internet going to be tapped and logged in the name of the liberty ? Is this liberty ? Think about it and resist.

    Configuration of a Remote Restart Capable partition

    How can we move a partition to another machine if the machine or the data-center on which the partition is hosted is totally unavailable ? This question is often asked by managers and technical people. Live Partition Mobility can’t answer to this question because the source machine needs to be running to initiate the mobility. I’m sure that most of you are implementing a manual solution based on a bunch of scripts recreating the partition profile by hand but this is hard to maintain and it’s not fully automatized and not supported by IBM. A solution to this problem is to setup your partitions as Remote Restart Capable partitions. This PowerVM feature is available since the release of VMcontrol (IBM Systems Director plugin). Unfortunately this powerful feature is not well documented but will probably in the next months or in the next year be a must have on your newly deployed AIX machines. One last word : with the new Power8 machines things are going to change about remote restart, the functionality will be easier to use and a lot of prerequisites are going to disappear. Just to be clear this post has been written using Power7+ 9117-MMD machines, the only thing you can’t do with these machines (compared to Power8 ones) is changing a current partition to be remote restart capable aware without having to delete and recreate its profile.

    Pre-requesite

    To create and use a remote restart partition on Power7+/Power8 machines you’ll need this prerequisites :

    • A PowerVM enterprise license (Capability “PowerVM remote restart capable” to true, be careful there is another capability named “Remote restart capable” this was used by VMcontrol only, so double check the capability ok for you).
    • A firmware 780 (or later all Power8 firmware are ok, all Power7 >= 780 are ok).
    • Your source and destination machine are connected to the same Hardware Management Console, you can’t remote restart between two HMC at the moment.
    • Minimum version of HMC is 8r8.0.0. Check you have the rrstartlpar command (not the rrlpar command used by VMcontrol only).
    • Better than a long post check this video (don’t laugh at me, I’m trying to do my best but this is one of my first video …. hope it is good) :

    What is a remote restart capable virtual machine ?

    Better than a long text to explain you what is, check the picture below and follow each number from 1 to 4 to understand what is a remote restart partition :

    remote_restart_explanation

    Create the profile of you remote restart capable partition : Power7 vs Power8

    A good reason to move to Power8 faster than you planed is that you can change a virtual machine to be remote restart capable without having to recreate the whole profile. I don’t know why at the time of writing this post but changing a non remote restart capable lpar to a remote restart capable lpar is only available on Power8 systems. If you are using a Power7 machine (like me in all the examples below) be carful to check this option while creating the machine. Keep in mind that if you forgot to check to option you will not be able to enable the remote restart capability afterwards and you unfortunately have to remove you profile and recreate it, sad but true … :

    • Don’t forget to check the check box to allow the partition to be remote restart capable :
    • remote_restart_capable_enabled1

    • After the partition is created you can notice in the I/O tab that all remote restart capable partition are not able to own any physical I/O adapter :
    • rr2_nophys

    • You can check in the properties that the remote restart capable feature is activated :
    • remote_restart_capable_activated

    • If you try to modify an existing profile on a Power7 machine you’ll get this error message. On a Power8 machine there will be not problem :
    • # chsyscfg -r lpar -m XXXX-9117-MMD-658B2AD -p test_lpar-i remote_restart_capable=1
      An error occurred while changing the partition named test_lpar.
      The managed system does not support changing the remote restart capability of a partition. You must delete the partition and recreate it with the desired remote restart capability.
      
    • You can verify that some of your lpar are remote restart capable :
    • lssyscfg -r lpar -m source-machine -F name,remote_restart_capable
      [..]
      lpar1,1
      lpar2,1
      lpar3,1
      remote-restart,1
      [..]
      
    • On a Power 7 machine the best way to enable remote restart on an already created machine is to delete the profile and recreate it by hand and adding it the remote restart attribute :
    • Get the current partition profile :
    • $ lssyscfg -r prof -m s00ka9927558-9117-MMD-658B2AD --filter "lpar_names=temp3-b642c120-00000133"
      name=default_profile,lpar_name=temp3-b642c120-00000133,lpar_id=11,lpar_env=aixlinux,all_resources=0,min_mem=8192,desired_mem=8192,max_mem=8192,min_num_huge_pages=0,desired_num_huge_pages=0,max_num_huge_pages=0,mem_mode=ded,mem_expansion=0.0,hpt_ratio=1:128,proc_mode=shared,min_proc_units=2.0,desired_proc_units=2.0,max_proc_units=2.0,min_procs=4,desired_procs=4,max_procs=4,sharing_mode=uncap,uncap_weight=128,shared_proc_pool_id=0,shared_proc_pool_name=DefaultPool,affinity_group_id=none,io_slots=none,lpar_io_pool_ids=none,max_virtual_slots=64,"virtual_serial_adapters=0/server/1/any//any/1,1/server/1/any//any/1",virtual_scsi_adapters=3/client/2/s00ia9927560/32/0,virtual_eth_adapters=32/0/1659//0/0/vdct/facc157c3e20/all/0,virtual_eth_vsi_profiles=none,"virtual_fc_adapters=""2/client/1/s00ia9927559/32/c050760727c5007a,c050760727c5007b/0"",""4/client/1/s00ia9927559/35/c050760727c5007c,c050760727c5007d/0"",""5/client/2/s00ia9927560/34/c050760727c5007e,c050760727c5007f/0"",""6/client/2/s00ia9927560/35/c050760727c50080,c050760727c50081/0""",vtpm_adapters=none,hca_adapters=none,boot_mode=norm,conn_monitoring=1,auto_start=0,power_ctrl_lpar_ids=none,work_group_id=none,redundant_err_path_reporting=0,bsr_arrays=0,lpar_proc_compat_mode=default,electronic_err_reporting=null,sriov_eth_logical_ports=none
      
    • Remove the partition :
    • $ chsysstate -r lpar -o shutdown --immed -m source-server -n temp3-b642c120-00000133
      $ rmsyscfg -r lpar -m source-server -n temp3-b642c120-00000133
      
    • Recreate the partition with the remote restart attribute enabled :
    • mksyscfg -r lpar -m s00ka9927558-9117-MMD-658B2AD -i 'name=temp3-b642c120-00000133,profile_name=default_profile,remote_restart_capable=1,lpar_id=11,lpar_env=aixlinux,all_resources=0,min_mem=8192,desired_mem=8192,max_mem=8192,min_num_huge_pages=0,desired_num_huge_pages=0,max_num_huge_pages=0,mem_mode=ded,mem_expansion=0.0,hpt_ratio=1:128,proc_mode=shared,min_proc_units=2.0,desired_proc_units=2.0,max_proc_units=2.0,min_procs=4,desired_procs=4,max_procs=4,sharing_mode=uncap,uncap_weight=128,shared_proc_pool_name=DefaultPool,affinity_group_id=none,io_slots=none,lpar_io_pool_ids=none,max_virtual_slots=64,"virtual_serial_adapters=0/server/1/any//any/1,1/server/1/any//any/1",virtual_scsi_adapters=3/client/2/s00ia9927560/32/0,virtual_eth_adapters=32/0/1659//0/0/vdct/facc157c3e20/all/0,virtual_eth_vsi_profiles=none,"virtual_fc_adapters=""2/client/1/s00ia9927559/32/c050760727c5007a,c050760727c5007b/0"",""4/client/1/s00ia9927559/35/c050760727c5007c,c050760727c5007d/0"",""5/client/2/s00ia9927560/34/c050760727c5007e,c050760727c5007f/0"",""6/client/2/s00ia9927560/35/c050760727c50080,c050760727c50081/0""",vtpm_adapters=none,hca_adapters=none,boot_mode=norm,conn_monitoring=1,auto_start=0,power_ctrl_lpar_ids=none,work_group_id=none,redundant_err_path_reporting=0,bsr_arrays=0,lpar_proc_compat_mode=default,sriov_eth_logical_ports=none'
      

    Creating a reserved storage device

    The reserved storage device pool is used to store the configuration data of the remote restart partition. At the time of writing this post thoses devices are mandatory and as far as I know they are used just to store the configuration and not the state (memory state) of the virtual machines itself (maybe in the future, who knows ?) (You can’t create or boot any remote restart partition if you do not have a reserved storage device pool created, do this before doing anything else) :

    • You have first to find on both Virtual I/O Server and on both machines (source and destination machine used for the remote restart operation) a bunch of devices. These ones have to be the same on all the Virtual I/O Server used for the remote restart operation. The lsmemdev command is used to find those devices :
    • vios1$ lspv | grep -iE "hdisk988|hdisk989|hdisk990"
      hdisk988         00ced82ce999d6f3                     None
      hdisk989         00ced82ce999d960                     None
      hdisk990         00ced82ce999dbec                     None
      vios2$ lspv | grep -iE "hdisk988|hdisk989|hdisk990"
      hdisk988         00ced82ce999d6f3                     None
      hdisk989         00ced82ce999d960                     None
      hdisk990         00ced82ce999dbec                     None
      vios3$ lspv | grep -iE "hdisk988|hdisk989|hdisk990"
      hdisk988         00ced82ce999d6f3                     None
      hdisk989         00ced82ce999d960                     None
      hdisk990         00ced82ce999dbec                     None
      vios4$ lspv | grep -iE "hdisk988|hdisk989|hdisk990"
      hdisk988         00ced82ce999d6f3                     None
      hdisk989         00ced82ce999d960                     None
      hdisk990         00ced82ce999dbec                     None
      
      $ lsmemdev -r avail -m source-machine -p vios1,vios2
      [..]
      device_name=hdisk988,redundant_device_name=hdisk988,size=61440,type=phys,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E5000000000000,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E5000000000000,redundant_capable=1
      device_name=hdisk989,redundant_device_name=hdisk989,size=61440,type=phys,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E6000000000000,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E6000000000000,redundant_capable=1
      device_name=hdisk990,redundant_device_name=hdisk990,size=61440,type=phys,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E7000000000000,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E7000000000000,redundant_capable=1
      [..]
      $ lsmemdev -r avail -m dest-machine -p vios3,vios4
      [..]
      device_name=hdisk988,redundant_device_name=hdisk988,size=61440,type=phys,phys_loc=U2C4E.001.DBJN914-P2-C2-T1-W500507680140F32C-L3E5000000000000,redundant_phys_loc=U2C4E.001.DBJN914-P2-C1-T1-W500507680140F32C-L3E5000000000000,redundant_capable=1
      device_name=hdisk989,redundant_device_name=hdisk989,size=61440,type=phys,phys_loc=U2C4E.001.DBJN914-P2-C2-T1-W500507680140F32C-L3E6000000000000,redundant_phys_loc=U2C4E.001.DBJN914-P2-C1-T1-W500507680140F32C-L3E6000000000000,redundant_capable=1
      device_name=hdisk990,redundant_device_name=hdisk990,size=61440,type=phys,phys_loc=U2C4E.001.DBJN914-P2-C2-T1-W500507680140F32C-L3E7000000000000,redundant_phys_loc=U2C4E.001.DBJN914-P2-C1-T1-W500507680140F32C-L3E7000000000000,redundant_capable=1
      [..]
      
    • Create the reserved storage device pool using the chhwres command on the Hardware Management Console (create on all machines used by the remote restart operation) :
    • $ chhwres -r rspool -m source-machine -o a -a vios_names=\"vios1,vios2\"
      $ chhwres -r rspool -m source-machine -o a -p vios1 --rsubtype rsdev --device hdisk988 --manual
      $ chhwres -r rspool -m source-machine -o a -p vios1 --rsubtype rsdev --device hdisk989 --manual
      $ chhwres -r rspool -m source-machine -o a -p vios1 --rsubtype rsdev --device hdisk990 --manual
      $ lshwres -r rspool -m source-machine --rsubtype rsdev
      device_name=hdisk988,vios_name=vios1,vios_id=1,size=61440,type=phys,state=Inactive,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E5000000000000,is_redundant=1,redundant_device_name=hdisk988,redundant_vios_name=vios2,redundant_vios_id=2,redundant_state=Inactive,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E5000000000000,lpar_id=none,device_selection_type=manual
      device_name=hdisk989,vios_name=vios1,vios_id=1,size=61440,type=phys,state=Inactive,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E6000000000000,is_redundant=1,redundant_device_name=hdisk989,redundant_vios_name=vios2,redundant_vios_id=2,redundant_state=Inactive,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E6000000000000,lpar_id=none,device_selection_type=manual
      device_name=hdisk990,vios_name=vios1,vios_id=1,size=61440,type=phys,state=Inactive,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E7000000000000,is_redundant=1,redundant_device_name=hdisk990,redundant_vios_name=vios2,redundant_vios_id=2,redundant_state=Inactive,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E7000000000000,lpar_id=none,device_selection_type=manual
      $ lshwres -r rspool -m source-machine
      "vios_names=vios1,vios2","vios_ids=1,2"
      
    • You can also create the reserved storage device pool from Hardware Management Console GUI :
    • After selecting the Virtual I/O Server, click select devices :
    • rr_rsd_pool_p

    • Choose the maximum and minimum size to filter the devices you can select for the creation of the reserved storage device :
    • rr_rsd_pool2_p

    • Choose the disk you want to put in you reserved storage device pool (put all the devices used by remote restart partitions in manual mode, automatic devices are used by suspend/resume operation or AMS pool. One device can not be shared by two remote restart partitions) :
    • rr_rsd_pool_waiting_3_p
      rr_pool_create_7_p

    • You can check afterwards that your reserved device storage pool is created and is composed by three devices :
    • rr_pool_create_9
      rr_pool_create_8_p

    Select a storage device for each remote restart partition before starting it :

    After creating the reserved device storage pool you have for every partition to select a device from the storage pool. This device will be used to store the configuration data of the partition :

    • You can see you cannot start the partition if no devices were selected !
    • To select the correct device size you first have to calculate the needed space for every partition using the using the lsrsdevsize command. This size around the size of max memory value set in the partition profile (don’t ask me why):
    • $ lsrsdevsize -m source-machine -p temp3-b642c120-00000133
      size=8498
      
    • Select the device you want to assign to your machine (in my case there was already a device selected for this machine) :
    • rr_rsd_pool_assign_p

    • Then select the machine you want to assign for the device :
    • rr_rsd_pool_assign2_p

    • Or do this in command line :
    • $ chsyscfg -r lpar -m source-machine -i "name=temp3-b642c120-00000133,primary_rs_vios_name=vios1,secondary_rs_vios_name=vios2,rs_device_name=hdisk988"
      $ lssyscfg -r lpar -m source-machine --filter "lpar_names=temp3-b642c120-00000133" -F primary_rs_vios_name,secondary_rs_vios_name,curr_rs_vios_name
      vios1,vios2,vios1
      $ lshwres -r rspool -m source-machine --rsubtype rsdev
      device_name=hdisk988,vios_name=vios1,vios_id=1,size=61440,type=phys,state=Active,phys_loc=U2C4E.001.DBJN916-P2-C1-T1-W500507680140F32C-L3E5000000000000,is_redundant=1,redundant_device_name=hdisk988,redundant_vios_name=vios2,redundant_vios_id=2,redundant_state=Active,redundant_phys_loc=U2C4E.001.DBJN916-P2-C2-T1-W500507680140F32C-L3E5000000000000,lpar_name=temp3-b642c120-00000133,lpar_id=11,device_selection_type=manual
      

    Launch the remote restart operation

    All the remote restart operations are launched from the Hardware Management Console with the rrstartlpar command. At the time of writing this post there is not GUI function to remote restart a machine and you can only do it with the command line :

    Validation

    Like you can do it with a Live Partition Mobility move you can validate a remote restart operation before running it. You can only perform the remote restart operation if the machine on which the remote restart machine is hosted is shutdown or in error, so the validation is very useful and mandatory to check your remote restart machine are well configured without having to stop the source machine :

    $ rrstartlpar -o validate -m source-machine -t dest-machine -p rrlpar
    $ rrstartlpar -o validate -m source-machine -t dest-machine -p rrlpar -d 5
    $ rrstartlpar -o validate -m source-machine -t dest-machine -p rrlpar --redundantvios 2 -d 5 -v
    

    Execution

    As I said before the remote restart operation can only be performed if the source machine is in a particular state, the states that allows a remote restart operation are :

    • Power Off.
    • Error.
    • Error – Dump in progress state.

    So the only way to test a remote restart operation today is to shutdown your source machine :

    • Shutdown the source machine :
    • step1

      $ chsysstate -m source-machine -r sys  -o off --immed
      

      rr_step2_mod

    • You can next check on the Hardware Management Console that Virtual I/O Servers and the remote restart lpar are in state “Not available”. You’re now ready to remote restart the lpar (if the partition id is used on the destination machine the next available one will be used) (you have to wait a little before remote restarting the partition, check below) :
    • $ rrstartlpar -o restart -m source-machine -t dest-machine -p rrlpar -d 5 -v
      HSCLA9CE The managed system is not in a valid state to support partition remote restart operations.
      $ rrstartlpar -o restart -m source-machine -t dest-machine -p rrlpar -d 5 -v
      Warnings:
      HSCLA32F The specified partition ID is no longer valid. The next available partition ID will be used.
      

      step3
      rr_step4_mod
      step5

    Cleanup

    When the source machine is ready to be up (after an outage for instance) just boot the machine and its Virtual I/O Server. After the machine is up you can notice that the rrlpar profile is still there and it can be a huge problem if somebody is trying to boot this machine because it is started on the other machine after the remote restart operation. To prevent such an error you have to cleanup your remote restart partition by using the rrstartlpar command again. Be careful not to check the option to boot the partitions after the machine is started :

    • Restart the source machine and its Virtual I/O Servers :
    • $ chsysstate -m source-machine -r sys -o on
      $ chsysstate -r lpar -m source-machine -n vios1 -o on -f default_profile
      $ chsysstate -r lpar -m source-machine -n vios2 -o on -f default_profile
      

      rr_step6_mod

    • Perform the cleanup operation to remove the profile of the remote restart partition (if you want later to LPM back your machine you have to keep the device of the reserved device storage pool in the pool, if you do not use the –retaindev option the device will be automatically removed from the pool) :
    • $ rrstartlpar -o cleanup -m source-machine -p rrlpar --retaindev -d 5 -v --force
      

      rr_step7_mod

    Refresh the partition and profile data

    During my test I encounter a problem. The configuration was not correctly synced between the device used in the reserved device storage pool and the current partition profile. I had to use a command named refdev (for refresh device) to synchronize the partition and profile data to the storage device.

    $ refdev -m source-machine -p refdev -m sys1 -p temp3-b642c120-00000133 -v 
    

    What’s in the reserved storage device ?

    I’m a curious guy. After playing with remote restart I asked myself a question, what is really stored in the reserved device storage device assigned to the remote restart partition. Looking in the documentation on the internet does not answer to my question so I had to look on it on my own. By ‘dding” the reserved storage device assigned to a partition I realized that the profile is stored in xml format. Maybe this format is the same format that the one used by the HMC 8 templates library. For the moment and during my tests on Power7+ machine the state of the memory of the partition is not transferred to the destination machine, maybe because I had to shutdown the whole source machine to test. Maybe the memory state of the machine is transferred to the destination machine if this one is in error state or is dumping. I had not chance to test this :

    root@vios1:/home/padmin# dd if=/dev/hdisk17 of=/tmp/hdisk17.out bs=1024 count=10
    10+0 records in
    10+0 records out
    root@vios1:/home/padmin# more hdisk17.out
    [..]
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    BwEAAAAAAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACgDIAZAAAQAEAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" Profile="H4sIAAAAA
    98VjxbxEAhNaZEqpEptPS/iMJO4cTJBdHVj38zcYvu619fTGQlQVmxY0AUICSH4A5XYorJgA1I3sGMBCx5Vs4RNd2zgXI89tpNMxslIiRzPufec853zfefk/t/osMfRBYPZRbpuF9ueUTQsShxR1NSl9dvEEPPMMgnfvPnVk
    a2ixplLuOiVCHaUKn/yYMv/PY/ydTRuv016TbgOzdVv4w6+KM0vyheMX62jgq0L7hsCXtxBH6J814WoZqRh/96+4a+ff3Br8+o3uTE0pqJZA7vYoKKnOgYnNoSsoiPECp7KzHfELTQV/lnBAgt0/Fbfs4Wd1sV+ble7Lup/c
    be0LQj01FJpoVpecaNP15MhHxpcJP8al6b7fg8hxCnPY68t8LpFjn83/eKFhcffjqF8DRUshs0almioaFK0OfHaUKCue/1GcN0ndyfg9/fwsyzQ6SblellXK6RDDaIIwem6L4iXCiCfCuBZxltFz6G4eHed2EWD2sVVx6Mth
    eEOtnzSjQoVwLbo2+uEf3T/s2emPv3z4xA16eD0AC6oRN3FXNnYoA6U7y3OfFc1g5hOIiTQsVUHSusSc43QVluEX2wKdKJZq4q2YmJXEF7hhuqYJA0+inNx3YTDab2m6T7vEGpBlAaJnU0qjWofTkj+uT2Tv3Rl69prZx/9s
    thQTBMK42WK7XSzrizqFhPL5E6FeHGVhnSJQLlKKreab1l6z9MwF0C/jTi3OfmKCsoczcJGwITgy+f74Z4Lu2OU70SDyIdXg1+JAApBWZoAbLaEj4InyonZIDbjvZGwv3H5+tb7C5tPThQA9oUdsCN0HsnWoLxWLjPHAdJSp
    Ja45pBarVb3JDyUJOn3aemXcIqtUfgPi3wCuiw76tMh6mVtNVDHOB+BxqEUDWZGtPgPrFc9oBgBhhJzEdsEVI9zC1gr0JTexhwgThzIwYEG7lLbt3dcPyHQLKQqfGzVsSNzVSvenkDJU/lUoiXGRNrdxLy2soyhtcNX47INZ
    nHKOCjYfsoeR3kpm58GdYDVxipIZXDgSmhfCDCPlKZm4dZoVFORzEX0J6CLvK4py6N7Pz94yiXlPBAArd3zqIEtjXFZ4izJzQ44sCv7hh3bTnY5TbKdnOtHGtatTjrEynTuWFNXV3ouaUKIIKfDgE5XrrpWb/SHWyWCbXMM5
    DkaHNzXVJws6csK57jnpToLopiQLZdgHJJh9wm+M+wbof7GzSRJBYvAAaV0RvE8ZlA5yxSob4fAiJiNNwwQAwu2y5/O881fvvz3HxgK70ZDwc1FS8JezBgKR0e/S4XR3ta8OwmdS56akXJITAmYBpElF5lZOdlXuO+8N0opU
    m0HeJTw76oiD8PS9QfRECUYqk0B1KGkZ+pRGQPUhPFEb12XIoe7u4WXuwdVqTAnZT8gyYrvAPlL/sYG4RkDmAx5HFZpFIVnAz9Lrlyh9tFIc4nZAColOLNGdFRKmE8GJd5zZx++zMiAoTOWNrJvBjODNo1UOGuXngzcHWjrn
    LgmkxjBXLj+6Fjy1DHFF0zV6lVH/p+VYO6pbZzYD9/ORFLouy6MwvlGuRz8Qz10ugawprAdtJ4GxWAOtmQjZXJ+Lg58T/fDy4K74bYWr9CyLIVdQiplHPLbjinZRu4BZuAENE6jxTP2zNkBVgfiWiFcv7f3xYjFqxs/7vb0P
     lpar_name="rrlpar" lpar_uuid="0D80582A44F64B43B2981D632743A6C8" lpar_uuid_gen_method="0"><SourceLparConfig additional_mac_addr_bases="" ame_capability="0" auto_start_e
    rmal" conn_monitoring="0" desired_proc_compat_mode="default" effective_proc_compat_mode="POWER7" hardware_mem_encryption="10" hardware_mem_expansion="5" keylock="normal
    "4" lpar_placement="0" lpar_power_mgmt="0" lpar_rr_dev_desc="	<cpage>		<P>1</P>
    		<S>51</S>
    		<VIOS_descri
    00010E0000000000003FB04214503IBMfcp</VIOS_descriptor>
    	</cpage>
    " lpar_rr_status="6" lpar_tcc_slot_id="65535" lpar_vtpm_status="65535" mac_addres
    x_virtual_slots="10" partition_type="rpa" processor_compatibility_mode="default" processor_mode="shared" shared_pool_util_authority="0" sharing_mode="uncapped" slb_mig_
    ofile="1" time_reference="0" uncapped_weight="128"><VirtualScsiAdapter is_required="false" remote_lpar_id="2" src_vios_slot_number="4" virtual_slot_number="4"/><Virtual
    "false" remote_lpar_id="1" src_vios_slot_number="3" virtual_slot_number="3"/><Processors desired="4" max="8" min="1"/><VirtualFibreChannelAdapter/><VirtualEthernetAdapt
    " filter_mac_address="" is_ieee="0" is_required="false" mac_address="82776CE63602" mac_address_flags="0" qos_priority="0" qos_priority_control="false" virtual_slot_numb
    witch_id="1" vswitch_name="vdct"/><Memory desired="8192" hpt_ratio="7" max="16384" memory_mode="ded" min="256" mode="ded" psp_usage="3"><IoEntitledMem usage="auto"/></M
     desired="200" max="400" min="10"/></SourceLparConfig></SourceLparInfo></SourceInfo><FileInfo modification="0" version="1"/><SriovEthMappings><SriovEthVFInfo/></SriovEt
    VirtualFibreChannelAdapterInfo/></VfcMappings><ProcPools capacity="0"/><TargetInfo concurr_mig_in_prog="-1" max_msp_concur_mig_limit_dynamic="-1" max_msp_concur_mig_lim
    concur_mig_limit="-1" mpio_override="1" state="nonexitent" uuid_override="1" vlan_override="1" vsi_override="1"><ManagerInfo/><TargetMspInfo port_number="-1"/><TargetLp
    ar_name="rrlpar" processor_pool_id="-1" target_profile_name="mig3_9117_MMD_10C94CC141109224549"><SharedMemoryConfig pool_id="-1" primary_paging_vios_id="0"/></TargetLpa
    argetInfo><VlanMappings><VlanInfo description="VkVSU0lPTj0xClZJT19UWVBFPVZFVEgKVkxBTl9JRD0zMzMxClZTV0lUQ0g9dmRjdApCUklER0VEPXllcwo=" vlan_id="3331" vswitch_mode="VEB" v
    ibleTargetVios/></VlanInfo></VlanMappings><MspMappings><MspInfo/></MspMappings><VscsiMappings><VirtualScsiAdapterInfo description="PHYtc2NzaS1ob3N0PgoJPGdlbmVyYWxJbmZvP
    mVyc2lvbj4KCQk8bWF4VHJhbmZlcj4yNjIxNDQ8L21heFRyYW5mZXI+CgkJPGNsdXN0ZXJJRD4wPC9jbHVzdGVySUQ+CgkJPHNyY0RyY05hbWU+VTkxMTcuTU1ELjEwQzk0Q0MtVjItQzQ8L3NyY0RyY05hbWU+CgkJPG1pb
    U9TcGF0Y2g+CgkJPG1pblZJT1Njb21wYXRhYmlsaXR5PjE8L21pblZJT1Njb21wYXRhYmlsaXR5PgoJCTxlZmZlY3RpdmVWSU9TY29tcGF0YWJpbGl0eT4xPC9lZmZlY3RpdmVWSU9TY29tcGF0YWJpbGl0eT4KCTwvZ2VuZ
    TxwYXJ0aXRpb25JRD4yPC9wYXJ0aXRpb25JRD4KCTwvcmFzPgoJPHZpcnREZXY+CgkJPHZEZXZOYW1lPnJybHBhcl9yb290dmc8L3ZEZXZOYW1lPgoJCTx2TFVOPgoJCQk8TFVBPjB4ODEwMDAwMDAwMDAwMDAwMDwvTFVBP
    FVOU3RhdGU+CgkJCTxjbGllbnRSZXNlcnZlPm5vPC9jbGllbnRSZXNlcnZlPgoJCQk8QUlYPgoJCQkJPHR5cGU+dmRhc2Q8L3R5cGU+CgkJCQk8Y29ubldoZXJlPjE8L2Nvbm5XaGVyZT4KCQkJPC9BSVg+CgkJPC92TFVOP
    gkJCTxyZXNlcnZlVHlwZT5OT19SRVNFUlZFPC9yZXNlcnZlVHlwZT4KCQkJPGJkZXZUeXBlPjE8L2JkZXZUeXBlPgoJCQk8cmVzdG9yZTUyMD50cnVlPC9yZXN0b3JlNTIwPgoJCQk8QUlYPgoJCQkJPHVkaWQ+MzMyMTM2M
    DAwMDAwMDAwMDNGQTA0MjE0NTAzSUJNZmNwPC91ZGlkPgoJCQkJPHR5cGU+VURJRDwvdHlwZT4KCQkJPC9BSVg+CgkJPC9ibG9ja1N0b3JhZ2U+Cgk8L3ZpcnREZXY+Cjwvdi1zY3NpLWhvc3Q+" slot_number="4" sou
    _slot_number="4"><PossibleTargetVios/></VirtualScsiAdapterInfo><VirtualScsiAdapterInfo description="PHYtc2NzaS1ob3N0PgoJPGdlbmVyYWxJbmZvPgoJCTx2ZXJzaW9uPjIuNDwvdmVyc2lv
    NjIxNDQ8L21heFRyYW5mZXI+CgkJPGNsdXN0ZXJJRD4wPC9jbHVzdGVySUQ+CgkJPHNyY0RyY05hbWU+VTkxMTcuTU1ELjEwQzk0Q0MtVjEtQzM8L3NyY0RyY05hbWU+CgkJPG1pblZJT1NwYXRjaD4wPC9taW5WSU9TcGF0
    YXRhYmlsaXR5PjE8L21pblZJT1Njb21wYXRhYmlsaXR5PgoJCTxlZmZlY3RpdmVWSU9TY29tcGF0YWJpbGl0eT4xPC9lZmZlY3RpdmVWSU9TY29tcGF0YWJpbGl0eT4KCTwvZ2VuZXJhbEluZm8+Cgk8cmFzPgoJCTxwYXJ0
    b25JRD4KCTwvcmFzPgoJPHZpcnREZXY+CgkJPHZEZXZOYW1lPnJybHBhcl9yb290dmc8L3ZEZXZOYW1lPgoJCTx2TFVOPgoJCQk8TFVBPjB4ODEwMDAwMDAwMDAwMDAwMDwvTFVBPgoJCQk8TFVOU3RhdGU+MDwvTFVOU3Rh
    cnZlPm5vPC9jbGllbnRSZXNlcnZlPgoJCQk8QUlYPgoJCQkJPHR5cGU+dmRhc2Q8L3R5cGU+CgkJCQk8Y29ubldoZXJlPjE8L2Nvbm5XaGVyZT4KCQkJPC9BSVg+CgkJPC92TFVOPgoJCTxibG9ja1N0b3JhZ2U+CgkJCTxy
    UlZFPC9yZXNlcnZlVHlwZT4KCQkJPGJkZXZUeXBlPjE8L2JkZXZUeXBlPgoJCQk8cmVzdG9yZTUyMD50cnVlPC9yZXN0b3JlNTIwPgoJCQk8QUlYPgoJCQkJPHVkaWQ+MzMyMTM2MDA1MDc2ODBDODAwMDEwRTAwMDAwMDAw
    ZmNwPC91ZGlkPgoJCQkJPHR5cGU+VURJRDwvdHlwZT4KCQkJPC9BSVg+CgkJPC9ibG9ja1N0b3JhZ2U+Cgk8L3ZpcnREZXY+Cjwvdi1zY3NpLWhvc3Q+" slot_number="3" source_vios_id="1" src_vios_slot_n
    tVios/></VirtualScsiAdapterInfo></VscsiMappings><SharedMemPools find_devices="false" max_mem="16384"><SharedMemPool/></SharedMemPools><MigrationSession optional_capabil
    les" recover="na" required_capabilities="veth_switch,hmc_compatibilty,proc_compat_modes,remote_restart_capability,lpar_uuid" stream_id="9988047026654530562" stream_id_p
    on>
    

    About the state of the source machine ?

    You have to know this before using remote restart : at the time of writing this post the remote restart feature is still young and have to evolve before being usable in real life, I’m saying this because the FSP of the source machine has to be up to perform a remote restart operation. To be clear the remote restart feature does not answer to the total loss of one of your site. It’s just useful to restart partitions of a system with a problem that is not an FSP problem (problem with memory DIMM, problem with CPUs for instance). It can be used in your DRP exercises but not if your whole site is totally down which is -in my humble opinion- one of the key feature that remote restart needs to answer. Don’t be afraid read the conclusion ….

    Conclusion

    This post have been written using Power7+ machines, my goal was to give you an example of remote restart operations : a summary of what is is, how it work, and where and when to use it. I’m pretty sure that a lot of things are going to change about remote restart. First, on Power8 machines you don’t have to recreate the partitions to make them remote restart aware. Second, I know that changes are on the way for remote restart on Power8 machines, especially about reserved storage devices and about the state of the source machine. I’m sure this feature will have a bright future and used with PowerVC it can be a killer feature. Hope to see all this changes in a near future ;-). Once again I hope this post helps you.

    An overview of the IBM Technical Collaboration Council for PowerSystems 2014

    Since now eight ten months I decided to change my job for better or for worst. Talking about the better I had the chance to be enrolled for the Technical Collaboration Council for Power Systems (I’ll not talk about the worst … this could takes me hours to explain it..). The Technical Collaboration Council is not well known in Europe, and not well known for Power Systems and I think writing this blog post may offer a better worldwide visibility to the Technical Collaboration Council. It deserve a blog post :-).

    To be clear and to avoid any problem to participate in the meeting you have first to sign a Non Disclosure Announcement. A lot of presentations are still IBM confidential. This said I had sign this NDA. So I cannot talk about the content of the meeting. Sure there is a lot of things to say but I have to keep it for me … :-)

    3
    But what is exactly the Technical Collaboration Council ? This annual meeting takes places in Austin Texas at the home of Power Systems :-). The duration is for one week from Monday to Friday. The Technical Collaboration Council is inviting biggest IBM customers all over the world. For a guy like me so involved in this community, coming here was a great opportunity and way to spread the word about my blog and my participation in the Power community. In fact we were just a few people coming from Europe and a lot of US guys. The TCC looks like an IBM Technical University in better … because you can participate during the meeting and answer to a lot of surveys about the shape of things to come about Power Systems :-) :-) :-) .

    Here is what you can see and what you can do when you are coming to the TCC Power. And for me it’s exciting !!! :

    • Meetings about trends an directions about Power Systems (overview of new products (hardware and software), new functionality and new releases going to be released in the next year).
    • Direct Access to IBM Lab. You can go and ask the lab about a particular feature you need, or about something you didn’t understand. For instance I had a quick meeting with PowerVC guys (not only guys, sorry Christine) about my needs for the next few months. Another one : I had the chance to talk to the head manager of AIX and ask him about a few things I’d like to see in the new version of AIX (Who said an installation over http ?).
    • Big “names” of Power are here, they share and talk : Doug Ballog, Satya Sharma. Seeing them is always impressive !
    • Interaction and sharing with other customers : like me a lot of customers were here at the TCC and sharing about how they do things and how they use their Power Systems it ALWAYS useful. Had a few interesting conversations with guys from another big bank with the same constraints as me.
    • You can say what you think. IBM is waiting for you feedback .. positive or negative.
    • Demo and hands on new products and new functionality (Remember about the IBM Provisionning Toolkit for PowerVM & a cool LPM scheduler presented by STG lab services guys).
    • Possibility to enroll for beta programs … (in my case HMC)
    • You can finally meet guys you had on the phone or by mail since a couple of years in real. It’s always useful !
    • And of course lot of fun :-)

    I had the chance to talk about my experience about PowerVC in front of all the TCC members. It was very stressful for a French guy like me … and I just had a few minutes to prepare … Hope it was good, but It was a great experience. You can do things like this at the TCC … you think PowerVC is good, just go on the scene and have 15 minutes talk about it … :-)

    4

    The Technical Collaboration Council is not just about technical stuffs and work. You can also have a lot of fun talking to IBM guys and customers. There are a lot of moments when people can eat and drink together and the possibility to share about everything is always here. And if I had to remember only one thing about the Technical Collaboration Council it will be that it is a great moment of sharing with others and not just about work and Power Systems. This said I wanted to thanks IBM and a lot of people for their kindness, their availability and all the fun they give us during this week. So thanks to : Philippe H., Patrice P., Satya S., Jay K., Carl B., Eddy S., Natalie M, Christine W, François L, Rosa D … and sorry for those I’ve forgotten :-). And never forget that Power is performance redefined.

    Ok ; one last word. Maybe some of the customers who were here this year are going to read this post and I encourage you to react to this post and to post comments. Redhat moto is “We grow when we share”, but in such events I am (and we are) growing when IBM is sharing. People may think that IBM do not share … I disagree :-). They are doing it and they are doing it well ! And never forget that the Power Community is still alive and ready to rocks ! So please raise your voice about it. In such times, times of Media and Social we have to prove to IBM and to the world that is community is growing, is great, and is ready to share.
    One last thing, the way to work in US seems to be very different than the way we do in Europe … could be cool to move to US