Continuous integration for your Chef AIX cookbooks (using PowerVC, Jenkins, test-kitchen and gitlab)

My Journey to integrate Chef on AIX is still going on and I’m working more than ever on these topics. I know that using such tools is not something widely adopted by AIX customers. But what I also know is that whatever happens you will in a near -or distant- future use an automation tool. These tools are so widely used in the Linux world that you just can’t ignore it. The way you were managing your AIX ten years ago is not the same as what you are doing today, and what you do today will not be what you’ll do in the future. The AIX world needs a facelift to survive, a huge step has already be done (and is still ongoing) with PowerVC thanks to a fantastic team composed by very smart people at IBM (@amarteyp; @drewthorst, @jwcroppe, and all the other persons in this team!) The AIX world is now compatible with Openstack and with this other things are coming … such as automation. When all of these things will be ready AIX we will be able to offer something comparable to Linux. Openstack and automation are the first brick to what we call today “devops” (to be more specific it’s the ops part of the devops word).

I will today focus on how to manage your AIX machines using Chef. By using the word “how” I mean what are the best practices and infrastructures to build to start using Chef on AIX. If you remember my session about Chef on AIX at the IBM Technical University in Cannes I was saying that by using Chef your infrastructure will be testable, repeatable, and versionnable. We will focus on this blog post on how to do that. To test your AIX Chef cookbooks you will need to understand what is the test kitchen (we will use the test kitchen to drive PowerVC to build virtual machines on the fly and run the chef recipes on it). To repeat this over and over to be sure everything is working (code review, be sure that your cookbook is converging) ok without having to do anything we will use Jenkins to automate these tests. Then to version your cookbooks development we will use gitlab.

To better understand why I’m doing such a thing there is nothing better than a concrete example. My goal is to do all my AIX post-installation tasks using Chef (motd configuration, dns, devices attributes, fileset installation, enabling services … everything that you are today doing using korn shells scripts). Who has never experienced someone changing one of these scripts (most of the time without warning the other members of the team) resulting in a syntax error then resulting in an outage for all your new builds. Doing this is possible if you are in a little team creating one machine per month but is inconceivable in an environment driven by PowerVC where sysadmin are not doing anything “by hand”. In such an environment if someone is doing this kind of error all the new builds are failing …. even worse you’ll probably not be aware of this until someone who is connecting on the machine will say that there is an error (most of the time the final customer). By using continuous integration your AIX build will be tested at every change, all this changes will be stored in a git repository and even better you will not be able to put a change in production without passing all these tests. Even if using this is just mandatory to do that for people using PowerVC today people who are not can still do the same thing. By doing that you’ll have a clean and proper AIX build (post-install) and no errors will be possible anymore, so I highly encourage you to do this even if you are not adopting the Openstack way or even if today you don’t see the benefits. In the future this effort will pay. Trust me.

The test-kitchen

What is the kitchen

The test-kitchen is a tool that allows you to run your AIX Chef cookbooks and recipes in a quick way without having to do manual task. During the development of your recipes if you don’t use the test kitchen you’ll have many tasks to do manually. Build a virtual machine, install the chef client, copy the cookbook and the recipes, run it, check everything is in the state that you want. Imagine doing that on different AIX version (6.1, 7.1, 7.2) everytime you are changing something in your post-installation recipes (I was doing that before and I can assure you that creating and destroy machine over and over and over is just a waste of time). The test kitchen is here to do the job for you. It will build the machine for you (using the PowerVC kitchen driver), install the chef-client (using an omnibus server), copy the content of your cookbook (the files), run a bunch of recipe (described in what we call suites) and then test it (using bats, or serverspec). You can configure your kitchen to test different kind of images (6.1, 7.1, 7.2) and differents suites (cookbooks, recipes) depending on the environment you want to test. By default the test kitchen is using a Linux tool called Vagrant to build your VM. Obsiouvly Vagrant is not able to build an AIX machine, that’s why we will use a modified version of the kitchen-openstack driver (modified by my self) called kitchen-powervc to build the virtual machines:

Installing the kitchen and the PowerVC driver

If you have an access to an enterprise proxy you can directly download and install the gem files from your host (in my case this is a Linux on Power … so Linux on Power is working great for this).

  • Install the test kitchen :
  • # gem install --http-proxy http://bcreau:mypasswd@proxy:8080 test-kitchen
    Successfully installed test-kitchen-1.7.2
    Parsing documentation for test-kitchen-1.7.2
    1 gem installed
    
  • Install kitchen-powervc :
  • # gem install --http-proxy http://bcreau:mypasswd@proxy:8080 kitchen-powervc
    Successfully installed kitchen-powervc-0.1.0
    Parsing documentation for kitchen-powervc-0.1.0
    1 gem installed
    
  • Install kitchen-openstack :
  • # gem install --http-proxy http://bcreau:mypasswd@proxy:8080 kitchen-openstack
    Successfully installed kitchen-openstack-3.0.0
    Fetching: fog-core-1.38.0.gem (100%)
    Successfully installed fog-core-1.38.0
    Fetching: fuzzyurl-0.8.0.gem (100%)
    Successfully installed fuzzyurl-0.8.0
    Parsing documentation for kitchen-openstack-3.0.0
    Installing ri documentation for kitchen-openstack-3.0.0
    Parsing documentation for fog-core-1.38.0
    Installing ri documentation for fog-core-1.38.0
    Parsing documentation for fuzzyurl-0.8.0
    Installing ri documentation for fuzzyurl-0.8.0
    3 gems installed
    

If you don’t have the access to an enterprise proxy you can still download the gems from home and install it on your work machine:

# gem install test-kitchen kitchen-powervc kitchen-openstack -i repo --no-ri --no-rdoc
# # copy the files (repo directory) on your destination machine
# gem install *.gem

Setup the kitchen (.kitchen.yml file)

The kitchen configuration file is the .kitchen.yml, when you’ll run the kitchen command, the kitchen will look at this file. You have to put it in the chef-repo (where the cookbook directory is, the kitchen will copy the file from the cookbook to the test machine that’s why it’s important to put this file at the root of the chef-repo.) This file is separated in different sections:

  • The driver section. In this section you will configure howto created virtual machines. In our case how to connect to PowerVC (credentials, region). You’ll also tell in this section which image you want to use (PowerVC images), which flavor (PowerVC template) and which network will be used at the VM creation (please note that you can put some driver_config in the platform section, to tell which image or which ip you want to use for each specific platform.:
    • name: the name of the driver (here powervc).
    • openstack*: the PowerVC url, user, password, region, domain.
    • image_ref: the name of the image (we will put this in driver_config in the platform section).
    • flavor_ref: the name of the PowerVC template used at the VM creation.
    • fixed_ip: the ip_address used for the virtual machine creation.
    • server_name_prefix: each vm created by the kitchen will be prefixed by this parameter.
    • network_ref: the name of the PowerVC vlan to be used at the machine creation.
    • public_key_path: The kitchen needs to connect to the machine with ssh, you need to provide the public key used.
    • private_key_path: Same but for the private key.
    • username: The ssh username (we will use root, but you can use another user and then tell the kitchen to use sudo)
    • user_data: The activation input used by cloud-init we will in this one put the public key to be sure you can access the machine without password (it’s the PowerVC activation input).
    • driver:
        name: powervc
        server_wait: 100
        openstack_username: "root"
        openstack_api_key: "root"
        openstack_auth_url: "https://mypowervc:5000/v3/auth/tokens"
        openstack_region: "RegionOne"
        openstack_project_domain: "Default"
        openstack_user_domain: "Default"
        openstack_project_name: "ibm-default"
        flavor_ref: "mytemplate"
        server_name_prefix: "chefkitchen"
        network_ref: "vlan666"
        public_key_path: "/home/chef/.ssh/id_dsa.pub"
        private_key_path: "/home/chef/.ssh/id_dsa"
        username: "root"
        user_data: userdata.txt
      
      #cloud-config
      ssh_authorized_keys:
        - ssh-dss AAAAB3NzaC1kc3MAAACBAIVZx6Pic+FyUisoNrm6Znxd48DQ/YGNRgsed+fc+yL1BVESyTU5kqnupS8GXG2I0VPMWN7ZiPnbT1Fe2D[..]
      
  • The provisioner section: This section can be use to specify if you want to user chef-zero or chef-solo as a provisioner. You can also specify an omnibus url (use to download and install the chef-client at the machine creation time). In my case the omnibus url is a link to an http server “serving” a script (install.sh) installing the chef client fileset for AIX (more details later in the blog post). I’m also putting “sudo” to false as I’ll connect with the root user:
  • provisioner:
      name: chef_solo
      chef_omnibus_url: "http://myomnibusserver:8080/chefclient/install.sh"
      sudo: false
    
  • The platefrom section: The plateform section will describe each plateform that the test-kitchen can create (I’m putting here the image_ref and the fixed_ip for each plateform (AIX 6.1, AIX 7.1, AIX 7.2)
  • platforms:
      - name: aix72
        driver_config:
          image_ref: "kitchen-aix72"
          fixed_ip: "10.66.33.234"
      - name: aix71
        driver_config:
          image_ref: "kitchen-aix71"
          fixed_ip: "10.66.33.235"
      - name: aix61
        driver_config:
          image_ref: "kitchen-aix61"
          fixed_ip: "10.66.33.236"
    
  • The suite section: this section describe which cookbook and which recipes you want to run in the machines created by the test-kitchen. For the simplicity of this example I’m just running two recipe the first on called root_authorized_keys (creating the /root directory, changing the home directory of root and the putting a public key in the .ssh directory) and the second one call gem_source (we will check later in the post why I’m also calling this recipe):
  • suites:
      - name: aixcookbook
        run_list:
        - recipe[aix::root_authorized_keys]
        - recipe[aix::gem_source]
        attributes: { gem_source: { add_urls: [ "http://10.14.66.100:8808" ], delete_urls: [ "https://rubygems.org/" ] } }
    
  • The busser section: this section describe how to run you tests (more details later in the post ;-) ):
  • busser:
      sudo: false
    

After configuring the kitchen you can check the yml file is ok by listing what’s configured on the kitchen:

# kitchen list
Instance           Driver   Provisioner  Verifier  Transport  Last Action
aixcookbook-aix72  Powervc  ChefSolo     Busser    Ssh        
aixcookbook-aix71  Powervc  ChefSolo     Busser    Ssh        
aixcookbook-aix61  Powervc  ChefSolo     Busser    Ssh        

kitchen1
kitchen2

Anatomy of a kitchen run

A kitchen run is divided into five steps. At first we are creating a virtual machine (the create action), then we are installing the chef-client (using an omnibus url) and running some recipes (converge), then we are installing testing tools on the virtual machine (in my case serverspec) (setup) and we are running the tests (verify). Finally if everything was ok we are deleting the virtual machines (destroy). Instead of running all theses steps one by one you can use the “test” option. This one will do destroy,create,converge,setup,verify,destroy in on single “pass”. Let’s check in details each steps:

kitchen1

  • Create: This will create the virtual machine using PowerVC. If you choose to use the “fixed_ip” option in the .kitchen.yml file this ip will be choose at the machine creation time. If you prefer to pick an ip from the network (in the pool) don’t set the “fixed_ip”. You’ll see the details in the picture below. You can at the end test the connectivity (transport) (ssh) to the machine using “kitchen login”. The ssh public key was automatically added using the userdata.txt file used by cloud-init at the machine creation time. After the machine is created you can use the “kitchen list” command to check the machine was successfully created:
# kitchen create

kitchencreate3
kitchencreate1
kitchencreate2
kitchenlistcreate1

  • Converge: This will converge the kitchen (on more time converge = chef-client installation and running chef-solo with the suite configuration describing which recipe will be launched). The converge action will download the chef client and install it on the machine (using the omnibus url) and run the recipe specified in the suite stanza of the .kitchen.yml file. Here is the script I use for the omnibus installation this script is “served” by an http server:
  • # cat install.sh
    #!/usr/bin/ksh
    echo "[omnibus] [start] starting omnibus install"
    echo "[omnibus] downloading chef client http://chefomnibus:8080/chefclient/lastest"
    perl -le 'use LWP::Simple;getstore("http://chefomnibus:8080/chefclient/latest", "/tmp/chef.bff")'
    echo "[omnibus] installing chef client"
    installp -aXYgd /tmp/ chef
    echo "[omnibus] [end] ending omnibus install"
    
  • The http server is serving this install.sh file. Here is the httpd.conf configuration file for the omnibus installation on AIX:
  • # ls -l /apps/chef/chefclient
    total 647896
    -rw-r--r--    1 apache   apache     87033856 Dec 16 17:15 chef-12.1.2-1.powerpc.bff
    -rwxr-xr-x    1 apache   apache     91922944 Nov 25 00:24 chef-12.5.1-1.powerpc.bff
    -rw-------    2 apache   apache     76375040 Jan  6 11:23 chef-12.6.0-1.powerpc.bff
    -rwxr-xr-x    1 apache   apache          364 Apr 15 10:23 install.sh
    -rw-------    2 apache   apache     76375040 Jan  6 11:23 latest
    # cat httpd.conf
    [..]
         Alias /chefclient/ "/apps/chef/chefclient/"
         
             Options Indexes FollowSymlinks MultiViews
           AllowOverride None
           Require all granted
         
    
# kitchen converge

kitchenconverge1
kitchenconverge2b
kitchenlistconverge1

  • Setup and verify: these actions will run a bunch of tests to verify the machine is in the state you want. The test I am writing are checking that the root home directory was created and the key was successfully created in the .ssh directory. In a few words you need to write tests checking that your recipes are working well (in chef words: “check that the machine is in the correct state”). In my case I’m using serverspec to describe my tests (there are different tools using for testing, you can also use bats). To describe the tests suite just create serverspec files (describing the tests) in the chef-repo directory (in ~/test/integration//serverspec in my case ~/test/integration/aixcookbook/serverspec). All the serverspec test files are suffixed by _spec:
  • # ls test/integration/aixcookbook/serverspec/
    root_authorized_keys_spec.rb  spec_helper.rb
    
  • The “_spec” file describe the tests that will be run by the kitchen. In my very simple tests here I’m just checking my files exists and the content of the public_key is the same as my public_key (the key created by cloud-init in AIX is located in ~/.ssh and my test recipe here is changing the root home directory and putting the key in the right place). By looking at the file you can see that the serverspec language is very simple to understand:
  • # ls test/integration/aixcookbook/serverspec/
    root_authorized_keys_spec.rb  spec_helper.rb
    
    # cat spec_helper.rb
    require 'serverspec'
    set :backend, :exec
    # cat root_authorized_keys_spec.rb
    require 'spec_helper'
    
    describe file('/root/.ssh') do
      it { should exist }
      it { should be_directory }
      it { should be_owned_by 'root' }
    end
    
    describe file('/root/.ssh/authorized_keys') do
      it { should exist }
      it { should be_owned_by 'root' }
      it { should contain 'from="1[..]" ssh-rsa AAAAB3NzaC1[..]' }
    end
    
  • The kitchen will try to install needed ruby gems for serverspec (serverspec needs to be installed on the server to run the automated test). As my server has no connectivity to the internet I need to run my own gem server. I’m lucky all the gem needed are installed on my chef workstation (if you have no internet access from the workstation use the tip described at the beginning of this blog post). I just need to run a local gem server by running “gem server” on the chef workstation. The server is listening on port 8808 and will serve all the needed gems:
  • # gem list | grep -E "busser|serverspec"
    busser (0.7.1)
    busser-bats (0.3.0)
    busser-serverspec (0.5.9)
    serverspec (2.31.1)
    # gem server
    Server started at http://0.0.0.0:8808
    
  • If you look on the output above you can see that the recipe gem_server was executed. This recipe change the gem source on the virtual machine (from https://rubygems.org to my own local server). In the .kitchen.yml file the urls to add and remove to the gem source are specified in the suite attributes:
  • # cat gem_source.rb
    ruby_block 'Changing gem source' do
      block do
        node['gem_source']['add_urls'].each do |url|
          current_sources = Mixlib::ShellOut.new('/opt/chef/embedded/bin/gem source')
          current_sources.run_command
          next if current_sources.stdout.include?(url)
          add = Mixlib::ShellOut.new("/opt/chef/embedded/bin/gem source --add #{url}")
          add.run_command
          Chef::Application.fatal!("Adding gem source #{url} failed #{add.status}") unless add.status == 0
          Chef::Log.info("Add gem source #{url}")
        end
    
        node['gem_source']['delete_urls'].each do |url|
          current_sources = Mixlib::ShellOut.new('/opt/chef/embedded/bin/gem source')
          current_sources.run_command
          next unless current_sources.stdout.include?(url)
          del = Mixlib::ShellOut.new("/opt/chef/embedded/bin/gem source --remove #{url}")
          del.run_command
          Chef::Application.fatal!("Removing gem source #{url} failed #{del.status}") unless del.status == 0
          Chef::Log.info("Remove gem source #{url}")
        end
      end
      action :run
    end
    
# kitchen setup
# kitchen verify

kitchensetupeverify1
kitchenlistverfied1

  • Destroy: This will destroy the virtual machine on PowerVC.
# kitchen destroy

kitchendestroy1
kitchendestroy2
kitchenlistdestroy1

Now that you understand how the kitchen is working and that you are now able to run it to create and test AIX machines you are ready to use the kitchen to develop and create the chef cookbook that will fit your infrastructure. To run the all the steps “create,converge,setup,verify,destroy”, just use the “kitchen test” command:

# kitchen test

As you are going to change a lot of things in your cookbook you’ll need to version the code you are creating, for this we will use a gitlab server.

Gitlab: version your AIX cookbook

Unfortunately for you and for me I didn’t had the time to run gitlab on a Linux on Power machine. I’m sure it is possible (if you find a way to do this please mail me). Anyway my version of gitlab is running on an x86 box. The goal here is to allow the chef workstation (in my environment this user is “chef”) user to push all the new developments (providers, recipes) to the git development branch for this we will:

  • Allow the chef user to push its source to the git server trough ssh (we are creating a chefworkstation user and adding the key to authorize this user to push the changes to the git repository with ssh).
  • gitlabchefworkst

  • Create a new repository called aix-cookbook.
  • createrepo

  • Push your current work to the master branch. The master branch will be the production branch.
  • # git config --global user.name "chefworkstation"
    # git config --global user.email "chef@myworkstation.chmod666.org"
    # git init
    # git add -A .
    # git commit -m "first commit"
    # git remote add origin git@gitlabserver:chefworkstation/aix-cookbook.git
    # git push origin master
    

    masterbranch

  • Create a development branch (you’ll need to push all your new development to this branch, and you’ll never have to do anything else on the master branch as Jenkins is going to do the job for us.
  • # git checkout -b dev
    # git commit -a
    # git push origin dev
    

    devbranch

The git server is ready: we have a repository accessible by the chef user. Two branch created the dev one (the one we are working on used for all our development) and the master branch used for production that will be never touched by us and will be only updated (by jenkins) if all the tests (foodcritic, rubocop and the test-kitchen) are ok

Automating the continous integration with Jenkins

What is Jenkins

The goal of Jenkins is to automate all tests and run them over and over again every time a change is applied onto the cookbook you are developing. By using Jenkins you will be sure that every change will be tested and you will never push something that is not working or not passing the tests you have defined in your production environment. To be sure the cookbook is working as desired we will use three different tools. foodcritic will check the will check your chef cookbook for common problems by checking rules that are defined within the tools (this rules will check that everything is ok for the chef execution, so you will be sure that there is no syntax error, and that all the coding convention will be respected), rubocop will check the ruby syntax, and then we will run a kitchen test to be sure that the developement branch is working with the kitchen and that all our serverspec tests are ok. Jenkins will automate the following steps:

  1. Pull the dev branch from git server (gitlab) if anything has changed on this branch.
  2. Run foodcritic on the code.
  3. If foodcritic tests are ok this will trigger the next step.
  4. Pull the dev branch again
  5. Run rubocop on the code.
  6. If rubocop tests are ok this will trigger the next step.
  7. Run the test-kitchen
  8. This will build a new machine on PowerVC and test the cookbook against it (kitchen test).
  9. If the test kitchen is ok push the dev branch to the master branch.
  10. You are ready for production :-)

kitchen2

First: Foodcritic

The first test we are running is foodcritic. Better than trying to do my own explanation of this with my weird english I prefer to quote the chef website:

Foodcritic is a static linting tool that analyzes all of the Ruby code that is authored in a cookbook against a number of rules, and then returns a list of violations. Because Foodcritic is a static linting tool, using it is fast. The code in a cookbook is read, broken down, and then compared to Foodcritic rules. The code is not run (a chef-client run does not occur). Foodcritic does not validate the intention of a recipe, rather it evaluates the structure of the code, and helps enforce specific behavior, detect portability of recipes, identify potential run-time failures, and spot common anti-patterns.

# foodcritic -f correctness ./cookbooks/
FC014: Consider extracting long ruby_block to library: ./cookbooks/aix/recipes/gem_source.rb:1

In Jenkins here are the steps to create a foodcritic test:

  • Pull dev branch from gitlab:
  • food1

  • Check for changes (the Jenkins test will be triggered only if there was a change in the git repository):
  • food2

  • Run foodcritic
  • food3

  • After the build parse the code (to archive and record the evolution of the foodcritic errors) and run the rubocop project if the build is stable (passed without any errors):
  • food4

  • To configure the parser go in the Jenkins configuration and add the foodcritic compiler warnings:
  • food5

Second: Rubocop

The second test we are running is rubocop it’s a Ruby static code analyzer, based on the community Ruby style guide. Here is an example below

# rubocop .
Inspecting 71 files
..CCCCWWCWC.WC..CC........C.....CC.........C.C.....C..................C

Offenses:

cookbooks/aix/providers/fixes.rb:31:1: C: Assignment Branch Condition size for load_current_resource is too high. [20.15/15]
def load_current_resource
^^^
cookbooks/aix/providers/fixes.rb:31:1: C: Method has too many lines. [19/10]
def load_current_resource ...
^^^^^^^^^^^^^^^^^^^^^^^^^
cookbooks/aix/providers/sysdump.rb:11:1: C: Assignment Branch Condition size for load_current_resource is too high. [25.16/15]
def load_current_resource

In Jenkins here are the steps to create a rubocop test:

  • Do the same thing as foodcritic except for the build and post-build action steps:
  • Run rubocop:
  • rubo1

  • After the build parse the code and run the test-kitchen project even if the build is fails (rubocop will generate tons of things to correct … once you are ok with rubocop change this to “trigger only if the build is stable”) :
  • rubo2

Third: test-kitchen

I don’t have to explain again what is the test-kitchen ;-) . It is the third test we are creating with Jenkins and if this one is ok we are pushing the changes in production:

  • Do the same thing as foodcritic except for the build and post-build action steps:
  • Run the test-kitchen:
  • kitchen1

  • If the test kitchen is ok push dev branch to master branch (dev to production):
  • kitchen3

More about Jenkins

The three tests are now linked together. On the Jenkins home page you can check the current state of your tests. Here are a couple of screenshots:

meteo
timeline

Conclusion

I know that for most of you working this way is something totally new. As AIX sysadmins we are used to our ksh and bash scripts and we like the way it is today. But as the world is changing and as you are going to manage more and more machines with less and less admins you will understand how powerful it is to use automation and how powerful it is to work in a “continuous integration” way. Even if you don’t like this “concept” or this new work habit … give it a try and you’ll see that working this way is worth the effort. First for you … you’ll discover a lot of new interesting things, second for your boss that will discover that working this way is safer and more productive. Trust me AIX needs to face Linux today and we are not going anywhere without having a proper fight versus the Linux guys :-) (yep it’s a joke).

What’s new in VIOS 2.2.4.10 and PowerVM : Part 2 Shared Processor Pool weighting

First of all before beginning this blog post I owe you an explanation about these two months without new posts. These two months were very busy. On the personal side I was forced to move from my current apartment and had to find another one which was suitable for me (and I can assure you that this is not something really easy in Paris). As I was visiting apartments almost 3 days a week the time kept for writing blog posts (please remember that I’m doing that in my “after hours” work) was taken for something else :-(. At work things were crazy too, we had to build twelve new E870 boxes (with the provisioning toolkit and SRIOV adapters) and make them work with our current implementation of PowerVC. Then I had to do a huge vscsi to NPIV migration (more than 500 AIX machines to migrate from vscsi to NPIV and then move to P8 boxes in less than three weeks … yes more than 500 machines in less than 3 weeks (4000 zones created …). Thanks to the help of STG Lab Services consultant (Bonnie LeBarron) this was achieved using a modified version of her script (to fit our need (zoning and mapping part) (latest hmc releases)). I’m back in business now and I have planned a couple of blog posts this month. This first of this series is about the Shared Processor Pool weighting on the latest Power8 firmware versions. You’ll see that it changes a lot of things compared to P7 boxes.

A short history of Shared Processor Pool weighting

This long story began a few years ago for me (I’ll say at least 4 years ago) (I was planing to do a blog post about it a few years ago but decided not to do it because I was thinking this topic was considered as “sensible”, now that we have documentation and an official statement on this there is no reason to hide this anymore). I was working for a bank using two P795 with a lot of cores activated. We were using Multiple Shared Processor Pool in an unconventional way (as far as I remember two pools per customers one for Oracle and one for WAS, and we had more than 5 or 6 customers, so each box had at least 10 MSPP). As you may already know I only believe what I can see. So I decided to make tests on my own. By reading the Redbook I realized that there was not enough information about pool and partition weighting. We were like a lot of today’s customers having different weights for development (32), qualification (64), pre-production (128), production (192) and finally Virtual I/O Server (255). As we were using Shared Processor Pool I was expecting that when the Shared Processor Pool is full (contention) the weight will work and will prioritize the partition with the higher weight. What was my surprise when I realized the weighting was not working inside a Shared Processor Pool but only in the DefaultPool (Pool 0). Remember forever this statement on Power7 partition weighting is only working when the default pool is full. There is no “intelligence” in a Shared Processor Pool and you have to be very careful with the size of the pool because of that. On Power7 pools are used ONLY for licensing purpose. I then decided to contact my preferred IBM pre-sales in France to tell him about this incredible discovery. I had no answer for one month, then (as always) he came back with the answer of someone who already knows the truth about this. He introduced me to a performance expert (she was a performance expert at this time and is now specialized in security) and she was telling me that I was absolutely right with my discovery but that only a few people were aware of this. I decided to say nothing about it … but was sure that IBM realized there was a something to clarify about this. Then last year at the IBM Technical Collaboration Council I saw a PowerPoint slide telling that latest IBM Power8 firmware will add this long awaited feature. Partition weighting will work inside a Shared Processor Pool. Finally after waiting for more than four years I have what I want. As I was working on a new project in my current job I had to create a lot of Shared Processor Pool in a mixed Power7 (P770) and Power8 (E870) environment. It was the time to check if this new feature was really working and compare the differences between a Power8 (with latest firmware) and a Power7 machine (with latest firmware). The way we are implementing and monitoring the Shared Processor Pool on a Power8 will now be very different than it was on Power7 box. I think that this is really important and that everybody now needs to understand the differences for their future implementation. But let’s first have a look in the Redbooks to check the official statements:

The Redbook talking about this is “IBM PowerVM Virtualization Introduction and Configuration”, here is the key paragraph to understand (page 113 and 114):

redbook_statement

It was super hard to find but there is place were IBM is talking about this. I’m below quoting this link: https://www.ibm.com/support/knowledgecenter/9119-MME/p8hat/p8hat_sharedproc.htm

When the firmware is at level 8.3.0, or earlier, uncapped weight is used only when more virtual processors consume unused resources than the available physical processors in the shared processor pool. If no contention exists for processor resources, the virtual processors are immediately distributed across the physical processors, independent of their uncapped weights. This can result in situations where the uncapped weights of the logical partitions do not exactly reflect the amount of unused capacity.

For example, logical partition 2 has one virtual processor and an uncapped weight of 100. Logical partition 3 also has one virtual processor, but an uncapped weight of 200. If logical partitions 2 and 3 both require more processing capacity, and there is not enough physical processor capacity to run both logical partitions, logical partition 3 receives two more processing units for every additional processing unit that logical partition 2 receives. If logical partitions 2 and 3 both require more processing capacity, and there is enough physical processor capacity to run both logical partitions, logical partition 2 and 3 receive an equal amount of unused capacity. In this situation, their uncapped weights are ignored.

When the firmware is at level 8.4.0, or later, if multiple partitions are assigned to a shared processor pool, the uncapped weight is used as an indicator of how the processor resources must be distributed among the partitions in the shared processor pool with respect to the maximum amount of capacity that can be used by the shared processor pool. For example, logical partition 2 has one virtual processor and an uncapped weight of 100. Logical partition 3 also has one virtual processor, but an uncapped weight of 200. If logical partitions 2 and 3 both require more processing capacity, logical partition 3 receives two additional processing units for every additional processing unit that logical partition 2 receives.

The server distributes unused capacity among all of the uncapped shared processor partitions that are configured on the server, regardless of the shared processor pools to which they are assigned. For example, if you configure logical partition 1 to the default shared processor pool and you configure logical partitions 2 and 3 to a different shared processor pool, all three logical partitions compete for the same unused physical processor capacity in the server, even though they belong to different shared processor pools.

Testing methodology

We now need to demonstrate that the behavior of the weighting is different between a Power7 and a Power8 machine, here is how we are going to proceed :

  • On a Power8 machine (E870 SC840_056) we create a Shared Processor Pool with a “Maximum Processing unit” set to 1.
  • On a Power7 machine we create a Shared Processor Pool with a “Maximum Processing unit” set to 1.
  • We create two partitions in the P8 pool (1VP, 0.1EC) called mspp1 and mspp2.
  • We create two partitions in the P7 pool (1VP, 0.1EC) called mspp3 and mspp4.
  • Using ncpu providev with the nstress tools (http://public.dhe.ibm.com/systems/power/community/wikifiles/PerfTools/nstress_AIX6_April_2014.tar) we create an heavy load on each partition. Obviously this load can’t be higher than 1 processing unit in total (sum of each physc).
  • We then use these testing scenarios (each test has a duration of 15 minutes, we are recording cpu and pool stats with nmon and lpar2rrd)
    1. First partition with a weight of 128, the second partition with a weight of 128 (test with the same weight).
    2. First partition with a weight of 64, the second partition with a weight of 128 (test weight multiply by two 1/2).
    3. First partition with a weight of 32, the second partition with a weight of 128 (test weight multiply by four 1/4).
    4. First partition with a weight of 1, the second partition with a weight of 2 (we try here to prove that the ratio between two values is more important that the value itself. Values of 1 and 2 should give us the same result as 64 and 128)
    5. First partition with a weight of 1, the second partition with a weight of 255 (a ratio of 1:255) (you’ll see here that the result is pretty interesting :-) ).
  • You’ll see that It will not be necessary to do all these tests on the P7 box …. :-)

The Power8 case

Prerequistes

Firmware P8 SC840* or SV840* are mandatory to enable the weighting in a Shared Processor Pool on a machine without contention for processor resources (no contention in the DefaultPool). This means that all P6, P7 and P8 (with a firmware < 840) machines do not have this feature coded in the firmware. My advice is to update all your P8 machines to the latest level to enable this new behavior.

Tests

For each test, we prove the weight of each partition using the lparstat command, then we capture a nmon file every 30 seconds and we launch ncpu for a duration of 15 minutes with four CPUs (we are in SMT4) on both P8 and P7 box. We will show you here that weight are taken into account in a Power8 MSPP, but are not taken into account in a Power7 MSPP.

#lparstat -i | grep -iE "Variable Capacity Weight|^Partition"
Partition Name                             : mspp1-23bad3d7-00000898
Partition Number                           : 3
Partition Group-ID                         : 32771
Variable Capacity Weight                   : 255
Desired Variable Capacity Weight           : 255
# /usr/bin/nmon -F /admin/nmon/$(hostname)_weight255.nmon -s30 -c30 -t ; ./ncpu -p 4 -s 900
# lparstat 1 10
  • Both weights at 128, you can check in the picture below that the “physc” value are strictly equal (0.5 for both lpars) (the ratio of 1 between the two weight is respected) :
  • weight128

  • One partition to 64 and one partition to 128, you can check in the pictures below (lparstat output, and nmon analyser graph) that we now have different values for the physc value (0.36 for the mssp2 lpar and 0.64 for the mssp1 lpar). We now have a ratio of 2, mspp1 physc is two time the mspp2 physc (the weights are respected in the Shared Processor Pool):
  • weight64_128

nmonx2

This lpar2rrd graph show you the weighting behavior on a Power8 machine (test one: both weights equal to 128, and test two: with two different weights of 128 and 64).

graph_p8_128128_12864

  • One partition to 32 and one partition to 128: you can check in the picture below that the ratio of 3 (32:128) is respected (physc value to 0.26 and 0.74).
  • weight32_128

  • One partition to 1 and one partition to 2. The results here are exactly the same as the second test (128 and 64 weights), it proves you that the important thing to configure are the ratio between the weights and not the value itself (using 1 2 3 weights will give you the exact same results as 2 4 6):
  • weight1_2

  • Finally one partition to 1 and one partition to 255. Be careful here the ratio is big enough to have an unresponsive lpar when loading both partitions. I do not recommend putting such high ratios because of this:
  • weight1_255

graph_p8_12832_12_1255

The Power7 case

Let’s do one test on a Power7 machine with on lpar with a weight of 1 and the other one with a weight of 255 … you’ll see a huge difference here … and I think it is clear enough to avoid doing all the test scenarios on the Power7 machine.

Tests

You can see here that I’m doing the exact same test, weight to 1 and 255, now both partition have an equal physc value (0.5 for both partitions). On a Power7 box the weights will be taken into account only if the DefaultPool (pool0) is full (contention). The pictures below show you the reality of the Multiple Shared Processors pool running on a Power7 box. On Power7 MSPP must be used only for licensing purpose and nothing else.

weight1_255_power7
graph_p7_1255

Conclusion

I hope you better understand the Multiple Shared Processor Pools differences between Power8 and Power7. Now that you are aware of this my advice is to have different strategies when you are implementing MSPP on Power7 and Power8. On Power7 double check and monitor your MSPP to be sure the pools are never full and that you can get enough capacity to run you load. On a Power8 box setup you weights wisely on your different environments (backup, production, development). You can then be sure that the production will be prioritized whatever appends even if you reduce your MSPP sizes, by doing this you’ll maximize licensing costs. As always I hope it help.

NovaLink ‘HMC Co-Management’ and PowerVC 1.3.0.1Dynamic Resource Optimizer

Everybody now knows that I’m using PowerVC a lot in my current company. My environment is growing bigger and bigger and we are now managing more than 600 virtual machines with PowerVC (the goal is to reach ~ 3000 this year). Some of them were build by PowerVC itself and some of them were migrated through an homemade python script calling the PowerVC rest api and moving our old vSCSI machines to the new full NPIV/Live Partition Mobility/PowerVC environment (Still struggling with the “old mens” to move on SSP, but I’m alone versus everybody on this one). I’m happy with that but (there is always a but) I’m facing a lot problems. The first one is that we are doing more and more stuffs with PowerVC (Virtual Machine creation, virtual machines resizing, adding additional disks, moving machine with LPM, and finally using this python scripts to migrate the old machines to the new environment). I realized that the machine hosting the PowerVC was slower and slower and the more actions we do the more the PowerVC was “unresponsive”. By this I mean that the GUI was slow, creating objects was slower and slower. By looking at CPU graphs in lpar2rrd we noticed that the CPU consumption was growing as fast as we were doing stuffs on PowerVC (check the graph below). The second problem was my teams (unfortunately for me, we have here different teams doing different sort of stuffs here and everybody is using the Hardware Management Consoles it’s own way, some people are renaming the machine making them unusable with PowerVC, some people were changing the profiles disabling the synchronization, even worse we have some third party tools used for capacity planning making the Hardware Management Console unusable by PowerVC). The solution to all these problems is to use NovaLink and especially the NovaLink Co-Management. By doing this the Hardware Management Consoles will be restricted to a read-only view and PowerVC will stop querying the HMCs and will directly query the NovaLink partitions on each hosts instead of querying the Hardware Management Consoles.

cpu_powervc

What is NovaLink ?

If you are using PowerVC you know that this one is based on OpenStack. Until now all the Openstack services where running on the PowerVC host. If you check on the PowerVC today you can see that there is one Nova per managed host. In the example below I’m managing ten hosts so I have ten different Nova processes running :

# ps -ef | grep [n]ova-compute
nova       627     1 14 Jan16 ?        06:24:30 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_10D6666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_10D6666.log
nova       649     1 14 Jan16 ?        06:30:25 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_65E6666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_65E6666.log
nova       664     1 17 Jan16 ?        07:49:27 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_1086666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_1086666.log
nova       675     1 19 Jan16 ?        08:40:27 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_06D6666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_06D6666.log
nova       687     1 18 Jan16 ?        08:15:57 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_6576666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_6576666.log
nova       697     1 21 Jan16 ?        09:35:40 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_6556666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_6556666.log
nova       712     1 13 Jan16 ?        06:02:23 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_10A6666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_10A6666.log
nova       728     1 17 Jan16 ?        07:49:02 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_1016666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9117MMD_1016666.log
nova       752     1 17 Jan16 ?        07:34:45 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_1036666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9119MHE_1036666.log
nova       779     1 13 Jan16 ?        05:54:52 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova-9117MMD_6596666.conf --log-file /var/log/nova/nova-compute.log --log-file /var/log/nova/nova-compute-9119MHE_6596666.log
# ps -ef | grep [n]ova-compute | wc -l
10

The goal of NovaLink is to move these processes on a dedicated partition running on each managed host (each PowerSystems). This partition is called the NovaLink partition. This one is running on an Ubuntu 15.10 Linux OS (Little endian) (so only available on Power8 hosts) and is in charge to run the Openstack nova processes. By doing that you will distribute the load across all the NovaLink partitions instead of charging one PowerVC host. Even better my understanding is that the NovaLink partition is able to communicate directly with the FSP. By using NovaLink you will be able to stop using the Hardware Management Consoles anymore and avoid the slowness of theses ones. As the NovaLink partition is hosted on the host itself the RMC connections are can now use a direct link (ipv6) through the PowerHypervisor. No more RMC connection problem at all ;-), it’s just awesome. NovaLink allows you to choose between two modes of management:

  • Full Nova Management: You install your new host directly with NovaLink on it and you will not need an Hardware Management Console Anymore (In this case the NovaLink installation is in charge to deploy the Virtual I/O Servers and the SEAs).
  • Nova Co-Management: Your host is already installed and you give the write access (setmaster) to the NovaLink partition, the Hardware Management Console will be limited in this mode (you will not be able to create partition anymore or modify profile, it’s not a “read only” mode as you will be able to start and stop the partitions and still do some stuffs with HMC but you will be very limited).
  • You can still mix NovaLink and Non-NovaLink management hosts, and still have P7/P6 managed by HMCs, P8 managed by HMCs, P8 Nova Co-Managed and P8 full Nova Managed ;-).
  • Nova1

Prerequisites

As always upgrade your systems to the latest code level if you want to use NovaLink and NovaLink Co-Management

  • Power 8 only with firmware version 840. (or later)
  • Virtual I/O Server 2.2.4.10 or later
  • For NovaLink co-management HMC V8R8.4.0
  • Obviously install NovaLink on each NovaLink managed system (install the latest patch version of NovaLink)
  • PowerVC 1.3.0.1 or later

NovaLink installation on an existing system

I’ll show you here how to install a NovaLink partition on an existing deployed system. Installing a new system from scratch is also possible. My advice is that you look at this address to start: , and check this youtube video showing you how a system is installed from scratch :

The goal of this post is to show you how to setup a co-managed system on an already existing system with Virtual I/O Servers already deployed on the host. My advice is to be very careful. The first thing you’ll need to do is to created a partition (2VP 0.5EC and 5GB Memory) (I’m calling it nova in the example below) and use the Virtual Optical device to load the NovaLink system on this one. In the example below the machine is “SSP” backed. Be very careful when do that: setup the profile name, and all the configuration stuffs before moving to co-managed mode … after that it will be harder for you to change things as the new pvmctl command will be very new to you:

# mkvdev -fbo -vadapter vhost0
vtopt0 Available
# lsrep
Size(mb) Free(mb) Parent Pool         Parent Size      Parent Free
    3059     1579 rootvg                   102272            73216

Name                                                  File Size Optical         Access
PowerVM_NovaLink_V1.1_122015.iso                           1479 None            rw
vopt_a19a8fbb57184aad8103e2c9ddefe7e7                         1 None            ro
# loadopt -disk PowerVM_NovaLink_V1.1_122015.iso -vtd vtopt0
# lsmap -vadapter vhost0 -fmt :
vhost0:U8286.41A.21AFF8V-V2-C40:0x00000003:nova_b1:Available:0x8100000000000000:nova_b1.7f863bacb45e3b32258864e499433b52: :N/A:vtopt0:Available:0x8200000000000000:/var/vio/VMLibrary/PowerVM_NovaLink_V1.1_122015.iso: :N/A
  • At the gurb page select the first entry:
  • install1

  • Wait for the machine to boot:
  • install2

  • Choose to perform an installation:
  • install3

  • Accept the licenses
  • install4

  • padmin user:/li>
    install5

  • Put you network configuration:
  • install6

  • Accept to install the Ubuntu system:
  • install8

  • You can then modify anything you want in the configuration file (in my case the timezone):
  • install9

    By default NovaLink (I think not 100% sure) is designed to be installed on SAS disk, so without multipathing. If like me you decide to install the NovaLink partition in a “boot-on-san” lpar my advice is to launch the installation without any multipathing enabled (only one vscsi adapter or one virtual fibre channel adapter). After the installation is completed install the Ubuntu multipathd service and configure the second vscsi or virtual fibre channel adapter. If you don’t do that you may experience problem at the installation time (RAID error). Please remember that you have to do that before enabling the co-management. Last thing about the installation it may takes a lot of time to finish. So be patient (especially the preseed step).

install10

Updating to the latest code level

The iso file provider in the Entitled Software Support is not updated to the latest available NovaLink code. Make a copy of the official repository available at this address: ftp://public.dhe.ibm.com/systems/virtualization/Novalink/debian. Serve the content of this ftp server on you how http server (use the command below to copy it):

# wget --mirror ftp://public.dhe.ibm.com/systems/virtualization/Novalink/debian

Modify the /etc/apt/sources.list (and source.list.d) and comment all the available deb repository to on only keep your copy

root@nova:~# grep -v ^# /etc/apt/sources.list
deb http://deckard.lab.chmod666.org/nova/Novalink/debian novalink_1.0.0 non-free
root@nova:/etc/apt/sources.list.d# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
  pvm-cli pvm-core pvm-novalink pvm-rest-app pvm-rest-server pypowervm
6 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 165 MB of archives.
After this operation, 53.2 kB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pypowervm all 1.0.0.1-151203-1553 [363 kB]
Get:2 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pvm-cli all 1.0.0.1-151202-864 [63.4 kB]
Get:3 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pvm-core ppc64el 1.0.0.1-151202-1495 [2,080 kB]
Get:4 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pvm-rest-server ppc64el 1.0.0.1-151203-1563 [142 MB]
Get:5 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pvm-rest-app ppc64el 1.0.0.1-151203-1563 [21.1 MB]
Get:6 http://deckard.lab.chmod666.org/nova/Novalink/debian/ novalink_1.0.0/non-free pvm-novalink ppc64el 1.0.0.1-151203-408 [1,738 B]
Fetched 165 MB in 7s (20.8 MB/s)
(Reading database ... 72094 files and directories currently installed.)
Preparing to unpack .../pypowervm_1.0.0.1-151203-1553_all.deb ...
Unpacking pypowervm (1.0.0.1-151203-1553) over (1.0.0.0-151110-1481) ...
Preparing to unpack .../pvm-cli_1.0.0.1-151202-864_all.deb ...
Unpacking pvm-cli (1.0.0.1-151202-864) over (1.0.0.0-151110-761) ...
Preparing to unpack .../pvm-core_1.0.0.1-151202-1495_ppc64el.deb ...
Removed symlink /etc/systemd/system/multi-user.target.wants/pvm-core.service.
Unpacking pvm-core (1.0.0.1-151202-1495) over (1.0.0.0-151111-1375) ...
Preparing to unpack .../pvm-rest-server_1.0.0.1-151203-1563_ppc64el.deb ...
Unpacking pvm-rest-server (1.0.0.1-151203-1563) over (1.0.0.0-151110-1480) ...
Preparing to unpack .../pvm-rest-app_1.0.0.1-151203-1563_ppc64el.deb ...
Unpacking pvm-rest-app (1.0.0.1-151203-1563) over (1.0.0.0-151110-1480) ...
Preparing to unpack .../pvm-novalink_1.0.0.1-151203-408_ppc64el.deb ...
Unpacking pvm-novalink (1.0.0.1-151203-408) over (1.0.0.0-151112-304) ...
Processing triggers for ureadahead (0.100.0-19) ...
ureadahead will be reprofiled on next reboot
Setting up pypowervm (1.0.0.1-151203-1553) ...
Setting up pvm-cli (1.0.0.1-151202-864) ...
Installing bash completion script /etc/bash_completion.d/python-argcomplete.sh
Setting up pvm-core (1.0.0.1-151202-1495) ...
addgroup: The group `pvm_admin' already exists.
Created symlink from /etc/systemd/system/multi-user.target.wants/pvm-core.service to /usr/lib/systemd/system/pvm-core.service.
0513-071 The ctrmc Subsystem has been added.
Adding /usr/lib/systemd/system/ctrmc.service for systemctl ...
0513-059 The ctrmc Subsystem has been started. Subsystem PID is 3096.
Setting up pvm-rest-server (1.0.0.1-151203-1563) ...
The user `wlp' is already a member of `pvm_admin'.
Setting up pvm-rest-app (1.0.0.1-151203-1563) ...
Setting up pvm-novalink (1.0.0.1-151203-408) ...

NovaLink and HMC Co-Management configuration

Before adding the hosts on PowerVC you still need to do the most important thing. After the installation is finished enable the co-management mode to be able to have a system managed by NovaLink and still connected to an Hardware Management Console:

  • Enable the powerm_mgmt_capable attribute on the Nova partition:
  • # chsyscfg -r lpar -m br-8286-41A-2166666 -i "name=nova,powervm_mgmt_capable=1"
    # lssyscfg -r lpar -m br-8286-41A-2166666 -F name,powervm_mgmt_capable --filter "lpar_names=nova"
    nova,1
    
  • Enable co-management (please not here that you have to setmaster (you’ll see that the curr_master_name is the HMC) and then relmaster (you’ll see that the curr_master_name is the NovaLink Partition, this is that state where we want to be)):
  • # lscomgmt -m br-8286-41A-2166666
    is_master=null
    # chcomgmt -m br-8286-41A-2166666 -o setmaster -t norm --terms agree
    # lscomgmt -m br-8286-41A-2166666
    is_master=1,curr_master_name=myhmc1,curr_master_mtms=7042-CR8*2166666,curr_master_type=norm,pend_master_mtms=none
    # chcomgmt -m br-8286-41A-2166666 -o relmaster
    # lscomgmt -m br-8286-41A-2166666
    is_master=0,curr_master_name=nova,curr_master_mtms=3*8286-41A*2166666,curr_master_type=norm,pend_master_mtms=none
    

Going back to HMC managed system

You can go back to an Hardware Management Console managed system whenever you want (set the master to the HMC, delete the nova partition and release the master from the HMC).

# chcomgmt -m br-8286-41A-2166666 -o setmaster -t norm --terms agree
# lscomgmt -m br-8286-41A-2166666
is_master=1,curr_master_name=myhmc1,curr_master_mtms=7042-CR8*2166666,curr_master_type=norm,pend_master_mtms=none
# chlparstate -o shutdown -m br-8286-41A-2166666 --id 9 --immed
# rmsyscfg -r lpar -m br-8286-41A-2166666 --id 9
# chcomgmt -o relmaster -m br-8286-41A-2166666
# lscomgmt -m br-8286-41A-2166666
is_master=0,curr_master_mtms=none,curr_master_type=none,pend_master_mtms=none

Using NovaLink

After the installation you are now able to login on the NovaLink partition. (You can gain root access with “sudo su -” command). A command new called pvmctl is available on the NovaLink partition allowing you to perform any actions (stop, start virtual machine, list Virtual I/O Servers, ….). Before trying to add the host double check that the pvmctl command is working ok.

padmin@nova:~$ pvmctl lpar list
Logical Partitions
+------+----+---------+-----------+---------------+------+-----+-----+
| Name | ID |  State  |    Env    |    Ref Code   | Mem  | CPU | Ent |
+------+----+---------+-----------+---------------+------+-----+-----+
| nova | 3  | running | AIX/Linux | Linux ppc64le | 8192 |  2  | 0.5 |
+------+----+---------+-----------+---------------+------+-----+-----+

Adding hosts

On the PowerVC side add the NovaLink host by choosing the NovaLink option:

addhostnovalink

Some deb (ibmpowervc-power)packages will be installed on configured on the NovaLink machine:

addhostnovalink3
addhostnovalink4

By doing this, on each NovaLink machine you can check that a nova-compute process is here. (By adding the host the deb was installed and configured on the NovaLink host:

# ps -ef | grep nova
nova      4392     1  1 10:28 ?        00:00:07 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /var/log/nova/nova-compute.log
root      5218  5197  0 10:39 pts/1    00:00:00 grep --color=auto nova
# grep host_display_name /etc/nova/nova.conf
host_display_name = XXXX-8286-41A-XXXX
# tail -1 /var/log/apt/history.log
Start-Date: 2016-01-18  10:27:54
Commandline: /usr/bin/apt-get -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold -y install --force-yes --allow-unauthenticated ibmpowervc-powervm
Install: python-keystoneclient:ppc64el (1.6.0-2.ibm.ubuntu1, automatic), python-oslo.reports:ppc64el (0.1.0-1.ibm.ubuntu1, automatic), ibmpowervc-powervm:ppc64el (1.3.0.1), python-ceilometer:ppc64el (5.0.0-201511171217.ibm.ubuntu1.199, automatic), ibmpowervc-powervm-compute:ppc64el (1.3.0.1, automatic), nova-common:ppc64el (12.0.0-201511171221.ibm.ubuntu1.213, automatic), python-oslo.service:ppc64el (0.11.0-2.ibm.ubuntu1, automatic), python-oslo.rootwrap:ppc64el (2.0.0-1.ibm.ubuntu1, automatic), python-pycadf:ppc64el (1.1.0-1.ibm.ubuntu1, automatic), python-nova:ppc64el (12.0.0-201511171221.ibm.ubuntu1.213, automatic), python-keystonemiddleware:ppc64el (2.4.1-2.ibm.ubuntu1, automatic), python-kafka:ppc64el (0.9.3-1.ibm.ubuntu1, automatic), ibmpowervc-powervm-monitor:ppc64el (1.3.0.1, automatic), ibmpowervc-powervm-oslo:ppc64el (1.3.0.1, automatic), neutron-common:ppc64el (7.0.0-201511171221.ibm.ubuntu1.280, automatic), python-os-brick:ppc64el (0.4.0-1.ibm.ubuntu1, automatic), python-tooz:ppc64el (1.22.0-1.ibm.ubuntu1, automatic), ibmpowervc-powervm-ras:ppc64el (1.3.0.1, automatic), networking-powervm:ppc64el (1.0.0.0-151109-25, automatic), neutron-plugin-ml2:ppc64el (7.0.0-201511171221.ibm.ubuntu1.280, automatic), python-ceilometerclient:ppc64el (1.5.0-1.ibm.ubuntu1, automatic), python-neutronclient:ppc64el (2.6.0-1.ibm.ubuntu1, automatic), python-oslo.middleware:ppc64el (2.8.0-1.ibm.ubuntu1, automatic), python-cinderclient:ppc64el (1.3.1-1.ibm.ubuntu1, automatic), python-novaclient:ppc64el (2.30.1-1.ibm.ubuntu1, automatic), python-nova-ibm-ego-resource-optimization:ppc64el (2015.1-201511110358, automatic), python-neutron:ppc64el (7.0.0-201511171221.ibm.ubuntu1.280, automatic), nova-compute:ppc64el (12.0.0-201511171221.ibm.ubuntu1.213, automatic), nova-powervm:ppc64el (1.0.0.1-151203-215, automatic), openstack-utils:ppc64el (2015.2.0-201511171223.ibm.ubuntu1.18, automatic), ibmpowervc-powervm-network:ppc64el (1.3.0.1, automatic), python-oslo.policy:ppc64el (0.5.0-1.ibm.ubuntu1, automatic), python-oslo.db:ppc64el (2.4.1-1.ibm.ubuntu1, automatic), python-oslo.versionedobjects:ppc64el (0.9.0-1.ibm.ubuntu1, automatic), python-glanceclient:ppc64el (1.1.0-1.ibm.ubuntu1, automatic), ceilometer-common:ppc64el (5.0.0-201511171217.ibm.ubuntu1.199, automatic), openstack-i18n:ppc64el (2015.2-3.ibm.ubuntu1, automatic), python-oslo.messaging:ppc64el (2.1.0-2.ibm.ubuntu1, automatic), python-swiftclient:ppc64el (2.4.0-1.ibm.ubuntu1, automatic), ceilometer-powervm:ppc64el (1.0.0.0-151119-44, automatic)
End-Date: 2016-01-18  10:28:00

The command line interface

You can do ALL the stuffs you were doing on the HMC using the pvmctl command. The syntax is pretty simple: pvcmtl |OBJECT| |ACTION| where the OBJECT can be vios, vm, vea(virtual ethernet adapter), vswitch, lu (logical unit), or anything you want and ACTION can be list, delete, create, update. Here are a few examples :

  • List the Virtual I/O Servers:
  • # pvmctl vios list
    Virtual I/O Servers
    +--------------+----+---------+----------+------+-----+-----+
    |     Name     | ID |  State  | Ref Code | Mem  | CPU | Ent |
    +--------------+----+---------+----------+------+-----+-----+
    | s00ia9940825 | 1  | running |          | 8192 |  2  | 0.2 |
    | s00ia9940826 | 2  | running |          | 8192 |  2  | 0.2 |
    +--------------+----+---------+----------+------+-----+-----+
    
  • List the partitions (note the -d for display-fields allowing me to print somes attributes):
  • # pvmctl vm list
    Logical Partitions
    +----------+----+----------+----------+----------+-------+-----+-----+
    |   Name   | ID |  State   |   Env    | Ref Code |  Mem  | CPU | Ent |
    +----------+----+----------+----------+----------+-------+-----+-----+
    | aix72ca> | 3  | not act> | AIX/Lin> | 00000000 |  2048 |  1  | 0.1 |
    |   nova   | 4  | running  | AIX/Lin> | Linux p> |  8192 |  2  | 0.5 |
    | s00vl99> | 5  | running  | AIX/Lin> | Linux p> | 10240 |  2  | 0.2 |
    | test-59> | 6  | not act> | AIX/Lin> | 00000000 |  2048 |  1  | 0.1 |
    +----------+----+----------+----------+----------+-------+-----+-----+
    # pvmctl list vm -d name id 
    [..]
    # pvmctl vm list -i id=4 --display-fields LogicalPartition.name
    name=aix72-1-d3707953-00000090
    # pvmctl vm list  --display-fields LogicalPartition.name LogicalPartition.id LogicalPartition.srr_enabled SharedProcessorConfiguration.desired_virtual SharedProcessorConfiguration.uncapped_weight
    name=aix72capture,id=3,srr_enabled=False,desired_virtual=1,uncapped_weight=64
    name=nova,id=4,srr_enabled=False,desired_virtual=2,uncapped_weight=128
    name=s00vl9940243,id=5,srr_enabled=False,desired_virtual=2,uncapped_weight=128
    name=test-5925058d-0000008d,id=6,srr_enabled=False,desired_virtual=1,uncapped_weight=128
    
  • Delete the virtual adapter on the partition name nova (note the –parent-id to select the partition) with a certain uuid which was found with (pvmclt list vea):
  • # pvmctl vea delete --parent-id name=nova --object-id uuid=fe7389a8-667f-38ca-b61e-84c94e5a3c97
    
  • Power off the lpar named aix72-2:
  • # pvmctl vm power-off -i name=aix72-2-536bf0f8-00000091
    Powering off partition aix72-2-536bf0f8-00000091, this may take a few minutes.
    Partition aix72-2-536bf0f8-00000091 power-off successful.
    
  • Delete the lpar named aix72-2:
  • # pvmctl vm delete -i name=aix72-2-536bf0f8-00000091
    
  • Delete the vswitch named MGMTVSWITCH:
  • # pvmctl vswitch delete -i name=MGMTVSWITCH
    
  • Open a console:
  • #  mkvterm --id 4
    vterm for partition 4 is active.  Press Control+] to exit.
    |
    Elapsed time since release of system processors: 57014 mins 10 secs
    [..]
    
  • Power on an lpar:
  • # pvmctl vm power-on -i name=aix72capture
    Powering on partition aix72capture, this may take a few minutes.
    Partition aix72capture power-on successful.
    

Is this a dream ? No more RMC connectivty problem anymore

I’m 100% sure that you always have problems with RMC connectivity due to firwall issues, ports not opened, and IDS blocking RMC ongoing or outgoing traffic. NovaLink is THE solution that will solve all the RMC problems forever. I’m not joking it’s a major improvement for PowerVM. As the NovaLink partition is installed on each hosts this one can communicate through a dedicated IPv6 link with all the partitions hosted on the host. A dedicated virtual switch called MGMTSWITCH is used to allow the RMC flow to transit between all the lpars and the NovaLink partition. Of course this Virtual Switch must be created and one Virtual Ethernet Adapter must also be created on the NovaLink partition. These are the first two actions to do if you want to implement this solution. Before starting here are a few things you need to know:

  • For security reason the MGMTSWITCH must be created in Vepa mode. If you are not aware of what are VEPA and VEB modes here is a reminder:
  • In VEB mode all the the partitions connected to the same vlan can communicate together. We do not want that as it is a security issue.
  • The VEPA mode gives us the ability to isolate lpars that are on the same subnet. lpar to lpar traffic is forced out of the machine. This is what we want.
  • The PVID for this VEPA network is 4094
  • The adapter in the NovaLink partition must be a trunk adapter.
  • It is mandatory to name the VEPA vswitch MGMTSWITCH.
  • At the lpar creation if the MGMTSWITCH exists a new Virtual Ethernet Adapter will be automatically created on the deployed lpar.
  • To be correctly configured the deployed lpar needs the latest level of rsct code (3.2.1.0 for now).
  • The latest cloud-init version must be deploy on the captured lpar used to make the image.
  • You don’t need to configure any addresses on this adapter (on the deployed lpars the adapter is configured with the local-link address (it’s the same thing as 169.254.0.0/16 addresses used in IPv4 format but for IPv6)(please note that any IPv6 adapter must “by design” have a local-link address).

mgmtswitch2

  • Create the virtual switch called MGMTSWITCH in Vepa mode:
  • # pvmctl vswitch create --name MGMTSWITCH --mode=Vepa
    # pvmctl vswitch list  --display-fields VirtualSwitch.name VirtualSwitch.mode 
    name=ETHERNET0,mode=Veb
    name=vdct,mode=Veb
    name=vdcb,mode=Veb
    name=vdca,mode=Veb
    name=MGMTSWITCH,mode=Vepa
    
  • Create a virtual ethernet adapter on the NovaLink partition with the PVID 4094 and a trunk priorty set to 1 (it’s a trunk adapter). Note that we now have two adapters on the NovaLink partition (one in IPv4 (routable) and the other one in IPv6 (non-routable):
  • # pvmctl vea create --pvid 4094 --vswitch MGMTSWITCH --trunk-pri 1 --parent-id name=nova
    # pvmctl vea list --parent-id name=nova
    --------------------------
    | VirtualEthernetAdapter |
    --------------------------
      is_tagged_vlan_supported=False
      is_trunk=False
      loc_code=U8286.41A.216666-V3-C2
      mac=EE3B84FD1402
      pvid=666
      slot=2
      uuid=05a91ab4-9784-3551-bb4b-9d22c98934e6
      vswitch_id=1
    --------------------------
    | VirtualEthernetAdapter |
    --------------------------
      is_tagged_vlan_supported=True
      is_trunk=True
      loc_code=U8286.41A.216666-V3-C34
      mac=B6F837192E63
      pvid=4094
      slot=34
      trunk_pri=1
      uuid=fe7389a8-667f-38ca-b61e-84c94e5a3c97
      vswitch_id=4
    

    Configure the local-link IPv6 address in the NovaLink partition:

    # more /etc/network/interfaces
    [..]
    auto eth1
    iface eth1 inet manual
     up /sbin/ifconfig eth1 0.0.0.0
    # ifup eth1
    # ifconfig eth1
    eth1      Link encap:Ethernet  HWaddr b6:f8:37:19:2e:63
              inet6 addr: fe80::b4f8:37ff:fe19:2e63/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:0 errors:0 dropped:0 overruns:0 frame:0
              TX packets:17 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:0 (0.0 B)  TX bytes:1454 (1.4 KB)
              Interrupt:34
    

Capture an AIX host with the latest version of rsct installed (3.2.1.0) or later and the latest version of cloud-init installed. This version of RMC/rsct handle this new feature so this is mandatory to have it installed on the captured host. When PowerVC will deploy a Virtual Machine on a Nova managed host with this version of rsct installed a new adapter with the PVID 4094 in the virtual switch MGMTSWITCH will be created and finally all the RMC traffic will use this adapter instead of your public IP address:

# lslpp -L rsct*
  Fileset                      Level  State  Type  Description (Uninstaller)
  ----------------------------------------------------------------------------
  rsct.core.auditrm          3.2.1.0    C     F    RSCT Audit Log Resource
                                                   Manager
  rsct.core.errm             3.2.1.0    C     F    RSCT Event Response Resource
                                                   Manager
  rsct.core.fsrm             3.2.1.0    C     F    RSCT File System Resource
                                                   Manager
  rsct.core.gui              3.2.1.0    C     F    RSCT Graphical User Interface
  rsct.core.hostrm           3.2.1.0    C     F    RSCT Host Resource Manager
  rsct.core.lprm             3.2.1.0    C     F    RSCT Least Privilege Resource
                                                   Manager
  rsct.core.microsensor      3.2.1.0    C     F    RSCT MicroSensor Resource
                                                   Manager
  rsct.core.rmc              3.2.1.1    C     F    RSCT Resource Monitoring and
                                                   Control
  rsct.core.sec              3.2.1.0    C     F    RSCT Security
  rsct.core.sensorrm         3.2.1.0    C     F    RSCT Sensor Resource Manager
  rsct.core.sr               3.2.1.0    C     F    RSCT Registry
  rsct.core.utils            3.2.1.1    C     F    RSCT Utilities

When this image will be deployed a new adapter will be created in the MGMTSWITCH virtual switch, an IPv6 local-link address will be configured on it. You can check the cloud-init activation to see the IPv6 address is configured at the activation time:

# pvmctl vea list --parent-id name=aix72-2-0a0de5c5-00000095
--------------------------
| VirtualEthernetAdapter |
--------------------------
  is_tagged_vlan_supported=True
  is_trunk=False
  loc_code=U8286.41A.216666-V5-C32
  mac=FA620F66FF20
  pvid=3331
  slot=32
  uuid=7f1ec0ab-230c-38af-9325-eb16999061e2
  vswitch_id=1
--------------------------
| VirtualEthernetAdapter |
--------------------------
  is_tagged_vlan_supported=True
  is_trunk=False
  loc_code=U8286.41A.216666-V5-C33
  mac=46A066611B09
  pvid=4094
  slot=33
  uuid=560c67cd-733b-3394-80f3-3f2a02d1cb9d
  vswitch_id=4
# ifconfig -a
en0: flags=1e084863,14c0
        inet 10.10.66.66 netmask 0xffffff00 broadcast 10.14.33.255
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en1: flags=1e084863,14c0
        inet6 fe80::c032:52ff:fe34:6e4f/64
         tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
sit0: flags=8100041
        inet6 ::10.10.66.66/96
[..]

Note that the local-link address is configured at the activation time (fe80 starting addresses):

# more /var/log/cloud-init-output.log
[..]
auto eth1

iface eth1 inet6 static
    address fe80::c032:52ff:fe34:6e4f
    hwaddress ether c2:32:52:34:6e:4f
    netmask 64
    pre-up [ $(ifconfig eth1 | grep -o -E '([[:xdigit:]]{1,2}:){5}[[:xdigit:]]{1,2}') = "c2:32:52:34:6e:4f" ]
        dns-search fr.net.intra
# entstat -d ent1 | grep -iE "switch|vlan"
Invalid VLAN ID Packets: 0
Port VLAN ID:  4094
VLAN Tag IDs:  None
Switch ID: MGMTSWITCH

To be sure all is working correctly here is a proof test. I’m taking down the en0 interface on which the IPv4 public address is configured. Then I’m launching a tcpdump on the en1 (on the MGMTSWITCH address). Finally I’m resizing the Virtual Machine with PowerVC. AND EVERYTHING IS WORKING GREAT !!!! AWESOME !!! :-) (note the fe80 to fe80 communication):

# ifconfig en0 down detach ; tcpdump -i en1 port 657
tcpdump: WARNING: en1: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en1, link-type 1, capture size 96 bytes
22:00:43.224964 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: S 4049792650:4049792650(0) win 65535 
22:00:43.225022 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: S 2055569200:2055569200(0) ack 4049792651 win 28560 
22:00:43.225051 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: . ack 1 win 32844 
22:00:43.225547 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: P 1:209(208) ack 1 win 32844 
22:00:43.225593 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: . ack 209 win 232 
22:00:43.225638 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: P 1:97(96) ack 209 win 232 
22:00:43.225721 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: P 209:377(168) ack 97 win 32844 
22:00:43.225835 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: P 97:193(96) ack 377 win 240 
22:00:43.225910 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: P 377:457(80) ack 193 win 32844 
22:00:43.226076 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: P 193:289(96) ack 457 win 240 
22:00:43.226154 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: P 457:529(72) ack 289 win 32844 
22:00:43.226210 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: P 289:385(96) ack 529 win 240 
22:00:43.226276 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: P 529:681(152) ack 385 win 32844 
22:00:43.226335 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.32819: P 385:481(96) ack 681 win 249 
22:00:43.424049 IP6 fe80::9850:f6ff:fe9c:5739.32819 > fe80::d09e:aff:fecf:a868.rmc: . ack 481 win 32844 
22:00:44.725800 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.rmc: UDP, length 88
22:00:44.726111 IP6 fe80::9850:f6ff:fe9c:5739.rmc > fe80::d09e:aff:fecf:a868.rmc: UDP, length 88
22:00:50.137605 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.rmc: UDP, length 632
22:00:50.137900 IP6 fe80::9850:f6ff:fe9c:5739.rmc > fe80::d09e:aff:fecf:a868.rmc: UDP, length 88
22:00:50.183108 IP6 fe80::9850:f6ff:fe9c:5739.rmc > fe80::d09e:aff:fecf:a868.rmc: UDP, length 408
22:00:51.683382 IP6 fe80::9850:f6ff:fe9c:5739.rmc > fe80::d09e:aff:fecf:a868.rmc: UDP, length 408
22:00:51.683661 IP6 fe80::d09e:aff:fecf:a868.rmc > fe80::9850:f6ff:fe9c:5739.rmc: UDP, length 88

To be sure security requirements are met from the lpar I’m pinging the NovaLink host (the first one) which is answering and then I’m pinging the second lpar (the second ping) which is not working. (And this is what we want !!!).

# ping fe80::d09e:aff:fecf:a868
PING fe80::d09e:aff:fecf:a868 (fe80::d09e:aff:fecf:a868): 56 data bytes
64 bytes from fe80::d09e:aff:fecf:a868: icmp_seq=0 ttl=64 time=0.203 ms
64 bytes from fe80::d09e:aff:fecf:a868: icmp_seq=1 ttl=64 time=0.206 ms
64 bytes from fe80::d09e:aff:fecf:a868: icmp_seq=2 ttl=64 time=0.216 ms
^C
--- fe80::d09e:aff:fecf:a868 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0/0/0 ms
# ping fe80::44a0:66ff:fe61:1b09
PING fe80::44a0:66ff:fe61:1b09 (fe80::44a0:66ff:fe61:1b09): 56 data bytes
^C
--- fe80::44a0:66ff:fe61:1b09 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

PowerVC 1.3.0.1 Dynamic Resource Optimizer

In addition to the NovaLink part of this blog post I also wanted to talk about the killer app of 2016. Dynamic Resource Optimizer. This feature can be used on any PowerVC 1.3.0.1 managed hosts (you obviously need at least to hosts). DRO is in charge to re-balance your Virtual Machines across all the available hosts (in the host-group). To sum up if a host is experiencing an heavy load and reaching a certain amount of CPU consumption over a period of time, DRO will move your virtual machines to re-balance the load across all the available hosts (this is done at a host level). Here are a few details about DRO:

  • The DRO configuration is done at a host level.
  • You setup a threshold (in the capture below) to reach to trigger the Live Partition Moblity or Mobily Cores movements (Power Entreprise Pool).
  • droo6
    droo3

  • To be triggered this threshold must be reached a certain number of time (stabilization) over a period you are defining (run interval).
  • You can choose to move virtual machines using Live Partition Mobilty, or to move “cores” using Power Entreprise Pool (you can do both; moving CPU will always be preferred as moving partitions)
  • DRO can be run in advise mode (nothing is done, a warning is thrown in the new DRO events tab) or in active mode (which is doing the job and moving things).
    droo2
    droo1

  • Your most critical virtual machines can be excluded from DRO:
  • droo5

How is DRO choosing which machines are moved

I’m running DRO in production since now one month and I had the time to check what is going on behind the scene. How is DRO choosing which machines are moved when a Live Partition Moblity operation must be run to face an heavy load on a host ? To do so I decided to launch 3 different cpuhog (16 forks, 4VP, SMT4) processes (which are eating CPU ressource) on three different lpars with 4VP each. On the PowerVC I can check that before launching this processes the CPU consumption is ok on this host (the three lpars are running on the same host) :

droo4

# cat cpuhog.pl
#!/usr/bin/perl

print "eating the CPUs\n";

foreach $i (1..16) {
      $pid = fork();
      last if $pid == 0;
      print "created PID $pid\n";
}

while (1) {
      $x++;
}
# perl cpuhog.pl
eating the CPUs
created PID 47514604
created PID 22675712
created PID 3015584
created PID 21496152
created PID 25166098
created PID 26018068
created PID 11796892
created PID 33424106
created PID 55444462
created PID 65077976
created PID 13369620
created PID 10813734
created PID 56623850
created PID 19333542
created PID 58393312
created PID 3211988

I’m waiting a couple of minutes and I realize that the virtual machines on which the cpuhog processes were launched are the ones which are migrated. So we can say that PowerVC is moving the machine that are eating CPU (another strategy could be to move all the non-eating CPU machines to let the working ones do their job without launching a mobility operation).

# errpt | head -3
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
A5E6DB96   0118225116 I S pmig           Client Partition Migration Completed
08917DC6   0118225116 I S pmig           Client Partition Migration Started

After the moves are ok I can see that the load is now ok on the host. DRO has done the job for me and moved the lpar to met the configured thresold ;-)

droo7dro_effect

The images below will show you a good example of the “power” of PowerVC and DRO. To update my Virtual I/O Servers to the latest version the PowerVC maintenance mode was used to free up the Virtual I/O Servers. After leaving the maintenance mode the DRO was doing the job to re-balance the Virtual Machines across all the hosts (The red arrows symbolize the maintenance mode action and the purple ones the DRO actions). You can also see that some lpars were moved across 4 different hosts during this process. All these pictures are taken from real life experience on my production systems. This not a lab environment, this is one part of my production. So yes DRO and PowerVC 1.3.0.1 are production ready. Hell yes!

real1
real2
real3
real4
real5

Conclusion

As my environment is growing bigger the next step for me will be to move on NovaLink on my P8 hosts. Please note that the NovaLink Co-Management feature is today a “TechPreview” but should be released GA very soon. Talking about DRO I was waiting for that for years and it finally happens. I can assure you that it is production ready, to prove this I’ll just give you this number. To upgrade my Virtual I/O Servers to 2.2.4.10 release using PowerVC maintenance mode and DRO more than 1000 Live Partition Mobility moves were performed without any outage on production servers and during working hours. Nobody in my company was aware of this during the operations. It was a seamless experience for everybody.

What’s new in VIOS 2.2.4.10 and PowerVM : Part 1 Virtual I/O Server Rules

I will post a series of mini blog posts about new features of PowerVM and Virtual I/O Server that are release this month. By this I mean Hardware Management Console 840 + Power firmware 840 + Virtual I/O Sever 2.2.4.10. As writing blog posts is not a part of my job and that I’m doing in that in my spare time some of the topics I will talk about have already been covered by other AIX bloggers but I think the more materials we have and the better it is. Other ones like this first one will be new to you. So please accept my apologize if topics are not what I’m calling “0 day” (the day of release). Anyway writing things help me to understand better and I add little details I have not seen in others blog post or in official documentation. Last point I will always try in these mini posts to give something new to you at least my point of view as an IBM customer. I hope it will be useful for you.

The first topic I want to talk about is Virtual I/O Server Rules. With the latest version three new commands called “rules” and “rulescfgset” and “rulesdeploy” are now available in the Virtual I/O Servers. Theses ones helps you configure your devices attributes by creating, deploying, or checking rules (with the current configuration). I’m 100% sure that every time you are installing a Virtual I/O Server you are doing the same thing over and over again: you check your buffers attributes, you check attributes on fiber channels adapters and so on. The rules is a way to be sure everything is the same on all your Virtual I/O Servers (you can create a rule file (xml format) that can be deploy on every Virtual I/O Server you install). Even better, if you are a PowerVC user like me you want to be sure that any new device created by PowerVC are created with the attributes you want (for instance buffer for Virtual Ethernet Adapters). In the “old days” you have to use the chdef command, you can now do this by using the rules. Better than giving you a list of command I’ll show you here what I’m now doing on my Virtual I/O Server in 2.2.4.10.

Creating and modifying existing default rules

Before starting here are (a non exhaustive list) the attributes I’m changing on all my Virtual I/O Servers at deploy time. I now want to do that using the rules (these are just examples, you can do much more using the rules):

  • On fcs Adapters I’m changing the max_xfer_size attribute to 0x200000.
  • On fcs Adapters I’m changing the num_cmd_elems attribute to 2048.
  • On fscsi Devices I’m changing the dyntrk attribute to yes.
  • On fscsi Devices I’m changing the fc_err_recov to fast_fail.
  • On Virtual Ethernet Adapters I’m changing the max_buf_tiny attribute to 4096.
  • On Virtual Ethernet Adapters I’m changing the min_buf_tiny attribute to 4096.
  • On Virtual Ethernet Adapters I’m changing the max_buf_small attribute to 4096.
  • On Virtual Ethernet Adapters I’m changing the min_buf_small attribute to 4096.
  • On Virtual Ethernet Adapters I’m changing the max_buf_medium attribute to 512.
  • On Virtual Ethernet Adapters I’m changing the min_buf_medium attribute to 512.
  • On Virtual Ethernet Adapters I’m changing the max_buf_large attribute to 128.
  • On Virtual Ethernet Adapters I’m changing the min_buf_large attribute to 128.
  • On Virtual Ethernet Adapters I’m changing the max_buf_huge attribute to 128.
  • On Virtual Ethernet Adapters I’m changing the min_buf_huge attribute to 128.

Modify existing attributes using rules

By default a “factory” default rule file now exist in the Virtual I/O Server. This one is located in /home/padmin/rules/vios_current_rules.xml, you can check the content of the file (it’s an xml file) and list the rules contains in it:

# ls -l /home/padmin/rules
total 40
-r--r-----    1 root     system        17810 Dec 08 18:40 vios_current_rules.xml
$ oem_setup_env
# head -10 /home/padmin/rules/vios_current_rules.xml
<?xml version="1.0" encoding="UTF-8"?>
<Profile origin="get" version="3.0.0" date="2015-12-08T17:40:37Z">
 <Catalog id="devParam.disk.fcp.mpioosdisk" version="3.0">
  <Parameter name="reserve_policy" value="no_reserve" applyType="nextboot" reboot="true">
   <Target class="device" instance="disk/fcp/mpioosdisk"/>
  </Parameter>
 </Catalog>
 <Catalog id="devParam.disk.fcp.mpioapdisk" version="3.0">
  <Parameter name="reserve_policy" value="no_reserve" applyType="nextboot" reboot="true">
   <Target class="device" instance="disk/fcp/mpioapdisk"/>
[..]
$ rules -o list -d

Let’s now say you have an existing Virtual I/O Server with en existing SEA configured on it. You want two things by using the rules:

  • Applying the rules to modify to the existing devices.
  • Be sure that new devices will be created using the rules.

For the purpose of this example we will work here on the buffers attributes of a Virtual Network Adapter (same concepts are applying to other devices type). So we have an SEA with Virtual Network Adapters and we want to change the buffers attributes. Let’s first check the current values of the Virtual Adapters:

$ lsdev -type adapter | grep -i Shared
ent13            Available   Shared Ethernet Adapter
$ lsdev -dev ent13 -attr virt_adapters
value

ent8,ent9,ent10,ent11
$ lsdev -dev ent8 -attr max_buf_huge,max_buf_large,max_buf_medium,max_buf_small,max_buf_tiny,min_buf_huge,min_buf_large,min_buf_medium,min_buf_small,min_buf_tiny
value

64
64
256
2048
2048
24
24
128
512
512
$ lsdev -dev ent9 -attr max_buf_huge,max_buf_large,max_buf_medium,max_buf_small,max_buf_tiny,min_buf_huge,min_buf_large,min_buf_medium,min_buf_small,min_buf_tiny
value

64
64
256
2048
2048
24
24
128
512
512

Let’s now check the value in the current Virtual I/O Servers rules:

$ rules -o list | grep buf
adapter/vdevice/IBM,l-lan      max_buf_tiny         2048
adapter/vdevice/IBM,l-lan      min_buf_tiny         512
adapter/vdevice/IBM,l-lan      max_buf_small        2048
adapter/vdevice/IBM,l-lan      min_buf_small        512

For the tiny and small buffer I can change the rules easily using the rules command (using modify operation):

$ rules -o modify -t adapter/vdevice/IBM,l-lan -a max_buf_tiny=4096
$ rules -o modify -t adapter/vdevice/IBM,l-lan -a min_buf_tiny=4096
$ rules -o modify -t adapter/vdevice/IBM,l-lan -a max_buf_small=4096
$ rules -o modify -t adapter/vdevice/IBM,l-lan -a min_buf_small=4096

I’m re-running the rules command to check rules are now modified :

$ rules -o list | grep buf
adapter/vdevice/IBM,l-lan      max_buf_tiny         4096
adapter/vdevice/IBM,l-lan      min_buf_tiny         4096
adapter/vdevice/IBM,l-lan      max_buf_small        4096
adapter/vdevice/IBM,l-lan      min_buf_small        4096

I can check the current values of my system against the current defined rules by using the diff operation:

# rules -o diff -s
devParam.adapter.vdevice.IBM,l-lan:max_buf_tiny device=adapter/vdevice/IBM,l-lan    2048 | 4096
devParam.adapter.vdevice.IBM,l-lan:min_buf_tiny device=adapter/vdevice/IBM,l-lan     512 | 4096
devParam.adapter.vdevice.IBM,l-lan:max_buf_small device=adapter/vdevice/IBM,l-lan   2048 | 4096
devParam.adapter.vdevice.IBM,l-lan:min_buf_small device=adapter/vdevice/IBM,l-lan    512 | 4096

Creating new attributes using rules

In the current Virtual I/O Server rules embedded with the current Virtual I/O Server release there are no existing rules for the medium, large and huge buffer. Unfortunately for me I’m modifying these attributes by default and I want a rule capable of doing that. The goal is now to create a new set of rules for the other buffers not already present in the default file … Let’s try to do that using the add operation:

# rules -o add -t adapter/vdevice/IBM,l-lan -a max_buf_medium=512
The rule is not supported or does not exist.

Annoying, I can’t add a rule for the medium buffer (same for the large and huge ones). The available attributes for each device is based on the current AIX artex catalog. You can check all the files present in the catalog to check what are the available attributes for each device type, you can see in the output below that there is nothing in the current ARTEX catalog for the medium buffer.

$ oem_setup_env
# cd /etc/security/artex/catalogs
# ls -ltr | grep l-lan
-r--r-----    1 root     security       1261 Nov 10 00:30 devParam.adapter.vdevice.IBM,l-lan.xml
# grep medium devParam.adapter.vdevice.IBM,l-lan.xml
# 

To show that this is possible to add new rules I’ll show you a simple example to add the new ‘src_lun_val’ and ‘dst_lun_val’ on the vioslpm0 device. First I check that I can add this rules by looking in the ARTEX catalog:

$ oem_setup_env
# cd /etc/security/artex/catalogs
# ls -ltr | grep lpm
-r--r-----    1 root     security       2645 Nov 10 00:30 devParam.pseudo.vios.lpm.xml
# grep -iE "src_lun_val|dest_lun_val" devParam.pseudo.vios.lpm.xml
  <ParameterDef name="dest_lun_val" type="string" targetClass="device" cfgmethod="attr" reboot="true">
  <ParameterDef name="src_lun_val" type="string" targetClass="device" cfgmethod="attr" reboot="true">

Then I’m checking the ‘range’ of authorized values for both attributes:

# lsattr -l vioslpm0 -a src_lun_val -R
on
off
# lsattr -l vioslpm0 -a dest_lun_val -R
on
off
restart_off
lpm_off

I’m searching the type using the lsdev command (here pseudo/vios/lpm):

# lsdev -P | grep lpm
pseudo         lpm             vios           VIOS LPM Adapter

I’m finally adding the rules and checking the differences:

$ rules -o add -t pseudo/vios/lpm -a src_lun_val=on
$ rules -o add -t pseudo/vios/lpm -a dest_lun_val=on
$ rules -o diff -s
devParam.adapter.vdevice.IBM,l-lan:max_buf_tiny device=adapter/vdevice/IBM,l-lan    2048 | 4096
devParam.adapter.vdevice.IBM,l-lan:min_buf_tiny device=adapter/vdevice/IBM,l-lan     512 | 4096
devParam.adapter.vdevice.IBM,l-lan:max_buf_small device=adapter/vdevice/IBM,l-lan   2048 | 4096
devParam.adapter.vdevice.IBM,l-lan:min_buf_small device=adapter/vdevice/IBM,l-lan    512 | 4096
devParam.pseudo.vios.lpm:src_lun_val device=pseudo/vios/lpm                          off | on
devParam.pseudo.vios.lpm:dest_lun_val device=pseudo/vios/lpm                 restart_off | on

But what about my buffers, is there any possibility to add these attributes in the current ARTEX catalog. The answer is yes. By looking in catalog used for Virtual Ethernet Adapters (file named: devParam.adapter.vdevice.IBM,l-lan.xml) you will see that a catalog file named ‘vioent.cat’ is utilized by this xml file. Check the content of this catalog file by using the dspcat command and find if there is anything related to medium, large and huge buffers (all the catalogs files are location in /usr/lib/methods):

$ oem_setup_env
# cd /usr/lib/methods
# dspcat vioent.cat |grep -iE "medium|large|huge"
1 : 10 Minimum Huge Buffers
1 : 11 Maximum Huge Buffers
1 : 12 Minimum Large Buffers
1 : 13 Maximum Large Buffers
1 : 14 Minimum Medium Buffers
1 : 15 Maximum Medium Buffers

Modify the xml file located in the ARTEX catalog and add the necessary information for these three new buffers type:

$ oem_setup_env
# vi /etc/security/artex/catalogs/devParam.adapter.vdevice.IBM,l-lan.xml
<?xml version="1.0" encoding="UTF-8"?>

<Catalog id="devParam.adapter.vdevice.IBM,l-lan" version="3.0" inherit="devCommon">

  <ShortDescription><NLSCatalog catalog="vioent.cat" setNum="1" msgNum="1">Virtual I/O Ethernet Adapter (l-lan)</NLSCatalog></ShortDescription>

  <ParameterDef name="min_buf_huge" type="integer" targetClass="device" cfgmethod="attr" reboot="true">
    <Description><NLSCatalog catalog="vioent.cat" setNum="1" msgNum="10">Minimum Huge Buffers</NLSCatalog></Description>
  </ParameterDef>

  <ParameterDef name="max_buf_huge" type="integer" targetClass="device" cfgmethod="attr" reboot="true">
    <Description><NLSCatalog catalog="vioent.cat" setNum="1" msgNum="11">Maximum Huge Buffers</NLSCatalog></Description>
  </ParameterDef>

  <ParameterDef name="min_buf_large" type="integer" targetClass="device" cfgmethod="attr" reboot="true">
    <Description><NLSCatalog catalog="vioent.cat" setNum="1" msgNum="12">Minimum Large Buffers</NLSCatalog></Description>
  </ParameterDef>

  <ParameterDef name="max_buf_large" type="integer" targetClass="device" cfgmethod="attr" reboot="true">
    <Description><NLSCatalog catalog="vioent.cat" setNum="1" msgNum="13">Maximum Large Buffers</NLSCatalog></Description>
  </ParameterDef>

  <ParameterDef name="min_buf_medium" type="integer" targetClass="device" cfgmethod="attr" reboot="true">
    <Description><NLSCatalog catalog="vioent.cat" setNum="1" msgNum="14">Minimum Medium Buffers<</NLSCatalog></Description>
  </ParameterDef>

  <ParameterDef name="max_buf_medium" type="integer" targetClass="device" cfgmethod="attr" reboot="true">
    <Description><NLSCatalog catalog="vioent.cat" setNum="1" msgNum="15">Maximum Medium Buffers</NLSCatalog></Description>
  </ParameterDef>

[..]
  <ParameterDef name="max_buf_tiny" type="integer" targetClass="device" cfgmethod="attr" reboot="true">
    <Description><NLSCatalog catalog="vioent.cat" setNum="1" msgNum="19">Maximum Tiny Buffers</NLSCatalog></Description>
  </ParameterDef>


Then I’m retrying to add the rules of the medium,large and huge buffers …. and it’s working great:

# rules -o add -t adapter/vdevice/IBM,l-lan -a max_buf_medium=512
# rules -o add -t adapter/vdevice/IBM,l-lan -a min_buf_medium=512
# rules -o add -t adapter/vdevice/IBM,l-lan -a max_buf_huge=128
# rules -o add -t adapter/vdevice/IBM,l-lan -a min_buf_huge=128
# rules -o add -t adapter/vdevice/IBM,l-lan -a max_buf_large=128
# rules -o add -t adapter/vdevice/IBM,l-lan -a min_buf_large=128

Deploying the rules

Now that a couple of rules are defined let’s now apply them on the Virtual I/O server. First check the differences you will get after applying the rules by using the diff operation of the rules command:

$ rules -o diff -s
devParam.adapter.vdevice.IBM,l-lan:max_buf_tiny device=adapter/vdevice/IBM,l-lan    2048 | 4096
devParam.adapter.vdevice.IBM,l-lan:min_buf_tiny device=adapter/vdevice/IBM,l-lan     512 | 4096
devParam.adapter.vdevice.IBM,l-lan:max_buf_small device=adapter/vdevice/IBM,l-lan   2048 | 4096
devParam.adapter.vdevice.IBM,l-lan:min_buf_small device=adapter/vdevice/IBM,l-lan    512 | 4096
devParam.adapter.vdevice.IBM,l-lan:max_buf_medium device=adapter/vdevice/IBM,l-lan   256 | 512
devParam.adapter.vdevice.IBM,l-lan:min_buf_medium device=adapter/vdevice/IBM,l-lan   128 | 512
devParam.adapter.vdevice.IBM,l-lan:max_buf_huge device=adapter/vdevice/IBM,l-lan      64 | 128
devParam.adapter.vdevice.IBM,l-lan:min_buf_huge device=adapter/vdevice/IBM,l-lan      24 | 128
devParam.adapter.vdevice.IBM,l-lan:max_buf_large device=adapter/vdevice/IBM,l-lan     64 | 128
devParam.adapter.vdevice.IBM,l-lan:min_buf_large device=adapter/vdevice/IBM,l-lan     24 | 128
devParam.pseudo.vios.lpm:src_lun_val device=pseudo/vios/lpm                          off | on
devParam.pseudo.vios.lpm:dest_lun_val device=pseudo/vios/lpm                 restart_off | on

Let’s now deploy the rules using the deploy operation of the rules command, you can notice that for some rules a mandatory reboot is needed to change the existing devices this is the case for the buffers, but not for the vioslpm0 attributes (we can check again that we now have no differences … some attributes are applied using the -P attribute of the chdev command):

$ rules -o deploy 
A manual post-operation is required for the changes to take effect, please reboot the system.
$ lsdev -dev ent8 -attr min_buf_small
value

4096
 lsdev -dev vioslpm0 -attr src_lun_val
value

on
$ rules -o diff -s

Don’t forget to reboot the Virtual I/O Server and check everything is ok after the reboot (check the kernel values by using enstat):

$ shutdown -force -restart
[..]
$ for i in ent8 ent9 ent10 ent11 ; do lsdev -dev $i -attr max_buf_huge,max_buf_large,max_buf_medium,max_buf_small,max_buf_tiny,min_buf_huge,min_buf_large,min_buf_medium,min_buf_small,min_buf_tiny ; done
[..]
128
128
512
4096
4096
128
128
512
4096
4096
$ entstat -all ent13 | grep -i buf
[..]
No mbuf Errors: 0
  Transmit Buffers
    Buffer Size             65536
    Buffers                    32
      No Buffers                0
  Receive Buffers
    Buffer Type              Tiny    Small   Medium    Large     Huge
    Min Buffers              4096     4096      512      128      128
    Max Buffers              4096     4096      512      128      128

For the fibre channels adapters I’m using theses rules:

$ rules -o modify -t driver/iocb/efscsi -a dyntrk=yes
$ rules -o modify -t driver/qliocb/qlfscsi -a dyntrk=yes
$ rules -o modify -t driver/qiocb/qfscsi -a dyntrk=yes
$ rules -o modify -t driver/iocb/efscsi -a fc_err_recov=fast_fail
$ rules -o modify -t driver/qliocb/qlfscsi -a fc_err_recov=fast_fail
$ rules -o modify -t driver/qiocb/qfscsi -a fc_err_recov=fast_fail

What about new devices ?

Let’s now create a new SEA by adding new Virtual Ethernet Adapter using DLPAR and check the devices are created with the good values. (I’m not showing you here how to create the VEA I’m doing it the GUI for simplicity) (14,15,16,17 are the new ones):

$ lsdev | grep ent
ent12            Available   EtherChannel / IEEE 802.3ad Link Aggregation
ent13            Available   Shared Ethernet Adapter
ent14            Available   Virtual I/O Ethernet Adapter (l-lan)
ent15            Available   Virtual I/O Ethernet Adapter (l-lan)
ent16            Available   Virtual I/O Ethernet Adapter (l-lan)
ent17            Available   Virtual I/O Ethernet Adapter (l-lan)
$ lsdev -dev ent14 -attr
buf_mode        min            Receive Buffer Mode                        True
copy_buffs      32             Transmit Copy Buffers                      True
max_buf_control 64             Maximum Control Buffers                    True
max_buf_huge    128            Maximum Huge Buffers                       True
max_buf_large   128            Maximum Large Buffers                      True
max_buf_medium  512            Maximum Medium Buffers                     True
max_buf_small   4096           Maximum Small Buffers                      True
max_buf_tiny    4096           Maximum Tiny Buffers                       True
min_buf_control 24             Minimum Control Buffers                    True
min_buf_huge    128            Minimum Huge Buffers                       True
min_buf_large   128            Minimum Large Buffers                      True
min_buf_medium  512            Minimum Medium Buffers                     True
min_buf_small   4096           Minimum Small Buffers                      True
min_buf_tiny    4096           Minimum Tiny Buffers                       True
$  mkvdev -sea ent0 -vadapter ent14 ent15 ent16 ent17 -default ent14 -defaultid 14 -attr ha_mode=sharing largesend=1 large_receive=yes
ent18 Available
$ entstat -all ent18 | grep -i buf
No mbuf Errors: 0
  Transmit Buffers
    Buffer Size             65536
    Buffers                    32
      No Buffers                0
  Receive Buffers
    Buffer Type              Tiny    Small   Medium    Large     Huge
    Min Buffers              4096     4096      512      128      128
    Max Buffers              4096     4096      512      128      128
  Buffer Mode: Min
[..]

Deploying these rules to another Virtual I/O Server

The goal is now to use this rule file and deploy it on all my Virtual I/O Servers to be sure all the attributes are the same on all the Virtual I/O Servers.

I’m copying my rule file and copy it to another Virtual I/O Server:

$ oem_setup_env
# cp /home/padmin/rules
# scp /home/padmin/rules/custom_rules.xml anothervios:/home/padmin/rules
custom_rules.xml                   100%   19KB  18.6KB/s   00:00
# scp /etc/security/artex/catalogs/devParam.adapter.vdevice.IBM,l-lan.xml anothervios:/etc/security/artex/catalogs/
devParam.adapter.vdevice.IBM,l-lan.xml
devParam.adapter.vdevice.IBM,l-lan.xml    100% 2737     2.7KB/s   00:00

I’m now connecting to the new Virtual I/O Server and applying the rules:

$ rules -o import -f /home/padmin/rules/custom_rules.xml
$ rules -o diff -s
devParam.adapter.vdevice.IBM,l-lan:max_buf_tiny device=adapter/vdevice/IBM,l-lan    2048 | 4096
devParam.adapter.vdevice.IBM,l-lan:min_buf_tiny device=adapter/vdevice/IBM,l-lan     512 | 4096
devParam.adapter.vdevice.IBM,l-lan:max_buf_small device=adapter/vdevice/IBM,l-lan   2048 | 4096
devParam.adapter.vdevice.IBM,l-lan:min_buf_small device=adapter/vdevice/IBM,l-lan    512 | 4096
devParam.adapter.vdevice.IBM,l-lan:max_buf_medium device=adapter/vdevice/IBM,l-lan   256 | 512
devParam.adapter.vdevice.IBM,l-lan:min_buf_medium device=adapter/vdevice/IBM,l-lan   128 | 512
devParam.adapter.vdevice.IBM,l-lan:max_buf_huge device=adapter/vdevice/IBM,l-lan      64 | 128
devParam.adapter.vdevice.IBM,l-lan:min_buf_huge device=adapter/vdevice/IBM,l-lan      24 | 128
devParam.adapter.vdevice.IBM,l-lan:max_buf_large device=adapter/vdevice/IBM,l-lan     64 | 128
devParam.adapter.vdevice.IBM,l-lan:min_buf_large device=adapter/vdevice/IBM,l-lan     24 | 128
devParam.pseudo.vios.lpm:src_lun_val device=pseudo/vios/lpm                          off | on
devParam.pseudo.vios.lpm:dest_lun_val device=pseudo/vios/lpm                 restart_off | on
$ rules -o deploy
A manual post-operation is required for the changes to take effect, please reboot the system.
$ entstat -all ent18 | grep -i buf
[..]
    Buffer Type              Tiny    Small   Medium    Large     Huge
    Min Buffers               512      512      128       24       24
    Max Buffers              2048     2048      256       64       64
[..]
$ shutdown -force -restart
$ entstat -all ent18 | grep -i buf
[..]
   Buffer Type              Tiny    Small   Medium    Large     Huge
    Min Buffers              4096     4096      512      128      128
    Max Buffers              4096     4096      512      128      128
[..]

rulescfgset

If you don’t care at all about creating your own rules you can just use the rulecfgset command as padmin to apply default Virtual I/O Server rules, my advice for newbies is to do that just after the Virtual I/O Server is installed. By doing that you will be sure to have the default IBM rules. It is a good pratice to do that every time you will deploy a new Virtual I/O Server.

# rulescfgset

Conclusion

Use rules ! It is a good way to be sure your Virtual I/O Server devices attributes are the same. I hope my example are good enough to convince you to use it. For PowerVC user like me using rules is a must. As PowerVC is creating devices for you, you want to be sure all your devices are created with the exact same attributes. My example about Virtual Ethernet Adapter buffers is just a mandatory thing to do now for PowerVC users. As always I hope it helps.