Deep dive into PowerVC Standard 1.2.1.0 using Storage Volume Controller and Brocade 8510-4 FC switches in a multifabric environment

Before reading this post I highly encourage you to read my first post about PowerVC because this one will be focused on the standard edition specificity. I had the chance to work on PowerVC express with IVM and local storage and now with a PowerVC standard with an IBM Storage Volume Controller & Brocade fibre channel switches. A few things are different between these two versions (particularly the storage management). Virtual Machines created by PowerVC standard will use NPIV (Virtual fibre channel adapters) instead of virtual vSCSI adapters. Using local storage or using an SVC in a multi fabric environment are two different things and PowerVC ways to capture/deploy and manage virtual machines are totally different. The PowerVC configuration is more complex and you have to manage the fibre channel ports configuration, the storage connectivity groups and templates. Last but no least the PowerVC standard edition is Live Partition Mobility aware. Let’s have a look on all the standard version specificity. But before you start reading this post I have to warn you that this one is very long (It’s always hard for me to write short posts :-)). Last thing, this post is the result of one month of work on PowerVC mostly on my own, but I had to thanks IBM guys for helping about a few problems (Paul, Eddy, Jay, Phil, …). Cheers guys!

Prerequisites

PowerVC standard needs to connect to the Hardware Management Console, to the Storage Provider, and to the Fibre Channel Switches. Be sure ports are open between PowerVC, the HMC, the Storage Array, and the Fibre Channel Switches :

  • Port TCP 12443 between PowerVC and the HMC (PowerVC is using the HMC K2 Rest API to communicate with the HMC)
  • Port TCP 22 (ssh) between PowerVC and the Storage Array.
  • Port TCP 22 (ssh) between PowerVC and the Fibre Channel Switches.

pvcch

Check your storage array is compatible with PowerVC standard (for the moment only IBM Storwise storage and IBM Storage Volume Controller are supported). All Brocade switches with a firmware 7 are supported. Be careful the PowerVC Redbook is not up-to-date about this : all Brocade switches are supported (An APAR and a PMR are opened about this mistake)

This post was written with this PowerVC configuration :

  • PowerVC 1.2.0.0 x86 version & PowerVC 1.2.1.0 PCC version.
  • Storage Volume Controller with EMC VNX2 Storage array.
  • Brocade DCX 8510-4.
  • Two Power770+ with firmware latest AM780 firmware.

PowerVC standard storage specifics and configuration

PowerVC needs to control the storage to create or delete luns, to create hosts and it also needs to control the fibre channel switches to create and delete zone for the virtual machines. If you are working with multi fibre channel adapters with many ports you have also to configure the storage connectivity groups and the fibre channels ports to tell which port to use and in which case (you may want to create virtual machines for development only on two virtual fibre channel adapters and production one on four). Let’s see how to to this :

Adding storage and fabric

  • Add the storage provider (in my case a Storage Volume Controller but it can be any IBM Storwise family storage array) :
  • blog_add_storage

  • PowerVC will ask you a few questions while adding the storage provider (for instance which pool will be the default pool for the deployment of the virtual machines). You can next check in this view the actual size and remaining size of the used pool :
  • blog_storage_added

  • Add each fibre channel switch (in my case two switches one for fabric A and the second one for the fabric B) (be very careful with the fabric designation (A or B), it will be used later when creating storage templates and storage connectivity groups) :
  • blog_add_frabric

  • Each fabric can be viewed and modified afterwards :
  • blog_fabric_added

Fibre Channel Port Configuration

If you are working in a multi fabric environment you have to configure the fibre channel ports. For each port the first step is to tell PowerVC on which fabric the port is connected. In my case here is the configuration (you can refer to the colours on the image below, and on the explications below) :

pb connectivty_zenburn

  • Each Virtual I/O Server has 2 fibre channel adapters with four ports.
  • For the first adapter : first port is connected to Fabric A, and last port is connected to Fabric B.
  • For the second adapter : first port is connected to Fabric B, and last port is connected to Fabric A.
  • Two ports (port 1 and 2) are remaining free for future usage (future growing).
  • For each port I have to tell PowerVC if the port is connected on : (With PowerVC 1.2.0.0 you have to do this manually and check on the fibre channel switch where are the ports connected. With PowerVC 1.2.1.0 it is automatically detected by PowerVC :-))
  • 17_choose_fabric_for_each_port

    • Connected on Fabric A ? (check the image below) (check switch command to find if the port is connected on the fibre channel switch)
    • blog_connected_fabric_A

      switch_fabric_a:FID1:powervcadm> nodefind 10:00:00:90:FA:3E:C6:D1
      No device found
      switch_fabric_a:FID1:powervcadm> nodefind 10:00:00:90:FA:3E:C6:CE
      Local:
       Type Pid    COS     PortName                NodeName                 SCR
       N    01fb40;    2,3;10:00:00:90:fa:3e:c6:ce;20:00:01:20:fa:3e:c6:ce; 0x00000003
          Fabric Port Name: 20:12:00:27:f8:79:ce:01
          Permanent Port Name: 10:00:00:90:fa:3e:c6:ce
          Device type: Physical Unknown(initiator/target)
          Port Index: 18
          Share Area: Yes
          Device Shared in Other AD: No
          Redirect: No
          Partial: No
          Aliases: XXXXX59_3ec6ce
      
    • Connected on Fabric B ? (check the image below) (check switch command to find if the port is connected on the fibre channel switch)
    • blog_connected_fabric_B

      switch_fabric_b:FID1:powervcadm> nodefind 10:00:00:90:FA:3E:C6:D1
      Local:
       Type Pid    COS     PortName                NodeName                 SCR
       N    02fb40;    2,3;10:00:00:90:fa:3e:c6:d1;20:00:01:20:fa:3e:c6:d1; 0x00000003
          Fabric Port Name: 20:12:00:27:f8:79:d0:01
          Permanent Port Name: 10:00:00:90:fa:3e:c6:d1
          Device type: Physical Unknown(initiator/target)
          Port Index: 18
          Share Area: Yes
          Device Shared in Other AD: No
          Redirect: No
          Partial: No
          Aliases: XXXXX59_3ec6d1
      switch_fabric_b:FID1:powervcadm> nodefind 10:00:00:90:FA:3E:C6:CE
      No device found
      
    • Free, not connected ? (check the image below)
    • blog_not_connected

  • At the end each fibre channel port has to be configured with one of these three choices (connected on Fabric A, connected on Fabric B, Free/not connected).

Port Tagging and Storage Connectivity Group

Fibre channel ports are now configured, but we have to be sure that when deploying a new virtual machine :

  • Each virtual machine will be deployed with four fibre channel adapters (I am in a CHARM configuration).
  • Each virtual machine is connected on the first Virtual I/O Server to the Fabric A and Fabric B on different adapters (each adapter on a different CEC).
  • Each virtual machine is connected to the second Virtual I/O Server to Fabric A and Fabric B on different adapters.
  • I can choose to deploy the virtual machine using fcs0 (Fabric A) and fcs7 (Fabric B) on each Virtual I/O Server or using fcs3 (Fabric B) and fcs4 (Fabric A). Ideally half of the machines will be created with the first configuration and the second half one the second configuration.

To do this you have to tag each port with a tag of the name of your choice, and then create a storage connectivity group. A storage connectivity is a constraint that is used for the deployment of virtual machine :

pb_port_tag_zenburn

  • Two tags are created and set on each ports, fcs0(A)_fcs7(B), and fcs3(B)_fcs4(A) :
  • blog_port_tag

  • Two connectivity groups are created to force the usage of tagged fibre channel ports when deploying a virtual machine.
    • When creating a connectivity group you have to choose the Virtual I/O Server(s) used when deploying a virtual machine using this connectivity group. It can be useful to tell PowerVC to deploy development machines on a single Virtual I/O Server, and production one on dual Virtual I/O Server :
    • blog_vios_connectivity_group

    • In my case connectivity groups are created to restrict the usage of fibre channel adapters. I want to deploy on fibre channel ports fcs0/fcs7 or fibre channel ports fcs3/fcs4. Here are my connectivity groups :
    • blog_connectivity_1
      blog_connectivity_2

    • You can check a sum-up of your connectivity group. I wanted to add this image because I think the two images (provided in PowerVC) are better than text to explain what is a connectivity group :-) :
    • 22_create_connectivity_group_3

Storage Template

If you are using different pools or different storage arrays (for example, in my case I can have different storage arrays behind my Storage Volume Controller) you may want to tell PowerVC to deploy virtual machines on a specific pool or with a specific type (I want for instance, my machines to be created on compressed luns, on thin provisioned luns, or on thick provisioned luns). In my case I’ve created two different templates to create machines on thin or compressed lun. Easy !

  • When creating a storage template you first have to choose the storage pool :
  • blog_storage_template_select_storage_pool

  • Then choose the type of lun for this storage template :
  • blog_storage_template_create

  • Here are exemple with my two storage templates :
  • blog_storage_list

A deeper look on VM capture

I you read my last article about PowerVC express version you know that capturing an image could take some time when using local storage, “dding” a whole disk is long, copying a file to the PowerVC host is long. But don’t worry PowerVC standard solve this problem easily by using all the potential of the IBM Storage (In my case a Storage Volume Controller) … the solution FlashCopies, more specifically what we call a FlashCopy-Copy (to be clear : a FlashCopy-Copy is a full copy of a lun : there are no more relationship between the source lun being copied on the FlashCopy lun (the FlashCopy is created with the autodelete argument)) . Let me explain to you how PowerVC standard manages the virtual machine capture :

  • The activation engine has be run, the virtual machine to be captured is stopped.
  • The user launch the capture by using PowerVC.
  • A FlashCopy-Copy is created from the storage side, we can check it from the GUI interface :
  • blog_flash_copy_pixelate_1

  • Checking with the SVC command line we can see that (use catauditlog command to check this) :
    • A new volume called volume-Image-[name_of_the_image] is created (all captured images will be called volume-Image-[name]), taking care of the storage template (diskgroup/pool, grainsize, rsize ….)
    • # mkvdisk -name volume-Image_7100-03-03-1415 -iogrp 0 -mdiskgrp VNX_XXXXX_SAS_POOL_1 -size 64424509440 -unit b -autoexpand -grainsize 256 -rsize 2% -warning 0% -easytier on
      
    • A FlashCopy-Copy with the id of boot volume of the virtual machine to capture as source, and the id of the image’s lun as target is created :
    • # mkfcmap -source 865 -target 880 -autodelete
      
    • We can check the vdisk 865 is the boot volume of the captured machine and has a FlashCopy running:
    • # lsvdisk -delim :
      id:name:IO_group_id:IO_group_name:status:mdisk_grp_id:mdisk_grp_name:capacity:type:FC_id:FC_name:RC_id:RC_name:vdisk_UID:fc_map_count:copy_count:fast_write_state:se_copy_count:RC_change:compressed_copy_count
      865:_BOOT:0:io_grp0:online:0:VNX_00086_SAS_POOL_1:60.00GB:striped:0:fcmap0:::600507680184879C2800000000000431:1:1:empty:1:no:0
      
    • The FlashCopy-Copy is prepared and started (at this step we can already use our captured image, the copy is running in background) :
    • # prestartfcmap 0
      # startfcmap 0
      
    • While the copy of the FlahsCopy is running we can check the advancement (we can check it too by logging on the GUI too) :
    • IBM_2145:SVC:powervcadmin>lsfcmap
      id name   source_vdisk_id source_vdisk_name target_vdisk_id target_vdisk_name            group_id group_name status  progress copy_rate clean_progress incremental partner_FC_id partner_FC_name restoring start_time   rc_controlled
      0  fcmap0 865             XXXXXXXXX7_BOOT 880             volume-Image_7100-03-03-1415                     copying 54       50        100            off                                       no        140620002138 no
      
      IBM_2145:SVC:powervcadmin>lsfcmapprogress fcmap0
      id progress
      0  54
      
    • After the FlashCopy-Copy is finished, there are no more relationship between the source volume and the finished FlashCopy. The captured image is a vdisk :
    • IBM_2145:SVC:powervcadmin>lsvdisk 880
      id 880
      name volume-Image_7100-03-03-1415
      IO_group_id 0
      IO_group_name io_grp0
      status online
      mdisk_grp_id 0
      mdisk_grp_name VNX_XXXXX_SAS_POOL_1
      capacity 60.00GB
      type striped
      [..]
      vdisk_UID 600507680184879C280000000000044C
      [..]
      fc_map_count 0
      [..]
      
    • The is no more fcmap for the source volume :
    • IBM_2145:SVC:powervcadmin>lsvdisk 865
      [..]
      fc_map_count 0
      [..]
      

Deployment mechanism

blog_deploy3_pixelate

Deploying a virtual machine with the standard version is very similar as deploying a machine with the express version. The only thing different is the possibility to choose the storage template (with the constraints of the storage connectivity group)

View from the Hardware Management Console

PowerVC is using the Hardware Management Console new k2 rest API to create the virtual machine, if you want to go further and check the commands used on the HMC you can check it with the lssvcevents command :

time=06/21/2014 17:49:12,text=HSCE2123 User name powervc: chsysstate -m XXXX58-9117-MMD-658B2AD -r lpar -o on -n deckard-e9879213-00000018 command was executed successfully.
time=06/21/2014 17:47:29,text=HSCE2123 User name powervc: chled -r sa -t virtuallpar -m 9117-MMD*658B2AD --id 1 -o off command was executed successfully.
time=06/21/2014 17:46:51,"text=HSCE2123 User name powervc: chhwres -r virtualio --rsubtype fc -o a -m 9117-MMD*658B2AD -s 29 --id 1 -a remote_slot_num=6,remote_lpar_id=8,adapter_type=server co
mmand was executed successfully."
time=06/21/2014 17:46:40,"text=HSCE2123 User name powervc: chsyscfg -m 9117-MMD*658B2AD -r prof -i lpar_name=deckard-e9879213-00000018,""virtual_fc_adapters+=""""6/CLIENT/1//29//0"""""",name=l
ast*valid*configuration -o apply --override command was executed successfully."
time=06/21/2014 17:46:32,text=HSCE2245 User name powervc: Activating the partition 8 succeeded on managed system 9117-MMD*658B2AD.
time=06/21/2014 17:46:17,"text=HSCE2123 User name powervc: chhwres -r virtualio --rsubtype fc -o a -m 9117-MMD*658B2AD -s 28 --id 1 -a remote_slot_num=5,remote_lpar_id=8,adapter_type=server co
mmand was executed successfully."
time=06/21/2014 17:46:06,"text=HSCE2123 User name powervc: chsyscfg -m 9117-MMD*658B2AD -r prof -i lpar_name=deckard-e9879213-00000018,""virtual_fc_adapters+=""""5/CLIENT/1//28//0"""""",name=l
ast*valid*configuration -o apply --override command was executed successfully."
time=06/21/2014 17:45:57,text=HSCE2245 User name powervc: Activating the partition 8 succeeded on managed system 9117-MMD*658B2AD.
time=06/21/2014 17:45:46,"text=HSCE2123 User name powervc: chhwres -r virtualio --rsubtype fc -o a -m 9117-MMD*658B2AD -s 30 --id 2 -a remote_slot_num=4,remote_lpar_id=8,adapter_type=server co
mmand was executed successfully."
time=06/21/2014 17:45:36,text=HSCE2123 User name powervc: chhwres -r virtualio --rsubtype fc -o r -m 9117-MMD*658B2AD -s 29 --id 1 command was executed successfully.
time=06/21/2014 17:45:27,"text=HSCE2123 User name powervc: chsyscfg -m 9117-MMD*658B2AD -r prof -i lpar_name=deckard-e9879213-00000018,""virtual_fc_adapters+=""""4/CLIENT/2//30//0"""""",name=l
ast*valid*configuration -o apply --override command was executed successfully."
time=06/21/2014 17:45:18,text=HSCE2245 User name powervc: Activating the partition 8 succeeded on managed system 9117-MMD*658B2AD.
time=06/21/2014 17:45:08,text=HSCE2123 User name powervc: chhwres -r virtualio --rsubtype fc -o r -m 9117-MMD*658B2AD -s 28 --id 1 command was executed successfully.
time=06/21/2014 17:45:07,text=User powervc has logged off from session id 42151 for the reason:  The user ran the Disconnect task.
time=06/21/2014 17:45:07,text=User powervc has disconnected from session id 42151 for the reason:  The user ran the Disconnect task.
time=06/21/2014 17:44:50,"text=HSCE2123 User name powervc: chhwres -r virtualio --rsubtype scsi -o a -m 9117-MMD*658B2AD -s 23 --id 1 -a adapter_type=server,remote_lpar_id=8,remote_slot_num=3
command was executed successfully."
time=06/21/2014 17:44:40,"text=HSCE2123 User name powervc: chsyscfg -m 9117-MMD*658B2AD -r prof -i lpar_name=deckard-e9879213-00000018,virtual_scsi_adapters+=3/CLIENT/1//23/0,name=last*valid*c
onfiguration -o apply --override command was executed successfully."
time=06/21/2014 17:44:32,text=HSCE2245 User name powervc: Activating the partition 8 succeeded on managed system 9117-MMD*658B2AD.
time=06/21/2014 17:44:22,"text=HSCE2123 User name powervc: chhwres -r virtualio --rsubtype fc -o a -m 9117-MMD*658B2AD -s 25 --id 2 -a remote_slot_num=2,remote_lpar_id=8,adapter_type=server co
mmand was executed successfully."
time=06/21/2014 17:44:11,"text=HSCE2123 User name powervc: chsyscfg -m 9117-MMD*658B2AD -r prof -i lpar_name=deckard-e9879213-00000018,""virtual_fc_adapters+=""""2/CLIENT/2//25//0"""""",name=l
ast*valid*configuration -o apply --override command was executed successfully."
time=06/21/2014 17:44:02,text=HSCE2245 User name powervc: Activating the partition 8 succeeded on managed system 9117-MMD*658B2AD.
time=06/21/2014 17:43:50,text=HSCE2123 User name powervc: chhwres -r virtualio --rsubtype scsi -o r -m 9117-MMD*658B2AD -s 23 --id 1 command was executed successfully.
time=06/21/2014 17:43:31,text=HSCE2123 User name powervc: chled -r sa -t virtuallpar -m 9117-MMD*658B2AD --id 1 -o off command was executed successfully.
time=06/21/2014 17:43:31,text=HSCE2123 User name powervc: chled -r sa -t virtuallpar -m 9117-MMD*658B2AD --id 2 -o off command was executed successfully.
time=06/21/2014 17:42:57,"text=HSCE2123 User name powervc: chsyscfg -m 9117-MMD*658B2AD -r prof -i lpar_name=deckard-e9879213-00000018,""virtual_eth_adapters+=""""32/0/1665//0/0/zvdc4/fabbb99d
e420/all/"""""",name=last*valid*configuration -o apply --override command was executed successfully."
time=06/21/2014 17:42:49,text=HSCE2245 User name powervc: Activating the partition 8 succeeded on managed system 9117-MMD*658B2AD.
time=06/21/2014 17:41:53,text=HSCE2123 User name powervc: chsyscfg -m 9117-MMD*658B2AD -r lpar -p deckard-e9879213-00000018 -n default_profile -o apply command was executed successfully.
time=06/21/2014 17:41:42,text=HSCE2245 User name powervc: Activating the partition 8 succeeded on managed system 9117-MMD*658B2AD.
time=06/21/2014 17:41:36,"text=HSCE2123 User name powervc: mksyscfg -m 9117-MMD*658B2AD -r lpar -i name=deckard-e9879213-00000018,lpar_env=aixlinux,min_mem=8192,desired_mem=8192,max_mem=8192,p
rofile_name=default_profile,max_virtual_slots=64,lpar_proc_compat_mode=default,proc_mode=shared,min_procs=4,desired_procs=4,max_procs=4,min_proc_units=2,desired_proc_units=2,max_proc_units=2,s
haring_mode=uncap,uncap_weight=128,lpar_avail_priority=127,sync_curr_profile=1 command was executed successfully."
time=06/21/2014 17:41:01,"text=HSCE2123 User name powervc: mksyscfg -m 9117-MMD*658B2AD -r lpar -i name=FAKE_1403368861661,profile_name=default,lpar_env=aixlinux,min_mem=8192,desired_mem=8192,
max_mem=8192,max_virtual_slots=4,virtual_eth_adapters=5/0/1//0/1/,virtual_scsi_adapters=2/client/1//2/0,""virtual_serial_adapters=0/server/1/0//0/0,1/server/1/0//0/0"",""virtual_fc_adapters=3/
client/1//2//0,4/client/1//2//0"" -o query command was executed successfully."

blog_deploy3_hmc1

As you can see on the picture below four virtual fibre channel adapters are created taking care of the constraints of the storage connectivity groups create earlier (looking on the Virtual I/O Server vfcmaps are ok …) :

blog_deploy3_hmc2_pixelate

padmin@XXXXX60:/home/padmin$ lsmap -vadapter vfchost14 -npiv
Name          Physloc                            ClntID ClntName       ClntOS
------------- ---------------------------------- ------ -------------- -------
vfchost14     U9117.MMD.658B2AD-V1-C28                8 deckard-e98792 AIX

Status:LOGGED_IN
FC name:fcs3                    FC loc code:U2C4E.001.DBJN916-P2-C1-T4
Ports logged in:2
Flags:a
VFC client name:fcs2            VFC client DRC:U9117.MMD.658B2AD-V8-C5

padmin@XXXXX60:/home/padmin$ lsmap -vadapter vfchost15 -npiv
Name          Physloc                            ClntID ClntName       ClntOS
------------- ---------------------------------- ------ -------------- -------
vfchost15     U9117.MMD.658B2AD-V1-C29                8 deckard-e98792 AIX

Status:LOGGED_IN
FC name:fcs4                    FC loc code:U2C4E.001.DBJO029-P2-C1-T1
Ports logged in:2
Flags:a
VFC client name:fcs3            VFC client DRC:U9117.MMD.658B2AD-V8-C6

View from the Storage Volume Controller

The SVC side is pretty simple, two steps, FlashCopy-Copy of the volume-Image (the one created at the capture step) (the source of the FlashCopy is the volumeImage-[name] lun) and a host creation for the new virtual machine :

  • Creation of a FlashCopy-Copy with the volume used for the capture as source :
  • blog_deploy3_flashcopy1

    # mkvdisk -name volume-boot-9117MMD_658B2AD-deckard-e9879213-00000018 -iogrp 0 -mdiskgrp VNX_00086_SAS_POOL_1 -size 64424509440 -unit b -autoexpand -grainsize 256 -rsize 2% -warning 0% -easytier on
    # mkfcmap -source 880 -target 881 -autodelete
    # prestartfcmap 0
    # startfcmap 0
    
  • The host is created using the height wwpns of the newly created virtual machine (I repaste here the lssyscfg command to check the wwpns are the same :-)
  • hscroot@hmc1:~> lssyscfg -r prof -m XXXXX58-9117-MMD-658B2AD --filter "lpar_names=deckard-e9879213-00000018"
    name=default_profile,lpar_name=deckard-e9879213-00000018,lpar_id=8,lpar_env=aixlinux,all_resources=0,min_mem=8192,desired_mem=8192,max_mem=8192,min_num_huge_pages=0,desired_num_huge_pages=0,max_num_huge_pages=0,mem_mode=ded,mem_expansion=0.0,hpt_ratio=1:128,proc_mode=shared,min_proc_units=2.0,desired_proc_units=2.0,max_proc_units=2.0,min_procs=4,desired_procs=4,max_procs=4,sharing_mode=uncap,uncap_weight=128,shared_proc_pool_id=0,shared_proc_pool_name=DefaultPool,affinity_group_id=none,io_slots=none,lpar_io_pool_ids=none,max_virtual_slots=64,"virtual_serial_adapters=0/server/1/any//any/1,1/server/1/any//any/1",virtual_scsi_adapters=3/client/1/XXXXX60/29/0,virtual_eth_adapters=32/0/1665//0/0/zvdc4/fabbb99de420/all/0,virtual_eth_vsi_profiles=none,"virtual_fc_adapters=""2/client/2/XXXXX59/30/c050760727c5004a,c050760727c5004b/0"",""4/client/2/XXXXX59/25/c050760727c5004c,c050760727c5004d/0"",""5/client/1/XXXXX60/28/c050760727c5004e,c050760727c5004f/0"",""6/client/1/XXXXX60/23/c050760727c50050,c050760727c50051/0""",vtpm_adapters=none,hca_adapters=none,boot_mode=norm,conn_monitoring=1,auto_start=0,power_ctrl_lpar_ids=none,work_group_id=none,redundant_err_path_reporting=0,bsr_arrays=0,lpar_proc_compat_mode=default,electronic_err_reporting=null,sriov_eth_logical_ports=none
    
    # mkhost -name deckard-e9879213-00000018-06976900 -hbawwpn C050760727C5004A -force
    # addhostport -hbawwpn C050760727C5004B -force 11
    # addhostport -hbawwpn C050760727C5004C -force 11
    # addhostport -hbawwpn C050760727C5004D -force 11
    # addhostport -hbawwpn C050760727C5004E -force 11
    # addhostport -hbawwpn C050760727C5004F -force 11
    # addhostport -hbawwpn C050760727C50050 -force 11
    # addhostport -hbawwpn C050760727C50051 -force 11
    # mkvdiskhostmap -host deckard-e9879213-00000018-06976900 -scsi 0 881
    

    blog_deploy3_svc_host1
    blog_deploy3_svc_host2

View from fibre channel switches

On the two fibre channel switches four zones a created (do not forget the zones used for the Live Partition Mobility). These zone can be easily identified by their names. All PowerVC zones are prefixed by “powervc” (unfortunately names are truncated) :

  • Four zones are created on the fibre channel switch of the fabric A :
  • switch_fabric_a:FID1:powervcadmin> zoneshow powervc_eckard_e9879213_00000018c050760727c50051500507680110f32c
     zone:  powervc_eckard_e9879213_00000018c050760727c50051500507680110f32c
                    c0:50:76:07:27:c5:00:51; 50:05:07:68:01:10:f3:2c
    
    switch_fabric_a:FID1:powervcadmin> zoneshow powervc_eckard_e9879213_00000018c050760727c5004c500507680110f385
     zone:  powervc_eckard_e9879213_00000018c050760727c5004c500507680110f385
                    c0:50:76:07:27:c5:00:4c; 50:05:07:68:01:10:f3:85
    
    switch_fabric_a:FID1:powervcadmin> zoneshow powervc_eckard_e9879213_00000018c050760727c5004d500507680110f385
     zone:  powervc_eckard_e9879213_00000018c050760727c5004d500507680110f385
                    c0:50:76:07:27:c5:00:4d; 50:05:07:68:01:10:f3:85
    
    switch_fabric_a:FID1:powervcadmin> zoneshow powervc_eckard_e9879213_00000018c050760727c50050500507680110f32c
     zone:  powervc_eckard_e9879213_00000018c050760727c50050500507680110f32c
                    c0:50:76:07:27:c5:00:50; 50:05:07:68:01:10:f3:2c
    
  • Four zones are created on the fibre channel switch of the fabric B :
  • switch_fabric_b:FID1:powervcadmin> zoneshow powervc_eckard_e9879213_00000018c050760727c5004e500507680120f385
     zone:  powervc_eckard_e9879213_00000018c050760727c5004e500507680120f385
                    c0:50:76:07:27:c5:00:4e; 50:05:07:68:01:20:f3:85
    
    switch_fabric_b:FID1:powervcadmin> zoneshow powervc_eckard_e9879213_00000018c050760727c5004a500507680120f32c
     zone:  powervc_eckard_e9879213_00000018c050760727c5004a500507680120f32c
                    c0:50:76:07:27:c5:00:4a; 50:05:07:68:01:20:f3:2c
    
    switch_fabric_b:FID1:powervcadmin> zoneshow powervc_eckard_e9879213_00000018c050760727c5004b500507680120f32c
     zone:  powervc_eckard_e9879213_00000018c050760727c5004b500507680120f32c
                    c0:50:76:07:27:c5:00:4b; 50:05:07:68:01:20:f3:2c
    
    switch_fabric_b:FID1:powervcadmin> zoneshow powervc_eckard_e9879213_00000018c050760727c5004f500507680120f385
     zone:  powervc_eckard_e9879213_00000018c050760727c5004f500507680120f385
                    c0:50:76:07:27:c5:00:4f; 50:05:07:68:01:20:f3:85
    

Activation Engine and Virtual Optical Device

All my deployed virtual machines are connected to one of the Virtual I/O Server by a vSCSI adapter. This vSCSI adapter is used to connect the virtual machine to a virtual optical device (a virtual cdrom) needed by the activation engine to reconfigure the virtual machine. Looking in the Virtual I/O Server the virtual media repository is filled with customized iso files needed to activate the virtual machines :

  • Here is the output of the lsrep command on one of my Virtual I/O Server is by PowerVC :
  • padmin@XXX60:/home/padmin$ lsrep 
    Size(mb) Free(mb) Parent Pool         Parent Size      Parent Free 
        1017     1014 rootvg                   279552           110592 
    
    Name                                                  File Size Optical         Access 
    vopt_1c967c7b27a94464bebb6d043e6c7a6e                         1 None            ro 
    vopt_b21849cc4a32410f914a0f6372a8f679                         1 None            ro 
    vopt_e9879213dc90484bb3c5a50161456e35                         1 None            ro
    
  • At the time of writing this post the vSCSI adapter is not deleted after the virtual machines activation, but this one is only used at the first boot of the machines :
  • blog_adapter_for_ae_pixelate

  • Even better you can mount this iso and check it is used by the activation engine. The network configuration to be applied at reboot is written in an xml file. For those -like me- who have ever played with VMcontrol it may remember you the deploy command used in VMcontrol :
  • root@XXXX60:# cd /var/vio/VMLibrary
    root@XXXX60:/var/vio/VMLibrary# loopmount -i vopt_1c967c7b27a94464bebb6d043e6c7a6e -o "-V cdrfs -o ro" -m /mnt
    root@XXXX60:/var/vio/VMLibrary# cd /mnt
    root@XXXX60:/mnt# ls
    ec2          openstack    ovf-env.xml
    root@XXXX60:/mnt# cat ovf-env.xml
    <Environment xmlns="http://schemas.dmtf.org/ovf/environment/1" xmlns:ovf="http://schemas.dmtf.org/ovf/envelope/1" xmlns:ovfenv="http://schemas.dmtf.org/ovf/environment/1" xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData" xmlns:vssd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ovfenv:id="vs0">
        <PlatformSection>
        <Locale>en</Locale>
      </PlatformSection>
      <PropertySection>
      <Property ovfenv:key="com.ibm.ovf.vmcontrol.system.networking.ipv4defaultgateway" ovfenv:value="10.244.17.1"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.system.networking.hostname" ovfenv:value="deckard"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.adapter.networking.slotnumber.1" ovfenv:value="32"/>;<Property ovfenv:key="com.ibm.ovf.vmcontrol.system.networking.dnsIPaddresses" ovfenv:value="10.10.20.10 10.10.20.11"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.adapter.networking.usedhcpv4.1" ovfenv:value="false"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.adapter.networking.ipv4addresses.1" ovfenv:value="10.244.17.35"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.adapter.networking.ipv4netmasks.1" ovfenv:value="255.255.255.0"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.system.networking.domainname" ovfenv:value="localdomain"/><Property ovfenv:key="com.ibm.ovf.vmcontrol.system.timezone" ovfenv:value=""/></PropertySection>
    

Shared Ethernet Adapters auto management

This part is not specific to the standard version of PowerVC but I wanted to talk about this here. You probably already knows that PowerVC is built on top of OpenStack and OpenStack is clever. The product doesn’t want to keep unnecessary objects in your configuration. I was very impressed by the management of the networks and of the vlans, PowerVC is managing and taking care of your Shared Ethernet Adapter for you. You don’t have to remove not used vlan, and to add by hand new vlans (just add the network in PowerVC), here are a few examples :

  • If you are adding a vlan in PowerVC you have the choice to select the Shared Ethernet Adapter for this vlan. For instance you can choose not to deploy this vlan on a particular host :
  • blog_network_do_not_use_pixelate

  • If you deploy a virtual machine on this vlan this one will be automatically added to the Shared Ethernet Adapter if this is the first machine using this vlan :
  • # chhwres -r virtualio --rsubtype vnetwork -o a -m 9117-MMD*658B2AD --vnetwork 1503-zvdc4 -a vlan_id=1503,vswitch=zvdc4,is_tagged=1
    
  • If you are moving a machine from one host to one another and this machine is last to use this vlan, the vlan will be automatically cleaned up and removed from the Shared Ethernet Adapter.
  • I have in my configuration two Shared Ethernet Adapters each one on a different virtual switch. Good news : PowerVC is vswitch aware :-)
  • This link is explaining this in details (not the redbook): Click here

Mobility

PowerVC standard is able to manage the mobility of your virtual machines. Machines can be relocated on any hosts on the PowerVC pool. You do not have anymore to remind you the long and complicated migrlpar command, PowerVC is taking care of this for you, just by clicking the migrate button :

blog_migrate_1_pixelate

  • Looking in the Hardware Management Console lssvcevents, you can check that the migrlpar command is taking care of the storage connectivity group created earlier, and is going to map the lpar on adapter fcs3 and fcs4 :
  • # migrlpar -m XXX58-9117-MMD-658B2AD -t XXX55-9117-MMD-65ED82C --id 8 -i ""virtual_fc_mappings=2//1//fcs3,4//1//fcs4,5//2//fcs3,6//2//fcs4""
    
  • On the Storage Volume Controller, the host created with the Live Partition Mobility wwpns are correctly activated while the machine is moving to the other host :
  • blog_migrate_svc_lpm_wwpns_greened

About supported fibre channel switches : all FOS >= 6.4 are ok !

At the time of writing this post things are not very clear about this. Checking in the Redbook the only supported models of fibre channel switches are IBM SAN24B-5 and IBM SAN48B-5. I’m using Brocade 8510-4 fibre channel switches and they are working well with PowerVC. After a couple of calls and mails with the PowerVC development team it seems that all Fabric OS superior or equals to version 6.4 are ok. Don’t worry if the PowerVC validator is failing, it may appends, just open a call to get the validator working with you switch model (have problems in version 1.2.0.1 but nor more problem with the latest 1.2.1.0 :-))

Conclusion

PowerVC is impressive. In my opinion PowerVC is already production ready. Building a machine with four virtual NPIV fibre channel adapter in five minutes is something every AIX system administrator has dreamed of. Tell your boss this is the right way to build machines, and invest for the future by deploying PowerVC : it’s a must have :-) :-) :-) :-)! Need advice about it, need someone to deploy it ? Hire me !

sitckers_resized

PowerVM Shared Ethernet Adapter simplification : Get rid of Control Channel Adapter

Since I started working on Virtual I/O Servers and PowerVM I’ve created many Shared Ethernet Adapters in all modes (standard, failover, or sharing). I’ve learned one important lesson “be careful when creating a Shared Ethernet Adapter“. A single mistake can cause a network outage and I’m sure that you’ve already seen someone in your team creating an ARP storm by mismatching control channel adapter or by adding a vlan that is already added on a Virtual Ethernet Adapter. Because of this kind of errors I know some customers who are trying to avoid the configuration of Shared Ethernet Adapter in failover or sharing mode to avoid any network outage. With the new version of Virtual I/O Server (starting from 2.2.2.2) network loop and ARP storms are -in most cases- detected and stopped at the Virtual I/O Server level or at the firwmare level. I always check two or three times my configuration before creating a Shared Ethernet Adapter. All these errors come -most of the time- from a lack of rigor and are in -almost- all cases due to the system administrator. With the new version of PowerVM you can now create all Shared Ethernet Adapters without specifying any control channel adapter (The Hardware Management Console and the Virtual I/O Server will do it for you). A new discovery protocol implemented on Virtual I/O Server is matching Shared Ethernet Adapters between them and will take care of creating the Control Channel vlan for you (this one will not be visible on the Virtual I/O Server). Much simpler = less errors. Here is a practical how-to :

How does it work ?

A new discovery protocol called SEA HA match partners between them by using a dedicated vlan (not configurable by the user). Here are a few things to know :

  • Multiple Shared Ethernet Adapters can share the vlan 4095 for their Control Channel link.
  • The vlan 4095 is created per Virtual Switch for this Control Channel link.
  • As always only two Shared Ethernet Adapters can be partners, the Hardware Management Console is ensuring that priority 1 and 2 are used (I’ve seen some customers using priority 3 and 4, do don’t this.)
  • Both failover and sharing mode can be used.
  • Shared Ethernet Adapters with a dedicated Control Channel Adapter, can be migrated to this configuration with a network outage, put the SEA in defined state before :

Here is any example of this configuration on a Shared Ethernet Adapter in Sharing Mode :

sea_no_ctl_chan_fig1

On the image below you can follow the steps of this new discovery protocol :

  • 1/No dedicated Control Channel Adapter in Shared Ethernet Adapter Creation. The discovery protocol will be used if you are creating a SEA in failover or sharing mode without specifying the ctl_chan attribute.
  • 2/Partners are identified by their PVID, both partners must have the same PVID.
  • 3/This PVID has to be uniq per SEA pairs.
  • 4/Additional vlans ID are compared : partners with not matching additional vlans IDs are still considered as partners if their PVID match.
  • 5/Shared Ethernet Adapter with matching additional vlan IDs and not matching PVID are not considered as partners.
  • 6/If partners are not matching their additional vlan IDs they are still considered partners but an error is logged in the errlog.

sea_no_ctl_chan_fig2

Prerequisites

Shared Ethernet Adapter without the need of a Control Channel Adapter can’t be created on all systems. At the time of writing this post only a few models of POWER7 machines (maybe POWER8) have the firmware implementing the feature. You have to check that the firmware of your machine is at least a XX780_XXX release. Be careful to check the release note of the firmware, some of the 780’s firmwares does not permit the creation of SEA without Control Channel Adapter (especially 9117-MMB) (here is an example on this page : link here, the release note says : “Support was added to the Management Console command line to allow configuring a shared control channel for multiple pairs of Shared Ethernet Adapters (SEAs). This simplifies the control channel configuration to reduce network errors when the SEAs are in fail-over mode. This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.”). Because the Hardware Management Console is using the vlan 4095 to create the Control Channel link between Shared Ethernet Adapters it has to be aware of this feature and must ensure that the vlan 4095 is not usable or configurable by the administrator. The HMC v7R7.8.0 is aware of this that’s why the HMC must be updated at least to this level.

  • Check your machine firmware, in my case I’m working on a 9117-MMD (P7+770) with the lastest firmware available (at the time of writing this post) :
# lsattr -El sys0 -a modelname
modelname IBM,9117-MMD Machine name False
# lsmcode -A
sys0!system:AM780_056 (t) AM780_056 (p) AM780_056 (t)
  • These prerequisites can be check directly from the Hardware Management Console :
hscroot@myhmc:~> lslic -t sys -m 9117-MMD-65XXXX
lic_type=Managed System,management_status=Enabled,disabled_reason=,activated_level=56,activated_spname=FW780.10,installed_level=56,installed_spname=FW780.10,accepted_level=56,accepted_spname=FW780.10,ecnumber=01AM780,mtms=9117-MMD*658B2AD,deferred_level=None,deferred_spname=FW780.10,platform_ipl_level=56,platform_ipl_spname=FW780.10,curr_level_primary=56,curr_spname_primary=FW780.10,curr_ecnumber_primary=01AM780,curr_power_on_side_primary=temp,pend_power_on_side_primary=temp,temp_level_primary=56,temp_spname_primary=FW780.10,temp_ecnumber_primary=01AM780,perm_level_primary=56,perm_spname_primary=FW780.10,perm_ecnumber_primary=01AM780,update_control_primary=HMC,curr_level_secondary=56,curr_spname_secondary=FW780.10,curr_ecnumber_secondary=01AM780,curr_power_on_side_secondary=temp,pend_power_on_side_secondary=temp,temp_level_secondary=56,temp_spname_secondary=FW780.10,temp_ecnumber_secondary=01AM780,perm_level_secondary=56,perm_spname_secondary=FW780.10,perm_ecnumber_secondary=01AM780,update_control_secondary=HMC
  • Check your Hardware Management Console release is at least V7R7.8.0 (in my case my HMC is at the latest level available at the time of writing this post) :
hscroot@myhmc:~> lshmc -V
"version= Version: 7
 Release: 7.9.0
 Service Pack: 0
HMC Build level 20140409.1
MH01406: Required fix for HMC V7R7.9.0 (04-16-2014)
","base_version=V7R7.9.0
"

Shared Ethernet Adapter creation in sharing mode without control channel

The creation is simple, just identify your Real Adapter and your Virtual Adapter(s). Check on both Virtual I/O Server that PVID used on Virtual Adapters are the same and check priority are ok (use priority 1 on PRIMARY Virtual I/O Server and priority 2 on BACKUP Virtual I/O Server). I’m creating in this post a Shared Ethernet Adapter in Sharing Mode, steps are the same if you are creating a Shared Ethernet Adapter in auto mode.

  • Identify the Real Adapter (in my case an LACP 802.3ad adapter) :
  • padmin@vios1$ lsdev -dev ent17
    name             status      description
    ent17            Available   EtherChannel / IEEE 802.3ad Link Aggregation
    padmin@vios2$ lsdev -dev ent17
    name             status      description
    ent17            Available   EtherChannel / IEEE 802.3ad Link Aggregation
    
  • Identify the Virtual Adapters : priority 1 on PRIMARY Virtual I/O Server and priority 2 on BACKUP Virtual I/O Server (my advice is to check that additional vlan IDs are ok too) :
  • padmin@vios1$ entstat -all ent13 | grep -iE "Priority|Port VLAN ID"
      Priority: 1  Active: False
    Port VLAN ID:    15
    padmin@vios1$ entstat -all ent14 | grep -iE "Priority|Port VLAN ID"
      Priority: 1  Active: False
    Port VLAN ID:    16
    padmin@vios2$ entstat -all ent13 | grep -iE "Priority|Port VLAN ID"
      Priority: 2  Active: True
    Port VLAN ID:    15
    padmin@vios2$ entstat -all ent14 | grep -iE "Priority|Port VLAN ID"
      Priority: 2  Active: True
    Port VLAN ID:    16
    
  • Create the Shared Ethernet Adapter without specifying the ctl_chan attribute :
  • padmin@vios1$ mkvdev -sea ent17 -vadapter ent13 ent14 -default ent13 -defaultid 15 -attr ha_mode=sharing largesend=1 large_receive=yes
    ent18 Available
    padmin@vios2$ mkvdev -sea ent17 -vadapter ent13 ent14 -default ent13 -defaultid 15 -attr ha_mode=sharing largesend=1 large_receive=yes
    ent18 Available
    
  • Shared Ethernet Adapter are created! You can check that the ctl_chan attribute is empty when checking the device :
  • padmin@svios1$ lsdev -dev ent18 -attr
    attribute     value       description                                                        user_settable
    
    accounting    disabled    Enable per-client accounting of network statistics                 True
    adapter_reset yes         Reset real adapter on HA takeover                                  True
    ctl_chan                  Control Channel adapter for SEA failover                           True
    gvrp          no          Enable GARP VLAN Registration Protocol (GVRP)                      True
    ha_mode       sharing     High Availability Mode                                             True
    [..]
    pvid          15          PVID to use for the SEA device                                     True
    pvid_adapter  ent13       Default virtual adapter to use for non-VLAN-tagged packets         True
    qos_mode      disabled    N/A                                                                True
    queue_size    8192        Queue size for a SEA thread                                        True
    real_adapter  ent17       Physical adapter associated with the SEA                           True
    send_RARP     yes         Transmit Reverse ARP after HA takeover                             True
    thread        1           Thread mode enabled (1) or disabled (0)                            True
    virt_adapters ent13,ent14 List of virtual adapters associated with the SEA (comma separated) True
    
  • By using the entstat command you can check that the Control Channel exists and is using the PVID 4095 (same result on second Virtual I/O Server) :
  • padmin@vios1$ entstat -all ent18 | grep -i "Control Channel PVID"
        Control Channel PVID: 4095
    
  • Looking at the entstat output SEA are partners (one PRIMARY_SH and one BACKUP_SH :
padmin@vios1$ entstat -all ent18 | grep -i state
    State: PRIMARY_SH
padmin@vios2$  entstat -all ent18 | grep -i state
    State: BACKUP_SH

Verbose and intelligent errlog

While configuring Shared Ethernet Adapter in this mode the errlog can give you a lot of informations about your configuration. For instance if additional vlan IDs does not match betweens Virtual Adapters of a Shared Ethernet Adapter you’ll be warned by an error in the errlog. Here are a few examples :

  • Additional vlan IDs does not match between Virtual Adapters :
padmin@vios1$ errlog | more
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
A759776F   0506205214 I H ent18          SEA HA PARTNERS VLANS MISMATCH
  • Looking on a detailed output you can get the missing vlan id :
padmin@vios1$ 
---------------------------------------------------------------------------
LABEL:          VIOS_SEAHA_DSCV_VLA
IDENTIFIER:     A759776F
Date/Time:       Tue May  6 20:52:59 2014
Sequence Number: 704
Machine Id:      00XXXXXXXX00
Node Id:         vios1
Class:           H
Type:            INFO
WPAR:            Global
Resource Name:   ent18
Resource Class:  adapter
Resource Type:   sea
Location:

Description
SEA HA PARTNERS VLANS MISMATCH

Probable Causes
VLAN MISCONFIGURATION

Failure Causes
VLAN MISCONFIGURATION

        Recommended Actions
        NONE

Detail Data
ERNUM
0000 001A
ABSTRACT
Discovered HA partner with unmatched VLANs
AREA
VLAN misconfiguration
BUILD INFO
BLD: 1309 30-10:08:58 y2013_40A0
LOCATION
Filename:sea_ha.c Function:seaha_process_dscv_init Line:6156
DATA
VLAN = 0x03E9
  • The last line is the value of the missing vlan in hexadecimal (0x03E9, 1001 converted in decimal). We can manually check that this vlan is missing on vios1 :
# echo "ibase=16; 03E9" | bc
1001
padmin@vios1$ entstat -all ent18 | grep -i "VLAN Tag IDs:"
VLAN Tag IDs:  1659
VLAN Tag IDs:  1682
VLAN Tag IDs:  1682
padmin@vios2$ entstat -all ent18 | grep -i "VLAN Tag IDs:"
VLAN Tag IDs:  1659
VLAN Tag IDs:  1001  1682
VLAN Tag IDs:  1001  1682
  • A loss of communication between SEA will also be logged in the errlog :
padmin@vios1$ errlog | more
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
B8C78C08   0502231214 I H ent18          SEA HA PARTNER LOST
padmin@vios1$ errlog -ls | more
Location:

Description
SEA HA PARTNER LOST

Probable Causes
SEA HA PARTNER DOWN

Failure Causes
SEA HA PARTNER DOWN

        Recommended Actions
        INITIATE PARTNER DISCOVERY

Detail Data
ERNUM
0000 0019
ABSTRACT
Initiating partner discovery due to lost partner
AREA
SEA HA discovery partner lost
BUILD INFO
BLD: 1309 30-10:08:58 y2013_40A0
LOCATION
Filename:sea_ha.c Function:seaha_dscv_ka_rcv_timeout Line:2977
DATA
Partner MAC: 0x1A:0xC4:0xFD:0x72:0x9B:0x0F
  • Be careful looking at the errlog, a SEA in sharing mode will “become primary” even if it is the “backup” SEA (you have to look with errlog -ls command for the details) :
padmin@vios$ errlog | grep BECOME
E48A73A4   0506205214 I H ent18          BECOME PRIMARY
padmin@vios2$ errlog | grep BECOME
1FE2DD91   0506205314 I H ent18          BECOME PRIMARY
padmin@vios1$ errlog -ls | more
LABEL:          VIOS_SEAHA_PRIMARY
IDENTIFIER:     E48A73A4
[..]
Description
BECOME PRIMARY
[..]
padmin@vios2$ errlog -ls | more
LABEL:          VIOS_SEAHA_BACKUP
IDENTIFIER:     1FE2DD91
[..]
Description
BECOME PRIMARY
[..]
ABSTRACT
Transition from INIT to BACKUP
[..]
seahap->state= 0x00000003
Become the Backup SEA

Removing the control channel adapter from an existing Shared Ethernet Adapter

A “classic” Shared Ethernet Adapter can be modified to be usable without the need of a dedicated Control Channel Adapter. This modification require a network outage and the Shared Ethernet Adapter needs to be in defined state. I DO NOT LIKE to do administration as root on Virtual I/O Servers but I’ll do it here because of the use of the mkdev command :

  • On both Virtual I/O Servers put the Shared Ethernet Adapter in defined state :
padmin@vios1$ oem_setup_env
root@vios1# rmdev -l ent18
ent18 Defined
padmin@vios2$ oem_setup_env
root@vios2# rmdev -l ent18
ent18 Defined
  • On both Virtual I/O Servers remove the dedicated Control Channel Adapter for both Shared Ethernet Adapters :
root@vios1# lsattr -El ent18 -a ctl_chan
ctl_chan ent12 Control Channel adapter for SEA failover True
root@vios1# chdev -l ent18 -a ctl_chan=""
ent18 changed
root@vios1# lsattr -El ent18 -a ctl_chan
ctl_chan  Control Channel adapter for SEA failover True
root@vios2# lsattr -El ent18 -a ctl_chan
ctl_chan ent12 Control Channel adapter for SEA failover True
root@vios2# chdev -l ent18 -a ctl_chan=""
ent18 changed
root@vios2# lsattr -El ent18 -a ctl_chan
ctl_chan  Control Channel adapter for SEA failover True
  • Put each Shared Ethernet Adapter in available state by using the mkdev command :
root@vios1# mkdev -l ent18
ent18 Available
root@vios2# mkdev -l ent18
ent18 Available
  • Verify that the Shared Ethernet Adapter is now using vlan 4095 as Control Channel PVID :
padmin@vios1$ entstat -all ent18 | grep -i "Control Channel PVID"
    Control Channel PVID: 4095
padmin@vios2$ entstat -all ent18 | grep -i "Control Channel PVID"
    Control Channel PVID: 4095

The first step to a global PowerVM simplification

Be aware that this simplification is one of the first step of a much larger project. With the latest version of the HMC v8R80.1 a lot of new features will be available (June 2014). I can’t wait to test the “single point of management” for Virtual I/O Servers. Anyway, creating a Shared Ethernet Adapter is easier than before. Use this method to avoid human errors and misconfiguration of your Shared Ethernet Adapters. As always I hope this post will help you to understand this simplification. :-)

Hardware Management Console : Be autonomous with upgrade, update, and use the Integrated Management Module

I’m sure that like me you do not have a physical access to your Hardware Management Consoles, or even if you have this access, some of your HMC are so far away from your working site (even in a foreign country) that you can’t afford to physically move to this place to update it. Even worse if -like me- you are working in a big place (who says too big ?) this job is often performed by IBM Inspectors and you do not have to worry about your Hardware Management Consoles and just have to ask IBM guys for anything about HMC. For some reasons I had to update an old Hardware Management Console from v7r7.3.0 to v7r7.7.0 SP3. Everybody is confused about the differences between updating an HMC and upgrading an HMC. I know really good bloggers : Anthony English and Rob McNelly have already post about this particular subject but I have to write this post as a reminder and to clarify some points which are not tell in Anthony’s and Rob’s posts :

To finish this post I’ll talk about a feature nobody is using : the HMC is coming with an Integrated Management Module, this one allows you to have more control and to be autonomous with you HMC.

The difference between updating, upgrading and migrating

There is a lot of confusion when people are trying “update” their HMC. When do I have to update using the updhmc command, when do I have to upgrade using saveupgdata, getupdfiles and chhmc command, and finally when do I have to migrate using HMC Recovery CD/DVD ? All of this three operations are not well described by IBM. Here is what I’m doing in for each case, and this is the result of my own experience (don’t take this as an official document). Here is a little reminder it can be useful for a lot of people : an HMC version number looks like this : v7r7.7.0 SP3. v7 is the VERSION. 7.7.0 is the RELEASE and SP3 is the SERVICE PACK.

  • Updating : You have to update your HMC if your are applying a service pack, or a corrective fix update on the HMC, this operation can only be performed by the updhmc command. Use this method if fix central gives you an iso named “HMC_Update_*.iso” or a zip files named “MHxxxxx.zip”. These fixes can be applied to a minor version of the HMC.
  • Upgrading : You have to upgrade your HMC if your are moving from one minor version to another (from one release to another), for instance if your are moving from v7r7.7.0 to v7r7.8.0, this operation can be made by using HMC Recovery DVD (fix central will gives you two isos named “HMC_Recovery_*.iso”), or by using the images provided for a network upgrade (I’ll explain this in this post).
  • Migrating : You have to migrate your HMC by using HMC Recovery DVD when your are moving from one major version to another. For example when your are moving from an HMC v6 to and HMC v7 (for instance from any v6 version to v7r7.8.0). In this case you have no other choice than burning DVDs and moving in front of the HMC to perform the operation by yourself.

Upgrading

You can upgrade your HMC from its local storage by using the network images provided by IBM on a public FTP server, once connected to the public FTP server get the version you want to upgrade to and download all the files, bzImage and initrd.gz included :

# ftp://ftp.software.ibm.com/software/server/hmc/network/v7770/
# ls
FTP Listing of /software/server/hmc/network/v7770/ at ftp.software.ibm.com
[..]
Feb 26 2013 00:00      2708320 bzImage
Feb 26 2013 00:00    808497152 disk1.img
Feb 26 2013 00:00   1142493184 disk2.img
Feb 26 2013 00:00   1205121024 disk3.img
Feb 26 2013 00:00           78 hmcnetworkfiles.sum
Feb 26 2013 00:00     34160044 initrd.gz
# mget *.*

Put all the files on a server where you have an FTP server running (the HMC getupgfiles is using FTP to get the files) and download all the files with the getupgfiles command directly form the HMC (if your HMC has a direct access to the internet you can specify it the command):

hscroot@gaff:~> getupgfiles -h 192.168.0.99 -u root -d /export/HMC/network_ugrade/v7770
Enter the current password for user root:

While images are downloading the HMC is mounting a temporary filesystem called /hmcdump and put the images in it. Once the images are downloaded the filesystem /hmcdump is unmounted. You can check the download progression with a loop looking on the /hmcdump filesystem :

hscroot@gaff:~>  while true ; do date; ls -la /hmcdump; sleep 60; done
[..]
drwxr-xr-x  3 root root      4096 2013-12-24 16:26 .
drwxr-xr-x 30 root root      4096 2013-12-19 14:52 ..
-rw-r--r--  1 root hmc  824223312 2013-12-24 16:32 disk3.img
-rw-r--r--  1 root hmc         78 2013-12-24 16:26 hmcnetworkfiles.sum
drwx------  2 root root     16384 2007-12-19 03:24 lost+found
Tue Apr  1 08:10:30 CEST 2014
total 3121248
drwxr-xr-x  3 root root       4096 2013-12-24 16:52 .
drwxr-xr-x 30 root root       4096 2013-12-19 14:52 ..
-rw-r--r--  1 root hmc     2708320 2013-12-24 16:52 bzImage
-rw-r--r--  1 root hmc   808497152 2013-12-24 16:52 disk1.img
-rw-r--r--  1 root hmc  1142493184 2013-12-24 16:45 disk2.img
-rw-r--r--  1 root hmc  1205121024 2013-12-24 16:36 disk3.img
-rw-r--r--  1 root hmc          78 2013-12-24 16:26 hmcnetworkfiles.sum
-rw-r--r--  1 root hmc    34160044 2013-12-24 16:52 initrd.gz
drwx------  2 root root      16384 2007-12-19 03:24 lost+found

Please note that this filesystem is only mounted while the getupgfile command is running and can’t be mounted after the command execution … :

hscroot@gaff:~> mount /hmcdump
mount: only root can mount /dev/sda6 on /hmcdump

Before launching the upgrade save all the data needed for the upgrade to disk, close all HMC events and clear all the filesystems :

  • Save all HMC upgrade data to disk. This command is MANDATORY, it save all the partition profile data, and the user data and the whole HMC configuration, if you forget this command you have to reconfigure the HMC by hand, so be careful with this one :-) :
  • hscroot@gaff:~> saveupgdata -r disk
    
  • Close all HMC events :
  • hscroot@gaff:~> chsvcevent -o closeall
    
  • Remove all temporary HMC files from all filesystems :
  • hscroot@gaff:~> chhmcfs -o f -d 0
    

The images are now downloaded to the HMC, to upgrade the HMC you just have to tell the HMC to boot on its alternate disk and to use the files you’ve just download for the upgrade :

  • To set the alternate disk partition on the HMC as a startup device on the next HMC boot and enable the upgrade on the alternate disk use the chhmc command :
  • hscroot@gaff:~> chhmc -c altdiskboot -s enable --mode upgrade
    
  • Before rebooting check the altdiskboot attribute is set to enable :
  • hscroot@gaff:~> lshmc -r
    ssh=enable,sshprotocol=,remotewebui=enable,xntp=enable,"xntpserver=127.127.1.0,kronosnet1.fr.net.intra,kronosnet2.fr.net.intra,kronosnet3.fr.net.intra",syslogserver=,netboot=disable,altdiskboot=enable,ldap=enable,kerberos=disable,kerberos_default_realm=,kerberos_realm_kdc=,kerberos_clockskew=,kerberos_ticket_lifetime=,kerberos_keyfile_present=,"sol=disabled
    "
    
  • Reboot the HMC and wait :-) :
  • hscroot@gaff:~> hmcshutdown -t now -r
    

Depending on the HMC model and on the version of the HMC the upgrade can takes 10 minutes to 40 minutes, you’ll have to be patient and to cross your finger and pray everything is going well. But don’t worry I never had an issue with this method. Once the HMC is rebooted and upgraded, you can check that the altdiskboot attribute is now set to disable :

hscroot@gaff:~> lshmc -r
ssh=enable,sshprotocol=,remotewebui=enable,xntp=enable,"xntpserver=127.127.1.0,kronosnet1.fr.net.intra,kronosnet2.fr.net.intra,kronosnet3.fr.net.intra",syslogserver=,syslogtcpserver=,syslogtlsserver=,netboot=disable,altdiskboot=disable,ldap=enable,kerberos=disable,kerberos_default_realm=,kerberos_realm_kdc=,kerberos_clockskew=,kerberos_ticket_lifetime=,kpasswd_admin=,trace=,kerberos_keyfile_present=,legacyhmccomm=enable,sol=disabled

Updating

Once the HMC is upgraded you have to update it. Unfortunately updates files (often ISO files) are only available on fix central and not on the public FTP. Get the ISO updates file from fix central and put it on your FTP (once again) server, then use the updhmc command to update the HMC, repeat the operation for each updates, and then reboot the HMC (in the example below I’m using sftp) :

hscroot@gaff:~> updhmc -t s -i -h 192.168.0.99 -u root -f /export/HMC/v7r770/HMC_Update_V7R770_SP1.iso
Password:
iptables: Chain already exists.
ip6tables: Chain already exists.
[..]
The corrective service file was successfully applied. A mandatory reboot is required but was not specified on the command syntax.
hscroot@gaff:~> updhmc -t s -i -h 192.168.0.99 -u root -f /export/HMC/v7r770/HMC_Update_V7R770_SP2.iso
Password:

ip6tables: Chain already exists.
ACCEPT  tcp opt -- in eth3 out *  0.0.0.0/0  -> 0.0.0.0/0  tcp dpt:5989
ACCEPT  udp opt -- in eth3 out *  0.0.0.0/0  -> 0.0.0.0/0  udp dpt:657
[..]
The corrective service file was successfully applied. A mandatory reboot is required but was not specified on the command syntax.
hscroot@gaff:~> hmcshutdown -t now -r

After upgrading and updating the HMC check the version is ok with the lshmc command :

hscroot@gaff:~> lshmc -V
"version= Version: 7
 Release: 7.7.0
 Service Pack: 3
HMC Build level 20131113.1
","base_version=V7R7.7.0
"

Using and configuring the Integrated Management Module

I like to be autonomous and do things on my own. Who had never been stuck on a problem with an HMC and was forced to called an IBM inspector to reboot the HMC or even to insert a CD in the CDRom reader. A few people know this but the HMC is based on an IBM Xserie server (Who said Lenovo ?) and is shipped with an Integrated Management Module allowing you to boot, start, and stop the HMC without the need to have someone in the data-center. Unfortunately this method seems not to be supported by IBM so do it at your own risk.

Use the dedicated port for the Integrated Management Console (the Red port)

hmc_imm_port_good

From the HMC command line using the chhmc command configure the Integrated Management Module IP address :

hscroot@gaff:~> chhmc -c imm -s modify -a 10.10.20.4 -nm 255.255.255.0 -g 10.10.20.254

Restart the Integrated Management Module to commit the changes. The IMM will not be pingable before restart :

hscroot@gaff:~> chhmc -c imm -s restart

The Integrated Management Module is now pingable and you can check its configuration :

hscroot@gaff:~> lshmc -i
ipv4addr=10.10.20.4,networkmask=255.255.255.0,gateway=10.10.20.254,username=USERID,mode=Dedicated

By default the username is USERID and the password is PASSW0RD (with a zero), you can change it to fit your needs :

hscroot@gaff:~> chhmc -c imm -s modify -u immusername --passwd "abc123"

The Integrated Management Module is now configured and can be accessed from the web interface of from SSH :

hmc_imm_login

I will not detail all the actions you can do with the Integrated Management Module but here is a screen showing the Hardware Health of the HMC :

hmc_imm_hardware

One thing you can do for free (without IMM license) is to control the Power of the HMC, choosing to stop/start/restart or reboot. This feature can be very useful when the HMC is stucked :

hmc_imm_actions

If you choose to restart the HMC the Integrated Management Module will warn you before restarting :

hmc_imm_restart

You can access the HMC Integrated Management Module by using the SSH command line :

  • Use the power command to control the power of the HMC :
  • system> help power
    usage:
       power on    [-options]   - power on server
       power off   [-options]   - power off server
       power cycle [-options]   - power off, then on
       power state              - display power state
       power -rp [alwayson|alwaysoff|restore]   - host power restore policy
    options:
       -s                       - shut down OS first
       -every day               - daily or weekly on,off or cycle commands
    [Sun|Mon|Tue|Wed|Thu|Fri|Sat|Day|clear]
       -t   time                - time (hh:mm)
    additional options for on.
       -d  date                 - date (mm/dd/yyyy)
       -clear                   - clear on date
    
  • Here is an example to restart the HMC :
  • system> power cycle -s
    ok
    
  • Checking the power state :
  • system> power state
    power on
    State:Booting OS or in unsupported OS
    

The Integrated Management Module is a licensed product and unfortunately IBM does not support the Integrated Management Module on the HMC.It seems that the IMM license can’t be acquired for the HMC. I have checked on the trial licenses page and the HMC Hardware does not even exists when you have to choose the Hardware model for the trial license. This a shame because the licensed IMM allows to remote control the HMC, and to manage Virtual CDrom ….. useful for migration. So if an IBMer is reading this and have an explanation about this feel free to tell me what I’ve missed in the comments :

hmc_imm_remote_control

I hope this post will let you manage your HMC alone and to be autonomous :-)