DHCP Option-82 injection and ISC-DHCP

DHCP Option 82 and CumulusLinux

it has been far too long since I have written  something but I have been working With Cumulus Linux for the last year and it is time I blog something about that.

I am working on building a spine leaf architecture based on CumulusLinux. The  policy is: If we do it more than once automate it. So I was tasked to make the installation, upgrade and replacement of switches possible without minimal user interference.

CumulusLinux

If you haven’t heard about CumulusLinux a quick introduction. CumulusLinux is hing a debian based distibution which can be installed on so calles whitelabel switches. These switches can be obtained from companies like Dell, HP, Supermicro and Mellanox. These whitelabel switches come without software and this is what Cumulus provides.

The installation process

When a switches boots for the first time it starts the Open Network Install Environment (ONIE).  ONIE is PXE on steroids for network devices. It can install the software from an USB drive but also from the network via tftp/http/ftp. As we want it as flexible as possible we are going to use the network method. The network method relies on some DHCP provided options

When we add new switches to the environment I don’t want to wait until our floormanagers have installed the switches, cable them up provide me the mac of the management port before I can make a DHCP reservation. So I needed to find a way to link a DHCP request to a particular switch without rely-ing on the mac-address. Enter DHCP Option-82.

DHCP Option-82

DHCP option-82 is an option which can be inserted into DHCP request packets when they are relayed/forwarded by intermediate devices. The DHCP-Relay server on Cumulus is capable to inject the incoming source port as Option-82 in the DHCP packet.
For this demo I have created the following topology within VirtualBox and CumulusVX

swp1 and swp2 are part of vlan100 which has an IP segment of 10.0.0.0/24. The DHCP server is placed in vlan 200 with ip segment 10.0.1.0/24. The switch oob01 has an DHCP-relay configured. The configuration file is below

SERVERS="10.0.1.1"
INTF_CMD="-i vlan100 -i vlan200"
OPTIONS="-a --use-pif-circuit-id"

This config listens for DHCP requests on interfaces vlan100 and 200 for DHCP request and replies and adds the incoming circuit ID to the request. Below is the DHCP request from tor01 as it is received by the DHCP server

04:27:14.749446 IP (tos 0x0, ttl 64, id 34211, offset 0, flags [DF], proto UDP (17), length 337)
    10.0.1.254.67 > 10.0.1.1.67: [udp sum ok] BOOTP/DHCP, Request from 08:00:27:bc:ff:cf, length 309, hops 1, xid 0x1da4587b, Flags [none] (0x0000)
	  Gateway-IP 10.0.0.254
	  Client-Ethernet-Address 08:00:27:bc:ff:cf
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: Request
	    Lease-Time Option 51, length 4: 7200
	    Requested-IP Option 50, length 4: 10.0.0.1
	    Hostname Option 12, length 7: "cumulus"
	    Parameter-Request Option 55, length 14: 
	      Subnet-Mask, BR, Time-Zone, Default-Gateway
	      Domain-Name, Domain-Name-Server, Option 119, Hostname
	      Netbios-Name-Server, Netbios-Scope, MTU, Classless-Static-Route
	      NTP, Option 239
	    Agent-Information Option 82, length 26: 
	      Circuit-ID SubOption 1, length 4: swp1
	      Remote-ID SubOption 2, length 18: 08:00:27:9b:53:b0^J
	    END Option 255, length 0

On line 17 the Circuit-ID Suboption 1 can be seen. This option can be used by ISC-DHCP to assign the correct IP address.

ISC-DHCP

To make the DHCP server act upon the Circuit-ID classes have  to  used. DHCP requests can be matched on various fields and placed in the classes. The class definitions are quite simple


class "swp1" {
match if (option agent.circuit-id = "swp1") and
not (substring(option vendor-class-identifier, 0, 4) = "onie");
}
class "swp2" {
match if (option agent.circuit-id = "swp2") and
not (substring(option vendor-class-identifier, 0, 4) = "onie");
}
class "onie" {
match if substring(option vendor-class-identifier, 0, 4) = "onie";
}

The classes swp1 and swp2 are used when the device is not in ONIE mode and the circuit identifier matches a switchport. The class “onie” is  only used when the device is in ONIE mode. Now the DHCP requests are placed in classes they can be assigned ip addreses. This is done by creating pools for each interface class within the subnet definition. Each pool only has 1 ip address available. There is also an “onie” pool which is used for installation phase of a switch. This is required because ISC-DHCP stores a client-identifier for each lease. When a device is in ONIE mode the client identifier is different from when it is running Cumulus Linux. As a result devices which were being installed via ONIE and rebooted were unable to get their intended lease as it was still valid for when it was in ONIE mode. A short lease time this could have prevented this but when there are problems with the DHCP server devices will loose their lease very quickly.

The DHCP subnet and pool configuration is as follows


subnet 10.0.0.0 netmask 255.255.255.0 {
option default-url = "http://10.0.1.1/cumulus-linux-3.6.0-vx-amd64.bin";
option routers 10.0.0.254;
  pool {
    range 10.0.0.100 10.0.0.200;
    allow members of "onie";
    option default-url = "http://10.0.1.1/cumulus-linux-3.6.0-vx-amd64.bin";
  }
  pool {
    range 10.0.0.1;
    option host-name "tor01";
    allow members of "swp1";
    option routers 10.0.0.254;
    option cumulus-provision-url "http://10.0.1.1/ztp.sh";
  }
  pool {
    range 10.0.0.2;
    option host-name "tor02";
    allow members of "swp2";
    option routers 10.0.0.254;
    option cumulus-provision-url "http://10.0.1.1/ztp.sh";
  }
}

This setup makes it possible to replace a switch in the middle of the night without anybody changing mac addresses in a DHCP server or using a console cable to configure an ip address on a box.
Option-82 insertion is also used to assigne fixed ip addresses to the DRAC/ILO ports on our servers. Replacing a server will no longer require any actions to assign the old ip to the new drac. The only requirement is stick to the cable plan. If cabling errors are made this whole setup fails.
My next post will be about the next phase in the life of a Cumulus device: ZTP

building a rest API for ExBGP

The last couple of years there is a trend to extend layer three to the top of rack switch (TOR). This gives a more stable and scalable design compared to the classic layer two network design. On major disadvantage of the layer 3 to the TOR switch is IP mobility. In the classic L2 design it was a simple live migration of a vm to a  different compute host in a different rack. When L3 is extended to the TOR IP mobility isn’t that simple anymore. A solution for this might be to let the VM Host advertise a unique service IP for a particular VM when it becomes active on that VM host. A great tool for this use case is ExaBGP.

ExaBGP does not modify the route table on the host itself it only announces routes to its neighbours. After ExaBGP starts the routes it advertises can be influenced by sending messages to STDIN
Below is the config used by the ExaBGP daemon

group ebgp {
router-id 172.16.2.11;
neighbor 172.16.2.252 {
local-address 172.16.2.11;
local-as 65001;
peer-as 65000;
group-updates;
}
process add-routes {
run /etc/exabgp/exabgp_rest3.py;
}
}

Most of this is pretty self explanatory the important stuff happens on line 9-11. These lines start a script and all output of this script is parsed by ExaBGP.

The script exabgp_rest3.py provides a rest API which outputs on STDOUT the announce and withdraw commands for ExaBGP.

For testing purposes I created a simple setup within KVM and two hosts, docker1 which runs ExaBGP and firewall-1 which runs the birdc bgp daemon. There is a L2 segment between those clients over which BGP peering is created

The python script is only 75 lines long.

#!/usr/bin/env python
import web
from sys import stdout
from netaddr import *
from pprint import pprint
urls = (
    '/announce/(.*)', 'announce',
    '/withdraw/(.*)', 'withdraw',
)
class MyOutputStream(object):
    def write(self, data):
        pass   # Ignore output
	
web.httpserver.sys.stderr = MyOutputStream()
class bgpPrefix:
    def __init__(self,prefix,action="announce",next_hop="self",attributes={}):
        self.prefix=prefix
        self.action=action
        self.next_hop=next_hop
        self.attributes=attributes
        print self.attributes
    def get_exabgp_message(self):
        if (self.action=='withdraw'):
            exabgp_message="{0} route {1} next-hop {2}".format(self.action,self.prefix,self.next_hop)
        else:
            attribute_string=""
            for attribute in self.attributes:
                 if attribute == "local-preference":
                     attribute_string+=" local-preference {0}".format(self.attributes[attribute])
                 elif attribute == "med":
                     attribute_string+=" med {0}".format(self.attributes[attribute])
                 elif attribute == "community":
                     print self.attributes[attribute]
                     if len(self.attributes[attribute])>0:
			 attribute_string+=" community [ "
			 for comm in self.attributes[attribute]:
			     attribute_string+=" {0} ".format(comm)
			 attribute_string+=" ]"

                     
            exabgp_message="{0} route {1} next-hop {2}{3}".format(self.action,self.prefix,self.next_hop,attribute_string)
	return exabgp_message
     
def verifyIp(ip):
    if not '/' in ip:
        ip="{0}/32".format(ip)
    try:
        ip_object=IPNetwork(ip)
    except:
        raise web.badrequest("invalid IP")
    return(ip_object)

class announce:
    def GET(self, prefix):
        ip_object=verifyIp(prefix)
       # bgp_prefix=bgpPrefix(str(ip_object),action="announce",attributes={'local-preference': 300})
        bgp_prefix=bgpPrefix(str(ip_object),action="announce",attributes=web.input(community=[]))
        stdout.write( bgp_prefix.get_exabgp_message() + '\n')
        stdout.flush()
        return "OK"


class withdraw:
    def GET(self, prefix):
        ip_object=verifyIp(prefix)
        bgp_prefix=bgpPrefix(str(ip_object),action="withdraw")
        stdout.write( bgp_prefix.get_exabgp_message() + '\n')
        stdout.flush()
        return "OK"

app = web.application(urls, globals())

if __name__ == "__main__":
app.run()

The heavy lifting of the web service is handled by web.py this is a powerfull library to create a webserver in Python. I am a network engineer with very limitted experience with Python but creating the script only took me a couple of hours.

The script in action

We start with starting the ExaBGP Daemo

.
.
.
Mon, 29 Aug 2016 21:01:14 | INFO     | 15213  | reactor       | New peer setup: neighbor 172.16.2.252 local-ip 172.16.2.11 local-as 65001 peer-as 65000 router-id 172.16.2.11 family-allowed in-open
Mon, 29 Aug 2016 21:01:14 | WARNING  | 15213  | configuration | Loaded new configuration successfully
Mon, 29 Aug 2016 21:01:14 | INFO     | 15213  | processes     | Forked process add-routes


Mon, 29 Aug 2016 21:01:16 | INFO     | 15213  | network       | Connected to peer neighbor 172.16.2.252 local-ip 172.16.2.11 local-as 65001 peer-as 65000 router-id 172.16.2.11 family-allowed in-open (out)

By default the service is started at port 8080

root@docker-1:/home/eelcon# netstat -anp | grep 8080
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      15183/python
root@docker-1:/home/eelcon#

The BGP neighbor is also shown as established by bird

bird> show protocols all bgp3
name     proto    table    state  since       info
bgp3     BGP      master   up     15:01:33    Established
  Preference:     100
  Input filter:   ACCEPT
  Output filter:  REJECT
  Routes:         0 imported, 0 exported, 0 preferred
  Route change stats:     received   rejected   filtered    ignored   accepted
    Import updates:              0          0          0          0          0
    Import withdraws:            0          0        ---          0          0
    Export updates:              0          0          0        ---          0
    Export withdraws:            0        ---        ---        ---          0
  BGP state:          Established
    Neighbor address: 172.16.2.11
    Neighbor AS:      65001
    Neighbor ID:      172.16.2.11
    Neighbor caps:    AS4
    Session:          external AS4
    Source address:   172.16.2.252
    Hold timer:       155/180
    Keepalive timer:  51/60

bird>

adding a route is as simple as doing a simple curl on the host on which the ExaBGP is running

nettinkerer@docker-1:~$ curl http://127.0.0.1:8080/announce/1.2.3.0/25
OK
nettinkerer@docker-1:~$

ExaBGP gets the announce message

Mon, 29 Aug 2016 21:08:18 | INFO     | 15231  | processes     | Command from process add-routes : announce route 1.2.3.0/25 next-hop self
Mon, 29 Aug 2016 21:08:18 | INFO     | 15231  | reactor       | Route added to neighbor 172.16.2.252 local-ip 172.16.2.11 local-as 65001 peer-as 65000 router-id 172.16.2.11 family-allowed in-open : 1.2.3.0/25 next-hop 172.16.2.11
Mon, 29 Aug 2016 21:08:18 | INFO     | 15231  | reactor       | Performing dynamic route update
Mon, 29 Aug 2016 21:08:19 | INFO     | 15231  | reactor       | Updated peers dynamic routes successfully

the bgp daemon on the firewall also knows the route

bird> show route 1.2.3.0/25 all
1.2.3.0/25         via 172.16.2.11 on ens9 [bgp3 15:08:36] * (100) [AS65001i]
        Type: BGP unicast univ
        BGP.origin: IGP
        BGP.as_path: 65001
        BGP.next_hop: 172.16.2.11
        BGP.local_pref: 100
bird>

the REST API also accepts communities and meds

curl "http://127.0.0.1:8080/announce/1.2.3.0/25?med=200&comnity=100:400&community=300:600"

which is shown by the bird daemon as well

bird> show route 1.2.3.0/25 all
1.2.3.0/25         via 172.16.2.11 on ens9 [bgp3 15:14:01] * (100) [AS65001i]
        Type: BGP unicast univ
        BGP.origin: IGP
        BGP.as_path: 65001
        BGP.next_hop: 172.16.2.11
        BGP.med: 200
        BGP.local_pref: 100
        BGP.community: (100,400) (300,600)
bird>

Withdrawing routes can also be done easily with a curl statement

 curl "http://127.0.0.1:8080/withdraw/1.2.3.0/25"

And the route is gone

bird> show route 1.2.3.0/25 all
Network not in table
bird>

At the moment there is only limitted input validation. The REST API does check if the ip address entered is valid but no other checks are implemented at this moment. I might add this if need arises.

The script and configs used in this blog can be found on my Github

 

Using the Python UCS library

Recently some VCE vBlocks have been taken into production at my current job. Although VCE installs everything for you they didn’t configure all the required production Vlans. The vlans need to be added to various components in the vBlock

  • Nexus 9000
  • Nexus 1000V
  • UCS-FI

configuring them on the Nexus devices is pretty straight forward but configuring them on the FI as a chore for the operations team. First add the Vlan to the system and them add the VLAN to every vNIC template

As I am still trying to improve my Python skills I just wrote a script to add a vlan from the cli to do this for me.

It starts with downloading the the Python SDK from Cisco and install them on you management system. After installation you are good to go an you can start wrting your own scripts. The documentation provided is not very elaborate but sufficient for a script like this.

First some modules are to be loaded. Besides the ones required for the UCS related stuff I add a few to make the script “nice” argparse is a library to support command line options and getpass allows entering passwords without showing them on screen

from UcsSdk.MoMeta.FabricLanCloud import FabricLanCloud
from UcsSdk.MoMeta.FabricVlan import FabricVlan
from UcsSdk.MoMeta.VnicLanConnTempl import VnicLanConnTempl
from UcsSdk.MoMeta.VnicEtherIf import VnicEtherIf
from UcsSdk import *

import argparse
import os
import getpass

The argument parser is created.

parser=argparse.ArgumentParser(description="Command adds or removes a vlan to a FI and all VniC profiles present")
parser.add_argument("--fi", type=str, required=True, help="IP/hostname of FI")
group=parser.add_mutually_exclusive_group(required=True)
group.add_argument("--add",action='store_true', help="vlan will be added")
group.add_argument("--del",action='store_true', help="vlan will be removed")
parser.add_argument("--id", type=int, required=True, help="vlan ID")
parser.add_argument("--name", type=str,  required=True, help="vlan Name")
args=parser.parse_args()
userName=raw_input("Username: ")
passWord=getpass.getpass()
print "Modify vlan %s with name %s on %s with user %s and pw ***" % (args.id,args.name,args.fi,userName)

This arguments parser adds a number of command line options

  • –fi the ip or hostname of the fabric interconnect
  • –add to add a vlan
  • –del to remove a vlan
  • –id the vlan id (the number)
  • –name the vlan name

When one of the options is missing an error is raised and some help tekst is provided. Argparser also prevents you from providing both add an del together.

Line 9 and 10 prompts for the username andpassword. Getpass prevents the password to be echoed on screen.

vlanId=str(args.id)
vlanName=args.name
try:
#login to the UCS FI 
  handle = UcsHandle()
  handle.Login(args.fi,userName,passWord)
  #get the MO for every vnic
  try:
    vnics=handle.GetManagedObject(None,VnicLanConnTempl.ClassId())
    #get the MO for the LANCLOUD
    LanCloud= handle.GetManagedObject(None, FabricLanCloud.ClassId())
    vlanExist=handle.GetManagedObject(LanCloud,FabricVlan.ClassId(), {FabricVlan.NAME:vlanName})

line 1 and 2 store the entered values for the vlan ID and vlan name in a more recognizable variable name.  A try expect structure is started and an handle to the UCS is created. All actions on the UCS will be done via this handle. The first thing to do now is do a login with the supplied credentials and ip address or hostname of the FI.

line 9 retrieves every vnic template on the system. This is simply done by retrieving all objects of the the class “vnicLanConnTempl” this string is the ouput of VnicLanConnTempl.ClassId(). The hardest part of writing scripts for UCS is determining the required ClassId. The easiest way to do this in my opinion is to dump the XML from the UCSM gui and find the required classes. Open the UCS GUI and select the object you want some info about. Press the right button and select Copy XML

Copy XML

The XML for this object is placed on the clipboard.

<vnicLanConnTempl childAction="deleteNonPresent" descr="" dn="org-root/lan-conn-templ-ESX_001_Prod2" identPoolName="" intId="118443" mtu="1500" name="ESX_001_Prod2" nwCtrlPolicyName="" operIdentPoolName="" operNwCtrlPolicyName="org-root/nwctrl-default" operQosPolicyName="" operStatsPolicyName="org-root/thr-policy-default" pinToGroupName="" policyLevel="0" policyOwner="local" qosPolicyName="" statsPolicyName="default" switchId="A" target="adaptor" templType="updating-template"> 
<vnicEtherIf addr="derived" childAction="deleteNonPresent" configQualifier="" defaultNet="no" fltAggr="0" name="Vlan1246" operState="indeterminate" operVnetDn="" operVnetName="" owner="logical" rn="if-Vlan1246" switchId="A" type="ether" vnet="1"/>
<vnicEtherIf addr="derived" childAction="deleteNonPresent" configQualifier="" defaultNet="no" fltAggr="0" name="Vlan3002" operState="indeterminate" operVnetDn="" operVnetName="" owner="logical" rn="if-Vlan3002" switchId="A" type="ether" vnet="1"/> 
<vnicEtherIf addr="derived" childAction="deleteNonPresent" configQualifier="" defaultNet="no" fltAggr="0" name="Vlan1300" operState="indeterminate" operVnetDn="" operVnetName="" owner="logical" rn="if-Vlan1300" switchId="A" type="ether" vnet="1"/>
<vnicEtherIf addr="derived" childAction="deleteNonPresent" configQualifier="" defaultNet="no" fltAggr="0" name="Vlan3000" operState="indeterminate" operVnetDn="" operVnetName="" owner="logical" rn="if-Vlan3000" switchId="A" type="ether" vnet="1"/> 
<vnicEtherIf addr="derived" childAction="deleteNonPresent" configQualifier="" defaultNet="no" fltAggr="0" name="Vlan124" operState="indeterminate" operVnetDn="" operVnetName="" owner="logical" rn="if-Vlan124" switchId="A" type="ether" vnet="1"/>
 <vnicEtherIf addr="derived" childAction="deleteNonPresent" configQualifier="" defaultNet="no" fltAggr="0" name="Vlan123" operState="indeterminate" operVnetDn="" operVnetName="" owner="logical" rn="if-Vlan123" switchId="A" type="ether" vnet="1"/> 
</vnicLanConnTempl>

This is a lot of informartion but the most important part is vnicLanConTempl the ClassId of this object. It is also obvious that the children of vnicLanConnTempl are the vlans which are allowed on this. So we already know that objects of ClassId vnicEtherI needs to be added if we want to modify the allowed vlans.

Line 11 retrieves the LanCloud. Under the LanCloud all objects related to L2 are stored. In line 12 the Lancloud is used as a starting point for a search for the vlan with the name which needs to be added. If it is present it should not be added or deleted later on in the script.

if (args.add):
    #add vlan
      print "add vlan %s to lanCloud and vNics" % (vlanName)
      if vlanExist:
        print vlanName + ": Already defined"
      else:
        #add the vlan to the LANCLOUD
        try:
          handle.AddManagedObject(LanCloud,FabricVlan.ClassId(), {FabricVlan.NAME:vlanName,FabricVlan.ID:vlanId})
          #add the vlan to all nics
          try:
            for vnic in vnics:
              vlanDn="%s/if-%s" % (vnic.Dn,vlanName)
              handle.AddManagedObject(vnic,VnicEtherIf.ClassId(), {VnicEtherIf.DN:vlanDn,VnicEtherIf.NAME:vlanName,VnicEtherIf.DEFAULT_NET:"no"},True,YesOrNo.FALSE)
          except Exception, err:
              print "Exception:", str (err)
        except Exception, err:
          print "Exception:", str (err)

This part of the script handles the adding of a vlan to the UCS. Line 4 and 5 check if the vlan already exists. When this is true the scripts logs a messages and continues with a logout. If the vlan is not found another try except structure is created. On line 9 the second UCS API command in the script, AddManagedObject, is used. This command is adds an object below another object. In this case we are adding a vlan below the LanCloud. The parameters used to create the vlan are the name and the id.

When the addition of the vlan  is successful another try expect is started. This one is to add the vlan to the vNics obtained earlier. For some reason the Dn of the new VnicEtherIf needs to be supplied as one of the parameters. I have not been able to find a list of required parameters of the various ClassIds.

The format of the Dn was again obtained by using the XML retrieved from the GUI. One important thing to notice is the True value in the AddManagedObject. This prevents the API to raise an error if the vlan is already part of the allowed vlans on the vNic.

The last line close the various try statements.

    else:
      print "del vlan %s from lanCloud and vNics" % (vlanName)
      #remove vlan from vnics
      vnicEtherIfMOS=handle.GetManagedObject(vnics,VnicEtherIf.ClassId(),{VnicEtherIf.NAME:vlanName})
      if vnicEtherIfMOS:
        handle.RemoveManagedObject(vnicEtherIfMOS)
      #remove vlan from LanCloud
      if vlanExist:
        handle.RemoveManagedObject(vlanExist)
   
   #delete vlan
  #logout from the UCS FI
  except Exception, err:
    print "Exception:", str (err)
  handle.Logout()

except Exception, err:
  print "Exception:", str (err)

The final section of the script handles the removal of the Vlan from the vNics and the vlan from the LanCloud. Line 4 searches for all VnicEtherIf with the name of the Vlan which needs to be removed. The base for this search is are the vnics obtained earlier. Line 5-7 removes all these VnicEtherIfs in one operation, but only if there is at least one Vnic. Line 9 and 10 do the same for the vlan.

The last lines closes the try, except and does a logout from the script.

Seeing the script in action

root@python-dev:~/ucs# python addVlan.py --fi 192.168.56.107 --add --id 123 --name test123
Username: admin
Password:
Modify vlan 123 with name test123 on 192.168.56.107 with user admin and pw ***
add vlan test123 to lanCloud and vNics
root@python-dev:~/ucs#

Best way is to keep the UCS GUI open while executing the script so you can see the vlans appear magically when executing this simple script.

Finding smallest subnet for two host

@netmanchris asked for a method to determine the smallest common subnet for two hosts. Below is my solution based on the python netaddr library

from netaddr import * 
lowIP=IPNetwork('1.1.1.1/32') 
highIP=IPNetwork('1.1.1.254/32') 
superNets=lowIP.supernet() 
superNets.reverse() 
for net in superNets: 
  if highIP in net: 
     print net.network
     break

On line 2 and 3 the ip are used to create to an IPNetwork. On line 4 a list is created containing al supernets of the IPNetwork. As the list is from large to small it needs te be reserved. By looping over eacht of the supernets and checking of the second ip is part of the subnet the common subnet is determined.

The Python netaddr is very versatile and can help you with various tedious ip operations

POAP and Ansible integration part 4

In the last part of the series I will look at the boot process of a POAP installation
First thing to do is run the playbook to populate the tftpboot folder and create all the files.

root@debian-8:/home/poap/poap# ansible-playbook site.yml

PLAY [Generate access switch files] *******************************************

GATHERING FACTS ***************************************************************
ok: [localhost]

TASK: [dhcpd | Generate dhcpd main config files] ******************************
ok: [localhost]

TASK: [dhcpd | include_vars globals/poap_clients.yml] *************************
ok: [localhost]

TASK: [dhcpd | create client dhcpd config files] ******************************
ok: [localhost]

TASK: [tftpd | Generate poap files] *******************************************
changed: [localhost] => (item={'name': 'switch1', 'ip': '192.168.4.201', 'mask': '255.255.255.0', 'mac': '0:0:0:50:60:50', 'bootfile': 'switch1.py', 'tftp_server': '192.168.3.254', 'serial': 'JAC0001', 'type': 'n3k', 'gateway': '192.168.4.254', 'software': '6.0.2.U3.5'})
changed: [localhost] => (item={'name': 'switch2', 'ip': '192.168.4.202', 'mask': '255.255.255.0', 'mac': '0:0:0:50:60:51', 'bootfile': 'switch2.py', 'tftp_server': '192.168.3.254', 'serial': 'JAC0002', 'type': 'n3k', 'gateway': '192.168.4.254', 'software': '6.0.2.U3.5'})

TASK: [tftpd | Generate tftpd files] ******************************************
changed: [localhost] => (item={'name': 'switch1', 'ip': '192.168.4.201', 'mask': '255.255.255.0', 'mac': '0:0:0:50:60:50', 'bootfile': 'switch1.py', 'tftp_server': '192.168.3.254', 'serial': 'JAC0001', 'type': 'n3k', 'gateway': '192.168.4.254', 'software': '6.0.2.U3.5'})
changed: [localhost] => (item={'name': 'switch2', 'ip': '192.168.4.202', 'mask': '255.255.255.0', 'mac': '0:0:0:50:60:51', 'bootfile': 'switch2.py', 'tftp_server': '192.168.3.254', 'serial': 'JAC0002', 'type': 'n3k', 'gateway': '192.168.4.254', 'software': '6.0.2.U3.5'})

TASK: [tftpd | Copy NXOS Files] ***********************************************
changed: [localhost] => (item={'name': 'n3000-uk9-system.6.0.2.U3.5.bin'})
changed: [localhost] => (item={'name': 'n3000-uk9-kickstart.6.0.2.U3.5.bin'})

NOTIFIED: [tftpd | generate md5] **********************************************
changed: [localhost] => (item={u'src': u'/root/.ansible/tmp/ansible-tmp-1438835621.38-10613105394176/source', u'md5sum': u'719eba727a731782d9bf91501f8f0754', u'group': u'root', u'uid': 0, u'dest': u'/srv/tftp/conf_JAC0001.cfg', u'changed': True, 'item': {'name': 'switch1', 'ip': '192.168.4.201', 'mask': '255.255.255.0', 'mac': '0:0:0:50:60:50', 'bootfile': 'switch1.py', 'tftp_server': '192.168.3.254', 'serial': 'JAC0001', 'type': 'n3k', 'gateway': '192.168.4.254', 'software': '6.0.2.U3.5'}, u'state': u'file', u'gid': 0, u'mode': u'0644', 'invocation': {'module_name': u'template', 'module_args': u'src=n3k.j2 dest=/srv/tftp/conf_JAC0001.cfg'}, u'owner': u'root', u'size': 329})
changed: [localhost] => (item={u'src': u'/root/.ansible/tmp/ansible-tmp-1438835621.43-76402414020639/source', u'md5sum': u'a2e2a37799030c77364fa41755ef270a', u'group': u'root', u'uid': 0, u'dest': u'/srv/tftp/conf_JAC0002.cfg', u'changed': True, 'item': {'name': 'switch2', 'ip': '192.168.4.202', 'mask': '255.255.255.0', 'mac': '0:0:0:50:60:51', 'bootfile': 'switch2.py', 'tftp_server': '192.168.3.254', 'serial': 'JAC0002', 'type': 'n3k', 'gateway': '192.168.4.254', 'software': '6.0.2.U3.5'}, u'state': u'file', u'gid': 0, u'mode': u'0644', 'invocation': {'module_name': u'template', 'module_args': u'src=n3k.j2 dest=/srv/tftp/conf_JAC0002.cfg'}, u'owner': u'root', u'size': 329})

NOTIFIED: [tftpd | generate md5 nxos] *****************************************
changed: [localhost] => (item={u'src': u'/root/.ansible/tmp/ansible-tmp-1438835621.5-27092330543238/source', u'md5sum': u'f4e9e6a8d205febcaae5a413a0edd82b', u'group': u'root', u'uid': 0, u'dest': u'/srv/tftp/n3000-uk9-system.6.0.2.U3.5.bin', u'changed': True, 'item': {'name': 'n3000-uk9-system.6.0.2.U3.5.bin'}, u'state': u'file', u'gid': 0, u'mode': u'0666', 'invocation': {'module_name': u'copy', 'module_args': u'src=n3000-uk9-system.6.0.2.U3.5.bin dest=/srv/tftp/n3000-uk9-system.6.0.2.U3.5.bin mode=0666'}, u'owner': u'root', u'size': 32})
changed: [localhost] => (item={u'src': u'/root/.ansible/tmp/ansible-tmp-1438835621.57-161210011896998/source', u'md5sum': u'7d9265513b44bf23a07912f21167af58', u'group': u'root', u'uid': 0, u'dest': u'/srv/tftp/n3000-uk9-kickstart.6.0.2.U3.5.bin', u'changed': True, 'item': {'name': 'n3000-uk9-kickstart.6.0.2.U3.5.bin'}, u'state': u'file', u'gid': 0, u'mode': u'0666', 'invocation': {'module_name': u'copy', 'module_args': u'src=n3000-uk9-kickstart.6.0.2.U3.5.bin dest=/srv/tftp/n3000-uk9-kickstart.6.0.2.U3.5.bin mode=0666'}, u'owner': u'root', u'size': 35})

NOTIFIED: [tftpd | add md5] ***************************************************
changed: [localhost] => (item={u'src': u'/root/.ansible/tmp/ansible-tmp-1438835621.2-187701580008017/source', u'md5sum': u'22256ae23becfa917ac20c03c0f3c17b', u'group': u'root', u'uid': 0, u'dest': u'/srv/tftp/switch1.py.md5', u'changed': True, 'item': {'name': 'switch1', 'ip': '192.168.4.201', 'mask': '255.255.255.0', 'mac': '0:0:0:50:60:50', 'bootfile': 'switch1.py', 'tftp_server': '192.168.3.254', 'serial': 'JAC0001', 'type': 'n3k', 'gateway': '192.168.4.254', 'software': '6.0.2.U3.5'}, u'state': u'file', u'gid': 0, u'mode': u'0644', 'invocation': {'module_name': u'template', 'module_args': u'src=poap_n3k.j2 dest=/srv/tftp/switch1.py.md5'}, u'owner': u'root', u'size': 38617})
changed: [localhost] => (item={u'src': u'/root/.ansible/tmp/ansible-tmp-1438835621.29-188072396475654/source', u'md5sum': u'22256ae23becfa917ac20c03c0f3c17b', u'group': u'root', u'uid': 0, u'dest': u'/srv/tftp/switch2.py.md5', u'changed': True, 'item': {'name': 'switch2', 'ip': '192.168.4.202', 'mask': '255.255.255.0', 'mac': '0:0:0:50:60:51', 'bootfile': 'switch2.py', 'tftp_server': '192.168.3.254', 'serial': 'JAC0002', 'type': 'n3k', 'gateway': '192.168.4.254', 'software': '6.0.2.U3.5'}, u'state': u'file', u'gid': 0, u'mode': u'0644', 'invocation': {'module_name': u'template', 'module_args': u'src=poap_n3k.j2 dest=/srv/tftp/switch2.py.md5'}, u'owner': u'root', u'size': 38617})

PLAY RECAP ********************************************************************
localhost                  : ok=10   changed=6    unreachable=0    failed=0

root@debian-8:/home/poap/poap#

There were no changes required for the DHCP server but as I removed all files from the tftp root all files were created or copied in case of the NXOS files.

root@debian-8:/home/poap/poap# ls -l /srv/tftp/
root@debian-8:/home/poap/poap# ls -l /srv/tftp/
total 192
-rw-r--r-- 1 root root   329 Aug  6 06:24 conf_JAC0001.cfg
-rw-r--r-- 1 root root    40 Aug  6 06:24 conf_JAC0001.cfg.md5
-rw-r--r-- 1 root root   329 Aug  6 06:24 conf_JAC0002.cfg
-rw-r--r-- 1 root root    40 Aug  6 06:24 conf_JAC0002.cfg.md5
-rw-rw-rw- 1 root root    35 Aug  6 06:24 n3000-uk9-kickstart.6.0.2.U3.5.bin
-rw-r--r-- 1 root root    40 Aug  6 06:24 n3000-uk9-kickstart.6.0.2.U3.5.bin.md5
-rw-rw-rw- 1 root root    32 Aug  6 06:24 n3000-uk9-system.6.0.2.U3.5.bin
-rw-r--r-- 1 root root    40 Aug  6 06:24 n3000-uk9-system.6.0.2.U3.5.bin.md5
-rw-r--r-- 1 root root 38653 Aug  6 06:24 switch1.py
-rw-r--r-- 1 root root 38612 Aug  6 06:24 switch1.py.md5
-rw-r--r-- 1 root root 38651 Aug  6 06:24 switch2.py
-rw-r--r-- 1 root root 38610 Aug  6 06:24 switch2.py.md5
root@debian-8:/home/poap/poap#

At my dev system at home. I didn’t have the NXOS files available So i just created bogus files for demonstration purposes. The boot process below did use the correct software images.
Now all files are in place and the DHCP server is ready it is time to start the POAP process
To get a switch after it already has been configured back in poap mode a special boot option needs to be configured. Save the config and reboot the switch.

switch(config)# boot poap enable 

switch# copy running-config startup-config 

[########################################] 100%
Copy complete, now saving to disk (please wait)...


switch# reload

WARNING: This command will reboot the system

Do you want to continue? (y/n) [n] y

The system boots with the software 6.0.2.U2.2 (line 13) and POAP is enabled (line 26)

11:12:40 switch %PFMA-2-PFM_SYSTEM_RESET: Manual system restart from Command Line Interface

[ 1182.288358]  writing reset reason 9, 


(c) Copyright 2011, Cisco Systems.

N3000 BIOS v.1.2.0, Thu 08/25/2011, 03:37 PM 

989D9CB4B4B4999299A0A2A3A0A2A3B2                                                                           B2Version 2.00.1201. Copyright (C) 2009 American Megatrends, Inc.                 Press <DEL> or <F2> to enter setup.                                             
Loader Version pr-1.07
GRUB Loading stage2
Booting kickstart image: bootflash:/n3000-uk9-kickstart.6.0.2.U2.2.bin....
...............................................................................
............................Image verification OK

[    0.000000] Fastboot Memory at 0c100000 of size 201326592
ÿUsage: init 0123POST INIT Starts at Mon Aug 10 11:13:19 UTC 2015
Starting Nexus 3000 Platform POST.....
  Executing Mod 1 1 SEEPROM Test:...done (0 seconds)
  Executing Mod 1 1 GigE Port Test:.done (8 seconds)
  Executing Mod 1 1 PCIE Test:.................done (0 seconds)
  Mod 1 1 Post Completed Successfully
POST is completed
.r.r.r. done.
Bootstrapping via POAP overriding existing startup-config
Loading System Software Mon Aug 10 11:13:43 UTC 2015

System Software(/bootflash/n3000-uk9.6.0.2.U2.2.bin) Loaded Mon Aug 10 11:14:00 UTC 2015
ethernet switching mode

INIT: Entering runlevel: 3

Mounting other filesystems:  [  
Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
Added VLAN with VID == 4042 to IF -:muxif:-
11:14:07 switch %USER-0-SYSTEM_MSG: FAST REBOOT DISABLED - bcm_usd
11:14:07 switch %USER-2-SYSTEM_MSG: CLIS: loading cmd files begin  - clis
11:14:19 switch %USER-2-SYSTEM_MSG: CLIS: loading cmd files end  - clis
11:14:19 switch %USER-2-SYSTEM_MSG: CLIS: init begin  - clis
11:14:49 switch %USER-0-SYSTEM_MSG: Starting bcm_attach - bcm_usd
11:14:53 switch %USER-0-SYSTEM_MSG: Finished bcm_attach... - bcm_usd
11:15:05 switch %VDC_MGR-2-VDC_ONLINE: vdc 1 has come online 
Starting Power On Auto Provisioning...Done

Obviously we do not want to abort POAP. So we wait until the device does a DHCP request on its management port which happens on line 9 and after about 25 seconds the switch decided on use this offer and continue the process (line 10).

The bootfile is downloaded and execution starts. Not sur why on line 23 it is stated that the MD5SUM is not verified because and incorrect MD5 in the file results in a failed boot process. All other messages are self explanatory.

Abort Power On Auto Provisioning and continue with normal setup ?(yes/no)[n]: 

11:15:09 switch %POAP-2-POAP_INITED: S/N[JAC0001]-MAC[...] - POAP process initialized
11:30:18 switch %POAP-2-POAP_INFO:   - Abort Power On Auto Provisioning and continue with normal setup ?(yes/no)[n]: 
11:30:52 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - USB disk not detected
11:30:52 switch %POAP-2-POAP_DHCP_DISCOVER_START: S/N[JAC0001]-MAC[...] - POAP DHCP Discover phase started
11:30:52 switch %POAP-2-POAP_INFO:   - Abort Power On Auto Provisioning and continue with normal setup ?(yes/no)[n]: 
11:45:16 switch %POAP-2-POAP_INFO:   - Abort Power On Auto Provisioning and continue with normal setup ?(yes/no)[n]: 
11:45:17 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - Valid DHCP OFFER received from 192.168.3.254
11:45:43 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - Using DHCP, information received over mgmt0 from 192.168.3.254
11:45:43 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - Assigned Host Name: switch1
11:45:43 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - Assigned IP address: 192.168.4.1
11:45:43 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - Netmask: 255.255.255.0
11:45:43 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - DNS Server: 192.168.3.254
11:45:43 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - Default Gateway: 192.168.4.254
11:45:43 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - Script Server: 192.168.3.254
11:45:43 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - Script Name: switch1.py
11:45:54 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - The POAP Script download has started
11:45:54 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - The POAP Script is being downloaded from [copy tftp://192.168.3.254/switch1.py bootflash:script.sh vrf management ]
11:45:55 switch %POAP-2-POAP_SCRIPT_DOWNLOADED: S/N[JAC0001]-MAC[...] - Successfully downloaded POAP script file
11:45:55 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - Default script timeout value:900 in script file
11:45:55 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - Script file size 38968, MD5 checksum f512efd4bf9e962d22aed20305cb60cc
11:45:55 switch %POAP-2-POAP_SCRIPT_STARTED_MD5_NOT_VALIDATED: S/N[JAC0001]-MAC[...] - POAP script execution started(MD5 not validated)
11:45:55 switch %POAP-2-POAP_INFO: S/N[JAC0001]-MAC[...] - script timeout value:900 sec
11:45:56 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: Selected config filename (serial-nb) : conf_JAC0001.cfg - script.sh
11:45:56 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : dir bootflash: - script.sh
11:45:56 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: free space is 1026448 kB - script.sh
11:45:56 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO:#Starting Copy of Config File - script.sh
11:46:00 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: Completed Copy of Config File - script.sh
11:46:00 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO:#Check md5 of Configuration File - script.sh
11:46:04 switch %AUTHPRIV-1-SYSTEM_MSG:     root : can't get hostname - sudo
11:46:04 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : show file volatile:conf_JAC0001.cfg.md5.poap_md5 - script.sh
11:46:04 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: md5sum c006ba91100eff0407c4b7bba7b7ec88 (.md5 file) - script.sh
11:46:04 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : show file poap_replay.cfg md5sum - script.sh
11:46:04 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: md5sum c006ba91100eff0407c4b7bba7b7ec88 (recalculated) - script.sh
11:46:04 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: Split config invoked.... - script.sh
11:46:04 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : delete poap_replay.cfg - script.sh
11:46:04 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : delete poap_1.cfg - script.sh
11:46:05 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO:#Starting Copy of Kickstart Image - script.sh
11:46:12 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: Completed Copy of Kickstart Image - script.sh
11:46:12 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO:#Check md5 of kickstart image - script.sh
11:46:17 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : show file volatile:n3000-uk9-kickstart.6.0.2.U3.5.bin.md5.poap_md5 - script.sh
11:46:17 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: md5sum fea16185a6104abba5179a73e438ef29 (.md5 file) - script.sh
11:46:17 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : show file bootflash:/kickstart.img.new md5sum - script.sh
11:46:19 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: md5sum fea16185a6104abba5179a73e438ef29 (recalculated) - script.sh
11:46:19 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : move bootflash:/kickstart.img.new bootflash:/n3000-uk9-kickstart.6.0.2.U3.5.bin - script.sh
11:46:19 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO:#Starting Copy of System Image - script.sh
11:46:44 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: Completed Copy of System Image - script.sh
11:46:45 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO:#Check md5 of system image - script.sh
11:46:51 switch %AUTHPRIV-1-SYSTEM_MSG:     root : can't get hostname - sudo
11:46:51 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : show file volatile:n3000-uk9.6.0.2.U3.5.bin.md5.poap_md5 - script.sh
11:46:51 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: md5sum 68e7c10e308864a5c8656c2c459b6897 (.md5 file) - script.sh
11:46:51 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : show file bootflash:/system.img.new md5sum - script.sh
11:46:58 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: md5sum 68e7c10e308864a5c8656c2c459b6897 (recalculated) - script.sh
11:46:58 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : move bootflash:/system.img.new bootflash:/n3000-uk9.6.0.2.U3.5.bin - script.sh
11:46:59 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: Entered get_cable_mgmt_file - script.sh
11:46:59 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] -  found cable file: 0  - script.sh
11:46:59 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: No cable file specified - script.sh
11:46:59 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : show version image bootflash:/n3000-uk9-kickstart.6.0.2.U3.5.bin - script.sh
11:47:09 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : show version image bootflash:/n3000-uk9.6.0.2.U3.5.bin - script.sh
11:48:09 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: Setting the boot variables - script.sh
11:48:09 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : config terminal ; boot kickstart bootflash:/n3000-uk9-kickstart.6.0.2.U3.5.bin - script.sh
11:48:11 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : config terminal ; boot system bootflash:/n3000-uk9.6.0.2.U3.5.bin - script.sh
11:48:14 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : copy running-config startup-config - script.sh
11:48:19 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: successful - script.sh
11:48:19 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : copy bootflash:poap_2.cfg scheduled-config - script.sh
11:48:20 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - ######### Copying the second scheduled cfg done ########## - script.sh
11:48:20 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - INFO: Configuration successful - script.sh
11:48:20 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - FINISH: Clean up files. - script.sh
11:48:20 switch %USER-1-SYSTEM_MSG: S/N[JAC0001]-MAC[...] - CLI : delete poap_2.cfg - script.sh
11:48:21 switch %POAP-2-POAP_SCRIPT_EXEC_SUCCESS: S/N[JAC0001]-MAC[...] - POAP script execution success
11:48:23 switch %PFMA-2-PFM_SYSTEM_RESET: Manual system restart from Command Line Interface

[ 2111.564321]  writing reset reason 9, 

The switch reboots after the succesfull POAP process and reboots with the specified software version and we are able to login with the username specified in the configuration file.

switch(config)# boot poap enable 

(c) Copyright 2011, Cisco Systems.

N3000 BIOS v.1.2.0, Thu 08/25/2011, 03:37 PM 

989D9CB4B4B4999299A0A2A3A0A2A3B2                                                                B2Version 2.00.1201. Copyright (C) 2009 American Megatrends, Inc.                 Press <DEL> or <F2> to enter setup.                                             
Loader Version pr-1.07
GRUB Loading stage2                                                                        Booting kickstart image: bootflash:/n3000-uk9-kickstart.6.0.2.U3.5.bin....
...............................................................................
...........................Image verification OK

[    0.000000] Fastboot Memory at 0c100000 of size 201326592
ÿUsage: init 0123POST INIT Starts at Mon Aug 10 11:49:02 UTC 2015
Starting Nexus 3000 Platform POST.....
  Executing Mod 1 1 SEEPROM Test:...done (0 seconds)
  Executing Mod 1 1 GigE Port Test:.done (8 seconds)
  Executing Mod 1 1 PCIE Test:.................done (0 seconds)
  Mod 1 1 Post Completed Successfully
POST is completed
.r.r.r. done.
Bootstrapping via POAP overriding existing startup-config
Loading System Software Mon Aug 10 11:49:22 UTC 2015

System Software(/bootflash/n3000-uk9.6.0.2.U3.5.bin) Loaded Mon Aug 10 11:49:43 UTC 2015
ethernet switching mode
cp: cannot stat `/isan/etc/capability.cap': No such file or directory

INIT: Entering runlevel: 3

Mounting other filesystems:  [  OK  ]


Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
Added VLAN with VID == 4042 to IF -:muxif:-
11:49:53  %USER-0-SYSTEM_MSG: FAST REBOOT DISABLED - bcm_usd
11:49:53  %USER-2-SYSTEM_MSG: CLIS: loading cmd files begin  - clis
11:50:04  %USER-2-SYSTEM_MSG: CLIS: loading cmd files end  - clis
11:50:04  %USER-2-SYSTEM_MSG: CLIS: init begin  - clis
11:50:27  %USER-0-SYSTEM_MSG: Starting bcm_attach - bcm_usd
11:50:31  %USER-0-SYSTEM_MSG: Finished bcm_attach... - bcm_usd
11:50:44  %VDC_MGR-2-VDC_ONLINE: vdc 1 has come online 
POAP - Waiting for Box Online...
POAP - Box is Online...
POAP - Applying scheduled configuration...
Copy complete, now saving to disk (please wait)...


[########################################] 100%
Copy complete, now saving to disk (please wait)...
Done

11:51:18 switch1 %SYSLOG-2-SYSTEM_MSG: POAP Completed - LOGIN

Nexus 3000 Switch
switch1 login: nettinkerer
Password: 
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2014, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained in this software are
owned by other third parties and used and distributed under
license. Certain components of this software are licensed under
the GNU General Public License (GPL) version 2.0 or the GNU
Lesser General Public License (LGPL) Version 2.1. A copy of each
such license is available at
http://www.opensource.org/licenses/gpl-2.0.php and
http://www.opensource.org/licenses/lgpl-2.1.php

switch1#

 

As you can see POAP is very powerfull to quickly upgrade and configure a large number of new switches. It would also be possible to modify the playbook to use the configuration of a failed switch. Imagine sending a replacement switch to the datacenter, the field engineer repalces the switch. You only need to change one line in a YAML file, run the playbook and the POAP files are prepared and the DHCP server is reconfigured and restarted.

POAP and Ansible integration part 3

The third part of the series will be about all the files required for the boot process. The boot process follows the diagram below.

poap_process

All configuration files required for the boot process are generated by TFTPD role in the playbook. The tasks associated with this roles are defined in the YAML file.

---
- name: Generate poap files
  template: src=poap_n3k.j2 dest=/srv/tftp/{{item.name}}.py.md5
  register: poap_created
  with_items: clients
  notify: add md5
- name: Generate tftpd files
  template: src=n3k.j2 dest=/srv/tftp/cfg_{{item.serial}}.cfg
  register: config_created
  with_items: clients
  notify: generate md5
- name: Copy NXOS Files
  copy: src={{item.name}} dest=/srv/tftp/{{item.name}} mode=0666
  register: ios_copied
  with_items: nxosfiles
  notify: generate md5 nxos

The bootfile is a Python script based on an example which can be downloaded from CCO when you have correct entitlement. I have modified the Python script a bit and removed one bug which prevented the script to recognize the switch as a Nexus 3048. In the script all the details for the POAP process are specified.

  • software version
  • configuration file
  • download credentials
  • transfer method
  • download server

As I wanted to be flexible in the software version I used the templating system of Ansible to generate custom py files for booting.

The handler called when the py file changes is used to create the actual py file provided via the DHCP offer. In the actual file an extra line is added with the md5sum of the file without this extra line. When executed by the switch the python script will remove the line with the md5sum, calculate the md5sum and verify the script.  The handlers are specified in a separate YAML file

---
- name: generate md5
  shell: md5sum {{ item.dest }} | awk {'print "md5sum=" $1'} > {{item.dest}}.md5
  when: item.changed
  with_items: config_created.results

- name: generate md5 nxos
  shell: md5sum {{ item.dest }} | awk {'print "md5sum=" $1'} > {{item.dest}}.md5
  when: item.changed
  with_items: ios_copied.results

- name: add md5
  command: /bin/addmd5 {{ item.dest }}
  when: item.changed
  with_items: poap_created.results

The handler for the py file is add md5 This handler executes a bash script to calculate the md5, add it to the file and store it as a new py file without the md5 suffix. This is the file downloaded by the switch during the POAP process and needs to be supplied by the DHCP server als Boot file.

#!/bin/bash
#script to add md5sum to the second line of a Python file used for POAP files
md5=`md5sum $1 | awk {'print "#md5sum=" $1 '}`
poap_file=`echo $1 | sed "s/\.md5//"`
sed '2i'$md5 $1 > $poap_file

When the md5 of the bootfile matches with the md5 included in the file the complete script is executed. In the script the transfer method is specified. If another method than tftp is used for transfer of the configuration and software credentials need to be specified. Please be aware that these credentials will be sent unencrypted to the switch when the py file is transferred. The files to be transferred are also specified in the script. In this example the name of the configuration file to be downloaded is derived from the serial number. This can be seen in line 8 of the roles/tftpd/tasks/main.yml file.

The next task is to create the actual configuration files. This is pretty straightforward. More about this can be found In a previous blog on this site. The only special thing is the handler generate md5 which calculates the md5sum of the configuration file and places this value in a textfile. This textfile has the same name as the configuration file with an .md5 suffix. The format of the string is md5sum=12345abcdef. The Python script executed by the POAP process will download these file automatically and verify the MD5SUM.

The last task in the playbook copies all the NXOS images to the TFTP server. Again a handler is called to create the md5 files like with the configuration files.

It is important to realize that Ansible is indempodent. It will always strive to keep everything in a consistent state regardless of how many times a playbook is run.  This also means that files generated by Ansible must not be changed by hand. The next time the playbook is run the changes made by hand will be lost.

In the last blog in the series I will show how everything works together and the switch will do a POAP.

Part 4

POAP and Ansible integration part 2

In this part of the serie I will discuss the isc-dhcpd server configuration. isc-dhcpd is a DHCP server which is available on most linux distributions. It has many options but for this setup only a minimal configuration is required.

The directory layout  for the ansible-playbook for the DHCPD role

/home/poap/poap/
|-- globals
|   `-- poap_clients.yml
|-- roles
|   |-- dhcpd
|   |   |-- handlers
|   |   |   `-- main.yml
|   |   |-- tasks
|   |   |   `-- main.yml
|   |   |-- templates
|   |   |   |-- dhcpd.conf.j2
|   |   |   `-- static_clients.j2
|   |   `-- vars
|   |       `-- main.yml
|   `-- includes
`-- site.yml

The tasks for the DHCPD role are defined in roles/dhcpd/tasks/main.yml.

---
- name: Generate dhcpd main config files
  template: src=dhcpd.conf.j2 dest=/etc/dhcp/dhcpd.conf
  notify: restart dhcpd
- include_vars: globals/poap_clients.yml
- name: create client dhcpd config files
  template: src=static_clients.j2 dest=/etc/dhcp/static_clients
  notify: restart dhcpd

in role/dhcpd/vars/main.yml basic settings are configured for the DHCP server.

---
domain_name: home.local
domain_name_servers: 192.168.3.254
default_lease_time: 7200
max_lease_time: 14400
scopes:
 - subnet: 192.168.3.0
   netmask: 255.255.255.0
 - subnet: 192.168.4.0
   netmask: 255.255.255.0
   ranges:
   - range_start: 192.168.4.1
     range_end: 192.168.4.200
     routers: 192.168.4.254

I my lab I used two scopes and one range to allocate addresses from. These settings are used in the dhcpd.conf.j2 template to create the main dhcpd.conf

#jinja2: lstrip_blocks: True
ddns-update-style none;

# option definitions common to all supported networks...

option domain-name "{{domain_name}}";
option domain-name-servers {{domain_name_servers}};

default-lease-time {{default_lease_time}};
max-lease-time {{max_lease_time}};

{% for scope in scopes %}
subnet {{scope.subnet}} netmask {{scope.netmask}} {
  {% if scope.ranges is defined %}
  {% for range in scope.ranges %}
  range {{range.range_start}} {{range.range_end}};
  {% if range.routers is defined %}
  option routers {{range.routers}};
  {% endif %}
  {% endfor %}
  {% endif %}
}
{% endfor %}
include "/etc/dhcp/static_clients";

At the end of the configuration an additional configuration file called static_clients has been included, in which the reservations for the statich (POAP) clients are defined. I have placed these in a separate file for a reason. In a normal environment there would be at least two DHCP servers. Each server would be responsible for a part of the subnet to allocate address from. Or there would be a master/slave relation between the two servers which requires different configurations on both. The reservations however must be the same on both servers.

This template is used by the task Generate dhcpd main config files. The handler is instructs Ansible to restart the DHCPD service but only when the configuration has changed.
The next task is to include an additional YAML file globals_poap_clients.yml with data about the various poap clients. The file is placed in a different directory than the normal vars directory belonging to the role because it will also be used by the TFTPD role.

---
clients:
  - name: switch1
    mac: 0:0:0:50:60:50
    tftp_server: 192.168.3.254
    ip: 192.168.4.201
    mask: 255.255.255.0
    gateway: 192.168.4.254
    serial: JAC0001
    type: n3k
    software: 6.1.4
  - name: switch2
    mac: 0:0:0:50:60:51
    tftp_server: 192.168.3.254
    ip: 192.168.4.202
    mask: 255.255.255.0
    gateway: 192.168.4.254
    serial: JAC0002
    type: n3k
    software: 4.2

This files specifies two Nexus devices. The data is being used in the task create client dhcpd config files and fed to the template for the POAP clients.

#static clients
{% if clients is defined %}
{% for client in clients %}
host {{client.name}} {
  option host-name "{{client.name}}";
  option dhcp-client-identifier "\000{{client.serial}}";
  option bootfile-name "{{client.name}}.py";
  option tftp-server-name "{{client.tftp_server}}";
}
{% endfor %}
{% endif %}

This configuration will provide for each poap client:

  • Hostname
  • Bootfile
  • Bootserver

Settings like the IP address/mask/gateway/DNS are provided via the global scope. The ip details specified in the YAML file will be used for the generation of the actual switch configuration files.

Normally reservations are made based on the MAC address. In this setup I have chosen to make the reservation based on the serial of switch. This is possible because the serial is used as the client-identifier in the DHCP request. The serial of a new switch is often more easilly obtained than the mac address and I hate entering mac addresses as each vendor/tool requires a different format.

It took a Wireshark capture to get it working because Cisco prepends the client- identifier with an ASCII NULL. That is why the \000 in front of the {{client.serial}} is required on line 5

Again when dhcp settings have changed like adding a POAP client the DHCPD service will be restarted by Ansible.

After running the playbook the configuration for the DHCP server is generated.

ddns-update-style none;

# option definitions common to all supported networks...
option domain-name "home.local";
option domain-name-servers 192.168.3.254;

default-lease-time 7200;
max-lease-time 14400;

subnet 192.168.3.0 netmask 255.255.255.0 {
}
subnet 192.168.4.0 netmask 255.255.255.0 {
  range 192.168.4.1 192.168.4.200;
  option routers 192.168.4.254;
}
include "/etc/dhcp/static_clients";
#template for static clients
host switch1 {
  option dhcp-client-identifier "\000JAC0001";
  option host-name "switch1";
  option bootfile-name "JAC0001.py";
  option tftp-server-name "192.168.3.254";
}
host switch2 {
  option dhcp-client-identifier "\000JAC0002";
  option host-name "switch2";
  option bootfile-name "JAC0002.py";
  option tftp-server-name "192.168.3.254";
}

Overall the DHCP server configuration is pretty simple. In my lab the DHCP server is running on the same hosts as the ansible-scripts In a real world deployment this will most likely be different remote servers. How to configure Ansible to connect to remote DHCP servers is beyond the scope of this series but can be found on the internet easilly

This was part 2 of the series. In part 3 I will discuss about all the various files which need to be generated to make the POAP work.

Part 1

POAP and Ansible integration part 1

Everyone who has every installed a Nexus switch is familiar with the following message.

%POAP-2-POAP_INFO: Abort Power On Auto Provisioning and continue with normal setup ?(yes/no)[n]:

I always pressed y and be done with it. Since I have been using Ansible to create config files and to deploy Linux clients I have been wondering if I can do it all with Ansible. In a number of blogs I will describe  how to setup everything and never touch your console cable anymore. Please follow me on Twitter for the other blogs on this subject.

The flowchart for the setup is below

Poap_Ansible1

Everything is being specified in a number of YAML files. In the YAML files details about the POAP clients like, serial number, desired software version, hardware platform and ip details are specified. Also the basic DHCP server configuration parameters are specified in a YAML file.

The YAML files are used to create the following files via the templating system.

  • isc-dhcpd configuration files
  • bootfiles for the Nexus devices
  • configuration files for the Nexus devices

The creation of all these files has been split in two roles

DHCPD and TFTPD

In the next blog post I  will describe how the DHCPD role is responsible for the isc-dhcpd service.

 

Using Ansible to create config files

Recently I had to roll out a number of access switches. In the past I created the config files with either Excel/Word via a mailmerge or custom perl scripts. Both methods were not ideal. Mailmerge is inflexible and although I know my way around in Perl my colleagues often do not. After reading the excellent Ansible blog by Kirk Byers I gave it a try.

Ansible is primarily a tool like Chef and Puppet for server management. To make Ansible do something it has a concept named playbooks. A playbook defines which roles a specific host has. Each role has it specific tasks which need to be executed on that  hosts. For example a hosts has a role as DNS server . Tasks associated with this role could be make sure the latest version of Bind is installed and all the zone files are up to date. But also a task of creating the zone files by means of using a template system. This template system will be used to create the configuration files in this example

 Almost all files used by Ansible are written in the YAML format.
Below is the playbook used in this example.

---
- name: Generate access switch files
  hosts: localhost

  roles:
  - switch

Normally the tasks indicated by the roles would be executed on a remote host (remember the DNS server from above). For this example the files are generated on the same host as the Ansible script is being run this could also be a remote TFTP server for example.
The tasks belonging to the switch role of localhost are defined in a separate YAML file.

---
- name: Generate config files
  template: src=switch.j2 dest=/home/ansible/nettinkerer/config/{{item.hostname}}.
txt
  with_items: access_switches

The task executed on the local host is creating files based on the Jinja2 template. The variables being used are also defined in a YAML file. The template is being completed by looping over item of the dictionary access_switches

---
access_switches:
  - hostname: switch_1
    vpc_domain: 20
    core_node: core_1
    core_uplink: 1/0/10
    core_portchannel: 11
    vpc_roleprio: 4096
    vpc_peer_dest: 10.1.1.2
    vpc_peer_source: 10.1.1.1
    vpc_peernode: switch_2
    ip_vlan: 192.168.1.1
    snmp_location: Rack1
  - hostname: switch_2
    vpc_domain: 20
    core_node: core_1
    core_uplink: 2/0/10
    core_portchannel: 11
    vpc_roleprio: 8192
    vpc_peer_dest: 10.1.1.1
    vpc_peer_source: 10.1.1.2
    vpc_peernode: switch_2
    ip_vlan: 192.168.1.2
    snmp_location: Rack1

The contents of the Jinja2 file.

!template for access switch
feature lacp
feature udld
feature interface-vlan
feature vpc
no password strength-check
username nettinkerer password nettinkerer
username nettinkerer role network-admin
!
{% include "files/vlan" %}
!
{% include "/roles/includes/aaa" %}
!
{% include "files/stp" %}
!
hostname {{item.hostname}}
!
vpc domain {{item.vpc_domain}}
 peer-switch
 role priority {{item.vpc_roleprio}}
 system-priority 200
 peer-keepalive destination {{item.vpc_peer_dest}} source {{item.vpc_peer_source}}
 delay restore 300
 auto-recovery reload 900
!
interface ethernet1/49
 description {{item.vpc_peernode}}_E1/49
 channel-group 1 mode active
!
interface ethernet1/50
 description {{item.vpc_peernode}}_E1/50
 channel-group 1 mode active
!
interface Port-channel1
 description {{item.vpc_peernode}}_Po1
 switchport mode trunk
 spanning-tree port type network
 vpc peer-link
!
interface ethernet1/51
  description {{item.core_node}}_E{{item.core_uplink}}
  channel-group 2 mode active
!
interface Port-channel2
  description {{item.core_node}}_Po{{item.core_portchannel}}
  switchport mode trunk
  vpc 2
!
interface Vlan2
 description Management Interface
 ip address {{item.ip_vlan}} 255.255.255.0
 no shut
!
interface mgmt0
 ip address {{item.vpc_peer_source}} 255.255.255.252
!
vrf context management
 ip route 0.0.0.0 0.0.0.0 10.1.1.254
!
!
snmp-server location {{item.snmp_location}}
{% include "files/snmp" %}
!
end

This is a fairly simple Jinja2 file and is easy to read even without knowledge of the Jinja2 language. Everything between double curly brackets are variables which are being replaced with the actual value. Everything enclosed by a curly bracket and the percent sign is a function of the Jinja2 templating system. In this case a simple include for very static things like vlans and snmp stuff.
The directory layout for an Ansible script is very important. All files are expected to be found in specific directories. Below is the layout for this tutorial. Don’t worry about the router subdirectory for this moment.

.
|-- config
|-- roles
|   |-- includes
|   |   `-- aaa
|   |-- router
|   |   |-- tasks
|   |   |   `-- main.yml
|   |   |-- templates
|   |   |   |-- files
|   |   |   |   |-- snmp
|   |   |   |   |-- stp
|   |   |   |   `-- vlan
|   |   |   `-- router.j2
|   |   `-- vars
|   |       `-- main.yml
|   `-- switch
|       |-- tasks
|       |   `-- main.yml
|       |-- templates
|       |   |-- files
|       |   |   |-- snmp
|       |   |   |-- stp
|       |   |   `-- vlan
|       |   `-- switch.j2
|       `-- vars
|           `-- main.yml
`-- site.yml

The magic happens by running the playbook.

ansible@python-dev:~/nettinkerer$ ansible-playbook site.yml

PLAY [Generate access switch files] *******************************************

GATHERING FACTS ***************************************************************
ok: [localhost]

TASK: [switch | Generate config files] ****************************************
changed: [localhost] => (item={'core_uplink': '1/0/10', 'ip_vlan': '192.168.1.1', 'core_port                                                       channel': 11, 'hostname': 'switch_1', 'vpc_domain': 20, 'snmp_location': 'Rack1', 'vpc_peer_                                                       dest': '10.1.1.2', 'vpc_peer_source': '10.1.1.1', 'vpc_peernode': 'switch_2', 'core_node': '                                                       core_1', 'vpc_roleprio': 4096})
changed: [localhost] => (item={'core_uplink': '2/0/10', 'ip_vlan': '192.168.1.2', 'core_port                                                       channel': 11, 'hostname': 'switch_2', 'vpc_domain': 20, 'snmp_location': 'Rack1', 'vpc_peer_                                                       dest': '10.1.1.1', 'vpc_peer_source': '10.1.1.2', 'vpc_peernode': 'switch_2', 'core_node': '                                                       core_1', 'vpc_roleprio': 8192})

PLAY RECAP ********************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=0

ansible@python-dev:~/nettinkerer$

and the configuration files can be found in the config directory.

ansible@python-dev:~/nettinkerer$ ls -l1 config/
total 8
-rw-r--r-- 1 ansible ansible 2352 Jul 28 21:31 switch_1.txt
-rw-r--r-- 1 ansible ansible 2352 Jul 28 21:31 switch_2.txt
ansible@python-dev:~/nettinkerer$

Although it might seem to be a lot of work to create all these YAML and Jinja2 files to generate a couple of configuration files it can save a lot of work later on. Imagine that you have generated 40 configurations and all of a sudden there is an additional vlan which needs to be included in all configurations. Now it is just a case of modifying one single file and generate all the configuration files by simply running the playbook again.

ERSPAN on the Nexus7000

To troubleshoot some performance issues A span port was required on a Nexus7000. Off course the port to span was not located on the same switch as the SPAN destination.

On the Nexus 7000 it is not possible to use an RSPAN vlan as a SPAN destination. It can only be used as a span source. So this was not an option.

ERSPAN can be used as a SPAN destination but the N7K where the ERSPAN traffic needed to be decapsulated and sent to the monitoring tool didn’t have the correct sofware to do this.  So again not a feasible solution

However it is possible to give the monitoring tool the ip address of the ERSPAN destination and place it in a segment reachable by the N7K generating the ERSPAN traffic.

The basic configuration looks like this

monitor session 10 type erspan-source
erspan-id 10
vrf span
destination ip 10.1.11.40
source interface port-channel2 both
no shut

In the admin VDC the source-ip for the ERSPAN traffic needs to be specifed

monitor erspan origin ip-address 1.1.1.1 global

Not sure why this is needed in the admin VDC.
Give a simple linux VM the ip 10.1.11.40 and capture the data with tcpdump.

 tcpdump -i eth3 -s 300 -c 10000 proto gre -w GRE.CAP

ERSPAN uses the GRE protocol to encapsulate the packets and sent them to the collector so we filter on those.
Opening the file in wireshark shows us the data received. In the red box ERSPAN traffic can be seen and in the blue box the actual encapsulated packets.
ERSPAN