2024-12-11

Emulated/Virtual Test Network

Today I finally managed to get the foundations for my test network working at L2 within a virtual environment.

The purpose of what I'm trying to achieve allows me to simulate various aspects of my home network and hyper-converged homelab within the homelab itself!

Over on LinkedIn, I posted that I got L2 port-channelling/bonding working, but I as you can see in the snip below, Po2 doesn't show LACP as it's protocol. This is because I cheated with the config and used 'channel-group 2 mode on` instead of `channel-group 2 mode active` which brought the Port-channel interface up on the switch, but the bond on the Debian GNU/Linux host would still not form.

This post serves as a correction to that article/post.


The cause for the behaviours I was experiencing was because the libvirt VirtIO-based network adapters don't seem to report the speed to the guest however, I believe they operate at 10Gbps by default, which would make the bond interfaces incompatible with the IOS-based peer's port-channel interfaces, which are limited to 1Gbps (and LACP in general)

Changing the speed and duplex with nmcli solved this for me [1].

for i in 3 4 5 6; do sudo nmcli conn mod ens$i 802-3-ethernet.speed 1000 802-3-ethernet.duplex full; done

As soon as the speed and duplex was applied, the port-channel came up straight away. Marvellous.

Switch#show etherchan 2 summ | beg Port-

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
2      Po2(SU)         LACP      Gi1/0(P)    Gi1/1(P)    Gi1/2(P)
                                 Gi1/3(P)
Switch#

Now, I can proceed to further network-related components similar to my 'production' network.

[1]https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html/configuring_and_managing_networking/configuring-802-3-link-settings_configuring-and-managing-networking#proc_configuring-802-3-link-settings-using-the-nmcli-utility_configuring-802-3-link-settings


 

2024-11-29

Migrating away from BGP default-information originate

Background

I recently had yet another nbn unplanned outage. Now I have a GL-MT300N-V2 which I have basic config and a floating, static route on my central/downstream multilayer switch as a backup route with a worse metric than BGP, so that I can share my mobile phone's Mobile Broadband in the event that my Fortigate (FGT) can't forward default route traffic, but for some reason it was not working as expected/intended.

Problem #1 - IPTABLES default reject on FORWARD table

I did not capture the issue in detail, but it turned out that the GL-MT300N-V2 was blocking traffic in the forwarding table, changing this setting is what allowed forward traffic to pass to the MBB tether.



Problem #2 - default-information originate

The upstream BGP default route from my FGT persisted even in the event of an outage, when it should have disappeared so that the floating static route comes takes over internet forwarding (the Fortigate article linked herein explains this however this is normal BGP behaviour, but it was initially overlooked at the time of implementation. whoops!), but this was because I was using the Fortinet option 'set capability default-information-originate` in the BGP configuration, so I ended up tuning the BGP configuration and made the default route more dynamic as follows:

The solution

  1. Created a DEFAULT route prefix list
  2. Created a Route-map that uses the prefix list
  3. Redistributed static routes into the BGP table using the route-map
It now looks something like this:

config router prefix-list
    edit "PL_DEFAULT"
        config rule
            edit 1
                set prefix 0.0.0.0 0.0.0.0
                unset ge
                unset le
            next
        end
    next
end
config router route-map
    edit "RM_DEFAULT"
        config rule
            edit 1
                set match-ip-address "PL_DEFAULT"
            next
        end
    next
end
config redistribute "static"
    set status enable
    set route-map "RM_DEFAULT"
end

I then disconnected the nbn and enabled `debug ip routing` on my switch to test the solution.

During testing and while the nbn was offline, the floating static was in place, exactly as expected:

SWITCH#show ip route | invl 0\.0\.0\.0\/0
S*    0.0.0.0/0 [254/0] via 192.168.81.1
SWITCH#

Once the nbn service was back and the upstream FGT inserted a static default, it wasn't long before I saw this the resulting debug message:

1w0d: RT: updating bgp 0.0.0.0/0 (0x0):
    via 10.8.18.1

1w0d: RT: closer admin distance for 0.0.0.0, flushing 1 routes
1w0d: RT: add 0.0.0.0/0 via 10.8.18.1, bgp metric [20/0]
1w0d: RT: default path is now 0.0.0.0 via 10.8.18.1

I followed this up with a check on the routing table, and here is the dynamic default route from an upstream ppp(oe) link in all its glory.

SWITCH#show ip route bgp | incl 0\.0\.0\.0\/0
B*    0.0.0.0/0 [20/0] via 10.8.18.1, 00:27:41
SWITCH#

Conclusion


This method provides a more elegant solution so that the backup internet solution can be leveraged with almost no touch.

In case your wondering why I use a floating static route, this is because the GL-MT300N-V2 is extremely limited in flash storage making it difficult to install and operate Quagga/FRR and I am tired of resetting the device as it has a tendency to fall over after a while which I suspect is due to lack of space.

The only possible improvement I could do right now is improving security through policy by putting the GL-MT300N-V2 behind the firewall itself, but that is a project for another day (not to mention it runs OpenWRT under the hood and has its own IPTABLES firewall anyway). I also plan to swap out the FGT for a dedicated, OPNSense appliance hosted on an SBC.


I hope this has been informative and I'd like to thank you for reading!

Stay tuned for more...


2024-11-23

Reflections on Cisco ENARSI Study

One LAB to rule them all

While studying for my Cisco Certified Network Processional Enterprise Advance Routing (ENARSI) specialisation, I eventually figured out a strategy to help me focus on learning and less on constantly creating LABs.

I decided to build a reusable (flexible) lab by simply specifying the L2 VLAN at the router with a sub-interface so that I could potential attach any router to any other router in a P2MP broadcast setup.

It is impractical for a production network as each router shares the bandwidth of a single link for all VLANs on the trunk, but it does provide a very simple, elegant and flexible solution in its design allowing for focusing on less lab building and more hands on in a variety of scenarios.



Interestingly, both IOSvL2 switches refuse to provision certain VLAN ranges in the device config I built, but that's probably a bug with either CML or the image itself.

EXAM

So the exam to date is nothing short of frustrating. Just when I think I've nailed a lot of the concepts in EIGRP, OSPF, redistribution and many other L3 topics and begin feeling more confident, I find that the exams pool of questions are completely disjointed. For example, in my last exam - just before the time of writing this - I had less than 50% of actual advanced routing questions and more around services, device access and extremely low-level and corner-case things like really nuanced MPLS. It's not helping me to overcome imposter syndrome and to me makes it feel like I'm just part of Cisco's additional revenue stream and business strategy rather than learning and valuable certification process.

It also frustrates me that I seem to have to retain and recall an insane and almost inhuman amount of low-level information on EVERYTHING no mater how relevant or related to advanced routing it may seem.

I may just pivot across to other vendors and technology because it seems as though my brain is incompatible with rote learning.

2024-11-06

Dealing with old Cisco gear and SSH

I've spent enough time dealing with old Cisco get to know that the old outdated Ciphers and Key exchanges can be tricky to deal with. Unfortunately, we can't just run the latest and greatest in a LAB and it generally considered isolated, so we have to live with this to a certain degree even if insecure.

I'm documenting the process of how to use an SSH client (Linux) to force it to use the right KEX and cypher etc. so people (including myself) don't have to piece the solution together from different sources every single time.

First off, the answers to the command-line parameters required lie in debugging in the client application itself.

ssh -vvv $host

This spits out a lot of information, which I could not seem to be able to filter through egrep, nonetheless, key items are listed here for reference

debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: KEX algorithms: diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1
debug2: host key algorithms: ssh-rsa

With that information gleaned, I was able to construct the parameters required to successfully connect to an SSH session in a lab.

ssh -oStrictHostKeyChecking=no -oKexAlgorithms=+diffie-hellman-group1-sha1,diffie-hellman-group14-sha1 -oCiphers=aes128-ctr -oHostkeyAlgorithms=+ssh-rsa $host


2024-10-22

CML2 - Some Thoughts and Comparison to EVE-ng

Since I'm back in study mode I thought I'd get a hold of a Cisco Moddeling Labs (CML)2 licence so that I can try and gain some efficiency and therefore more focus in LABs at hand rather than troubleshooting and working around various kinks and nuances of the LAB environment, which I found I was doing a lot of in EVE-ng (prior to 6.0.x).

Installation

Once the purchase over at lerningnetworkstore was complete, Installation was quite straight forward except I had an extra step, which I will explain later.

Here's the high-level steps I took to acomplish the task;

  1. Downloaded the OVA and the refplat bundle from the Cisco software center
  2. Copied the OVA to the hypervisor
  3. Converted the vmdk from the OVA to qcow2
  4. Imported the qcow2 image into the hypervisor
Step 3 was only required for me since I'm using QEMU/KVM+libvrt as my hypervisor, but a quick search online guided me to the solution which allowed me to almost seamlessly import it.

I imported the qcow2 image and started the VM, but it would not boot properly, but it seems as though UEFI is required. Easy fix.

Initial Setup

Initial setup required access to the VM console, but it was very straight forward, it offered to use DHCP, expand the disk to its biggest possible extent and set the passwords. All quite lacklustre, painless and somewhat anticlimactic.

Once booted the console informs you that you can log into the CML application and also the Cockpit web interface for sysadmin tasks (queue the sysadmin credentials).

Licence and Registration

At first login to CML the application setup continues a bit more, you then can register the instance by inputting the licence key (providing you're not distracted by the setup option to navigate away from the wizard).

LAB Time

Once CML is configured I seem to remember it immediately creating a new lab and leaving me in the driving seat at that point. I found it trivial to navigate, add nodes access console, create links etc.

So far so good. Less than a couple of hours to set up as opposed to EVE-ng, which took me over a few hours to install from scratch (ISO) in a VM and that's not including the time it took to copy and set up each image (convert the few Cisco qcow2 images that were actually raw), test and then start labbing and figure out all the strange and weird behaviours like interface state configuration not being saved in exported configs etc.

I have no idea yet how to add custom (non-Cisco) nodes to CML. I know that qcow2 images (only) can be added, but they demand a lot of hypervisor options for node definitions, which I also don't want to have to worry myself over. And then theres the quirk of how node default node definitions are read-only. I want to edit them but CML scares you into not doing so warning you that it could break LABs and there doesn't seem to be an option to revert them back to default.

Another quirk I found is that CML (out of the box) doesn't give you the same sort of naming construct as EVE-ng does with bulk node numbering. While you can prefix nodes, it puts a '-' and then a number which is starts from 0, not 1. So I end up with R-0, R-1, R-2 and so on. Not a big deal as renaming is fairly straight-forward, but renaming 5-10 devices isn't something I want to have to spend time doing.

The last thing I'd like to mention is that that I'm noticing a lot of EVE-ng like similarities with regards to LAB IDs, exported configs (or as CML puts it Fetch config, which I discovered needs to be done individually on each node in order to include the config in the exported LAB YAML).

Final Opinion

CML is very polished. It's a breath of fresh air for Cisco-centric stuff out-of-the-box. Where it is lacking though is the system performance in the bottom bar is quite distracting. When starting a lab and while its settling or whenever a router reloads or just decides it needs more CPU, you the user see it. This is distraction and if there isn't an option to toggle it off or hide it completely, there should be. I want to focus on the LAB at hand not sysadmin tasks.

I've never been a fan of the EVE-ng UNL file format for UNet Labs, but the ability to easily export and import labs in a standard file format (YAML) is fantastic.

Licensing is something that I don't like. While CML does come with an eval licence, you still have to purchase it to get access to download it. CML Personal could still be free/accessible for personal/eval use and could include a perpetual licence or just require it to be registered to get a free licence would be better. Cisco are definitely bringing in a revenue stream across the entire CML product-line, but that goes without saying that they probably pumped a lot of resources into developing the KVM+Cockpit-based hypervisor and WebGUI into quite a polished product which also has a rich API for automation which can be leveraged for things like CI/CD.

The disk capacity of the OVA seems rather small, so I'm going to consider using libvirts guestfs tools to expand the qcow2 image and then figure out how to expand the PV/LV within the OS/Cockpit.

CML is now my go-to Network Modelling LAB tool for my next CCNP ENARSI exam since it offers less quirks and more polish to allow me to more easily create, manage and operate my labs and focus on what matters. Learning.

2024-08-24

HomeLab Mk.3 - Project Closeout

From a project methodology-standpoint, I'm missing some udates since the last post, but this is because I had since entered a redundancy, had immediate funding as a result, not to mention, limitted time to kick-off, execute and deploy before securing new employment.

The whole project is now complete with a 4RU AMD Ryzen-based custom-built server runnig Debian GNU/Linux.

Some of the improvemnts that have been made so far are as follows (in no particular order);

  1. Employed cryptsetup on top of software RAID
  2. Purchased and installed the 4RU system into an 18RU rack
  3. Installed Cockpit for easier host/hypervisor management
  4. Migrated the VMs from the previous HomeLab hypervisor to the new one
  5. Built a functioning eve-ng instance as a VM using nested virtualisation for network moddeling
One key compromise, was that I decided to reduce costs with memory so the hypervisor host is outfited with 64Gb instead of the maximum 192Gb of RAM. This was due to the higher than expected motherboard cost not to mention my requirements are farily low at this stage so the cost of that sort of outlay isn't justified.

In addition to the above, I've also embarked on a more secured and virtualised infrastructure by using OPNSense for PROD, DEV, (NET)LAB and DMZ networks which pretty much just stiches together and firewalls multiple isolated virtual networks inside of libvirt and peers with the multi-layer switch over a P2P L3 interface via a dot1q trunk while also advertising a summary route and accepts a default route only from upstream.

I think its a failry elegant design given my constraints and requirements but more importantly, it is a much more manageble setup now which reduces some technical debt for me. Now theres very few improvements to make even in the next iteration of the HomeLab in future, which will mostly be a hardware refresh - That and re-racking everything since the racks mounting rails needs adjusting to accomidate the 4RU server depth which was unfortunately not able to be done in time.

While I would love to share the overall design itself, it unfortunately has far too much information that is now considered somewhat confidential, but those who I trust and those who know me are always welcome to take a read (preferably onscreen) as I'm not in a position to re-write it for public consumption.

Debugging Cisco Access Lists

I want to share something specific I learned, that seems to be outside the official CCNP curriculum.

Despite the fact that I've done some (L2) traffic seperation for untrusted devices, there's still, unfortunately some that need to be on my internal L3 network for now (Google-based devices like a Google TV-based TV and an old Google Home - Nest products don't interest me) so I decided to do something about this to restrict vertical traffic and potential attacks from old, unsupported or not-so-trusted hosts.

While I could seperate the traffic at L2 and forward it to a virtual firewall or my FGT internet firewall appliance, that in my opinion, causes sub-optimal traffic flows due to network limitations/design since the budget won't allow better gear for my needs (like VXLAN/VPNV4 overlay with route leaking etc).

So, all I have to work with is an old unsupported Cisco IOS v15 (Classic) Multilayer central switch in my home network.

I thought, this would be pretty easy. Just allow host services like DHCP/netboot, intra-VLAN traffic etc., block RFC1918 and allow everything else. Ez Pz. Except netboot to my netboot.xyz server didn't work initially and I couldn't easilly figure out why.

ip access-list extended RESTRICTED_ACCESS
 remark NETWORK_SERVICES
 permit udp any eq bootpc any eq bootps
 permit udp any any eq domain
 remark ALLOW_PING
 permit icmp any any echo
 permit icmp any any echo-reply
 remark ALLOW_PXE_SERVER
 permit udp any host 192.168.56.3 eq tftp
 permit tcp any host 192.168.56.3 eq www
 remark PERMIT_INTRA-VLAN
 permit ip 192.168.0.0 0.0.0.255 192.168.0.0 0.0.0.255 log
 remark DENY_RFC1918
 deny   ip any 10.0.0.0 0.255.255.255
 deny   ip any 172.16.0.0 0.15.255.255
 deny   ip any 192.168.0.0 0.0.255.255
 remark ALLOW_EVERYTHING_ELSE
 permit ip any any log

I needed some visibility on the ports and protocols like a firewall log... Cisco conditional debugging to the rescue!

The specific Cisco debug I used was `debug ip packet detail`

Unfortunately, the detail was overwhelming and showed far too much information for any human to interpret and nearly brought down the switch, so I had to contrain or filter the output with a debug condition similar to the following:

`debug condition ip 192.168.0.4`

This produced the information I required and allowed me to pinpoint the missing port and protocol required!

21w3d: IP: s=192.168.0.1 (local), d=192.168.0.4 (Vlan666), len 56, sending

21w3d:     ICMP type=3, code=13
21w3d: IP: s=192.168.0.1 (local), d=192.168.0.4 (Vlan666), len 56, output feature
21w3d:     ICMP type=3, code=13, Check hwidb(88), rtype 1, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE
21w3d: IP: s=192.168.0.1 (local), d=192.168.0.4 (Vlan666), len 56, sending full packet
21w3d:     ICMP type=3, code=13pak 599DB6C consumed in input feature , packet consumed, Access List(31), rtype 0, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE
21w3d: IP: s=192.168.0.4 (Vlan666), d=192.168.56.3, len 32, access denied
21w3d:     UDP src=62557, dst=30002
21w3d: FIBipv4-packet-proc: route packet from Vlan666 src 192.168.0.4 dst 192.168.56.3
21w3d: FIBfwd-proc: packet routed by adj to Vlan56 192.168.56.3
21w3d: FIBipv4-packet-proc: packet routing succeeded
21w3d: IP: s=192.168.0.1 (local), d=192.168.0.4, len 56, local feature
21w3d:     ICMP type=3, code=13, CASA(4), rtype 0, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE
21w3d: IP: s=192.168.0.1 (local), d=192.168.0.4, len 56, local feature

As you can see in the above output, UDP port 3002 was blocked (due to the implicit deny any rule), so adding that in before the deny RFC1918 entry resolved this for me. Happy days.

So here's the final ACL that worked a treat.

ip access-list extended RESTRICTED_ACCESS
 remark NETWORK_SERVICES
 permit udp any eq bootpc any eq bootps
 permit udp any any eq domain
 remark ALLOW_PING
 permit icmp any any echo
 permit icmp any any echo-reply
 remark ALLOW_PXE_SERVER
 permit udp any host 192.168.56.3 eq tftp
 permit udp any host 192.168.56.3 eq 30002
 permit tcp any host 192.168.56.3 eq www
 remark PERMIT_INTRA-VLAN
 permit ip 192.168.0.0 0.0.0.255 192.168.0.0 0.0.0.255 log
 remark DENY_RFC1918
 deny   ip any 10.0.0.0 0.255.255.255
 deny   ip any 172.16.0.0 0.15.255.255
 deny   ip any 192.168.0.0 0.0.255.255
 remark ALLOW_EVERYTHING_ELSE
 permit ip any any log

Yes, I know I can (and probably will) tighten it some more and make DNS more specific (or remove it entirely to enforce quad9 DNS and prevent poisoning), but I wanted an ACL that is as simple as possible so I can easlily model and apply to other interfaces and SVI's which I might add is being done and so far it is working well.

 
Google+