Showing posts with label linux. Show all posts

2024-12-11

Emulated/Virtual Test Network

Today I finally managed to get the foundations for my test network working at L2 within a virtual environment.

The purpose of what I'm trying to achieve allows me to simulate various aspects of my home network and hyper-converged homelab within the homelab itself!

Over on LinkedIn, I posted that I got L2 port-channelling/bonding working, but I as you can see in the snip below, Po2 doesn't show LACP as it's protocol. This is because I cheated with the config and used 'channel-group 2 mode on` instead of `channel-group 2 mode active` which brought the Port-channel interface up on the switch, but the bond on the Debian GNU/Linux host would still not form.

This post serves as a correction to that article/post.

The cause for the behaviours I was experiencing was because the libvirt VirtIO-based network adapters don't seem to report the speed to the guest however, I believe they operate at 10Gbps by default, which would make the bond interfaces incompatible with the IOS-based peer's port-channel interfaces, which are limited to 1Gbps (and LACP in general).

Changing the speed and duplex with nmcli solved this for me ^[1].

for i in 3 4 5 6; do sudo nmcli conn mod ens$i 802-3-ethernet.speed 1000 802-3-ethernet.duplex full; done

As soon as the speed and duplex was applied, the port-channel came up straight away. Marvellous.

Switch#show etherchan 2 summ | beg Port-
Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
2 Po2(SU) LACP Gi1/0(P) Gi1/1(P) Gi1/2(P)
Gi1/3(P)
Switch#

Now, I can proceed to further network-related components similar to my 'production' network.

^[1]https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html/configuring_and_managing_networking/configuring-802-3-link-settings_configuring-and-managing-networking#proc_configuring-802-3-link-settings-using-the-nmcli-utility_configuring-802-3-link-settings

2024-11-06

Dealing with old Cisco gear and SSH

I've spent enough time dealing with old Cisco get to know that the old outdated Ciphers and Key exchanges can be tricky to deal with. Unfortunately, we can't just run the latest and greatest in a LAB and it generally considered isolated, so we have to live with this to a certain degree even if insecure.

I'm documenting the process of how to use an SSH client (Linux) to force it to use the right KEX and cypher etc. so people (including myself) don't have to piece the solution together from different sources every single time.

First off, the answers to the command-line parameters required lie in debugging in the client application itself.

ssh -vvv $host

This spits out a lot of information, which I could not seem to be able to filter through egrep, nonetheless, key items are listed here for reference

debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: KEX algorithms: diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1
debug2: host key algorithms: ssh-rsa

With that information gleaned, I was able to construct the parameters required to successfully connect to an SSH session in a lab.

ssh -oStrictHostKeyChecking=no -oKexAlgorithms=+diffie-hellman-group1-sha1,diffie-hellman-group14-sha1 -oCiphers=aes128-ctr -oHostkeyAlgorithms=+ssh-rsa $host

2024-10-22

CML2 - Some Thoughts and Comparison to EVE-ng

Since I'm back in study mode I thought I'd get a hold of a Cisco Moddeling Labs (CML)² licence so that I can try and gain some efficiency and therefore more focus in LABs at hand rather than troubleshooting and working around various kinks and nuances of the LAB environment, which I found I was doing a lot of in EVE-ng (prior to 6.0.x).

Installation

Once the purchase over at lerningnetworkstore was complete, Installation was quite straight forward except I had an extra step, which I will explain later.

Here's the high-level steps I took to acomplish the task;

Downloaded the OVA and the refplat bundle from the Cisco software center
Copied the OVA to the hypervisor
Converted the vmdk from the OVA to qcow2
Imported the qcow2 image into the hypervisor

Step 3 was only required for me since I'm using QEMU/KVM+libvrt as my hypervisor, but a quick search online guided me to the solution which allowed me to almost seamlessly import it.

I imported the qcow2 image and started the VM, but it would not boot properly, but it seems as though UEFI is required. Easy fix.

Initial Setup

Initial setup required access to the VM console, but it was very straight forward, it offered to use DHCP, expand the disk to its biggest possible extent and set the passwords. All quite lacklustre, painless and somewhat anticlimactic.

Once booted the console informs you that you can log into the CML application and also the Cockpit web interface for sysadmin tasks (queue the sysadmin credentials).

Licence and Registration

At first login to CML the application setup continues a bit more, you then can register the instance by inputting the licence key (providing you're not distracted by the setup option to navigate away from the wizard).

LAB Time

Once CML is configured I seem to remember it immediately creating a new lab and leaving me in the driving seat at that point. I found it trivial to navigate, add nodes access console, create links etc.

So far so good. Less than a couple of hours to set up as opposed to EVE-ng, which took me over a few hours to install from scratch (ISO) in a VM and that's not including the time it took to copy and set up each image (convert the few Cisco qcow2 images that were actually raw), test and then start labbing and figure out all the strange and weird behaviours like interface state configuration not being saved in exported configs etc.

I have no idea yet how to add custom (non-Cisco) nodes to CML. I know that qcow2 images (only) can be added, but they demand a lot of hypervisor options for node definitions, which I also don't want to have to worry myself over. And then theres the quirk of how node default node definitions are read-only. I want to edit them but CML scares you into not doing so warning you that it could break LABs and there doesn't seem to be an option to revert them back to default.

Another quirk I found is that CML (out of the box) doesn't give you the same sort of naming construct as EVE-ng does with bulk node numbering. While you can prefix nodes, it puts a '-' and then a number which is starts from 0, not 1. So I end up with R-0, R-1, R-2 and so on. Not a big deal as renaming is fairly straight-forward, but renaming 5-10 devices isn't something I want to have to spend time doing.

The last thing I'd like to mention is that that I'm noticing a lot of EVE-ng like similarities with regards to LAB IDs, exported configs (or as CML puts it Fetch config, which I discovered needs to be done individually on each node in order to include the config in the exported LAB YAML).

Final Opinion

CML is very polished. It's a breath of fresh air for Cisco-centric stuff out-of-the-box. Where it is lacking though is the system performance in the bottom bar is quite distracting. When starting a lab and while its settling or whenever a router reloads or just decides it needs more CPU, you the user see it. This is distraction and if there isn't an option to toggle it off or hide it completely, there should be. I want to focus on the LAB at hand not sysadmin tasks.

I've never been a fan of the EVE-ng UNL file format for UNet Labs, but the ability to easily export and import labs in a standard file format (YAML) is fantastic.

Licensing is something that I don't like. While CML does come with an eval licence, you still have to purchase it to get access to download it. CML Personal could still be free/accessible for personal/eval use and could include a perpetual licence or just require it to be registered to get a free licence would be better. Cisco are definitely bringing in a revenue stream across the entire CML product-line, but that goes without saying that they probably pumped a lot of resources into developing the KVM+Cockpit-based hypervisor and WebGUI into quite a polished product which also has a rich API for automation which can be leveraged for things like CI/CD.

The disk capacity of the OVA seems rather small, so I'm going to consider using libvirts guestfs tools to expand the qcow2 image and then figure out how to expand the PV/LV within the OS/Cockpit.

CML is now my go-to Network Modelling LAB tool for my next CCNP ENARSI exam since it offers less quirks and more polish to allow me to more easily create, manage and operate my labs and focus on what matters. Learning.

2024-08-24

HomeLab Mk.3 - Project Closeout

From a project methodology-standpoint, I'm missing some udates since the last post, but this is because I had since entered a redundancy, had immediate funding as a result, not to mention, limitted time to kick-off, execute and deploy before securing new employment.

The whole project is now complete with a 4RU AMD Ryzen-based custom-built server runnig Debian GNU/Linux.

Some of the improvemnts that have been made so far are as follows (in no particular order);

Employed cryptsetup on top of software RAID
Purchased and installed the 4RU system into an 18RU rack
Installed Cockpit for easier host/hypervisor management
Migrated the VMs from the previous HomeLab hypervisor to the new one
Built a functioning eve-ng instance as a VM using nested virtualisation for network moddeling

One key compromise, was that I decided to reduce costs with memory so the hypervisor host is outfited with 64Gb instead of the maximum 192Gb of RAM. This was due to the higher than expected motherboard cost not to mention my requirements are farily low at this stage so the cost of that sort of outlay isn't justified.

In addition to the above, I've also embarked on a more secured and virtualised infrastructure by using OPNSense for PROD, DEV, (NET)LAB and DMZ networks which pretty much just stiches together and firewalls multiple isolated virtual networks inside of libvirt and peers with the multi-layer switch over a P2P L3 interface via a dot1q trunk while also advertising a summary route and accepts a default route only from upstream.

I think its a failry elegant design given my constraints and requirements but more importantly, it is a much more manageble setup now which reduces some technical debt for me. Now theres very few improvements to make even in the next iteration of the HomeLab in future, which will mostly be a hardware refresh - That and re-racking everything since the racks mounting rails needs adjusting to accomidate the 4RU server depth which was unfortunately not able to be done in time.

While I would love to share the overall design itself, it unfortunately has far too much information that is now considered somewhat confidential, but those who I trust and those who know me are always welcome to take a read (preferably onscreen) as I'm not in a position to re-write it for public consumption.

2023-11-03

Git

I'm not sure why git is called 'the stupid content tracker' (according to the man page that is), but I've discovered that - despite many tutorials overcomplicating the setup by adding the creation of a git user account and SSH key-based authentication - it is stupidly trivial to set up a remote repository.

By stupid I mean that git does not reference any of the object files in a way that you would expect or as you are used to working with them in your locally checked-out repository or IDE.

This method of file storage threw me off and caught me off guard but I eventually managed to get the initial comit added to the remote.

I also learned that git appears to work locally, meaning you can clone on the same system that's hosting the repository using directory paths without a transport protocol!

I'm now armed with information on how private git repo hosting works, which is especially useful for interim SCM or when private hosting is required for whatever reason.

2023-10-29

Libvirt virtio Networking

Devling deeper into Libvirt, has my trying to find ways to improve the previous build through lab testing.

The latest testing is virtio networking with an isolated network in order to mitigate libvirt not being able to snapshot guests unless the volumes they use are all qcow2.

With this limitation in mind, I employed NFS to a common datastore for guests that require access to the datastore, however the path taken in the current configuration is suboptimal and takes the path of the hosts management interface.

The virtio model provides much better throughput while at the same time allowing guests to communicate with the host, but not outside the host.

In my testing with a virtio model I was able to achieve over 10Gbps with no tuning whatsoever as follows;

[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 16.3 GBytes 14.0 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 16.3 GBytes 14.0 Gbits/sec receiver

The current path which uses the suboptimal path is not only limited to the hardware NIC/switch, but we can also observe quite a lot of retries indicating TCP retransmits are likely also occuring which would be introducing latency with NFS.

[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.10 GBytes 942 Mbits/sec 315 sender
[ 5] 0.00-10.00 sec 1.09 GBytes 939 Mbits/sec receiver

I now have yet amother defined improvement concept ready for implementation on the new server build.

2023-10-26

Libvirt pool storage management

I was really looking forward to improving on my previous homelab by building a new server, defining succinct and well thought out pools that leverages and manages LVM, mounts etc in order to abstract away some of the sysadmin tasks.

In my limit testing, I've found that libvirt storage management is flexible yet limited insofar as the fact that I could have potentially done away with the complexities of mdadm, manual definition of a PV and/or VG and LVs, formatting, creating mountpoints and then adding the mounted filesystem(s) to libvirt or let libvirt mount it for me, but since I'm using crypto in order to mitigate potential data breaches during hard drive disposal, it means that I can't leverage RAID functionality within LVM itself as I require a simplified encryption with a single key on a single volume or in my case, an md array.

If I didn't require crypto, I may have been able to skip the manual mdadm RAID configuration and carved out nicer storage management, however this is unfortunately not the case.

It seems as though you can't easily carve up an LV as if it where a PV from libvirt's perspective when defining a pool (that is without the headaches that comes with partitioning LVs or overcomplicating the solution with pools defined from libvirt volumes). Libvirt pools also seem flat in nature and I can't figure out how to define a volume under a directory without defining seperate volumes (such as dir-based) to overcome this.

So for now my solution is to handle most of the storage manually with one single mount point based on a single md and crypto device along with a single LVM PV, VG and LV with dir-based pools defined to manage volumes.

It doesn't seem ideal nor efficient, but right now I need a solution to move the project forward to completion.

I will further test and refine (and possibly even automate) the solution on the new hypervisor host at some point. Who knows, there may be better tools or newly discovered ways of doing this in the future.

The next step in the overall solution is to test a virtiofs shared for and/or virtio high-speed (10Gbps) isolated SAN solution.

2023-09-28

Regular Expressions - Examples and Use Cases

Background

This post should serve as a repository of selected use-case reqular expressions, sorted by utility/name. It is predominantly centered around Linux and user-space utilies (with a certain amount of Cisco IOS-based examples as well in its heading and subheadings). It will hopefully be continually updated as I intent to keep adding to it as I see buld more regular expression use cases.

MDADM

The following was useful to gather mdadm information when I had an issue with a missing block device in a RAID array (which turned out to be SATA cables that where accidently swapped when performing maintenance/cleaning causing device unexpected device renaming which ultimately bumped a device off the array - sdb in my case). The examples here uses simple patterns to show the linux block devices in an array and looking for log entries

user@host:~$ sudo mdadm --detail /dev/md0 | egrep '\/dev\/sd?'
3 8 64 0 active sync /dev/sde
1 8 32 1 active sync /dev/sdc
4 8 48 2 active sync /dev/sdd
user@host:~$ cat /etc/mdadm/mdadm.conf | egrep '\/dev\/sd?'
DEVICE /dev/sdb /dev/sdc /dev/sdd /dev/sde
user@host:~$
user@host:~$ sudo dmesg | grep md0
[ 2.701684] md/raid:md0: device sdc operational as raid disk 1
[ 2.701686] md/raid:md0: device sdd operational as raid disk 2
[ 2.701687] md/raid:md0: device sde operational as raid disk 0
[ 2.702549] md/raid:md0: raid level 5 active with 3 out of 3 devices, algorithm 2
[ 2.702574] md0: detected capacity change from 0 to 8001304920064
user@host:~$

HDPARM

For similar reasons to the MDADM, I initially suspected that a disk was faulty and wanted to extract the serial numbers of each for warranty lookup. This is how I acheived that outcome (sans actual serial numbers).

user@host:~$ sudo hdparm -I /dev/sd? | egrep '(\/dev\/sd?|Serial\ Number)'
/dev/sda:
Serial Number: *** REDACTED ***
/dev/sdb:
Serial Number:   *** REDACTED ***
/dev/sdc:
Serial Number:   *** REDACTED ***
/dev/sdd:
Serial Number:   *** REDACTED ***
/dev/sde:
Serial Number:   *** REDACTED ***
user@host:~$

SCREEN

So, sometimes a screen is killed or exited (often accidently) and rather than opening up the local user screenrc file, looking for the screen/entry/command and then executing the screen command manually to restore it, with the help of grep, I simply execute it dirrectly with bash substitution. Here are a couple of examples:

$(grep virsh ~/.screenrc)
$(grep /var/log/messages ~/.screenrc)
$(grep virt_snapshot ~/.screenrc)

LVM

At some point, we might need to review LVM volumes to see where we can scale and resize etc. The following allowed me to quickly see everything at a glance in order to formulate a plan for resizing.

user@host:~$ sudo lvdisplay | egrep "LV (Name|Size)"

[sudo] password for user:
LV Name video
LV Size <4.02 TiB
LV Name audio
LV Size 750.00 GiB
LV Name hdimg
LV Size <2.51 TiB
LV Name swap
LV Size 16.00 GiB
LV Name var-tmp
LV Size 8.00 GiB
user@host:~$

Cisco IOS

A collection of various Cisco IOS commands and the very limited IOS regular expression engine on an IOS device (or IOS-XE's IOSD).

show version

Show a consolidated view of uptime, firmware and software version & reason for reload (minus all the Cisco copyright and releng information):

Cisco IOS Software, C3750 Software (C3750-IPSERVICESK9-M), Version 15.0(2)SE11, RELEASE SOFTWARE (fc3)

ROM: Bootstrap program is C3750 boot loader

BOOTLDR: C3750 Boot Loader (C3750-HBOOT-M) Version 12.2(44)SE5, RELEASE SOFTWARE (fc1)

SWITCH uptime is 1 week, 3 days, 22 hours, 29 minutes

System returned to ROM by power-on

System restarted at 12:28:16 WST Sun Sep 17 2023

System image file is "flash:/c3750-ipservicesk9-mz.150-2.SE11.bin"

SWITCH#

show etherchannel

Show portchannel member state times - This is particularly useful in correlating events for possible cause without having to rely on syslog:

SWITCH#show etherchannel 1 detail | incl ^(Port: |Age of the port)

Port: Gi1/0/15

Age of the port in the current state: 10d:22h:41m:32s

Port: Gi1/0/16

Age of the port in the current state: 10d:22h:41m:31s

Port: Gi1/0/17

Age of the port in the current state: 10d:22h:41m:30s

Port: Gi1/0/18

Age of the port in the current state: 10d:22h:41m:30s

SWITCH#

2023-09-20

HomeLab Mk.3 - Planning Phase

Background

I kicked off my homelab refresh project not long ago, dubbed "HomeLab mk.3" as its the third iteration since circa 2010. I'm now well into the planning phase but I've found that I'm also overlapping into the procurement phase (as described herein).

So far, I've decided to replace my pre-Ryzen AMD-based full-tower hyperconverged system with another hyperconverged system, but this time it will be housed in an 18RU rack for providing a small amount of noise management, but also neaten up the office a little, which will have the added benefit of assisting in home improvement (flooring) later.

Key requirements;

Costs must be kept as low as possible
Software RAID (due to #1)
Hyperconverged system (due to item #1 and power constraints)
Nested virtulisation for EVE-NG/network modelling

Therefore based on requirements, the system (excluding the rack) will comprise of the following;

One SSD for the hypervisor stack/host OS
Up to six (6) 8Tb CMR disks for the storage of guests etc.
4RU ATX rackmount case (including rails of course) ✅
As much memory as the mainboard allows which relates to key requirement #4

Challenges

The current challenges surrounding the build are;

Choice of Hypervisor (oVirt, libvirt, OpenStack, EVE-NG)
Choice of CPU architecture (due to key requirement #4 and likely challenge #1)
Possible Network re-architecture required to support the system including possible infrastructure Re-IP addressing.

Choice of Hypervisor

For item #1 the choices don't look that great, and I will probably stick with libvirt and the various virt toolsets, only because;

oVirt appears to no longer be supported downstream by RedHat which means contributions to the upstream project (oVirt) will likely and eventually kill the project
OpenStack is a pain to set up, even the all-in-one "packstack" which also means that could impact scalability in future if required
EVE-NG appears to be an inappropriate choice. While it supports KVM/QEMU/qcow2 images, I'm not sure I want this as the underlying HomeLab hypervisor (unless certain constraints can be overcome - these are considered not in scope of this post).

Choice of CPU architecture

For item #2 the CPU architecture is important only because network vendor (QEMU/qcow2) images highlight strict CPU architecture requirements as being Intel-based, and AFAIK nested virtulisation requires that the guest architecture matches that of the host.

Possible Network re-architecture

Item #3 is not insurmountable, but it is still a challenge nonetheless as I'm not sure about whether I will change the Hypervisor guest networks (dev, prod, lab etc) to connect back upstream at L2 or L3.

Procurement

As I mentioned already, the project planning phase is somewhat overlapping with the procurement phase, the reason for this is so that I can not only procure certain less tech-depreciating items over time to allow project budget flexibility, but also allow a certain level of reduced risk in operation of the system:

Case in point: HDD's - I never risk buying them from the same batch in case of multiple catastrophic failures.

I've already purchased 3 HDD's, the 4RU rackmount case and rails and an 18RU rack to house the new gear along with the existing kit (switch, router and UPS).

I'll continue to procure the HDD's until I have enough to build the system then all that left is to purchase the key parts for the rack mount case/system (CPU, mainboard, memory & PSU) once CPU architecture/hypervisor testing (see Hardware selection below) and the design is complete.

Hardware selection (CPU architecture)

In order to determine whether the new system will be Intel or AMD will depend on the testing performed on my AMD Ryzen-based desktop. If EVE-NG and the required images work in nested virtualisation (and/or bare-metal) with said CPU architecture, then I will be in a good position to stick with AMD for this iteration (and likely future iterations) of the HomeLab. After all, AMD-based systems appear to have a good pricepoint which relates back to key requirement #1

2020-07-25

Adventures in Open Source (Part 3)

The (somewhat) popular Adventures in Open Source series is back and even better than before

SYSLINUX
Along time ago and, I was able to force a Toshiba Satellite A10 to boot a Ghost Boot Wizard created disc (ISO) thanks to syslinux.
This seemed kind of tricky to begin with but looking back on it, it is pretty trivial.

First of all I mounted the ghost iso file as loopback and copied the file contents to a temporary directory ($pathspec can be any empty temporary directory such as /tmp/ghost).

sudo mount -o loop /usr/temp/ghost.iso /mnt/loop0
cp -R /mnt/loop0/* $pathspec

Next up I copied the necessary syslinux files to the temporary directory

cp /usr/lib/syslinux/isolinux.bin $pathspec
cp /usr/lib/syslinux/memdisk /$pathspec

I then had to create an isolinux configuration file called isolinux.cfg in the temporary ($pathspec) directory with the following contents.


cat > $pathspec/isolinux.cfg
default ghost
timeout 150
prompt 1
label ghost
  kernel memdisk
  append initrd=osboot.img
^Z

Lastly, I moved up one directory and created the iso with mkisofs/genisoimage

cd ..
mkisofs -v -J -V $volid -N -A '' -sysid '' -o $filename -b isolinux.bin -c boot.cat \
-no-emul-boot --boot-load-size 4 -boot-info-table $pathspec

That's all!

NOTE: Due to the varying nature of Linux distributions, I have purposefully used variables (named in accordance with mkisofs/genisoimage documentation) so as to aid in making this procedure as dynamic as possible.

ntfsclone(8)
Since I still help people with Windows (only close friends and relatives now), and I decided to give this tool another try (last time I used it, I used the "special" image format, which cannot be loopback mounted).

Bellow is the output (proof) of a successful ntfsclone (and ntfs-3g loopback mount).


localhost ~ # ntfsclone -o /u1/S3A1378D001-ntfsclone.img /dev/sde1ntfsclone v2.0.0 (libntfs 10:0:0)NTFS volume version: 3.1Cluster size       : 4096 bytesCurrent volume size: 39999500288 bytes (40000 MB)Current device size: 39999504384 bytes (40000 MB)Scanning volume ...100.00 percent completedAccounting clusters ...Space in use       : 13676 MB (34.2%)Cloning NTFS ...100.00 percent completedSyncing ...

This is me loopback mounting a standard ntfsclone (not special image format) image:

ntfs-3g -o loop /u1/S3A1378D001a-ntfsclone2.img /mnt/loop0mount | grep fuse/u1/S3A1378D001a-ntfsclone2.img on /mnt/loop0 type fuse (rw,noatime,allow_other)

Adventures in docker and portainer

Around 2007 I was gifted some old hardware which entailed an ASUS motherboard, 8Gb or RAM and an AMD CPU.

It wasn't until a few years later that I decided to build it into a home server.

There was no hardware virtualisation and either I didn't know how to do or didn't want to do software virtualisation and software RAID, instead all my services, DHCP, DNS, SAMBA, FTP etc. and I think even Plex as well (or maybe that came later) was running co-resident on a bare-metal JBOD server.

Since it was simple design, it was relatively simple to operate and maintain. I even managed to successfully P2V the server when I did a hardware refresh and it continued operating for the most part.

One day I upgraded the system and a Python-based application which catastrophically broke and I since abandoned picking it back up until recently because I discovered Docker containers.

Fast-forward to today, now I have a big proponent of my services and apps hosted in a dedicated docker-engine VM and maintenance has never been easier.

I've even learned how to share the same network namespace as other containers such as stacking a container with a VPN container and all it took was using the following in the compose file under the containers service definition:

network_mode: "container:<container>"

Which, I leaned and adapted the above from the following YouTube video:

How to route any docker container through a VPN container

Further to this my Docker engine VM is exclusively managed now using portainer.io where I can easily create and delete (and in the process upgrade) containers with ease, which means everything stays fresh and all I have to be concerned about is backing up the persistent storage!

Armed with knowledge of how docker works, I've written up slide deck on docker to help demystify docker containers and hopefully improve overall understanding for the potentially emerging DevOps capability.

2019-05-15

Adventures in Automation: Part 1

So ever since I heard about automation (and then orchestration) I have finally taken some measures to not only learn but implement some of my own.

Since I realised that I have a reasonable amount of unwanted technical debt building up while having to maintain my 'homelab' in its current state, some the daily hastles can be automated away.

Two such tasks I have just undertaken and completed are"

Ansible automation to update, upgrade and clean orphaned/unused packages
Script to cleanup snapshots taken prior to ansible playbooks being run

I have a very long way to go before I can get other things automated, but its a start.

I'm also in the process if defining a docker environment where most of my services will operate from within one or more virtual machines.

I will also have to look into github repository of my code and a complete cloud backup solution with onsite encryption (encryped using my own privately owned and stored keys).

2019-05-04

Free Range Routing

Since I discovered Docker, I have been busy designing my homelab to be as Cloud Native as possible, but in doing so, I realised that the default docker network (aka bridge) and the other networks defined by other containers from docker-compose bridge type networks isn't known by the upstream network collapsed core access layer network.

In the past I have been adding static routes upstream and a default route on the docker host, but this was not ideal (read scalable) given the dynamic nature of docker networks created with docker-compose.

I quickly realised that since I've developed significant experience in BGP (in service provider environments), I planned to just peer the docker host to the upstream access layer, but until now didn't know how to do this with Linux.

I have always known about Quagga, but been a bit concerned about the learning curve required to get it working, but remembered about its fork called Free Range Routing. So I decided to make the a leap of faith and I have no regrets whatsoever.

I was supprised at how easy it was to install and configure and to get it working which comprised of the following;

On the Docker host;

Setup the REPO
Installed FRR and configred services for vtysh as per the official documentation
Connected to the vtysh interface and
Configured BGP and redistributed only conneted routes using route-map/prefix-list

On the upstream switch/access layer;

Configured peering to FRR and squelched all but a summary route prefix-list/route-map.

Not only did I define the policies perfectly (woot!), but the neighbor came up without any issues at all!!!

My switch can now route traffic to any docker/docker-compose containers/services that are created with far less effort required to define routes to the containers as well as deleting them if no longer in use.

I am very supprised to learn that not only is it extremely easy to get FRR up and running but it is very easy to configure routing daemons/config for them especially if you have a grounding in Cisco as its vtysh is very closely matched with that of Cisco CLI - there are some slight and obvious differences to Cisco's not to mention the fact that you need to connect to the vtysh interface kind of like the root user does on a Juniper platform (cli).

Next up I need to figure out how to leverage Linux VRF namespaces using FRR vtysh, then I can migrate my infrastructure to MPLS!

2015-09-03

BIND (named) server remidiation [part 2]

Following up from my previous post (BIND (named) server remidiation), I spent a good couple hours further developing and testing the configuration but failing to get a bind9 reverse lookup zone to load only to find out that I had a slight typo in the reverse lookup zone definition

named-checkzone was returning OK, but named itself was failing to load the zone file with the error:

zone X.X.X.in.addr.arpa/IN: has 0 SOA records
zone X.X.X.in.addr.arpa/IN: has no NS records
zone X.X.X.in.addr.arpa/IN: not loaded due to errors.

It wasn't until I had a friend take a closer look at then the problem became clear:

I defined the zone as .in.addr.arpa instead of .in-addr.arpa in the named.conf include file which references the zone file.

Some things I have learned is:

Check the logs (in my case, on a default debian/bind9 install this was /var/log/syslog) when things don't work.
Always check your config with the bind DNS tools before reloading
Always check your zones files with the bind DNS tools before reloading
Keep zone files neat and group together similar resource record types.

Now that I have the dev domain DNS working, I just need to look at setting up DHCP and testing dynamic DNS.

I also considered moving different resource records for each zone into a separate file, but this is not necessary, due to the (current) size of the network.

Once this is all done, tested and implemented in 'production', I will also consider keeping a similar configuration in dev as a slave for all zones from the primary DNS or just as it is and just for testing.

2015-08-26

BIND (named) server remidiation

Since I virtualised my old failing physical server into a VM, I have found it less and less easy to administer and maintain (read: configuration files).

So, I am looking and spinning up new Debian servers for more specific tasks, network services, games servers, file services etc.

The fist, and most important thing I need to migrate is DNS. That way I can have it simply running in parallel with the old, ready to essentially, stop the service (after making sure DHCP serves out this DNS IP address as well of course!).

Now, here comes the "clever" part or the goals of this approach (or so I thought):

Install named.
Configure it to be a slave for the existing zones
re-configure it to be a master (complete with zone files)

Pretty simple right? Not so much. Well, thanks be to the 'Debian' way of doing things, it was very quick and easy to have a the zones slaved, but when I went to look at the files I was expecting, they where still empty, since I had created empty zone files to begin with.

Some poking around later and I discover that it is transferring the zones fine, but there was an issue with permissions for the zone files, or more specifically, the directory where they lived. A quick chmod -R 0777 /zone/file/directory later and a restart of the service, voila! Except.... something was not right...

The zone files seemed to be in a binary format as file would have me believe they were of type: data

I could have converted them back to plain text using the bind-tool named-compilezone(8) but, I couldn't commit my time to learning how to get the syntax correct for one small job, besides I learned that it is a crazy default in order to get a performance increase, however minuscule that would be given such a small DNS server implementation (for now).

So as per the article "Bind 9.9 – Binary DNS Slave file format" (linked above) or more authoratively as per the Chapter 6. BIND 9 Configuration Reference section of the BIND 9.9 Administrator Reference Manual (ARM) which states (incorrectly):

masterfile-format

Specifies the file format of zone files (see the section called “Additional File Formats”). The default value is text, which is the standard textual representation, except for slave zones, in which the default value is raw. Files in other formats than text are typically expected to be generated by the named-compilezone tool, or dumped by named.

So, knowing this I edited /etc/bind/named.conf.options to include the following:

masterfile-format text;

Perfect. (Just like me ;-) I now have a duplicate of the zones served on the master server, which can, and will soon be decommissioned, not to mention the new servers zones getting a makeover with many many more zones as well as a dynamic-update zone - more to come on this soon.

2015-08-05

Great Success

Finally after weeks, no months of agonising failure though trial and error, I finally managed to get the outcome I desired with my Raspberry Pi 2!

History

A few years back I acquired a Cisco 3560 and quickly realised the potential of vlans and separate subnets for the purposes of testing among other valid reasons, and came to find that the nodes on most of the vlans could not communicate with the outside world (read: internet). It was then that I realised that something was wrong...

Long story short: the Netgear DGND4000 that I own does not route/NAT anything other than its resident subnet and I sure as heck was not going to implement double NAT!

Thanks be to LIbVirt's NAT networking which gave me an interim workaround and helped confirm this.

Getting the necessary bits

NAT issues aside, I began by purchasing a second-hand Netgear DM111P v2 from some random guy on Gumtree. The ADSL Modem in itself wasn't enough because it too, seemed to suffer from the same issue as the DGND4000 did, although admittedly, I didn't put much effort into testing that theory as I wanted a solution not more testing.

I then purchased a Raspberry Pi 2 along with a bunch of accessories. In the meantime (while I was waiting the excessively long shipping time). I did some research on the distributions that are capable of running on the bcm2709-based board and decided with OpenWRT. Yes, I know that I could have used Raspbian but OpenWRT seemed the most logical choice given the fact that it is essentially an internet router anyway, just without the wireless and ADSL modem.

Turns out I made the right choice despite the fact that OpenWRT is still in trunk (RC3 at the time of writing this).

Lastly (after destroying the extremely cheap Rpi2 case) I managed to get an image booted (helps when you use the bcm2709 not the bcm2708 barrier breaker version, thats for the Raspberry Model B!).

Configuration

First of all, this would have gone a lot smother had I have just tested with the USB network adapter I bought along with the Pi, but it didn't get here in time with partial shipping.

I configured the switch with a trunk port with two vlans, one for the LAN side of things (internal link) and another for the WAN or pppoe (public/external/internets) and set the mode appropriately.

NOTE: VLANS and IP addresses have been altered so as to protect the actual configuration used in my network infrastructure. Call me paranoid.

Cisco 3650 partial configuration

!
vlan 20
vlan 69
!
interface Vlan69
description DMZ/LAN
ip address 192.168.69.1 255.255.255.248
no shutdown
!
! no interface defined for WAN because we do not want any L3 traffic
!

interface GigabitEthernet0/2
description Trunk port for Rpi2 VLAN's: 20, 69
switchport access vlan 69
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 20,69
switchport mode trunk
no shutdown
!
interface GigabitEthernet0/1
description Link to DM111Pv2 modem (bridged) for PPPoE/L2 traffic
switchport access vlan 20

no shutdown

exit
!
ip route 0.0.0.0 0.0.0.0 192.168.69.66
!
end

Caveats/Adendums/Extra information

By now you may be wondering, "Why is there no IP addresses or switch virtual interface for vlan 20"? There is no need for it! That, and the fact one might only want traffic to go via one vlan and then the other (remember, this is essentially a router on a stick implementation and we want to separate the vlan's into L3 traffic for one and L2 for the other per requirements).

If you were thinking: "The netmask and destination network IP for the LAN route is wrong!", you would be incorrect. This is a perfectly legitimate summary route. It allows for much easier (read: slack) administration so one does not have to manage multiple static routes for subnets added or removed from the network (short of running a routing protocol) and it has the added benefit of consuming less memory and is a much more flexible approach for this design. Neat huh? I thought so too :-)

Conclusion

Let it be said that although this configuration is very simple, there where many hurdles accompanied by many choice words along the way. The one single most important thing that I kept getting wrong was routing. I had to remember to change the 'gateway of last resort' (Cisco's way of saying default route) on the switch so that all the subnets will route to the internet and the static (summarised) route for traffic to get back into the network from whence they came. That and trying to test this when the internet is depended upon so much by the two people in this household, was frustrating as my change windows where often short and had to be rolled back constantly.

Lastly, I must say that "out-of-the-box" pppoe/nat/routing on OpenWRT worked with like a charm with minimal configuration, however I will need to develop the scenario a little further so I can secure the connection by way of its firewall (read: iptables), but that itself is a beast I have yet to conquer.

2015-08-04

Rasbpberri Pi Internet Connectivity Lab

I have successfully built a lab for testing internet connectivity to the Raspberry Pi 2, by using my phone in a USB tethering configuration.

I followed the majority of the configuration listed in the OpenWRT wiki, except I used the LuCI web configuration instead of the final manual step of using uci to use usb0 as the WAN connection

This will now allow me to test various scenarios including multiple default routes with different metrics as well as testing firewall configurations using OpenWRT running on the Rpi2.

The one gotcha is that I forgot to set the rout back to the internal network for which was previously miss-configured.

I am getting one step closer to having much more control of my internet as well as being able to NET/Route all of my subnets!

2015-07-25

failure to focus

I have confirmed that I can get the Raspberry Pi to connect to the ISP using PPPoE through a VLAN, however, I cannot (or rather my brain cannot) get the OpenWrt to accept traffic other then ICMP to/from the device itself (I probably need to understand iptables or I am overlooking something very simple).

I'm finding it extremely hard to focus and get the networking part of this lab working right now when I don't actually have a lab to do it on and when others in the household rely on internet so much including myself, when I need to refer to something while trying to troubleshoot and find a solution to this 'router-on-a-stick' model of networking to overcome the shortfall of the existing router.

I've also lost my 4Gb micro SD for which I was planning using for building a Bluetooth (A2DP) Audio receiver from the Raspberri Pi which is making me a little less than happy considering they are not as easy to come by due to the size and I will have to spend another $10 (effectively $20 now) in order to get one.

For now, I'm going to go watch something and try again later (including looking for the SD card).

2011-03-10

mirgating to libata

Since IDE/MFM/RLL is now depreciated, I thought I'd share my experiences of migrating to the newer libsata (SATA prod) drivers in 2.6 kernel.

Since I only have 2 devices on IDE ports (WD 320Gb HDD and a cdrw), there was very little for me to do as I had just about everything spread across both ata and libsata, so I removed all instances of ata, set built-in ATA driver support (since the system boots from IDE - for now) under libsata and enabled what I needed as modules for my SATA JBOD's

The whole thing almost went perfectly as planned (and as documented), except for the following minor irritations:

Forgot to change the real_root option in grub.conf from /dev/hda3 to /dev/sda3 :-P
udev was naming my cdrom to cdrw1/cdrom1

Admittedly, it took my a while figure out that I forgot to change the bootloader for the change in device names, but I quickly worked out how to change the cdrom device name back to default, by editing "/etc/udev/rules.d/70-persistent-cd.rules"

2009-08-21

vim + gnupg = password manager

After finding that there are very little native password managers for linux, I decided to see if I could find a way to open my encrypted password file using a console-based editor without putting any plain text onto the disk at all (ie. transparent editing of gnupg encrypted files).

I stumbled onto the vim website (by way a Google search) and found a nice little script (plugin) that does all this for me!

Initially, I had some issues with getting it working but that was mainly due to exporting $GPG_TTY incorrectly :-P

However, as I use screen to manage everthing I do from the one terminal window/ssh session (vim incuded), the plugin works fine but fails to decrypt files when vim is invoked as a new screen.

I suspect that it's attributed to the $GPG_TTY variable, but my knowledge of screen and some other aspects of Linux are limited.

I now use vim + gnupg for my encrypted password file.

UPDATE 21/08/2009 @ 13:15
There seems to be an issue where the the GPG_TTY variable needs to re-exported every time you change to a another screen/pts. I have made myself a workaround, whereby I run a simple script that first exports the variable and then opens vim with the encrypted pwd file, but then vim removes the standard UDLR keybord controlls and falls back to classic vi mode. *sigh*

2024-12-11

2024-11-06

2024-10-22

Installation

Initial Setup

Licence and Registration

LAB Time

Final Opinion

2024-08-24

2023-11-03

2023-10-29

2023-10-26

2023-09-28

Background

MDADM

HDPARM

SCREEN

LVM

Cisco IOS

show version

show etherchannel

2023-09-20

Background

Challenges

Choice of Hypervisor

Choice of CPU architecture

Possible Network re-architecture

Procurement

Hardware selection (CPU architecture)

2020-07-25

2019-05-15

2019-05-04

2015-09-03

2015-08-26

2015-08-05

History

Getting the necessary bits

Configuration

Cisco 3650 partial configuration

OpenWRT network configuration

Caveats/Adendums/Extra information

Conclusion

2015-08-04

2015-07-25

2011-03-10

2009-08-21

My other blogs

About Me

Blog Archive

Links

Labels