Code Faff

2025-04-26

Detecting your home network with SSH

Suppose you have a server in your private home network. It's called myserver.home, and you SSH into it as user me. When you're at home, all you need to get in is:

ssh me@myserver.home

But now you want to SSH from outside. You've set up a dynamic DNS service under the name mydyndns.example.org, and configured forwarding in your router on external port 9000 to 22 on myserver.home. Your command becomes:

ssh me@mydyndns.example.org -p 9000

You can set up aliases for these command in ~/.ssh/config:

Host srv-ext srv-int
User me

Host srv-int
Hostname myserver.home

Host srv-ext
Port 9000
Hostname mydyndns.example.org

Cool, except you have to think about which one to use, depending on where you are. Well, it might not be that important, as the srv-ext alias will likely also work at home. Nevertheless, it would be cooler to use just one alias srv, and have it automatically detect whether a local connection is possible. It might also be noticeably faster in some cases.

Solution

SSH configuration provides a Match directive that can detect several conditions, including the result of a generic command. You can use that to override the general case of external access with optimizations for internal access.

Detecting the local network

First, you need a way to detect your local network. Perhaps the most robust way is to detect your router's MAC address xx:xx:xx:xx:xx:xx, which you can get with:

$ arp -a _gateway
_gateway (192.168.1.254) at xx:xx:xx:xx:xx:xx [ether] on wlp4s0

You could wrap up testing for it in a script:

#!/bin/bash
# -*- sh-basic-offset: 2; indent-tabs-mode: nil -*-

declare -A mac_arg=()

while [ $# -gt 0 ] ; do
  arg="$1" ; shift
  case "$arg" in
    (-h|--host)
      host_arg="$1" ; shift
      ;;

    (--host=*)
      host_arg="${arg#--host=}"
      ;;

    (-m|--mac)
      arg="$1" ; shift
      arg="${arg,,}"
      mac_arg["$arg"]=yes
      ;;

    (--mac=*)
      arg="${arg#--mac=}"
      arg="${arg,,}"
      mac_arg["$arg"]=yes
      ;;

    (-*|+*)
      printf >&2 '%s: unknown switch: %s\n' "$0" "$arg"
      exit 1
      ;;

    (*)
      printf >&2 '%s: unknown arg: %s\n' "$0" "$arg"
      exit 1
      ;;
  esac
done

if [ -z "$host_arg" -o "${#mac_arg[@]}" -eq 0 ] ; then
  printf >&2 'usage: %s -h host -m mac\n' "$0"
  exit 1
fi

read canon ipbr at got_mac rest < <(arp -a "$host_arg" 2> /dev/null)
test -n "$got_mac" && test -n "${mac_arg["$got_mac"]}"

(You can surely get away with something a lot simpler.) Drop the script in (say) /usr/local/bin/host-is-mac and make it executable (chmod 755 /usr/local/bin/host-is-mac). Now you can see whether you're in your own network:

$ host-is-mac -h _gateway -m xx:xx:xx:xx:xx && echo internal
internal
$

If you test a hostname which doesn't resolve, the command quietly fails:

$ host-is-mac -h made-up -m xx:xx:xx:xx:xx && echo internal
$

A special case for your home network

Now you can configure SSH to test whether the local network is your home network, to decide whether to connect using internal or external parameters. Conflicting options are resolved by choosing the first instance, so you should define the parameters for external access as the general case, and then precede them with the specific case of being in the same network:

## specific (internal) case
Match originalhost="srv" exec "host-is-mac -h _gateway -m xx:xx:xx:xx:xx"
Port 22
Hostname myserver.home

## general (external) case
Host srv
User me
HostName mydyndns.example.org
Port 9000

The Match directive enables its following directives only if srv is given as the SSH destination, and then only if the host-is-mac command succeeds. They are treated as a logical AND with short-circuiting, so the external command is only invoked when necessary.

Test the configuration by using ssh -v srv echo yes, and look for lines with Connecting to in them. If you're connecting locally, you'll see:

debug1: Connecting to myserver.home [192.168.1.100] port 22.

If you're connecting from outside:

debug1: Connecting to mydyndns.example.org [10.10.10.10] port 9000.

Proxying

Suppose you use myserver.home as a proxy to another server otherserver.home, and you want the alias alt-srv to conditionally go direct when local. Its specialization must disable proxying:

Match originalhost="alt-srv" exec "host-is-mac -h _gateway -m xx:xx:xx:xx:xx"
ProxyJump none

Host alt-srv
User me
Hostname otherserver.home
ProxyJump srv

Handling multiple aliases

If you have several aliases for a single server, you can list them in the Host clause, separating them with spaces. However, to list them in the originalhost condition, separate them with commas:

Match originalhost="srv1,srv2" exec "host-is-mac -h _gateway -m xx:xx:xx:xx:xx"
Port 22
Hostname myserver.home

Host srv1 srv2
User me
HostName mydyndns.example.org
Port 9000

You could, of course, just use Match for both clauses.

Things that didn't work

Trying to make transclusion of configuration files conditional doesn't work:

Match originalhost="srv,alt-srv" exec "host-is-mac -h _gateway -m xx:xx:xx:xx:xx"
Include site1-specializations.conf

The Include directive applies unconditionally, and the Match directive applies only to the initial directives of the transcluded file, up until the next Match or Host.

2024-05-27

Reading pressure from a QMP6988

I got hold of an envIII sensor for the indoor humidity, temperature and pressure readings, and bunged it on an RPi via a Grove HAT. This device incorporates an SHT30 for humidity and temperature, and a QMP6988 for pressure (but it also measures temperature for performing some compensation on the pressure). I had no trouble interpreting the SHT30's datasheet, and got the readings out with two I²C calls. The procedure for the QMP6988 is a bit more involved, and its datasheet required some guesswork, so I'm documenting the steps I took in case someone else is having trouble.

Reading the raw coefficients

To perform compensation, you need to read 12 raw integer coefficients, then scale and translate them as real numbers, before combining them with the raw pressure/temperature readings. The raw coefficients are expressed as 25 1-byte constant read-only registers within the device, so you only need to fetch them once, even if you're going to take multiple readings. I used the I2C_RDWR ioctl to write the register being requested, read the value, cancel the request, and confirm the cancellation, in sequence for each register. Each call (re-)used a single buffer:

uint8_t buf;
struct i2c_msg msg = {
  .addr = addr,
  .len = 1,
  .buf = &buf,
};
struct i2c_rdwr_ioctl_data pyld = {
  .msgs = &msg,
  .nmsgs = 1,
};

With fd open on the I²C device, I could request register reg_idx like this:

buf = reg_idx;
msg.flags = 0; // write
if (ioctl(fd, I2C_RDWR, &pyld) < 0)
  throw std::system_error(errno, std::generic_category());

To read, set msg.flags = I2C_M_RD, and call ioctl again. I kept reading as long as ioctl returned negative with errno == EIO.

My understanding of the datasheet is that one should then request register 0xff (as if to cancel the prior request), and keep reading until one gets 0. In fact, my code stopped if it got EIO or a zero, though I don't think I've seen the latter:

buf = 0;
msg.flags = I2C_M_RD;
do {
  if (ioctl(fd, I2C_RDWR, &pyld) < 0) {
    if (errno == EIO) break;
    throw std::system_error(errno, std::generic_category());
  }
  if (buf != 0x00) continue;
  break;
} while (true);

Coefficients' signedness

Ten of the coefficients are 16-bit integers, and the other two are 20-bit. I couldn't find anywhere in the datasheet about their signedness, but I only get reasonable readings if they are treated as signed. I used a wider unsigned type to compose the value from bytes, reinterpreted as the corresponding signed type, then subtracted if the ‘top’ bit was set:

uint_fast32_t val = low_byte;
val |= high_byte << 8;
int_fast32_t ival = val;
if (val & 0x8000)
  ival -= 0x10000;

Scaling and translating the coefficients

Each of the 16-bit integers must be divided by an integer constant, then multipled by a real constant, and then offset by another real. In the datasheet, these real constants are provided under Conversion factor in a table, and a general equation shows how to use them. However, the information for the 20-bit coefficients looks potentially contradictory. In the corresponding table, the Conversion factor column says Offset value (20Q16), while the equation simply says to divide by 16 (so no offset?). I haven't found any definition of this notation, but I think it implies that the original value is 20 bits, with the unit being 1/16. In other words, all you have to do is divide the signed integer by 16, as the equation states.

Taking the raw readings

I used a one-off write to one of the registers to initialize the device (a 2-byte <register, value> message), but I send another 2-byte message to force each reading. After waiting a moment, I read each of the 6 bytes separately, in the same way as reading the coefficients (request, read, cancel, confirm).

The datasheet states that each 24-bit reading should have 2²³ subtracted from it, but at 24bits[sic] output mode. I thought maybe this meant that the result should be masked with 0xffffff, but that would create a considerable discontinuity, and indeed it does not yield correct results. Simply treat the raw 24-bit value as unsigned, convert it to a signed value (with no sign extension), and do the subtraction.

Units

After applying compensation, the pressure is expressed in Pa, which is stated in the datasheet. Divide by 100 to get hPa or mbar.

The datasheet mentions 256 degreeC as the unit for the compensated temperature. I got meaningful readings by dividing by 256, so I guess it means that the unit is one 256th of a degree C. When you use the compensated temperature to compensate the pressure, just use the value as is; don't divide.

WS3085 wind speed codes

I've been examining the raw signals from several Aercus Instruments weather stations, mainly the WS3085 and similar. Two bytes of the long (80-bit) messages appear to carry wind speed, one for the average, and one for gust.

By recording the signals and simultaneously observing the console, I could get a mapping between the signal and reported wind speed. Here are some plain speeds:

byte 1 (wind speed, bits 32-39)	console speed (km/hr)
00000000	0.0
00000001	1.1	(corrected signal after possible misreading)
00000010	2.5
00000011	3.6
00000100	5.0
00000101	6.1
00000110	7.2
00000111	8.6

Here are some gust speeds (on a windier day):

byte 2 (gust speed; bits 40-47)	console gust speed (km/hr)
00000110	7.2
00001000	9.7
00001001	11.2
00001110	17.3
00001111	18.4
00010001	20.9
00010010	22.0
00011101	35.6
00100000	39.2

Where they overlap, gust speeds and plain wind speeds appear to use the same representation, and larger numbers correspond to greater speeds, so I'm going to assume that they indeed use the same representation. However, there's no consistent ratio shown in the recordings above, but it's always (so far) between 1.1 and 1.25. The mean is ~1.218, which works closely for codes 5 and 8, but over-reports for 1, 3 and 6, and under-reports for 2, 4, 7, 9, 14, 15, 17, 18, 29 and 32. Perhaps using different units would have yielded a more consistent ratio, e.g., the code is first multiplied and rounded to get the speed in another unit, then multiplied again and rounded again to get the speed in km/hr. Other units are m/s (÷3.6), mi/hr (÷1.609) and knots (÷1.852), and none of these are going to yield a nicer ratio.

To get a more intuitive understanding, here's a plot of speeds against raw values, but with a couple of anticipated scales subtracted:

Those drops are all by the same amount. The increments aren't, but some are similar. What's going on?

Here's the Gnuplot script:

set title 'Wind ratio'
set datafile sep ','
set xlabel 'signal'
set ylabel 'speed (km/hr)'
set term pdf monochrome linewidth 0.1
set output 'windratio.pdf'
set key left bottom
set grid xtics
set xtics 1
show grid
plot 'windratio.csv' using 1:($2-$1*1.25) with linespoints title 'observed - 1.25x', \
  'windratio.csv' using 1:($2-$1*1.225) with linespoints title 'observed - 1.225x'

And here's windratio.csv:

0,0
1,1.1
2,2.5
3,3.6
4,5.0
5,6.1
6,7.2
7,8.6
8,9.7
9,11.2
14,17.3
15,18.4
17,20.9
18,22
29,35.6
32,39.2

Looks like you can reproduce that table with something like this:

def conv(i):
    return i * 1.1 + \
        ((i + 3) // 5 + (i + 1) // 5) * 0.3 + \
        ((i + 16) // 25) * 0.1

for i in range(0, 33):
    print('%2d: %4.2f' % (i, conv(i)))
    continue

In other words, add 1.1 per unit, then add 0.3 every 5 units from positions 1 and 4, and add a further 0.1 at 9 (and I'm guessing that's every 25 units, but it must be at least 24).

According to Kevin, just multiply by 0.34, and round to the nearest tenth, to get metres per second. Converting to km/h and rounding again gives all the reported values. Try the following, and you'll see all the reported values matching:

def conv(i):
    return i * 1.1 + \
        ((i + 3) // 5 + (i + 1) // 5) * 0.3 + \
        ((i + 15) // 24) * 0.1

def conv2(i):
    return int(i * 3.4 + 0.5) / 10 * 3600 / 1000

for i in range(0, 33):
    print('%2d: %4.1f %4.1f' % (i, conv(i), conv2(i)))
    continue

[2024-06-10 Minor corrections to table; inferred expression]
[2024-06-12 Linked to Kevin's post with "the answer"; corrected bit positions]

2023-08-19

Two logical interfaces on one physical, with Netplan

In my home network, I have a server which I want to appear under two hostnames, mainly so I can later move the functionality associated with one of them around to other hosts. I'm just using my ISP-supplied broadband router/modem to manage the network, but it doesn't exactly bristle with configuration options to make this directly possible with, say, a DNS alias. Nevertheless, I want to stick with it, as other solutions might involve duplicating a lot of its functionality, or splitting it across multiple hosts, both of which introduce their own risks.

The router provides local DNS resolution (in the .home domain), and it honours the hostnames specified by DHCP requests. By presenting two interfaces to it, a single host can get two IP addresses and so two distinct names. Yes, it's ugly and hacky, but it's a solution within the constraints.

Approach

In this specific example, enp3s0 is the physical interface, and the second hostname is media-centre. The approach is to create two virtual interface pairs (faux0-faux0br and faux1-faux1br), connect one end of each (faux0br and faux1br) to a virtual bridge (br0), and connect this to the physical interface enp3s0. The other two ends of the pairs (faux0 and faux1) are now on the same Ethernet network, and running DHCP on them causes them to acquire distinct IP addresses, and registers them under distinct DNS names.

IPv4 ARP

For IPv4, it's essential to prevent the two interfaces stepping on each other's toes regarding ARPs, and a Server Fault answer shows how. Put this in your /etc/sysctl.d/local.conf (or create a numbered file for it, say 99-dualiface.conf):

net.ipv4.conf.all.arp_ignore=1
net.ipv4.conf.all.arp_announce=2
net.ipv4.conf.all.rp_filter=2

That will apply on boot, but you can apply it immediately with sudo sysctl -p/etc/sysctl.d/local.conf.

Creating virtual interface pairs

At the time of writing, and as far as I can tell, Netplan can set up bridges, but not the veth pairs used in the previous solution. This Ask Ubuntu answer explains how to do it another way. For our case specifically, create /etc/systemd/network/25-faux0.netdev:

[NetDev]
Name=faux0
Kind=veth
[Peer]
Name=faux0br

Create /etc/systemd/network/25-faux1.netdev similarly:

[NetDev]
Name=faux1
Kind=veth
[Peer]
Name=faux1br

Connecting with a bridge

We create and define the bridge in the /network/bridges section of a YAML file in /etc/netplan/. I've called this one 99-bridgehack.yaml:

network:
  ethernets:
    enp3s0:
      dhcp4: false
    faux0:
      dhcp4: true
    faux0br: {}
    faux1:
      dhcp4: true
      dhcp4-overrides:
        hostname: media-centre
    faux1br: {}
  bridges:
    br0:
      link-local: []
      interfaces:
        - faux0br
        - faux1br
        - enp3s0

We enable DHCP on faux0 and faux1. The former announces itself using the server's own name by default, but we set the name explicitly for the latter. Note that we also disable DHCP on our original interface enp3s0, overriding the setting in /etc/netplan/00-installer-config.yaml:

# This is the network config written by 'subiquity'
network:
  ethernets:
    enp3s0:
      dhcp4: true
  version: 2

The section /network/bridges/br0/interfaces binds the backends of the veth pairs together with the physical interface. faux0br and faux1br must have some presence in /network/ethernets in order to reference them here, so they are set empty.

[Edit 2024-12-07] /network/bridges/br0/link-local is set to an empty list to prevent IPv6 addresses being assigned to the bridge. This isn't vital, but it might save you some head scratching about strange entries in your router's network device list.

Deployment

With /etc/netplan/99-bridgehack.yaml in place, you just need to tell Netplan about it. Any remote network reconfiguration risks you losing the very connection you're using to do it over, so this is best done on the server's console:

sudo netplan generate
sudo netplan apply

Maybe I did something wrong, but I would often find that Netplan would create new entities as requested, but not tear down old ones. A reboot ensures you're starting from a clean slate. If you make a mistake, you can always rename 99-bridgehack.yaml to disable it.

Déjà vu

I did this before, but without Netplan. I turned it off, and enabled legacy ifupdown functionality still available in Ubuntu 18.04. However, it's less clear how to do that on 22.04, so I had to find a way with Netplan. There was no need to mess with /etc/dhcp/dhclient.conf this time, which is good, as it didn't seem to make any difference. (Is dhclient being used any more?) The IPv4/ARP advice remains largely the same.

2022-08-25

The Brexit Song

To the tune of “Thank you for the music” by ABBA:

Thank you for the Brexit that keeps on giving
To the EU. You've lost your living.
Thank you for the workforce,
The jobs and all your money,
For sovereignty,
And for some bad trade deals you will see
Aren't worth the loss of all your farming,
Fishing and industry.

Feel free to develop.

2021-04-06

BT email rules not working

So, I just spent the evening rejigging my parents' email rules with BT. They seemed to stop working sometime in January 2021, and I've just worked out why.

BT have changed how comparisons like is and ends with work on the From: field (and possibly others). Previously, the email address was extracted from the field, so it didn't matter whether the whole text of the field read any of these ways:

From: j.bloggs@example.com
From: Joe Bloggs <j.bloggs@example.com>
From: "Joe Bloggs" j.bloggs@example.com

Can't be certain that I've remembered that third form correctly; it's in an RFC somewhere anyway. However, I don't think I've seen it for a long time, so I've going to assume it's fallen out of favour, and focus on the other two.

Under the new mechanism, From: is j.bloggs@example.com will only match the first form. You'll now also need a From: contains <j.bloggs@example.com> to guarantee a match. You can't use multiple operators like is and contains on the same field in the same rule, so you must duplicate the rule, and maintain it. You could, of course, match both j.bloggs@example.com and <j.bloggs@example.com> in the same rule with contains, and you'll probably get away with it, but you'll be left scratching your head when bob.j.bloggs@example.computing.invalid ends up in the same place. Also, if they change it back without notice, your is rule will continue to work.

From: ends with @example.com will also fail to match the second form. You need From: ends with @example.com> too now. Fortunately, you can do that with an extra entry in the same rule; you don't need a duplicate rule. However, bear in mind that you can only have 15 From: entries in a single rule.

To: and CC: can have multiple addresses. Some experimentation is required to determine whether they are automatically split and tested separately.

While I'm in gripe mode, BT rules could do with a few other features:

Match on List-Id: to pick out mailing-list posts unambiguously.
Filter out those damn subject-line tags like [zarquon users] that needn't pollute mailing lists when they've already been sorted into the right folder.
Mark messages as read.

2021-01-07

“Wrong __data_start/_end pair” work-around

I was getting Wrong __data_start/_end pair from my Mokvino Web scripts when converting ODG to SVG, since upgrading to Ubuntu 20.04 (though I've used Mokvino Web so little lately, I can't be sure that that's the start of the problem). It was an inkscape command that was failing. When I ran the command manually, I got no error. I found few differences in environment variables between running directly and running via make, and when I forced them to be the same in the script as in the console, it still failed within make and worked in the console.

A StackExchange question pointed towards a work-around. I checked the resource limit for the stack size (ulimit -s), and it was unlimited when run from make, but 8192 in the console. I bunged in a ulimit -s 8192 before the command, and it worked!

$ ulimit -s unlimited 
$ inkscape -z --query-all "example.svg" | head -2
Wrong __data_start/_end pair
$ ulimit -s 8192
$ inkscape -z --query-all "example.svg" | head -2
svg805,5848,8815,14472,4305.111
rect2,0,0,29700,21000
$

Can't say I understand what's happening here; just hope it helps.