Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local node routing table separation #1568

Closed
wants to merge 3 commits into from

Conversation

T-X
Copy link
Contributor

@T-X T-X commented Nov 15, 2018

This pull request includes one minor fix for the source MAC for IPv6 packets from the local-node ULA. And two changes similar to the previous attempt from commit b3762fc ("gluon-client-bridge: move IPv4 local subnet route to br-client (#1312)"). But without the drawbacks @ecsv had discovered with it (3ef28a4).

Conceptually, the new approach is to use a separate routing table for the local-node interface. Which we will only jump into if a node tries to use its IPv4 or IPv6 local-node address. This should ensure, that we never use the local-node interface's MAC address for any purpose other than direct client-to-node communication (for instance, routed packets will always use the unique MAC address from br-client).

This is also in preparation for the gluon-alt-esc packages (#1094).

Before, IPv6 packets from the local-node interface using the IPv6 ULA
were wrongly using the MAC address from br-client (64:70:02:ae:72:e4):

client$ tcpdump -i enp0s31f6 -e -n "ether src 16:41:95:40:f7:dc or ether src 64:70:02:ae:72:e4 or ether src 8c:16:45:66:ba:11"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp0s31f6, link-type EN10MB (Ethernet), capture size 262144 bytes
00:02:51.202570 8c:16:45:66:ba:11 > 16:41:95:40:f7:dc, ethertype IPv6 (0x86dd), length 118: fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11 > fdef:ffc0:3dd7::1: ICMP6, echo request, seq 3, length 64
00:02:51.203040 64:70:02:ae:72:e4 > 8c:16:45:66:ba:11, ethertype IPv6 (0x86dd), length 118: fdef:ffc0:3dd7::1 > fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11: ICMP6, echo reply, seq 3, length 64
00:02:52.226566 8c:16:45:66:ba:11 > 16:41:95:40:f7:dc, ethertype IPv6 (0x86dd), length 118: fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11 > fdef:ffc0:3dd7::1: ICMP6, echo request, seq 4, length 64
00:02:52.227255 64:70:02:ae:72:e4 > 8c:16:45:66:ba:11, ethertype IPv6 (0x86dd), length 118: fdef:ffc0:3dd7::1 > fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11: ICMP6, echo reply, seq 4, length 64

(br-client: 64:70:02:ae:72:e4, local-node: 16:41:95:40:f7:dc, client device: 8c:16:45:66:ba:11)

$ ip a s dev local-node
10: local-node@local-port: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
qdisc noqueue state UP qlen 1000
    link/ether 16:41:95:40:f7:dc brd ff:ff:ff:ff:ff:ff
    inet 10.130.0.1/20 brd 10.130.15.255 scope global local-node
       valid_lft forever preferred_lft forever
    inet6 fdef:ffc0:3dd7::1/128 scope global deprecated
       valid_lft forever preferred_lft 0sec
    inet6 fe80::1441:95ff:fe40:f7dc/64 scope link
       valid_lft forever preferred_lft forever

With this patch applied ICMPv6 uses the correct source MAC
address from the local-node interface (16:41:95:40:f7:dc):

$ tcpdump -i enp0s31f6 -e -n "ether src 16:41:95:40:f7:dc or ether src 64:70:02:ae:72:e4 or ether src 8c:16:45:66:ba:11"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp0s31f6, link-type EN10MB (Ethernet), capture size 262144 bytes
00:15:19.757790 8c:16:45:66:ba:11 > 16:41:95:40:f7:dc, ethertype IPv6 (0x86dd), length 118: fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11 > fdef:ffc0:3dd7::1: ICMP6, echo request, seq 1, length 64
00:15:19.758567 16:41:95:40:f7:dc > 33:33:ff:66:ba:11, ethertype IPv6 (0x86dd), length 86: fdef:ffc0:3dd7::1 > ff02::1:ff66:ba11: ICMP6, neighbor solicitation, who has fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11, length 32
00:15:19.758617 8c:16:45:66:ba:11 > 16:41:95:40:f7:dc, ethertype IPv6 (0x86dd), length 86: fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11 > fdef:ffc0:3dd7::1: ICMP6, neighbor advertisement, tgt is fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11, length 32
00:15:19.759220 16:41:95:40:f7:dc > 8c:16:45:66:ba:11, ethertype IPv6 (0x86dd), length 118: fdef:ffc0:3dd7::1 > fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11: ICMP6, echo reply, seq 1, length 64
00:15:20.759425 8c:16:45:66:ba:11 > 16:41:95:40:f7:dc, ethertype IPv6 (0x86dd), length 118: fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11 > fdef:ffc0:3dd7::1: ICMP6, echo request, seq 2, length 64
00:15:20.759966 16:41:95:40:f7:dc > 8c:16:45:66:ba:11, ethertype IPv6 (0x86dd), length 118: fdef:ffc0:3dd7::1 > fdef:ffc0:3dd7:0:8e16:45ff:fe66:ba11: ICMP6, echo reply, seq 2, length 64

In practice, this wrong MAC address does not seem to cause any
issues so far, though, as ebtables rules are filtering to bat0 for both
local-node MAC, IPv4 and IPv6 addresses and the IPv6 stack (at least in
Linux) does not seem to update its neighbor table from IPv6 data packets
(interestingly, contrary to the Linux IPv4 stack, as was observed
with 3ef28a4
("gluon-client-bridge: Revert "move IPv4 local subnet route to br-client (freifunk-gluon#1312)").

Signed-off-by: Linus Lüssing <[email protected]>
Just as was done for IPv6 previously, use the new, separate table for IPv4
local-node, too.

This ensures that we will always only use the source MAC of the
local-node interface if the packet originated from us (locally
generated, not routed) and matches our local-node IPv4 or IPv6
address.

Signed-off-by: Linus Lüssing <[email protected]>
This allows us to route and NAT between br-client and br-wan in the future.
Which is necessary for the gluon-alt-esc-provider package, for instance.

With the local-node routes separated in their own table, the issue
described in:

3ef28a4 ("gluon-client-bridge: Revert "move IPv4 local subnet route to br-client (freifunk-gluon#1312)"

and initially introduced in:

b3762fc ("gluon-client-bridge: move IPv4 local subnet route to br-client (freifunk-gluon#1312)")

does not apply anymore, which makes this change safe again.

Signed-off-by: Linus Lüssing <[email protected]>
@rotanid rotanid added 0. type: enhancement The changeset is an enhancement 5. needs: testing Testing of the changes is necessary labels Nov 15, 2018
@christf
Copy link
Member

christf commented Nov 16, 2018

I am uncertain whether we should we diverge routing for babeld and batman.
I guess it comes down to whether we want to support keeping the configuration when switching mesh protocols.

@T-X
Copy link
Contributor Author

T-X commented Nov 16, 2018

Hi @christf. When making this PR I was wondering and uncertain why those routes were in the batman-adv package upgrade scripts, too. But I did not change that because I was uncertain whether that would break something for babel.

Do you mean you see some conflict with the babel integration with this concept? Then please elaborate, then I'd love to hear and learn what you are doing and need differently. Or are you just suggesting to do what I had been wondering about initially, to move these routes and tables from gluon-mesh-batman-adv (package/gluon-mesh-batman-adv/luasrc/lib/gluon/upgrade/320-gluon-mesh-batman-adv-client-bridge) to gluon-client-bridge (package/gluon-client-bridge/luasrc/lib/gluon/upgrade/310-gluon-client-bridge-local-node)?

@christf
Copy link
Member

christf commented Nov 26, 2018

actually I was referring to the usage of table 2 and the fact that now batman adjusts routing rules for the local-node feature while the babel stack is untouched.

I feel local-node should work the same on both stacks to avoid confusion.

@T-X
Copy link
Contributor Author

T-X commented Nov 27, 2018

Ok, great.

"local_node_route6" was already set up via 320-gluon-mesh-batman-adv-client-bridge. @christf, can I just move my changes and additions for "local_node_route", "local_node_route6", "local_node_rule", "local_node_rule6" and "br_client_route" from 320-gluon-mesh-batman-adv-client-bridge to 310-gluon-client-bridge-local-node? Or would that break something for Babel?

@christf
Copy link
Member

christf commented Nov 29, 2018

I do not think it should break anything but it would be interesting to see a test.

I am not sure how to best do this in the future. I could be difficult to run many test-scenarios for changes like this.

@rotanid rotanid added this to the 2019.1 milestone Jan 20, 2019
@rotanid
Copy link
Member

rotanid commented Feb 12, 2019

how's the testing of this coming along? it would be nicer to get good work not only done, but also tested and afterwards maybe merged because of good results :)

@CodeFetch
Copy link
Contributor

@christf This should not interfere with the Babel setup as the local node route table lookup is /128 /32. If it does it should be easy to fix.
@rotanid Please merge it so that #1094 can be merged.

@rotanid
Copy link
Member

rotanid commented Mar 2, 2019

@CodeFetch i won't merge if there's not even basic testing with a babel-build.

@ghost
Copy link

ghost commented Mar 21, 2019

I try to do a test build of this PR with the help of the FFFFM babel knowledge, next week(-end)
https://wiki.ffm.freifunk.net/firmware:babel
https://github.com/freifunk-ffm/site-ffffm/tree/christf_next

@rotanid what comes to your mind when basic testing has to be done?

And what should be tested when doing tests with this PR especially?

@CodeFetch
Copy link
Contributor

CodeFetch commented Mar 22, 2019

On layer 3 this patch should work this way:

  1. (local_node_rule) IPv4 packet from our local node IP address (indirectly means client is local)? -> put in routing table 2
  2. (local_node_rule6) IPv6 packet from our local node IP address (indirectly means client is local)? -> put in routing table 2
  3. (babel) packet destined to a client on another node (shorter cidr)? -> output on interface where other node is / forward to other node
  4. (br_client_route) IPv4 packet destined to a client in the site-prefix? -> output on client bridge
    (this is needed as T_X does not use the site-ipv4-prefix for the local-node interface anymore, but the local-node IPv4 address. IPv6 routes use the link local address for their routes which is based on the MAC address of the clients together with the interface where the MAC address can be reached. Thus this is not needed for IPv6 here)

Routing table 2:
(local_node_route) IPv4 packet destined to site-IPv4-prefix? -> output on local-node interface
(local_node_route) IPv6 packet destined to site-IPv4-prefix? -> output on local-node interface

local-node interface is a veth and the packet outputs on local-port which is in the client bridge together with the wifi client interfaces and the bat0 client interface. This patch only has the purpose that packets from the local node IP address use the MAC address of the local-port interface.

@bobcanthelpyou That means you have to check everything ordinary like does the clients still have gateway reachability, do they receive router announcements - say check if the clients have internet connection. Then as a client check if you can reach the status page at the local node's IPv4/IPv6 address and as a client on another node check if you can reach the status page of this node with its global IPv6 address.

Edit: I'm not really sure at the moment if this can't cause troubles with babel as it uses the local-node interface for forwarding traffic between clients, too as far as I remember. Thus you also have to check if clients on two different nodes running this patch, can still reach each other.

@rotanid rotanid removed this from the 2019.1 milestone Jun 6, 2019
@rotanid
Copy link
Member

rotanid commented Jun 20, 2019

@bobcanthelpyou @T-X @christf has someone now tested this with a babel build?

@ghost
Copy link

ghost commented Jun 20, 2019

It is still on my todo-list but i forgot about it, and can't start at the moment (cause i am afk for the next weeks)

@rotanid
Copy link
Member

rotanid commented Sep 14, 2019

@bobcanthelpyou @T-X @christf has someone now tested this with a babel build?

status?

@@ -23,8 +23,7 @@ uci:section('network', 'device', 'local_node_dev', {
local ip4, ip6

if next_node.ip4 then
local plen = site.prefix4():match('/%d+$')
ip4 = next_node.ip4 .. plen
ip4 = next_node.ip4 .. '/32'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christf Please test this change with the babeld setup.

src = next_node.ip4 .. '/32',
lookup = 2,
})
uci:set('network', 'local_node_rule', 'in', 'loopback')
Copy link
Member

@neocturne neocturne Nov 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge this set into the section above (same for local_node_rule6 below).

I assume this was not done because in is a keyword in Lua. The correct syntax for this (and arbitrary Lua values as table keys) is

{
  ...
  ['in'] = 'loopback',
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and is the in setting even necessary at all?

uci:section('network', 'route', 'br_client_route', {
interface = 'client',
target = site.prefix4(),
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I wonder if such a change will lead to incorrect source address selection in some cases. I'm not sure if source address selection for hosts with multiple addresses is as well-specified for IPv4 as it is for IPv6 (at least, I don't know the relevant RFCs), but I'm thinking about the following:

When the Gluon node wants to contact a local client, it will now select this route at first (before source address selection), with interface client - which doesn't have any IPv4 address. Now, how is the source address selected now? What tells Linux to use the address of local-node and not, for example, the br-wan address?

The issue might be solved by adding a "source" attribute with the next-node address to this route definition, but I haven't fully thought this through yet.

For IPv6, a similar issue cannot occur, as br-client should always have an address, which would be selected (at least when gluon-radvd is included, which is the default now).

Copy link
Member

@neocturne neocturne May 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@T-X

I also wonder if this route could lead to packets with the local-node source IPv4 address leaving through br-client, with the br-client MAC address, or if packets would go through another round of routing after choosing the source address, and then end up in routing table 2. If the former is the case, I don't think we should merge the IPv4 part of these changes...

Would adding a source address attribute with the local-node IPv4 address to this route be a problem for alt-esc?

@mweinelt mweinelt added the 2. status: waiting-on-author Waiting on some action from the author label Feb 3, 2020
@rotanid
Copy link
Member

rotanid commented Aug 30, 2020

@T-X are you still interested in this? you have not commented on the feedback which was given 9 months ago...

@AiyionPrime AiyionPrime added the 2. status: merge conflict The merge has a conflict and needs rebasing label Jan 21, 2023
@T-X
Copy link
Contributor Author

T-X commented Nov 19, 2024

Closing this PR as I'm currently not working on it. And instead have an alternative approach in mind (https://www.open-mesh.org/projects/open-mesh/wiki/OpenHarbors).

@T-X T-X closed this Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. type: enhancement The changeset is an enhancement 2. status: merge conflict The merge has a conflict and needs rebasing 2. status: waiting-on-author Waiting on some action from the author 5. needs: testing Testing of the changes is necessary
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants