Testing CLAT for IPv6-only mobile networks

Oh okay.
It just goes off completely.
So far it has only happened to me when establishing a connection to the 10 III hotspot. Was like that before the patch and with the current patch it happened again during testing

If the connection works, I have no internet. But then it is the same for other devices.
So far it has worked once after installing or updating and I then also had internet. Only with the current patch did that no longer work and I have this error as before

With this build the clat interface died 2 or 3 times for me. Restarting connman.service brought it back to life.

If either of your could take some logs using the instructions above, we’d stand a better chance of seeing what’s going on.

Ouch. I’d also like to know if mere on/off for the mobile data would suffice or was the restart absolutely necessary to get CLAT interface back?

I’m not sure, i will test if disabling mobike data or enabling flight mode will help, if it dies again.
I will also collect some logs.

UPDATE: Disabling mobile data or turning off flight mode didn’t bring the CLAT interface back up again.

1 Like

First of all, thanks to @jlaakkonen for implementing CLAT in Sailfish and to @abranson for testing and this thread.
This seems to be a very good alternative to “real” IPv4.
While many users (including me) and service providers are still struggling with the implementation of ipv6, it can no longer be ignored.
It is therefore important and sensible to provide compatibility with the old IPv4 world for the transition period (since about 20 years and probably for another 20 years).

I installed the package a few days ago (and all the updates that followed).
In normal everyday use, it worked very well at first glance.

The IPv4 only world is accessible again and as @miau already mentioned, the Jolla email program now has access to the mobile data connection again.

However, I have noticed the following:
After every restart of the phone or every time you toggle airplane mode on and off, CLAT stops working.
I have to turn mobile data off and back on once for CLAT to work again.

Also it seems like the switching and routing mechanism doesn’t work in all situations.
So I already had the situation that WLAN and CLAT were active at the same time:

[root@Xperia10III defaultuser]# ifconfig
clat      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet addr:192.0.0.1  P-t-P:192.0.0.1  Mask:255.255.255.248
          inet6 addr: fe80::1922:b17f:8f67:d39/64 Scope:Link
          UP POINTOPOINT RUNNING NOARP MULTICAST DYNAMIC  MTU:1500  Metric:1
          RX packets:357 errors:0 dropped:0 overruns:0 frame:0
          TX packets:361 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:210193 (205.2 KiB)  TX bytes:211213 (206.2 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:13713 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13713 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1722772 (1.6 MiB)  TX bytes:1722772 (1.6 MiB)

rmnet_data0 Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet6 addr: fe80::532d:da72:65f3:4be9/64 Scope:Link
          UP RUNNING  MTU:1500  Metric:1
          RX packets:71 errors:0 dropped:0 overruns:0 frame:0
          TX packets:73 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:4684 (4.5 KiB)  TX bytes:4832 (4.7 KiB)

rmnet_data2 Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet6 addr: 2a01:598:d840:17ea:7624:b722:f5c4:a7c8/64 Scope:Global
          inet6 addr: 2a01:598:d840:17ea:a993:e42d:2ca4:e4b5/64 Scope:Global
          inet6 addr: fe80::7624:b722:f5c4:a7c8/64 Scope:Link
          UP RUNNING  MTU:1500  Metric:1
          RX packets:50565 errors:0 dropped:0 overruns:0 frame:0
          TX packets:48009 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:29199670 (27.8 MiB)  TX bytes:8641784 (8.2 MiB)

rmnet_ipa0 Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet6 addr: fe80::73ef:94fd:505f:191a/64 Scope:Link
          UP RUNNING  MTU:9216  Metric:1
          RX packets:17913 errors:0 dropped:0 overruns:0 frame:0
          TX packets:21422 errors:0 dropped:38 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:11215665 (10.6 MiB)  TX bytes:9117480 (8.6 MiB)

rndis0    Link encap:Ethernet  HWaddr E2:01:4D:4E:FC:77  
          inet addr:192.168.10.1  Bcast:192.168.10.255  Mask:255.255.255.0
          inet6 addr: fe80::e001:4dff:fe4e:fc77/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST DYNAMIC  MTU:1500  Metric:1
          RX packets:168 errors:0 dropped:0 overruns:0 frame:0
          TX packets:90 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:18061 (17.6 KiB)  TX bytes:22571 (22.0 KiB)

wlan0     Link encap:Ethernet  HWaddr 3C:01:EF:F1:22:1C  
          inet addr:192.168.2.59  Bcast:192.168.2.255  Mask:255.255.255.0
          inet6 addr: fe80::3e01:efff:fef1:221c/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST DYNAMIC  MTU:1500  Metric:1
          RX packets:28335 errors:0 dropped:16816 overruns:0 frame:0
          TX packets:5150 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:3000 
          RX bytes:8061912 (7.6 MiB)  TX bytes:497370 (485.7 KiB)

While the ipv4 default route continued to run via CLAT

[root@Xperia10III defaultuser]# ip route
default dev clat  scope link  metric 2048 
192.0.0.0/29 dev clat  proto kernel  scope link  src 192.0.0.1 
192.168.2.0/24 dev wlan0  proto kernel  scope link  src 192.168.2.59 
192.168.2.1 dev wlan0  scope link 
192.168.10.0/24 dev rndis0  proto kernel  scope link  src 192.168.10.1

And the ipv6 default route via mobile data was also still active

[root@Xperia10III defaultuser]# ip -6 route
2a01:598:7ff:0:10:74:210:221 via fe80::f82f:ad22:58e8:ddfd dev rmnet_data2  metric 1 onlink 
2a01:598:7ff:0:10:74:210:222 via fe80::f82f:ad22:58e8:ddfd dev rmnet_data2  metric 1 onlink 
2a01:598:d840:17ea::c1a7 dev clat  metric 1024 
2a01:598:d840:17ea::/64 dev rmnet_data2  proto kernel  metric 256 
fe80::f82f:ad22:58e8:ddfd dev rmnet_data2  metric 1 onlink 
fe80::/64 dev rmnet_ipa0  proto kernel  metric 256 
fe80::/64 dev rmnet_data0  proto kernel  metric 256 
fe80::/64 dev rmnet_data2  proto kernel  metric 256 
fe80::/64 dev clat  proto kernel  metric 256 
fe80::/64 dev wlan0  proto kernel  metric 256 
fe80::/64 dev rndis0  proto kernel  metric 256 
default via fe80::f82f:ad22:58e8:ddfd dev rmnet_data2  proto ra  metric 1024  expires 62438sec hoplimit 255

In the data counter I was able to determine that when calling up an internet page via browser, the main flow of data took place via WLAN, but some kB also ran over the mobile network.

This is probably not due to CLAT itself, but to a problem with the basic switching process between IPv6 only mobile data and IPv4 only WLAN networks:
If you have an IPv6 only mobile data connection, it will not be switched off if you connect to an IPv4 only WLAN.
The mobile data icon in the top menu still shows the provider name. In the status bar, the WLAN and mobile data icon alternates.

If you have a mobile data connection with IPv4 only or with IPv6 and IPv4, it will be switched off completely when you connect to a IPv4 only WiFi network.
The mobile data icon in the top menu no longer shows the provider name

See also my Bug Report: Use of mobile data (IPv6) although a WLAN connection (IPv4) is active
Unfortunately, this problem was not solved once and for all by switching off the mobile data connection when the WiFi connection (whether IPv6, IPv4 or both) is active. Only the removal of the IPv6 route was implemented.
While this works in most cases, as I pointed out at the time, this solution isn’t elegant and doesn’t solve the real problem.

Please don’t get me wrong, the implementation of CLAT in Sailfish is very important and needs to be done.
But CLAT cannot repair an incorrect connection establishment (IPv6 only although the MNO provides IPv4) or an incorrect switching process between IPv6 only mobile data and IPv4 only WLAN.

2 Likes

Jussi’s done all the development for this, not me! I’ve just been helping test it because I’m on an affected network.

Thank you for the report. Logs would have been nice but I think I noticed at least few problems in the code. Without the possibility to test this properly myself as there is no CLAT enabled mobile network here I could assume that having WLAN up and CLAT still running was caused by ConnMan internals doing a bit of 1. cellular is default 2. NULL is default and 3. WLAN is default in some scenarios. That NULL state was not handled properly.

But the flight mode does have a nasty issue. It removes IP configurations too fast for the plugin to process it as it has to wait for the tayga process to end first. And when it should do a proper cleanup there is no data to work with. Sorry about the technical issues but this is what we’re dealing with here.

But I hope the next version would fix these both. At least when tricked in believing that CLAT is enabled state transitions do seem to work on the device.

3 Likes

My issue of the dying CLAT interface happens sometimes when the phone gets out of wifi range and uses mobile data instead. I will make the needed changes to sysctl this evening to gather a log when this happens again.

1.32+git193.28 now built

2 Likes

Repo still shows 193.27 builts.

Oops sorry it’s still waiting. OBS is busy today.

1 Like

Sorry @jlaakkonen. Honor to whom honor is due. I corrected this in my post

1 Like

It built during the night. Huge backlog of builds on OBS after the latest SFOS tartget was updated.

Installed the new build around 12hours ago. After leaving the range of the connected wifi, the phone switched to mobile data, but there was no CLAT interface. Restarting connman brought it back.

With 193.25 this unwanted behaviour did not occur.

Thanks for testing the different versions. Does this happen every time you go out of WLAN range or is it random? If it is a consistent result there is a bug. If it is random there is a race situation happening.

And to be clear, switching mobile data on and off did not help? One thing would be good to check that are the pid’s of ConnMan changed during that WLAN drop which would then indicate a unfortunately segfault :/. For example, with journalctl -b -u connman|grep -e ".*version 1.32.*"

This is just quite common, trying to fix the CLAT being on also when WLAN is active may break other case. In this case the logs would be really appreciated.

EDIT: personally, with device tricked to believing it has CLAT in Finland, which it doesn’t, I failed to see this behavior yesterday with the same version.

No, it does not happen every time. I have set up everything to collect logs and also take a look if connmans pids change the next time this will happen.

1 Like

Logs have been sent.

1 Like

1.32+git193.29 built. Changelog is always collected here for those interested:

https://build.merproject.org/package/view_file/home:abranson:clat/connman/_service:tar_git:connman.changes?expand=1

1 Like

Thank you. Unfortunately those do not contain necessary information and it is only because the journal has by default too strict limits. It does not maintain enough content and throttles too often received entries as it is configured for release devices.

Some other component may have been logging quite much to make systemd-journald to rotate the logs as reserved space ran out. This is rather annoying and for developer mode we may have to consider more lenient limits for logging to avoid this kind of issues.

If it is not too much of a work could you add the following to /etc/systemd/journald.conf.d/debug.conf:

mkdir -p /etc/systemd/journald.conf.d/
cat <<EOF >/etc/systemd/journald.conf.d/debug.conf
[Journal]
Storage=persistent
RateLimitIntervalSec=0s
RateLimitBurst=0
SystemMaxUse=150M
RuntimeMaxUse=4M
EOF

and afterwards:
systemctl restart systemd-journald

When done with log collecting you can simply remove the /etc/systemd/journald.conf.d/debug.conf file and do the restart for systemd-journald again (or reboot).

1 Like