The Android support layer has some serious issues with network connectivity that makes e.g. applications dependent on push notification based services a gambling game. Here is an example usecase and the hope that Jolla devs can investigate into this issue any further as many users might be affected by this.
In my case I started using SchildiChat
(a Matrix-based chat client) together with ntfy.sh
(a unified push provider) as a replacement for the legacy XMPP protocol on my daily driver (Xperia 10 Mark 2).
Both applications will run as background processes once SailfishOS and aliendalvik container has started to receive push notifications.
Multiple times since I used this setup I noticed that notifications get lost because of missing network connectivity. So I started digging around as I’ve lost network connection once again after the phone was suspended but connected to wifi network, woken up to check that SchildiChat
reports no connection issues, suspended again and when I returned 30 minutes later to check for new messages SchildiChat
told me about a broken connection.
First the issue is on the Android-side of things so we have to enter console as root and switch into the lxc container.
,---
| Sailfish OS 4.4.0.64 (Vanha Rauma)
'---
[defaultuser@Xperia10II-DualSIM ~]$ devel-su
Password:
[root@Xperia10II-DualSIM defaultuser]# lxc-attach -n aliendalvik
lxc-attach: aliendalvik: terminal.c: lxc_terminal_create_native: 927 Not a directory - Failed to open terminal multiplexer device
#
So now I’m in Android world.
I checked if network connection is up and running by pinging my home server.
# ping 45.142.179.56
PING 45.142.179.56 (45.142.179.56) 56(84) bytes of data.
64 bytes from 45.142.179.56: icmp_seq=1 ttl=63 time=29.9 ms
64 bytes from 45.142.179.56: icmp_seq=2 ttl=63 time=49.4 ms
64 bytes from 45.142.179.56: icmp_seq=3 ttl=63 time=50.2 ms
Network connection on the physical layer seems to work as my server received the ICMP packets.
Next I tried name resolution.
# ping noordsee.de
ping: unknown host noordsee.de
Okay there is the culprit. Network is fine but DNS is broken.
Let’s check the Android communication layer using the dumpsys
command.
# dumpsys connectivity
NetworkFactories for: PhoneSwitcherNetworkRequstListener WifiNetworkFactory TelephonyNetworkFactory[1] TelephonyNetworkFactory[0] UntrustedWifiNetworkFactory
Active default network: none
Current Networks:
NetworkAgentInfo{ ni{[type: WIFI[], state: CONNECTED/CONNECTED, reason: (unspecified), extra: (none), failover: false, available: true, roaming: false]} network{0} nethandle{0} lp{{InterfaceName: wlan0 LinkAddresses: [ ] DnsAddresses:
[ /192.168.178.1 ] Domains: null MTU: 0 Routes: [ ]}} nc{[ Transports: WIFI Capabilities: NOT_METERED&INTERNET&NOT_RESTRICTED&TRUSTED&NOT_VPN&NOT_ROAMING&FOREGROUND&NOT_CONGESTED&NOT_SUSPENDED LinkUpBandwidth>=1048576Kbps LinkDnBandwidth>=
1048576Kbps Specifier: <WifiNetworkAgentSpecifier [WifiConfiguration=, SSID="Klaus", BSSID=1c:11:1f:91:e1:1a, mOriginalRequestorUid=-1, mOriginalRequestorPackageName=]> SignalStrength: -67 SSID: "Klaus"]} Score{20} everValid
ated{false} lastValidated{false} created{false} lingering{false} explicitlySelected{true} acceptUnvalidated{false} everCaptivePortalDetected{false} lastCaptivePortalDetected{false} captivePortalValidationPending{false} partialConnectivity
{false} acceptPartialConnectivity{false} clat{mBaseIface: null, mIface: null, mState: IDLE} }
Requests: REQUEST:0 LISTEN:0 BACKGROUND_REQUEST:0 total:0
Lingered:
Restrict background: false
Well this is interesting. The wifi access point is connected and DNS configured but the active default network is gone (none).
dumpsys connectivity
told me also a few lines later that…
NetTransition WakeLock activity (most recent first):
total acquisitions: 2
total releases: 2
cumulative duration: 187s
longest duration: 185s
2022-06-25T10:03:16.743 - RELEASE (EVENT_EXPIRE_NET_TRANSITION_WAKELOCK)
2022-06-25T10:00:11.186 - ACQUIRE for NetworkAgentInfo [MOBILE (EDGE) - 0]
2022-06-24T23:30:20.984 - RELEASE (EVENT_CLEAR_NET_TRANSITION_WAKELOCK)
2022-06-24T23:30:18.848 - ACQUIRE for NetworkAgentInfo [WIFI () - 0]
… at 10am the phone somehow tried to acquire a mobile network agent info lock which expired. This must be the time within my 30 minutes between working network and failing network where the actual issue occured.
Now all DNS and network stuff is handled by netd
service that receives IPCs via binder or ncd
dispatcher.
# dumpsys netd
ppid:1 -> pid:38 -> tid:111
total running time: 0d3h32m51s614ms (12771614ms)
user: 0s191ms sys: 2s967ms
maxrss: 7924kB
NetworkController
Default network: 0
Networks:
51 DUMMY dummy0
99 LOCAL
...
...
...
Log:
06-24 23:17:13.651 netd 1.0 starting
06-24 23:17:14.102 Registering NetdNativeService: 0.3ms
06-24 23:17:14.103 Netd started in 451ms
06-24 23:17:49.774 registerUnsolicitedEventListener() <0.03ms>
06-24 23:18:12.304 firewallSetFirewallType(1) <22.32ms>
06-24 23:18:12.335 isAlive() -> {"true"} <0.01ms>
06-24 23:18:12.404 bandwidthSetGlobalAlert(2097152) <0.04ms>
06-24 23:18:12.415 networkSetPermissionForUser(1, []) <0.02ms>
06-24 23:18:12.416 networkSetPermissionForUser(2, [2000,10034,10035,1027,10037,10006,10038,1000,1001,10030]) <0.02ms>
06-24 23:18:12.417 trafficSetNetPermForUids(12, [1001,1013,10006]) <0.02ms>
06-24 23:18:12.417 trafficSetNetPermForUids(4, [1000,1027,2000,10001,10002,10012,10014,10027,10030,10036,10037,10038,10044,10045]) <0.02ms>
06-24 23:18:12.418 trafficSetNetPermForUids(8, [1041,1047]) <0.02ms>
06-24 23:18:12.419 trafficSetNetPermForUids(0, [1003,1063,1066,1067,1068,1073,10000,10003,10004,10005,10007,10008,10009,10010,10013,10015,10016,10017,10018,10019,10021,10022,10023,10024,10025,10026,10028,10032,10033,10034,10035]) <0.02m
s>
06-24 23:18:12.552 firewallReplaceUidChain("fw_standby", "false", []) -> {"true"} <0.02ms>
06-24 23:18:30.851 registerUnsolicitedEventListener() <0.02ms>
06-24 23:18:43.981 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:18:45.151 bandwidthAddNiceApp(10006) <0.02ms>
06-24 23:18:46.384 NetdNativeService::interfaceGetCfg(wlan0) -> ({wlan0, 3c:01:ef:95:ec:7a, 192.168.178.61, 24, up, broadcast, running, multicast}) (4ms)
06-24 23:18:46.384 interfaceGetCfg("wlan0") <3.94ms>
06-24 23:18:46.386 interfaceClearAddrs("wlan0") <0.02ms>
06-24 23:18:46.386 interfaceSetIPv6PrivacyExtensions("wlan0", "true") <0.01ms>
06-24 23:18:46.386 interfaceSetEnableIPv6("wlan0", "false") <0.01ms>
06-24 23:18:46.400 NetdNativeService::interfaceGetCfg(wlan0) -> ({wlan0, 3c:01:ef:95:ec:7a, 192.168.178.61, 24, up, broadcast, running, multicast}) (0.2ms)
06-24 23:18:46.400 interfaceGetCfg("wlan0") <0.33ms>
06-24 23:18:46.417 interfaceSetEnableIPv6("wlan0", "false") <0.02ms>
06-24 23:18:46.417 interfaceClearAddrs("wlan0") <0.01ms>
06-24 23:18:46.475 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-24 23:18:46.526 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:18:46.596 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:18:46.610 networkAddInterface(0, "wlan0") <0.02ms>
06-24 23:18:46.613 DnsResolverService::setResolverConfiguration(0, [192.168.178.1], [], 1800, 25, 8, 64, 0, 0, [], []) -> (-64) (0.2ms)
06-24 23:18:46.613 setResolverConfiguration() -> ServiceSpecificException(64, "Machine is not on the network") <0.49ms>
06-24 23:18:46.639 networkCreatePhysical(0, 0) <0.02ms>
06-24 23:18:46.640 createNetworkCache(0) <0.09ms>
06-24 23:18:46.646 DnsResolverService::setResolverConfiguration(0, [192.168.178.1], [], 1800, 25, 8, 64, 0, 0, [192.168.178.1], []) -> (0) (0.8ms)
06-24 23:18:46.646 setResolverConfiguration() <1.00ms>
06-24 23:18:46.648 networkAddInterface(0, "wlan0") <0.02ms>
06-24 23:18:46.650 DnsResolverService::setResolverConfiguration(0, [192.168.178.1], [], 1800, 25, 8, 64, 0, 0, [192.168.178.1], []) -> (0) (0.1ms)
06-24 23:18:46.650 setResolverConfiguration() <0.23ms>
06-24 23:18:46.663 idletimerAddInterface("wlan0", 15, "1") <0.03ms>
06-24 23:18:46.665 networkSetDefault(0) <0.05ms>
06-24 23:18:46.665 setTcpRWmemorySize("4096 87380 110208", "4096 16384 110208") <0.01ms>
06-24 23:18:46.678 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-24 23:18:46.679 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-24 23:18:46.979 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:18:46.988 DnsResolverService::setResolverConfiguration(0, [192.168.178.1], [], 1800, 25, 8, 64, 0, 0, [192.168.178.1], []) -> (0) (0.3ms)
06-24 23:18:46.989 setResolverConfiguration() <0.52ms>
06-24 23:18:47.020 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-24 23:18:47.043 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-24 23:18:47.067 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:18:47.092 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-24 23:18:47.135 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:18:47.227 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:18:47.271 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-24 23:18:47.646 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-24 23:18:47.696 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:18:47.764 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-24 23:19:47.177 bandwidthAddNiceApp(10045) <0.02ms>
06-24 23:30:18.841 idletimerRemoveInterface("wlan0", 15, "1") <0.01ms>
06-24 23:30:18.850 networkDestroy(0) <0.02ms>
06-24 23:30:18.851 destroyNetworkCache(0) <0.05ms>
06-24 23:30:18.906 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:30:19.915 networkCreatePhysical(0, 0) <0.02ms>
06-24 23:30:19.915 createNetworkCache(0) <0.06ms>
06-24 23:30:19.918 DnsResolverService::setResolverConfiguration(0, [10.74.210.210, 10.74.210.211], [], 1800, 25, 8, 64, 0, 0, [10.74.210.210, 10.74.210.211], []) -> (0) (0.6ms)
06-24 23:30:19.918 setResolverConfiguration() <0.71ms>
06-24 23:30:19.919 networkAddInterface(0, "rmnet_data0") <0.01ms>
06-24 23:30:19.920 interfaceSetMtu("rmnet_data0", 1500) <0.01ms>
06-24 23:30:19.921 networkAddRoute(0, "rmnet_data0", "10.75.166.0/26", "") <0.01ms>
06-24 23:30:19.921 networkAddRoute(0, "rmnet_data0", "0.0.0.0/0", "10.75.166.32") <0.01ms>
06-24 23:30:19.923 DnsResolverService::setResolverConfiguration(0, [10.74.210.210, 10.74.210.211], [], 1800, 25, 8, 64, 0, 0, [10.74.210.210, 10.74.210.211], []) -> (0) (0.2ms)
06-24 23:30:19.923 setResolverConfiguration() <0.36ms>
06-24 23:30:19.931 idletimerAddInterface("rmnet_data0", 10, "0") <0.02ms>
06-24 23:30:19.933 networkSetDefault(0) <0.01ms>
06-24 23:30:19.933 setTcpRWmemorySize("4093 26280 70800", "4096 16384 70800") <0.01ms>
06-24 23:30:19.949 bandwidthSetInterfaceQuota("rmnet_data0", 9223372036854775807) <0.02ms>
06-24 23:30:19.954 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:30:19.970 bandwidthRemoveInterfaceQuota("rmnet_data0") <0.02ms>
06-24 23:30:19.971 bandwidthSetInterfaceQuota("rmnet_data0", 9223372036854775807) <0.01ms>
06-24 23:30:19.972 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-24 23:30:22.430 DnsResolverService::setResolverConfiguration(0, [10.74.210.210, 10.74.210.211], [], 1800, 25, 8, 64, 0, 0, [10.74.210.210, 10.74.210.211], []) -> (0) (0.8ms)
06-24 23:30:22.430 setResolverConfiguration() <1.00ms>
06-24 23:48:43.988 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 00:00:08.058 NetdNativeService::interfaceGetCfg(wlan0) -> ({wlan0, 3c:01:ef:95:ec:7a, 192.168.178.61, 24, up, broadcast, running, multicast}) (0.3ms)
06-25 00:00:08.058 interfaceGetCfg("wlan0") <0.45ms>
06-25 00:00:08.059 interfaceClearAddrs("wlan0") <0.03ms>
06-25 00:00:08.060 interfaceSetIPv6PrivacyExtensions("wlan0", "true") <0.01ms>
06-25 00:00:08.061 interfaceSetEnableIPv6("wlan0", "false") <0.01ms>
06-25 00:00:08.075 NetdNativeService::interfaceGetCfg(wlan0) -> ({wlan0, 3c:01:ef:95:ec:7a, 192.168.178.61, 24, up, broadcast, running, multicast}) (0.2ms)
06-25 00:00:08.075 interfaceGetCfg("wlan0") <0.28ms>
06-25 00:00:08.080 interfaceSetEnableIPv6("wlan0", "false") <0.02ms>
06-25 00:00:08.080 interfaceClearAddrs("wlan0") <0.02ms>
06-25 00:00:08.295 networkAddInterface(0, "wlan0") <0.03ms>
06-25 00:00:08.298 DnsResolverService::setResolverConfiguration(0, [192.168.178.1], [], 1800, 25, 8, 64, 0, 0, [192.168.178.1], []) -> (0) (0.4ms)
06-25 00:00:08.298 setResolverConfiguration() <0.57ms>
06-25 00:00:08.324 networkCreatePhysical(0, 0) <0.02ms>
06-25 00:00:08.325 createNetworkCache(0) -> ServiceSpecificException(17, "File exists") <0.28ms>
06-25 00:26:13.866 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 00:48:13.891 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 01:18:13.839 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 01:48:43.866 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 02:18:12.786 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 02:48:44.086 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 03:18:12.735 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 03:48:12.680 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 04:18:13.665 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 04:48:13.644 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 05:18:13.646 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 05:48:12.683 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 06:26:13.598 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 06:48:12.435 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 07:18:13.559 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 07:48:13.567 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 08:18:13.524 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 08:48:13.513 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 09:26:14.485 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 09:48:43.936 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-25 10:00:11.179 idletimerRemoveInterface("rmnet_data0", 10, "0") <0.02ms>
06-25 10:00:11.187 networkDestroy(0) <0.01ms>
06-25 10:00:11.188 destroyNetworkCache(0) <0.08ms>
06-25 10:00:11.225 bandwidthRemoveInterfaceQuota("rmnet_data0") <0.08ms>
06-25 10:00:11.228 bandwidthSetGlobalAlert(2097152) <0.01ms>
06-25 10:14:24.929 isAlive() -> {"true"} <0.01ms>
06-25 10:14:28.082 isAlive() -> {"true"} <0.01ms>
06-25 10:18:13.461 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 10:48:13.431 bandwidthSetGlobalAlert(2097152) <0.02ms>
06-25 11:18:13.427 bandwidthSetGlobalAlert(2097152) <0.02ms>
These are many log entries but to sum it up the following things happend to Android network connection:
- 23:17: netd was restarted after rebooting the phone
- 23:18: “wlan0” was added as network interface 0 and set as default network
- 23:30: wlan was turned off and the new mobile network interface “rmnet_data0” was setup including routing and new name servers
- 00:00: wlan was turned on again and DNS entries switched to access points DNS IP.
netd
DNS cache was not cleared and creation of a new cache failed. The routing through the mobile network was also not changed back to the wifi network. - 10:00: Wake lock for mobile connection timed out and removed default interface 0 and destroyed DNS cache.
As you can see there are many problems here. When doing the WLAN → mobile network → WLAN handover mobile routing is not removed. Also when switching back to WLAN the network connection is not destroyed and initialized properly so all packets are still send via mobile network instead of WLAN.
And last but not least the wakelock destroys the default network and clears netd
DNS cache but keeps the mobile network routing so I was able to ping my server using the IP address.
So I hope this helps Jolla devs to investigate into this matter.