Android Connectivity Issues Analysis

The Android support layer has some serious issues with network connectivity that makes e.g. applications dependent on push notification based services a gambling game. Here is an example usecase and the hope that Jolla devs can investigate into this issue any further as many users might be affected by this.

In my case I started using SchildiChat (a Matrix-based chat client) together with ntfy.sh (a unified push provider) as a replacement for the legacy XMPP protocol on my daily driver (Xperia 10 Mark 2).

Both applications will run as background processes once SailfishOS and aliendalvik container has started to receive push notifications.

Multiple times since I used this setup I noticed that notifications get lost because of missing network connectivity. So I started digging around as I’ve lost network connection once again after the phone was suspended but connected to wifi network, woken up to check that SchildiChat reports no connection issues, suspended again and when I returned 30 minutes later to check for new messages SchildiChat told me about a broken connection.

First the issue is on the Android-side of things so we have to enter console as root and switch into the lxc container.

,---
| Sailfish OS 4.4.0.64 (Vanha Rauma)
'---
[defaultuser@Xperia10II-DualSIM ~]$ devel-su
Password:
[root@Xperia10II-DualSIM defaultuser]# lxc-attach -n aliendalvik
lxc-attach: aliendalvik: terminal.c: lxc_terminal_create_native: 927 Not a directory - Failed to open terminal multiplexer device
#

So now I’m in Android world.

I checked if network connection is up and running by pinging my home server.

# ping 45.142.179.56
PING 45.142.179.56 (45.142.179.56) 56(84) bytes of data.
64 bytes from 45.142.179.56: icmp_seq=1 ttl=63 time=29.9 ms
64 bytes from 45.142.179.56: icmp_seq=2 ttl=63 time=49.4 ms
64 bytes from 45.142.179.56: icmp_seq=3 ttl=63 time=50.2 ms

Network connection on the physical layer seems to work as my server received the ICMP packets.

Next I tried name resolution.

# ping noordsee.de
ping: unknown host noordsee.de

Okay there is the culprit. Network is fine but DNS is broken.

Let’s check the Android communication layer using the dumpsys command.

# dumpsys connectivity
NetworkFactories for: PhoneSwitcherNetworkRequstListener WifiNetworkFactory TelephonyNetworkFactory[1] TelephonyNetworkFactory[0] UntrustedWifiNetworkFactory

Active default network: none

Current Networks:
  NetworkAgentInfo{ ni{[type: WIFI[], state: CONNECTED/CONNECTED, reason: (unspecified), extra: (none), failover: false, available: true, roaming: false]}  network{0}  nethandle{0}  lp{{InterfaceName: wlan0 LinkAddresses: [ ] DnsAddresses:
[ /192.168.178.1 ] Domains: null MTU: 0 Routes: [ ]}}  nc{[ Transports: WIFI Capabilities: NOT_METERED&INTERNET&NOT_RESTRICTED&TRUSTED&NOT_VPN&NOT_ROAMING&FOREGROUND&NOT_CONGESTED&NOT_SUSPENDED LinkUpBandwidth>=1048576Kbps LinkDnBandwidth>=
1048576Kbps Specifier: <WifiNetworkAgentSpecifier [WifiConfiguration=, SSID="Klaus", BSSID=1c:11:1f:91:e1:1a, mOriginalRequestorUid=-1, mOriginalRequestorPackageName=]> SignalStrength: -67 SSID: "Klaus"]}  Score{20}  everValid
ated{false}  lastValidated{false}  created{false} lingering{false} explicitlySelected{true} acceptUnvalidated{false} everCaptivePortalDetected{false} lastCaptivePortalDetected{false} captivePortalValidationPending{false} partialConnectivity
{false} acceptPartialConnectivity{false} clat{mBaseIface: null, mIface: null, mState: IDLE} }
    Requests: REQUEST:0 LISTEN:0 BACKGROUND_REQUEST:0 total:0
    Lingered:

Restrict background: false

Well this is interesting. The wifi access point is connected and DNS configured but the active default network is gone (none).

dumpsys connectivity told me also a few lines later that…

NetTransition WakeLock activity (most recent first):
  total acquisitions: 2
  total releases: 2
  cumulative duration: 187s
  longest duration: 185s
  2022-06-25T10:03:16.743 - RELEASE (EVENT_EXPIRE_NET_TRANSITION_WAKELOCK)
  2022-06-25T10:00:11.186 - ACQUIRE for NetworkAgentInfo [MOBILE (EDGE) - 0]
  2022-06-24T23:30:20.984 - RELEASE (EVENT_CLEAR_NET_TRANSITION_WAKELOCK)
  2022-06-24T23:30:18.848 - ACQUIRE for NetworkAgentInfo [WIFI () - 0]

… at 10am the phone somehow tried to acquire a mobile network agent info lock which expired. This must be the time within my 30 minutes between working network and failing network where the actual issue occured.

Now all DNS and network stuff is handled by netd service that receives IPCs via binder or ncd dispatcher.

# dumpsys netd
ppid:1 -> pid:38 -> tid:111
total running time: 0d3h32m51s614ms (12771614ms)
user: 0s191ms sys: 2s967ms
maxrss: 7924kB


  NetworkController
    Default network: 0

    Networks:
      51 DUMMY dummy0

      99 LOCAL
...
...
...
  Log:
    06-24 23:17:13.651 netd 1.0 starting
    06-24 23:17:14.102 Registering NetdNativeService: 0.3ms
    06-24 23:17:14.103 Netd started in 451ms
    06-24 23:17:49.774 registerUnsolicitedEventListener() <0.03ms>
    06-24 23:18:12.304 firewallSetFirewallType(1) <22.32ms>
    06-24 23:18:12.335 isAlive() -> {"true"} <0.01ms>
    06-24 23:18:12.404 bandwidthSetGlobalAlert(2097152) <0.04ms>
    06-24 23:18:12.415 networkSetPermissionForUser(1, []) <0.02ms>
    06-24 23:18:12.416 networkSetPermissionForUser(2, [2000,10034,10035,1027,10037,10006,10038,1000,1001,10030]) <0.02ms>
    06-24 23:18:12.417 trafficSetNetPermForUids(12, [1001,1013,10006]) <0.02ms>
    06-24 23:18:12.417 trafficSetNetPermForUids(4, [1000,1027,2000,10001,10002,10012,10014,10027,10030,10036,10037,10038,10044,10045]) <0.02ms>
    06-24 23:18:12.418 trafficSetNetPermForUids(8, [1041,1047]) <0.02ms>
    06-24 23:18:12.419 trafficSetNetPermForUids(0, [1003,1063,1066,1067,1068,1073,10000,10003,10004,10005,10007,10008,10009,10010,10013,10015,10016,10017,10018,10019,10021,10022,10023,10024,10025,10026,10028,10032,10033,10034,10035]) <0.02m
s>
    06-24 23:18:12.552 firewallReplaceUidChain("fw_standby", "false", []) -> {"true"} <0.02ms>
    06-24 23:18:30.851 registerUnsolicitedEventListener() <0.02ms>
    06-24 23:18:43.981 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:18:45.151 bandwidthAddNiceApp(10006) <0.02ms>
    06-24 23:18:46.384 NetdNativeService::interfaceGetCfg(wlan0) -> ({wlan0, 3c:01:ef:95:ec:7a, 192.168.178.61, 24, up, broadcast, running, multicast}) (4ms)
    06-24 23:18:46.384 interfaceGetCfg("wlan0") <3.94ms>
    06-24 23:18:46.386 interfaceClearAddrs("wlan0") <0.02ms>
    06-24 23:18:46.386 interfaceSetIPv6PrivacyExtensions("wlan0", "true") <0.01ms>
    06-24 23:18:46.386 interfaceSetEnableIPv6("wlan0", "false") <0.01ms>
    06-24 23:18:46.400 NetdNativeService::interfaceGetCfg(wlan0) -> ({wlan0, 3c:01:ef:95:ec:7a, 192.168.178.61, 24, up, broadcast, running, multicast}) (0.2ms)
    06-24 23:18:46.400 interfaceGetCfg("wlan0") <0.33ms>
    06-24 23:18:46.417 interfaceSetEnableIPv6("wlan0", "false") <0.02ms>
    06-24 23:18:46.417 interfaceClearAddrs("wlan0") <0.01ms>
    06-24 23:18:46.475 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-24 23:18:46.526 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:18:46.596 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:18:46.610 networkAddInterface(0, "wlan0") <0.02ms>
    06-24 23:18:46.613 DnsResolverService::setResolverConfiguration(0, [192.168.178.1], [], 1800, 25, 8, 64, 0, 0, [], []) -> (-64) (0.2ms)
    06-24 23:18:46.613 setResolverConfiguration() -> ServiceSpecificException(64, "Machine is not on the network") <0.49ms>
    06-24 23:18:46.639 networkCreatePhysical(0, 0) <0.02ms>
    06-24 23:18:46.640 createNetworkCache(0) <0.09ms>
    06-24 23:18:46.646 DnsResolverService::setResolverConfiguration(0, [192.168.178.1], [], 1800, 25, 8, 64, 0, 0, [192.168.178.1], []) -> (0) (0.8ms)
    06-24 23:18:46.646 setResolverConfiguration() <1.00ms>
    06-24 23:18:46.648 networkAddInterface(0, "wlan0") <0.02ms>
    06-24 23:18:46.650 DnsResolverService::setResolverConfiguration(0, [192.168.178.1], [], 1800, 25, 8, 64, 0, 0, [192.168.178.1], []) -> (0) (0.1ms)
    06-24 23:18:46.650 setResolverConfiguration() <0.23ms>
    06-24 23:18:46.663 idletimerAddInterface("wlan0", 15, "1") <0.03ms>
    06-24 23:18:46.665 networkSetDefault(0) <0.05ms>
    06-24 23:18:46.665 setTcpRWmemorySize("4096 87380 110208", "4096 16384 110208") <0.01ms>
    06-24 23:18:46.678 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-24 23:18:46.679 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-24 23:18:46.979 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:18:46.988 DnsResolverService::setResolverConfiguration(0, [192.168.178.1], [], 1800, 25, 8, 64, 0, 0, [192.168.178.1], []) -> (0) (0.3ms)
    06-24 23:18:46.989 setResolverConfiguration() <0.52ms>
    06-24 23:18:47.020 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-24 23:18:47.043 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-24 23:18:47.067 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:18:47.092 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-24 23:18:47.135 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:18:47.227 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:18:47.271 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-24 23:18:47.646 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-24 23:18:47.696 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:18:47.764 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-24 23:19:47.177 bandwidthAddNiceApp(10045) <0.02ms>
    06-24 23:30:18.841 idletimerRemoveInterface("wlan0", 15, "1") <0.01ms>
    06-24 23:30:18.850 networkDestroy(0) <0.02ms>
    06-24 23:30:18.851 destroyNetworkCache(0) <0.05ms>
    06-24 23:30:18.906 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:30:19.915 networkCreatePhysical(0, 0) <0.02ms>
    06-24 23:30:19.915 createNetworkCache(0) <0.06ms>
    06-24 23:30:19.918 DnsResolverService::setResolverConfiguration(0, [10.74.210.210, 10.74.210.211], [], 1800, 25, 8, 64, 0, 0, [10.74.210.210, 10.74.210.211], []) -> (0) (0.6ms)
    06-24 23:30:19.918 setResolverConfiguration() <0.71ms>
    06-24 23:30:19.919 networkAddInterface(0, "rmnet_data0") <0.01ms>
    06-24 23:30:19.920 interfaceSetMtu("rmnet_data0", 1500) <0.01ms>
    06-24 23:30:19.921 networkAddRoute(0, "rmnet_data0", "10.75.166.0/26", "") <0.01ms>
    06-24 23:30:19.921 networkAddRoute(0, "rmnet_data0", "0.0.0.0/0", "10.75.166.32") <0.01ms>
    06-24 23:30:19.923 DnsResolverService::setResolverConfiguration(0, [10.74.210.210, 10.74.210.211], [], 1800, 25, 8, 64, 0, 0, [10.74.210.210, 10.74.210.211], []) -> (0) (0.2ms)
    06-24 23:30:19.923 setResolverConfiguration() <0.36ms>
    06-24 23:30:19.931 idletimerAddInterface("rmnet_data0", 10, "0") <0.02ms>
    06-24 23:30:19.933 networkSetDefault(0) <0.01ms>
    06-24 23:30:19.933 setTcpRWmemorySize("4093 26280 70800", "4096 16384 70800") <0.01ms>
    06-24 23:30:19.949 bandwidthSetInterfaceQuota("rmnet_data0", 9223372036854775807) <0.02ms>
    06-24 23:30:19.954 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:30:19.970 bandwidthRemoveInterfaceQuota("rmnet_data0") <0.02ms>
    06-24 23:30:19.971 bandwidthSetInterfaceQuota("rmnet_data0", 9223372036854775807) <0.01ms>
    06-24 23:30:19.972 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-24 23:30:22.430 DnsResolverService::setResolverConfiguration(0, [10.74.210.210, 10.74.210.211], [], 1800, 25, 8, 64, 0, 0, [10.74.210.210, 10.74.210.211], []) -> (0) (0.8ms)
    06-24 23:30:22.430 setResolverConfiguration() <1.00ms>
    06-24 23:48:43.988 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 00:00:08.058 NetdNativeService::interfaceGetCfg(wlan0) -> ({wlan0, 3c:01:ef:95:ec:7a, 192.168.178.61, 24, up, broadcast, running, multicast}) (0.3ms)
    06-25 00:00:08.058 interfaceGetCfg("wlan0") <0.45ms>
    06-25 00:00:08.059 interfaceClearAddrs("wlan0") <0.03ms>
    06-25 00:00:08.060 interfaceSetIPv6PrivacyExtensions("wlan0", "true") <0.01ms>
    06-25 00:00:08.061 interfaceSetEnableIPv6("wlan0", "false") <0.01ms>
    06-25 00:00:08.075 NetdNativeService::interfaceGetCfg(wlan0) -> ({wlan0, 3c:01:ef:95:ec:7a, 192.168.178.61, 24, up, broadcast, running, multicast}) (0.2ms)
    06-25 00:00:08.075 interfaceGetCfg("wlan0") <0.28ms>
    06-25 00:00:08.080 interfaceSetEnableIPv6("wlan0", "false") <0.02ms>
    06-25 00:00:08.080 interfaceClearAddrs("wlan0") <0.02ms>
    06-25 00:00:08.295 networkAddInterface(0, "wlan0") <0.03ms>
    06-25 00:00:08.298 DnsResolverService::setResolverConfiguration(0, [192.168.178.1], [], 1800, 25, 8, 64, 0, 0, [192.168.178.1], []) -> (0) (0.4ms)
    06-25 00:00:08.298 setResolverConfiguration() <0.57ms>
    06-25 00:00:08.324 networkCreatePhysical(0, 0) <0.02ms>
    06-25 00:00:08.325 createNetworkCache(0) -> ServiceSpecificException(17, "File exists") <0.28ms>
    06-25 00:26:13.866 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 00:48:13.891 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 01:18:13.839 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 01:48:43.866 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 02:18:12.786 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 02:48:44.086 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 03:18:12.735 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 03:48:12.680 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 04:18:13.665 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 04:48:13.644 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 05:18:13.646 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 05:48:12.683 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 06:26:13.598 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 06:48:12.435 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 07:18:13.559 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 07:48:13.567 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 08:18:13.524 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 08:48:13.513 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 09:26:14.485 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 09:48:43.936 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-25 10:00:11.179 idletimerRemoveInterface("rmnet_data0", 10, "0") <0.02ms>
    06-25 10:00:11.187 networkDestroy(0) <0.01ms>
    06-25 10:00:11.188 destroyNetworkCache(0) <0.08ms>
    06-25 10:00:11.225 bandwidthRemoveInterfaceQuota("rmnet_data0") <0.08ms>
    06-25 10:00:11.228 bandwidthSetGlobalAlert(2097152) <0.01ms>
    06-25 10:14:24.929 isAlive() -> {"true"} <0.01ms>
    06-25 10:14:28.082 isAlive() -> {"true"} <0.01ms>
    06-25 10:18:13.461 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 10:48:13.431 bandwidthSetGlobalAlert(2097152) <0.02ms>
    06-25 11:18:13.427 bandwidthSetGlobalAlert(2097152) <0.02ms>

These are many log entries but to sum it up the following things happend to Android network connection:

  • 23:17: netd was restarted after rebooting the phone
  • 23:18: “wlan0” was added as network interface 0 and set as default network
  • 23:30: wlan was turned off and the new mobile network interface “rmnet_data0” was setup including routing and new name servers
  • 00:00: wlan was turned on again and DNS entries switched to access points DNS IP. netd DNS cache was not cleared and creation of a new cache failed. The routing through the mobile network was also not changed back to the wifi network.
  • 10:00: Wake lock for mobile connection timed out and removed default interface 0 and destroyed DNS cache.

As you can see there are many problems here. When doing the WLAN → mobile network → WLAN handover mobile routing is not removed. Also when switching back to WLAN the network connection is not destroyed and initialized properly so all packets are still send via mobile network instead of WLAN.
And last but not least the wakelock destroys the default network and clears netd DNS cache but keeps the mobile network routing so I was able to ping my server using the IP address.

So I hope this helps Jolla devs to investigate into this matter.

23 Likes

Hello!
Also I am not an expert to help, my experience is the same. Loosing connectivity for whatsapp multiple times a day is the annoying behavior since latest releases.
Maybe better to put it as a bug report in the correct category?

Thanks for your investigation!

There had been multiple bug reports related:

1 Like

Thanks for your investigation. Personally, on Xperia 10 iii I do have that exact same issue even in some native sailfish apps. For instance, the weather auto update from the event view doesn’t work in such situation. And my email are not gotten as expected, meaning always up to date (but searching manually for new emails work)
That’s why, for now, I’m keeping wifi off full time …

Thanks for this analysis. We were already aware of a mixing of network details being sent through to Android, and this helps to clarify that.

3 Likes

Just made an experiment

Yes, same problem occurs for the 2 accounts.

I’ve switched from mobile data connection to WLAN. Then, I can still use the native sailfish browser, but not Aurora Store for instance. My exchange account was not able to sync either.
Disabling WLAN, I had to switch mobile data OFF then ON to make the connection to be fully stable and then, so that Aurora store and my email work again !

I’ve collected eas-sailfish.log and qmf.log and sent them out to Jolla via zendesk.

Thanks for the lxc-attach -n aliendalvik hint.

Situation here: switched from mobile to Wifi - Android apps have no connectivity

  • Sailfish: Native browser works, ping ipv4 and ipv6 hosts work, name resolution via ping also working
  • Android: ping ipv4 and ipv6 is working, name resolution does NOT work

When going to Tools → restart network → Android apps work again.

I did not check routing/dns server info for android - there is to much output for me and I don’t know where to look…

1 Like

Although the other thread has been closed, I still have this problem in 4.5 (XA2). I’ve only noticed it affecting Opera but it might affect Aurora Store as well.

Sometimes, it happens also to me with the last SFOS 4.5.0.19 on Xperia 10 II using a VPN based on UDP with the default OpenVPN client.

I confirm that restarting network is a workaround that works, in my case.

IMHO this¹ has been closed too fast not considering all the possible cases or we are seen a regression.

¹ Android subsystem loses connectivity when switching networks

1 Like

4.5.0.19 X10 II: I just noticed after network switches I couldn’t get Android mobile data to work, until I disabled empty SIM2 slot, which I had enabled for easy «SIM1-only airplane mode» for recent travels.

Utilities » Restart network wouldn’t fix it.

2 Likes

I just noticed that I had the same issue on X10 III, checked SIM2 slot and found it was indeed enabled.
I will update if the issue persists (or if it does not happen for a while).

1 Like

In my case a smartphone reboot solved the problem. Others people can follow the Quick Start Guide up to the point in which the dnsmaq or Alternative DNS and safe DNS settings are configured.

I have a sim in 1 and 2 (x10 III), disabled 2 and had very less connection issues with android apps. Mobile connection was via slot 1 all the time.

Today, I wrote this Quick First Setup Guide which give instruction about dealing with dual-sim smartphones in order to avoid - as much as possible - connectivity problems.

UPDATE

Moreover, having such guide allows everyone involved in supporting the hardware to have a standard starting point in order to reduce - as much as possible - the interference caused to a different setup procedure.

This is about the GPS availability for SFOS native apps and does not belong specifically to the reply to @jollausar.