Ofonod cpu guzzling

When you experience those drains open a terminal and type top. I bet you will see offonod on top. :wink:

I too had ofono taking 100% of a cpu core this morning. Mobile was in flight mode, with wifi enabled. Don’t know when it got triggered. Restarting ofono fixed it. However, as another data point or noise: I noticed sensorfwd taking a few percent CPU time, more than normal (and more than fingerterm). That I’ve seen before. Restarting sensorfwd fixed that as well, but I noticed it took a while. journalctl showed that the restart command failed, and that the process was SIGKILLed after a 15 seconds.

I had had the impression that it was related to that as well, so as a test I’ve stuck to 4g for a while and kept wifi completely turned off.

This morning I saw ofonod was at it again, eating battery like a hungry hippo. I think it must have happened when I switched off flight mode in the morning. It was fixed by restarting ofonod.service as usual.

Lately i feel like i’ve been “fighting” with ofono more than i use the phone itself. :neutral_face:

I am this :pinching_hand: close to going on a rant. :frowning_face:

Never had the problem myself, but wouldn’t it be better to first collect some evidence?

If you have gdb installed,

          # gdb -p `pidof ofonod`
          (gdb) generate-core-file

will dump a core file containing a lot of possibly useful information. Another option is to attach strace, e.g.:

         # strace -f -p `pidof ofonod`

This will show either almost nothing (i.e. ofono or at least one thread hangs in some computation) or the deamon’s interaction with the kernel.

2 Likes

Well, that was a good idea. It seems ofonod starts its CPU munching when a file descriptor is closed and it just keeps trying to read from it:

[pid 30013] poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=11, events=POLLIN}, {fd=22, events=0}], 6, 2240) = 1 ([{fd=22, revents=POLLNVAL}])
[pid 30013] clock_gettime(CLOCK_MONOTONIC, {tv_sec=193881, tv_nsec=760733526}) = 0
[pid 30013] clock_gettime(CLOCK_MONOTONIC, {tv_sec=193881, tv_nsec=760828622}) = 0

These lines repeat over and over very, very quickly. epoll tells us that the file descriptor is invalid and indeed, when one checks the file descriptors for the running process:

nemo@Sailfish ~ [1]> sudo ls -l /proc/(pidof ofonod)/fd
total 0
lr-x------    1 root     root            64 May  5 22:00 0 -> /dev/null
lrwx------    1 root     root            64 May  5 22:00 1 -> socket:[11282575]
l-wx------    1 root     root            64 May  5 22:00 10 -> pipe:[11281456]
lrwx------    1 root     root            64 May  5 22:00 11 -> socket:[11283459]
lrwx------    1 root     root            64 May  5 22:00 12 -> anon_inode:[eventfd]
lrwx------    1 root     root            64 May  5 22:00 13 -> socket:[11261401]
lrwx------    1 root     root            64 May  5 22:00 14 -> anon_inode:[eventfd]
lrwx------    1 root     root            64 May  5 22:00 15 -> /dev/hwbinder
lr-x------    1 root     root            64 May  5 22:00 16 -> pipe:[11281467]
l-wx------    1 root     root            64 May  5 22:00 17 -> pipe:[11281467]
l-wx------    1 root     root            64 May  5 22:00 2 -> /dev/null
lrwx------    1 root     root            64 May  5 22:00 3 -> anon_inode:[eventfd]
lrwx------    1 root     root            64 May  5 22:00 4 -> anon_inode:[signalfd]
lrwx------    1 root     root            64 May  5 22:00 5 -> socket:[11261388]
lrwx------    1 root     root            64 May  5 22:00 6 -> socket:[11261389]
lr-x------    1 root     root            64 May  5 22:00 7 -> anon_inode:inotify
lrwx------    1 root     root            64 May  5 22:00 8 -> /dev/hwpuddlejumper
lr-x------    1 root     root            64 May  5 22:00 9 -> pipe:[11281456]

After restarting ofonod, the open file descriptors look quite similar. The socket and pipe ids are different, of course, but the number of open sockets is the same, and they are of the same type. I don’t see any socket that wasn’t there before restarting ofonod.

1 Like

@flypig (or anyone else from jolla) can you tell us if there is a bug in the internal bugtracker for this issue??

1 Like

Thanks for asking @ApB. I just checked now and I don’t see any open bugs related to ofono power consumption, and I don’t see any internal reports that directly reference this thread.

It doesn’t follow there aren’t bugs about this issue I’m afraid since we only have symptoms here; you’d need someone with a better understanding of ofono (e.g. @slava) to check.

Maybe it’s also worth going through the upstream changes to see if there’s anything related there. I skimmed through them, as well as our Sailfish-specific changes, but didn’t see anything obvious. Again, with my level of understanding it’s quite possible I missed something though.

We now have an internal bug report covering this, so I’ve tagged it as “tracked”.

4 Likes

Can someone test the below in case it triggers it for them also.

Use bluetooth earphones and listen to music through the default media player.
Close the media player.
The BT icon (on top) will disappear and reappear again
Check top for cpu usage.

Yes, it does same for me (10iii). After stopping media player ‘top’ shows ofonod using ~12% cpu. Tried couple of times with media player, always happened. Native browser+youtube or ‘nettiradio’ did not cause any problems.

During last night (6 hours) battery dropped from 70% to 9%, And ofonod was using that ~12% cpu, Phone was noticeably warm compared to XA2 beside it, so it has been clearly doing something.

Deadbeef-Player seems not to be affected by this fault. Since i use it instead of Mediaplayer, the ofonod-cpu-eating seems to be gone.

1 Like

Thanks for the test people. It might not be the only way of making it guzzle the CPU but at least we know of one way and its reproducible.

@piggz put it in the internal bug tracker if you please. :slightly_smiling_face:

I found another way to reliably reproduce this on X10III (edit: I did these steps three times with ofono restart in between, happened every time):

  • Turn on Bluetooth headset
  • The headset auto-connects to the phone
  • Listen to some music (I used Nettiradio)
  • Stop the music (I used the app cover)
  • Go to Bluetooth settings
  • Click the device name to disconnect it
  • Ofono starts to consume one CPU core

No wonder I ran into this daily with my new headphones which I use daily: At work I disconnect if from the phone and use the PC to establish the connection - battery drainage every day…

One thing worth noting is that restarting telephony using Sailfish Utilities does not help, only restarting ofono systemd service.

PS. I use ToeTerm, and I created a quick command for restarting ofono (sans password), it’s been quite handy actually.

2 Likes

All we have to do now is wait for a fix. We can reliably reproduce it. :slightly_smiling_face:

Hope it comes soon.

Interestingly it happens only when actively disconnecting the headset.
I used my Bose Quiet Comfort earbuds many times without any issues, but I rarely disconnected them. There’s no problem when I just shut them down by putting back in their case. Also when I just disable BT completely.
Thanks for finding out.

1 Like

As a side note, it does not happen if I just turn Bluetooth off. So it seems to be the Bluetooth disconnection procedure that makes this happen.

1 Like

For me it sometimes happens when my PineTime watch crashes and reboot (which seems to happen on average once every two days). After that it refuses to reconnect until I remove the pairing on the phone. At this time the phone warms up and ofonod goes into an endless loop polling an invalid file descriptor.

2 Likes

Hi everyone,

not quite sure if it’s ofono related, but I will try to explain what I have observed on my XA2.

Some time ago i realised, that the battery drainage vary during night time. So I installed SysMon to follow the CPU utilisation in a visual way. What I have seen is days and night times with high utilisation followed by days and night times with low utilisation. Strange think for me: This behaviour survived reboots and i have no clue what was causing this.

Yesterday I have seen this: I was on a low average utilisation the last days. After a missed call, the utilisation rose and stayed over night on a high position. This morning I got a call and answered it. After that the utilisation decreased again.

Unfortunately I don’t have logs for this. What would be a good way to investigate it further?

That’s likely the voicecall-ui bug. In some cases after a phonecall it keeps using CPU doing nothing. See [3.4.0.22] Voicecall-ui uses CPU on the background - #13 by KuroNeko

1 Like