Harbour-defender blacklisted in SFOS 3.4.0.24

DrYak · 27 October 2020 13:57

TL;DR: could a Jolla Dev jump in an explain the bugs that cause harbour-defender <=0.4.0 to get black-listed so the community can fix it ?

REPRODUCIBILITY (% or how often): Always
BUILD ID = OS VERSION (Settings > About product): 3.4.0.24
HARDWARE (Jolla1, Tablet, XA2,…): XA2 (but is not hardware-dependent)
UI LANGUAGE:
REGRESSION: (compared to previous public release: Yes, No, ?): Yes, it’s new in Pallas.

DESCRIPTION:

Lots of people hate having ads taking up their (limited) mobile bandwidth and overall degrading the browsing experience.
A possible to handle this is through hostfile.

Defender by nodevel is one such application that offer a GUI to select and automatically update the hostfile based on well known blacklists.

For some reasons, since SFOS 3.4.0.24 Pallas, harbour-defender has been added to patterns-sailfish-app-blacklist. I would be great if some Jolla Dev could jump in and explain which bug inside harbour-defender makes it incompatible and worthy of a blacklist, so we can fix it.

PRECONDITIONS:

SFOS 3.4.0.24

STEPS TO REPRODUCE:

Try installing harbour-defender from openrepos.net

EXPECTED RESULT:

Harbour-defender gets installed and we can use it to block ads.

ACTUAL RESULT:

patterns-sailfish-app-blacklist blocks installation of harbour-defender and ads are still inflicted upon us and our bandwidth.

ADDITIONAL INFORMATION:

harbour-defender’s source code is available on Github, so it should be possible fix whatever trouble cause it to be put on the blacklist.

patoll · 27 October 2020 15:38

I believe the app was preventing a proper start of the OS from 3.2 at least on Xperia 10. See for instance https://together.jolla.com/question/217403/defender-app-stops-xperia-10-from-booting/
I was using it before and did not yet find a replacement so if anyone can adapt it to recent SFOS releases, that would be great

nephros · 27 October 2020 16:02

As patoll said, there’s a good reason for that “blacklisting” (actually it’s just warnings in both the Release Notes and the pre-upgrade checker that the user is free to ignore and then enjoy their broken system).

The issue has been known for a while, plenty of users were affected in the past.

However, I’m not sure the actual underlying issue has been identified. (My suspicion is that it’s a simple matter of systemd dependencies, and relying on things existing in /home/nemo which might not be accessible while the device boots.)

And to my knowledge nothing has happened from the developer’s side after the issue had been reported (after all, the last release of the app is from 2017, and the github history also reflects that), and problems have been seen since SFOS 3.2 or at least 3.3 which came out half a year ago.

So, the community should

a) identify and document the actual issue
b) probably fork the app and fix it, as upstream appears dead.

DrYak · 27 October 2020 18:50

Thank you for the pointers! That’s very valuable!

Does anyone have a spare (non daily driver) SFOS device with full encryption turned on, in order to test which part of Defender causes the hang?

(My XA2 is my daily driver, my X currently needs Android so I can use the local covid tracking app in my country).

@peterleinchen has already pointed out the steps in this comment, basically:

turn out everything that is installed by defender (3 systemd units: path, service, timer).
try turning them on.

My hunch is that the .path unit (which require the nemo user’s home path to be available) is somehow creating a unresolvable circular dependency loop on fully-encrypted devices.

I would be better:

move the requirement into the .service which is started by timer (or even implement a check in the python code).
rewrite the dependency of .service as a “after=” pointing to the service in charge of unlocking.
rewrite other parts of the code to handle the new situation were the main user isn’t necessarily called ‘nemo’ anymore (i.e.: there are a few hard-coded “/home/nemo” instances which should be changed to environment variables, and the installer script should install the units as user sessions).

peterleinchen · 27 October 2020 20:13

@DrYak
Yes. That was my thinking also.
And the problem only came up for devices fully encrypted (either from the first boot or later).

And that could be solved with some after statements in service files as you said.
But I just read a bit into systemd lately and how should a simple service hinder the device on booting?
There is nothing that is dependent on one of these services. Neither would anything in the python code block.

So, even if some circular dependency would happen on one of the start services would this block the device booting? I could not find such problem and also would not believe that systemd is designed like that?

–
and I guess for the nemo/defaultuser it is sufficient to check for existence of directory /home/nemo and then use either nemo or defaultuser, right? (or using $HOME?)

–
should moving the unit files from /etc/systemd/system to $HOME/.config/systemd/user then not be sufficient as these services will definitely be started after decryption?

DrYak · 27 October 2020 22:49

(typing this on my smartphone . it’s comic that Jolla’s own forum doesn’t work properly on the native browser)

I think the way defender is currently set up it shows as a hard requirement for multi-user target . (that would be the login screen on a desktop ) But at that point in time, the user hasn’t logged in yet and thus isn’t decrypted yet.

Somehow the installation of systemd on Sailfish doesn’t detect this circularity.

The normal behaviour for a path is to wait until it pops up (e.g.: in case it is on some plug’n’play device that will enevenrually show up on the bus and automount that directory . e.g.: on a desktop that would be an old school spinning rust hdd that takes some rime to spin up). except here, it won’t happen because (as you need a login screen to ublock ho.e).

after a while this unit times out and thus the multi-,user target fails its hard requirements , and i suspect that on Sailfish the default behaviour is left as-is (which, for backward compatibility reasons is exactly the same as SysInitV: put a rpompt on the terminal asking the user to pupush Ctrl-D to continue and iignore, or type the rroot password to get a shell and fix/ivestigate) (this isn’t going to work on a smartphone that lacks both a terminal and a hardware keyboard ).

I am not sure, but there might be a timeout at this prompt and so the phone might boot after a ridiculous amount of time.

DrYak · 28 October 2020 08:44

[ Back on the desktop, got a few free minutes before the rush of work begins (> 500 SARS-CoV-2 genome sequencing to process). But I think I’ll leave the above post with all the typos in ]

SoI had time to check: Yes, indeed the .path unit is a hard requirement of multi-user. The problems seem to possibly stem from here.

Anyway, the way things are organized will not work in modern Sailfish OS 3.4.0.14 anymore. (defender expects /home/nemo/.config/ to be always available, which is definitely not true anymore, both in the light of multi-users and encryption).

The way it should be done:

configuration data should be moved outside of home so that it could even work with a still-locked home or a differently called main users. /var/lib/harbour-defender seems the most logical to me (somebody could confirm).
installer should make that directory either writeable to members of the users group or (given how users seem to be envisionned according to the blog post about multi-users) writeable exclusively to the user-specific group with gid “100000”, no matter which group name (“nemo” or “defaultuser”)
GUI should be changed to add checks if /var/lib/harbour-defender is writeable (guest users should not have admin rights) and to write inside that common directory.

bonus point:

somebody with more skills in python than C/C++ (thus preferably not me) shoudl check if all the exceptions are gracefully handled in defender_updater.py script, so it simply skips broken sources (and leaves a warning to be picked up be the GUI).

[Sorry, I need to get back to work, there’s a pandemic waiting on me I’ll see if I have time to help around this later]

peterleinchen · 28 October 2020 22:35

Definitely not a systemd specialist here.
But as far as I understand:

Often, it is a better choice to use Wants= instead of Requires= in order to achieve a system that is more robust when dealing with failing services.

from
https://www.freedesktop.org/software/systemd/man/systemd.unit.html
or

Wants/WantedBy is the soft dependency,
which may fail.
And Requires/RequiredBy is the hard deoendency,
which let the higher service fail if not all dependencies could get started.

So again does this sound like a failing service should block booting?
But I do not have any better idea.

peterleinchen · 23 November 2020 00:07

So. I invested some time and solved this issue. See here:

I would have liked to get some pointers earlier from Jolla instead of a blacklisting (which is not needed on unencrypted devices including all Jolla 1, C, Tablets). As I think someone more experienced in systemd would have seen this immediately.

Many thanks to DrYak for this bug report here and confirming my thoughts and needed code changes.

DrYak · 24 November 2020 12:05

First: BIG THANK YOU for investigating and fixing the bug!

I had indeed confused the hard/soft requirement, but the final result is the same.

Just for the tiny details:

Wants/WantedBy is the soft dependency,
which may fail.

Yes, it tolerates a fail (if the units fails, it will not break, but continue starting).
But it will always initiate a unit start before, no matter what.
And is will always wait for unit completion
- One-shot type service units will wait for script’s exit
- (modern, sysd-aware) daemons service units will wait for successful backgrounding (there’s a systemd dbus signal for that)
- old-school (SysVInit) daemons service units will wait for the typical double-fork
- things which are not technically daemon but scripts service units (a.k.a the I wrote a Perl 1-liner and don’t bother with all the daemony stuff) will wait a timeout (i.e.: the script needs to survive without crashing for some time for it be considered a successful start)
- and path-triggers units require the path to be available (seems logical).

So even if it is a “WantedBy” (so if an actual script did crash booting would have continued), we still end up in a waiting loop, where multi-user waits for the (path trigger) unit to be ready, but the path trigger depends on a mount path (/home) that isn’t mounted yet and thus waits, but the that mount path will never become available before the encryption has been unlocked which is set to happen much later in the boot process order, and that depends on the multi-user target having been already reached (Whereas it’s currently still stuck waiting).

circular dependency hell.

The way you solved it (making sure that the path trigger unit is loaded ONLY AFTER the point at which the GUI is ready and the encryption has been unlocked and mounted) is nice.

There’s a tiny catch in your solution, but not dramatic:

if the main (admin) user is called “default user” but there’s a secondary (non admin) user created which happens to be called “nemo”, that user gets to do the updates too, even if apparently, according to Jolla’s design only UID 100000 should be able to.

(but that’s a problem that is shared with any other app which predates the username switch from nemo to default)
if the users has renamed the admin user to something which is neither defaultuser nor nemo (i.e.: non standard), this line will fail. Best would be to use whatever is the pythonic equivalent of the bash getent passwd 100000 and get the correct path from column 6)

(but if a user goes to play around with usernames and paths, they should are advanced power users enough to fix this too).

peterleinchen · 24 November 2020 15:51

Thanks to you, DrYak, again!

Maybe this is what I could not read out of any information I read about systemd:

This makes it more or less clear. And I would have started earlier…
Any pointer to that? (please, even it is now obsolete ) As I would have expected that unit to dangle around but not stop multi-user.target to finish.

About the tiny catches:
I guess I did it to my best knowledge:

This is handled as the existence of /home/defaultuser is checked first. Only if this does not exist it is falling back to old (beloved ) nemo.

This is something that cannot happen as the user names “defaultuser” resp. “nemo” are hard-wired and cannot be changed, afaik. I read that on zendesk, only the clear (UI) name displayed may be changed like on other OSes.
Pythonic could look something like:

import subprocess
HOME_DIR = subprocess("getent passwd 100000 | cut -f 6 -d ':'", shell=True)

But as I am lazy and thought that this is not needed, I kept my approach

nephros · 24 November 2020 17:15

well someone could do it but I don’t think it’s worth dealing with that use case…

# add title to avoid confusing a proper badass with some pathetic little missing fish

usermod -m -d /home/nautilus -l captain_nemo nemo

peterleinchen · 24 November 2020 19:20

Uh oh!

Yes, that would be possible of course. But I am pretty sure then something strange will happen on next boot!

Until someone shows me captain_nemo on his nautilus booting, I do see it as safe.

DrYak · 25 November 2020 19:15

Well, I think it’s safe to assume that a user that could manage to change their home directory and username, they’d also be able to handle this type of failure modes.

(On the other hand, note that my initial suggestion of using a path inside /var/lib/harbour-defender would also not be affected by that).

In the end, the most important thing is that you fixed defender. And that’s super cool.

4carlos · 26 November 2020 08:31

What about this?

https://together.jolla.com/question/215859/adblock-via-etchosts-on-xa2aliendalvik-v8/?answer=215903#post-id-2159038

nephros · 26 November 2020 09:26

What about it?

AFAIK/AFAICS, Defender manages the Android Support hosts file as well.

github.com

peterleinchen/harbour-defender/blob/master/qml/python/defender_updater.py#L48-L51


android1_dir="/system/etc"
android1_hosts="/system/etc/hosts"
android2_dir="/opt/alien/system/etc"
android2_hosts="/opt/alien/system/etc/hosts"

Some answers in the thread you linked say that should work, others say AD doesn’t use these hosts entries at all. What is true for current SFOS, current Defender and modern (Xperia10 et al) AD?

For me it reads like that the Defender scripts would have to do the lxc-mount and override as in the linked answer to be effective, correct?

DrYak · 26 November 2020 10:07

This works for AlienDalvik for Android 4.x (devices from original Jolla 1 up to Xperia X), as they share the same filesystem as the main linux (the isolation is just inside the Java-like VM that old Android use).

This code will see the files and update them

This does not work AlienDalvik for Android 8.1 (devices starting from Xperia XA 2), as they:

use an LXC container (and thus their filesystem is isolated from the main Linux)
store that filesystem in a compressed read-only squashfs system.img file (as opposed to bind-mount some subdirectory that is writeable by Linux).

The code linked above will not work because, the files don’t exist in the namespace of the main linux.

The solution to modify the configuration (the include /var/lib/lxc/aliendalvik/extra_config is handy for that) works, because it bind mounts the Linux file to inside the Android LXC.

DrYak · 26 November 2020 10:15

A completely different way to handle this would be to use DNS forwarder daemon on the linux side
(i.e.: it’s own DNS server, e.g.: dnsmasq, systemd-resolved, etc.) (note: on other laptops/smartphomes NetworkManager can even automate that, I don’t know if connman on Sailfish can automate it).

And then configure Android to always use that as a DNS server in the default connection.

The main advantage:

defender wouldn’t need to modify actual host files. It can modify some include configuration of dnsmasq and normal DNS name solution is handled by the latter.

The main disadvantage:

much more complex solution involving multiple components (installing dnsmasq, having connman update dnsmasq configuration instead of /etc/resolv.conf, etc.)

4carlos · 26 November 2020 13:10

Only legacy devices (+ Xperia X I guess)

Thank you for your answer. Then I don’t have to answer @nephros .

peterleinchen · 26 November 2020 23:14

Yes.
I changed the lxc mount a bit, but in the end it does not really matter, I guess.

Simply edit the file /var/lib/lxc/aliendalvik/extra_config as root and add following line:

lxc.mount.entry = /system/etc/hosts system/etc/hosts none bind,ro,create=file,optional 0 0

I am using the Android /system/etc/hosts (which gets updated by Defender as well) for the lxc mount.
Furthermore I added the option ‘ro’ so Android side cannot mess with SFOS file(s).
And I put option ‘optional’ as when I tried to add also /system/etc/hosts.editable it caused aliendalvik to fail starting (ro file system, could not add that file, alien service failing). This is not needed for system/etc/hosts as this file is present. Just for the records.