MCE is using wrong metric to properly measure RAM usage

Thaodan · 18 August 2022 05:40

I think this is a duplicate/we had this question earlier.

Previous answer

We had this question in the community meeting earlier this year.
Our answer can be found in the minutes of this meeting:
https://irclogs.sailfishos.org/meetings/sailfishos-meeting/2022/sailfishos-meeting.2022-01-20-08.00.html

Memory Pressure and Memory

Free memory doesn’t necessary match with the memory usage in bytes since you have to remove the ram that’s available from the used ram to get the memory usage.

Also the kernel doesn’t show the exact value for performance reasons:
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/memory.html#usage-in-bytes

5.5 usage_in_bytes¶
For efficiency, as other kernel components, memory cgroup uses some optimization to avoid unnecessary cacheline false sharing. usage_in_bytes is affected by the method and doesn’t show ‘exact’ value of memory (and swap) usage, it’s a fuzz value for efficient access. (Of course, when necessary, it’s synchronized.) If you want to know more exact memory usage, you should use RSS+CACHE(+SWAP) value in memory.stat(see 5.2).

MCE vs lmk/oom-killer

Both do different things. MCE reports memory pressure from the system to apps that use it’s API.
The oom-killer the kernel memory pressure to determine if it needs to kill apps and then kills the app with the lowest priority.
Both use similar data but they don’t use the same calculations - they are independent from each other.
As Karry reported in his report LMK uses the wrong values/takes cache into account wrong.

We are looking into replacing the lmk kernel module as said in the meeting earlier this year.

So the TLDR is that the title is misleading, lmk is sometimes killing apps wrongfully not MCE, MCE just forwards this signal to apps that listen it’s API.