2

As I have 128 GB of RAM, I wanted to minimize swap usage by setting vm.swappiness to 10.

I ran a bath (snakemake -j 1) of memory-heavy operations in Python: subtracting two arrays up to 15 GB each, then calculating norms of the difference. Surprisingly my system started to misbehave: Thunderbird crashed, then so did my graphic environment (XFCE with lightdm) effectively killing my screen session with the bath running. Now I wonder: why?

Moreover, my scripts tends to fail with segmentation faults when processing the biggest arrays. Also, after graphic environment respawned it swapped my monitors (pun intended) and did not allow me to re-swap them with Display settings. service lightdm restart after sysctl vm.swappiness=60 was necessary.

I had plenty (932 GB) of swap available, so it is not that my system suddenly ran out of memory. RAM chips also seems to work (17 passes of Memtest86+ revealed no error).

I ask about the reason behind crashes of other programs (Thunderbird, screen session, graphic environment). Even if my programs were poorly written, I would expect their impact to be limited extensive swapping. A total XFCE session restart is something that definitively should not happen. And by restart I mean restart, not freezing or slowdown due to swapping.

abukaj
  • 465
  • Try completely turning the swap off (comment out the swap entry in /etc/fstab). It is really not necessary with 128 GB of RAM. – Archisman Panigrahi Sep 18 '23 at 15:09
  • @ArchismanPanigrahi I am not looking for walkaround. I am looking for understanding what happened. – abukaj Sep 18 '23 at 15:28
  • 3
    Silly question, but have you confirmed the memory is good? I’ve had faulty sticks (both ECC and non-ECC) contribute to behaviours like this in the past. – matigo Sep 18 '23 at 15:30
  • I don't know what happened. But from my experience (I use Ubuntu on desktops and workstations with large RAM), beyond 8 GB memory or so, swap does not help in most cases, but rather slows the computer down. – Archisman Panigrahi Sep 18 '23 at 15:30
  • @ArchismanPanigrahi It helps when I am working with arrays closer to 120 GB. – abukaj Sep 18 '23 at 15:35
  • Try adjusting/increasing min_free_kbytes – Doug Smythies Sep 18 '23 at 17:54
  • 1
  • @matigo I have ran 17 passes of Memtest86+ - memory seems to be good. – abukaj Sep 28 '23 at 11:06
  • @karel No. Why do you think it explain crashes I mentioned? – abukaj Sep 28 '23 at 11:43
  • @karel I ask about the reason behind crashes specifically. I want to understand why other programs crash when I expect them to be just slowed down by (possibly suboptimal) swapping. – abukaj Sep 28 '23 at 12:34
  • @karel What is the connection between number of inotify-monitored files and system crashes when running memory-heavy processes? Just in case I have tried fs.inotify.max_user_watches=524288 and fs.inotify.max_user_instances=8192 - after 12 successful jobs snakemake crashed (Segmentation fault). So did my graphic session, screen session luckily survived. Interestingly, a bash process has gone wild and now it uses 100% CPU time. – abukaj Sep 29 '23 at 07:23
  • @matigo You were pretty close. It was swap partition what was bad. – abukaj Sep 29 '23 at 15:22

1 Answers1

0

TL;DR: It does not seem to.

The problem was with badblocks at swap partition which may or might not be related to the reduced swappiness. Someone more experienced may tell, whether swappiness affect disk wear.

I have succesfully ran my bath with vm.swappiness=0. 4h ago I started another bath with vm.swappiness=10 - so far everything is working fine. If system starts to misbehave (and no badblocks are found) I will update the answer.

abukaj
  • 465