I'm using Ubuntu 20.04, with i3 as the window manager but the gnome subsystem installed.
I have a problem that at some random interval (hours or days) after a reboot, any attempt to open thunar or nautilus, or any attempt by an application (such as Firefox) to open a file dialog either takes a minute or more to open, or just times out forever and requires me to kill the application.
How can I find out which application, file, extension or driver is behind this delay?
I've tried the following:
- Running as root - the applications are fast as normal.
- Disable tracker (as per here)
- libgoa seems up to date (as per this)
- Tried the pulseaudio hack here
- Running nautilus with strace; there's a 30-second gap with the below lines, but I'm not sure how to interpret it (fd=24 is some file descriptor?)
journalctl
shows the below when (e.g.) nautilus finally works
I have several machines with very similar setups, only one has this behaviour. I can't figure out what application or library is the culprit.
Output of strace -t -o /tmp/nautilus nautilus
:
20:15:32 write(24, "\1\0\0\0\0\0\0\0", 8) = 8
20:15:32 poll([{fd=24, events=POLLIN}], 1, 0) = 1 ([{fd=24, revents=POLLIN}])
20:15:32 read(24, "\1\0\0\0\0\0\0\0", 16) = 8
20:15:32 write(24, "\1\0\0\0\0\0\0\0", 8) = 8
20:15:32 futex(0x56241fa9c880, FUTEX_WAKE_PRIVATE, 2147483647) = 0
20:15:32 close(24) = 0
20:15:32 eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK) = 24
20:15:32 write(24, "\1\0\0\0\0\0\0\0", 8) = 8
20:15:32 write(6, "\1\0\0\0\0\0\0\0", 8) = 8
20:15:32 futex(0x56241ef26dc0, FUTEX_WAKE_PRIVATE, 1) = 1
20:15:32 futex(0x56241ef26910, FUTEX_WAKE_PRIVATE, 1) = 1
20:15:32 futex(0x56241ef1c038, FUTEX_WAKE_PRIVATE, 1) = 1
20:15:32 poll([{fd=24, events=POLLIN}], 1, 25000) = 1 ([{fd=24, revents=POLLIN}])
20:15:32 read(24, "\1\0\0\0\0\0\0\0", 16) = 8
20:15:32 poll([{fd=24, events=POLLIN}], 1, 25000) = 0 (Timeout)
20:15:57 write(24, "\1\0\0\0\0\0\0\0", 8) = 8
20:15:57 futex(0x56241fa9c880, FUTEX_WAKE_PRIVATE, 2147483647) = 0
20:15:57 close(24) = 0
From the journal:
Apr 20 20:32:35 aubrey dbus-daemon[1180]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.134190' (uid=1000 pid=1450204 comm="nautilus " label="unconfined")
Apr 20 20:32:35 aubrey systemd[1]: Starting Hostname Service...
Apr 20 20:32:35 aubrey dbus-daemon[1180]: [system] Successfully activated service 'org.freedesktop.hostname1'
Apr 20 20:32:35 aubrey systemd[1]: Started Hostname Service.
Apr 20 20:32:35 aubrey nautilus[1450204]: Called "net usershare info" but it failed: Failed to execute child process “net” (No such file or directory)
Apr 20 20:32:45 aubrey ddclient[1450366]: WARNING: file /var/cache/ddclient/ddclient.cache, line 3: Invalid Value for keyword 'ip' =
Edit:
The above trace is just the output of strace -t -o /tmp/nautilus nautilus
- a very verbose log, but with tail -f
I'm able to inspect it at the point where it hangs, which is where I see an entry such as 20:15:32 poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout)
followed by 30+ seconds of nothing (then it starts and is very verbose again).
The fd reported by lsof, if I'm reading it right, looks like:
nautilus 1719429 coljac 7u a_inode 0,13 0 17491 [eventfd]
The top entries from strace -c nautilus
are as follows (yes, poll is number one):
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
22.70 0.011333 2 3988 poll
16.55 0.008265 1 5018 4490 stat
9.31 0.004647 2 1966 131 futex
7.73 0.003859 1 2119 1385 recvmsg
7.33 0.003660 1 3612 1441 openat
4.76 0.002375 2 1038 436 access
4.29 0.002140 1 1424 read
3.60 0.001799 1 1356 105 readlink
3.50 0.001748 0 1868 mmap
3.30 0.001646 8 188 getdents64
3.14 0.001568 0 2232 close
2.38 0.001188 0 1758 fstat
2.14 0.001067 0 1150 munmap
1.99 0.000993 2 337 writev
1.67 0.000833 92 9 clone
1.57 0.000782 2 367 write
1.47 0.000733 3 190 mprotect
Output from lsof | wc -l
: 1340334
Edit 2:
Currently, thunar won't open at all - it fails with the message Failed to register: Timeout was reached
. The output from strace -c thunar
is as follows:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
50.53 0.002814 3 882 792 openat
28.21 0.001571 4 354 mmap
4.79 0.000267 2 96 read
3.75 0.000209 2 90 fstat
3.54 0.000197 2 96 close
3.48 0.000194 1 109 mprotect
1.87 0.000104 5 18 poll
1.45 0.000081 0 165 1 futex
1.31 0.000073 2 28 write
...
If I run strace -t thunar
I also see this in the output before it hangs:
futex(0x564445b9ff70, FUTEX_WAKE_PRIVATE, 1) = 1
poll([{fd=7, events=POLLIN}], 1, 25000) = 1 ([{fd=7, revents=POLLIN}])
read(7, "\1\0\0\0\0\0\0\0", 16) = 8
poll([{fd=7, events=POLLIN}], 1, 25000
lsof -p <pid>
returns the following:
thunar 2036483 coljac 7u a_inode 0,13 0 17491 [eventfd]
Some more outout from strace thunar
, which fails to open, is:
20:29:32 read(7, "\1\0\0\0\0\0\0\0", 16) = 8
20:29:32 poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout)
20:29:58 write(7, "\1\0\0\0\0\0\0\0", 8) = 8
20:29:58 futex(0x5566ad906ba0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
20:29:58 close(7) = 0
20:29:58 write(6, "\1\0\0\0\0\0\0\0", 8) = 8
20:29:58 futex(0x5566ad8e1f70, FUTEX_WAKE_PRIVATE, 1) = 1
20:29:58 futex(0x5566ad8e1b90, FUTEX_WAKE_PRIVATE, 1) = 1
20:29:58 futex(0x5566ad8d9058, FUTEX_WAKE_PRIVATE, 1) = 1
20:29:58 openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu/charset.alias", O_RDONLY) = -1 ENOENT (No such file or directory)
20:29:58 write(2, "Failed to register: Timeout was "..., 40) = 40
20:29:58 poll([{fd=4, events=POLLIN}], 1, 0) = 1 ([{fd=4, revents=POLLIN}])
20:29:58 read(4, "\1\0\0\0\0\0\0\0", 16) = 8
20:29:58 futex(0x7f0dccfb72a0, FUTEX_WAKE_PRIVATE, 1) = 1
20:29:58 poll([{fd=4, events=POLLIN}], 1, 0) = 0 (Timeout)
Further strace outputs are here on pastebin.
strace
output log, I could find above only strace c resume. – user.dz Jun 08 '21 at 11:39dbus-run-sessions
seems to fix it. But that does not fix the underlying cause it only prevents the file open dialog from speaking with the slow program it seems. And you will run out of D-Bus sessions after a month or so. I had this problem for years and your question motivated me to take another look at google for this and found this. The proposedpkill gvfsd-trash
works for me. Thank you!!! – mxmlnkn Jun 27 '21 at 07:54pkill gvfsd-trash
worked. If you would like to add an answer I can mark this question as answered. – Coljac Jul 08 '21 at 03:31