2

Suppose I have an NFS client with an NFS share mounted at /nfs_share.

Suppose I unplug the NFS server, causing an ungraceful shutdown.

Now my client's /nfs_share is stale.

If on the client I go into a bash prompt and type ls /nfs_<TAB> then my bash shell freezes. It appears to be stuck in a syscall.

Just reading / to look up what entries might match "/nfs_" ought not block on an unresponsive file system as we are reading the entries in the / filesystem, not the entries in the NFS filesystem mounted at /nfs_share.

The freeze would only happen if tab completion is causing bash (or something else) to read the props of the NFS F/S.

So I wonder if anyone knows what is hitting the NFS filesystem?* And can I configure the system to not try to read the properties of the mounted filesystem when just doing a string match to do tab completion?

I remember in the past (years ago), perhaps on a different distro, hitting a stale NFS mount would generate an error message and a non-zero exit code, and promptly at that.

*) This behaviour to me has the feeling of some part of systemd trying to "optimize" things by caching filesystem info. If so, OK - but how can I turn it off?

  • 3
    "This behaviour to me has the feeling of some part of systemd trying to "optimize" things by caching filesystem info" Where did this unfounded connection between bash tab completion and systemd come from? – muru Apr 23 '20 at 03:20
  • @muru - it is completely subjective, possibly irrational, and based on how in my experience the systemd suite is altogether too smart for it's own good, and generally makes my life as a partially reconstructed Slackware/init guy more complex. This behaviour just felt like something systemd would do. –  Apr 24 '20 at 02:25
  • 1
    I can reasonably assure that this was the behaviour back in 12.04 as well – muru Apr 24 '20 at 04:04
  • @muru thanks. I came to Ubuntu at version 14. Prior to that Redhat and SuSE. Possibly I am remembering behaviour from then. –  Apr 25 '20 at 17:17

2 Answers2

3

Completion in Bash is a two-stage process - some parts are done by bash, and some are done by readline. In the case of filename completion, bash gets the list of directory entries, and then passes the filenames to readline, where we have:

mark-directories
If set to ‘on’, completed directory names have a slash appended. The default is ‘on’.

Readline then stats the filenames to decide whether or not to append a slash. On some systems, with some filesystems, this information is already available when bash got the directory entries, but this may not always be the case.

In any case, a quick check of strace -o log bash without and with set mark-directories off in .inputrc shows this is likely the main reason.

Without set mark-directories off:

read(0, "\t", 1)                        = 1
openat(AT_FDCWD, "/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_CLOEXEC|O_DIRECTORY) = 3
fstat64(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getdents64(3, /* 20 entries */, 32768)  = 488
getdents64(3, /* 0 entries */, 32768)   = 0
close(3)                                = 0
write(2, "\n", 1)                       = 1
stat64("/bin", {st_mode=S_IFDIR|0755, st_size=53248, ...}) = 0
stat64("/bin", {st_mode=S_IFDIR|0755, st_size=53248, ...}) = 0
stat64("/boot", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/boot", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
write(2, "bin/  boot/ \n", 13)          = 13
write(2, "bash-5.0$ ls /b", 15)         = 15

With:

read(0, "\t", 1)                        = 1
openat(AT_FDCWD, "/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_CLOEXEC|O_DIRECTORY) = 3
fstat64(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getdents64(3, /* 20 entries */, 32768)  = 488
getdents64(3, /* 0 entries */, 32768)   = 0
close(3)                                = 0
write(2, "\n", 1)                       = 1
write(2, "bin   boot  \n", 13)          = 13
write(2, "bash-5.0$ ls /b", 15)         = 15
muru
  • 197,895
  • 55
  • 485
  • 740
2

This is due to the way NFS mounts are handled by the client. They are treated like remote connections and the client will send the ls request to the remote NFS host and wait for a response. If the no response is received within the specified timeout period with the timeo option set or the default timeout period ( 60 seconds ) otherwise, then the client will retry the NFS request again and thus will keep the shell prompt busy waiting for a response.

This behavior is expected as the entry for the NFS share exists in your fstab. That is where the completion comes from ( since the NFS share was already mounted ) when you press Tab and it has no connection specific to systemd or bash in this matter.

That is also why these options soft,bg exist. Try using them when mounting the NFSshare on the client. Also you might want to add the timeo=30 option as the default timeout is is 600 (60 seconds) so you might want to shorten it.

soft / hard

Determines the recovery behavior of the NFS client after an NFS request times out. If neither option is specified (or if the hard option is specified), NFS requests are retried indefinitely. If the soft option is specified, then the NFS client fails an NFS request after retrans retransmissions have been sent, causing the NFS client to return an error to the calling application.

bg / fg

Determines how the mount(8) command behaves if an attempt to mount an export fails. The fg option causes mount(8) to exit with an error status if any part of the mount request times out or fails outright. This is called a "foreground" mount, and is the default behavior if neither the fg nor bg mount option is specified.

See man nfs for more.

Raffa
  • 32,237