2

This self-answered question is more of a novelty or curiosity issue.

I know this can be done with a bash script that recursively traverses all directories. However it probably takes many hours to run.

How can I quickly discover the deepest directory level?

3 Answers3

3

locate command is the fastest

The locate command is your friend in this case:

$ time locate "/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*"
/mnt/clone/home/rick/.gradle/wrapper/dists/gradle-4.6-all/bcst21l2brirad8k2ben1letg/gradle-4.6/samples/userguide/multiproject/dependencies/java/services/personService/src/main/java/org/gradle/sample/services/PersonService.java
/mnt/clone/home/rick/.gradle/wrapper/dists/gradle-4.6-all/bcst21l2brirad8k2ben1letg/gradle-4.6/samples/userguide/multiproject/dependencies/java/services/personService/src/test/java/org/gradle/sample/services/PersonServiceTest.java
/mnt/clone/home/rick/.gradle/wrapper/dists/gradle-4.6-all/bcst21l2brirad8k2ben1letg/gradle-4.6/samples/userguide/multiproject/dependencies/javaWithCustomConf/services/personService/src/main/java/org/gradle/sample/services/PersonService.java
/mnt/clone/home/rick/.gradle/wrapper/dists/gradle-4.6-all/bcst21l2brirad8k2ben1letg/gradle-4.6/samples/userguide/multiproject/dependencies/javaWithCustomConf/services/personService/src/test/java/org/gradle/sample/services/PersonServiceTest.java

real    0m1.731s
user    0m1.653s
sys     0m0.072s

Stuff in enough /*/* until no results are displayed, then subtract one /* to get the deepest subdirectory level. The files in the deepest levels will also be display.

Note: On this machine there are four different paths returned. Each path contains one file.


Some details about locate

The database used by locate is updated daily by cron. If you installed an application or created new directories today you need update the database using:

sudo updatedb

In Ubuntu 19.10 the locate command is no longer installed by default. Hopefully it returns in 20.04 but in the meantime you need to install it with:

sudo apt install mlocate

To gain an appreciation of locate speed look at what it has indexed for instant retrieval:

$ locate -S

Database /var/lib/mlocate/mlocate.db: 381,154 directories 2,548,775 files 213,049,136 bytes in file names 92,287,412 bytes used to store database


Using a script

Comments point out how people won't know the starting point. I wrote a script that defaults to 50 level starting point and works backwards from there. You can override with a starting point of 6 to 126 subdirectory levels.

Script output:

$ time deepdir

Search point 50 levels deep: ////////////////////////////////////////////////// Common path followed by unique sub-paths (deepest subdir 25 levels): +- /mnt/clone/home/rick/.gradle/wrapper/dists/gradle-4.6-all/bcst21l2brirad8k2ben1letg/gradle-4.6/samples/userguide/multiproject/dependencies/ |--- /java/services/personService/src/main/java/org/gradle/sample/services/PersonService.java |--- /java/services/personService/src/test/java/org/gradle/sample/services/PersonServiceTest.java |--- /javaWithCustomConf/services/personService/src/main/java/org/gradle/sample/services/PersonService.java |--- /javaWithCustomConf/services/personService/src/test/java/org/gradle/sample/services/PersonServiceTest.java

real 0m45.141s user 0m44.552s sys 0m0.588s

$ time deepdir 26

Search point 26 levels deep: ////////////////////////// Common path followed by unique sub-paths (deepest subdir 25 levels): (... SNIP repeated parts ...)

real 0m6.123s user 0m6.041s sys 0m0.080s

  • The first time you run the script you don't know how deep the subdirectories go. Therefore the default of 50 levels will takes 43 seconds to run.
  • The second time you run the script pass the known count + 1 and it only takes 6 seconds to run.
  • After the second time, take the output line of /*/*.../* and copy it (less 1 set) to the clipboard as a parameter for calling locate or another command.

The bash script

#!/bin/bash

NAME: deepdir

PATH: $HOME/askubuntu/

DESC: Answer for: https://askubuntu.com/questions/1187624/how-to-quickly-find-the-deepest-subdirectory/1187625?noredirect=1#comment1985731_1187625

DATE: November 11, 2019.

StartLevel=50 [[ $1 != "" ]] && StartLevel="$1" [[ $StartLevel -gt 126 ]] && { echo Max levels 126 ; exit 1 ; } [[ $StartLevel -lt 6 ]] && { echo Min levels 6 ; exit 2 ; }

Big="/////////////////////////////////" # 33 Big="$Big///////////////////////////////" # 31 Big="$Big///////////////////////////////" # 31 Big="$Big///////////////////////////////" # 31 # Total supported: 126

If starting level populated it is too small.

Search="${Big:0:StartLevel*2}" echo "Search point $StartLevel levels deep: $Search" Count=$(locate "$Search" | wc -l) [[ $Count -gt 0 ]] && { echo "Levels too small. $Count files found" ; exit 3 ; }

Loop backwards to find first populated level, always more than 5

for (( l=StartLevel; l>5; l-- )) ; do Search="${Big:0:l*2}" Count=$(locate "$Search" | wc -l) [[ $Count -gt 0 ]] && break done

Arr=( $(locate "$Search") )

Enhancement using Q&A: Longest common prefix of two strings in bash

https://stackoverflow.com/a/17475354/6929343

Common=
"$(IFS=$'\n'; sed -e '$!{N;s/^(.).\n\1.$/\1\n\1/;D;}' <<<"${Arr[]}")" Common="${Common%/*}/" echo "Common path followed by unique sub-paths (deepest subdir $l levels):" echo "+- $Common" Len="${#Common}"

for p in "${Arr[@]}" ; do # echo "DEBUG: $p" Curr="$(dirname "$p")" [[ $Curr != "$Last" ]] && echo "|--- /${p:$Len}" Last="$Curr" done

exit 0

2
find -type d \
  -wholename "$(find -type d -print0 |
                tr -d --complement '/\0' |
                sort -zur |
                sed 's:/:*/*:g' |
                head -z -n1 |
                tr -d '\0' )"

We'll use find to locate all directories, then select the one(s) with most slashes in the path. Notably we'll allow any possible names including odd characters such as newlines, spaces or whatnot.

FelixJN
  • 2,358
  • Starting from root (/) you get permission denied errors which is a common complaint with find command versus locate. It processes virtual file systems such as /proc. On my system it takes 3 minutes versus 1 second for locate but it does eventually return the correct result so +1. – WinEunuuchs2Unix Nov 11 '19 at 15:47
  • A well, I assumed you knew about a start directory and were not searching the whole system. Of course you could e.g. exclude some system dirs like /proc and /sys, redirect failures to not print 2> /dev/null and even define a -mindepth (if you already have an idea about the result). Your approach is quicker, exactly because you added predefined minimal path lengths. – FelixJN Nov 11 '19 at 15:50
  • Thanks, but this doesn't work OOTB for me with older GNU utils due to many problems. I gave up after working around the first one - old head version missing -z switch. IMO This approach is overcomplicated even though it is valid. – pinkeen Feb 19 '21 at 16:34
2

Really there's a much simpler way, find has a -printf action which has a %d format that prints the fs hierarchy depth.

The oneliner

In order to get the 10 deepest dirs (starting in cwd) you just need to use:

find . -type d -printf '%03d %p\n' | sort -n -k 1 | tail -n 10

Alternatively you can print the 10 biggest and unique depths by adding -u switch to sort command:

find . -type d -printf '%03d %p\n' | sort -n -k 1 -u | tail -n 10

Examples

Primary form of the command will output a list like this:

019 ./vendor/magento/magento2-base/setup/src/Magento/Setup/Test/Unit/Module/Di/_files/app/code/Magento/SomeModule/etc/source/PhpExt.php
020 ./vendor/magento/magento2-base/dev/tests/integration/testsuite/Magento/Framework/View/_files/fallback/design/frontend/Vendor/default/ViewTest_Module/web/i18n/ru_RU
020 ./vendor/magento/magento2-base/dev/tests/integration/testsuite/Magento/Setup/Console/Command/_files/root/lib/internal/Magento/Framework/Test/Unit/View
020 ./vendor/magento/magento2-base/dev/tests/integration/testsuite/Magento/Setup/Module/I18n/Dictionary/_files/source/app/code/Magento/FirstModule/view/frontend
021 ./vendor/magento/magento2-base/dev/tests/integration/testsuite/Magento/Setup/Console/Command/_files/root/lib/internal/Magento/Framework/Test/Unit/View/Element

Performance

After dropping all VM caches with echo 3 > /proc/sys/vm/drop_caches script was tested on a path with 41454 subdirs. Both variants took roughly the same amount of time:

real    0m6.310s
user    0m0.446s
sys     0m0.708s

This was performed on a relatively underpowered server with the disk hard-limited to 3000 IOPS.

Notes

The script was tested under CentOS 7, not Ubuntu, however, I think it should be pretty portable across shells with GNU utils.

pinkeen
  • 121