35

There are the commands find and locate to search for files on the disk.

I know that find recursively processes all needed subdirectories to search files and therefore is slow but up-to-date, whereas locate uses a database that gets updated every now and then (when exactly?) to quickly show results which might be outdated though.

Are there any other differences? In which situations would one prefer the one or the other? And when does the locate database get updated usually?

Byte Commander
  • 107,489
  • 7
    Reference: http://unix.stackexchange.com/questions/60205/locate-vs-find-usage-pros-and-cons-of-each-other – Rinzwind Sep 08 '15 at 13:54
  • 1
    http://manpages.ubuntu.com/manpages/trusty/man8/updatedb.8.html " updatedb is usually run daily by cron(8) to update the default database." – Rinzwind Sep 08 '15 at 13:56
  • @Rinzwind The linked U&L answer is awesome, it's a shame we can't make cross-site duplicates. But do you know more about the cronjob, when exactly will it run? After startup? At a specific time (I think I've read 1-2AM or something like that) only? What happens if it's shut down at that time? Does it start when the computer is on idle? How can I see the database's age? – Byte Commander Sep 08 '15 at 14:24
  • 2
    @ByteCommander - That's what anacron is for. I don't know if it's installed by default on desktop systems/servers, but it is on notebooks. It runs upon boot and sees if any cron jobs should have run while the system was off and runs them. It's really helpful, but can cause some issues if you have jobs scheduled far away from midnight. That can cause the job to be run upon boot and then again when the time comes up - possibly a lot less than 24 hours later (for a daily job.) – Joe Sep 10 '15 at 08:34
  • @Joe So will it run during boot and slow it down, or will it run some time after boot, or does it usually run with such a low priority that it just runs when the system is almost on idle? – Byte Commander Sep 10 '15 at 12:24
  • @ByteCommander - It's a settable parameter. It usually runs around 10 or 15 minutes after boot to avoid such problems. I don't know what, if anything, it does with priorities. I suspect nothing. It has its own table similar to crontab which allows very fine control of how it works, but is usually fine with just the defaults. – Joe Sep 13 '15 at 01:54
  • @ByteCommander I added an answer with time it takes for updatedb to run which is only 3 to 4 seconds on my machine with 1 SSD (4 partitions), 1 HDD (1 partition) – WinEunuuchs2Unix Mar 31 '18 at 23:35

2 Answers2

29

locate is really only good for finding files and displaying them to humans. You can do a few things with it, but I wouldn't trust it enough to parse and —as you say— it's impossible to guarantee the state of the internal database, more so because it's only scheduled to run from /etc/cron.daily/mlocate, once a day!

find is live. It filters, excludes, executes. It's suitable for parsing. It can output relative paths. It can output full paths. It can do things based on attributes, not just names.

locate certainly has a place in my toolbox but it's usually right at the bottom as a last-ditch effort to find something. It's easier than find too.

terdon
  • 100,812
Oli
  • 293,335
  • 2
    I find locate to be much faster if I want to search my entire filesystem. And you can manually update the database using updatedb before using it. – hytromo Sep 08 '15 at 14:21
  • You know how that cronjob is exactly configured? Does it run at a specific time or when the system is on idle or n minutes after startup? Because I think I have read somewhere that it is scheduled at 1-2AM, when my machine is usually turned off. Will it never get updated then, except manually (sudo updatedb)? And is there a chance to see how old the database is? – Byte Commander Sep 08 '15 at 14:21
  • grep run-parts /etc/crontab You'll see that these are being managed through anacron (which you'll see through man anacron is more resilient to systems that aren't on all the time). From what I can see it should run it on boot instead if you miss the original cron time. – Oli Sep 08 '15 at 14:28
  • 2
    I find that locate doesn't index my removable/unmounted partitions, so if I want to find something on them, I have to use find. Of course, locate doesn't have all the amazing options that find does - like -exec command {} \; to run a command on every file found. I do like to use locate -b which restricts locate to finding files which match on the final component of the name - without the rest of the path. I often try that first because it's so fast. Also, you can run sudo updatedb any time you want to to refresh the locate database. – Joe Sep 10 '15 at 08:47
  • if you need real time search that is also somewhat easy, you can use something like ls -R | grep 'file_name.txt' – jena Aug 17 '16 at 12:31
11

As much as I like Oli (which is a lot!) I disagree with him on the find command. I don't like it.

find command takes over three minutes

Take for example this simple command:

$ time find / -type f -name "mail-transport-agent.target"
find: ‘/lost+found’: Permission denied
find: ‘/etc/ssmtp’: Permission denied
find: ‘/etc/ssl/private’: Permission denied
    (... SNIP ...)
find: ‘/run/user/997’: Permission denied
find: ‘/run/sudo’: Permission denied
find: ‘/run/systemd/inaccessible’: Permission denied

real    3m40.589s
user    0m4.156s
sys     0m8.874s

It takes over three minutes for find to search everything starting from /. By default reams of error messages appear and you must search through them to find what you are looking for. Still it is better than grep to search the whole drive for a string which takes 53 hours: `grep`ing all files for a string takes a long time

I know I can fiddle with the find command's parameters to make it work better but the point here is the amount of time it takes to run.

locate command takes less than a second

Now let's use locate:

$ time locate mail-transport-agent.target
/lib/systemd/system/mail-transport-agent.target

real    0m0.816s
user    0m0.792s
sys     0m0.024s

The locate command takes less than a second!

updatedb only run once a day by default

It is true the updatedb command which updates the locate database is only run once a day by default. You can run it manually before searching for files just added by using:

$ time sudo updatedb

real    0m3.460s
user    0m0.503s
sys     0m1.167s

Although this will take 3 seconds, it's small in comparison to find command's 3+ minutes.

I've updated my sudo crontab -e to include the line at the bottom:

# m h  dom mon dow   command
  0 0  1   *   *     /bin/journalctl --vacuum-size=200M
*/5 *  *   *   *     /usr/bin/updatedb

Now every five minutes updatedb is run and locate commands database is almost always up-to-date.

But there are no attributes?

You can pipe locate output to other commands. If for example you want the file attributes you can use:

$ locate mail-transport-agent.target | xargs stat
  File: '/lib/systemd/system/mail-transport-agent.target'
  Size: 473         Blocks: 8          IO Block: 4096   regular file
Device: 10305h/66309d   Inode: 667460      Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-03-31 18:11:55.091173104 -0600
Modify: 2017-10-27 04:11:45.000000000 -0600
Change: 2017-10-28 07:18:24.860065653 -0600
 Birth: -

Summary

I posted this answer to show the speed and ease of use of locate. I tried to address some of the command short-comings pointed out by others.

The find command needs to traverse the entire directory structure to find files. The locate command has it's own database which gives it lightning speed in comparison.

  • @EliahKagan But the find command was scrolling through and listing all the directories and files on all the drives an partitions. It appeared to be working and I was expecting a printout at the end... Either way it wasn't about "fixing" the find command's search it was about getting the time. Running locate / display-auto-brightness takes 17 seconds and also displays every directory and file on all disks. – WinEunuuchs2Unix Mar 31 '18 at 23:49
  • @EliahKagan I understand. --regex was necessary because there were too many results returned with my search string. I'll find two new examples for find and locate and update my answer in a few minutes. – WinEunuuchs2Unix Mar 31 '18 at 23:54
  • 1
    To clarify Eliah's point, that find command means "print the filenames of all the files in the directories / and display-auto-brightness." I think you meant to use find / -name display-auto-brightness, but even that prints a lot of junk "Permission denied" errors. – wjandrea Mar 31 '18 at 23:59
  • @wjandrea Yes as I said the point wasn't to find the file, it was to time the find command. I'm rerunning tests now with valid parameters after flushing caches. Then I'll update the answer. – WinEunuuchs2Unix Apr 01 '18 at 00:07
  • @EliahKagan I've revised the answer with a file name everyone has on their system so they can test the results for themselves. I've included the correct syntax for find to search based on -name starting at /. I feel the locate command can be very valuable to most people and deserved an answer on Byte's Question. – WinEunuuchs2Unix Apr 01 '18 at 00:37
  • FWIW, some of us are still using 14.04, which doesn't use systemd. – wjandrea Apr 01 '18 at 00:45
  • @WinEunuuchs2Unix Thanks for the edit--that clarifies what you're talking about. (It occurs to me that a possible interpretation of that other question is as a partial XY problem; in some cases, it is reasonable to use locate instead of find, and then one automatically avoids "Permission denied" messages. I don't know if you want to post an answer there too, and I can't be sure if it would be well received by people other than me... but I'm mentioning it in case you want to.) – Eliah Kagan Apr 01 '18 at 00:49
  • @EliahKagan That question has great answers and it focuses on find problems only. Besides time find /home -iname *.pdf 2>/dev/null returns two files in .069 seconds but time locate /home/*.pdf returns the same output in 0.463 seconds 7 times slower! Granted .4 second overhead to search entire database of a million file names for locate is minimal I guess. – WinEunuuchs2Unix Apr 01 '18 at 01:08
  • @wjandrea groans Should I change my search file name a third time? – WinEunuuchs2Unix Apr 01 '18 at 17:04
  • 1
    @Win No, your example is still valid, and I don't think the processing time is changed much whether the file is found or not. – wjandrea Apr 01 '18 at 22:50
  • @wjandrea The time changes when you use time locate * | wc (15 seconds longer) vs. time find / -name "*" 2>/dev/null | wc. (can't remember time difference and it was last night) – WinEunuuchs2Unix Apr 01 '18 at 22:54