How can I count all the python and shell scripts in my whole system?
2 Answers
Quick overview
Here is a guideline on how to do it.
$ for f in * ; do file "$f" ; done
aptfielout: ASCII text, with very long lines
aptfilein: ASCII text, with very long lines
aptfileout: ASCII text
aptfileparse.sh: Bourne-Again shell script, ASCII text executable, with very long lines
aptfileparse.sh~: ASCII text, with very long lines
calc.py: Python script, UTF-8 Unicode text executable
catall.sh: Bourne-Again shell script, ASCII text executable
Strip out all the files that don't say "Bourne-Again shell script," or "Python script,". Add to the list POSIX shell scripts:
$ file /bin/zgrep
/bin/zgrep: POSIX shell script, ASCII text executable
A complete answer
/$ time find * -type f -print0 2>/dev/null | xargs -0 -P 8 file | \
sed 's/.*: //g' | sed 's/^ *//g' | \
grep -Eio 'shell script,|Python script,' | sort | uniq -c
19151 Python script,
127 python script,
18420 shell script,
real 16m14.939s
user 54m7.355s
sys 2m33.238s
Starting from the root (/
) find
all files and pipe to the xargs
command as zero byte terminated names.
The xargs
command is run in parallel maximizing all 8 CPUs for faster processing. Each parallel process calls the file
command which gets a description of the file as shown in the previous section.
The grep
commmand selects shell scripts and python scripts.
The sort
command sorts shell scripts together and python scripts together.
The uniq
command counts the occurrences of each group.
fun facts
You can really tax your system running all 8 CPUs (in my case) at once:
The beauty of Linux shines through because other jobs such as the screen recorder making the .gif
and a video running on the third monitor (big screen TV) continue to function normally. Linux doesn't let the xargs file
command bog down the system.

- 102,282
-
Note that, as written, this technique is not recursive. (That's not necessarily a problem since this is presented as a general guideline, but it's good to be aware of.) – Eliah Kagan Nov 22 '19 at 04:17
-
@EliahKagan I had quickly posted an answer before leaving for work because question was about to be closed. If I waited too long no answer could be posted. – WinEunuuchs2Unix Nov 22 '19 at 11:37
-
Yes, I did not intend it as a criticism--just a note for readers using the answer. – Eliah Kagan Nov 22 '19 at 11:43
-
@EliahKagan Thanks :) I've finalized my answer now. BTW time and counts are for three Ubuntu distributions mounted and three NTFS partitions mounted... – WinEunuuchs2Unix Nov 22 '19 at 11:49
In the absence of a more specific goal, this will be approximate no matter how you do it, because of ambiguities about what constitutes a shell script and what constitutes a Python script. That doesn't make the problem too ill-defined, so long as an approximation is what you want. And you can get a good approximation.
Given that, I suggest this command to list shell and Python scripts:
find . -type f -executable -exec file {} + | grep -Ei '(python|shell) script,'
If the output looks reasonable for your needs, you can run it again, modified to count the number of results:
find . -type f -executable -exec file {} + | grep -Ei '(python|shell) script,' | wc -l
You may get some "Permission denied" errors. That's okay. I don't recommend attempting to suppress those error messages, because you should read or at least scan through them to see if it looks like you were unable to access any files or locations that were of interest to you. You can run the find
command as root with sudo
if you really want to.
-type f
makes it find only regular files. Usually it's better to use-xtype f
to include symbolic links that resolve to regular files, but in this case that would result in overcounting.-executable
makes it find only files that are executable by the user who runsfind
. Looking at non-executable files to see if they appear to be shell or Python scripts would make the command take considerably longer. You may also get more false positives that way, in that files that aren't executable may be "libraries" rather than scripts, i.e., they may consist of shell commands and be intended for sourcing with.
orsource
into shell scripts, or they may be Python modules that one would import withimport
orfrom
into Python programs. (You might think this would not happen, since such files generally do not have a shebang, butfind
looks for more than a shebang.) However, you can omit-executable
if you like--and if you are willing to wait as your command attempts to open and read the beginning of every regular file on your system.-exec ... +
runs a command...
with the found files as its command-line arguments. It runs the command as many times as necessary to process all the files. Often this is just once; for all the executable files on your whole system, it will likely be more than once, but many fewer times than if you ran it once per file (as-exec ... \;
would do). Even on the same number of files, running a command fewer times tends to be notably faster than running it more times, because there is lower associated overhead.- The
file
command looks at the beginning of a file and guesses, usually pretty well, what kind of file it is. It outputs in a two-column format, with the path or filename on the left and a summary of what kind of file it appears to be on the right. - The
grep
command filters its input and outputs only lines that case-insensitively (-i
) match the extended regular expression (-E
)(python|shell) script,
. Those are the lines that contain the textpython script,
,shell script,
, or any case variant thereof. Filesfind
identifies as those types of scripts will show this. wc -l
, which appears in the second of two commands shown above, counts lines.
As shown, this technique is wholly unsuitable for many tasks that involve discerning what type of files one has. The reason is that a file can have text like python script,
in its name, as well as newline characters in its name that that would cause the output of file
not to be one-per-line. It is usually important, and often even vital, to account for such things, and it can be done. In this case, however, you're just going for an estimate (due to the fuzzy nature of the problem itself) and it appears you're not renaming, modifying, deleting, or even creating anything based directly on the result, so I don't think it's worthwhile to worry about that. If you end up iterating on this and defining the problem more strictly, then it could be worthwhile to address that.
Note that there is one major case where you might wish to consider non-executable files to be scripts: if you have many Python scripts brought over from a system like Windows where they are not marked executable. In that case, you can search for .py
files, though be aware that many of them are likely to be Python modules rather than Python scripts. If the good Python practice of putting a hashbang at the top of the script has been followed (this is useful even in Windows, because py.exe
and pyw.exe
recognize them, though unfortunately it's not always done), then a technique that looks just for hashbangs but ignores if a file is executable may be more suited to your needs.
There is also a minor but significant case where you might wish to consider non-executable files to be scripts of any kind--or, more precisely, where you might wish to test for executability differently. If you have a drive mounted noexec
, then no file on it will pass find
's -executable
test. Note that this is a different problem from running find
as a user who doesn't have permissions to execute some files--like the problem of running it as a user who doesn't have permissions to look in some directories, this can be solved by running it as a sufficiently privileged user.
This problem, as you've posed it, is unusual--ordinarily one would want to find scripts of a specific language or small family of closely related languages. But for the benefit of future readers, note that finding all the (for example) shell scripts in a single, perhaps large, directory can also be accomplished with a slight modification of the above commands. (The same holds for the technique presented in WinEunuuchs2Unix's answer--it is useful for that, too.)
For example, to find all the shell scripts in the current directory:
find . -type f -executable -exec file {} + | grep -Fi 'shell script,'

- 117,780
POSIX shell script, UTF-8 Unicode text executable
" or other details thatfile
command returns. A good use case is someone needs to search for specific file types during a system conversion project. It is possible some people voted to close as "unclear" because they didn't know how to do it. But that isn't a valid reason. – WinEunuuchs2Unix Nov 22 '19 at 12:02