8

Edit: Please note, as written below, I'm aware of answers on this site explaining that apt-get uses regex to interpret packages' names. In fact, the question is directly about a way in which its actual behavior is different from that documented one. Please read the question before suggesting a duplicate.


Some answers on this site warn about using apt-get with wildcards (i.e., asterisks: *), because apt-get supposedly expands them as regular expressions, which might give unexpected (and undesired) results, especially with apt-get remove. Indeed, Ubuntu man page for apt-get reads:

If no package matches the given expression and the expression contains one of '.', '?' or '*' then it is assumed to be a POSIX regular expression, and it is applied to all package names in the database. Any matches are then installed (or removed). Note that matching is done by substring so 'lo.*' matches 'how-lo' and 'lowest'. If this is undesired, anchor the regular expression with a '^' or '$' character, or create a more specific regular expression.

In fact, this answer claims:

apt-get accepts a regular expression and not a glob pattern as the shell.

I believe this is wrong (at least as of Xenial). For example, I can reproduce the following behavior:

$ sudo apt-get install -s 'meld*'
[...]
Note, selecting 'meld' for glob 'meld*'
[...]

$ sudo apt-get install -s 'meldt*'
[...]
Note, selecting 'python-meld3' for regex 'meldt*'
Note, selecting 'python3-meld3' for regex 'meldt*'
Note, selecting 'meld' for regex 'meldt*'
[...]

(I didn't remove any matches, only irrelevant parts of apt-get's response.)

It would seem to me, based on this behavior, that apt-get first attempts to match given expressions as globs, and only if it fails, will it then retry as regular expressions.

Do I have that right? Have I misunderstood the man page, or is this behavior badly documented?

Jonathan Y.
  • 1,044
  • How does t* match 3 as a glob? as a regex, it matches as "zero or more instances of t" (followed by anything, since the expression isn't anchored - compare to meldt*$) – steeldriver Jun 14 '17 at 15:44
  • I don't get that output for sudo apt-get install -s 'meld*' (I get 100s of regex matches). What version of apt do you have? – Joe P Jun 14 '17 at 15:44
  • 4
  • @steeldriver The t* match is only regexes in the question – Joe P Jun 14 '17 at 15:47
  • @steeldriver that's precisely my point. Since the glob meld* matches some packages (namely, meld), apt-get never notices that the regex meld* will also match (a substring of) python-meld3. But meldt* matches nothing as a glob, which is why apt-get interprets it as regex, finding the other two packages. – Jonathan Y. Jun 14 '17 at 15:47
  • @heemayl I don't believe it's a duplicate, because I'm asking specifically about the mechanism, and a way in which it might be wrongly documented. – Jonathan Y. Jun 14 '17 at 15:48
  • @JoeP I have apt 1.2.20 (amd64). But I suspect the difference might be caused by my having enabled repositories which you haven't. meld 3.14.2-1 is in universe. – Jonathan Y. Jun 14 '17 at 15:52
  • 1
    OK that is interesting. On a machine with the same version, I do get glob results as in your question. And it's not in the documentation. This doesn't happen on apt 1.0.1ubuntu2 for amd64. ... So it is an undocumented change in behaviour. – Joe P Jun 14 '17 at 15:56
  • Just to make sure everyone is well confused, Bash (the default shell on Ubuntu) interprets globbing characters before passing them to any program. If your current directory contains files named filea, fileb, and filec, then Bash expands apt-get install file* to apt-get install filea fileb filec. You probably never want that, so it would be advisable to always quote-protect your globs and regular expressions from Bash expansions, such as apt-get install 'file*'. The behavior of apt-get you have found is strange and interesting though ... – takatakatek Jun 14 '17 at 16:29
  • Yet two people have now voted to close this question as a duplicate. To the best of my understanding, they simply pointed at another question where the effects of regex-matching were discussed, with no reference to this globbing behavior. @heemayl -- I would love to discuss this if you still believe this is a duplicate. – Jonathan Y. Jun 14 '17 at 16:33
  • Interestingly, the message about selecting ... glob is only printed when STDOUT is the terminal. If you pipe the output to any other program that message is not printed. For example apt-get install -s 'meld*' | cat doesn't print the message about selecting for glob. – takatakatek Jun 14 '17 at 16:51
  • man 5 apt_preferences | grep -1n glob reveals that globbing is known to Apt in general, but I still think this is a documentation bug. It isn't for Ubuntu though, it is for the maintainers of Apt, Debian. – takatakatek Jun 14 '17 at 17:07
  • @takatakatek It doesn't print the message, but it still expands 'meld*' and 'meldt*' in the same fashion. – Jonathan Y. Jun 14 '17 at 18:23

1 Answers1

5

This is explained in the apt(8) manpage:

install, remove, purge (apt-get(8))
   Performs the requested action on one or more packages specified via
   regex(7), glob(7) or exact match. The requested action can be
   overridden for specific packages by append a plus (+) to the
   package name to install this package or a minus (-) to remove it.

This paragraph does not exist in the 15.10 manpage, so it might have been added in 16.04.

This does not seem to have got a mention in apt's changelog - the commit which added this in 2013 doesn't show any changes in the manpages. This was briefly disabled and reenabled later on (see commits between May and February 2014), and the disabling is mentioned in the changelog.

So this may have been added four years ago but only documented in 2015. And apt-get's manpage remains neglected.

muru
  • 197,895
  • 55
  • 485
  • 740
  • I see. As far as I can tell, the operation order is the reverse of what's implied by the man page; that is, apt (and apt-get) first looks for glob matches, and only looks for regex matches if that failed. I wouldn't know how and where to verify that (although the links you supplied suggest fileutl.{cc,h}, apt-pkg/policy.cc and/or apt-pkg/versionmatch.cc); any suggestions? – Jonathan Y. Jun 15 '17 at 07:53
  • @JonathanY. maybe I was mistaken about when it was added. See apt-pkg/cacheset.cc – muru Jun 15 '17 at 08:03
  • That's a little over my head, I'm afraid. Does that function return a list of matches in global variables, beyond the declared Boolean function type? Anyway, am I correct in assuming the path forward is to open a GitHub issue about the documentation? (Also, as you can probably tell, any suggestion you might have on where that issue goes and what it should specify -- or dare I ask, help opening it -- will be very much appreciated.) – Jonathan Y. Jun 15 '17 at 08:11
  • 1
    Since this is Debian, issue typically go to https://bugs.debian.org. I filed it for you: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=864811. You can create a PR on Github if you want, and mention this bug number. – muru Jun 15 '17 at 08:49