========================================== GLOB patterns (wildcard pathname matching) ========================================== -Ian! D. Allen - idallen@idallen.ca The shell will treat as pathnames (files, directories, etc.) any words or tokens on the command line that contain GLOB meta-characters ("wildcard" characters). The shell will try to match the GLOB patterns against existing pathname components to produce a list of existing pathnames. (A pathname component lies between slashes in a pathname - a GLOB pattern can never match or cross a slash.) These are the GLOB metacharacters processed by the shells: * - matches zero or more of any character ? - matches any one character [list] - matches any one character in the list of characters; the list can contain a range of characters such as [a-zA-Z] and can be negated/complemented by using ^ or ! at the start, e.g. [^a-zA-Z] = not a letter; [!0-9] = not a digit The shell always processes GLOB characters that it finds on the command line, even for commands that do not take pathnames. (The shell doesn't know which commands do or do not take pathnames.) If you do not want GLOB processing to happen by accident, hide the GLOB characters from the shell by using quoting, for example: $ grep '*' /etc/passwd $ echo "*** Warning: assuming the worst" $ mail -s "Coming to dinner?" idallen $ tr '[:lower:]' '[:upper:]' uppercase GLOB patterns match any existing pathname component - the pathname might be a file, a directory, or some other Unix pathname type (e.g. socket, fifo). GLOB patterns do not match or cross the slashes that separate pathname components. To match multiple levels of pathnames, you must use a separate GLOB pattern between each slash: $ ls /usr/bin/* # GLOB applies only in /usr/bin/ $ ls /usr/*/* # GLOB applies in /usr and everything in /usr/* $ ls /usr/*/date # GLOB applies in /usr only $ ls /*/*/* # GLOB applies in / and everything in /* and /*/* GLOB characters can match any character in a pathname component, including spaces, newlines, and unprintables; but, they do not match the slashes that separate pathname components: $ echo /* # matches names under the ROOT; but, no deeper $ echo /bin/* # matches names under /bin; but, no deeper $ echo /usr/bin/* # matches names under /usr/bin; but, no deeper $ echo /*/*/* # matches names under /*/*; but, no deeper The pathnames generated by /* and /*/* have no names in common. The "*" does not match the slash that separates pathname components. Each separate word/token in the command line is examined for GLOB characters, and has those characters expanded separately. The only way for the same pathname to appear twice is for the shell to find two separate tokens/words containing GLOB characters and expand them separately. The command lines below produce identical output; since, only one token is found by the shell and "*" means the same thing as "**" or "***": $ echo * $ echo ** $ echo *** The command lines below produce duplicate or triplicate output; since, each blank-separated token is separately GLOB-expanded by the shell: $ echo * * $ echo * * * If a GLOB pattern expansion produces a file name with a space or other shell metacharacter in it, that space or shell metacharacter is not seen or processed specially by the shell. (The space in the name does not produce extra command line arguments.) Examples of GLOB use in Bourne shells: $ touch a ab abc abcde abcdef $ ls a ab abc abcde abcdef $ echo * a ab abc abcde abcdef $ echo ** a ab abc abcde abcdef $ echo **** a ab abc abcde abcdef $ echo * * a ab abc abcde abcdef a ab abc abcde abcdef $ echo ? a $ echo ?? ab $ echo ? ? a a $ echo ?? ?? ab ab $ echo ??? abc $ echo ???* abc abcde abcdef $ echo *??? abc abcde abcdef $ echo ?*? ab abc abcde abcdef $ echo *[cf] abc abcdef $ echo *c* abc abcde abcdef In Bourne-style shells (e.g. bash on Linux), if the shell cannot match a GLOB pattern against any existing pathname, the pattern is left unchanged by the shell and passed to the command being used. No error message is generated by the shell; the command runs with the unchanged argument. (C-Shells do produce error messages and refuse to run the command.) The command itself will probably produce an error message for the nonexistent pathname passed to it by the shell: $ echo *z* *z* $ rm *z* rm: cannot remove `*z*': No such file or directory $ echo ??????? ??????? $ rm ??????? rm: cannot remove `???????': No such file or directory $ echo foobar* foobar* $ rm foobar* rm: cannot remove `foobar*': No such file or directory GLOB patterns themselves never match a leading period in a pathname component. To match a pathname component that starts with a period, you must explicitly put the real period at the start of the pattern. (This is why GLOB patterns never show the directory entries . or .. unless you start the pattern with a period.) This rule even applies to GLOB character lists that contain a period ("[.]") - they do not match leading periods in pathnames! $ rm * $ touch .a .ab .abc .abcde .abcdef $ echo * * $ echo ? ? $ echo ?? ?? $ echo .? .. .a $ echo .?? .ab $ echo .* . .. .a .ab .abc .abcde .abcdef $ echo .??* .ab .abc .abcde .abcdef $ echo [.]* [.]* This refusal to match hidden files is there so that people who type things like "rm -rf *" don't accidentally match and remove the contents of ".." from their accounts. Beware that GLOB pattern ".*" matches the pathname ".." and therefore "rm -rf .*" recursively removes your parent directory (if the rm command doesn't check first). A safer pattern is "rm -rf .??*" which is at least three characters long and thus does not match "." or ".." (or any two-character names starting with a period). GLOB patterns apply to the directory indicated by the pathname in which they appear. For example: # echo * # match names in current directory # echo ./* # match names in current directory # echo ../* # match names in parent directory # echo /* # match names in ROOT directory # echo /bin/* # match names in /bin/ directory # echo /*/ls # match directories in ROOT directory that contain "ls" # echo */ls # match directories in current directory that contain "ls" A single GLOB pattern never crossses a directory boundary (it never matches a slash in a pathname); but, patterns may be expanded in multiple directories using multiple GLOB patterns separated by slashes: $ echo /*/ls /bin/ls $ echo /*/*x /bin/ex /dev/psaux /dev/ptmx /etc/mgetty+sendfax /etc/postfix /mnt/knoppix /sbin/fsck.minix /sbin/mkfs.minix /sbin/partx $ echo /[bs]*/*x /bin/ex /sbin/fsck.minix /sbin/mkfs.minix /sbin/partx $ echo /* | wc -w 26 $ echo /*/* | wc 2034 The list of pathnames produced by /* and /*/* have no pathnames in common. A GLOB pattern that appears on the left of any slash must match one or more directory names (or symbolic links to directory names); because, only directory names can appear to the left of slashes in valid pathnames. These shell patterns match all non-hidden names in the /tmp/idallen directory: $ pwd /tmp/idallen $ echo * $ echo ./* $ echo ../idallen/* $ echo ../../tmp/idallen/* $ echo /tmp/idallen/* These patterns match all non-hidden names in the /tmp directory (parent): $ pwd /tmp/idallen $ echo ../* $ echo ./../* $ echo .././* $ echo /tmp/* $ echo /tmp/./* $ echo /././././tmp/./././* These patterns match all non-hidden names in the root directory (the parent of the parent directory): $ pwd /tmp/idallen $ echo ../../* $ echo ../../../../../../* $ echo /* $ echo /tmp/../* $ echo /tmp/idallen/../../* -------------------------------- Useful trick with GLOB patterns: -------------------------------- This is a valid directory pathname: $ ls dir1/. This is not a valid directory pathname: $ ls file1/. (A filename cannot be used as a directory; only directory names can appear to the left of slashes in valid Unix pathnames.) Question: What valid pathnames does "echo */." match and output? How does it differ from the pathnames matched by "echo *"? Hint: "*" matches every pathname in the current directory "*/." only matches names where "*/." is valid, and "*/." is only valid for pathnames where "*" matches a directory name (because only directory names can appear to the left of slashes in valid Unix pathnames) Question: What is the difference between "echo ./*" and "echo */." ?