If you work within a disorganized workspace with deeply nested folders and try locating a specific folder, file or code snippet, then your productivity suffers from the constant distraction of manually searching through the workspace. Navigating the workspace and rummaging through every folder (double-clicking each one) to find a single folder or file becomes repetitive and directs attention away from your work. If you forget to close the folders after exploring them, then these opened folders accumulate over time and obstruct subsequent searches by cluttering the screen. Additionally, a computer's file explorer, such as Mac's Finder or Ubuntu's Nautilus, slows down when loading and displaying folders and files within large external hard-drives, thumb drives or SD cards filled (or nearly filled) to maximum capacity.

Operating systems based on the UNIX kernel provide the find and grep command-line utilities to search for files/folders and text within a file respectively via pattern matching. With a single-line command, you avoid interacting with the interface of the computer's file explorer. Instead, the command prints the search results to standard output (stdout) displayed within the terminal. Both the find and grep commands are considered as some of the most essential building blocks in bash scripting! Knowing how to use them allows you to integrate them into your continuous integration (CI) pipeline to automate search tasks.

Below, I'm going to show you:

  • How to locate folders and files via the find command within a folder hierarchy.

  • How to locate a text string via the grep command within a file.

  • How to interpret and construct your own glob patterns.

To demonstrate the find and grep commands, we will search for directories and files within a downloaded copy of one of GitHub's most popular repositories, facebook/react.

The find Command#

The find command, as its name implies, recursively finds directories and files located within a specified list of directory paths. When a file or directory matches search criteria (based on the options provided to the find command), the find command outputs the matched directories and files by their path relative to the given starting point/s.

To recursively list all of the directories and files (including those hidden) within the current directory:

Listing of All Directories and Files

To narrow our list down to a specific directory or file, we must provide an expression to the find command:

Note: Angle brackets indicate required arguments, whereas square brackets indicate optional arguments.

An expression describes how to identify a match via tests, which use certain properties of directories/files to determine which directory/file satisfies the defined conditions:

  • Name (-name) - Match if the name of a directory/file equals the name (a glob pattern) passed as the argument specified for -name. This test is case-sensitive. Use -iname for the case-insensitive variant.

  • Empty (-empty) - Match if a directory is empty or a file has no content.

  • Path (-path) - Match if a directory/file belongs within the path (a glob pattern) passed as the argument specified for -path. This test is case-sensitive. Use -ipath for the case-insensitive variant.

  • Permissions (-perm) - Match if the permissions (read/write/execute for owner/group/everyone) of a directory/file equals the argument specified for -perm. 000 indicates no permissions, and 777 indicates full permissions.

  • Regular Expression (-regex) - Match if the path of a directory/file fulfills the regular expression passed as the argument specified for -regex. This test is case-sensitive. Use -iregex for the case-insensitive variant.

  • Size (-size) - Match if the size of a directory/file is greater than (+ prefix), less than (- prefix) or equal (no prefix) to the argument specified for -size.

  • Type (-type) - Match for a specific type. There are many supported types, but for this blog post, we are concerned with only directories (d) and files (f).

For more information on other tests, check out the find command's documentation in the Linux Manual Page.

Name (-name Test)#

To get started, let's search for all files named package.json.

Note: This command also searches for all directories named package.json. It is highly unlikely for directory names to contain extensions. To limit the search to files only, add the -type f test.

If we try to search for directories or files that do not exist, then find returns an empty list with an exit code of zero.

Now let's search for all JSON files.

If you execute this command without the quotation marks around the glob pattern, then you may expect the terminal to also print a list of JSON files. However, notice that the terminal only prints a list of package.json files.

Suppose we rename the package.json file in the root of the current directory to package-x.json.

If we execute the previous find command, then notice that the terminal only prints this package-x.json file.

Without the quotation marks, Bash expands the glob pattern and will replace it with the first file in the current directory that matches this pattern, which is package-x.json. To ensure that Bash does not expand the glob pattern and behave non-deterministically, wrap the glob pattern in quotation marks. Now, revert the renaming of package-x.json back to package.json.

Empty (-empty Test)#

Currently, this project has no empty directories:

Let's create an empty directory:

When we search for empty directories within this project, the terminal will print the empty-dir directory.

Path (-path Test)#

Since relative paths start with ./ to indicate the current directory, the argument passed to the -path test must begin with either * or ./ to allow the glob pattern to match the leading segment of a relative path.

If we tried to locate package.json files with -path and forgot to add these prefixes to the glob pattern, then the find command returns an empty list.

Prepending * or ./ to the glob pattern allows the find command to correctly match for package.json files via their relative paths.

Let's find all package.json files within the packages sub-directory:

Type (-type Test)#

The -type test filters out directories/files based on what the user is looking for. If the user only wants the find command to limit its search to files, then the -type test must be passed f for "file."

The following command prints all of the files within the current directory.

If the user only wants the find command to limit its search to directories, then the -type test must be passed d for "directory."

The following command prints all of the sub-directories within the current directory.

To search for multiple types, join the types together and separate each type with a comma.

The -type test supports additional types:

  • b for Block Special FilesExamples:

    • Hard-Drive Partitions

    • Memory Devices (SD Cards, Flash Drives)

  • c for Character Special FilesExamples:

    • Terminal File

    • NULL File

    • System Console File

    • File Descriptor File

  • p for FIFO Special Files (Named Pipes) Example:

    • Shared File Systems

  • l for Symbolic Link Files

  • s for UNIX Domain Socket Address Files

  • D for Solaris Door Files

Next Steps#

Proceed on to the second part of this blog post, which dives into the grep command.

Sources#