PH8124

Lecture 3: Linux file system. Positional parameters. Your first Linux/Bash command. Command precedence

Last update: 20260501-5

Table of Contents

  1. Linux file system
  2. Positional parameters
  3. Your first Linux/Bash commands: Bash functions
  4. Command precedence

1. Linux file system

We have already seen how you can make your own files (e.g. with touch, cat or nano), and your own directories (with mkdir). The organization of files and directories in Linux is not arbitrary, and it follows the common and widely accepted structure named Filesystem Hierarchy Standard (FHS). The top directory is the so-called root directory and is denoted by / (slash). You can enter it and see its content by executing the following code snippet in the terminal:

cd /
ls

The output could look like:

bin  boot  dev  etc  home  lib  media  opt  proc  root  run  sbin  sys  tmp  usr  var

All files and directories on your computer are in one of these subdirectories. Depending on which Linux distribution you are using, the details might differ — you can programmatically inspect which Linux distribution is installed on your computer with the following command:

$ cat /etc/os-release 
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Schematically, the Linux file system structure can be represented with the following diagram:

The Bash built-in command cd (‘change directory’) is used to move from the current working directory to some other directory. It accepts only one argument, which is interpreted either as an absolute path to the new directory (if the argument starts with /), or as a relative path to the new directory (relative to your current working directory). If you use cd without any argument, the argument is defaulted to the home directory. Due to their special meanings, the only characters that cannot be part of a directory name are / and the null byte \0.

If you get confused where you are at the moment in the Linux file system (i.e. where is your current working directory in the overall file system hierarchy), you can always get that information either from Bash built-in command pwd (‘print working directory’):

pwd

or by referencing the content of environment variable PWD, which is always set to the absolute path of your current working directory:

echo $PWD

Both versions return the same answer in all cases of practical interest. However, and as a general rule of thumb, it is always much more efficient to get information directly from the environment variable like PWD, than to retrieve and store in a variable the same information by executing the command, via the so-called command substitution operator $( ... ) (more on this later).

The most important directories in the Linux file system structure are:

We have already used Linux commands date and touch. But to which physical executables (binaries), stored somewhere in the file system, these two commands correspond to? For all cases of practical interest, you can figure that out simply by using the command which:

$ which date
/bin/date
$ which touch
/usr/bin/touch

It is completely equivalent to execute in the terminal the command name, e.g. date, or the full absolute path to the corresponding executable. Therefore,

$ date
Mon Apr 27 16:12:06 CEST 2020

is the same as:

$ /bin/date
Mon Apr 27 16:12:06 CEST 2020

It would be very tedious and impractical if each time we would like to use some command, it would be necessary to type in the terminal the absolute path to its executable sitting somewhere in the Linux file system, both in terms of typing and in terms of memorizing the exact locations. This is precisely where Bash (or any other shell) is extremely helpful — shell finds the correct executable in the file system for us, after we have typed only the short command name in the terminal, and executes it. Clearly, something is happening here behind the scene: How does shell know which physical executable in the file system is linked with the short command name you have typed in the terminal? Hypothetically, we could also have another version of date command sitting somewhere else in the file system, e.g. in the directory /usr/bin/date. Then there is an ambiguity, since after we have typed in the terminal date, it is not clear whether we want /bin/date or /usr/bin/date to be executed.

This is resolved with a very important environment variable PATH. To see its current content, simply type:

echo $PATH

The output could look like this:

/home/abilandz/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

This output looks messy, but in fact it has a well-defined structure which is easy to decipher. In the above output, we can recognize absolute paths to a few directories, which are separated in this context with the field separator : (colon). The directories specified in the environment variable PATH are extremely important, because only inside them Bash will be searching for a corresponding executable, after you have typed the short command name in the terminal. Literally, the command date works because the directory /bin, where its corresponding executable /bin/date sits, was added to the content of PATH variable. The order of directories in PATH variable matters — when Bash finds your executable in some directory specified in PATH, it will stop searching in the other directories specified in PATH. The priority of the search is from left to right. Therefore, if you have two executables in the file system for the same command name, e.g. /bin/date and /usr/bin/date, and if the content of PATH is as in the example above, after you have typed in the terminal date, Bash would try first to execute /usr/bin/date and not /bin/date, because /usr/bin is specified before /bin in the PATH variable. However, since there is no date executable in /usr/bin, Bash continues the search for it in /bin, finally finds it there, and then executes /bin/date .

By manipulating the ordering of directories in PATH variable, you can also have your own version of any Linux command — just place the directory with your own executables at the beginning of PATH variable, and then those directories will be searched first by Bash. For instance, you can have your own executable for date in your local directory for binaries (e.g. in /home/abilandz/bin). Then, you need to redefine PATH in such a way that it has your personal directory with higher priority, when compared to standard system-wide directories for command executables (like /bin, /usr/bin, etc.). This is achieved with the following standard code snippet:

PATH="/home/abilandz/bin:${PATH}"

With this syntax, directory with your personal executables /home/abilandz/bin is prepended to the current content of PATH, and therefore your executables will have a higher priority in the Bash search.

For the lower priority of your executables, use an alternative standard code snippet:

PATH="${PATH}:/home/abilandz/bin"

In this example, you have appended the directory with your executables to what is already set in PATH — this way you indicate that you want to use your own version of some standard system-wide Linux command only if its executable is not found by Bash. As always, if you want to make such definitions permanent in any new terminal you open, add the above redefinitions of PATH into ~/.bashrc file. In case you want the redefinition of PATH to be persistent in all new processes you start from a terminal, use in addition the command export at the first redefinition of PATH variable.

From the above explanation, it is clear that if you unset PATH variable, all commands will stop working when you type them in the terminal, because Bash does not know where to search for the corresponding executables.

We finalize the explanation of PATH variable with the following concluding remarks:

Some frequently used Linux commands to work within the file system are:

As you can see from the above output of stat, the example file Lecture_2.md is characterized by three timestamps: Access, Modify and Change. These three timestamps are an important part of file metadata, which we cover next.

File metadata

File metadata is any file-related information besides its content. From the user’s perspective, the most important file metadata are timestamps, ownership and permissions.

The meaning of three timestamps is as follows:

These three timestamps are not an overkill, in fact, they enable a lot of very powerful features when searching for specific files or directories in the file system. For instance, by using them, it is possible to list names of all files modified within the last day, to delete all files which were not accessed for more than 1 year, etc.

Next, each file or directory in Linux has three distinct levels of ownership:

File ownership becomes extremely handy in combination with file permissions, when it is very simple to set common access rights for any group of other users.

Finally, each file in Linux has three distinct levels of permissions (or access rights):

For instance, when you execute

ls -al someFile

you can get the following example output:

-rw-rw-rw- 1 abilandz alice 97805 Apr 28 12:23 someFile

It is very important to understand all entries in this output, and how to modify or set some of them. Reading from left to right:

The meaning of the remaining columns is trivial.

File permissions are changed with the Linux command chmod (‘change mode’). This is best illustrated with a few concrete examples:

chmod o+r someFile.txt

After the above command was executed, others (o) can (+) read (r) your file someFile.txt. Whatever was set for w and x flags for others, it remains intact. A slightly different notation:

chmod o=r someFile.txt

would ensure that for others, only r is set, while w and x flags are forced to -. In this example:

chmod go-w someFile.txt

group members to which your account belongs to (g) and all others (o) can not (-) modify or write (w) to your file someFile.txt. Therefore, after this simple command execution, only you can edit this file!

With this syntax:

chmod u+x someFile.txt

the file someFile.txt is declared to be an executable and only you as a user (u) can (+) execute (x) it.

Remember that only the files which are executables are taken into account by Bash when searching through the content of directories in PATH variable. Therefore, when making your own Linux command, two formal aspects must be always met:

  1. the directory containing your executable must be included in the content of PATH variable;
  2. your executable must have x permission.

Next example:

chmod ugo+rwx someFile.txt

Now everybody (you as a user (u), group members (g) and others (o)), can read (r), modify or write (w) to, or execute your file (x). For directories, you can change permissions in one go for all files in all subdirectories, by specifying the flag -R (‘recursive’), i.e. by using schematically:

chmod -R some-options-to-change-permissions someDirectory

Note that it makes a perfect sense to use x permission also for directories, because we can then add recursively in one go x permission to all files in that directory.

Finally, we clarify that the setting for each permission can be represented alternatively by a numerical value. The rule is established with the following simple table:

When these values are added together, the sum is used to set specific permissions.

For example, if you want to set only ‘read’ and ‘write’ permissions, you need to use a value 6, because from the above table, it follows immediately: 4 (‘read’) + 2 (‘write’) = 6. If you want to remove all of ‘read’, ‘write’ and ‘execute’ permissions, you need to specify 0.

For convenience, all possibilities are documented in the table:

Example: Make a new file with default permissions, then remove all permissions, and set the permission pattern to -rwx--xr-- , by using both syntaxes described above. With the first syntax, we would have:

touch file.log # make a new file
# the default permission pattern is: -rw-rw-rw-
chmod ugo-rwx file.log # strip off all permissions
# pattern is now: ----------
chmod u+rwx,g+x,o+r file.log # set new requested permissions
# the final pattern is: -rwx--xr--

With the alternative syntax, we proceed as follows:

touch file.log # make a new file
# the default permission pattern is: -rw-rw-rw-
chmod 000 file.log # strip off all permissions
# pattern is now: ----------
chmod 714 file.log
# the final pattern is: -rwx--xr--

It practice, it is not needed to remove old permissions and only then to set the new ones — it was done here that way only for the sake of this exercise, but the old permissions can be directly overwritten.

Example: Does command cp copy also the permissions of original file into a new file?

# make a new file with default permissions:
$ touch file1.txt 

# check the permisions:
$ ls -la file1.txt
-rw-rw-r-- 1 abilandz abilandz 0 Mai 14 14:19 file1.txt

# change the permisions:
$ chmod o+w file1.txt

# copy the original file into new file:
$ cp file1.txt file2.txt

# check the permissions of both files:
$ ls -al file*
-rw-rw-rw- 1 abilandz abilandz 0 Mai 14 14:19 file1.txt
-rw-rw-r-- 1 abilandz abilandz 0 Mai 14 14:22 file2.txt

# copy the original file into new file using option '-a':
$ cp -a file1.txt file3.txt

# check the permissions of all files:
$ ls -al file*
-rw-rw-rw- 1 abilandz abilandz 0 Mai 14 14:19 file1.txt
-rw-rw-r-- 1 abilandz abilandz 0 Mai 14 14:22 file2.txt
-rw-rw-rw- 1 abilandz abilandz 0 Mai 14 14:19 file3.txt

As we can see above, the new file file2.txt was created with default permissions if cp was used without any options. Permissions were correctly copied over into the new file file3.txt only if cp -a was used (in this context, the flag -a means ‘preserve all’). The same thing happens when on a shared computer we copy a file from the home directory of another user into our home directory. As a side remark, we mention that the default permissions for files and directories can be modified with shell’s built-in command umask.

Before we start developing the new commands from scratch in Linux, we need to introduce one very important and fairly generic concept: positional parameters (or script arguments).

2. Positional parameters

In this section we discuss how some arguments can be supplied to your script at execution. This clearly will allow you much more freedom and power in the code development, because nothing needs to be hardcoded in the script’s body. The very same mechanism can be used also in the implementation of Bash functions, as we will see later. We introduce now the so-called positional parameters (or script arguments).

Example: We want to develop a script named favorite.sh which takes two arguments: the first one is the name of the collider, the second the name of the experiment. This script then just prints something like:

My favorite collider is <some-collider>
My favorite experiment at <some-collider> is <some-experiment>

The solution goes as follows — edit the file favorite.sh with the following content:

#!/bin/bash

echo "My favorite collider is ${1}" 
echo "My favorite experiment at ${1} is ${2}"

return 0

If you now execute this script as:

source favorite.sh LHC ALICE

the printout looks as follows:

My favorite collider is LHC
My favorite experiment at LHC is ALICE

So how does this work? It is very simple and straightforward, there is no black magic happening here! Whatever you have typed first after source favorite.sh , and before the next empty character is encountered in the command input, was declared as the 1st positional parameter (or the 1st script argument). The value of the 1st positional parameter is stored in the internal variable ${1} (‘LHC’ in the above example). Whatever you have typed next, and before the next empty character is encountered, is declared as the 2nd positional parameter, and its value is stored in the internal variable ${2} (‘ALICE’ in the above example). And so on — in this way you can pass to your script as many arguments as you wish!

Once you fetch programmatically in the body of your script the supplied arguments via variables ${1}, ${2}, etc., you can do all sorts of manipulations on them, which can completely modify the behavior of your script.

Few additional remarks on positional parameters:

In combination with looping, you can programmatically parse over all supplied arguments to your script (i.e. there is no need to hardwire in the script that you expect exactly a certain number of arguments, etc.).

Example: Below is the script arguments.sh, which uses the for loop in Bash (loops are covered in detail later!), and just counts and prints all arguments supplied to the script:

#!/bin/bash

echo "Total number of arguments is: $#"
echo "The second argument is: ${2}"
echo "The very last argument is: ${!#}"

for Arg in "$@"; do
 echo "${Arg}"
done

return 0

If you execute this script for instance as:

source arguments.sh a bbb cccc

you will get as a printout:

Total number of arguments is: 3
The second argument is: bbb
The very last argument is: cccc
a
bbb
cccc

By using this functionality, you can instruct a script to behave differently if certain options or arguments are supplied to it. Since this is clearly a frequently used feature, the specialized built-in Bash command exists to ease the parsing and interpretation of positional parameters (see the documentation of advanced getopts (‘get options’) command, but do not confuse it with Linux utility with similar name getopt, which has flaws in its design).

3. Your first Linux/Bash command: Bash functions

As the very first and respectable version of your own command in Linux/Bash, which can take and interpret arguments, provide exit status, has its own environment, etc., we can consider Bash function.

Functions in Bash are very similar to scripts, however, the details of their implementations differ. In addition, functions are safer to use than scripts, since they have a well-defined notion of local environment. This basically means that if you have the variable with the same name in your current terminal session, as well as in the script or in the function you are executing, it’s much easier to prevent the clash of these variables if you use functions. In addition, usage of functions to great extent resembles the usage of Linux commands, and it is in this sense, that your first function developed in Bash can be also treated as your first Linux command!

An example implementation of Bash function could look like:

#!/bin/bash

function Hello
{
 # This function prints the welcome message 
 # Usage: Hello <some-name>

 echo "Hello"
 local Name="${1}"
 echo "Your name is: ${Name}"

 return 0

}

Save the above code snippet in the file functions.sh. Then, in order to execute your function Hello, first you have to source that file:

source functions.sh

From this point onward, the definitions of all functions in the file functions.sh are loaded in the computer’s memory, and can be in the current terminal session used as any other Linux or Bash built-in command. To check this, try to execute:

Hello Alice

The output is:

Hello
Your name is: Alice

When compared to the script implementation, there are few differences:

The rest is the same as for the scripts:

Finally, we remark that functions are superior to aliases: anything that can be done with an alias can be done better with a function. For instance, the classical alias definition:

alias ll='ls -alF'

can be reimplemented as a Bash function in the following way:

function ll 
{ 
 ls -alF "$@"; 
}

Note that only the above implementation of function can easily be generalized — within the function body we can programmatically manipulate the arguments and, for instance, use different formatting options for the printout depending upon which directory we are in, etc.

4. Command precedence

We have seen that your very first input in the terminal, before the empty character is encountered, will be interpreted by Bash as the command name, where the command name can stand for an alias, built-in Bash command (e.g. echo), Linux command (e.g. date), Bash functions (e.g. Hello from the previous example), etc. But what happens if we have, for instance, alias and Linux command named in the same way, like in this example:

alias date='echo "Hi"'

If after this definition we type in the terminal date, we get:

$ date
Hi

What now? Have we just accidentally overwritten and lost permanently the command date? Not quite, what happened here is that the alias execution got precedence over the Linux command named in the same way. But both the alias date and the command date now exist simultaneously on your computer.

The command precedence rules in Bash are well defined and strictly enforced with the following ordering:

  1. aliases
  2. Bash keywords (if, for, etc.)
  3. Bash functions
  4. Bash built-in commands (cd, type, etc.)
  5. scripts with execute permission and Linux commands (at this level, the precedence is determined based on the ordering in PATH variable, as we already discussed)

Given the above ordering of command precedence, some care is definitely needed when introducing new aliases or developing new functions in Bash, to avoid the name clashes with the existing Linux commands.

Additional profiling of command precedence can be achieved in Bash with built-in commands builtin, command, and enable (check their ‘help’ pages in Bash). For instance, we can force that always the Bash built-in command echo is executed, even if the alias or function named echo exists, with the following syntax:

builtin echo someText

Reminder: If you have overwritten accidentally Linux command with some alias definition (like in the above example for date), use the command unalias to revert back permanently:

unalias someAliasName

or temporarily with

\someAliasName

In the case you are not sure to which one of the five cases above the command you intend to use corresponds to, use the Bash built-in command type:

$ type date
date is /bin/date

The above line tells that date is Linux command whose executable is /bin/date.

Two other examples in this context:

$ type echo
echo is a shell builtin
$ type ll
ll is aliased to `ls -alF'

For some commands, multiple independent implementations and executables can simultaneously exist on your computer, you can retrieve ‘em all with type -a, for instance:

$ type -a printf
printf is a shell builtin
printf is /usr/bin/printf

For the Bash functions, the command type also prints the source code of that function. For instance, for the function Hello discussed previously you would get:

$ type Hello
Hello is a function
Hello ()
{
    echo "Hello";
    local Name="${1}";
    echo "Your name is: ${Name}";
    return 0
}

This is quite handy, because if you have forgotten the details of the implementation of this particular function, you do not need to dig into the file functions.sh where a lot of your additional functions can be implemented in the meanwhile.

Note also that this way you can see immediately the implementation of some Bash functions which were not developed by you (therefore, you have no idea where in the file system is the file with their source code), but are nevertheless available in your terminal session:

$ type quote
quote is a function
quote ()
{
    local quoted=${1//\'/\'\\\'\'};
    printf "'%s'" "$quoted"
}

Finally, it can happen that accidentally you delete the file functions.sh. If this file was sourced before you deleted it accidentally, you can still retrieve the implementations of your functions from the computer’s memory with type, and then just redirect the output to some file.