Everything in Linux is considered to be either a file or a process:

  • A process is an executing program identified by a unique process identifier, called a PID. Processes may be short in duration, such as a process that prints a file to the screen, or they may run indefinitely, such as a monitor program.
  • A file is a collection of data, with a location in the file system called a path. Paths will typically be a series of words (directory names) separated by forward slashes, /. Files are generally created by users via text editors, compilers, or other means.
  • A directory is a special type of file. Linux uses a directory to hold information about other files. You can think of a directory as a container that holds other files or directories; it is equivalent to a folder in Windows or macOS.

A file is typically stored on physical storage media such as a disk (hard drive, flash disk, etc.). Every file must have a name because the operating system identifies files by their name. File names may contain any characters, although some special characters (such as spaces, quotes, and parenthesis) can make it difficult to access the file, so you should avoid them in filenames. On most common Linux variants, file names can be as long as 255 characters, so it is convenient to use descriptive names.

Files can hold any sequence of bytes; it is up to the user to choose the appropriate application to correctly interpret the file contents. Files can be human readable text organized line by line, a structured sequence only readable by a specific application, or a machine-readable byte sequence. Many programs interpret the contents of a file as having some special structure, such as a pdf or postscript file. In scientific computing, binary files are often used for efficiency storage and data access. Some other examples include scientific data formats like NetCDF or HDF which have specific formats and provide application programming interfaces (APIs) for reading and writing.

The Linux kernel is responsible for organizing processes and interacting with files; it allocates time and memory to each process and handles the file system and communications in response to system calls. The Linux system uses files to represent everything in the system: devices, internals to the kernel, configurations, etc.

 
©   Cornell University  |  Center for Advanced Computing  |  Copyright Statement  |  Inclusivity Statement