Linux & Storage: Filesystems and Formatting

Intro

Some time back, I set up Immich, a self-hosted photo and video manager similar to Google Photos, on my Raspberry Pi home server. The process was relatively easy, but was not without its complications, particularly due to the storage devices I was using on my system.

Immich locally stores a variety of data related to the app, which can take up a good amount of space. My Pi boots from a 32GB micro SD card; while I don’t think I’d ever hit capacity on there through Immich data alone, who knows what other applications I’d run/install on there in the future, so I wanted to avoid using it for all that.

I then looked at the 2TB portable SSD I had all my videos and photos on. Upon running its docker container, I was greeted by errors related to chown, which changes file ownership, not being a permitted operation. I found later on that since it uses exFAT, it doesn’t support Linux permissions. I needed a drive which had an ext4 filesystem, which Linux natively supports.

The revelation led me to further research filesystem formats and possible solutions to these problems of mine.

In this post, I cover the differences between common filesystem formats for Windows and Linux (NTFS, exFAT, ext4), how to work with storage devices on Linux (including understanding mounting and file access), and how to format a storage device to ext4 (to make the most out of Linux’s features).

Hopefully my explanations are simple enough for a non-Linux computer user to understand, as it feels like basic explanations of this kind of stuff aren’t really common around the web.

What is a filesystem?

A filesystem is the underlying structure for how files are stored, organized, and managed on storage devices, from SSDs to hard drives to USB drives. Essentially, a drive’s filesystem serves as the “library” of its files.

Different filesystems handle organization, access, and storage management very differently. As an example, on systems with multiple storage drives attached, their hierarchies from the top folder will be different on Windows vs. Linux systems. Additionally, file permissions are done differently on different filesystems, with some systems focusing on more granular control over individual files’ access control, while others try and keep it simple.

NTFS vs. exFAT vs. ext4

For Windows and Linux systems, these are three of the most common filesystems used for storage devices on these systems. Of course, there are many more out there, including FAT32 (a much older Windows fs used starting in Windows 95) and APFS (the most common fs natively supported by macOS), but as I’m not a macOS user nor do I use filesystems that old, those are out of the scope of this post.

NTFS (New Technology File System)

NTFS was developed by Microsoft in the 1990s. Today, this filesystem is most commonly used on Windows internal drives; if you are on a Windows laptop or desktop, your main C drive will almost certainly use NTFS.

Drives using NTFS feature journaling (increasing recoverability after crashes) and file permissions/encryption. macOS and Linux systems only partially support NTFS drives; they perform the fastest on Windows.

exFAT (Extended File Allocation Table)

exFAT was also developed by Microsoft; it is a more recent technology, being introduced in 2006. It is seen mostly on portable drives and SD cards.

Of note, both Windows and macOS natively support it, so it works well for easy transfers between the two systems. Linux however only partly supports it out of the box, particularly due to exFAT’s lack of a permission system, but it is sufficient for read/write operations so it’s still fine for media storage. It also does not have journaling, which poses a corruption risk.

ext4 (Fourth Extended File System)

ext4 is the focus of this post, and is the default filesystem for most Linux distros. Windows has only minimal support for it (read-only). It supports Linux’s implementation of permissions (discussed later) and other filesystem features (such as symbolic links).

It is also blazingly fast; I mentioned how surprised I was at how fast Linux is at handling filesystem and terminal operations (versus Windows) in another post, it turns out ext4 is just that much quicker than NTFS.1

Understanding Filesystems on Linux (vs. Windows)

Having used Windows most of my life, I noticed many differences in the ext4 filesystem compared to NTFS. That said, they are not very extreme and, as someone already familiar with file manipulation in Windows, were pretty easy to get used to.

The Hierarchy

On Windows, each storage device has its own root directory (or the highest folder accessible within the drive), represented by drive letters. The most common one of course is C, which represents the boot drive. Its root includes the Users, Program Files, and Windows folders. Any external drives have other letters assigned, starting from D onwards. On Windows systems, the This PC “folder” in File Explorer is not actually its own folder in the filesystem, it is just a place to easily access devices and specific folders.

A very basic doodle showing the file hierarchy in Windows. Each storage device has its own independent root folder, and thus 'section' in the Windows system.

On Linux, everything falls under a single root directory (/) in the boot drive. Every drive’s “root” lies somewhere under this directory, though that said, their contents only use space within the drive they are stored in, and nothing in the boot drive. Under that root folder is also essential binaries in bin, system config files and services in etc, and the user folders in home, among other system files.

This model was a little confusing to me at first, but it can be thought of like this: the folders which represent an external storage device serve as “portals” to the device they link to.

A very basic doodle showing the file hierarchy in Linux. Everything lies under '/' in the boot drive, but within the '/media/Simon' folder are two external drives, which lead to their storage.

Mounting

For Windows, “mounting” is not a commonly heard term, as Windows automatically assigns new drives to a drive letter when attached, so they can be immediately accessed and used.

This is not always the case with Linux, though. Some desktop environments (like GNOME) do automatically mount devices like Windows, though this time under a subdirectory such as /media/ or /mnt/, but others do not. For the latter, mounting must be handled in the terminal by specifying the newly-attached device’s name (usually formatted as /dev/sdXX), and a mount point (the location under the root where the drive can be accessed):

sudo mount /dev/sda1 /mnt/ext

Unmounting, equivalent to checking if a drive is safe to eject in Windows, is handled in a similar manner:

sudo umount /mnt/ext

Permissions

Windows uses an access control list (ACL) model for permissions, allowing for very granular control; administrators can specify specific permissions for users and groups, including “modify”, “read & execute”, and “list folder contents.” They are handled in the File Properties > Security GUI. Windows also has a bit of an edge in this area because of inheritance, meaning permissions can be inherited from parent directories.

Linux’s permission system is a bit simpler, having only read (r), write (w) and execute (x) permissions. Handled through the terminal, permissions are displayed in 9 character strings, structured as follows:

  • The first three characters represent owner permissions
  • The next three represent permissions of all users who belong to the group associated with the file/directory
  • The last three represent permissions all other users have

Take rwxr-xr-- for example. The owner can read, write, and execute the file; group users can read and execute, but not write; all other users can only read.

How to Format a Drive for Linux Use

As mentioned in the Intro, I wanted to use an ext4 filesystem to store my media manager’s data, while keeping it separate from the Linux boot drive. I ended up finding another micro SD card that was formatted as exFAT, so I had to reformat it to ext4. This section describes how I did that; the process was quick and not hard, but requires terminal usage. This assumes your storage device is already mounted to the system.

First, you must identify the device you want to format using the df -T command to identify what filesystem each connected device uses.2

sbrug@windragon:~ $ df -T
Filesystem     Type      1K-blocks     Used  Available Use% Mounted on
[truncated]
/dev/sda1      exfat    1953450496 57710592 1895739904   3% /media/sbrug/T71
/dev/sdb2      ext4       30073720        8   28520716   1% /media/sbrug/31GBSD

My /dev/sdb2/ device is already formatted as ext4 here, but for the sake of the guide, let’s say it’s exFAT instead and you want to reformat it to ext4.

Next, the drive must be unmounted:

sudo umount /dev/sdb2

Formatting the device can be done in one command. This must be prefaced with the obvious warning:

THIS WILL WIPE EVERYTHING ON THE DRIVE. MAKE SURE ALL FILES ARE BACKED UP ELSEWHERE BEFORE DOING THIS.

sudo mkfs.ext4 /dev/sdb2

You will be prompted with a confirmation, stating that the drive is formatted using some other system. Just enter y and proceed.

Finally, re-mount the newly formatted drive:

sudo mount /dev/sdb2 /media/user/LinuxSD

And the process is done! You can verify that the reformat was successful by re-running df -T.


  1. https://superuser.com/questions/256368/why-is-an-ext4-disk-check-so-much-faster-than-ntfs ↩︎

  2. Other commands like fastfetch which can be installed through apt-get etc. can display this information in a more streamlined manner. ↩︎