Multi-grain Timestamps

Multi-grain timestamps, is the ability to have higher-resolution timestamps for inodes without hurting performance. Low-res timestamps are normally in milliseconds.

Timestamps means:

ctime:

The inode change time. This is stamped with the current time whenever the inode’s metadata is changed. Note that this value is not settable from userland.

mtime:

The inode modification time. This is stamped with the current time any time a file’s contents change.

atime:

The inode access time. This is stamped whenever an inode’s contents are read. Widely considered to be a terrible mistake. Usually avoided with options like noatime or relatime.

The reason why we would care about more precise timestamps would be with a system like NFSwhere the system would compare local and remote inode mtime in order to know if we need to clear cache.

The solution is to go with high-res only when someone is looking, when someone queries the mtime, we use a unused bit in the timestamp show that we are looking. Next time the inodes mtime changes, it would be saved using high-res.

This was attempted in Linux 6.6 and didn't work, think of this scenario:

  1. file1 is written to.
  2. The modification time for file2 is queried.
  3. file2 is written to.
  4. The modification time for file2 is queried (again).
  5. file1 is written again.
  6. The modification time for file1 is queried.

if you were to compare the mtime of file1 and file2, it would tell you that file2 was last written after file1... not what happen. The reason is because the higher resolution would "trick" the system, lets compare the mtime:

file1: 2:01:01:001:999

file2: 2:01:01:001:023

The solution comes in Linux 6.13 where they implement a floor value. When a new high-res timestamp is created, the floor value is updated to make sure that a low-res timestamp would be equal or later than the high-res, so in our example:

file1: 2:01:01:002

Which force the order to be respected, fixing the issue[1] [2]

Sources: