Data-in-inode
GPFS by default has 4 KiB inodes, and will try to store file data within these 4 KiB if there’s enough free space for that. That means it will not need to allocate any data blocks or data sub-blocks for these files. As a rule-of-thumb, I say that files smaller than 3.5 KiB will be stored as data-in-inode, but if there are lots of xattrs or other file metadata, that might make this limit lower.
Since metadata is not encrypted when using file system encryption at rest feature, data-in-inode is disabled when doing encryption. That means that any file larger than 0 size on an encrypted file system will use minimum 4 KiB inode plus 1 sub-block (likely 8 KiB or 16 KiB, depending on file system block size).
To figure out how many files are stored as data-in-inode, we can use the policy engine to look for files with KB_ALLOCATED=0
# cat data-in-inode.policy
RULE LIST 'data-in-inode' WHERE KB_ALLOCATED=0
# mmapplypolicy archive -P data-in-inode.policy -I defer -f /tmp/policy
<snip>
[I] Summary of Rule Applicability and File Choices:
Rule# Hit_Cnt KB_Hit Chosen KB_Chosen KB_Ill Rule
0 799 0 799 0 0 RULE '' LIST 'data-in-inode' WHERE(.)
and if we check /tmp/policy.list.data-in-inode we have a listing of these files. Sorted by size, I see my largest data-in-inode file is 3789 bytes:
# awk -F ' -- ' '{printf "ls -l \"%s\"\n", $2}' /tmp/policy.list.data-in-inode|bash|sort -n -k5|tail
-rw-------. 1 root root 814 Aug 8 11:41 /gpfs/scalemgmt/.ltfsee/meta/0000VLIBRARY_LL0/cart_repos/VTAP48L5.cartstat
-rw-r--r--. 1 root root 814 Jun 10 10:54 /gpfs/scalemgmt/.ltfsee/statesave/completed/229/1229/summary.json
-rw-------. 1 root root 820 Aug 8 11:41 /gpfs/scalemgmt/.ltfsee/meta/0000VLIBRARY_LL0/cart_repos/VTAP49L5.cartstat
-rwx------. 1 root root 851 Jul 30 12:47 /gpfs/scalemgmt/cesSharedRoot/ces/s3-config/config.json
-rw-rw-rw-. 1 root root 995 Jun 16 09:43 /gpfs/scalemgmt/.ltfsee/meta/0000VLIBRARY_LL0/volume_cache/VTAP40L5.schema
-rw-rw-rw-. 1 root root 995 Jun 16 09:52 /gpfs/scalemgmt/.ltfsee/meta/0000VLIBRARY_LL0/volume_cache/VTAP48L5.schema
-rw-rw-rw-. 1 root root 995 Jun 16 09:58 /gpfs/scalemgmt/.ltfsee/meta/0000VLIBRARY_LL0/volume_cache/VTAP31L5.schema
-rw-rw-rw-. 1 root root 995 Jun 16 09:58 /gpfs/scalemgmt/.ltfsee/meta/0000VLIBRARY_LL0/volume_cache/VTAP32L5.schema
-rw-rw-rw-. 1 root root 995 Jun 4 12:48 /gpfs/scalemgmt/.ltfsee/meta/0000VLIBRARY_LL0/volume_cache/VTAP44L5.schema
-rw-r--r--. 1 root root 3789 Jun 16 09:48 /gpfs/scalemgmt/.ltfsee/statesave/completed/280/1280/subtask.1281/subtask.1282/subtask.1283/msg.txt
and if we poke into the file system using tsdbfs, we can see that this files has indirectionLevel=INODE, which means that it’s stored as data-in-inode:
# ls -i /gpfs/scalemgmt/.ltfsee/statesave/completed/280/1280/subtask.1281/subtask.1282/subtask.1283/msg.txt
45665 /gpfs/scalemgmt/.ltfsee/statesave/completed/280/1280/subtask.1281/subtask.1282/subtask.1283/msg.txt
# echo inode 45665 | tsdbfs scalemgmt | grep indirectionLevel
indirectionLevel=INODE status=USERFILE
Then we can check how much data fits within the inode:
## Create a test-file and find it's inode number
# touch /gpfs/scalemgmt/testfile
# ls -i /gpfs/scalemgmt/testfile
59139 /gpfs/scalemgmt/testfile
# for i in $(seq 1 4096|tac) ; do dd if=/dev/zero of=/gpfs/scalemgmt/testfile bs=1 count=$i ; sync -d /gpfs/scalemgmt/testfile ; echo inode 59139 | tsdbfs scalemgmt | grep indirectionLevel=INODE && break ; done
<snip>
3878 bytes (3.9 kB, 3.8 KiB) copied, 0.0239968 s, 162 kB/s
3877+0 records in
3877+0 records out
3877 bytes (3.9 kB, 3.8 KiB) copied, 0.0241104 s, 161 kB/s
3876+0 records in
3876+0 records out
3876 bytes (3.9 kB, 3.8 KiB) copied, 0.0230618 s, 168 kB/s
indirectionLevel=INODE status=USERFILE
So the largest file fitting as data-in-inode in this file system was 3876 bytes. If I add a single xattr, it moves out of data-in-inode:
# mmchattr --set-attr user.test=1 /gpfs/scalemgmt/testfile
# echo inode 59139 | tsdbfs scalemgmt | grep indirectionLevel
indirectionLevel=DIRECT status=USERFILE
and then there’s only room for 3864 bytes:
# for i in $(seq 1 4096|tac) ; do dd if=/dev/zero of=/gpfs/scalemgmt/testfile bs=1 count=$i ; sync -d /gpfs/scalemgmt/testfile ; echo inode 59139 | tsdbfs scalemgmt | grep indirectionLevel=INODE && break ; done
<snip>
3866 bytes (3.9 kB, 3.8 KiB) copied, 0.0270999 s, 143 kB/s
3865+0 records in
3865+0 records out
3865 bytes (3.9 kB, 3.8 KiB) copied, 0.0268111 s, 144 kB/s
3864+0 records in
3864+0 records out
3864 bytes (3.9 kB, 3.8 KiB) copied, 0.0262779 s, 147 kB/s
indirectionLevel=INODE status=USERFILE