42

As the subject says; I want to know why is every directory having a size equals to 4K even if they contain files with sizes greater than 4K.

Please have a look at the following:-

$ ls -lh
total 2.0M
drwxr-xr-x 4 ankit ankit 4.0K Sep 11 07:28 Desktop

$ ls -lrh Desktop/
-rw-rw-r-- 1 ankit ankit 9.1M Aug 4 11:15 sophosthreatsaurusaz.pdf
-rw------- 1 ankit ankit 107K Dec 27 2010 KP 3 0.pdf
drwxrwsr-x 9 ankit ankit 4.0K Sep 10 19:26 eclipse

PS: I am aware of du -sh command line utility.

Edit: I am assuming directory as a container for files.

galoget
  • 2,963
Ankit
  • 6,779

3 Answers3

39
  • Without getting too technical, think of a directory entry as simply a "link" to a list of the files the directory "contains."
  • Then, as with everything, ls shows you the size of that link, not the total space occupied by the contents of the directory.
  • The minimum size a file or directory entry/link must occupy is one block, which is usually 4096 bytes/4K on most ext3/4 filesystems.
ish
  • 139,926
  • 8
    You say that "The minimum size a file or directory entry/link must occupy is one block" but I am sure that I have seen file sizes less than 4K. – lakshayg Aug 13 '16 at 15:18
  • 2
    @LakshayGarg although the file can be less than 4K, then it will cause what is called "internal fragmentation", where just a few bytes of the block has been used to store the small file. – campescassiano Apr 27 '18 at 06:11
  • 1
    @phyloflash some filesystems (e.g. NTFS) store small files in the file entries themselves (for NTFS it's in the MFT entry). This way their contents occupy zero allocation blocks, and internal fragmentation is reduced. – Ruslan Nov 02 '19 at 09:03
  • @Ruslan : EXT4 has an option for that too (it's called "inline data"), but it's disabled by default, because not yet stabilized (see https://lkml.kernel.org/linux-ext4/20190914232801.GD19710@mit.edu/T/) – ChennyStar Jan 09 '24 at 15:12
33

To understand this, you'd better have some basic knowledge of the following (file system):

  • inode (contains file attributes, metadata of file, pointer structure)
  • file (can be considered a table with 2 columns, filename and its inode, inode points to the raw data blocks on the block device)
  • directory (just a special file, container for other filenames. It contains an array of filenames and inode numbers for each filename. Also it describes the relationship between parent and children.)
  • symbolic link VS hard link
  • dentry (directory entries)
  • ...

On typical ext4 file system (what most people use), the default inode size is 256 bytes, block size is 4096 bytes.

A directory is just a special file which contains an array of filenames and inode numbers. When the directory was created, the file system allocated 1 inode to the directory with a "filename" (dir name in fact). The inode points to a single data block (minimum overhead), which is 4096 bytes. That's why you see 4096 / 4.0K when using ls.

You can get the details by using tune2fs & dumpe2fs.

Example

root@ubuntu:~# tune2fs -l /dev/ubuntu/root 
tune2fs 1.42 (29-Nov-2011)
Filesystem volume name:   <none>
Last mounted on:          /
Filesystem UUID:          2fca4cbb-22f1-4328-ab13-cacedb360930
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              967680
Block count:              3931136
Reserved block count:     0
Free blocks:              2537341
Free inodes:              517736
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      416
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8064
Inode blocks per group:   504
RAID stride:              35637
Flex block group size:    16
Filesystem created:       Thu Mar 15 14:31:04 2012
Last mount time:          Sat Oct 20 20:28:04 2012
Last write time:          Sat Oct 20 20:23:32 2012
Mount count:              1
Maximum mount count:      -1
Last checked:             Sat Oct 20 20:22:57 2012
Check interval:           0 (<none>)
Lifetime writes:          54 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:           256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
First orphan inode:       272350
Default directory hash:   half_md4
Directory Hash Seed:      d582ad79-75a0-4964-9a48-33ddba04df5c
Journal backup:           inode blocks
Terry Wang
  • 9,775
  • Nice and thorough explanation. Note that that's one feature where NTFS gets the better of EXT4 : NTFS has actually 2 block sizes, one for files ("cluster size") and one for directories ("index block size"). In EXT4, if you decide to use for example 64K blocks, then every directory will also take 64K, which is a waste of space. NTFS handles that aspect better. – ChennyStar Jan 09 '24 at 15:02
9

If a file contains any data at all (even a single byte), it will occupy one block on the disk (which is typically 4k these days). One block cannot be shared between files. This means that the space of that whole block will not be available for other files, so it is considered "used".

Source

Anwar
  • 76,649
ThiagoPonte
  • 1,916