Recitation 2
What is the problem this paper tried to solve? (Hint: please state the problem using data)
The combination of the small block size(512 bytes), limited read-ahead in the system(arbitrarily allocated block), and many seeks severely limits file system throughput(2% of the maximum disk bandwidth).
The original 512 byte UNIX file system is incapable of providing the data throughput rates that many applications require.
Hence there is a need for a faster file system.
What is the root cause of the problem? In one sentence.
The enhancement of both softwares and hardwares forces FS has to be improved to catch up with them(In other words, the philosophy worse is better causes it.).
For each optimization, how does the author get the idea? (Hint: just imagine you're the author. Given the problem, what experiment would you do to get more info?)
-
Bigger block size(4096 Bytes): The author is indicated by the improvement work on Unix file system at Berkeley(changing block size from 512 Bytes to 1024 Bytes). Increasing the block size can improve throughput.
The traditional file system never transfersmore than 512 bytes per disk transaction and often finds that the next sequential data block is not on the same cylinder, forcing seeks between 512 byte transfers.
- Optimize storage utilization: By doing measurement based of active user file systems of wasted space with different block sizes, the author found that 4096 Bytes/block wasted too much space because there were many small files in daily use. He tried to minimize the wasted rate as the old file system did. So he splitted single block into small fragments.
- File system parameterization: The old file system ignores the parameters of the underlying hardware. If a file system is configurable, blocks can be allocated in an optimum configurationdependent way adapted to the characteristics of the disk.
- Layout policies: By taking account of the underlying hardware, the upper layout policies had to be adjusted to improve performance. Referring two methods that increasing the locality of reference and improving the layout of data, the author used the global layout policies trying to improve performance by clustering related information.
For each optimization, does it always work? In which case it cannot work?
-
Bigger Block Size
The performance of the new file system is currently limited by memory to memory copy operations required to move data from disk buffers in the system’s address space to data buffers in the user’s address space.
When the buffers are not aligned well, the cost of transferring data will be more expensive.
Layout policies: In order for the layout policies to be effective, a file system cannot be kept completely full.
Do you have some better ideas to implement a faster FS?
The paper targets at rotational hard disks. When comes to non-cylinder storage devices, the file system may be even slower. A more compatible file system for all kinds of storage devices is needed.
-
The default policy is to allocate one inode for each 2048 bytes of
space in the cylinder group, expecting this to be far more than will ever be needed.2048B seems too small. It is worth doing some research on it.