Storage System

Storage hierarchy

Cache, memory -> hard disks, SSD, Tape, Optical Disk
(读写速度,成本)

Access time

Time taken before drive is ready to transfer data
(物理设备(硬盘,内存..)在进行数据的转换前需要索引到目标位置所消耗的时间)
一般来说,
内存:纳秒级
SSD:微秒级
HDD:毫秒级

Access times.png

Storage device information

  • Characters of storage device:

    • Capacity (bytes)
    • Cost(price per byte of storage)
    • Bandwidth (number of bytes that can be transferred per second; read bandwidth is not equal to write bandwidth)
    • Latency(waiting time for response/delivery of data)
  • Basic function/operation:CRUD

  • Time to complete an operation depends on both bandwidth and latency
    CompletionTime = Latency + Size/Bandwidth
    Influence factor:
    Technology(HDD or SSD);Operation type,(read or write);number of operations in the workload; Access pattern(sequential or random)

  • Access pattern:

    1. Sequential: data to be accessed are located next to each other or sequentially on the device
    2. Random: data located randomly on the storage device

Hard Disk Drive

HDD structure.png
  • One or more spinning magnetic platters
    • Typically two surfaces per platter
  • Disk arm positions over the radial position (tracks) where data are stored
    • It swings across tracks (but do not extend/shrink)
  • Data is read/written by a read/write head as platter spins

Hard disk head movement while copying files between two folders:https://www.youtube.com/watch?v=BlB49F6ExkQ

  • Physical characteristics:
    2.5‘’ in laptops, 3.5‘’ common in desktops
    rotational speed: 4,800/5,400/7,200,10,000 RPM (rotations per minute)
    platter number: 5~7
    current capicity: 10 TB (Western Digital)

  • Disk organization: platter -> tracks -> sectors
    Each platter consists of a number of tracks;
    Each track is divided into N fixed size sectors (sector size: 4KB)

CHS (cylinder-head-sector)

Early way to address a sector (Logical Block Addressing) is more common now)


CHS structure.png
example:
# cylinders: 256
# heads: 16 (i.e., 8 platters, 2 heads/platter)
# sectors/track: 64
   sector size = 4KB

capacity of the drive:
2^8 * 2^6 * 2^2* 2^10 * 2^4 = 2^30 = 1GB
overall:capacity = C * H * S * sector size 

According to CHS, data can be located before transferring, then data can be transferred

T = Tseek + Trotation + Ttransfer
Tseek : Time to get the disk head on right track
Trotation :Time to wait for the right sector to rotate under the head
Ttransfer: Time to actually transfer the data

  1. rotational latency: waiting for the right sector to rotate under the head
    On average: about 1⁄2 of time of a full rotation


    rotation.png
example:
Assume 10,000 RPM (rotations per minute)
60000 ms/ 10000 rotations  = 6ms / rotation
  1. seek times (For multiple tracks): waiting for the head to the right track
    On average seek time is about 1/3 max seek time


    seek the track.png

3.transfer time (related to transmission bandwidth)

Assume that data will be transferred:  512KB, 128 MB/sec transmission bandwidth
Transfer time:  512KB/128MB * 1000ms = 4ms
  1. Actual bandwidth
    Actual bandwidth = amount data/ autual time
    actually time = Tseek + Trotation + Ttransfer

Sector vs. Block

  • Block is the smallest unit of the file system
  • Sector is the smallest unit of the hard disk
  • Block has 1 or more sectors

Sequential vs. Random

Sequential operation:

  • May assume all sectors involved are on the same track
    -- need to seek to the right track or rotate to the first sector
    -- But no rotation/seeking needed afterward

Random operation: May assume all sectors are on different tracks and sectors

example: 7ms avg seek,  10,000 RPM  50 MB/sec transfer rate 4KB/block
Sequential access of 10 MB:
– Completion time = 7ms + 60*1000/10000/2 ms + 10/50 *1000 ms = 210ms
– Actual bandwidth = 10MB/210ms = 47.62 MB/s

Random access of 10 MB 
– block numbers: 10*1000/4 = 2500  (assume 1 block = 1 sector)
– Completion time = 2500 * (7 + 3 + 4/50) = 25.2s
– Actual bandwidth = 10MB / 25.2s = 0.397 MB/s

Solid State Drive

SSD.png
  • All electronic, made from flash memory
  • Limited lifetime, can only write a limited number of times.
  • Significantly better latency: no seek or rotational delay
  • Much better performance on random (however, write has much higher latency than read )
Speed comparison between read and write.png

structures of SSD

  • SSD contains a number of flash memory chips
    chip -> dies -> planes -> blocks -> pages (rows) -> cells
• Typically, a chip may have 1, 2, or 4 dies
• A die may have1or 2 planes
• A plane has a number of blocks
• A block has a number of pages 
* A page has a number of cells 
Die Layout.png
  • Page is the smallest unit of data transfer between SSD and main memory

How data is stored in SSD

  • Cells are made of floating-gate transistors : By applying high positive/negative voltage to control gate, electrons can be attracted to or repelled from floating gate
    • State = 1, if no electrons in the floating gate
    • State = 0, if there are electrons (negative charges)
      – Electrons stuck there even when power is off
      – So state is retained
  • Data in SSD are represented by the '101010...' formats, that is the state of the eletrons
floating-gate transistor.png

Read Operations

  • Electrons on the floating gate affect the threshold voltage for the floating gate transistor to conduct
  • Higher voltage needed when gate has electrons


    Read operation.png
Steps:
• Apply Vint (intermediate voltage)
• If the current is detected, gate has no electrons=> bit = 1
• If no current, gate must have electrons => bit = 0
  • Page is the smallest unit that can be read (about more details, I choose to give up.)

Write and erase

  • Write: 1 => 0
    – Apply high positive voltage (>> voltage for read) to the control gate
    – Attract electrons from channel to floating gate (through quantum tunneling)
    – Page is the smallest unit for write

  • Erase: 0 => 1 (make electrons empty)
    – Need to apply much higher negative voltage to the control gate
    – Get rid of electrons from floating gate
    – May stress surrounding cells(dangerous to do on individual pages)
    – Block is the smallest unit for erase

P/E cycle (1->0->1->0...)

P: program/write;
E: erase

  • what is P/E cycle?
    Data are written to cells (P): cell value from 1 -> 0 – Then erased (E): 0 -> 1
  • why P/E cycle?
    Every write & erase damages oxide layer surrounding the floating-gate to some extent


    P/E cycle.png

latency: read < write < erase

latency.png

MLC (Multi-level cell)

  • floating gate can hold a number of electrons to represent different states

  • SLC vs. MLC
    – Less complex
    – Faster
    – More reliable
    – Less storage
    – More costly


    MLC example.png
2 bits, 3 intermediate voltages.png

an example about the write page of SSD

P/E/P.png
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,193评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,306评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,130评论 0 353
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,110评论 1 292
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,118评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,085评论 1 295
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,007评论 3 417
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,844评论 0 273
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,283评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,508评论 2 332
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,667评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,395评论 5 343
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,985评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,630评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,797评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,653评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,553评论 2 352