w1
MapReduce
Lab 1
Part 1: Map/Reduce input and output
1. 完成sequential implementation
1.1 function that divides up the output of a map task(doMap() function in common_map.go)
1.2 function that gathers all the inputs for a reduce task(doReduce() function in common_reduce.go)
2. 测试方法
2.1 set debugEnabled = true in common.go
2.2 cd 6.824/src/mapreduce; go test -v -run Sequential
Part 2: Single-worker word count
Input Files: pg-*.txt
Map/Reduce Files: mrtmp.wcseq-<MapTaskNumber>-<ReduceTaskNumber>
Merge Files: mrtmp.wcseq-res-<ReduceTaskNumber>
1. 完成 main/wc.go 中的 mapF() 和 reduceF() 函数
1.1 mapF, Input包含文件名(filename),以及该文件的内容(contents);Output是key为单词,value为1的结构的切片
func mapF(filename string, contents string) (kvResult []mapreduce.KeyValue)
1.2 ReduceF,Input是key(单词),以及该单词的出现情况的切片,比如["1", "1",...],Output是该单词出现的总数
func reduceF(key string, values []string) string
2. 测试方法
cd 6.824/src/main
go run wc.go master sequential pg-*.txt
Part 3: Distributing MapReduce tasks
1. complete a version of MapReduce that splits the work over a set of worker threads that run in parallel on multiple cores.
1.1 implement schedule() in mapreduce/schedule.go
2. 测试方法
go test -run TestParallel
go test -race -run TestParallel