讲解:COMP226、MWS、R、R Statistics、、|R

COMP226 Assignment 1: Reconstruct aLimit Order BookContinuousAssessment Number1 (of 2)Weighting 10%Assignment Circulated 09:00 Tuesday 18 February 2020 (updated 2020-02-20)Deadline 17:00 Friday 6 March 2020Submission Mode Electronic onlySubmit a single file MWS-username.R, where MWS-usernameshould be replaced with your MWS username.Learning OutcomesAssessedHave an understanding of market microstructure and its impacton trading.Goal of Assignment Reconstruct a limit order book from order messagesMarking Criteria Code correctness (85%); Code readability (15%)Submission necessaryin order to satisfymodule requirementsNoLate SubmissionPenaltyStandard UoL policy; resubmissions after the deadline willNOT be considered.Expected time taken Roughly 8-12 hoursWarningYour code will be put through the departments automatic plagiarism and collusiondetection system. Students found to have plagiarized or colluded will likely receive amark of zero. Do not discuss or show your work to others. In previous years, twostudents had their studies terminated and left without a degree because of plagiarism.Rscript from RstudioIn this assigment, we use Rscript (which is provided by R) to run our code, e.g.,Rscript skeleton.R input/book_1.csv input/empty.txtIn R studio, you can call Rscript from the terminal tab (as opposed to the console).On Windows, use Rscript.exe not Rscript:Rscript.exe skeleton.R input/book_1.csv input/empty.txtDistributed code and sample input and output dataAs a first step, please download comp226_a1.zip comp226_a1_v3.zip from:https://student.csc.liv.ac.uk/internal/modules/comp226/_downloads/comp226_a1_v3.zipThen unzip comp226_a1.zip, which will yield the following contents in the directorycomp226_a1:comp226_a1├── input│   ├── book_1.csv│   ├── book_2.csv│   ├── book_3.csv│   ├── empty.txt│   ├── message_a.txt│   ├── message_ar.txt│   ├── message_arc.txt│   ├── message_ex_add.txt│   ├── message_ex_cross.txt│   ├── message_ex_reduce.txt│   └── message_ex_same_price.txt├── output│   ├── book_1-message_a.out│   ├── book_1-message_ar.out│   ├── book_1-message_arc.out│   ├── book_2-message_a.out│   ├── book_2-message_ar.out│   ├── book_2-message_arc.out│   ├── book_3-message_a.out│   ├── book_3-message_ar.out│   └── book_3-message_arc.out└── skeleton.R2 directories, 21 filesBrief summaryThe starting point for the assignment is a code skeleton, provided in a file called skeleton.R.This file runs without error, but does not produce the desired output because it contains 6empty functions. To complete the assignment you will need to correctly complete these 6functions.You should submit a single R file that contains your implementation of some or ideally all ofthese 6 functions. Your submission will be marked via a combination of:• automated tests (for code correctness, 85%, breakdown by function given below);and• human visual inspection (for code readability, 15%, in particular, for appropriatenaming of variables and functions (5%), good use of comments (5%), and sensible,consistent code formatting (5%)).Correct sample output is provided so that you can check whether your code implemetationsproduces the correct output.skeleton.R versus solution.RYou are given skeleton.R, which you should extend by implementing 6 functions.Throughout this handout, we also generate example output using a file solution.R thatcontains a correct implementation of all 6 of these functions. Obviously, you are notgiven the file solution.R, however the example output will be helpful for checking thatyour function implementations work correctly.Two sets of functions to implementAs described in detail in the rest of this document, you are required to implement thefollowing 6 functions. The percentage in square brackets correspond to the breakdown of thecorrectness marks by function.Limit order book stats:1. book.total_volume 2. book.best_prices 3. book.midprice 4. book.spread Updating the limit order book:5. book.reduce 6. book.add WarningDo not make changes to the rest of the code in skeleton.R, only implement these 6functions. Penalties may be applied if other changes are present in your submission.Running skeleton.RAn example of calling skeleton.R follows.Rscript skeleton.R input/book_1.csv input/empty.txtAs seen in this example, skeleton.R takes as arguments the path to two input files:1. initial order book (input/book_1.csv in the example)2. order messages to be processed (input/empty.txt in the example)Note: the order of the arguments matters.Lets see part of the source code and the output that it produces.if (!interactive()) { options(warn=-1) args if (length(args) != 2) { stop(Must provide two arguments: ) } book_path if (!file.exists(data_path) || !file.exists(book_path)) { stop(File does not exist at path provided.) } book book book.summarise(book)}So in short, this part of the code:• checks that there are two command line arguments• assigns them to the appropriate variables (the first to the initial book file path, thesecond to the message file path)• loads the initial book• reconstructs the book according to the messages• prints out the book• prints out the book statsLets see the output for the example above:$ Rscript skeleton.R input/book_1.csv input/empty.txt$ask oid price size1 a 105 100$bid oid price size1 b 95 100Total volume:Best prices:Mid-price:Spread:Now lets see what the output would look like for a correct implementation:$ Rscript solution.R input/book_1.csv input/empty.txt$ask oid price size1 a 105 100$bid oid price size1 b 95 100Total volume: 100 100Best prices: 95 105Mid-price: 100Spread: 10You will see that now the order book stats have been included in the output, because thefour related functions that are empty in skeleton.R have been implemented in solution.R.The initial order bookHere is the contents of input/book_1.csv, which is one of the 3 provided examples of aninitial book:oid,side,price,sizea,S,105,100b,B,95,100Lets justify the columns to help parse this input:oid side price sizea S 105 100b B 95 100The first row is a header row. Every subsequent row contains a limit order, which isdescribed by the following fields:• oid (order id) is stored in the book and used to process (partial) cancellations of ordersthat arise in reduce messages, described below;• side identifies whether this is a bid (B for buy) or an ask (S for sell);• price and size are self-explanatory.Existing code in skeleton.R will read in a file like input/book_1.csv and create thecorresponding two (possibly empty) orders book as two data frames that will be stored in thelist book, a version of which will be passed to all of the six functions that you are required toimplement.Note that if we now change the message file to a non-empty one, skeleton.R will producethe same output (since it doesnt parse the messages; you need to write the code, functions5 and 6, to do that):$ Rscript skeleton.R input/book_1.csv input/message_a.txt$ask oid price size1 a 105 100$bid oid price size1 b 95 100Total volume:Best prices:Mid-price:Spread:If correct message parsing and book updating is implemented, book would be updatedaccording to input/adds_only.txt to give the following output:$ Rscript solution.R input/book_1.csv input/message_a.txt$askBefore we go into details on the message format and reconstructing the order book, letsdiscuss the first four functions that compute the book stats, which we also see correctlycomputed in this example.Computing limit order book statsThe first four of the functions that you need to implement compute limit order book stats,and can be developed and tested without parsing the order messages at all. In particular,you can develop and test the first four functions using an empty message file,input/empty.txt, as in the first example above.The return values of the four functions should be as follows (where as usual in R singlenumbers are actually numeric vectors of length 1):• book.total_volumes should return a list with two named elements, bid, which shouldcontain the total volume in the bid book, and ask, which should contain the total volumein the ask book;• book.best_prices bid, which should contain the best bid price, and ask, which should contain the best askprice;• book.midprice should the midprice of the book;• book.spread should the spread of the book;You should check that the output of these functions in the example above that usessolution.R are what you expect them to be.We now move on to the reconstructing the order book from the messages in the inputmessage file.Reconstructing the order book from messagesYou do not need to look into the details of the (fully implemented) functionsbook.reconstruct or book.handle that manage the reconstruction the book from thestarting initial book according to the messages.In the next section, we describe that there are two types of message, Add messages andReduce messages.代写COMP226课程作业、代做MWS留学生作业、R编程设计作业调试、R实验作业代做 代写留学生 Statistics统 All you need to know to complete the assignment is that messages inthe input file are processed in order, i.e., line by line, with Add messages passed tobook.add and Reduce messages passed to book.reduce, along with the current book inboth cases.Message FormatThe market data log contains one message per line (terminated by a single linefeedcharacter, \n), and each message is a series of fields separated by spaces.There are two types of messages: Add and Reduce messages. Heres an example,which contains an Add message followed by a Reduce message:A c S 97 36R a 50An Add message looks like this:A oid side price size• A: fixed string identifying this as an Add message;• oid: order id used by subsequent Reduce messages;• side: B for a buy order (a bid), and an S for a sell order (an ask);• price: limit price of this order;• size: size of this order.A Reduce message looks like this:R oid size• R: fixed string identifying this as a Reduce message;• oid: order id identifies the order to be reduced;• size: amount by which to reduce the size of the order (not the new size of the order); ifsize is equal to or greater than the existing size of the order, the order is removed fromthe book.Processing messagesReduce messages will affect at most one existing limit order in the book.Add messages will either:• not cross the spread and then add a single row to the book (orders at the same priceare stored separately to preserve their distinct oids);• cross the spread and in that case can affect any number of orders on the other side ofthe book (and may or may not result in a remaining limit order for residual volume).The provided example message files are split into cases that include crosses and those thatdont to help you develop your code incrementally and test it on inputs of differing difficulty.We do an example of each case, one by one. In each example we start frominput/book_1.csv; we only show this initial book in the first case.Example of processing a reduce message$ Rscript solution.R input/book_1.csv input/empty.txt$ask oid price size1 a 105 100$bid oid price size1 b 95 100Total volume: 100 100Best prices: 95 105Mid-price: 100Spread: 10$ cat input/message_ex_reduce.txtR a 50$ Rscript solution.R input/book_1.csv input/message_ex_reduce.txt$ask oid price size1 a 105 50$bid oid price size1 b 95 100Total volume: 100 50Best prices: 95 105Mid-price: 100Spread: 10Example of processing an add (non-crossing) message$ cat input/message_ex_add.txtA c S 97 36$ Rscript solution.R input/book_1.csv input/message_ex_add.txt$ask oid price size2 a 105 1001 c 97 36$bid oid price size1 b 95 100Total volume: 100 136Best prices: 95 97Mid-price: 96Spread: 2Example of processing a crossing add message$ cat input/message_ex_cross.txtA c B 106 101$ Rscript solution.R input/book_1.csv input/message_ex_cross.txt$ask[1] oid price size (or 0-length row.names)$bid oid price size1 c 106 12 b 95 100Total volume: 101 0Best prices: 106 NAMid-price: NASpread: NASample outputWe provide sample output for 9 cases, namely all combinations of the following 3 initialbooks and 3 message files.The 3 initial books are found in the input subdirectory and are called:• book_1.csv• book_2.csv• book_3.csvThe 3 message files are also found in the input subdirectory and are called:filemessages_a.txt add messages only, i.e., requires book.add but not book.reduce; forall three initial books, none of the messages cross the spreedmessages_ar.txt add and reduce messages, but for the initial book book_3.csv, noadd message crosses the spreadmessages_arc.txt add and reduce messages, with some adds that cross the spread forall three initial booksThe 9 output files can be found in the output subdirectory of the comp226_a1 directory.output├── book_1-message_a.out├── book_1-message_ar.out├── book_1-message_arc.out├── book_2-message_a.out├── book_2-message_ar.out├── book_2-message_arc.out├── book_3-message_a.out├── book_3-message_ar.out└── book_3-message_arc.out0 directories, 9 filesHints for order book statsFor book.spread and book.midprice a nice implementation would use book.best_prices,which you should then implement first.Hints for book.add and book.reduceA possible way to implement book.add and book.reduce that makes use of the differentexample message files is the following:• First, do a partial implementation of book.add, namely implement add messages that donot cross. Check your implementation with message_a.txt.• Next, implement book.reduce fully. Check your combined (partial) implementation ofbook.add and book.reduce with message_ar.txt and book_3.csv (only thiscombination with message_ar.txt has no crosses).• Finally, complete the implementation of book.add to deal with crosses. Check yourimplementation with message_arc.txt and any initial book or with message_ar.txt andbook_1.csv or book_2.csv.Hint on book.sortIn comp226_a1_v3 there is a book.sort method, with sort code as follows:book.sort if (sort_ask && nrow(book$ask) >= 1) { book$ask nchar(book$ask$oid), book$ask$oid, decreasing=F),] row.names(book$ask) } if (sort_bid && nrow(book$bid) >= 1) { book$bid nchar(book$bid$oid), book$bid$oid, decreasing=F),] row.names(book$bid) } book}This method will ensure that limit orders are sorted first by price and second by time ofarrival (so that for two orders at the same price, the older one is nearer the top of thebook).You are welcome (and encouraged) to use book.sort in your own implementations. Inparticualar, by using it you can avoid having to find exactly where to place an order in thebook.Hint on using logging in book.reconstructIn comp226_a1_v3 a logging option has been added to book.reconstruct:book.reconstruct if (nrow(data) == 0) return(book) if (is.null(init)) init book function(b, i) { new_book if (log) { cat(Step, i, \n\n) book.summarise(new_book, with_stats=F) cat(====================\n\n) } new_book }, 1:nrow(data), init, ) book.sort(book)}You can turn on logging by changing log=F to log=T. Then book.summarise will be used togive output after each message is processed by book.reconstruct.Hint on stringsAsFactors=FALSENotice the use of` stringsAsFactors=FALSE in the book.load function (similarly indata.load) from skeleton.R.book.load df path, fill=NA, stringsAsFactors=FALSE, header=TRUE, sep=, ) book.sort(list( ask=df[df$side == S, c(oid, price, size)], bid=df[df$side == B, c(oid, price, size)] ))}Its use here is not optional, it is necessary and what ensures that the oid column ofbook$bid and book$ask have type character.It is also crucial that you make sure that you ensure that the type of your oid columns inyour books remain character rather than factors. The following examples will explain theuse of stringsAsFactors and help you to achieve this.First we introduce a function that will check the type of this column on different data framesthat we will construct:check checks is.factor(df$oid)) for (check in checks) cat(sprintf(%20s: %5s, check, eval(parse(text=check))), \n)}Now lets use this function to explore different cases. First we look at the case of reading acsv.> check(read.csv(input/book_1.csv))is.character(df$oid): FALSE is.factor(df$oid): TRUE> check(read.csv(input/book_1.csv, stringsAsFactors=FALSE))is.character(df$oid): TRUE is.factor(df$oid): FALSEWhat about creating a data.frame?> check(data.frame(oid=a, price=1))is.character(df$oid): FALSE is.factor(df$oid): TRUE> check(data.frame(oid=a, price=1, stringsAsFactors=FALSE))is.character(df$oid): TRUE is.factor(df$oid): FALSEWhat about using rbind?> empty_df > non_empty_df > check(rbind(empty_df, data.frame(oid=a, price=1)))is.character(df$oid): FALSE is.factor(df$oid): TRUE> check(rbind(empty_df, non_empty_df))is.character(df$oid): TRUE is.factor(df$oid): FALSE> check(rbind(non_empty_df, data.frame(oid=a, price=1)))is.character(df$oid): TRUE is.factor(df$oid): FALSENote that with a non-empty data frame, the existing type persists! However, when thedata.frame is empty the type of the oid column is malleable and it is crucial to usestringsAsFactors=FALSE. We see the same behaviour when we rbind a list with adata.frame.> check(rbind(empty_df, list(oid=a, price=1)))is.character(df$oid): FALSE is.factor(df$oid): TRUE> check(rbind(empty_df, list(oid=a, price=1), stringsAsFactors=FALSE))is.character(df$oid): TRUE is.factor(df$oid): FALSE> check(rbind(non_empty_df, list(oid=a, price=1)))is.character(df$oid): TRUE is.factor(df$oid): FALSEAgain, it is crucial to use stringsAsFactors=FALSE when the data.frame is empty. Isuggest to use it in every case.SubmissionRemember to submit a single MWS-username.R file, where MWS-username should bereplaced with your MWS username.转自:http://www.6daixie.com/contents/18/4954.html

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,372评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,368评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,415评论 0 353
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,157评论 1 292
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,171评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,125评论 1 297
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,028评论 3 417
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,887评论 0 274
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,310评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,533评论 2 332
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,690评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,411评论 5 343
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,004评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,659评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,812评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,693评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,577评论 2 353

推荐阅读更多精彩内容

  • pyspark.sql模块 模块上下文 Spark SQL和DataFrames的重要类: pyspark.sql...
    mpro阅读 9,451评论 0 13
  • 本文转载自知乎 作者:季子乌 笔记版权归笔记作者所有 其中英文语句取自:英语流利说-懂你英语 ——————————...
    Danny_Edward阅读 43,869评论 4 38
  • 小年在各地有不同的概念和日期,北方地区是腊月二十三,南方地区是腊月二十四,我家从我记事起就是二十四过小年,...
    娟_07ba阅读 272评论 0 2
  • 爷爷走的第二年了,我很想念他。 爷爷是唯一陪我走过童年,少年,以及看着我在青春时代叛逆又懂事的家人,他这一走,...
    c咂咂阅读 106评论 0 0
  • 这个故事是摊主自己的亲身经历,里面一些细节上的事情我不详细写出来了,我怕有人把我认出来。 开这个故事的板块,是想通...
    小书摊的摊主阅读 216评论 0 3