Getting and cleaning data——Week4

Editing text variables

  • Fixing character vectors
    • tolower()
    • toupper()
    • strsplit()
    • sub()
    • gsub() globle version of sub
  • Finding values
    • grep()
    • grepl() if exist, function will return TRUE, otherwise
      return FALSE
  • More useful string functions: stringr
    • substr()
    • paste()
    • paste0
    • str_trim()

Regular expressions

Metacharacters

  • ^ :Some metacharacters represent the start of a line
  • $: represents the end of a line
  • .: is used to refer to any (one) character
  • |: OR
  • ():Subexpressions are often contained in parentheses to constrain the alternatives
  • ?:The question mark indicates that the indicated expression is optional
  • *: means any number, characters
  • +:means "at least one of the item
  • {}:referred to as interval quantifiers; the let us specify the minimum and maximum number of matches of an expression
    Character Classes with []
  • []:list a set of characters we will accept at a given point in the match

Working with dates

  • date() 得到的日期为字符串
  • Sys.Date()获取日期格式的日期
  • format()格式化日期:
  • as.Date()
  • julian()从1970-01-01起到日期为止的天数

lubridate

  • mdy()
  • dmy()
  • ymd_hms()
  • wdays()

Data resources

...

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容