2018-01-09-strsplit

strsplit(x, split, fixed=FALSE)

Split a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list, where each list item corresponds to an element of x that has been split. In the simplest case, x is a single character string, and strsplit outputs a one-item list.

  • x – A character string or vector of character strings to split.
  • split – The character string to split x. If the split is an empty string (""), then x is split between every character.
  • fixed – If the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.

Example. Several starter examples are shown below (note that a period is a stand in for "any character" in regular expressions), followed by a couple scenarios that are a little more practical. For instance, dates are split into year, month, and day, and names in the form Last, First are split at their comma.

x <- "Split the words in a sentence."
 strsplit(x, " ")
[[1]]
[1] "Split"     "the"       "words"     "in"       
[5] "a"         "sentence."

> 
> x <- "Split at every character."
> strsplit(x, "")
[[1]]
 [1] "S" "p" "l" "i" "t" " " "a" "t" " " "e" "v" "e" "r" "y"
[15] " " "c" "h" "a" "r" "a" "c" "t" "e" "r" "."

> 
> x <- " Split at each space with a preceding character."
> strsplit(x, ". ")
[[1]]
[1] " Spli"      "a"          "eac"        "spac"      
[5] "wit"        ""           "precedin"   "character."

> 
> x <- "Do you wish you were Mr. Jones?"
> strsplit(x, ". ")
[[1]]
[1] "D"      "yo"     "wis"    "yo"     "wer"    "Mr"    
[7] "Jones?"

> strsplit(x, ". ", fixed=TRUE)
[[1]]
[1] "Do you wish you were Mr" "Jones?"                 

> 
> #=====> Splitting Dates <=====#
> dates <- c("1999-05-23", "2001-12-30", "2004-12-17")
> temp  <- strsplit(dates, "-")
> temp
[[1]]
[1] "1999" "05"   "23"  

[[2]]
[1] "2001" "12"   "30"  

[[3]]
[1] "2004" "12"   "17"  
> matrix(unlist(temp), ncol=3, byrow=TRUE)
     [,1]   [,2] [,3]
[1,] "1999" "05" "23"
[2,] "2001" "12" "30"
[3,] "2004" "12" "17"
> 
> #=====> Cofounders of Google and Twitter <=====#
> Names <- c("Brin, Sergey", "Page, Larry",
+            "Dorsey, Jack", "Glass, Noah",
+            "Williams, Evan", "Stone, Biz")
> Cofounded <- rep(c("Google", "Twitter"), c(2,4))
> temp <- strsplit(Names, ", ")
> temp
[[1]]
[1] "Brin"   "Sergey"

[[2]]
[1] "Page"  "Larry"

[[3]]
[1] "Dorsey" "Jack"  
[[4]]
[1] "Glass" "Noah" 
[[5]]
[1] "Williams" "Evan"    
[[6]]
[1] "Stone" "Biz"  
> mat  <- matrix(unlist(temp), ncol=2, byrow=TRUE)
> df   <- as.data.frame(mat)
> df   <- cbind(df, Cofounded)
> colnames(df) <- c("Last", "First", "Cofounded")
> df
      Last  First Cofounded
1     Brin Sergey    Google
2     Page  Larry    Google
3   Dorsey   Jack   Twitter
4    Glass   Noah   Twitter
5 Williams   Evan   Twitter
6    Stone    Biz   Twitter</pre>
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • **2014真题Directions:Read the following text. Choose the be...
    又是夜半惊坐起阅读 9,934评论 0 23
  • 浮生若梦,若梦非梦。浮生何如?如梦之梦。 ——庄子《如梦之梦》 Ⅰ.暮色 “彻底的寂静,给沉沉夜色增添...
    少艾Crystal阅读 265评论 0 0
  • 参考资料:http://nicklee.tw/?p=1753 這篇文章主要介紹奈良的2個假日才有行駛的巴士。 一個...
    商帝葛格阅读 567评论 0 0
  • 善良的人在追求中纵然迷惘,却终将意识到一条征途。 将感情埋藏的太深有时是件坏事。如果一个女人掩饰了对自己所爱的男子...
    Babyzpj阅读 310评论 0 0