Data Science with R in 4 Weeks - Week 1 - Day3

Day 3: summaries of data - two dimension summary


例子1: multiple boxplot  不同联盟的胜率有什么不同?

> temp <- read.csv("basketball_teams.csv")

> teamdata <- as.data.frame(temp)

> teamdata$new_column <- ifelse(teamdata$games == 0, NA, teamdata$won / teamdata$games)

> stats <- teamdata[, c("name","lgID", "year","new_column")]

boxplot(stats$new_column ~stats$lgID, data = stats, col = "red")

结果如下:


我们也可以用histgram

> par(mfrow = c(2,1), mar = c(4,4,2,1))

> hist(subset(stats$new_column, stats$lgID == "ABA"), col="green")

> hist(subset(stats$new_column, stats$lgID == "NBA"), col="green")


scatterplot

> with(stats, plot(stats$year, stats$new_column))

> abline( h =0.7, lwd = 2, lty = 2)


add color to scatterplot

with(stats, plot(stats$year, stats$new_column, col=stats$lgID))

从这个图中,我们就能看出来各个联赛(ABA,NBA)的球队他们的胜率是什么样子的。

或者,可以做多个scatterplot

分别看NBA和NBL的胜率

> with(subset(stats, stats$lgID == "NBA"), plot(subset(stats, stats$lgID == "NBA")$year, subset(stats, stats$lgID == "NBA")$new_column, main = "NBA"))

> with(subset(stats, stats$lgID == "NBL"), plot(subset(stats, stats$lgID == "NBL")$year, subset(stats, stats$lgID == "NBL")$new_column, main = "NBL"))


最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容