SQL Serise Part V (Data Cleaning)

LEFT & RIGHT (for column)

# choose the col string you want to separate
SELECT LEFT(col, number) AS new_col # col=> string column, number=> separate index position
FROM TABLE;

SELECT RIGHT(col, number) AS new_col
FROM TABLE;

# Sample:
#    if we want to count the `name` start with 'a' character
SELECT SUM(new_name) AS n_name
FROM (SELECT name, CASE WHEN LEFT(name, 1)='a'
                   THEN 1 ELSE 0 END AS new_name 
                   FROM TABLE) AS t1;

POSITION, STRPOS & SUBSTR

# POSITION, STRPOS: provides the position of a string counting from the left
# ATTENTION: both them are case sensitive
POSITION('target_string' IN col)
STRPOS(col, 'target_string')

# If you want separate the string, use LEFT or RIGHT and POSITION or STRPOS
...
LEFT(col, POSITION('target_string' IN col) -1 ) AS new_col # -1 is to substracting the target_string
...

LOWER, UPPER

# force every character in a string to become lowercase(uppercase)
LOWER(col)
UPPER(col)

CONCAT & ||

# CONCAT & ||: combines values from several columns into one column
...
CONCAT(a, 'space mark(or nothing)', b) AS new_col
a || 'space mark(or nothing)' || b AS new_col
...

CAST

# Allows us to change columns from one data type to another
# change float to int:
CAST(25.6 AS int) => 25
# change string to date:
CAST(year || '-' || month || '-' || day AS date) => 2018-08-21

COALESCE

# Returns the first non-null value passed for each row
COALESCE(col, 'Nothing here') AS show_non-null_col => if col is null, then will show 'Nothing here'
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • pyspark.sql模块 模块上下文 Spark SQL和DataFrames的重要类: pyspark.sql...
    mpro阅读 13,143评论 0 13
  • 目标:能够对原始数据进行清理,并获得适合分析的整洁数据。 清理和重新整理混乱的数据。 将列转换为不同的数据类型。 ...
    夏威夷的芒果阅读 3,255评论 0 0
  • 开场阿斯加德人遭到了灭霸的袭击,他们对附近发出了求救讯号,抵抗已经失败,乌木喉跨过死者,受伤的海姆达尔躺在地上,洛...
    Off_time阅读 4,204评论 0 0
  • 打卡练字中,目前还是一只小菜鸟。。。 欢迎各路大神指教!
    陈砚v阅读 2,839评论 0 0
  • 我曾经是一个很爱抱怨的人,总觉得老师讲课的方法有问题,布置的作业没有意义,不如我自己自学来的痛快。直到暑假的夏令营...
    你好啊我叫露娜阅读 1,950评论 0 6