SQL

Tables and Schema

Tables

Data is organized into named tables, or relations
Table is used to store the data
Every column has its data type:

  • INTEGER, SMALLINT, TINYINT, BIGINT
  • DECIMAL(n,m) : float type
  • VARCHAR: string type
  • BOOLEAN: logic

Schemas

The names and types of the data in a table's columns are called the table's schema.

Log in shell

mysql -u root -p psw

SQL

The most common features of SQL are:

  • Projection - selecting some columns from table
  • Function application - applying a function to a value
  • Filtering - selecting rows based on some criterion
  • Grouping - aggregating rows based on the values in a column
  • Joins - combining two tables
    (回想在big data中学习的实现各类功能的完备数学运算集)

Query syntax

SQL queries have the form:

SELECT expression[, expression, ...]
FROM table
[... additional clauses ...]

Projection

Projection is when you ask for some(all) of the columns of a table.
也就是抽取相应的列,keyword 负责对抽取的列中的行进行过滤操作,
比如DISTINCT 就是抽取出某一列的不重复的所有元素(类似于数学里面的集合)。

SELECT col1, col2 
FROM table1

SELECT has many keywords like:

  • DISTINCT - return unique results
  • BETWEEN a AND b - limit the range, the values can be numbers, text, or dates
  • LIKE - pattern search within the column text
  • IN (a, b, c) - check if the value is contained among given.
SELECT DISTINCT Facility
FROM Admissions;
SELECT * 
FROM table1

Function application

在列上施加函数,可以构造新的列名

SELECT Facility,
       Lengthofstay * 24
FROM Admissions;
SELECT UPPER(Facility),
       Lengthofstay * IllnessCode
FROM Admissions;
SELECT Facility,
        Lengthofstay * 24 AS stayinhours
 FROM Admissions;

Filtering

相当于对行施加函数,抽取出部分的行, 关键词为WHERE.多个条件语句之间用逻辑符OR AND 隔开

SELECT *
FROM Admissions
WHERE Lengthofstay > 4;
SELECT Facility, IllnessCode
FROM Admissions
WHERE IllnessCode = 141 OR
      IllnessCode = 94;
SELECT Facility, Lengthofstay
FROM Admissions
WHERE Lengthofstay > 2 AND
              Lengthofstay < 6;

Aggregation function

Aggregate the rows according to a column, and perform operarions on the aggregated rows. 使用keyword :GROUP BY
Grouping 操作也就是汇总通常与聚合函数(共有40多种聚合函数)一起使用,常见的聚合函数:

  • COUNT - return the number of rows
  • SUM - cumulate the values
  • AVG - return the average for the group
  • MIN / MAX - smallest / largest value
SELECT Facility,
       MAX(Lengthofstay)
FROM Admissions
GROUP BY Facility;
SELECT Facility,
       MIN(Lengthofstay),
       MAX(Lengthofstay),
       AVG(Lengthofstay)
FROM Admissions
GROUP BY Facility;
SELECT COUNT(*) FROM Admissions;

可以将过滤操作和GROUP BY 操作放到一起同时执行,但是注意过滤操作要放到前面。

Sort the result

对结果排序,关键词 :ORDER BY

SELECT Facility, Lengthofstay
FROM Admissions
ORDER BY Lengthofstay;
SELECT Facility, Lengthofstay
FROM Admissions
ORDER BY Facility, Lengthofstay

Joins

链接两个或者多个表

SELECT * FROM Admissions JOIN Illnesses
ON Admissions.IllnessCode = Illnesses.Code;

Join 操作可以写的非常的复杂,同时join分为 Left Outer Join, Inner join and right Outer join

UPDATE t1 SET a = 1
FROM table1 t1 JOIN table2 t2 ON t1.id = t2.t1_id WHERE t1.col1 = 0 AND t2.col2 IS NULL;

LIMIT clause

从返回的结果中显示有限的元素

SELECT Facility, IllnessCode
FROM Admissions
LIMIT 2

Database and tables

Basic operations

CREATE DATABASE database name;
DROP DATABASE database name;
SHOW DATABASES;
USE database name;
SHOW TABLES;
DESCRIBE table-name;

tables

DROP TABLE IF EXISTS plate_types;

CREATE TABLE plate_types(
  code CHAR(3) NOT NULL,
  description VARCHAR(27),
  PRIMARY KEY(code)
);

You may notice that the return of a SELECT statement is actually another table. So you may use it as a subqueries anywhere you need a table

-- create a table from another select-statement
CREATE TABLE longstays
AS SELECT facility, MAX(lengthofstay)
   FROM hospital
   GROUP BY facility;

Deleting rows and tables

DROP TABLE table-name;
# just delete the data but leave the table(as a schema)
TRUNCATE TABLE table-name;
# Delete rows
DELETE FROM table-name
WHERE conditions;

add or update tables

-- update speci c data with the WHERE clause 
UPDATE table1 SET col1 = 1 
WHERE col2 = 2

-- insert values manually   
INSERT INTO table1 (ID, FIRST_NAME, LAST_NAME) 
VALUES (1, ‘Rebel’, ‘Labs’);

-- or by using the results of a query  
INSERT INTO table1 (ID, FIRST_NAME, LAST_NAME) 
SELECT id, last_name,  rst_name FROM table2

-- load data
LOAD DATA LOCAL INFILE '/Users/xiaodiu/Google/Big Data/assignment2/data/plate-types.csv'
INTO TABLE plate_types
COLUMNS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\n';

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容