1.热身:利用 Chinook 库查询数据
QUESTION:Which songs in this dataset were written by AC/DC?
’QUERY = "SELECT Name FROM Track WHERE Composer='AC/DC';
QUESTION: How long would it take to listen to all the songs in our
dataset that were written by Johannes Sebastian Bach?'''
QUERY = "SELECT Composer, sum(Milliseconds) FROM Track WHERE Composer='Johann Sebastian Bach';
2.搭建本地环境
1.下载“sqlite_windows”
http://video.udacity-data.com.s3.amazonaws.com/topher/2017/May/59265674_sqlite-windows/sqlite-windows.zip
2。解压到你的 C 盘中
你应该会看到一个名为“sqlite_windows”的文件夹,里面有一个两个文件,分别名为“Chinook_Sqlite.sqlite” 和“sqlite.exe”。
在开始菜单栏的“运行”窗口中输入 cmd 并回车
在命令提示符后输入如下内容:
C:\> cd \sqlite_windows
C:\> cd sqlite_windows
C:\sqlite_windows>sqlite3 Chinook_Sqlite.sqlite
sqlite>
这样你就成功下载了 sqlite3,并导入了我们在这门课程中将要使用的数据库chinook。你也可以通过“chinook_db file”文件访问我们需要的数据库。chinook.db 和 Chinook_Sqlite.sqlite 的内容一致,你也可以通过 sqlite3 chinook.db 来使用 chinook.db.
前面说的都是从我们的课程资源中下载资料,它包含你在这门课程中需要的一个小数据包。你也可以从官方网站这里获得它们的原始文件。
SQLite - https://www.sqlite.org/download.html
你需要下载:
- Windows: sqlite-shell-win32-x86-3090200.zip
- Mac: sqlite-shell-osx-x86-3090200.zip
Chinook 数据库 - https://chinookdatabase.codeplex.com/releases/view/55681
当你完成了这些文件的下载后,你也可以按上面描述的方式完成工作。
3.欢迎来到 SQLite
无论何时,你都可以输入 .help 命令,以获取可用的命令行操作的清单。
sqlite> .help
输入 .tables,你可以看到数据库中的所有数据表。
sqlite> .tables
如果你想了解表格的架构,只需要输入 .schema。
sqlite> .schema Album
CREATE TABLE [Album]
(
[AlbumId] INTEGER NOT NULL,
[Title] NVARCHAR(160) NOT NULL,
[ArtistId] INTEGER NOT NULL,
CONSTRAINT [PK_Album] PRIMARY KEY ([AlbumId]),
FOREIGN KEY ([ArtistId]) REFERENCES [Artist] ([ArtistId])
ON DELETE NO ACTION ON UPDATE NO ACTION
);
CREATE UNIQUE INDEX [IPK_Album] ON [Album]([AlbumId]);
CREATE INDEX [IFK_AlbumArtistId] ON [Album] ([ArtistId]);
.schema 命令可以返回大量的有用信息,从输出的内容中我们可以看到表格的名字,每一列的名称,数据结构,还有数据库的主键和外键。
CREATE TABLE [Album] <-- 这是表格的名字,Album
[AlbumId] <--- 这是每一列的名称
[Title] 这个表格有三列
[ArtistId] AlbumId, Title, ArtistId
INTEGER, <--- 这些是表格中的数据类型
NVARCHAR(160), 每一列都需要有一个数据类型
INTEGER,
其余的输出内容是对表格的不同约束,数据库有主键约束,外键约束和NULL 值约束。
注意:如果你没有在 .schema 命令中选定一个表格,这个命令会返回数据库中所有表格的信息。
开始查询数据吧!
Chinook 数据库是一个 iTunes 资料库,它包含一个数字音乐商店的信息。通过这个习题集中的问题,你会开始熟悉这个数据库的内容。如果你想知晓更多有关 Chinook 数据库的信息,你可以访问它的 Chinook 数据库官方网站。
所以,如果你已经查看了数据库中表格的名称,现在让我们试着查询一下 Invoice 表格中的内容。
sqlite> SELECT * FROM Invoice;
当你完成时,输入 .exit 退出数据库。
sqlite> .exit
4.sql要素
4.1 select 子句
Write a query that returns all the species in the zoo, and how many animals of each species there are, sorted with the most populous species at the top.
The result should have two columns: species and number.
The animals table has columns (name, species, birthdate) for each individual.
QUERY = " SELECT species,count(*) as num from animals GROUP BY species ORDER BY num DESC ;"
4.2 insert
Insert a newborn baby opossum into the animals table and verify that it's been added.
To do this, fill in the rest of SELECT_QUERY and INSERT_QUERY.
SELECT_QUERY should find the names and birthdates of all opossums.
INSERT_QUERY should add a new opossum to the table
The animals table has columns (name, species, birthdate) for each individual.
SELECT_QUERY = "SELECT name,birthdate from animals WHERE species='opossum';"
INSERT_QUERY = "INSERT INTO animals values('lilpump','opossum','2010-08-17');"
4.3 join
Find the names of the individual animals that eat fish.
The animals table has columns (name, species, birthdate) for each individual.
The diet table has columns (species, food) for each food that a species eats.
QUERY = SELECT animals.name FROM animals JOIN diet ON animals.species=diet.species WHERE food='fish';
QUERY = SELECT name FROM animals,diet WHERE animals.species=diet.species AND diet.food='fish';
4.4 聚合之后(使用having,不可用where)
QUERY = SELECT food,count(*) as num FROM animals,diet WHERE animals.species=diet.species GROUP BY food having num=1;
4.5 习题集
4.5.1账单数量最多的三个国家
Write a query that returns the 3 countries with the highest number of invoices, along with the number of invoices for these countries.
QUERY =
SELECT BillingCountry,count(*) as num
FROM Invoice
GROUP BY BillingCountry
ORDER BY num DESC
LIMIT 3
4.5.2 最佳客户电子邮件
Build a query that returns the person who has the highest sum of all invoices,along with their email, first name, and last name.
QUERY =
SELECT Customer.Email,Customer.FirstName,Customer.LastName,sum(Invoice.Total) as num
FROM Customer JOIN Invoice
ON Customer.CustomerId = Invoice.CustomerId
GROUP BY Customer.CustomerId
ORDER BY num DESC
LIMIT 1
4.5.3 最佳客户电子邮件
Use your query to return the email, first name, last name, and Genre of all Rock Music listeners!
Return you list ordered alphabetically by email address starting with A.
Can you find a way to deal with duplicate email addresses so no one receives multiple emails?
QUERY =
SELECT Customer.Email,Customer.FirstName,Customer.LastName,Genre.Name
FROM Customer
JOIN Invoice ON Customer.CustomerId=Invoice.CustomerId
JOIN InvoiceLine ON Invoice.InvoiceId=InvoiceLine.InvoiceId
JOIN Track ON InvoiceLine.TrackId = Track.TrackId
JOIN Genre ON Track.GenreId=Genre.GenreId
WHERE Genre.Name='Rock'
GROUP BY Customer.Email #确保没有重复的电子邮件
ORDER BY Customer.Email
4.5.4 音乐宣传活动
Write a query that returns the 1 city that has the highest sum of invoice totals.
Return both the city name and the sum of all invoice totals.
QUERY =
SELECT BillingCity,SUM(Total) as Total
FROM Invoice
GROUP BY BillingCity
ORDER BY Total DESC
LIMIT 1
'''
4.5.5 城市热门音乐排行
Write a query that returns the BillingCity,total number of invoices.
Return the top 3 most popular music genres for the city Prague with the highest
invoice total
QUERY =
SELECT Invoice.BillingCity,COUNT(Invoice.InvoiceId) as num,Genre.Name
FROM Invoice
JOIN InvoiceLine ON Invoice.InvoiceId=InvoiceLine.InvoiceId
JOIN Track ON InvoiceLine.TrackId=Track.TrackId
JOIN Genre ON Track.GenreId=Genre.GenreId
WHERE BillingCity='Prague'
GROUP BY Genre.Name
ORDER BY num DESC
LIMIT 3
4.5.6 寻找音乐家
Write a query that returns the Artist name and total track count of the top 10 rock bands.
QUERY =
SELECT Artist.Name,COUNT(Album.AlbumId) as num
FROM Genre
JOIN Track ON Genre.GenreId=Track.GenreId
JOIN Album ON Track.AlbumId=Album.AlbumId
JOIN Artist ON Album.ArtistId=Artist.ArtistId
WHERE Genre.Name='Rock'
GROUP BY Artist.ArtistId
ORDER BY num DESC
LIMIT 10
4.5.7 直通法国
Return the BillingCities in France, followed by the total number of tracks purchased for Alternative & Punk music.
Order your output so that the city with the highest total number of tracks purchased is on top.
QUERY =
SELECT Invoice.BillingCity,COUNT(Track.TrackId) as num
FROM Invoice
JOIN InvoiceLine ON Invoice.InvoiceId=InvoiceLine.InvoiceId
JOIN Track ON InvoiceLine.TrackId=Track.TrackId
JOIN Genre ON Track.GenreId=Genre.GenreId
WHERE Invoice.BillingCountry='France' and Genre.Name='Alternative & Punk'
GROUP BY Invoice.BillingCity
ORDER BY num DESC