SQL
A relational database is a database that organizes information into one or more tables.
1. Basic SQL Commands
Create Table
Data types include INTEGER, TEXT, DATE and REAL (decimal value).
SELECT * FROM table_name;
CREATE TABLE table_name (
column_1 data_type,
column_2 data_type,
column_3 data_type
);
CREATE TABLE celebs (
id INTEGER,
name TEXT,
age INTEGER
);
INSERT INTO celebs (id, name, age)
VALUES (1, 'Justin Bieber', 22);
Clauses are written in capital letters - perform specific tasks (They are commands)
parameters are in parenthesis, in this case, a list of column names and data
CREATE TABLE
- creates from a scratch
Inserting Rows of Data
INSERT INTO table_name
- adds specified row
(id,name, age)
- parameter identifying columns the data added to
VALUES
- data inserted
2. Queries
SELECT
is the clause we use every time we want to query information from a database.AS
renames a column or table.DISTINCT
return unique values.WHERE
is a popular command that lets you filter the results of the query based on conditions that you specify.LIKE
andBETWEEN
are special operators.AND
andOR
combines multiple conditions.ORDER BY
sorts the result.
LIMIT
specifies the maximum number of rows that the query will return.CASE
creates different outputs.
### COMPREHENSIVE REVIEW CHEATSHEET
SELECT * FROM movies WHERE year BETWEEN 1990 AND 1999;
SELECT name FROM movies WHERE name BETWEEN 'A' AND 'J';
SELECT *
FROM movies
WHERE year BETWEEN 1990 AND 1999
AND genre = 'romance';
SELECT *
FROM movies
WHERE year BETWEEN 1970 AND 1979
AND imdb_rating > 8;
SELECT *
FROM movies
WHERE year > 2014
OR genre = 'action';
### SORTING by alphabet or numerical
SELECT *
FROM movies
ORDER BY name; #auto sort by A-Z
SELECT *
FROM movies
WHERE imdb_rating > 8
ORDER BY year DESC; # DESC is a key word for descending order
# ORDER BY year ASC; #ascending for ASC
# ORDER BY always goes after WHERE (if WHERE is present).
## ORDER BY LAST 3 CHARACTERS IN ASC ID
SELECT NAME FROM STUDENTS WHERE Marks > 75 ORDER BY RIGHT(NAME, 3), ID ASC;
### QUESTION: Write a new query that retrieves the name, year, and
# imdb_rating columns of all the movies, ordered highest to lowest by their ratings.
SELECT name, year, imdb_rating FROM movies ORDER BY imdb_rating DESC;
### LIMIT - specify max no. of rows in result set
SELECT * FROM movies where name BETWEEN 'A' and 'J' BY ORDER DESC LIMIT 20
# selecting top 3 imdbratings by limit 3
SELECT * FROM movies
ORDER BY imdb_rating DESC
LIMIT 3;
### CASE (if - then logic)
#Task: split movies by imdbrating, above 8 is fantastic, above 6, poorly recieved, else, avoid at all costs
SELECT name,
CASE
WHEN imdb_rating > 8 THEN 'Fantastic' # WHEN tests condition, THEN returns if true
WHEN imdb_rating > 6 THEN 'Poorly Received'
ELSE 'Avoid at All Costs'
END # you can rename column name to 'Review' by 'END AS 'Review'
FROM movies;

Select to Fetch Data
1. Select from
SELECT name FROM celebs;
SELECT * FROM celebs;
name specifies column to select from, the * specifies data from all columns in the table. This command always return a new table, called result set.
" * " is a wildcard, select every column
SELECT name , genre , year FROM celebs;
//also works
2. Select as from
Allows you to set an alias for your column name in the results
SELECT name AS 'Title - Alias' FROM celebs;
# allows you to set name as an alias
# this does not rename in the table, but only show in results
3. distinct (add before column name)
used to return unique values, filters out duplicate values in the column
SELECT DISTINCT tools
FROM inventory;
4. where
WHERE
clause filters the result set to only include rows where the following condition is true.
SELECT * FROM movies WHERE imdb_rating>8;
5. Like
SELECT * FROM movies WHERE name LIKE 'se_en';
LIKE
is a special operator used with theWHERE
clause to search for a specific pattern in a column.name LIKE 'Se_en'
is a condition evaluating thename
column for a specific pattern.Se_en
represents a pattern with a wildcard character.
The _
means you can substitute any individual character here without breaking the pattern. The names Seven
and Se7en
both match this pattern.
The percentage sign %
is another wildcard character that can be used with LIKE
.
This statement below filters the result set to only include movies with names that begin with the letter ‘A’:
SELECT *
FROM movies
WHERE name LIKE 'A%';
SELECT * FROM movies WHERE name Like '%man%' # any name with man inside
6. Between
SELECT * FROM movies WHERE year BETWEEN 1990 AND 1999;
SELECT name FROM movies WHERE name BETWEEN 'A' AND 'J';
SELECT *
FROM movies
WHERE year BETWEEN 1990 AND 1999
AND genre = 'romance';
SELECT *
FROM movies
WHERE year BETWEEN 1970 AND 1979
AND imdb_rating > 8;
SELECT *
FROM movies
WHERE year > 2014
OR genre = 'action';
When the values are text, BETWEEN
filters the result set for within the alphabetical range.
Alter Table to add new column
ALTER TABLE celebs;
ADD COLUMN twitter_handle
TEXT;
NULL is a special value in SQL that represents missing or unknown data. Here, the rows that existed before the column was added have NULL
(∅) values for twitter_handle
.
UPDATE FROM table to edit existing rows or columns
UPDATE celebs
SET twitter_handle #indicates column to edit
= '@taylorswift13' # new value
WHERE id = 4; # indicates which row to apply to, here the row
# with 4 nin id column

DELETE FROM to delete records
The statement below deletes all records in the celeb
table with no twitter_handle
DELETE FROM celebs
WHERE twitter_handle IS NULL;
#where lets you know which rows
#IS NULL - condition to delete
Constraints Set for table
Constraints that add information about how a column can be used are invoked after specifying the data type for a column. They can be used to tell the database to reject inserted data that does not adhere to a certain restriction. The statement below sets constraints on the celebs
table.
CREATE TABLE celebs (
id INTEGER PRIMARY KEY,
name TEXT UNIQUE,
date_of_birth TEXT NOT NULL ,
date_of_death TEXT DEFAULT 'Not Applicable'
);
1. PRIMARY KEY
columns can be used to uniquely identify the row. Attempts to insert a row with an identical value to a row already in the table will result in a constraint violation which will not allow you to insert the new row.
2. UNIQUE
columns have a different value for every row. This is similar to PRIMARY KEY
except a table can have many different UNIQUE
columns.
3. NOT NULL
columns must have a value. Attempts to insert a row without a value for a NOT NULL
column will result in a constraint violation and the new row will not be inserted.
4. DEFAULT
columns take an additional argument that will be the assumed value for an inserted row if the new row does not specify a value for that column.
3. Aggregate Functions
Calculations performed on multiple rows of a table are called aggregates.
COUNT()
: count the number of rowsSUM()
: the sum of the values in a columnMAX()
/MIN()
: the largest/smallest valueAVG()
: the average of the values in a columnROUND()
: round the values in the column
GROUP BY
: used withSELECT
to arrange identical data to groups, TheGROUP BY
statement comes after anyWHERE
statements, but beforeORDER BY
orLIMIT
.HAVING : similar to WHERE (which filters rows) but it filters GROUPs,
HAVING
statement always comes afterGROUP BY
, but beforeORDER BY
andLIMIT
.When we want to limit the results of a query based on values of the individual rows, use
WHERE
.When we want to limit the results of a query based on an aggregate property, use
HAVING
.
### 1. COUNT rows
SELECT COUNT(*) FROM table_name;
# with condition
SELECT COUNT(*)
FROM fake_apps
WHERE price = 0;
### 2. SUM values in column_name
SELECT SUM(column_name)
FROM table_name;
### 3. MAX MIN value in a column_name
SELECT MAX(column_name)
FROM table_name;
### 4. AVG average
SELECT AVG(column_name) FROM table_name;
### 5. ROUND (parameter1 = column_name, parameter2 = decimal places)
SELECT name, ROUND(price, 0)
FROM fake_apps;
# Combining aggregators
SELECT ROUND(AVG(price), 2)
FROM fake_apps;
### 6. GROUP BY along with SELECT
SELECT year,
AVG(imdb_rating) #avg of column
FROM movies #table_name
GROUP BY year
ORDER BY year;
SELECT price, COUNT(*)
FROM fake_apps
WHERE downloads > 20000
GROUP BY price;
### 7. HAVING for filtering groups
SELECT year,
genre,
COUNT(name)
FROM movies
GROUP BY 1, 2
HAVING COUNT(name) > 10;
SELECT name, category FROM startups GROUP BY 2 HAVING COUNT(*)>3

Group By 2
SELECT ROUND(imdb_rating),
COUNT(name)
FROM movies
GROUP BY ROUND(imdb_rating)
ORDER BY ROUND(imdb_rating);
### The above and below is the same query
SELECT ROUND(imdb_rating),
COUNT(name)
FROM movies
GROUP BY 1 # this number references the first column u mentioned, ROUND(imdb_rating)
ORDER BY 1;

Count example 1:
SELECT COUNT(CITY) - COUNT(DISTINCT CITY) FROM STATION;
#total minus distinct results
Example project : Forum moderating to catch rickroller users
SELECT user,
COUNT(*)
FROM hacker_news
WHERE url LIKE '%watch?v=dQw4w9WgXcQ%' # this url is the subdoamin of youtube rickroll vid
GROUP BY 1
ORDER BY 2 DESC;
#or similarly
SELECT user,
COUNT(*)
FROM hacker_news
WHERE url LIKE '%watch?v=dQw4w9WgXcQ%'
GROUP BY user
ORDER BY COUNT(*) DESC;
# To find source of news from forum hacker_news
SELECT CASE
WHEN url LIKE "%github.com%" THEN 'Github'
WHEN url LIKE '%medium.com%' THEN 'Medium'
WHEN url LIKE '%newyorktimes.com%' THEN 'New York Times'
ELSE 'Other'
END AS 'Source'
FROM hacker_news;
### to count the occurences
SELECT CASE
WHEN url LIKE "%github.com%" THEN 'Github'
WHEN url LIKE '%medium.com%' THEN 'Medium'
WHEN url LIKE '%newyorktimes.com%' THEN 'New York Times'
ELSE 'Other'
END AS 'Source',
COUNT(*)
FROM hacker_news
GROUP BY 1;

4. SQLite
SQLite is a database engine. It is software that allows users to interact with a relational database. In SQLite, a database is stored in a single file — a trait that distinguishes it from other database engines. This fact allows for a great deal of accessibility: copying a database is no more complicated than copying the file that stores the data, sharing a database can mean sending an email attachment.
does not validate data types
only one user can write to the file at a time
used where it makes sense for the database to be on the same disk as the application code.
SQLite creates schemas, which constrain the type of data in each column, but it does not enforce them. (will not reject wrong type)
#Just downloaded sqlite, access via following commands
sqlite <name of database.sqlite>
#use .exit
#use .open FILENAME
strftime() - format date
takes 2 arguments : strftime(format, column)
For strftime(__, timestamp)
:
%Y
returns the year (YYYY)%m
returns the month (01-12)%d
returns the day of the month (1-31)%H
returns 24-hour clock (00-23)%M
returns the minute (00-59)%S
returns the seconds (00-59)
if timestamp
format is YYYY-MM-DD HH:MM:SS
.
SELECT timestamp,
strftime('%H', timestamp)
FROM hacker_news
GROUP BY 1
LIMIT 20;



5. Multiple Tables
Sometimes there's too much information to put into a single table, here we learn some SQL commands to help us work with multiple tables.
Example case : A Magazine company's subscription services database

JOIN tables
Combining tables manually is time-consuming. Luckily, SQL gives us an easy sequence for this: it’s called a JOIN
.
If we want to combine orders
and customers
, we would type:
SELECT *
FROM orders
JOIN customers
ON orders.customer_id = customers.customer_id; # table_name.column_name
# given 3 tables, orders, subscription, customers
Let’s break down this command:
The first line selects all columns from our combined table. If we only want to select certain columns, we can specify which ones we want.
The second line specifies the first table that we want to look in,
orders
The third line uses
JOIN
to say that we want to combine information fromorders
withcustomers
.The fourth line tells us how to combine the two tables. We want to match
orders
table’scustomer_id
column withcustomers
table’scustomer_id
column.
Because column names are often repeated across multiple tables, we use the syntax table_name.column_name
to be sure that our requests for columns are unambiguous. In our example, we use this syntax in the ON
statement, but we will also use it in the SELECT
or any other statement where we refer to column names.
For example: Instead of selecting all the columns using *
, if we only wanted to select orders
table’s order_id
column and customers
table’s customer_name
column, we could use the following query:
SELECT orders.order_id,
customers.customer_name
FROM orders
JOIN customers
ON orders.customer_id = customers.customer_id;
### EXAMPLE - filter (WHERE is after JOIN ON)
SELECT * FROM orders
JOIN subscriptions
ON orders.subscription_id = subscriptions.subscription_id
WHERE subscriptions.description = 'Fashion Magazine';
Inner Joins
Example 1 - multiple queries
SELECT COUNT(*) FROM newspaper;
SELECT COUNT(*) FROM online;
SELECT COUNT(*)
FROM newspaper
JOIN online
ON newspaper.id = online.id;
Left Joins
What if we want to combine two tables and keep some of the un-matched rows?
SQL lets us do this through a command called LEFT JOIN
. A left join will keep all rows from the first table, regardless of whether there is a matching row in the second table.
SELECT *
FROM table1
LEFT JOIN table2 # joins regardless of presence of matching rows
ON table1.c2 = table2.c2;
The first line selects all columns from both tables.
The second line selects
table1
(the “left” table).The third line performs a
LEFT JOIN
ontable2
(the “right” table).The fourth line tells SQL how to perform the join (by looking for matching values in column
c2
).
SELECT *
FROM newspaper
LEFT JOIN online
ON newspaper.id = online.id;
### Want to find those that subscribe to newspaper but NOT ONLINE
SELECT *
FROM newspaper
LEFT JOIN online
ON newspaper.id = online.id # the column that u r joining is ON
WHERE online.id IS NULL; # WHERE is empty in the online id row
CROSS JOIN
Join all rows of one table to another regardless of matching information
SELECT shirts.shirt_color,
pants.pants_color
FROM shirts
CROSS JOIN pants; # dont need ON as u r not joingin on any column BUT EVERYTHING
The first two lines select the columns
shirt_color
andpants_color
.The third line pulls data from the table
shirts
.The fourth line performs a
CROSS JOIN
withpants
.
UNION - stacking tables
Sometimes we just want to stack one dataset on top of the other.
SELECT *
FROM table1
UNION
SELECT *
FROM table2;
SQL has strict rules for appending data:
Tables must have the same number of columns.
The columns must have the same data types in the same order as the first table.
WITH
Often times, we want to combine two tables, but one of the tables is the result of another calculation.
WITH previous_results AS (
SELECT ...
...
...
...
)
SELECT *
FROM previous_results
JOIN customers
ON _____ = _____;
The
WITH
statement allows us to perform a separate query (such as aggregating customer’s subscriptions)previous_results
is the alias that we will use to reference any columns from the query inside of theWITH
clauseWe can then go on to do whatever we want with this temporary table (such as join the temporary table with another table)
Essentially, we are putting a whole first query inside the parentheses ()
and giving it a name. After that, we can use this name as if it’s a table and write a new query using the first query.
EXAMPLE : Let’s return to our magazine order example. Our marketing department might want to know a bit more about our customers. For instance, they might want to know how many magazines each customer subscribes to. We can easily calculate this using our orders
table, but a customer_id
isn’t terribly useful for our marketing department, they probably want to know the customer’s name.
We want to be able to join the results of this query with our customers
table, which will tell us the name of each customer. We can do this by using a WITH
clause.
WITH previous_query AS (
SELECT customer_id,
COUNT(subscription_id) AS 'subscriptions'
FROM orders
GROUP BY customer_id
)
SELECT customers.customer_name, previous_query.subscriptions
FROM previous_query
JOIN customers
ON customers.customer_id = previous_query.customer_id;

HackerRank Qns
Query for even id

2. Basic Join with filter
Given the CITY and COUNTRY tables, query the sum of the populations of all cities where the CONTINENT is 'Asia'.
Note: CITY.CountryCode and COUNTRY.Code are matching key columns.
SELECT SUM(CITY.POPULATION)
FROM CITY
LEFT JOIN COUNTRY
ON CITY.CountryCode = COUNTRY.Code
WHERE COUNTRY.CONTINENT = 'Asia';
3. ORDER BY last 3 characters in ascending ID
Query the Name of any student in STUDENTS who scored higher than Marks. Order your output by the last three characters of each name. If two or more students both have names ending in the same last three characters (i.e.: Bobby, Robby, etc.), secondary sort them by ascending ID.
SELECT Name FROM STUDENTS WHERE Marks >75
ORDER BY RIGHT(Name,3), ID ASC
4. HackerRank Basic SQL Certification
# QUESTION 1. STUDENT ADVISOR
SELECT student_information.roll_number, student_information.name
FROM student_information
JOIN faculty_information
ON student_information.advisor = faculty_information.employee_ID
WHERE (faculty_information.gender = 'M' AND faculty_information.salary >15000)
OR (faculty_information.gender = 'F'AND faculty_information.salary >20000);
# QUESTION 2: PROFITABLE STOCKS
# condition is that given list of stocks
SELECT price_today.stock_code
FROM price_today JOIN price_tomorrow
ON price_today.stock_code = price_tomorrow.stock_code
WHERE price_tomorrow.price > price_today.price # condition
ORDER BY stock_code ASC;
Last updated