Page cover

SQL

A relational database is a database that organizes information into one or more tables.

1. Basic SQL Commands

Basic Commands

  • CREATE TABLE creates a new table.

  • INSERT INTO adds a new row to a table.

  • SELECT queries data from a table.

  • ALTER TABLE changes an existing table.

  • UPDATE edits a row in a table.

  • DELETE FROM deletes rows from a table.

Create Table

Data types include INTEGER, TEXT, DATE and REAL (decimal value).

SELECT * FROM table_name;

CREATE TABLE table_name (
   column_1 data_type, 
   column_2 data_type, 
   column_3 data_type
);

CREATE TABLE celebs (
   id INTEGER, 
   name TEXT, 
   age INTEGER
); 

INSERT INTO celebs (id, name, age) 
VALUES (1, 'Justin Bieber', 22);         

  • Clauses are written in capital letters - perform specific tasks (They are commands)

  • parameters are in parenthesis, in this case, a list of column names and data

  • CREATE TABLE - creates from a scratch

Inserting Rows of Data

INSERT INTO table_name - adds specified row

(id,name, age) - parameter identifying columns the data added to

VALUES - data inserted

2. Queries

  • SELECT is the clause we use every time we want to query information from a database.

  • AS renames a column or table.

  • DISTINCT return unique values.

  • WHERE is a popular command that lets you filter the results of the query based on conditions that you specify.

  • LIKE and BETWEEN are special operators.

  • AND and OR combines multiple conditions.

  • ORDER BY sorts the result.

ORDER BY

  • can ORDER BY RIGHT (columnname, numberofchar) : from the right, pick 3 characters then order by these

  • ORDER BY 1stpriority, second priority : etc ORDER BY RIGHT(row1, 3), row2 ASC

  • LIMIT specifies the maximum number of rows that the query will return.

  • CASE creates different outputs.

Select to Fetch Data

1. Select from

name specifies column to select from, the * specifies data from all columns in the table. This command always return a new table, called result set.

  • " * " is a wildcard, select every column

2. Select as from

Allows you to set an alias for your column name in the results

3. distinct (add before column name)

used to return unique values, filters out duplicate values in the column

4. where

WHERE clause filters the result set to only include rows where the following condition is true.

5. Like

  • LIKE is a special operator used with the WHERE clause to search for a specific pattern in a column.

  • name LIKE 'Se_en' is a condition evaluating the name column for a specific pattern.

  • Se_en represents a pattern with a wildcard character.

The _ means you can substitute any individual character here without breaking the pattern. The names Seven and Se7en both match this pattern.

The percentage sign % is another wildcard character that can be used with LIKE.

This statement below filters the result set to only include movies with names that begin with the letter ‘A’:

use WHERE <column> IS NULL or IS NOT NULL : for missing data or non empty data

6. Between

When the values are text, BETWEEN filters the result set for within the alphabetical range.

Alter Table to add new column

NULL is a special value in SQL that represents missing or unknown data. Here, the rows that existed before the column was added have NULL (∅) values for twitter_handle.

UPDATE FROM table to edit existing rows or columns

DELETE FROM to delete records

The statement below deletes all records in the celeb table with no twitter_handle

Constraints Set for table

Constraints that add information about how a column can be used are invoked after specifying the data type for a column. They can be used to tell the database to reject inserted data that does not adhere to a certain restriction. The statement below sets constraints on the celebs table.

1. PRIMARY KEY columns can be used to uniquely identify the row. Attempts to insert a row with an identical value to a row already in the table will result in a constraint violation which will not allow you to insert the new row.

2. UNIQUE columns have a different value for every row. This is similar to PRIMARY KEY except a table can have many different UNIQUE columns.

3. NOT NULL columns must have a value. Attempts to insert a row without a value for a NOT NULL column will result in a constraint violation and the new row will not be inserted.

4. DEFAULT columns take an additional argument that will be the assumed value for an inserted row if the new row does not specify a value for that column.

3. Aggregate Functions

Calculations performed on multiple rows of a table are called aggregates.

  • COUNT(): count the number of rows

  • SUM(): the sum of the values in a column

  • MAX()/MIN(): the largest/smallest value

  • AVG(): the average of the values in a column

  • ROUND(): round the values in the column

By default, SQL tries to be as precise as possible without rounding. We can make the result table easier to read using the ROUND() function.

ROUND() function takes two arguments inside the parenthesis:

  1. a column name

  2. an integer

It rounds the values in the column to the number of decimal places specified by the integer.

  • GROUP BY : used with SELECT to arrange identical data to groups, The GROUP BY statement comes after any WHERE statements, but before ORDER BY or LIMIT.

  • HAVING : similar to WHERE (which filters rows) but it filters GROUPs, HAVING statement always comes after GROUP BY, but before ORDER BY and LIMIT.

    1. When we want to limit the results of a query based on values of the individual rows, use WHERE.

    2. When we want to limit the results of a query based on an aggregate property, use HAVING.

GROUP BY I

Group By 2

GROUP BY II References

Count example 1:

Example project : Forum moderating to catch rickroller users

Hacker_news count results

4. SQLite

SQLite is a database engine. It is software that allows users to interact with a relational database. In SQLite, a database is stored in a single file — a trait that distinguishes it from other database engines. This fact allows for a great deal of accessibility: copying a database is no more complicated than copying the file that stores the data, sharing a database can mean sending an email attachment.

  • does not validate data types

  • only one user can write to the file at a time

  • used where it makes sense for the database to be on the same disk as the application code.

SQLite creates schemas, which constrain the type of data in each column, but it does not enforce them. (will not reject wrong type)

strftime() - format date

takes 2 arguments : strftime(format, column)

For strftime(__, timestamp):

  • %Y returns the year (YYYY)

  • %m returns the month (01-12)

  • %d returns the day of the month (1-31)

  • %H returns 24-hour clock (00-23)

  • %M returns the minute (00-59)

  • %S returns the seconds (00-59)

if timestamp format is YYYY-MM-DD HH:MM:SS.

Results of the above
Grouping by hour, we find the most popular hour to post to get the highest scores on the forum
Next we use round and renaming columns using 'AS' (forgot to add WHERE hour IS NOT NULL)

5. Multiple Tables

Sometimes there's too much information to put into a single table, here we learn some SQL commands to help us work with multiple tables.

COMMANDS:

  • JOIN will combine rows from different tables if the join condition is true.

  • LEFT JOIN will return every row in the left table, and if the join condition is not met, NULL values are used to fill in the columns from the right table.

  • Primary key is a column that serves a unique identifier for the rows in the table.

  • Foreign key is a column that contains the primary key to another table.

  • CROSS JOIN lets us combine all rows of one table with all rows of another table.

  • UNION stacks one dataset on top of another.

  • WITH allows us to define one or more temporary tables that can be used in the final query.

Example case : A Magazine company's subscription services database

3 total tables

JOIN tables

Combining tables manually is time-consuming. Luckily, SQL gives us an easy sequence for this: it’s called a JOIN.

If we want to combine orders and customers, we would type:

Let’s break down this command:

  1. The first line selects all columns from our combined table. If we only want to select certain columns, we can specify which ones we want.

  2. The second line specifies the first table that we want to look in, orders

  3. The third line uses JOIN to say that we want to combine information from orders with customers.

  4. The fourth line tells us how to combine the two tables. We want to match orders table’s customer_id column with customers table’s customer_id column.

Because column names are often repeated across multiple tables, we use the syntax table_name.column_name to be sure that our requests for columns are unambiguous. In our example, we use this syntax in the ON statement, but we will also use it in the SELECT or any other statement where we refer to column names.

For example: Instead of selecting all the columns using *, if we only wanted to select orders table’s order_id column and customers table’s customer_name column, we could use the following query:

Inner Joins

Example 1 - multiple queries

Left Joins

What if we want to combine two tables and keep some of the un-matched rows?

SQL lets us do this through a command called LEFT JOIN. A left join will keep all rows from the first table, regardless of whether there is a matching row in the second table.

  1. The first line selects all columns from both tables.

  2. The second line selects table1 (the “left” table).

  3. The third line performs a LEFT JOIN on table2 (the “right” table).

  4. The fourth line tells SQL how to perform the join (by looking for matching values in column c2).

CROSS JOIN

Join all rows of one table to another regardless of matching information

  • The first two lines select the columns shirt_color and pants_color.

  • The third line pulls data from the table shirts.

  • The fourth line performs a CROSS JOIN with pants.

UNION - stacking tables

Sometimes we just want to stack one dataset on top of the other.

SQL has strict rules for appending data:

  • Tables must have the same number of columns.

  • The columns must have the same data types in the same order as the first table.

WITH

Often times, we want to combine two tables, but one of the tables is the result of another calculation.

  • The WITH statement allows us to perform a separate query (such as aggregating customer’s subscriptions)

  • previous_results is the alias that we will use to reference any columns from the query inside of the WITH clause

  • We can then go on to do whatever we want with this temporary table (such as join the temporary table with another table)

Essentially, we are putting a whole first query inside the parentheses () and giving it a name. After that, we can use this name as if it’s a table and write a new query using the first query.

EXAMPLE : Let’s return to our magazine order example. Our marketing department might want to know a bit more about our customers. For instance, they might want to know how many magazines each customer subscribes to. We can easily calculate this using our orders table, but a customer_id isn’t terribly useful for our marketing department, they probably want to know the customer’s name.

We want to be able to join the results of this query with our customers table, which will tell us the name of each customer. We can do this by using a WITH clause.

Results of above

HackerRank Qns

  1. Query for even id

HackerRank Basic SQL

2. Basic Join with filter

Given the CITY and COUNTRY tables, query the sum of the populations of all cities where the CONTINENT is 'Asia'.

Note: CITY.CountryCode and COUNTRY.Code are matching key columns.

3. ORDER BY last 3 characters in ascending ID

Query the Name of any student in STUDENTS who scored higher than Marks. Order your output by the last three characters of each name. If two or more students both have names ending in the same last three characters (i.e.: Bobby, Robby, etc.), secondary sort them by ascending ID.

4. HackerRank Basic SQL Certification

Last updated