6.2 SQL (1)

SQL, short of Structured Query Language, has established itself as the standard relational database language. Generally speaking, although SQL is only a small part of database systems, studying SQL would cost most time for undergraduates[1]. In this textbook, we mainly focus two parts of SQL:

  • Data-definition language (DDL): defining relation schemas, deleting relations, and modifying relation schemas.
  • Data-manipulation language (DML): querying information from, inserting tuples into, deleting tuples from, and modifying tuples in the database.

The term CRUD, short for create, read, update, and delete, is often used to describe the database business logics.

[!NOTE] SQL is case-insensitive. SQL command ends with a semicolon (;).

DDL: data types

[!TIP] TL;DR. We only use INTEGER, REAL, and TEXT for simplicity, and you can think of it as int, double, and String in Java, respectively.

When defining relation schemas, we must specify the attributes' data types[2]. The most commonly used data types are listed below:

  • char(n): A fixed length character string with user-specified length n.
  • varchar(n): A variable-length character string with user-specified maximum length n.
  • int: An integer.
  • numeric(p, d): A fixed-point number with user-specified precision. For example, numeric(3, 1) allows 42.1 to be stored.
  • real, double precision: Floating-point and double precision floating point numbers.
  • float(n): A floating-point number with precision of at least n digits.

But each value stored in an SQLite database (or manipulated by the database engine) has one of the following storage classes:

  • INTEGER
  • REAL
  • TEXT
  • BLOB
  • NULL

So, is SQL written in standard syntaxes not compatible with SQLite? The answer is yes or no. 1) No two SQL database engines work exactly alike, so incompatibility sometimes is avoidable. 2) SQLite supports the concept of type affinity on columns. For example, both char(n) and varchar(n) will convert to affinity type TEXT[3].

SQLite strives to be flexible regarding the datatype of the content that it stores. For example, if a table column has a type of "INTEGER", then SQLite tries to convert anything inserted into that column into an integer. So an attempt to insert the string '123' results in an integer 123 being inserted. But if the content cannot be losslessly converted into an integer, for example if the input is 'xyz', then the original string is inserted instead.

For the sake of simplicity, in what follows, we only use three data types: INTEGER, REAL, and TEXT. By the way, SQLite does not have a storage class set aside for storing dates and/or times. In this textbook, we use TEXT to store date as ISO8601 strings (YYYY-MM-DD HH:MM:SS.SSS)[4].

We define an SQL relation by using CREATE TABLE command. The following command crates a relation book in the database:

CREATE TABLE book(
    ISBN TEXT PRIMARY KEY,
    title TEXT,
    author_name TEXT,
    price REAL
);

Since primary key for book consists of a single column (ISBN), the keywords PRIMARY KEY is added to a column definition. Alternatively, it can also be specified as a table constraint:

CREATE TABLE book(
    ISBN TEXT,
    title TEXT,
    author_name TEXT,
    price REAL,
    PRIMARY KEY(ISBN)
);

To remove a relation from an SQL database, we use DROP TABLE command:

DROP TABLE book;

DDL: constraints

Besides PRIMARY KEY, we can specify more constraints for relations. For example,

  • NOT NULL: By default, a column can be a null value. We can add NOT NULL attached to a column definition.
  • DEFAULT: The DEFAULT clause specifies a default value to use for the column if no value is explicitly provided by the user when doing an INSERT[5].
  • UNIQUE: It is similar to PRIMARY KEY[6].

Another important constraint on the contents of relations are called foreign-key constraints. Consider the attribute author_name of the book relation. It would not make sense for a tuple in book have a value for author that does not correspond to a name in the author relation. In this case, the attribute author_name in book is a foreign key, referencing author.

CREATE TABLE author(
    name TEXT PRIMARY KEY,
    birth TEXT,
    country TEXT
);

Suppose every author has a unique name.

CREATE TABLE book(
    ISBN TEXT,
    title TEXT NOT NULL,
    author_name TEXT,
    price REAL DEFAULT 9.9,
    PRIMARY KEY(ISBN),
    FOREIGN KEY(author_name) REFERENCES author(name)
);

Basic DML

To insert data into a relation, we either specify a tuple to be inserted or write a query whose result is a set of tuples to be inserted. In this book, we only use the simplest form[7]:

INSERT INTO author(name, birth, country) VALUES ('Lu Xun', '1881-09-25', 'China');

INSERT INTO book(ISBN, title, author_name, price)
VALUES ('9781500946654', 'A Madman''s Diary', 'Lu Xun', 10.5);

SELECT command is to issue a query. Let us consider a simple query: "find the names of all in books",

SELECT title
FROM book;

The WHERE clause allows us to select only those rows in the result relations of the FROM clause that satisfy a specified predicate. For example, "find the names of books written by Lu Xun":

SELECT title
FROM book
WHERE author_name = 'Lu Xun';

SQL allows the use of logical connectives AND, OR and NOT in the WHERE clause. The operands of the logical connectives can be expressions involving the comparison operations, including <, <=, >, >=, =, <>[8]. For example,

SELECT title
FROM book
WHERE author_name = 'Lu Xun' AND price > 20;

Sometimes, we may want to retrieve all attributes. The asterisk symbol "*" can be used in the SELECT clause to denote "all attributes"[9]:

SELECT *
FROM book;

UPDATE statement can be used if we wish to change a value in a tuple. Suppose we would like to increase the price of the book above by 5 percent,

UPDATE book
SET price = price * 1.05
WHERE ISBN = '9781500946654';

DELETE statement is used to delete tuples. For example,

DELETE FROM book
WHERE ISNB = '9781500946654';

If the predicate is not given, it means deleting all tuples while keeping the relation itself.


[1] This is because most undergraduate database courses only cover the application aspect of the database systems, while the theory aspect is rarely involved.

[2] Different from standard SQL, data types in SQLite are advisory rather than mandatory.

[3] Flexible type is a feature of SQLite, and you don't have to memorize the detailed rules.

[4] https://en.wikipedia.org/wiki/ISO_8601

[5] If there is no explicit DEFAULT clause attached to a column definition, then the default value of the column is NULL.

[6] PRIMARY KEY cannot be NULL in SQL standard, but it can be NULL in SQLite.

[7] String should be enclosed in single quotes in SQL standard, but double-quoted String literals are accepted in SQLite.

[8] Inequality in SQL standard is <>, but most database implementations, including SQLite, also support !=.

[9] SELECT * is considered a bad style in production code.