SQL Tutorial Part II

ArticleCategory: [Choose a category, translators: do not translate this, see list below for available categories]

Software Development

AuthorImage:[Here we need a little image from you]

[Photo of the Author]

TranslationInfo:[Author + translation history. mailto: or http://homepage]

original in fr Manuel Soriano

fr to en Dallas L. Gale

AboutTheAuthor:[A small biography about the author]

Manuel is a software consultant and his passion is Linux. He did already worked with computers when one single computer filled a whole room and had 32K Ram.

Abstract:[Here you write a little summary]

In part II of the tutorial we review several basic statements of the SQL language.

ArticleIllustration:[One image that will end up at the top of the article]

[Illustration]

ArticleBody:[The main part of the article]

Introduction

This is the second part of the SQL course. In this part, we will focus on various SQL commands from the creation of a table, its modification, and/or deletion.

We will focus, above all, on the SELECT command, which is, in my judgment, the most important of all.

I hope that this second part will be enjoyable and instructive for you.

Creating a table

Like we have seen in the first installment, to create a table, the CREATE command with the TABLE qualifier is used. In effect, the CREATE command serves to create:

The CREATE command indicates to the manager that something is going to be created. Later we will cover what and how.

What we are interested in now is the creation of a table:

Syntax

CREATE TABLE name ( column type [DEFAULT value] [NOT NULL], ...
      [INHERITS (inherits, ...)]
      [CONSTRAINT constraints CHECK (test), CHECK (test)] );

Where:
name: Is the name that is given to the table and how it is referenced by any command
Column: Is the name of the column
Type: Is the type of data(varchar, char, int, date, time, timestamp), Postgres has other types of data, but they aren't compatible with ANSI SQL
Value: The value that it will have by default
Inherits: This is Postgres's own. It defines an inheritance from another table. This will create an entity that contains the columns of the table that we are creating and the ones inherited
Nom_cons: This defines a rule of integrity to meet each time that a row is modified
Test: Conditions to check
Example:
CREATE TABLE countries (
         cod_country integer NOT NULL,
         name varchar(30))
   CONSTRAINT cod_raro CHECK (cod_country > 0 AND cod_country < 154);

With this example we have created a table of countries. Each time that we insert a new row, these conditions will be meet:

Note

What does NULL mean? In SQL there exist two states, data and no data. In effect, we can be interested that a field of rows doesn't have data, like zero (0) as well as the spaces are data. SQL introduces the concept of NULL and works with it. A practical example:

I have a table of bills, with the following fields: customer, value, date_issued, date_paid

When I create the row, I will insert the data: customer, value, date_issued

I will leave the date paid null; in this way I will know which bills need to be paid with the following command:

SELECT * FROM bills WHERE date_paid IS NULL;

One can claim that a zero (0) in the field date_paid would do the same thing. It's true, except that zero (0) is not a date, and it prevents creating date_paid of date type and applying appropriate date functions.

Examples of creating with NULL:

insert into countries values (15, NULL);
Or:
insert into countries (cod_country) values (27);

The absence of the "name" field implies that this will receive a NULL value.

Modifying a table

In PostgreSQL, modification only deals with adding new columns.

ALTER TABLE table ADD name type;

Where:

Table Name of the table to modify
Name Name of the column to add
Type Type of data (see CREATE TABLE)

Inserting data into a table

Now we will insert data into our table:

SYNTAX:
INSERT INTO table [(column, column, ...)] VALUES (value-1, value-2,
...)
or else:
INSERT INTO table [(column, column, ...)] SELECT ....

As we have seen, there are two forms of inserting data into a table, either line by line or the result of a sub-select that can return one or more rows.

When we insert lines into a table, we will ALWAYS put data in all the columns including those that we don't mention, these are created with NULL values.

If, in the command, we don't specify which columns we are going to fill, it is understood that we are going to give data to all, example:

INSERT INTO countries VALUES (34, 'Spain');
This would be incorrect:
INSERT INTO countries VALUES (34);
But, this would be correct:
INSERT INTO countries (cod_country) VALUES (34);

I recommend that a command embedded in "C" programs or in database functions ALWAYS specify the columns that we are going to touch, in effect, if we add a new column to the table (ALTER TABLE), the next insert an error will occur, Example:


INSERT INTO countries VALUES (34, 'Spain');

INSERT 18301 1

ALTER TABLE countries add population integer

INSERT INTO countries VALUES (34, 'Spain');

This will give a parser error, given that the population data is missing

Note

PostgreSQL, doesn't generate an error. It creates the line with the "population" field NULL. This is only a particularity of PostgreSQL, any other SQL manager would give an error.

We still have another type of INSERT, which is fed from a sub-select.

This type of insert is used very often to create temporary tables or tables to carry out a concrete task of speculative calculations.

The part replaced is that which touches the data itself, this comes from the SELECT instructions that were carried out previously and the inserting of the data. The instruction SELECT can return one or more rows, this instruction SELECT has the same restrictions of the same SELECT.

Selecting data

I wanted to arrive at this point! :-))

We have covered required SQL commands, the SQL language without SELECT would be like beans without sausage.

The SELECT command allows us to access data, but with the reservation that searches, unions of tables, functions with the data, and with the search rules can be carried out.

An example:

select * from countries;
Another example:
SELECT a.name, SUM(population)
    FROM countries a, states b, counties c
    WHERE b.cod_country = a.cod_country
        AND (c.cod_country = b.cod_country
        AND c.state_code = b.state_code)
        AND population IS NOT NULL
        GROUP BY a.name
        ORDER BY sum ASC;

Let me explain, I requested the populations of all of the countries ordered by the population in an ascending order.

For this I have added a new column (population) to the counties table.

This would be like this:
create table counties (cod_country int, 
                        state_code int, 
                        county_code int,
                        county_name varchar(60),
                        population int);
insert into counties values (1, 1, 1, 'Country 1, State 1, County 1',
5435);
insert into counties values (2, 1, 1, 'Country 2, State 1, County 1',
7832);
insert into counties values (3, 1, 1, 'Country 3, State 1, County 1',
4129);
insert into counties values (1, 2, 1, 'Country 1, State 2, County 1',
76529);
insert into counties values (2, 2, 1, 'Country 2, State 2, County 1',
9782);
insert into counties values (3, 2, 1, 'Country 3, State 2, County 1',
852);
insert into counties values (1, 3, 1, 'Country 1, State 3, County 1',
3433);
insert into counties values (2, 3, 1, 'Country 2, State 3, County 1',
7622);
insert into counties values (3, 3, 1, 'Country 3, State 3, County 1',
2798);
insert into counties values (1, 1, 2, 'Country 1, State 1, County 2',
7789);
insert into counties values (2, 1, 2, 'Country 2, State 1, County 2',
76511);
insert into counties values (3, 1, 2, 'Country 3, State 1, County 2',
98);
insert into counties values (1, 2, 2, 'Country 1, State 2, County 2',
123865);
insert into counties values (2, 2, 2, 'Country 2, State 2, County 2',
886633);
insert into counties values (3, 2, 2, 'Country 3, State 2, County 2',
982345);
insert into counties values (1, 3, 2, 'Country 1, State 3, County 2',
22344);
insert into counties values (2, 3, 2, 'Country 2, State 3, County 2',
179);
insert into counties values (3, 3, 2, 'Country 3, State 3, County 2',
196813);
insert into counties values (1, 1, 3, 'Country 1, State 1, County 3',
491301);
insert into counties values (2, 1, 3, 'Country 2, State 1, County 3',
166540);
insert into counties values (3, 1, 3, 'Country 3, State 1, County 3',
165132);
insert into counties values (1, 2, 3, 'Country 1, State 2, County 3',
0640);
insert into counties values (2, 2, 3, 'Country 2, State 2, County 3',
65120);
insert into counties values (3, 2, 3, 'Country 3, State 2, County 3',
1651462);
insert into counties values (1, 3, 3, 'Country 1, State 3, County 3',
60650);
insert into counties values (2, 3, 3, 'Country 2, State 3, County 3',
651986);
insert into counties values (3, 3, 3, 'Country 3, State 3, County 3',
NULL);
commit work;

Now, we can't do it with ALTER TABLE, but should use UPDATE and I haven't explained it, so use the "cut & paste" and everyone will be happy :-))

Now we can execute the QUERY, and the results should be:

name     |    sum
- ---------+-------
country 1| 705559
country 2|1212418
country 3|2804018
(3 rows)
Now we will verify this:

select sum(population) from counties where cod_country = 1;

Results:
   sum
- ------
791986
(1 row)

!!!!!! One Difference !!!!!!

Let's look at the states table, it is missing the state 3, we do:

INSERT INTO states VALUES (3, 1, 'State 3, Country 1');
INSERT INTO states VALUES (3, 2, 'State 3, Country 2');
INSERT INTO states VALUES (3, 3, 'State 3, Country 3');
and repeat the command, resulting in:
name     |    sum
- ---------+-------
country 1| 791986
country 2|1872205
country 3|3003629 

We are missing the state 3 from each country.

Now for those that have been lost, remember that the joins between tables were EXACT, it only extracts data if the condition are exact.

Let's look at the first part of the WHERE: b.cod_country = a.cod_country

This means that I join the country table with the states where the country code is equal, now remember the country data that we inserted:

Don't execute this, is only used as an illustration.

create table countries (cod_country integer, name varchar(30));
insert into countries values (1, 'country 1');
insert into countries values (2, 'country 2');
insert into countries values (3, 'country 3');
commit work;
Now the states data:
create table states (state_code int, 
                        cod_country int, 
                        state_name varchar(30));
insert into states values (1, 1, 'State 1, Country 1');
insert into states values (2, 1, 'State 2, Country 1');
insert into states values (1, 2, 'State 1, Country 2');
insert into states values (2, 2, 'State 2, Country 2');
insert into states values (1, 3, 'State 1, Country 3');
insert into states values (2, 3, 'State 2, Country 3');
commit work;
All of the states 3 from every country are missing, but in the country table the corresponding data of the states 3, so it is normal that we don't add the country data with the states with code 3 to be discarded in the second part where:
        AND (c.cod_country = b.cod_country
        AND c.state_code = b.state_code)

The state exists in the counties table but NOT in the states table.

For those that haven't understood, take an aspirin, go walk your dog (if you don't have a dog, go walk without a dog), breath a little fresh air and comeback to begin from the first exercise.

Is is very important to understand how the joining of data is carried out, without this the development that we do can have unpredictable results.

Let's shift gears and start with the SELECT command syntax.

SELECT [DISTINCT] expression1 [AS nom-attribute] {, expression-i [as
nom-attribute-i]} 
       [INTO TABLE classname] 
       [FROM from-list] 
       [WHERE where-clause] 
       [GROUP BY attr_name1 {, attr_name-i....}] 
       [ORDER BY attr_name1 [ASC | DESC ] [USING op1 ] {,
nom-attribute-i...}] 
       [UNION {ALL} SELECT ...] 

Step by step:
DISTINCT: This is to eliminate duplicate row returned
expression1: what we want returned, normally a column for a table from the list FROM
AS nom-attribute: an alias for the name of a column, example:
manu=> select cod_country from countries;
cod_country
- -----------
          1
          2
          3
(3 rows)          
Now with the alias:
manu=> select cod_country as countr from countries;
countr
- ------
     1
     2
     3
(3 rows)
INTO TABLE: allow the resulting row to be inserted directly into another table (see INSERT ... SELECT...)
FROM: list of input tables
WHERE: selection statement (union and selection criteria).
GROUP BY: group criteria, certain functions that are used in expressions might need grouping
ORDER BY: Order criteria of the returned rows, ASC ascending order, DESC descending order, USING if the column that defines the order is not in the list (expression)
UNION ALL SELECT: This says to add to the results of the first SELECT this second SELECT that can be different tables, but return the same number of columns.

We have seen that the commands SELECT not only returns items from the DB but it can also modify them:


SELECT SUM(salary * 1.1) - SUM(salary) AS increment FROM employees;

This will return the pay increment of a pay increase more then 10%.

Let's look at the functions that we have available:

COUNT(): returns the number of rows that aren't NULL
SUM(): returns the total sum of a column of numbers
AVG(): returns the average of a column of numbers
MIN(): returns the minimum value in a column
MAX(): returns the maximum value in a column
FLOAT(int): returns a FLOAT8, FLOAT(12345)
FLOAT4(int): returns a FLOAT4, FLOAT4(12345)
INT(float): returns an INT from a FLOAT/4, INT(123.456)
LOWER(text): returns lower-case text
UPPER(text): returns upper-case text
LPAD(text, long, char): fill to the left with char the length of long the column text
RPAD(text, long, char): fill to the right with char the length of long the column text
LTRIM(text, char): remove from the left of text all of the characters char
RTRIM(text, char): remove from the right of text all of the characters char
POSITION(string IN text): extract from text the position of string, but IT DOESN'T WORK
SUBSTR(text,from[,to]): extract the substring of text, from the position from to the position of to or the end of the string
DATETIME(date, hour): converts a datetime format to date (YYYY-MM-DD) and an hour (HH:MM)

These were a few of the functions that exist in SQL, these are those that are defined in ANSI SQL and are also present in Postgres95.

Details of WHERE

Until now, we have seen that the section of WHERE from the SELECT we put things like:

AND column = value

This is a small example of what we put or combine:

AND, OR, NOT, IN, IN ALL, =, !=, >, <, (SELECT....), LIKE also the parenthesis are relevant, example:

WHERE 
   column IN (SELECT DISTINCT column FROM table WHERE ....)
   column IN ('value1','value2','value3',...)
   (column = 'value' and column = 'other_value' OR column != 'value')
!= the same as saying NOT EQUAL

LIKE permits searching for a string within a column with wildcards:
WHERE column LIKE '%Pepito%'
The % is a wildcard, in the example, it's true if "Pepito" is in the string
WHERE column LIKE 'Pepito%'
will be true if "Pepito" is at the beginning of the string
WHERE column LIKE '%Pepito'
will be true if "Pepito" is at the end of the string

There's not enough time here to list all of the possible options used with WHERE, the limit is in the imagination of the programmer or the limits of the individual process parser.

Now we can leave the SELECT command and concentrate on the last two.

UPDATE command

The UPDATE command allows modification of one or more rows, depending on the defined condition in WHERE

SYNTAX:
UPDATE table SET column-1 = expression-1 
                 [, column-i = expression-i] 
       [WHERE condition] 
Where:
table: the table to modify, only one table can be modified at a time
column: the column that will be modified
expression: the value that the column will receive. This value can be static or the results of a function
condition: the condition that define the modification boundaries, here the defined rules are applied for the SELECT

DELETE command

The DELETE command allows modification of one or more row of a table.

SYNTAX
DELETE FROM table
       [WHERE condition] 
Where:
table: the table where rows will be deleted, only one table can be deleted from at a time
condition: the condition that define the modification boundaries, here the defined rules are applied for the SELECT
NOTE: without the WHERE ALL of the rows will be deleted from the table

References

SQL: Introduction to SQL. Installation of PostgreSQL, first article of the series.