SQL Sequel

A Sequel to SQL

An Introduction to SQL should give you a primer to the “Q” part SQL; or how to query. This post is a SQL sequel; how to add and alter data.

 

Top level review of basic SQL structure

SELECT 
 The columns you are SELECTing
FROM 
 The name of the table FROM which the data is from
JOIN ON 
 JOINing a second table based ON a shared column {optional}
WHERE 
 Condition the selection should follow {optional}
GROUP BY 
 Grouping non aggregated data {optional}
HAVING 
 The where equivalent for aggregated selections {optional}
ORDER BY 
 You will never guess what this does {optional}
LIMIT 
 LIMITs the number of rows {optional}

 

One could go years as an analyst only reading data, and never writing to the database. I know personally, I was anxious about first getting write access. Hearing stories of people accidentally deleting tables (most are hoaxes), or a wrong keystroke costing millions. Eventually, you will find work to be easier when you can create and alter your own tables. A common example being aggregated tables.

 

Let’s Begin our SQL Sequel

For this tutorial I am using PostgreSQL 9.4.7, to check your version of SQL you can use the following command.

 

select version()

 

Creating New Tables

To creating a table from existing data use the “CREATE TABLE” command.

 

CREATE TABLE KPI_agg AS 
(
 SELECT
 date_trunc('month', date_field) as Month
 ,User_class
 ,COUNT(myid) as Count_field
 ,AVG(KPI) AS AVG_KPI
 FROM Your_Table
 GROUP by 1,2
);

 

Or if the data does not already exist in your database create a Table from a CSV

create table KPI_agg
(
  Month timestamp,
  User_class varchar,
  Count_field bigint,
  AVG_KPI numeric,
);

 

Some different datatypes one can use Postgres datatypes. Varchars should be avoided when possible, but sometimes they are your only option. Altering a column can be done using the ALTER command.

 

ALTER TABLE KPI_agg ALTER COLUMN User_class TYPE text USING (User_class::text);

 

After creating an empty table you can upload the CSV.

\copy KPI_agg from 'filelocation/datatoupload.csv' DELIMITER ',' CSV HEADER;

The above line copies the data from your local computer uses the delimiter of a comma, stored as a .csv and has a header. Some SQL client’s don’t allow you to do a \copy command, so you may need to use psql / a SQL shell.

 

New Columns

To add a new column use the ALTER  and ADD commands

ALTER TABLE KPI_agg
ADD COLUMN NEW_Class text;

Then you can SET this column’s information based on existing data.

 

UPDATE KPI_agg
SET NEW_Class = 'New Name'
WHERE User_class = 'Old Name'

 

Later on, you can update the table by only inserting data that meets a certain condition

 

insert into KPI_agg(Month,User_class,Count_field,AVG_KPI) 
(
SELECT
   date_trunc('month', date_field) as Month
  ,User_class
  ,COUNT(myid) as Count_field
  ,AVG(KPI) AS AVG_KPI
FROM Your_Table
WHERE date_field::date >= '2017-03-01'
  and date_field::date <= '2017-03-31'
GROUP by 1,2
);

 

Removing Data

 

Drop the entire table

DROP TABLE KPI_agg;

 

Delete rows that match a condition

DELETE FROM
KPI_agg
where User_class = 'Undesired_class';

 

Don’t forget to create indexes on your new tables

CREATE INDEX myidx ON Your_Table (myid);

A guide on Indexes is outside the scope of this post, see Efficient Use of PostgreSQL Indexes for more information.

 

 

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *