A Sequel to SQL
An Introduction to SQL should give you a primer to the “Q” part SQL; or how to query. This post is a SQL sequel; how to add and alter data.
Top level review of basic SQL structure
SELECT The columns you are SELECTing FROM The name of the table FROM which the data is from JOIN ON JOINing a second table based ON a shared column {optional} WHERE Condition the selection should follow {optional} GROUP BY Grouping non aggregated data {optional} HAVING The where equivalent for aggregated selections {optional} ORDER BY You will never guess what this does {optional} LIMIT LIMITs the number of rows {optional}
One could go years as an analyst only reading data, and never writing to the database. I know personally, I was anxious about first getting write access. Hearing stories of people accidentally deleting tables (most are hoaxes), or a wrong keystroke costing millions. Eventually, you will find work to be easier when you can create and alter your own tables. A common example being aggregated tables.
Let’s Begin our SQL Sequel
For this tutorial I am using PostgreSQL 9.4.7, to check your version of SQL you can use the following command.
select version()
Creating New Tables
To creating a table from existing data use the “CREATE TABLE” command.
CREATE TABLE KPI_agg AS ( SELECT date_trunc('month', date_field) as Month ,User_class ,COUNT(myid) as Count_field ,AVG(KPI) AS AVG_KPI FROM Your_Table GROUP by 1,2 );
Or if the data does not already exist in your database create a Table from a CSV
create table KPI_agg ( Month timestamp, User_class varchar, Count_field bigint, AVG_KPI numeric, );
Some different datatypes one can use Postgres datatypes. Varchars should be avoided when possible, but sometimes they are your only option. Altering a column can be done using the ALTER command.
ALTER TABLE KPI_agg ALTER COLUMN User_class TYPE text USING (User_class::text);
After creating an empty table you can upload the CSV.
\copy KPI_agg from 'filelocation/datatoupload.csv' DELIMITER ',' CSV HEADER;
The above line copies the data from your local computer uses the delimiter of a comma, stored as a .csv and has a header. Some SQL client’s don’t allow you to do a \copy command, so you may need to use psql / a SQL shell.
New Columns
To add a new column use the ALTER and ADD commands
ALTER TABLE KPI_agg ADD COLUMN NEW_Class text;
Then you can SET this column’s information based on existing data.
UPDATE KPI_agg SET NEW_Class = 'New Name' WHERE User_class = 'Old Name'
Later on, you can update the table by only inserting data that meets a certain condition
insert into KPI_agg(Month,User_class,Count_field,AVG_KPI) ( SELECT date_trunc('month', date_field) as Month ,User_class ,COUNT(myid) as Count_field ,AVG(KPI) AS AVG_KPI FROM Your_Table WHERE date_field::date >= '2017-03-01' and date_field::date <= '2017-03-31' GROUP by 1,2 );
Removing Data
Drop the entire table
DROP TABLE KPI_agg;
Delete rows that match a condition
DELETE FROM KPI_agg where User_class = 'Undesired_class';
Don’t forget to create indexes on your new tables
CREATE INDEX myidx ON Your_Table (myid);
A guide on Indexes is outside the scope of this post, see Efficient Use of PostgreSQL Indexes for more information.
Leave a Reply
Want to join the discussion?Feel free to contribute!