Showing posts with label stevechol Database. Show all posts
Showing posts with label stevechol Database. Show all posts

Friday, 6 February 2009

PostgreSQL - First Things First

After installing PostgreSQL the installation is available to access only from the local machine. If however you have installed it on a remote server or VM you will initially be unable to connect. This is because (sensibly) PostgreSQL comes with a highly restrictive table of allowed connections. Therefore the first thing to do is give yourself access to your PostgreSQL database from over the local network.

These are the steps to follow to set this up:

From the machine which contains the PostgreSQL installation, launch the pGAdmin III application. If you don't have this you should run Application Stack Builder included with the PG installation and download it.

Once pGAdmin III is launched it will look something like this:



From the File menu select the option 'Open pg_hba.conf...' This will open a dialog asking you the location of this file. Navigate to your data directory and select the pg_hba.conf file.

You will then see this:



Double Click the third line down in the list and populate the window as follows:



Press OK to close the input window, close the list window, answering Yes to the prompt to Save changes and close pgAdmin III.

You now need to Restart the PostgreSQL server process using the shortcut installed by the PG installer.

Your PG database will now be accessible from your Local Area Network (assuming you use the IP Range 192.168.0.X, if you use a different IP Range then substitute it when editing the pg_hba.conf file.)

www.artenscience.co.uk
Quality Commercial Software and Custom Software Development


Arten Science RSS Feed

Thursday, 5 February 2009

The Most Appropriate Open Source Database ?

An upcoming development of mine ideally needs to store data in a powerful, multi user relational database system. Not a problem - I'm a big fan of Oracle and use it daily. The fly in the ointment this time is that due to the nature of the software the database system needs to be inexpensive, ideally free.

There are several 'free' database systems to choose from. the two most well known and highly rated are MySQL and PostgreSQL. Of the two MySQL seems to be great for 'quick and dirty' data storage while from the research I have done PostgreSQL seems to be far more advanced and with features nearer to what I would expect from Oracle.

I have experience of using MySQL, both locally on my Mac and also as part of the Plesk installation on my dedicated web server. It seems straight forward enough and would probably suit my needs, but the more I read the more drawn towards PostgreSQL I am becoming.

A big benefit of both of these database systems is the ability to install the server software on my Macbook Pro, a big plus when traveling, as Oracle doesn't install on a Mac and requires a seperate VM on which to run.

A very fast download and equally fast installation on my Mac left me very surprised. PostgreSQL really feels very professional and even comes with some nice admin tools. I tried the same exercise on one of my virtual Windows 2003 servers with the same result.

I have yet to do any extensive testing with PostgreSQL but I have already ruled out MySQL. Far from being free, it cannot be used with my commercial application without I first purchase a license. Apparently MySQL is released under the very restrictive GPL license and this effectively means that if I use MySQL as my data store then my commercial application falls under the same licensing - in other words I have to make it Open Source. Which I am not prepared to do.

PostgreSQL on the other hand is released under the far simpler BSD license. Which you can read below: (Click to enlarge the image)



This is a much better fit for my purpose. I'm not sure why MySQL appears to be so much more popular given the restrictions of the licensing ? Maybe it is just easier to use ? Maybe people do not realise how restrictive the licensing is ?

Anyway, it's full steam ahead with evaluating PostgreSQL then. The elephant takes the lead :-)

Note: Another big plus point in favour of PostgreSQL from my point of view is that native drivers are available for my current development platform of choice, REALbasic. No middleware needed ... If PostgreSQL turns out to be as good as I suspect it is, it may well replace Oracle in some of my future higher value commercial applications.



www.artenscience.co.uk
Quality Commercial Software and Custom Software Development


Arten Science RSS Feed

Wednesday, 9 April 2008

Oracle: Temporary Tables

Temporary Tables can be extremely useful when manipulating large amounts of data within Oracle, for storing totals or aggregate values. To setup a temporary table:

create global temporary table temp_table
(description VARCHAR2(30), count NUMBER)
on commit delete rows

or

create global temporary table temp_table
(description VARCHAR2(30), count NUMBER)
on commit preserve rows

Temporary tables are visible only to the session that inserted data into it. The data will be deleted at the end of the transaction (on commit delete rows) or when the session is terminated (on commit preserve rows). While they exist and are populated they can be used in the same way as any standard Oracle table.

Beware so long as you live, of judging people by appearances." - La Fontaine

Oracle: Locking 101

Locking data is essential to prevent more than one user altering data in the database at the same time. Oracle can lock one row, multiple rows, or the whole table. Oracle attempts to lock at the lowest level possible to ensure minimal impact on the database. It is possible using LOCK TABLE to lock manually lock a complete table. I have personally never needed to use this statement and it is certainly not recommended for most applications with more than one concurrent user.

Querying the database will *never* produce a lock using Oracle, and unlike some databases locking will not stop Oracle producing consistent queries, as the query works from the UNDO tablespace. This is the technology that allows a long running query started at 1200 to finish at 1400 and show the results of that query with all the data as it was when the query was started at 1200, regardless of changes to the data during the time the query was running.

The first user to request a lock, gets the lock and other users are queued on a FIFO basis. If the design of your application requires that control is returned to the user immediately in the event of the requested data being unavailable for locking then the NOWAIT parameter can be used.

It is important to realise that all locks are released when the transaction is terminated via either a COMMIT or ROLLBACK, whether issued explicitly or implicitly.

The table below shows the different kinds of table lock modes available:

Row Share
Allows connect access to the whole table, stops users locking the entire table for EXCLUSIVE.

Row Exclusive
As Row Share, but stops other users locking in SHARE mode
(used by DML such as INSERT, UPDATE and DELETE)

Share
Permits concurrent queries but prevents any updates to the table.
(used by CREATE INDEX)

Share Row Exclusive
To query the whole table, and allow others to do so, but prevent them locking the table in SHARE mode or doing any updates.

Exclusive
Permits queries, prevents any DML.
(used by DROP TABLE)

To monitor the current locks in Oracle you can use the Enterprise Manager or alternatively query the following views:

V$SESSION
V$TRANSACTION
V$LOCK
V$LOCKED_OBJECT

You can obtain explicit locks on individual rows using the select ... for update statement:

select * from ords_data where ords_ref = ‘SJC001’ for update;

If somebody else attempts to get a lock on that row their session will either wait or if using the NOWAIT parameter control will be returned to them immediately with an Oracle Error: ORA-00054. You can trap for this within your application and bring up a message explaining that the record is locked and to try again later.

"A book is a version of the world. If you do not like it, ignore it; or offer your own version in return." - Salman Rushdie

Tuesday, 1 April 2008

Roll Your Own :-)

I’ve just finished working on a System Migration project that was one of the more complex I’ve done. The original application was running on Oracle 9i on Unix. So far so good. However the documentation for the (over 1000) tables was poor to say the least and in addition the system had been heavily modified in the 20 or so years it had been in existence. The application was never initially designed for Oracle and used rule based indexing.

On to the point of the article . It can be useful in circumstances such as these to explore the data from code rather than relying purely on SQL query tools. I created a couple of tools to help with this job, one being ODBCDataTool which I wrote in an old, but effective language called Omnis7. Despite it’s age Omnis is still a fast and flexible cross platform development tool and works well even under the latest Operating Systems such as Vista.



What I required from this tool was a nice simple interface to the tables and columns within the database and then to create schemas based on these tables within Omnis. I could then switch to code and select, combine, loop and manipulate data to my hearts content. It’s been a few months since I have used Omnis and I enjoyed using it again. The application took an evening to write and worked perfectly. As I designed it to work against any database which has an ODBC driver, it will probably surface again in the future as the need arises.

"When life seems chaotic, you don't need people giving you easy answers or cheap promises. There might not be any answers to your problems. What you need is a safe place where you can bounce with people who have taken some bad hops of their own." - Unknown

Wednesday, 5 March 2008

Oracle Top 'N' Queries

If we wanted to query all our Customers from an Oracle database, show the Account Code, Name and Balance and order by the Balance in descending order we would use something like this:

select accm1_cus_sup as Code, accmname as Name, accmledg_balance as Balance
from accm_data
where accm1_co_site = '01'
and accm1_rec_type = 1
order by accmledg_balance desc

What if we wanted to retrieve only the Top 10 Customers who owe us the most money ? The best way of doing that is as follow:

select * from
(
select accm1_cus_sup as Code, accmname as Name, accmledg_balance as Balance
from accm_data
where accm1_co_site = '01'
and accm1_rec_type = 1
order by accmledg_balance desc
)
where rownum <= 10

What we have done here is made our initial query into a subquery and used the ROWNUM pseudocolumn to limit the rows returned by the query. This method uses a table scan initially and the rows outside of the rownum count are discarded, giving you your Top 10 Customers by account balance.
Following on but using a slightly different example, what if we wished to display all our Customers in Account Code order, retrieving only ten at a time, maybe within a thin client or web page scenario ? This is how that can be accomplished:

select *
from ( select X.*, ROWNUM rownumber
from
(
select accm1_cus_sup as Code, accmname as Name
from accm_data
where accm1_co_site = '01'
and accm1_rec_type = 1
order by accm1_cus_sup
)
X
where ROWNUM <= 10)
where rownumber >= 0

Obviously when the user indicates they require the next 10 records you would alter the parameters within the final two WHERE statements accordingly. An important point here is that if the column you are using to order the data is not unique, you must append a unique identifier to the ORDER BY statement. This is usually done by adding TABLENAME.ROWID to the end of the ORDER BY statement.

"When you make a mistake, there are only three things you should ever do about it: admit it, learn from it, and don't repeat it." - Paul ‘Bear’ Bryant

Sunday, 23 December 2007

Oracle Views

An Oracle database consists of a vast amount of underlying metadata that stores information about the database itself. This metadata is extremely cryptic and as such Oracle have provided ‘user friendly’ views to enable the user or DBA to view this data in a meaningful format.

There are two types of Oracle Views:

Data Dictionary Views
Dynamic Performance Views

Data Dictionary Views begin with DBA_ , USER_ or ALL_ . The difference is as follows: DBA_ Views show information on all the tables in the database, ALL_ shows information on all the tables that you own or have been granted access to. USER_ shows information on all the tables that you own.

There are some differences between Data Dictionary Views and Dynamic Performance Views:

Data Dictionary Views usually have plural names (DBA_TABLES), Dynamic Performance Views, singular (V$DATAFILE)
Data Dictionary Views are only available whilst the database is open, some V$ Views are available when the DB is not fully open
The data in Data Dictionary Views is usually UPPERCASE whilst V$ data is usually lowercase
The data in Data Dictionary Views is static and not cleared when the DB is restarted, V$ data is cleared after a restart

Examples of Data Dictionary Views

DBA_TABLES
DBA_TABLESPACES
DBA_USERS
DBA_VIEWS

Examples of Dynamic Performance Views

V$VERSION
V$OPTION
V$DATABASE
V$SQL

"The truth is not simply what you think it is; it is also the circumstances in which it is said, and to whom, why and how it is said." - Vaclav Havel

Oracle: COMMENT ON ...

A little known / little used feature of Oracle is the COMMENT ON TABLE / COMMENT ON COLUMN facility. Basically this allows each table and column to have a descriptive text applied to it. It is common to have descriptive text for both tables and columns contained within your application schema, or in a separate data dictionary type document, but it would be nice for that information to be available via external tools that access and / or modify the database without referring to separate documentation.

The feature is used in the following way:

COMMENT ON TABLE mytable IS ‘This is my Master Table’

COMMENT ON COLUMN mytable.primarykey IS ‘Unique ID from Sequence SEQ_MASTER’

The comments can be displayed by querying the following views:

DBA_TAB_COMMENTS
DBA_COL_COMMENTS
ALL_TAB_COMMENTS
ALL_COL_COMMENTS
USER_TAB_COMMENTS
USER_COL_COMMENTS

If you have a schema within your application that contains comments on both the table and column levels it should be very straightforward to write a small program to loop through your schema data and write those comments into Oracle. It could be a couple of hours well spent.

"The excellence of a gift lies in its appropriateness rather than in its value." - Charles Dudley Warner

Friday, 7 September 2007

Types of SQL

There are three basic types of SQL:

DDL, DCL and DML.

DDL:        Data Definition Language
                Used By:                DBA
                Examples:        CREATE, ALTER, DROP

DCL:        Data Control Language
                Used By:                DBA
                Examples:        GRANT, REVOKE

DML:        Data Manipulation Language
                Used By:                USERS, DBA
                Examples:        SELECT, INSERT, UPDATE, DELETE

"Creativity can solve almost any problem. The creative act, the defeat of habit by originality, overcomes everything." - George Lois

Wednesday, 11 July 2007

Stored Procedures: Advantages

When accessing a SQL database such as Oracle there are generally two ways of implementing your data access code, either by creating the SQL statements as a string and passing them to the database from your application, or by using stored procedures. A stored procedure on Oracle is created using the language PL/SQL. A stored procedure is an executable piece of code thats sits within the database and is called by your application, it is generally created and written outside of your application using tools supplied by the database vendor.

I am convinced that stored procedures offer valuable benefits over creating the SQL within your application. For future projects I will be spending far more time with stored procedures than I do currently. Below I have listed the main reasons that I believe stored procedures (in many circumstances) are the more beneficial and efficient way of working:

PL/SQL Code is Stored within the Database
This means that the code once written and installed within the database is available to all users of that database, in addition performance is likely to be considerably better than code sent from the application.

PL/SQL Makes it Easy to Process Huge Amounts of Data
Using the ‘cursor for’ loop you can query large amounts of database rows and cycle through them iteratively.

PL/SQL Allows you to Abstract the Logic away from the Application
By putting logic in the database we can ensure that even people who access the database without your client application will still have the data processed or presented according to the system requirements. This feature often requires the use of Triggers.

PL/SQL Supports Packages
A Package is a logical group of procedures or functions that work together. As they are all loaded into memory together performance is enhanced. Standard procedures are only loaded into memory when called.

and my personal favourite:

PL/SQL Code is Portable
The PL/SQL code can be deployed on any platform on which Oracle runs, so a procedure developed on a Unix or Linux based system can be moved over to run on a Win2K3 based system with no alterations. Try doing that if the logic is written in VB.NET instead !

Other DBA’s and Developers may have additional reasons that they use PL/SQL, the above list is my opinion only of the features that I consider the most important to me when considering PL/SQL.

Live Long and Prosper - Dr Spock

Tuesday, 7 November 2006

SQL 101 7: SQL Background

This is the seventh and final post in my series designed to teach people the basics of SQL and it’s use in interrogating databases.

SQL Background Information
SQL is a declarative language, not a procedural language.  This effectively means that you tell the database what you want doing, not specifically how to actually do it. This is a similarity shared by several general fourth generation languages. Examples of procedural languages are Pascal, C++ and Basic, as well as old timers such as COBOL and FORTRAN.

The SQL Language consists of three parts:
                DDL:                Data Definition Language
                (Used to Design the tables and columns)
                DML:                Data Manipulation Language
                (Insert, Delete, Update & Query the database)
                DCL:                Data Control Language
                (Controls database security and permissions)

The DBA (Data Base Administrator) is concerned with the DDL & DCL, whereas the programmer or reasonably sophisticated end-user is responsible for the DML.

SQL differs from other procedural languages in that other languages handle data in FILES. A FILE is a container of data in which data is stored sequentially. There is a concept of last record and first record, this fits in well with the way files work in the real world with paper forms and filing cabinets and the like.

SQL however handles data in TABLES. A TABLE is a set, a set is a mathematical abstraction, sets are ordered and all of their members are of the same type. When you perform an operation on a set the operation happens all at once and to the entire membership. Tables can be permanent (base views) or virtual (views). An operation on an SQL table always returns a RESULTS TABLE.

Cardinality of a table
(The number of rows in a table, Zero or Greater)
Degree of a Table
(Number of columns in a table, One or Greater)

A foreign key is the column which links to another (primary) key in another table.

OR (When using an OR within the search string, the OR'd items must be bracketed for a successful search.)
NOT
AND
LIKE 'xxx%' %=wildcard of one or more characters, the like command is however CASE SENSITIVE.
IN
BETWEEN (= can be used for an exact match type search)

GROUP BY can be more than one field separated by a comma, both columns must also be in the actual columns displayed. DO NOT include a primary key in a GROUP BY command, as the primary key is unique. Fields other than the grouping fields cannot be in the display columns, as to do so would make zero sense.

NORMALISATION is the process where by a database designer attempts to minimise the amount of duplicate data stored within his database.

"If you have made mistakes, even serious ones, there is always another chance for you. What we call failure is not the falling down but the staying down." - Mary Pickford

SQL 101 6: DISTINCT & Functions

This is the sixth in my series of posts designed to teach people the basics of SQL and it’s use in interrogating databases.

The DISTINCT Option
This option removes duplicate rows from the Results Table. The result of the query will only include each distinct selected set of Columns, even though there may be many more such occurrences in the Table.   NOTE:  Only rows where ALL columns are the same will be removed.

One example of use is where you wish to query the stock file to see if you have ever sold a particular product, you only wish a row to be returned as confirmation, you do not need a row for every occurrence of the sold stock record. This query would use the DISTINCT option to return only the required data.

Calculated Columns
The value of a Column can have a calculation applied to it. This calculation can be in the form of a numeric calculation or a calculation based on the value of another Column.

Functions
It is possible that instead of displaying the contents of a Column in the Results Table, you can display the result of performing certain operations on the Column contents. The functions available are as follows:

AVG(COLUMN)
This function calculates the average value of the specified Column. The name of the Column to be averaged must be enclosed in parentheses.

SUM(COLUMN)
This function operates exactly as the average function, but produces the SUM of the selected Columns.

MIN(COLUMN)
This function operates exactly as the average function, but produces the minimum of the selected Columns.

MAX(COLUMN)
This function operates exactly as the average function, but produces the maximum of the selected Columns.

COUNT(*)
This function operates exactly as the average function, but produces a COUNT of the number of Columns that have met the selected criteria.

Next: SQL Background Information

"If you really do put a small value upon yourself, rest assured that the world will not raise your price." - Anonymous

Sunday, 5 November 2006

SQL 101 5: ORDER and GROUP

This is the fifth in my series of posts designed to teach people the basics of SQL and it’s use in interrogating databases.

The Optional ORDER Clause
The ORDER clause provides an optional method of sorting the rows that have been returned to the Results Table. Please Note that using a sort can sometimes take longer for the query to execute, and require more memory on the Server. An example of the ORDER clause is as follows:

ORDER BY CUS_NAME, CUS_PRIMARYKEY

The order of sorting is determined by the order of the Columns within the ORDER section of the statement. The sort is carried out in ascending order of field value.

Below are some examples of queries with the optional ORDER clause in use.

SELECT SOH_NUMBER, SOH_DATE, SOH_NET, SOH_OWNER
FROM GUSER.F_SOHDR
WHERE (SOH_OWNER = 'JOHN CLARK' OR SOH_OWNER = 'PETER OAKDEN') AND SOH_NET > 3000
ORDER BY SOH_NET

SELECT SOH_NUMBER, SOH_DATE, SOH_NET, SOH_OWNER
FROM GUSER.F_SOHDR
WHERE (SOH_OWNER = 'JOHN CLARK' OR SOH_OWNER = 'PETER OAKDEN') AND SOH_NET > 3000
ORDER BY SOH_NET DESC


The Optional GROUP Clause
The GROUP clause is used to sort the output not as individual records, but as groups of records that have the same value in a specified field. This optional clause is only used when the query is applying one of the functions (AVERAGE, SUM, MIN, MAX or COUNT) to the groups.

The SELECT clause is limited to the Column that appears in the GROUP clause, with one or more functions of other Columns. For Example:

SELECT HIS_USER, COUNT(*)
FROM GUSER.F_SYSHISTORY
WHERE HIS_USER LIKE ‘%’
GROUP BY HIS_USER


This query returns a row containing the amount of system history records against each user.

Next: DISTINCT Option, Calculated Columns and Functions

"Don't join the book burners. Don't think you're going to conceal faults by concealing evidence that they ever existed. Don't be afraid to go in your library and read every book..." - Dwight D Eisenhower

Saturday, 4 November 2006

SQL 101 4: The WHERE Clause

This is the fourth in my series of posts designed to teach people the basics of SQL and it’s use in interrogating databases.

The WHERE Clause
The WHERE clause is used to specify the conditions that must be met for the record to be selected. In the query, each record in the Table named in the FROM clause is read and tested for the condition specified in the WHERE clause. If the condition evaluates to True the Columns specified in the SELECT clause are included in the Results Table. Otherwise the record is not returned to the Results Table.

SELECT CUS_PRIMARYKEY,CUS_CODE
FROM GUSER.F_CUSTOMER
WHERE CUS_CODE LIKE '%'


This returns a Results Table containing all your Customers. (Of course the same could have been achieved in this case by eliminating the WHERE clause entirely.)

In its simplest form the WHERE clause consists of four parts: The word WHERE, a Columns name, an operator, and a value or another Column name. Several examples of a full SQL statement containing a simple WHERE clause are shown below:

SELECT SOH_NUMBER, SOH_DATE, SOH_NET, SOH_CUCODE
FROM GUSER.F_SOHDR
WHERE SOH_DATE BETWEEN ‘2004 MAR 01’ AND ‘2004 MAR 31’

SELECT SOH_NUMBER, SOH_DATE, SOH_NET, SOH_CUCODE
FROM GUSER.F_SOHDR
WHERE SOH_NET > 1000.00

SELECT CUS_CODE, CUS_NAME, CUS_BALANCE
FROM GUSER.F_CUSTOMER
WHERE CUS_NAME LIKE ‘Bathroom%’

SELECT CUS_CODE, CUS_NAME, CUS_BALANCE
FROM GUSER.F_CUSTOMER
WHERE CUS_NAME NOT LIKE ‘Bathroom%


SELECT SOH_NUMBER, SOH_DATE, SOH_NET, SOH_OWNER
FROM GUSER.F_SOHDR
WHERE (SOH_OWNER = ‘JOHN CLARK’ OR SOH_OWNER = ‘PETER OAKDEN’) AND SOH_NET > 3000


Parentheses are used to alter the sequence in which conditions are evaluated. Normally conditions are evaluated multiplication/division, left to right, however parentheses cause the condition inside the parentheses to be evaluated first and to pass the results of that evaluation to the outer part of the condition (or the next outer set of parentheses if using nested parentheses)

An example of the use of parentheses, consider the following two expressions:

7 + 5 * 4        =        27
(7 + 5) * 4        =        48

The final part of the WHERE clause is the element that follows the operator. This can either be a data value, or the name of another column. If the data is alphanumeric, the data must be enclosed in quotation marks. Numeric data must not be enclosed in quotation marks. Dates are treated as alphanumeric fields and the form ‘01 JAN 06’ is commonly used.

Note: The ‘%’ symbol is a wildcard. As the query executes this is evaluated to mean ‘one or more characters’. Effectively allowing a simple ‘contains’ search. Another wildcard than can be used is the underscore character (_) , this represents one character only.

If using the wildcards then you cant use the equality operator (=), instead you must use the LIKE statement. See the examples earlier.

Tomorrow: The ORDER BY and GROUP BY Clauses.

"Good communication is as stimulating as black coffee and just as hard to sleep after." - Anne Morrow Lindbergh

Friday, 3 November 2006

SQL 101 3: SELECT and FROM

This is the third in my series of posts designed to teach people the basics of SQL and it’s use in interrogating databases.

The SELECT Clause
The SELECT clause is used to define the columns that are to be displayed in the Results Table for each record that satisfies the query conditions. If more than one Column is specified the field names must be separated with commas. If only one Column is specified, no separator is required. For example:

SELECT CUS_PRIMARYKEY,CUS_CODE
or
SELECT CUS_CODE


The FROM Clause
The FROM clause is used to define the Table from which the records are to be selected. The FROM clause normally follows the SELECT clause, and in order to keep the structure of the query clear, it can often be found entered on a separate line. For Example:

SELECT CUS_PRIMARYKEY,CUS_CODE
FROM GUSER.F_CUSTOMER


This means that the Columns Primary Key and Customer Code are to be retrieved from the table F_CUSTOMER. If more than one Table is to be used in a query (using a SQL ‘Join’ which is a technique for linking related tables),  the Table names must be separated by a comma. For Example:

SELECT CUS_PRIMARYKEY,CUS_CODE,SOH_NET
FROM GUSER.F_CUSTOMER, GUSER.F_SOHDR
WHERE


Tomorrow: The WHERE Clause.

"Most folks are about as happy as they make up their minds to be." - Abraham Lincoln

Thursday, 2 November 2006

SQL 101 2: Relational Joins

This is the second in my series of posts designed to teach people the basics of SQL and it’s use in interrogating databases.

Relational Joins
Data can be linked to other data. A sales order is linked to it’s lines, it is also linked to a customer. These links are known as relational joins. When attempting to retrieve linked data you have to tell the computer what are the key fields that you want to use to establish the link.

The link is established through Primary and Foreign Keys. In my own databases I generally provide a second and arguably more intuitive link. More on this later.

Consider the following Example, which shows a simplified Sales Order Header Table, and a simplified Customer Table:

Table: GUSER.F_SOHDR
SOH_PRIMARYKEY                        Sales Order Primary Key
SOH_NUMBER                        Sales Order Number
SOH_DATE                                Date
SOH_DELIVDATE                        Delivery Date
SOH_CUCODE                        Customer Code
SOH_FK_CUSTOMER                Foreign Key to Customer Table

Table: GUSER.F_CUSTOMER
CUS_PRIMARYKEY                        Customer Primary Key
CUS_CODE                                Customer Code
CUS_NAME                                Customer Name

As you can see the customer name is not stored against the sales order header record. (There is a very valid reason for this and it concerns normalisation, see elsewhere in this document for an explanation of normalisation.)

Now if you wish to produce a SQL Query that shows all sales orders for the first week in March, that shows the order number, order date, customer code, customer name and net value, then you need to pull in the customer name from the customer table. This means that you have to apply a join in your query, and for this first example we are going to do the join via the sales order foreign key to customer (SOH_FK_CUSTOMER) and the customer primary key (CUS_PRIMARYKEY). The query looks like this:

SELECT SOH_NUMBER,SOH_DATE,SOH_CUCODE,CUS_NAME,SOH_NET
FROM GUSER.F_SOHDR,GUSER.F_CUSTOMER
WHERE SOH_FK_CUSTOMER = CUS_PRIMARYKEY
AND
SOH_DATE BETWEEN '2004 MAR 01' AND '2004 MAR 05'

The Important things to notice here are that we have had to specify in the FROM section the fact that we wish to pull in data from more than one table. The other thing we have had to do is specify that the way we want to link the sales order to the customer is through the Primary Key / Foreign Key mechanism, you can see this as the first part of the WHERE section.

As I mentioned earlier, within my own databases I generally provide a second way of linking the data between tables, through the use of a Common or Convenient code. In this example we could have used the Customer Code stored against the order (SOH_CUCODE) and the Customer Code stored in the customer table (CUS_CODE), to link the records. This is demonstrated in the example shown below, it returns exactly the same data as the first example.

SELECT SOH_NUMBER,SOH_DATE,SOH_CUCODE,CUS_NAME,SOH_NET
FROM GUSER.F_SOHDR,GUSER.F_CUSTOMER
WHERE SOH_CUCODE = CUS_CODE
AND
SOH_DATE BETWEEN '2004 MAR 01' AND '2004 MAR 05'


The only difference between the two examples is the in the first part of the WHERE section, where I have specified the column names used to join the two tables.

NOTE: If you specify more than one table in the FROM section of your query and do not specify how to link the tables in your WHERE section, then you end up with what is know as a Cartesian Product. (THIS IS A BAD THING). A Cartesian Product is mathematically a binary operation in which two objects are combined in an ‘everything in combination with everything’ fashion. This can cause a serious system slowdown, and will return nonsensical data.

An understanding of joins is essential for you to progress to more advanced SQL Queries, and to allow you to collate and interrogate data successfully.

"Temptation rarely comes in working hours. It is in their leisure time that men are made or marred." - W N Taylor

SQL 101 1: Intro / Overview

This is the first in a series of posts designed to teach somebody the basics of SQL. It is adapted from a training course I originally wrote around 10 years ago, revamped again in 2000 and then again in 2006. The syntax has been tested against Oracle but in the vast majority of cases should work fine against any version of SQL. Have Fun !

Introduction
Information is data organised in a way that is useful. What we give you within database software is Information, what we actually store is Data.

Obviously the data is not stored in a totally random fashion. The data is stored in a structured fashion, this structure is represented by Tables and Columns.

A Table is a collection of related data. For example the table F_CUSTOMER contains all the Customer data. The table F_SUPPLIER contains all the Supplier data. A column represents a particular portion of the data, for example a column within the customer table might be called CUS_NAME. This column will contain the customers name. You can liken this structure to a spreadsheet worksheet, where each row represents a single customer and each column represents a particular type of data stored. See this diagram for clarification.

For this tutorial we are going to be concentrating on using a SQL Query Tool to retrieve data from the database and present it to you on your screen.

Overview
‘SQL’ is an abbreviation for ‘Structured Query Language’. It provides a programming language for accessing selected records from a relational database. SQL often (and possibly incorrectly) pronounced ‘Sequel’ is actually pronounced ‘ess que ell’. This confusion stems from the fact that ‘SQL’ is the successor of a language called ‘Sequel’ developed by IBM in the late 1960’s.
For even the most simple of database inquiries there are three elements that you must communicate to the system to perform the inquiry:
You must specify the Table or Tables that you wish to search.
You must specify the Columns of data you wish to retrieve.
You must specify the search conditions for the data you require.
In SQL this is achieved using the SELECT ..  FROM .. WHERE ..  clauses.
The SELECT clause is to tell the system which Columns you want to select for the inquiry. The FROM clause is used to tell the system which Table contains the records which are to be selected. The WHERE clause is used to specify the conditions that must be met for a record to be selected.
These three clauses are all that is a necessary to perform a simple query from the SQLQuery Engine. The system reads the specified Tables(s) and checks each row against the condition, if the condition is met the selected records are returned into the Results Table.
Another feature of SQL is the ability to request not the contents of the selected fields, but a function of those contents, such as the average value, sum total, maximum or minimum value, or just a count of the number of records that meet the selection criteria.

"Aim at the sun, and you may not reach it; but your arrow will fly far higher than if aimed at an object on a level with yourself." - Joel Hawes