Category: Database Technology

FIT9019: Week 11

Week 11 gave us a quick introduction to PL/SQL.

The primary tools of PL SQL are:

Anonymous Blocks
Named Blocks
Subprograms
Triggers

Of these subprograms and triggers would be the most commonly used.

The structure of PL/SQL:

DECLARE
/*Declarative section is where variables and other objects such as cursors, types, exceptions, etc. are declared. Local
procedures and functions can also be declared and will be available for this block only. This section is optional */
BEGIN
/* Executable section can consist of both SQL statement and procedural statements. This section is where the work the
block has to perform is done. This section is mandatory */
EXCEPTION
/* Exception handling section is where errors are handled. The code in this section gets executed only when an error
is encountered. This section is optional. */
END;
/

A couple of examples of PL/SQL blocks:

-- Whenever emp_salary is updated or an employee deleted, this trigger enters a corresponding record into
--'EMPLOYEE_SALARY_CHANGE' table.
CREATE OR REPLACE TRIGGER trg_trackSalaryChange
AFTER UPDATE OR DELETE OF e_salary ON employee
FOR EACH ROW
DECLARE
oper VARCHAR2(10);
new_sal NUMBER(6,2);
BEGIN
IF updating THEN
oper := 'UPDATE';
new_sal := :new.e_salary;
IF :new.e_salary <= 0 THEN
RAISE_APPLICATION_ERROR (-20000, 'SALARY CANNOT BE ZERO OR NEGATIVE.');
END IF;
END IF;
IF deleting THEN
oper := 'DELETE';
new_sal := NULL;
END IF;

INSERT INTO employee_salary_change VALUES (:old.e_no, :old.e_name, :old.e_salary,
new_sal, user, sysdate, oper);
END;
/

For some basics on PL SQL such as syntax and constructs visit: http://plsql-tutorial.com/index.htm

Database Technology

FIT9019: Week 10

Week 10 saw us move away from the more technical SQL and Indexing topics to ‘Database administration, Transaction management and Concurrency Control’.

Responsibilities of DBA:

Control of redundancy
- Use of standards for modelling
- Maintaining data dictionary
- Ensuring sufficient normalization
- Managing sharing of data between multiple applications
Control of Integrity Constraints
- Defining appropriate integrity constraints
- Ensuring applications meet with the impose constraints
- Ensuring applications cannot subvert constraints
Control of central data security
- Provision relevant user views
- Issuing passwords
- reviewing physical security
- design and monitoring of back up procedures

Also important duties are:

Query Optimization
Clusters, Indexes

Manoj then continued into the nature transactions, crashes and concurrency control. A quick summary below:

Transactions
- A single or number of processes involving database accesses that constitutes one single logical action (i.e. bank transfer). Atomic (indivisible, can’t separate processes), Durable (upon completion, stored permanently).
- Transactions can be committed or aborted (rollback, utilizing before-image log)
- Checkpoints are periodic (usually 15-20mins) batch writes of transactions to disk.
Soft Crashes
- I.e. Power failure, where system resets but no physical damage.
- After soft crash, secondary memory data is loaded, then transaction log is referred to and transaction missing from secondary memory are either:
  - Redone (REDO), if the transaction was completed and can be re-completed this will be done
  - Undone (UNDO) if the transaction was not completed (committed) in the log then it must be undone (transactions are atomic).

rollforward.rollback — source: week 10 lecture notes

Continue reading “FIT9019: Week 10”

Database Technology

FIT9019: Week 9

A very busy lecture this week with Manoj introducing some index structures including:

ISAM – Indexed Sequential Access Mechanism
Binary Tree – Suseptable to ‘lopsidedness’

binaryTreeTable — source week 9 lecture notes

B – Tree – Normalized but inconsistent lookup times.
B+ Tree – Normalized, all keys in leaf nodes. Used by Oracle.

bplusTreeTable — source: week 9 lecture slides

Database Technology

FIT9019 – Week 8

Was quite a hectic week for Database Technologies with the movement of lecture and tutorials and the major assignment due. Thank to Manoj for putting in so much consultation time for everyone outside of class hours. Unfortunately I could not make the lecture due to work commitments but I will review the lecture notes with some assistance from the text book.

Physical Database Design 1:

The process of developing a physical implementation strategy for a logical design:

Database Technology

FIT9019 – Week 7

Manoj presented lecute number 7 this week, titled “SQL III”. True to the title, we churned through 2 hours of SQL definitions and examples. Interestingly there are a number of differences between Oracle and MySQL. To really get the hang of SQL I have just set up a MySQL server on my home PC and running over the commands show in the lectures/tutorials and in the Text book.

Interestingly in some I stumbled upon the following in a google search: ‘On its earnings call last night, Oracle president Charles Phillips crowed about IDC data showing Oracle’s database share at 44.3% vs. 21% for IBM vs. 18.5% for Microsoft.’ Source: http://itknowledgeexchange.techtarget.com/channel-marker/database-market-share-war-resumes/

This makes me wonder why Oracle is a such a superior option over other (is the actual product better?). On some further investigation (of Oracle’s wikipedia page: http://en.wikipedia.org/wiki/Oracle_Database)there may be a couple of major reasons:

Oracle Corporation claims to have provided:

the first commercially-available SQL-based database (1979)^[38]
the first database to support symmetric multiprocessing (SMP) (1983)
the first distributed database (1986)
the first database product tested to comply with the ANSI SQL standard (1993)^[38]
the first 64-bit database (1995)
the first database to incorporate a native JRE (1998)
the first proprietary RDBMS to become available on Linux (1998)^[39]
the first database to support XML (1999)

The name Oracle comes from the code-name of a CIA-funded project… <- hah

Database Technology

FIT9019: Week 6

Manoj presented a lecture focused on the Structured Query Language [SQL] upon our return from the easter break. The tutorial was closely tied to the lecture enabling us to work through the code examples shown. With that in mind, shown below are the types of queries covered in the lecture and tutorial with an example and description:

CREATE – create tables (or Databases), specifying the table name and column values. Constraints can be applied to the attributes such as Primary Key, Foreign Key, Default value and Not Null however it is probably easier to add constraints using the ‘ALTER TABLE’ query. CREATE is part of DDL.

CREATE TABLE DEPARTMENT
(DEPT_NUM INTEGER, DEPT_NAME varchar(20), CREATION_DATE DATE);

Continue reading “FIT9019: Week 6”

Database Technology

FIT9019: Week 5

Database Design III – Boyce Codd Normal Form, presented by Manoj Kathpalia, alerted me to the value of revising the text book. It seems I do not have a clear grasp of some of the characteristics of each Normal form. Must ensure to remember that for a relation to be in BCNF every functional dependency must be a candidate key, this is best illustrated graphically:

So, although the iPhone pic of the text book is a bit blurry, it can be seen that a sneaky dependency from a non-candidate key attribute to another attribute will cause update anomalies and refrain the table from achieving BCNF.

Most of the lecture was focused on ensuring that everyone understood the process of 1NF -> 2NF -> 3NF -> BCNF. I have already made a detailed post on this process which can be viewed here.

Our tutorial was 2 hours of toying with MicroSoft Access. I came into it with an open attitude, but after about 90 minutes I was lost in the world of GUI fluff that is Access. I am sure that with time it is something that one could learn to use with more efficiency, but with superior performing Open Source alternatives (ie: LAMP bundle) I cannot ever see myself working on an Access database by choice.

Pearl of the Week: Avoid Microsoft Access.

I plan to review the text book over the next week to ensure that my current perception of each aspect of the subject is in fact valid. Particularly important is to rote learn definitions for 1NF, 2NF, 3NF, BCNF, Functional Dependency, Transitive Key Dependency(3NF), Partial Dependency (2NF) and Non-Key Partial Dependencies (BCNF).

Database Technology

FIT9019: Week 4

Unfortunately I missed Manoj’s 4th lecture, the notes however are online and I will run through a summary of them.

Part 1: Normalization

Data normalization is required to ensure that a relational database will perform UPDATE, CREATE and DELETE queries without anomalies (loss of data integrity).

The process of normalization has numerous levels (1st Normal Form [1NF], 2NF, 3NF, BCNF, 4NF, 5NF) which will be shown below. Prior to running through each level, the terminology for dependencies must be clarified.

Functional Dependence: A relationship between two attributes each value of A is associated with exactly 1 value of B, ie; ID is a determinant of Name. Partial Key dependencies are determinant relationships between two or more non-primary key attributes(leading to update anomalies). Transitive Key dependencies occur when an attribute is more immediately identified be a non-Primary Key attribute/s (leading to update anomalies).

1NF: Ensuring atmoic values in the relation by removing unnecessary attributes, separating relations if repeating values occur.

CUSTOMER (cust-id, surname, initials,{order-no, order-date})
//{} signify repeating values.

Now in 1NF :

CUSTOMER(cust-id, surname, initials)

ORDER(cust_id, order-no, order-date)

2NF: Remove partial key dependencies (ensure primary key is in its minimal form and partially dependent attributes are removed to an alternate relation). Ie:

ORDER-LINE(order_number, product_no, description, number-ordered)
// As above, the description attribute is dependent only on product_no
making it a partial key dependency.

Now in 2NF:

ORDER-LINE(order_number, product_no, number-ordered)

ORDER-LINE(product_no, description)

3NF: Remove transitive keys dependencies.

ORDER(order-no, order-date, cus-id, name, street, suburb)
//name, street and suburb transitively dependent on cus-id

Now in 3NF (Transitive dependency removed by creating new relation):

ORDER(order-no, order-date, cus-id)

CUSTOMER(cus-id, name, street, suburb)
// new relation added to eliminate transitive dependency.

Boyce-Codd Normal Form [BCNF]: 3NF + only a single candidate key for each relation.

4NF: No non-trivial multi-valued dependencies. *

5NF: No join dependencies. *

* It seems we will not be covering 4NF and 5NF in this subject although other lecturers have mentioned they are the most prevalent in the business world.

That brought to a close the section on Normalization, next up was ER/EER -> Relational Model. Again as I did not attend this lecture I will just simply run through a summary of Manoj’s slides.

Inputs: ER/EER diagrams -> Outputs: relational schemas

Properties of Relations:

Each relation has a unique name..
Each tuple is unique
Attributes names are meaningful
Order of attributes is immaterial
Order of tuples is immaterial
entries are atomic (single valued)

Types of entities -> Regular(Strong), Weak, Associative(Composite).

In preparation for the transition from ER model to relation schema, the ER model must be expanded from conceptual to logical level. This involves removal of M:N relationships (Unary, Binary or Ternary) and recognition of Primary Key.

There was much more detail in this lecture but I will await the next presentation to elaborate further.

Database Technology

FIT9019: Week 3

Week 3’s lecture by Manoj Kathpalia got into database design. We continued our referencing to the ANSI/SPARC architecture which it appears will be an ongoing theme.

The lecture ran through the Database Design Cycle introducing Requirements Definition, Conceptual Design, Logical Design and Physical Design. With detailed analysis of ER modeling which applies to Conceptual and and Logical Design (which are both at the conceptual level of the ANSI/SPARC architecture).

I have some confusion over Cardinality, I assume the Cardinality of Relation and the Cardinality of a Relationship are two different things. The cardinality of a relation being the number of tuples and the cardinality of a relationship being either 1:1, 1:M or M:N. I will need to clarify this with Manoj.

The ER model is quite a simple graphical modeling tool which we review in the tutorial, it does however seem that there are a number of acceptable notations in ER modeling some of which can be seen here.

Pearl of the week:

Although ER modeling can be used for both Conceptual and Logical Design, the Conceptual Level does not include Foreign Keys and allows Many-to-Many relationships.

After working through the ER model, we touched on integrity constraints, specifically Referential Integrity constraints:

Foreign Key references must be valid or null
Update Rules either restricted, cascaded or null (ie: when the customer_ID is updated in the customer table, the foreign key in the order table can either stay the same, update or become null).
Deletion Rules either restricted, cascaded or null (ie: when the customer_ID is attempted to be deleted in the customer table, it can either be restricted (forbidden delete), delete the associated orders :O or set the customer_ID filed of the order to null).

Finally we covered an introduction to Database Design Language [DBDL]. We did some more work on DBDL in the tutorial although there were some discrepancies between the lecture example and the tut examples (ie including CK and PK). From the final question of the tutorial, my derived relations in DBDL were as follows:

book(ISBN, call_num, title, pub_date, publisher_ID)
SK title
AK call_num

author(author_ID, surname, first_name, DOB, initials)
SK surname
 SK initials

author_book(ISBN, author_ID)
FK ISBN -> book - Delete restrict, Update Restrict
FK author_ID -> author Delete restrict, Update Restrict

subject (sub_name)

subject_book (ISBN, sub_name)
FK ISBN -> book - Delete restrict, Update restrict
FK sub_name -> subject - Delete restrict, Update Restrict

publisher (publisher_ID, pub_name, street_add, city, postcode, country)
AK pub_name
SK pub_name

I did not include CK and PK as they were not included in the lecture’s example.

Database Technology

FIT9019: Week 2

We got into the Relational Data Model today in Manoj Kathpalia’s second week lecture. I found this lecture much more interesting and challenging than last weeks which was great. Until now the terminology I had used for Relational Databases was: Table, Column, Row. Today I discovered that the RDBMS industry is engaging in mass duplication of data by having to remember the following alternate names:

Table – > Relation, Entity
Column -> Attribute / field
Row -> Record set / tuple / entry

Why everyone doesn’t just use Table, Column, Row is beyond me… We covered some properties of Relations, Primary, Alternate and Foreign Keys. I was a little confused by the Relation Languages segment where Data Definition Language [DDL] and Data Manipulation Language [DML] were outlined. Does SQL encompass both DDL and DML? If so why separate them? Further investigation reveals DDL, ‘Initially it referred to a subset of SQL, but is now used in a generic sense to refer to any formal language for describing data or information structures, like XML schemas.‘ source: http://en.wikipedia.org/wiki/Data_Definition_Language … which reminds me that I need to do some learning about XML. DDL -> Create and Delete (DBs, Tables, views, integrity contraints, indexes) DML -> Relational Calculus, Relational Algebra, Transform Oriented Languages, Graphical Languages (MS Access), 4GL, 5GL (All exhibit ‘closure ‘ property – whatever you do is not stored permanently and gone on closure of the instance.) For DML we are focusing on Relational Algebra as it is apparently more simplistic. Relational algebra has 8 basic operators:

Selection
Projection
Join
Union
Intersection
Difference
Cartesian Product
Division

We went through definitions of these with theoretical examples which I think failed to pull the theory into any relevant practical sense. I hope the tut will give me some more practical connections for this theory otherwise (without a much larger mathematical understanding) this is rote learning without the ability for application. Will update this entry after the tutorial. In the tutorial we went through some clarification of terms and then moved onto some prac questions on Relational Algebra for which I made some notepad notations:

Q4, D

[1] -> (ROOM) |x| ROOM.Hotel-No = HOTEL.Hotel-No (HOTEL)

[2] -> π type,price([1]) PROJECTION

[3] -> Θ [2].Name = Grosvenor ([2])

Q4, e

[1] -> (GUEST) |x| GUEST.Guest-No = BOOKING.Guest-No (BOOKING)

[2] -> ([1]) |x| [1].Hotel-No = HOTEL.Hotel.No (HOTEL)

[3] -> π [2].NAME, [2].ADDRESS ([2])

|x| – JOIN

π – PROJECTION

Θ – SELECTION

note that SELECTION and PROJECTION should be done first if efficiency is a concern.

Working through the Relational Algebra was good as it tied the link between theory and possible practical applications. I know if I had known this theory previously I could have saved time on several projects using SQL databases. Pearl of the Week: Relational Algebra Basic Operators:

Selection (Slices Tuples)
Projection (Dices Attributes)
Join (Natural Join, Left/Right/Full Outer Join, θ-join[uses binary operators {<, ≤, =, >, ≥}]
Union
Intersect
Difference
Cartesian Product
Division
Rename? [(Actually, Codd omitted the rename, but the compelling case for its inclusion was shown by the inventors of ISBL.)source