Was quite a hectic week for Database Technologies with the movement of lecture and tutorials and the major assignment due. Thank to Manoj for putting in so much consultation time for everyone outside of class hours. Unfortunately I could not make the lecture due to work commitments but I will review the lecture notes with some assistance from the text book.

Physical Database Design 1:

 

The process of developing a physical implementation strategy for a logical design:

  • File Organisation
    • Tables are stored as files
      • Serial – records are placed in next physical position from previous
      • Random – record placement determined by content of key field
      • Indexed – Index table is used for record lookup, record placement is not an issue
  • File Access methods
    • Serial Access – Start to finish, if you want to access the last record all previous must be accessed first
    • Random Access – Records can be accessed regardless of what is in front of them. Requires Random/Indexed organisation and DASD
    • Dynamic Access – Uses both Random and serial in attempt to gain efficiency, also requires DASD
  • Serial or Sequential Files – Entry in chronological order, only access method is serial (would not be efficient for large tables), Sequential can be based on a numeric PK but in essence the same.
    • Pros: Best speed for batching, can be stored on tape 😀
    • Cons: Bad for interactive access to individual records, record modification may have issues, fragmentation will occur if deletion is allowed
    • Use if very high hit ratio (if Queries need to access most records)
  • Random Files – Each record has an address based on records Key, addressing algorithms used for locating
    • Pros – Good for interactive low hit ratio queries. Updating an inserting record as always ok
    • Cons – As records are not stored contiguously, batching can be slow
    • Utilize hashing to reduce address wastage, results in collisions (use loading factor to control this)
    • Solve collisions with:
      • Linear Probing – On collision search sequentially through file space until a home is found (not often used)
      • Double Hashing – On collision apply another hashing algorithm to find a home (can be optimized)
      • Separate Overflow – On collision record i moved to another files space and a link added from the collision site. (good in that this resolution does not result in increased likelihood of further collisions)
  • Index Files – Data File + Index File
    • Index file contains only address and record Key and can be Random or Sequential
    • Secondary index files can be used for secondary keys
    • Correct or Incorrect Index use can speed up or slow down the databases

Pearl of the week – Notes of interest: