CPSC 461:Copyright © 2002 Katrin Becker 1998-2002 Last Modified October 21, 200102:54 PM
INDEXING REVIEW QUESTIONS
SHORT ANSWER QUESTIONS
[ 4 marks ] What is the best placement strategy to use for re-using entries in the free-list of a fixed-length record file? Why?
[ 4 marks ] What is the advantage of rearranging the records of an indexed file after a keysort?
[ 2 marks ] Explain how compaction affects internal and external fragmentation within a file.
[ 4 marks ] How can one effect a binary search on a varying length record file?
[2] What do we normally count when we are comparing sorting algorithms?
[3] What is the usual measure of efficiency for large file sorts? Explain.
[ 4 marks ] What are the 4 characteristics of an ideal primary key in an indexed file?
[4 marks] Suppose we had a dynamic, indexed, fixed length record file and we wanted to avoid having to maintain a free list. How could we make sure that there would be no wasted space within the file? Hint: it's indexed so locality of records is not an issue.
[ 5 marks ] If one has an indexed record file, why might one want to sort the records themselves (at least 2 reasons)?
[ 5 marks ] One of the fields in the records structure of a large, varying length record file contains text values that vary in length from 2 to 256 bytes. Their average length is 30 bytes, but the shortest possible abbreviation is 32 bytes (for all conceivable values). Describe how you would design this field of the record and why.
LONG QUESTIONS
(worth 10 total)
Outline the circumstances that might make the two placement strategies (First Fit & Best Fit) reasonable choices (outline each separately).
(worth 10 total) In class, we discussed 3 placement strategies for assigning records to free space in a file. 'Worst Fit' works well when the file consists of records such that one 'hole' (free space) can be made to accommodate more than one record. There are essentially two ways this clean come about. What are they?
(worth 10 total)
Discuss the relative advantages and disadvantages of using the same free list for both the primary and secondary indices of an indexed record file when the keys in both indices are different sizes.
(worth 10 total)
Suppose we have an indexed varying length record file stored in binary format. The records are blocked so that each read to fill one buffer retrieves a number of records and no record crosses a block boundary. What would be the relative advantages and disadvantages of keeping the block number of the record in the index as opposed to the actual byte offset ?
(worth 10 total)
Suppose we have a free list for file containing varying-length records. We want to use a worst-fit strategy for using free-space. What do we need to do to make this work? What information does the free-list need? Should this be part of the regular record structure or should the free-list have it's own structures? How must we sort the free-list? We want to be able to coalesce the holes efficiently without loosing efficiency in adding to and removing records from the free-list. How do we accomplish this
(worth 15 total)
What would eventually happen to a dynamic, indexed, varying-length record file if we did not manage the free space ourselves with some sort of free list? (include how this would affect placement of new records, efficiency of searches, sequential file updates, file size, overall processing speed)
(worth 10 total)
You are given an indexed record file with a single secondary index. When the file is in use, both indices are be stored in memory as binary trees. The Main Index contains a unique ID as its key and the key for the secondary index is the Name field from the record. Each secondary index entry also contains a list of pointers to the primary index (stored as an array). The crucial parts of each structure are outlined in the classes that follow. You may assume the existence of working constructors aand destructors. Your task is to outline the algorithms (pseudo-code or C++) needed to INSERT and DELETE a record from the file. Don’t worry about fragmentation. There are separate routines for reading the indices into memory and for writing them out before close. You need not outline these.
Program is at: Code/InsertDelete.txt
CPSC 461:Copyright © 2002 Katrin Becker 1998-2002 Last Modified October 21, 200102:54 PM