DEPARTMENT OF COMPUTER SCIENCE

THE UNIVERSITY OF CALGARY

CPSC 461 Spring 2003

FINAL EXAMINATION

June 27 2003

8:00-10:00 MFH 164

Time: 120 Minutes

Family Name	Given Name(s)

NOTE: Many questions have > 1 correct answer and in some cases full marks will be given for answers that while not perfect, are still 'good enough'.
DEPARTMENT OF COMPUTER SCIENCE

THE UNIVERSITY OF CALGARY

COMPUTER SCIENCE 461

FINAL EXAMINATION

Lecture Section L20

June 27 2003 Time: 120 Minutes

Name:

Student ID:

INSTRUCTIONS

1. Please answer all questions on the exam paper. IF your ANSWER IS NOT RIGHT WITH THE QUESTION, MAKE SURE YOU tell me WHERE THE ANSWERS CAN BE FOUND.

2. Write your name and ID on this page.

3. Answer all questions to the best of your ability.

4. Keep answers short and to the point.

THIS IS AN OPEN BOOK EXAM.

YOU MAY USE ANY BOOKS OR OTHER NOTES YOU FEEL ARE APPROPRIATE. You may also use: cell-phones, or any other wireless or other electronic devices.

YOU MAY NOT USE FRIENDS,

	total possible	your score	Sub - Total
Question 1	15
Question 2	10
Question 3	10
Question 4	10
Question 5	15
TOTAL SCORE	60

Question 1: (worth 15 total) short (!) answer

a) [ 2 marks -bonus ] What is the answer?

b) [ 2 marks ] How does a Storage Area Network differ from a regular file system managed by a typical file server?

c) [ 2 marks ] Why do we need to be concerned about managing space inside a file (why can't we just let the operating system take care of it?)

d) [ 2 marks ] The NTSC film standard calls for 24 frames per second while the digital video standard MPEG calls for 30 frames per second. How is NTSC converted to meet the digital video standard frame rate?

e) [ 2 marks ] Explain why 'traditional' pointers (such as char* in C/C++) cannot be used in files.

---see next page for last part of question 1---

--- see previous page for first parts of question 1---

f) [ 5 marks ] One of the common threads throughout this course was that the organization of data touched on similar issues at various levels of abstraction. Fill in at least five more boxes in the following table:

Level of Abstraction	Highest Level	Index Equivalent	One Element	Subdivision of one Element
Within a Record:	One Record
Within a File:	The Whole File	Index of Keys & Pointers	Record	Field	Data Item (such as a single char of a string)
Within a File System (such as in a UNIX partition):	File System
Within a Storage Network:	Network FS			File
Within an Object Repository:	The Entire Collection		Object

Question 2: (worth 10 total) Cosequential Processing

Estimate (calculate) the cost (in seeks) of creating the initial runs of a major sort by implementing the first phase of a radix sort, followed by a merge to produce a single sorted file. Assume it is possible to break up the original file (using the keys) into 17 roughly equal-sized sets of records. Compare the cost of this kind or sort-merge against the calculations below for a standard polyphase sort-merge of 17 sorted runs. Be sure to explain your calculations (especially important if your numbers are wrong or confusing). It can be assumed that one uninterrupted, sequential pass through an entire file (or run) will cost one seek. Compare the two totals: including the cost of creating the runs and the cost of merging them). Think carefully. You may or may not require all the information below.

For a POLYPHASE Merge of 17 runs:
	D1	D2	D3	D4	Seeks:
0	7[1]	4[1]	6[1]	-	Phase 0: we don't count seeks here, they are counted when we create the runs
1	3[1]	-	2[1]	4[3]	Phase 1: 4 sets of 3 runs: : 1/3 of RAM for each = 333,333 Bytes = 333 Records : 1/3 of each run = 3 seeks/run * 3 runs = 9 seeks : 4 sets to do = 9 * 4 = 36 seeks [36 SEEKS]
2	1[1]	2[5]	-	2[3]	Phase 2: 2 sets of 3 runs (1 run in each set has 3,000 records in it) : 1/3 RAM = 333 records : 3 seeks/run * 3 runs = 9 seeks, but last file has 2,000 records left, so 6 more seeks for it = 15 seeks per set : 2 sets = 2 * 15 = 30 seeks [30 SEEKS]
3	-	1[5]	1[9]	1[3]	Phase 3: 1 set of 3 runs (1 has 1000 rec., 1 has 3000 rec., 1 has 5000 rec.) : 3 seeks/run * 3 runs = 9 seeks, but 2 runs have at least 2000 records left, so 6 more seeks for each; 6 * 2 = 12 MORE seeks, but the last run still has 2000 records left so 6 MORE for it : 9 seeks + 12 seeks + 6 seeks = 27 seeks [27 SEEKS]
4	1[17]	-	-	-	Phase 4: 1 set of 3 runs (lengths 9000, 5000, 3000 records) : smallest size is 3000 rec. so 1/3 RAM = 9 seeks/run so 9 seeks/run * 3 runs = 27 seeks : now 2 runs still have at least 2000 records left so 6 seeks for each * 2 of them = 12 seeks : now last run has 4000 records left so 3*4 = 12 seeks : 27 seeks + 12 seeks + 12 seeks = 53 seeks [53 SEEKS]
36 + 30 + 27 + 53 = 146 SEEKS TOTAL + 17 seeks to break the file into the initial runs = 163

Radix Sort-Merge Total:____________ Polyphase Sort-Merge Total: 163 seeks

Question 4: (worth 10 total) Assignment Related

The following are 5 problems encountered by various groups while working on their projects. For each one, suggest a solution or means of addressing this problem. Please suggest concrete, implement-able solutions (“should ‘a ‘been smarter” or “start sooner” are not acceptable). Assume that acquiring more time, or physical resources (i.e. faster/better machines, etc.) are not options. Any solutions must be applicable within the same time frame. An example follows:

Problem: The group acquired code for reading images/sounds (as opposed to writing it themselves), and assumed they would be able to make it work with their code on their systems. When they were ready to start trying out their algorithms, they discovered that they couldn’t get the acquired code to work.

Solution: - prototyping (Get a working program up and running as fast as possible. In this case it would have included writing & testing the “wrapper” code to load images – even before the algorithms for manipulating those images had been created.)

1. The group couldn’t settle on a project (task) to pursue. What constraints could they have devised to help define their problem space?

2. Two different algorithms for image matching were tried – neither produced any acceptable results. There was insufficient time to try others. How could they have used what they were discovering to get farther than “tried this – didn’t work”?

3. Two algorithms were tried and seemed promising but the group did not have enough sample data. Images were created by converting existing images to .bmp using “MSPaint” (by hand). There was insufficient time to convert more in this fashion. How could they have automated this process?

4. The group was unable to find reference information on the ‘Net, even though they knew the information existed. What if references were found, but the resources referred to were not on-line?

5. Group wanted to examine MP3 files but could not acquire file format info. Could they attempt to extract information from the files themselves? How?

This page left blank. You can use it if you like.
Question 5: (worth 15 total) Comprehensive

DEPARTMENT OF COMPUTER SCIENCE

THE UNIVERSITY OF CALGARY

CPSC 461 Spring 2003

FINAL EXAMINATION

June 27 2003

8:00-10:00 MFH 164

Time: 120 Minutes

NOTE: Many questions have > 1 correct answer and in some cases full marks will be given for answers that while not perfect, are still 'good enough'. DEPARTMENT OF COMPUTER SCIENCE

THE UNIVERSITY OF CALGARY

COMPUTER SCIENCE 461

FINAL EXAMINATION

INSTRUCTIONS

Question 1: (worth 15 total) short (!) answer

This page left blank. You can use it if you like. Question 3: (worth 10 total, 5 each) Digital Libraries

This page left blank. You can use it if you like. Question 5: (worth 15 total) Comprehensive

NOTE: Many questions have > 1 correct answer and in some cases full marks will be given for answers that while not perfect, are still 'good enough'.
DEPARTMENT OF COMPUTER SCIENCE

This page left blank. You can use it if you like.
Question 3: (worth 10 total, 5 each) Digital Libraries

This page left blank. You can use it if you like.
Question 5: (worth 15 total) Comprehensive