This assignments has two parts, which you can complete in either order.
In the first part, you’ll go back to a previous PA of your choice, measure the performance of your implementation, and improve it.
In the second part, you’ll get some practice with C, and implement and test a min-heap in C, paying careful attention to memory allocation.
You’ve learned a lot in this course. With the benefit of hindsight, we can often go back and better understand programs we’ve written in the past. This part of the PA gives you an opportunity to do that.
In this part of the PA, you will choose one of PA2, PA3, PA5, PA6, or PA7 (in PA1 and PA4 there’s less that’s interesting to measure), perform some measurements on its runtime performance, then make some changes and measure those.
Your process should be as follows:
Choose one of the PAs above, and choose which part you are going to measure.
For example, you might choose to measure the transform
method on
ArrayLists
, the solve
method for mazes, the generateDatabase
method for
n-grams, the isValidPartitionResult
method for partitioning, and so on. The
method you choose should have an interesting relationship (e.g. not constant)
with the size of the input. If you’re not sure if a particular method/input
pair is appropriate, feel free to ask.
You will submit this in two places:
Programming Assignment 8 (PA8) - Part1 written
assignment.
Programming Assignment 8 (PA8) - Part1 measure code
below).Programming Assignment 8 (PA8) - Part1 measure code
code assignment. You should
make sure the code you used to measure, the original code for your PA, and the
updated version are included. We may look at this to confirm/support the
descriptions you give above. However, the descriptions you give should be
complete enough for us to understand your original and updated implementation.For this PA, you should use the lab machines. If you want to set up C on your own laptop, you’re welcome to, but we probably won’t be able to help, and we recommend against it.
This document has instructions on finding your account and getting logged in, and working remotely if you’d like:
https://docs.google.com/document/d/1zxXlY36LOfpI4Z3zr70u3AImMirsUUBaomX9Vf7T0xg/edit?usp=sharing
Note that it also has some background information on running C from the command
line. We’ve provided the commands you need in a Makefile
and detailed them
below, but the background in the document explains the underlying commands.
Also note that the PA is not set up with Eclipse, so directly importing the code into Eclipse will not create a project. You can use any text editor to edit.
You can get the starter code here:
https://www.dropbox.com/s/ne3kednj8hrxuim/pa8-starter-master.zip?dl=0
First, you’re going to do two warm-up exercises.
insert.c
: This file contains an implementation of inserting into an array
that appears to work, but has some memory errors. You will use valgrind
to
detect and fix the memory errors. You can run this example with make
insert
, which will run the necessary valgrind command. You’ll hand in the
corresponding fixed C file.match_struct.c
: In this file, you will write initialization code to
construct data that matches a series of boolean checks in the main
method.
You’ll hand in a C file that makes the main
method print both "You got
ans1!"
and "You got ans2!"
Run with make match_struct
. You’ll do
this by editing only setupAns1
and setupAns2
. Hint – draw a picture of
what main
is looking for!Unlike Java, C doesn’t have language support for a feature called an “interface.” That said, the concept of a collection of functions that implement a particular feature independent of the underlying representation is still a reasonable one to consider. For this heap implementation, we’ll take a collection of function headers as our interface:
Heap* makeHeap(int capacity);
void add(Heap* heap, int priority, char* value);
char* removeMin(Heap* heap);
char* peek(Heap* heap);
int size(Heap* heap);
void cleanupHeap(Heap* heap);
In addition, we’ve defined a few structs for you:
struct Entry {
int key;
char* value;
};
typedef struct Entry Entry;
struct Heap {
int capacity;
int size;
Entry** elements;
};
typedef struct Heap Heap;
The heap.h
file holds all these definitions, and is called a header file. C
programs are often organized with definitions in one file and declarations in
another. This is done (at a high level) to help C’s compiler, which doesn’t
have the same features Java’s compiler has for traversing the filesystem to
find all the relevant source files.
In heap.c
, you will write implementations of these functions. Your
implementation will be a min heap (the mapping with the lowest key will be
on top). Some notes on the required algorithms are in lectures from week 9.
Descriptions of each function are:
makeHeap
: Should return a pointer to a newly allocated Heap
with the
given capacity
, a size
of 0
, and an elements
array allocated with the
given capacity. The elements
array should contain pointers (references) to
Entry
objects.add
: Should add a new pair mapping the given priority
as a key to the
given value value
. This should work by inserting add
at the end of the
storage array (elements
) and bubbling up. Should run in O(lg(size))
time,
except when resizing. If the size is equal to the capacity, this should take
O(size)
time, and expand the capacity by allocating a new array of twice
the current capacity and copying old elements over before performing the
insertion.removeMin
: Should return the value stored at the top of the heap (the one
with the lowest priority). That element should then be removed from the heap,
and the heap should be restored to the correct shape by moving the last
element to the top and bubbling down. When performing bubble down, prefer
swapping with the left child if both children have the same priority.
If the heap is empty, return NULL
. Should run in O(lg(size))
time.peek
: Should return the value stored at the top of the heap (the one with
the lowest priority) and make no changes to the heap. If the heap is empty,
should return NULL
. Should run in constant time.size
: Should return the number of elements in the heap (not the capacity),
and run in constant time.cleanupHeap
: Should use free
to reclaim the memory used by the given Heap
pointer, including its elements
array.Note that the reference implementation, in bubbleUp/Down, does not perform swaps between equal elements. If you write tests with many duplicates, you might run into this behavior. We don’t test in detail for either direction here, so your implementation can go either way. However, if you write tests that explicitly check for not bubbling past equal elements, they won’t succeed on our implementation.
You can write tests in test_heap.c
, where there are some examples given of
using arithmetic and checking numeric equality, and checking equality between
strings. You should write tests that create heaps, add and remove items, and
check that the operations work as expected. The two relevant functions for
writing assertions are CuAssertIntEquals
and CuAssertStrEquals
. In this
testing framework we need to use the equality method that matches the type we
are testing.
You can run the tests with
make test
You’ll see output like this when the tests run:
[cs12w@ieng6-203]:pa8-implementation:642$ make test
gcc -g -Wall -o test_heap.run heap.c test_heap.c cutest/CuTest.c
valgrind --leak-check=full --undef-value-errors=no ./test_heap.run
==32451== Memcheck, a memory error detector
==32451== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==32451== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==32451== Command: ./test_heap.run
==32451==
.
OK (1 test)
==32451==
==32451== HEAP SUMMARY:
==32451== in use at exit: 176 bytes in 2 blocks
==32451== total heap usage: 8 allocs, 6 frees, 16,913 bytes allocated
==32451==
==32451== 176 (16 direct, 160 indirect) bytes in 1 blocks are definitely lost in loss record 2 of 2
==32451== at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==32451== by 0x400FCD: makeHeap (heap.c:101)
==32451== by 0x401017: TestHeap (test_heap.c:7)
==32451== by 0x40167E: CuTestRun (CuTest.c:144)
==32451== by 0x401CE4: CuSuiteRun (CuTest.c:292)
==32451== by 0x4010F1: RunAllTests (test_heap.c:29)
==32451== by 0x401156: main (test_heap.c:40)
==32451==
==32451== LEAK SUMMARY:
==32451== definitely lost: 16 bytes in 1 blocks
==32451== indirectly lost: 160 bytes in 1 blocks
==32451== possibly lost: 0 bytes in 0 blocks
==32451== still reachable: 0 bytes in 0 blocks
==32451== suppressed: 0 bytes in 0 blocks
==32451==
==32451== For counts of detected and suppressed errors, rerun with: -v
==32451== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
The part that says “OK (1 test)” will reflect the number of tests you’ve written, and is like the JUnit output for your tests.
The rest is the output of a tool called valgrind
, which checks the memory
usage of programs. In this case, it’s saying that there are 176 bytes of memory
left in use at the end of the program. You’ll probably see something like this
early on, until you implement cleanupHeap
. It means that memory that was
earlier malloc
ed was never freed. You should make sure to free all memory
that is allocated in each test by using cleanupHeap
on any heaps you create.
When there are no memory errors and no memory leaks, you’ll see a message like this:
[cs12w@ieng6-203]:pa8-implementation:649$ make test
gcc -g -Wall -o test_heap.run heap.c test_heap.c cutest/CuTest.c
valgrind --leak-check=full --undef-value-errors=no ./test_heap.run
==5758== Memcheck, a memory error detector
==5758== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==5758== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==5758== Command: ./test_heap.run
==5758==
.
OK (1 test)
==5758==
==5758== HEAP SUMMARY:
==5758== in use at exit: 0 bytes in 0 blocks
==5758== total heap usage: 8 allocs, 8 frees, 16,913 bytes allocated
==5758==
==5758== All heap blocks were freed -- no leaks are possible
==5758==
==5758== For counts of detected and suppressed errors, rerun with: -v
==5758== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
This is the kind of output you should see when things are in a happy state – no memory errors, all heap blocks freed, and all tests passing!
It’s important to point out that valgrind
can catch even more than unfreed
memory. Consider this broken test:
void SillyTest(CuTest *tc) {
Heap* h = NULL;
CuAssertIntEquals(tc, 0, h->capacity);
}
If we run make test
on this, we’ll see this output:
[cs12w@ieng6-203]:pa8-implementation:664$ make test
gcc -g -Wall -o test_heap.run heap.c test_heap.c cutest/CuTest.c
valgrind --leak-check=full --undef-value-errors=no ./test_heap.run
==22849== Memcheck, a memory error detector
==22849== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==22849== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==22849== Command: ./test_heap.run
==22849==
==22849== Invalid read of size 4
==22849== at 0x40101A: SillyTest (test_heap.c:8)
==22849== by 0x4016EB: CuTestRun (CuTest.c:144)
==22849== by 0x401D51: CuSuiteRun (CuTest.c:292)
==22849== by 0x40115E: RunAllTests (test_heap.c:34)
==22849== by 0x4011C3: main (test_heap.c:45)
==22849== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==22849==
==22849==
==22849== Process terminating with default action of signal 11 (SIGSEGV)
==22849== Access not within mapped region at address 0x0
==22849== at 0x40101A: SillyTest (test_heap.c:8)
==22849== by 0x4016EB: CuTestRun (CuTest.c:144)
==22849== by 0x401D51: CuSuiteRun (CuTest.c:292)
==22849== by 0x40115E: RunAllTests (test_heap.c:34)
==22849== by 0x4011C3: main (test_heap.c:45)
==22849== If you believe this happened as a result of a stack
==22849== overflow in your program's main thread (unlikely but
==22849== possible), you can try to increase the size of the
==22849== main thread stack using the --main-stacksize= flag.
==22849== The main thread stack size used in this run was 8388608.
... more output down here ...
This message will show up in some valgrind output. This output is giving the
same kind of meaning as NullPointerException
in Java. the Invalid read of
size 4
pinpoints the line on which it happens, and means that we tried reading
an invalid pointer (at address 0x0
, which is address 0
, which is the same
as NULL
).
You will submit this part to the Programming Assignment 8 (PA8) - Part2 code
assignment on Gradescope.
insert
and match_struct
) [automatic]