I'm not sure what the "I" and "-I" gates do. I can't seem to apply them correctly. When I do hadimard I get |00>(Tensor)Hadimard. If I then apply the tensor product to apply the 'i' gate on the last 2 bits, I get a 4x1 matrix but then the 'i'-gates are 2x2 so I'm obviously missing something.
Does this make sense or should I take a picture of my work?
Basically one just had to use the basis states |000>, |001>, ... , |111> and compute them in the circuit. The trick is to make sure you implement the controlled-P gates correctly and everything else should be straight forward.