My understanding is that for problems where there are an exponential number of possible solutions, a non-deterministic turing machine (NTM) is able to solve it in polynomial time because an NTM can generate and test all possibilities in parallel.
The examples I've seen of non-deterministic polynomial time algorithms are generally of the form:
1) $Guess$ a possible solution
2) Verify that the possible solution is a solution
Does step 1 really mean "generate all possible branches" as opposed to "guess a possible solution"?
Not all such problems: you still have to be able to generate a possible solution and test it in polynomial time. Depending on what constitutes a possible solution and what you have to do to test it, that may or may not be possible.
In order to be sure of finding a solution, you may have to generate all possibilities, or most of them. But the guess is talking about a different scenario, where you're satisfied with a procedure that sometimes finds a solution (and sometimes fails, but never returns a non-solution).