In the prisoner's dilemma, the reward matrix is $$\begin{matrix}1,1 & 3,0 \\ 0,3 & 2,2\end{matrix}$$ there's a dominate strategy for both play, which is to betray. So in this case they both should choose to betray if the game is played only once.
What if the reward is modified as below such that there's no dominate strategy? $$\begin{matrix}1,1 & 3,0 \\ 0,3 & 4,4\end{matrix}$$ What should we do to maximize the expected reward?
You used very suggestive language there, which without more information, you don't have the 'right' to use yet! Your question hits at one of the most unsatisfying aspects of Nash Equilibrium (NE) which is that it often times doesn't really have terribly much content as a predictive theory. NE says, if, by grace of lady luck, we end up playing a pair of strategies that is indeed a NE, neither of us have incentive to unilaterally deviate. Critically, it makes use of the idea that this is only a 'good' thing to do conditional on the other player using some fixed strategy.
In your question, you talked about maximizing expected reward. Now expectations of any sort aren't well-defined without some prior distribution over your opponent's strategies: the relative likelihoods you ascribe to them doing a given thing. If you were given such a set of beliefs, this would be a straight forward calculation. But this is a different problem now than what NE seeks to solve. NE just says, absent any sort of prior beliefs, what pairs of outcomes are stable.