Let's say we have tabular data(numerical) and each column is dimension hence each row is a data point in $R^n$ if there are $n$ columns. The decision tree can be seen as the partition of the data now the partition is done by hyperplane parallel to the coordinates. When the data is represented in $R^2$ it is partitioned by lines parallel to $x$ or $y$ axis. So we can say that at each node it's a binary split.
In the following paper, I saw the definition of
https://arxiv.org/abs/1501.05141
decision space which is a generalisation of the definition of the decision tree. The definition of decision space is given in section 4 of the paper. Roughly the definition says that we can create partition with the minimum and maximum value of each column and that will create a box(polyhedra) and each box is like the leaf in a decision tree.
Following example make it more clear

Hence we can see that given a decision tree we create a decision space. Now the opposite is not true. Given a decision space, we cannot create a decision tree. A counter-example is given by the following rule set.
My question is what condition should we impose on decision space to make it a decision tree? In a paragraph in the paper, it was given as follows
A decision tree cannot generate all axis-parallel rectangles. Indeed, a decision tree belongs to the data mining family of divide-and-conquer algorithms that imposes constraints on the search. Intuitively, a cut in the space along the border of an element, either vertical or horizontal, should not cut any element.
This statement is not clear to me. Any explanation welcome.
