What mathematical structure best describe a Microsoft Excel spreadsheet?

223 Views Asked by At

Here is a basic question: how would you model an excel spreadsheet in terms of linear algebra, set theory, matrices, vector spaces?

A excel spreadsheet consists of cells, whereby cells can contain (for the sake of simplicity) two types of numerical value or symbols (such as alphabets/characters).

For example, it could appear as,

\begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}

or

\begin{bmatrix} 1 & 2 & \cdots \\ 4 & \ddots & \cdots \\ \vdots & \vdots & \ddots \end{bmatrix}

or \begin{bmatrix} 1 & a & c \\ b & 2 & d \\ e & 3 & 4 \end{bmatrix}

or

\begin{bmatrix} ! & a & c \\ b & ? & d \\ e & $ & @ \end{bmatrix}

or perhaps other types (imagine an empty spreadsheet or cells with functions $f(x)$).

What mathematical structure would best correspond to each of these examples?

2

There are 2 best solutions below

6
On BEST ANSWER

Linear algebra is not a good model for Excel spreadsheets. What would it mean to add or multiply two spread sheets? The best model comes from graph theory, which is a general way of describing relationships between objects, called "vertices" in the theory. Two vertices can be by an "edge" to indicate some relation between them. In some cases, the edges are directed, meaning that they point from one vertex to the other. A cycle in a graph is a path that goes from vertex to vertex along the edges, and eventually come back to where it started.

The structure of an Excel worksheet is best described as a directed acyclic graph (DAG). The vertices of the graph are the cells. There is an arc (directed edge) from cell A to cell B if the value of cell A enters into the computation of the value of cell B. The graph must be acyclic, because if there were a cycle, the value of a cell would depend on itself, and the software doesn't handle recursion.

Spreadsheets work by topologically sorting the vertices of this graph, and then evaluating them in that order. So first the cells that have constant data entered in them are evaluated, then the cells who values depend on those constants, and so on. Each time a cell comes up for evaluation, the sorting guarantees that all its dependencies have already been evaluated.

I should say, perhaps, that I am not privy to the Excel code. This discussion is based on my understanding of the general structure of spreadsheet programs, as they would appear in a programming project in a computer science class.

2
On

Excel sheets are a function from the set of cells to the value in each cell so $f:\mathbb{N}^2 \rightarrow A$ where $A$ is the set of symbols which is typically called an alphabet. The value is typically null on a new sheet and you edit the values such as $f(B3) := "data"$ by editing the cell.