I am writing a tutorial paper about the use of Markov chains / random walks on graphs in machine learning. Since I want to demonstrate that any Markov chain can be realized by a random walk on some graph G, I consider the most general case where graphs can be weighted and/or directed and/or have self loops (this last requirement is necessary since Markov chains are allowed to have self transitions, i.e. $P_{ii}\neq 0$
Let $G$ be such an undirected graph and let $w_{ij}=w_{ji}$ denote the weight of an edge between vertex $v_i$ and $v_j$ (when there is no edge, $w_{ij}=w_{ji}=0$). Then, $\mathbf{W}$ is the $N\times N$ matrix containing all values $w_{ij}$. If $G$ has no self-loops, a random walk on $G$ is described by a transition matrix $\mathbf{P}$ whose entries are $p_{ij}=\frac{w_{ij}}{d_i}$, where $d_i=\sum_j w_{ij}$ is the weighted degree of vertex $v_i$ and is simply the sum over the $i$-th row of $\mathbf{W}$. However, if $G$ has self-loops this is somewhat problematic, since for undirected graphs self-loops contribute twice as much as regular edges to the degree of the respective vertex, i.e. $d_i=2*w_{ii}+\sum_{j\neq i}w_{ij}$ (this fact is a consequence of the handshaking lemma or degree sum formula - https://en.wikipedia.org/wiki/Handshaking_lemma). Therefore, in the case of an undirected graph with self-loops, the degrees are not necessarily the row sums of $\mathbf{W}$
Unfortunately, having the degrees equal to the row sums of $\mathbf{W}$ is an interpretation that I require when dealing with random walks throughout my tutorial. I thought about fixing this by instead assuming that for undirected graphs the diagonal entries of the $\mathbf{W}$ are automatically twice the value of the actual weight. This would mean that the factors of 2 are taken care of when calculating the degrees by summing over the rows of $\mathbf{W}$. The downside of this is that there is then a mismatch between the value of visual depiction of a graph and it's corresponding weight matrix. For example, the following would be an example of a graph and its weight matrix, where $w_{11}$ and $w_{22}$ are twice as big as the values shown in the diagram:

Has anyone seen this convention used anywhere? So far I couldn't find any examples of it online, which makes me think it is very atypical, and maybe a bad choice. However, I don't know how else to solve this issue. Also, can anyone think of any problems with using this convention (i.e. do any famous results or theorems about graphs or random walks no longer apply)?
You have basically total freedom to decide what to do, because as far as I can tell after checking a few sources, nobody considers random walks on undirected graphs with loops. Your other option, which is not as bad as it sounds, is to ditch the degree sum formula.
Consider the unweighted graph
with two vertices $v,w$, a self-loop at $v$, and an edge $vw$. I think the more reasonable way to define its random walk is that when you're at $v$, you have a $\frac12$ chance of staying at $v$, and a $\frac12$ chance of staying at $w$. The limiting distribution $\pi$ has $\pi_v = \frac23$ and $\pi_w = \frac13$.
This is consistent with defining $\deg(v) = 2$ and $\deg(w) = 1$, so that the limiting distribution is proportional to the degrees. However, the sum of degrees is now only twice the number of edges if we treat the loop as half an edge.
When we make the graphs weighted, we continue to have $d_i = \sum_i w_{ij}$ with transition probabilities $p_{ij} = \frac{w_{ij}}{d_i}$. So the sum of rows of $\mathbf W$ is happy.
The stationary distribution is given by normalizing $(d_1, \dots, d_n)$, dividing it by $d = \sum_i d_i = \sum_{i,j} w_{ij}$. This is the sum of all entries of $\mathbf W$, and I've made it look innocent so far, but we do also have $d = \sum_i w_{ii} + 2 \sum_{i < j} w_{ij}$. In other words, to find $d$ from the diagram, we count the weight of each loop once, and the weight of every other edge twice.
When we want to model the undirected graph as a directed one, we replace each standard edge by a pair of edges in either direction, but replace each loop by a single (directed) loop.