Tonight a game of Scrabble ended in what I consider a very unusual graph structure,
unlike this generic web image, which seems more typical:

Let us call the Scrabble graph the one whose nodes are tiles, with two tiles connected
by an arc in the graph if they share a side.
The unusual graph that occurred contained a long snaking chain that consumed most of the board.
This made me wonder if anyone has performed a statistical analysis of "average" Scrabble
graphs; for example, their diameter, or their cycle structure?
No doubt these properties depend on the skill of the players, but
it seems within some reasonable parameters, there are perhaps answers of sorts. I would be curious to
hear of any data in this direction. Thanks!
Afterthought. Maybe equally interesting: What is the largest diameter legal, achievable Scrabble graph?
Methods
Starting from the base url http://www.cross-tables.com/annotated.php?a=1 I used a combination of Python's
urllib,multiprocessingandBeautifulSoupto extract the first 10000 games. The games were parsed and turned intonumpy15x15 Boolean matrices. The matrices were then turned into graphs making an edge if two adjacent cells in the matrix were both active. Graph properties were then analysed withnetworkxOf the 10000 games, only 9966 were usable. Some games did not start on the center title, while others ended so quickly and strangely they did not behave properly. Fortunately these games were rare enough that the sample should give a robust estimation of the true distributions.
Methods (Update)
There was a bit more data cleanup needed. I had not taken into consideration challenged moves, leading to games that had >100 tiles used. In the process, I noted false moves and fake games. We may have to live with a bit of uncertainty in the data, such is the cost of true empirical data.
Results
The first interesting piece of information is the board frequency - this gives a nice spatial connection to the graphs we are about to study. Notice that, due to how the game is played (and how we read left to right and top to bottom) the board is asymmetric.
From here we can answer the question,
A scatter plot versus the graph size reveals a bit more information for the smaller $N$ values:
Results (Update)
Based of a comment, I plotted the radius vs the diameter, giving a mostly linear relationship of 1 to 2 except for a range of games with some variance. Feel free to make some observations on the significance of this in the comments.
Quick Conclusion (TLDR)
From the data studied, there were mostly full games played ~100 tiles with an average graph radius 18 and a diameter of 36. Further work is needed to compare these results to random graphs with the same size and edge counts but different edge distribution.