![[gamblersruinthumbnail.png]]
The Gambler’s Ruin is a statistical concept that is used to prove the security of Bitcoin and other blockchain-based technologies. This concept lacks explanation and proof in the Bitcoin whitepaper; This the the motivation for this project.
# The Motivation
Whilst reading the [Bitcoin Whitepaper](https://bitcoin.org/bitcoin.pdf), there was a section that involved proving the security of the blockchain. The chain is maintained by nodes (CPUs, computers, etc) and the author of the paper claims that so long as more than 50% of the nodes are honest, the blockchain is secure. The paper compares this claim to the Gambler’s Ruin and provides the equation $(q/p)^z$ without explanation.
### The Gambler’s Ruin Claim:
> [!important] If 60% of the Miners are honest and 40% dishonest, there is a 67% chance that the dishonest nodes will catch up to the honest nodes from a 1 block deficit…
# The Blockchain Problem
One the blockchain the longest chain is the accepted chain, the true chain. Previous transactions can be altered by attackers, but the catch is that they have to shorten their chain to make the alteration. Since the altered chain is shorted than the unaltered chain, the altered chain is considered invalid and none of the dishonest transactions are recognized. If somehow the attackers are able to get their chain longer than the unaltered chain, then their chain becomes the accepted chain. Their altered transactions always steal bitcoin from others.
The blockchain is lengthened by nodes guessing the right answer to a complex problem. If $p$ is the fraction of honest nodes and $q$ is the fraction of dishonest nodes, then:
$1=p+q$
We can also define $z$ as the number of blocks behind the altered chain is from the unaltered chain.
# The Gambler’s Ruin Problem
The Gambler’s Ruin is a concept that states that a gambler start with 1 penny and on each betting round he either wins or losses a penny. The gambler is betting on a weighted coin flip and bets heads every time. There is an $r$ chance it lands on heads and a $s$ chance it lands on tails. Therefore:
$1=r+s$
What are the odds that the gambler attains a fortune of $N$ pennies before getting ruined (i.e. going broke)?
# Bitcoin x Gambler’s Ruin
These two statistical events appear related but the differences are important. In the blockchain example, the attacker is at a deficit of blocks relative to the honest nodes and is trying to get to a zero block deficit. The Gambler’s ruin is the opposite, where the gambler starts with a surplus and is trying _not_ to reach 0 pennies. With each example there is also the other perspective. In the blockchain, the honest nodes are trying to be at an infinite number of blocks ahead of the attackers before the attackers catch up. This perspective is more like the gambler.
OK, so the gambler and the honest nodes are the same, and for now let’s assume that the honest nodes only need to get $N$ blocks ahead of the dishonest nodes.
The blockchain problem can now be thought of as a weighted coin-flip where again with $p$ and $q$ defined above. The honest nodes will always bet on $p$ and the dishonest on $q$. The honest node start $z$ blocks ahead of the dishonest nodes and are trying to reach $N$ nodes ahead before the attackers catch up.
More generally:
What is the probability that the honest nodes will get $N$ blocks ahead of the dishonest nodes using a weighted coin with weight $p$ for the honest nodes and $q$ for the dishonest nodes? The honest nodes start $z$ blocks ahead where $0 <z<N$ because either the attackers have caught where were $z<1$ or the honest nodes have gotten far enough ahead where $z=N$.
$P_z$ is the probability that the honest nodes will reach $N$ blocks ahead before the attackers catch up. For $P_0=0$, the attackers have caught up. For $P_N=1$, the honest nodes are far enough away. What is the $P_z$ for values between 0 and $N$? i.e. What is the probability that the honest nodes will get sufficiently far away from the dishonest nodes before the dishonest nodes catch up from a deficit of 1 or 2 … $N-1$?
# Recursion Derivation
Each added block changes the probability that the honest nodes will win (reach $N$ before the dishonest nodes catch up). If they start with a probability of $P_z$ and create a block, their probability of winning changes to $P_{z+1}$ and if the dishonest makes a node, it becomes $P_{z-1}$. This changing probability based on the total outcomes of all the previous blocks is called the Markov property. Therefore:
$P_z=pP_{z+1}+qP_{z-1}$
There is a $p$ chance of the honest nodes making a block and thus the probability of winning turns to $P_{z+1}$ and a $q$ of the dishonest nodes winning a block and thus the probability of winning turns to $P_{z-1}$. This equation is recursive and measn that it needs to call on itself to be solved. In an attempts to remove the recursion:
$(p+q)P_z=pP_{z+1}+qP_{z-1}$
$P_{z+1}-P_z=\frac{q}{p}(P_z-P_{z-1})$
Solving for $z=1$:
$P_2-P_1=\frac{q}{p}(P_1-P_0)$
Solving for $z=2$:
$P_3-P_2=\frac{q}{p}(P_2-P_1)$
We see a $P_2-P_1$ term so it can be replaced:
$P_3-P_2=\Big(\frac{q}{p}\Big)^2P_1$
If we keeping solving for increasing $z$ we can get the general equation:
$P_{z+1}-P_z=\Big(\frac{q}{p}\Big)^zP_1$
This is a common statistical pattern that goes:
$P_{z+1}-P_1=\sum_{n=1}^{z}P_{n+1}-P_n$
We can substitute in our equation and thus:
$P_{z+1}-P_1=\sum_{n=1}^{z}\Big(\frac{q}{p}\Big)^nP_1$
$P_{z+1}=P_1\Bigg(1+\sum_{n=1}^{z}\Big(\frac{q}{p}\Big)^n\Bigg)$
We can simplify further by changing the lower bound from $n=1$ to $n=0$:
$P_{z+1}=P_1\sum_{n=0}^{z}\Big(\frac{q}{p}\Big)^n$
# Into the Weeds
The geometric series for any $a$ and integer $i\ge0$:
$\sum_{n=0}^ia^i=\frac{1-a^{i+1}}{1-a}$
Therefore (equ 7):
$P_{z+1}=\begin{cases} P_1\frac{1-(\frac{q}{p})^{z+1}}{1-\frac{q}{p}} & p \neq q \\P_1(z+1)&p=q \end{cases}$
This equation will always solve for the probability of winning at $z+1$ and since $P_N =1$, we will set $z+1=N$:
$P_{N}=1=\begin{cases} P_1\frac{1-(\frac{q}{p})^{N}}{1-\frac{q}{p}} & p \neq q \\P_1N&p=q \end{cases}$
$P_1=\begin{cases} \frac{1-\frac{q}{p}}{1-(\frac{q}{p})^{N}} & p \neq q \\\frac{1}{N}&p=q \end{cases}$
We can put the above back into equ. 7:
$P_{z+1}=\begin{cases} \frac{1-\frac{q}{p}}{1-(\frac{q}{p})^{N}}\frac{1-(\frac{q}{p})^{z+1}}{1-\frac{q}{p}} & p \neq q \\\frac{(z+1)}{N}&p=q \end{cases}$
Simplifying further:
$P_{z}=\begin{cases} \frac{1-(\frac{q}{p})^z}{1-(\frac{q}{p})^N} & p \neq q \\\frac{z}{N}&p=q \end{cases}$
Therefore, the above is the probability that honest nodes will successfully get $N$ blocks ahead of dishonest nodes whilst starting $z$ blocks ahead before the dishonest nodes. NB: The Bitcoin whitepaper only claims security for the case where $p>q$.
# Infinite Blocks Ahead
Recall that the Bitcoin problem actually requires the honest nodes to get an infinite number of blocks ahead of the dishonest nodes. To add this requirement, we can take the limit as $N$ approaches $\infty$. Starting with the $p=q$ case:
$\lim_{N\rightarrow \infty} P_z=0$
This also holds for any $p<q$. Therefore in the cases above, the honest nodes will never get far enough away from the dishonet nodes.
For $p>q$, the denominator approaches 0 as $\frac{q}{p}$ gets raised to a very large power. Therefore:
$\lim_{N\rightarrow \infty} P_z=1-\Big(\frac{q}{p}\Big)^z$
Finally, to match the equation presented in the whitepaper, we will change the perspective to that of the attackers:
$P_{z \text{, dishonest}} = 1-P_{z \text{, honest}}$
Now, the probability that the dishonest nodes will catch the honest nodes is:
$P_z=\begin{cases}\Big(\frac{q}{p}\Big)^z&p>q\\1&p\leq q\end{cases}$
This is identical to the claim in the paper:
![[4 - Attachments/image 23.png|image 23.png]]
We have now shown that it does make sense that if the dishonest nodes have a 1 block deficit and control 40% of the nodes, they have a 67% of catching up.
![[4 - Attachments/image 1 11.png|image 1 11.png]]
This graph shows our answer along with some other combinations of probabilities. Their chances drop exponentially as the block difference increases.
# Another Way
My intuition for this problem was that the dishonest nodes would have a 40% chance of catching up, not a 67% chance. Even with the stats and math infront of me it didn’t click. The way that finally convinced me was a different method. The problem is stated as: What is the chance the dishonest nodes will _ever_ catch up? This means that if they start at a 1 block deficit and it turns into a 2 block deficit, they can still catch up, they just need to get two blocks in a row.
What are the odds the dishonest nodes will get the first block and thus catch up immediately? _40%_
But what are the odds the dishonest nodes will lose the first block but get two in a row? _40% $\cdot$ 40% = 16%_
Here is a table for block deficits 1 through 10:
![[4 - Attachments/image 2 6.png|image 2 6.png]]
If we sum up all these percentages we get … 67%!
# Final Thoughts
We are now certian that so long as honest nodes control more than 50% of the chain, the system is safe and secure. However people cannot wait forever to know that their transaction was secure, but the whitepaper goes into detail about this and shows their steps to show how many blocks a person would have to wait for there to be less than a 0.1& chance of the dishonest nodes catching up. Case closed… but wait. Honest and dishonest nodes are not actually competing against each other for blocks, but rather each mining their own chain. This means that they don’t need to discard their work if the other type of chain finishes block. How does this change the probability and does it mean that the chain is more or _less_ secure than the calculations show?
> [!important] For further reading, check out "An Introduction to Probability Theory and its Applications," by W. Feller.
![[Connect#Connect]]