Mining Crypto, What is Bitcoin

For this project a raspberry pi will be converted to a miner, working with a pool of others to solve cryptographic puzzles. To fully understand this process we must start with the whitepaper. - [[#The Bitcoin Whitepaper]] - [[#The Introduction]] - [[#The Double-Spending Example Without a Third Party]] - [[#The Double-Spending Example With a Third Party]] - [[#The Transactions]] - [[#The Timestamp Server]] - [[#Proof of Work]] - [[#The Complex Problem]] - [[#The Incentive]] - [[#The Network]] - [[#The Disk Space]] - [[#The Simplified Payment Verification]] - [[#The Privacy]] - [[#The Math]] - [[#The Conclusion]] - [[#The Mining]] - [[#The Pi Setup]] - [[#The Mining Software]] - [[#The Mine]] - [[#Connections | Questions:]] # The Bitcoin Whitepaper A whitepaper is a document that is produced to define and explain an idea or project at "high-level". The paper was written by a person (or group) named Satoshi Nakamoto. The main problem that this paper attempts to solve is how to remove the third party from financial transactions. When people exchange electronic money, they need the banks to record it on a ledger. The banks verify the purchase and maintain the ledger. Through the use of digital signatures people can verify their own transactions, but cannot maintain a common ledger. This common ledger problem is solved using what is known as a blockchain. The blockchain is a list of time-stamped transactions that are verified in a peer-to-peer network using Proof of Work (PoW). Computers compete to verify the block the quickest to add it to the chain. The longest chain is considered valid which means so as long as less than 51% of the computers are bad actors, only valid transactions will be recorded. ## The Introduction Using cash as a medium for trade can be secure, anonymous, trustless, and irreversible, however it only works for in-person or local transactions. This system worked great for awhile, but globalization took over and now anything can be bought from anywhere for the lowest price. The solution for internet transactions involves banks serving as third parties in the verification and maintenance of the ledger. The ledger is a history of all transactions made using the bank. We can think of all the banks of the world as just one bank, the third party bank. ## The Double-Spending Example Without a Third Party _Abby finds a $10 bill on the ground_ _Abby calls Matthew and promises to give him $10 if he gives her a clue for the Daily Wordle_ _Matthew agrees and gives Abby a clue and will get the $10 the next time they meet up_ _Abby still cannot figure out the Wordle so she calls Simon and makes the same deal_ _Simon agrees to give Abby a clue because he is not aware that Abby already promised the $10 to Matthew_ Even if Matthew and Simon figure out that Abby double-spent her money, it is too late, the boys cannot take back the clues they gave her. A solution to this problem is to involve a third party, the bank. ## The Double-Spending Example With a Third Party _Abby finds a $10 bill on the ground_ _Abby calls Matthew and promises to give him $10 if he gives her a clue for the Daily Wordle_ _Matthew agrees and tells Abby to send him the money first_ _Abby goes to the bank and deposits the $10_ _The bank then gives Matthew the $10_ _After Matthew sees the money he gives Abby the clue_ _Abby still cannot figure out the Wordle so she calls Simon and makes the same deal_ _Simon agrees and tells Abby to send him the money first_ _Abby tells the bank to give Simon $10, but since the bank knows Abby already spent it, they refuse the transaction_ _Simon doesn't receive the money so he doesn't give Abby the clue_ In this scenario, the boys only accept money that has come from the bank. This third party system requires both the buyer and seller to trust the bank to maintain the ledger. The bank knew that Abby already spent the money because they can look back in their records. An obvious question arises: What if Matthew accepts the money but doesn't give Abby a clue? This forces the bank to take on another role, mediator. Abby can dispute the transaction and if she can prove Matthew didn't give her a clue then the bank can reverse the transaction. Overall the third party must be trusted, know the identities of all parties, and have the ability to reverse transactions. The third party gets a lot of trust and personal information from the participates just for acting as a middleman of a transaction. In this system some fraud is unavoidable because of the third party mediator. Two people can physically swap goods but the same is not true on the internet. The paper proposes a solution that replaces trust with cryptographic proof. In this context, "cryptographic" means irreversible calculations. Moreover it means that the only way to get proof of the transaction is by actually doing the transaction. People cannot fake transactions that didn't happen because they do not have the cryptographic proof. # The Transactions A "Bitcoin" is defined as a series of transactions. This means that a Bitcoin is not a physical coin or even a digital image of a coin, rather it _is_ the combination of all of the previous transactions/transfers. Each transfer adds and updates the coins data. An obvious question is what is a transaction? The white paper includes this image: ![[4 - Attachments/image 25.png|image 25.png]] It is best to go through each term and fully define it. Every account (or person, but a person can have many accounts) has a Public/Private Key pair. The Public Key (PK) is public knowledge but this doesn't mean that there is an identity attached to it. Rather it means that a certain transaction used this PK. The Private(Secret) Key (SK) is secret and should never be shared with anyone, ever. An example of a transaction is: _Abby gives Matthew $10_ This example transaction can be combined with Abbys PK to make a hash, let's label it called TR. Abby must sign this transaction, meaning that she in fact does want to transfer $10 to Matthew. The signature is created using Abby's SK (this is why the SK must remain secret, or anyone could fake your signature). An example of Abby's signature is: _TR + SK = Signature_ More formally TR and SK are inputs to a cryptographic function. Recall that a cryptographic function means it can be computed one way but not the other. Imagine a machine that takes two inputs (TR and SK) and outputs a Signature: ![[4 - Attachments/image 1 13.png|image 1 13.png]] The machine is called SHA and only works one way which means the SK cannot be reverse calculated. Even though the TR, SHA function, and the Signature are all known, the SK cannot be calculated. This means that if there is a signed transaction, it is extremely unlikely for it to be fraudulent. More formally the signature equation looks like this: _SHA(TR, SK) = Signature_ Another question, how is the signature verified if the SK is unknown? That is where the PK comes in to create a second equation. The PK and the signature goes into a verification function, VER, that returns either true or false. Since all variables are known to the public there is no need for it to be too complicated: _VER(Signature, PK) = True or False_ These two equations are great for verifying transactions but they cannot verify that Abby hasn't sent the same $10 to two different people. Since a coin is just a list of previous transactions, it can fork into two different chains of transactions and the receivers may never know. This again can be solved with the bank. If all transactions must go through the bank, the bank can check its ledger to verify no coin was double-spent. The only way for everyone to know about all transactions without the use of a third party is to publicly announce each transaction and agree on a public ledger. There needs to be a way to know which transaction was first and also that all (valid) transactions get included on the ledger. This could mean that everyone puts each transaction on a website, but that involves a third party, the website's webmaster. ## The Timestamp Server The solution to the double-spend problem is to add a timestamp so the order in which transactions happen are known. This solves the issues of transactions happening in quick succession. The paper suggests taking many items, forming a block, and taking the hash of the block. The hash is then announced publicly. Recall that hashes are one-way functions, so the only way to produce the hash would be for the transactions to actually happen. A block in the case of Bitcoin contains 2,400 transactions and the hash of the previous block. The inclusion of the previous hash means that all the transactions in the previous hash must have happened before the transactions in the current hash. Since the only way to get a hash is for the previous transactions to have actually happened, we know that the previous block happened before the current block. This solves the problem of the double-spend because the history of the transactions are now publicly known. Each of the blocks are connected to the previous because they contain the hash of the previous block. This mains a chain of blocks or a _blockchain_. There is still one major problem with this solution, how do we know which chain is the real chain? Let's say Abby transferred Simon $1,000 dollars. Since Abby is sending the money, she needs to sign the transaction with her private key, pretty straightforward. After that transaction 15 more blocks of transactions have been added to the chain. Abby nows tries to steal Simons money by going back 15 blocks and changing the transaction to say $1 instead of $1,000. She then adds the same 15 blocks to the altered transaction. We now have two chains of equal length and the only difference is that single transaction 15 blocks ago. The solution to this problem is to always assume that the longest blockchain is correct AND make it hard for blocks to be added to the chain. If we go back to the example, Abby has to add 15 blocks to her altered transaction whilst the unaltered chain is 15 blocks ahead. This means that everyone will ignore Abby's shorter, altered chain and opt for the longer, unaltered chain. If blocks are hard to add then Abby will have a tricky time catching up and overtaking the unaltered chain. Now, how do we make it hard to add blocks to the chain? ## Proof of Work Remember Bitcoin is a peer-to-peer network which means that the transactions, blocks, are added by people like you or me. More formally CPUs maintain the blockchain that we control. We can make it hard to add blocks if we force the CPU to solve a complex problem before being allowed to add a block to the chain. This complex problem is so hard that the computers simply need to guess the answer randomly until they get it. This complex problem can be modified so that on average all the CPUs contributing to the blockchain take about 10 minutes to solve it and thus add a block to the chain. Only a single CPU needs to find the right answer because then they tell everyone else because they are proving that they did the work to add the block. Let's go back to the stealing example. If there are 10 CPUs maintaining the blockchain, including Abby's, then on average the complex problem will take 10 CPUs about 10 minutes to solve. When Abby goes back 15 blocks to alter the transaction, she needs to create 15 new blocks to catch up with the unaltered chain. Since Abby only has 1 CPU it will take her ~10x longer to solve the problem and add a block to her altered chain. There is no way that Abby will be able to catch up with only 1 CPU and thus the fraud cannot occur. ## The Complex Problem What problem is so hard that a computer just has to guess? A cryptographic equation! If we give the CPU the hash and the input message for the first equation example, then the only way to find the private key is to guess and check. In this case the message is the block and the hash is a 256-bit number will x leading zeros where x is a varying number between 0 and 256. The higher the x, the harder the problem. The more nodes trying to solve the problem, the higher x goes. Running CPUs take power and effort so how are these CPUs (other names: nodes/miners) rewarded for their honest efforts? ## The Incentive The lucky CPU to actually solve the complex problem gets to place a block on the chain. The reason why this is a good thing is because the first transaction of a given block is a reward transaction. The transaction is simply that the node gives themselves a set amount of bitcoin. This means that nodes earn bitcoin for maintaining the blockchain. People transfering bitcoin to each other can also "tip" the miners. This tip is another reward for their efforts. The set amount of bitcoin that a node receives for adding a block goes down as the total supply of bitcoin goes up until a limit and then there is no reward for adding a block. The miners then have to fully rely on tips. This bitcoin limit is estimated to be reached in the 2100s (21 million bitcoin). The paper also theorizes that this incentive will encourage nodes to refrain from trying to steal. This is because they can either direct their CPU toward stealing or mining. If they steal, they make the average bitcoin user weary of the asset. But if they simply mine then they are rewarded with bitcoin. Their stealing may undermine and devalue the exact asset that they are trying to get more of whilst mining strengthens the asset. ## The Network The paper explains the bitcoin protocol in this way: 1. New transactions are broadcast to all nodes 2. Each node collects 2,400 transactions to make a block 3. Each node then trys to solve the complex problem of their block 4. Once a node solves the problem they broadcast the block and the answer to all the other nodes 5. All other nodes verify the block and start working on the next one Nodes always work on the longest chain. If two chains are tied then nodes choose whichever chain to mine on and eventually one chain will become longer than the other. So long as 51% of the nodes are honest, then the true chain will be longer and will always be considered correct. The more and more blocks added after a transactions, the more sure the participants can be that the transaction is valid. Because of the 51% rule it is okay if some of the nodes miss the most recent block because once they do finally get a new block they will realize they are missing and request it from other nodes. They will know they are missing a block because the hash of the previous block is stored within the current block ## The Disk Space As of 2022 the entire bitcoin blockchain is about ~400 GB. To save nodes from having to have the entire chain stored locally, the paper suggests using a Merkle Tree of Transaction hashes to save space. Each transaction gets a hash and is entered into a "March Madness" style bracket where instead of the two hashes playing basketball, they simply get hashed into a single hash. If there are 2,400 transactions in a block, then they get paired into 1,200 hashes, and then 600, 300, 150 ,75, 36, 13, 7, 4, 2, and finally 1. Whenever there is an odd number of hashes, the odd one out hashes with itself. This root hash is stored in the header of a given block. This type of storing means that 2,400 transactions are accounted for in a single hash. Specific transactions can then be queried if desired on the full blockchain and the Merkle Tree can be traced. This typically happens with transactions that happened many blocks ago and thus are not relevant in current transactions. For example if a store clerk gave me change, I don't really care who gave him the change in the first place. ## The Simplified Payment Verification A faster and more simple way of verifying payments is to store the block headers on the local computer. If the transaction is linked to a Merkle root on the longest chain, then the verifier can be sure that the other nodes have accepted the transaction. The paper includes this image: Note: In this image each block only contains 4 transactions, but the Merkle shape is obvious. ![[4 - Attachments/image 2 8.png|image 2 8.png]] Transactions can contain multiple inputs and outputs to simply processes such as over-payments and tipping. ## The Privacy Privacy and security are some of the major selling factors about cryptocurrency. A major issue with having banks as a third party this that they know an excessive amount of personal information about both parties in a transaction. A bank knows exactly who you are, where you live, and what you are buying and selling. The third parties typically do a good job of controlling that information, but they are in control, not you. On the blockchain the transactions are public, but there are no names attached to the public addresses of the buyer or seller. The public can see all the transactions but unless the addresses are disclosed by the owners the public will have nary a clue who is behind each address. People can create as many accounts (PK, SK) as they would like to prevent people from tracking the same account throughout the blockchain. The paper has a great visual on where the train of information stops in each model: ![[4 - Attachments/image 3 6.png|image 3 6.png]] ## The Math As mentioned before, the only way for someone to steal is to change one of their previous transactions. This is because the buyer must always sign the transaction to make it valid. Since signatures cannot be created without the SK, the only way to steal is to reduce the amount of money the attacker sent to another address in a previous transaction. We made a claim that an attacker would have to change a previous transaction and then add more blocks to the modified chain to be longer than the unaltered chain. But what if the attacker gets lucky? How lucky does the attacker really have to be? The paper suggests that the race for the altered chain to catch up to the previous chain can be described as a Binomial Random Walk. Adding a block to the unaltered chain is a +1 and a block to the altered chain is a -1. A deep-dive into the Gambler's Ruin problem can be found [[The Gambler’s Ruin]]. Going through the calculations of the probability that the altered chain will catch the unaltered chain is as follows: $p$ = probability an honest node solves the complex problem and adds a block $q$ = probability the stealer solves the complex problem and adds a block $qz$ = probability the stealer will catch up from z blocks behind $qz=1 \text{ if }p\leq q$ $qz=( \frac{q}{p})^z \text{ if } p>q$ The first equation is trivial because it is how we defined the problem. If the altered chain has more CPU then it will catch up to the unaltered chain. This is why it is best to wait until the transaction has been buried under a couple of blocks to be sure that it has been accepted as the longest chain. The plot for the altered chain catching up is given below, it is assumed that there are more honest nodes than attacking nodes ( _p > q_). ![[4 - Attachments/image 4 6.png|image 4 6.png]] We can see that the chances are very slim for attackers even with a large minority of the CPU power. After about 10 blocks there is a 0.00000003%, 0.0001%, 0.02%, and 1.7% chance an attacker will be able to catch up, respectively. 10 blocks seems like a very safe amount of time to wait until the receiver can be sure that the sender will not try to alter the transaction. How many z blocks should the receiver waits until they are sure the transaction is secure? If they waited 10 it would be very good but that would take 10 x 10 minutes or just under 2 hours. That is far too long for today's society. The attacker will make progress with a Poisson distribution: _λ = z * q/p_ Where z is the amount of blocks needed for a transaction to be considered "safe". To get the probability that the altered chain can catch up, the Poisson density is multiplied by each amount of progress the attacker may have made. The reason for the summation is because if the attacker shortens the difference in length by 1, the previous equation changes to _λ = (z-1) * q/p_ The simplified version of the summation equation is: ![[4 - Attachments/image 5 5.png|image 5 5.png]] The paper suggests waiting until there is less than a 0.1% chance of the attacker catching up. This means if the attacker has 10% of the total computing power (q) then the receiver should wait 5 blocks (z). More values are given: |q|z| |---|---| |15%|8| |20%|11| |25%|15| |30%|24| |35%|41| |40%|89| |45%|340| ||| ## The Conclusion Bitcoin is created to remove the bank from online transactions and all other forms of trust. Each transaction has a signature with a secret key that only the sender knows which makes it nearly impossible for an attacker to forge a transaction. The double-spending problem is resolved via a peer-to-peer network that requires nodes to prove that the block of transactions are valid by using up CPU. The nodes are rewarded with bitcoin for their efforts. Nodes do not need to be identified and can leave and rejoin the network whenever they please. So long as honest nodes control the majority of the CPU, the chain is safe and secure. Now that we fully understand the bitcoin blockchain we can start mining it! # The Mining Bitcoin is the token of the Bitcoin Blockchain which is the most popular cryptocurrency on the market. Because of this popularity, the barriers to entry for mining have become quite steep. Recall how the complex problem gets harder the more nodes there are. A consumer CPU is no match for purpose built machines. The goal is to be able to mine some crypto on a raspberry pi so we need to choose a different coin. Monero, the third most popular coin is the token for the job. Monero has a slightly different protocol than Bitcoin but the important difference is that Monero was designed to be mined on consumer-grade CPUs. This experiment can be generalized for many different tokens. ## The Pi Setup The only purpose of this CPU will be to mine crypto so when we are choosing a linux distro, we should get something lightweight. This means for RaspOS we want a lite version which has no desktop. The headless install is straightforward with the raspberry pi imager. ![[4 - Attachments/image 6 4.png|image 6 4.png]] With this OS installed and the SSH and WiFi enabled during the imaging, the pi can be powered on and ready to go. The IP was found by pinging the hostname of the pi. The next step in the set up is to install git and the other files needed for the CPU to be turned into a mining machine. ![[4 - Attachments/image 7 4.png|image 7 4.png]] After that the Pi is set up and can now be installed with the mining software of choice. ## The Mining Software The Monero ticker is XMR and we will be using a XMR rig on github. The XMRIG is installed like any other git repository. ![[4 - Attachments/image 8 4.png|image 8 4.png]] With the repo all set up the only thing left to do is make the project. ![[4 - Attachments/image 9 4.png|image 9 4.png]] The rig took awhile to build but now the Pi is ready to mine Monero. ## The Mine The Raspberry Pi CPU is quite small and will not be able to make many guesses for the complex problem per second. This means that it would be almost impossible to ever put a block on the chain. Small CPUs work best when they join a mining pool. A mining pool is a collection of other CPUs that all guess on the small block and split the reward if they guess the right answer. The split of the pot will be proportional to the amount of work put into guessing the solution. The pool that the pi will join is called Moneroocean. The final item needed is wallet address to send the earned Monero to. The app Cake Wallet offers Monero addresses so that will be used. The rig can be started as seen below. ![[4 - Attachments/image 10 4.png|image 10 4.png]] The Raspberry Pi is now mining for Monero and the rewards are automatically sent to the address provided. How does the CPU stack up? We are getting around 18 hashes a second. This means that the Pi is making 18 guesses toward the complex problem per second. For reference purpose built machines can mine Bitcoin at a trillion hashes per second. So the mine isn't good... at all. But this process can be implemented on just about any CPU so if we find a more powerful CPU, we can easily set it up to mine Monero. ![[Connect#Connect]]