Hashing’s Essential Role in the Bitcoin Protocol
Get an inside look at hashing and how it helps to secure the Bitcoin protocol.
What's the Point of Hashing?
The word "hashing" conjures an image of a very specific procedure in computer science. It's typically used to describe the process of converting a given string of text or numerical information into another, completely different string of text or numbers.
To transform a given sequence into a new one—to hash it—you input your sequence into an algorithm called a hash function.
You run your first string of text (your input) through the hash function, which then spits out a new, transformed string of text (your output, also known as your hash).
Hashing algorithms result in a hash of a fixed size. For example, Bitcoin uses SHA-256, which always produces a hash that is 256 bits long, which translates into 64 characters when represented in hexadecimal form.
In a very basic way, it works like this: Imagine you are working with the words "The quick brown fox jumps over the lazy dog." You could hash this input to produce the output "6f4cbf26a5fc3452a9fa1694997d46e0".
Now, if someone were to change just one letter in the input string—say, they replaced "The" with "A"—the hash of that new string would be completely different: "03ac674216f3e15c761ee1a5e255f06795."
In this way, hashing is hyper-specific. Changing even one character in the input string produces a hash that looks completely different from the hash of the original input.
What's the Point of Hashing?
On its surface, hashing seems like a somewhat arbitrary process. But hash functions are essential for cybersecurity and data integrity. Most hash functions are one-way, meaning you cannot convert the output back into the input. This is vital for all things digital because it helps prevent tampering with data and ensures that messages can be transmitted accurately—and safely.
How Does Hashing Protect Data?
In cybersecurity, hash functions are used in several ways to protect information. For example, when you log into your online banking account, your information is run through a hash function.
The hash of your password is then compared to the hash stored in the bank's database. If the two match, you are granted access. This is how getting one character wrong in your username or password results in an "invalid login" error. Remember—changing one character in your input will result in a completely different hash.
Let's take an example. Say your password is "ilovesatoshi21." When this password is run through a hash function, it produces the output "9a9e7ea22f0b79efcd6141." The bank stores this hash in its database.
Now, when you try to log in, let's say you accidentally type in "ilovesatohsi20" as your password. When this string is run through the hash function, it produces a completely different output: "ec32dfbbe1ef2f0cc7a845."
The bank sees that the hash of your input doesn't match the hash of your password on file, so it denies you access. This is how companies say they "don't store your password." They only store the hash, which remember, cannot be reverse-engineered to figure out your password.
How Does Hashing Work in Bitcoin?
Before we start, understand that this is a simplified explanation. We could write chapters on each step of the hashing process, but we'll spare you the mathematics lecture. Let's break it down.
Step 1—The transaction is announced to the Bitcoin network.
When you want to send some bitcoin to your friend, you announce the details of that transaction—how much bitcoin you're sending and to what address—to the Bitcoin network.
Step 2—The transaction is verified by all nodes on the network.
Once your transaction is announced, all of the nodes on the network verify that it is valid. Does this person have the bitcoin they're trying to send? Or have they already spent it elsewhere?
Step 3—The transaction is collected into a block.
Once your transaction has been collected, it is combined with other verified transactions and collected into a new block. A block represents a collection of all those transactions, bundled into one entity. Every single transaction is hashed with another, and those hashes are hashed together until there is only one hash remaining. Think of it as a family tree: you start at the bottom, combining transactions into "parents" and those parents into grandparents until you're left with a single "ancestor" at the top.
The top of the tree is called the merkle root.
Step 4—The creation of the block header.
The merkle root is combined with all other necessary data (previous block data, timestamp, and more) to create the block header. The block header is important because it's what will be hashed in the next step.
Step 5—The block header is hashed.
This is where things get interesting. The Bitcoin network presents miners with a "target" number. Miners race to put the block header's value through the hashing algorithm, and the hash that results is compared to the target number.
If the hash of the block header is lower than the target hash, then it is "accepted" by the network. The miner who finds that number first gets the block reward.
But if miners are putting the same information through the same hashing algorithm, won't they always find the same hash?
That's where the "nonce" comes in. A nonce is a random number that miners tack onto the end of their block header before hashing it. The hash that is produced will be completely different than if the nonce were not included.
So miners take the block header, add a nonce, hash it, and if the number is less than the target, they've solved the block. If not, they try again with a different nonce until they find the hash that works. This happens billions, sometimes trillions of times before this cryptographic puzzle is solved.
Confused? Let's Go Through a Simplified Example
Let's say there's a target hash of "0000000000000000000011111111111111111111."
And for simplicity's sake, let's say the block header is "0x1y2z3."
First, a miner will take the block header and add a nonce. They might start by adding a zero to the block header, giving us: "00x1y2z3."
They then put this through the hashing algorithm.
They hash it and get:
- This hash is not lower than the target hash, so it is not accepted by the network.
The miner then tries again with a different nonce. This time they add another zero—"000x1y2z3.” After hashing, they get:
- This hash is lower than the target hash, so it is accepted by the network.
As previously mentioned, a miner might do this trillions of times before they find the hash that works.
Hashing is essential to the Bitcoin protocol because it secures the blockchains and helps prevent fraud, mistake data input, and other malicious activity.
Miners take a sequence of characters that represents all transactions in a given block (merkle root), add a random number (nonce), hash the whole lot, and if the hash is lower than the target hash, they've solved the block.
Whoever is first to hash the accepted block gets a reward in bitcoin. And that's how new bitcoin are created and transactions are verified.
Hashing creates bulletproof security because if any piece of data is changed at any point throughout the process, it will invalidate the hash. Since every miner is working with the same data, any hash that doesn't match the target hash is immediately rejected by the network. There's an incentive to truthfully hash the correct data because if a miner's hash is accepted by the network, they are rewarded with bitcoin. Trying to defraud the data isn't worth it because everyone else will be trying to play by the rules and hash the valid data.