A Merkle Tree is a binary tree data structure in which every leaf is a cryptographic hash of a data block and every branch is a hash of its children.
Two Merkle trees can be compared by only comparing the root nodes.
1. Data integrity: Merkle trees help nodes/validators ensure that data blocks in P2P environments are complete and not tampered with.
2. Faster verification: They scale the consensus process by rolling up many transactions into a single hash to be compared; that's much more efficient to verify.
3. Smaller proof sizes: Rolling up transactions into a single hash also greatly reduces the size of the proof. Using less data each time adds up to big savings.
4. Time complexity: When searching for a transaction in a Merkle tree compared to searching in a different data structure such as a linked list, your time complexity is reduced from O(N) to O(logN)!
Let's say I want to verify eight data points with another node, Alice. There are a few ways to do this.
Alice and I could exchange all eight data points. This would get the job done, but I certainly can't send sensitive data in plaintext. Also, what if someone intercepts the message and changes something? This won't suffice in a trustless P2P environment.
We could hash the data points first and then exchange them. Then we only have to compare the hashes and anyone listening in will learn nothing of the content. This is an improvement! While the integrity of the data is protected, this still won't scale very well.
More hashing. We can hash our sets of hashes until we only have one value. Then, we can exchange that singular value. We've just drastically reduced the size of our message and the time to verify! If any one value is changed, there is a butterfly effect that changes the hash of that initial value and thereby changes the hash of the hashed value, and so on up the chain until ultimately the root hash is also changed.
Here's what a Merkle tree might look like:
Merkle trees are used by blockchains for scalability (and sometimes privacy).
Bitcoin uses Merkle trees to reduce an entire block's worth of transactions into a single hash, reducing the complexity of validating that block; only the root hash needs to be checked to confirm that every single transaction happened in a specific order in that block. A bit more work upfront reduces the complexity for all future interactions with that block.