We've been seeing issues of ledger corruption when nodes are syncing the chain with increased frequency and have finally identified the fix. In the way we use rocksdb we split data in the ledger and the blockchain across column families (eg token balances are distinct from hotspot information). The ledger corrruption issues were when part of a ledger update would commit but the other parts would not. The fix was to enable the atomic flush option in the rocksdb database options which ensures that all updates in a batch update are committed atomically, even across column families.
Additionally we added a small optimization to block validation for bypassing validation for a block the node is a signatory of. This helps the consensus group move to a new round faster, especially when the block is large or slow to validate. Since the signer has already validated the block before it signed it, validating it again is redundant.
- Add atomic flush option PRs: relcast#38, libp2p#241, blockchain-core#336
- Bypass double validation of self signed blocks PR: blockchain-core#337
We plan to beta
2020.01.08.0 for a couple of hours starting around 7:00 PM PDT and confirm that
the GA release OTA downgrades successfully before making it available to all hotspots.