On Monday 2022-08-08 at approximately 19:34 PM UTC Validators stopped producing blocks at block
1,469,951
and a chain halt occurred. Normal chain operations resumed at approximately 08:00 AM UTC
just over 7 hours later.
Throughout the disruption, devices continued to transfer data and operate normally. There were no risks to Hotspot owners or HNT holders, but they did experience a halt in transaction processing, for example, adding and transferring Hotspots, asserting location, and completing payment and burn transactions.
Cause
The latest chain variable activation revealed a previously unknown chain variable update hook bug which caused the chain to halt. This chain variable was intended to disable legacy Hotspot indexing operation in accordance with HIP 54, h3dex poc targeting.
Recovery
The team identified the issue and requested operators of Validators in the Consensus Group to restart.
However, although a restart resolved the issue during testing, on mainnet a restart did not resolve
the issue and the core team issued an emergency Validator release, v1.13.5
. This release disabled
the variable update hook and included a new blessed snapshot, generated on a chain follower with the
hook disabled.
Validators were requested to upgrade to the emergency release, and those in the Consensus Group were instructed to manually load the new snapshot.
Once the Consensus Group’s ledgers were in agreement, the block skip process allowed the chain to begin moving again, but with a lower number of transactions compared to normal operations
Resolution
Following Validator release v1.13.5
, which addressed the variable update hook issue, release
v1.1.158
for ETL maintainers and v1.1.70
for Blockchain Node operators were issued on 2022-08-09
at 5:38 AM and 5:49 AM UTC, respectively, along with a snapshot to synchronize users past the halt.
Certain Validators who updated after the chain resumed experienced ledger drift, which caused
Validators to go offline and remain halted. The team decided to issue release v1.13.6
as a
non-mandatory release for halted Validators. This enabled the chain to continue to progress more
normally as Validators updated, aligned with the latest blocks based on the latest snapshot.
The average PoC rate before the halt was ~270 receipts per block. The post-halt average initially moved up to 210 receipts per block because of grpc blocking by certain Validators. Validators have unblocked grpc connections and PoC receipts have returned to normal operations.
The team provided an updated Validator release v1.13.7
which included a number of performance
updates, but also included a fix to prevent the variable update hook issue from recurring.
Power of Community
We want to extend our deep appreciation to our community members that host Validators. Thank you for your attentiveness and willingness to troubleshoot issues that affect the entire network. Your efforts and support move the network forward.
Timeline of Key Announcements
Throughout the disruption, the team worked to keep users up to date on the status and required actions.
2022-08-08 19:34 PM UTC Discovery and investigation of chain halt that occurred at block
height 1,469,951
as a result of activating the latest chain variable.
20:26 PM UTC Core team requests Validators in the Consensus Group to restart their Validators in order to resume block production.
22:45 PM UTC Chain halt again at block 1,469,988
, continued exploration by the core team into
causes and solutions.
2022-08-09 03:04 AM UTC Emergency release v1.13.5
sent to Validators which included a new
snapshot that all Validators were required to sync to in order to realign to block 1,469,990
.
05:09 AM UTC Block moving at slow rates (approx 10% of normal transaction rate). No reward block yet, team monitoring. Testing for ETL and Node releases.
05:49 AM UTC Release v1.1.158
for ETL maintainers and v1.1.70
for Blockchain Node operators
issued along with corresponding snapshot to synchronize users past the chain halt.
08:02 AM UTC Chain progressing as expected, PoC rate to continue to drift back to normal rates (approx 270 receipts per block) over time.
18:31 PM UTC Ledger drift identified among Validators that updated after chain resumed. Core
team issues release v1.13.6
for general availability containing a newer blessed snapshot for any
Validators that are halted because of ledger drift.
2022-08-10 22:34 PM UTC Core team delivers beta release v1.13.7
for Validators. This
release included a number of performance updates, but also included a fix to prevent the varhook
issue from recurring.