1. Incident Summary
Findora Mainnet stopped generating blocks on 02/08/2023 at 11:30 AM PST. Gate.io generated a specific type of transaction which triggered a bug in the UTXO code. The team at Discreet Labs ran a rollback with validators and brought Mainnet back online.
2. Incident Impact
Mainnet consensus on Findora paused for 2 hours and 5 minutes. Transactions were unable to be processed during this time.
3. Incident Detection
The chain stopped at 11:30 AM and the DevOps team received an alarm at 11:40 AM. The proactive monitoring solution has a 10-minute period to count the block interval. Shorter intervals can produce false alarms.
4. Response Time & Recovery
The team responded to this incident at 11:40 AM PST and immediately began contacting community leaders and validators to coordinate a workaround. Mainnet was rolled back three blocks to remove the transaction from Gate.io. No additional transactions outside of those from Gate.io were affected by the rollback.
5. Timeline of Events
- 11:30 AM PST
Mainnet consensus was halted
- 11:40 AM PST
Proactive alarms were received. The recovery process begins.
- 12:45 PM PST
Network now is staged for rollback. Contacting validators for rollback.
- 1:35 PM PST
Mainnet begins producing blocks.
6. Root Cause
This incident was caused by two factors:
1. Findora uses JSON as the transaction data format in the original design. JSON does not promise that the order of fields before and after serialization remains the same. It is not a good choice as a transaction data format. In my opinion, It’s a bad transaction data format. When the data length is too long, or some fields are too long (more than one signature, maybe have other situations), the order of the fields will be unstable.
2. There is a problem with the consensus apphash calculation process. The hash of the transaction Merkle tree should be calculated before deserialization, or this hash’s calculation should skip.
7. Lessons Learned
Improvements can be made to the UTXO side code which will avoid this problem in the future. There are multiple options available to accomplish this, including a FIP proposal to constrain the Merkle tree calculation rule.
8. Corrective Action
Create a filter in RPC service (endpoint) to block transactions that cause crashes.
A code refactor for UTXO-related logic is being composed, reviewed, and proposed via Findora Improvement Proposal (FIP).
Findora is a Layer-1 building a future for Web3 where you can #ExpectPrivacy that’s auditable and programmable.
It combines an EVM layer for programmability and interoperability with a UTXO layer optimized for privacy. Developers can leverage either chain model as they build dApps with auditable privacy. By combining privacy with auditability, Findora prepares Web3 for mass adoption, empowering developers to protect users and comply with regulations.
We appreciate our developers and would love to onboard you to the Findora ecosystem. Please reach out, and join our social channels for more.
Discord | Twitter | Reddit | Telegram | Youtube | LinkedIn | Facebook | Newsletter