On September 2, 2025, shortly after upgrading to version 0.14.0 (Grinta), Starknet experienced an outage during which block production was halted and two chain reorganizations (reorgs) were required to restore normal operation.

We recognize the impact this had on our users and partners, and we are committed to providing full transparency into what happened, how it was resolved, and what steps we are taking to prevent recurrence.

This incident occurred in the immediate aftermath of a historic milestone: Starknet became the first Validity (ZK) rollup to decentralize its sequencer architecture, moving from one to three sequencers operating the network. That shift was the core change that exposed this new challenge – one that is inherent in pioneering the path toward decentralization, a path that only Starknet has taken so far.

Since the incident ended on September 2 at 13:41 UTC, Starknet has been fully operational. We are continuing to push the boundaries of proving and advancing toward becoming the first decentralized Validity (ZK) Rollup, while also applying the key insights from this incident to further strengthen Starknet’s resilience going forward.

Summary

Between 02:20 and 13:40 UTC, Starknet was intermittently unavailable. During that time, two reorgs were required:

  • The first reverted roughly one hour of transactions.
  • The second reverted roughly 20 minutes of transactions.

The downtime was caused by a sequence of three interconnected incidents, beginning with a failure in Ethereum RPC providers at the node logic level.

A deeper investigation identified the following contributing factors:

1. Ethereum node failures – The three Starknet sequencers, which operate as part of its decentralized architecture, observed different states of Ethereum. This divergence disrupted the block-proposing process: one sequencer proposed transactions triggered by Ethereum messages that the others could not validate. As a result, network progression slowed significantly.

2. Manual intervention gap – To address the sequencer failures discussed above, a manual intervention was required. In this manual process, certain validations that are performed automatically were skipped, which resulted in the creation of two conflicting blocks in the L2 layer by the different Starknet sequencers. To restore correctness, a reorg was performed.

3. Blockifier bug in v0.14.0 – After the first reorg, transactions were discarded but L1→L2 messages were reprocessed. Some of these messages assumed that earlier Starknet transactions had already been executed, and when that assumption failed, they reverted. This triggered a bug in the blockifier – the component responsible for updating Starknet’s state based on transaction execution. The bug affected how reverted transactions initiated by L1→L2 messages were handled. This compounded the earlier disruptions and required both a hotfix and a second reorg to fully resolve.

When analyzing the failures, we identified issues in the consensus mechanism and blockifier logic, which will be detailed further below.

It is important to note that throughout the incident, Starknet’s proving layer acted as the safeguard that prevented these inconsistencies from compromising the integrity of the blockchain. This design reflects a core principle of Starknet: to preserve correctness and security regardless of any undesired sequencer behavior-whether caused by software bugs or by malicious activity.

Impact

  • Downtime: Approximately 9 hours of degraded or halted service.
  • Reorgs: All transactions in the affected blocks reflecting ~1.5 hours of activity were not processed and needed to be resubmitted.

Timeline

At 02:20 UTC, initial issues were detected when sequencers failed to sync L1 events, preventing new blocks from being approved. An investigation was initiated immediately. By 03:57, after several manual attempts to reset individual sequencers, L1 synchronization succeeded and Starknet resumed.

At 04:10 UTC, an alert indicated that the proving layer had detected an inconsistency, marking the beginning of the second incident. Apparently, different sequencers were building blocks on top of different versions of a certain block, and this was a result of our previous manual intervention. Starknet, as a validity rollup, first sequences blocks, and later proves them. When the proving layer detects an inconsistency (or any other invalid logic), it cannot generate a valid proof. By 04:37 UTC, Starknet was manually halted, since it became clear that a reorg might be required in order to fix this issue. This approach is based on the principle that when a reorg is anticipated, halting the system helps to minimize potential transaction loss.

At 06:05, we identified the block where the inconsistency occurred, and thus decided to re-org back to the block before (1,960,612), , reverting approximately one hour of activity. Block production resumed at 07:42 and by 09:08 all full node clients were synced and Starknet operations were fully restored.

The third incident began at 10:28, when the proving layer identified another inconsistent batch. At 10:43, Starknet was halted again as the probability for a second reorg became high. By 12:05, the root cause was identified, and at 12:37 a bug fix was implemented, tested and deployed. At 13:29, a second reorg was executed from block 1,962,681, reverting roughly 20 minutes of activity. Finally, at 13:41, block production and full network activity were restored.

Key learnings and Prevention Measures

In addition to detecting, analyzing, and rapidly fixing the bugs that caused this failure, and adding safeguards that will automatically prevent such failures from recurring, we are also implementing a set of measures designed to further reduce the likelihood of similar issues in the future:

  • Increase the number of nodes participating in the internal consensus protocol to improve fault tolerance and resilience.
  • Introduce additional safety mechanisms to protect against disruptions when external dependencies, such as Ethereum nodes, experience issues.
  • Minimize the need for manual interventions and ensure that in the rare cases they are required, the process is reliable, well-documented, and resistant to human error.

Closing

Starknet is back online and fully operational. While the incident was serious, the fixes applied have already increased network resilience, and additional long-term improvements and safeguards are being implemented.

By identifying and addressing these bugs immediately upon detection, and with the proving layer serving as a backstop, Starknet emerges more robust. The fixes applied in response, together with additional safety precautions now being implemented, directly strengthen its reliability going forward. It is also worth noting that the Starknet dapp teams were well-prepared to handle reorgs, which ensured users were minimally affected at the application level throughout the incident.

We are committed to full transparency with our users and partners, and will continue to share follow-up updates as improvements are rolled out.

Thank you, Starknet community, for your understanding and support as we worked through this issue.