Accounts, Sealevel and the SPL

So far, we've written about Proof of History, the unreakable chain of tick hashes that serves as Solana's on-chain clock, Tower BFT, Solana's PoS consensus mechanism and also what happens when a user submits a transaction from start to finish. And since we now understand how blocks are assembled, we just need the final puzzle pieces to understand the core concepts behind Solana's architecture.

In this article, we will expand on all the things we've learned so far by explaining how accounts (yes, Solana uses the account model, not UTXOs) on Solana work and also how the Sealevel engine allows execution of smart contracts in parallel. And we cannot mention smart contracts - they are called programs on Solana - without mentioning the Solana Program Library, or SPL.

Let's get started!

Blockstore, AccountsDB and Cloudbreak

All consensus-critical data on the Solana blockchain on all validator and RPC-full nodes lives in only two places - raw ledger history is stored in what Solana calls blockstore and the current state of the Solana blockchain is recorded in AccountsDB, which, as the name suggests, is a database of all accounts on the Solana network - you could say a "live snapshot". Cloudbreak is a custom database solution that handles all AccountsDB under-the-hood operations written specifically for the Solana blockchain.

Blockstore

Blockstore is a permanent database of received shreds (both data shreds and encoding shreds - so all executed transactions and PoH hash ticks) and records of validator votes for each slot, including some slot metadata sprinkled in.

Most validator nodes delete all non-root forks after a slot gets finalized via a process called pruning to minimize the size of their blockstore, but specialized nodes (such as infra nodes ran by block explorer services or dedicated Solana archival nodes) keep the complete ledger history, including historical forks that weren't added to the blockchain in the end. Apart from serving as the actual Solana ledger, blockstore is primarily used by the TVU to replay incoming shreds.

AccountsDB

AccountsDB is a database of all accounts on Solana. But in our case, "accounts" doesn't mean just user accounts and their balances, because pretty much everything is an account on Solana: actual accounts and balances, the state of programs, validator stakes, vote records, system configuration, etc. all use Solana accounts.

Each account on Solana has four key fields - pubkey, lamports, owner and data.

  • Pubkey is the account address, a unique 32-byte identifier

  • Lamports is the balance of an account in SOL

  • Owner is the program ID which is allowed to change account data and move the lamports (all programs can read account data and credit lamports, but only the owner can write and deduct lamports)

  • Data is a field whose purpose is determined by a boolean executable flag - if executable=false, the account data is state, if executable=true, the account is a program (a smart contract) and the data field contains the executable code

A practical example can be a stake account - the executable value is false, because the stake account isn't a program. In this case, the data field holds information relevant to staking: who the staker and the withdrawer are, the pubkey of the associated vote account, the epoch the stake was activated in, etc.

Rent

Every account on Solana with a balance has to pay rent in SOL in order to stay on the network - this discourages network users to create a lot of accounts that wouldn't then be used and helps prevent spam bloating the blockchain state. Rent is a small number of lamports deducted from each account every time it's used - if an account holds more than 2 years worth of rent, it's exempt from paying rent and can stay on the network without being scrapped. Rent is 3.48 lamports per byte‑year - take the data field of this account, multiply the number of bytes in it by 3.48, multiply it by 2 (2 years) and you will get the final minimum amount of SOL required for this account to be rent-exempt.

Snapshots

The AccountsDB database is periodically snapshotted (every 512 slots by default) to allow new validators to quickly sync with the network - the snapshot file represents the state of the blockchain at a certain slot, the new validator loads this snapshot file received from another full node into their database and replays all slots that come after it instead of replaying the whole history from scratch, which would take days.

AppendVecs

Solana stores all account information in a set of memory-mapped files called AppendVecs. Every time an account state changes, this change is appended to the end of this file without changing the previous records. An in-memory index that lives in the RAM on each full node maps every account's pubkey (address) to the bytes that represent its latest state in the AppendVec file, which allows Solana to answer state queries and execute transactions without having to scan the AppendVec file itself, contributing to speed. The latest state of the blockchain can be forked, so the index keeps all information until a slot reaches finality and then deletes all unneccessary records. AppendVecs are handled by Cloudbreak, Solana's sharded storage engine.

Cloudbreak

AccountsDB is the public API layer the Solana runtime talks to, but most of the logic in the database is actually implemented by Cloudbreak under the hood. Cloudbreak handles both AppendVecs and the in-memory RAM index that works with them mentioned above (and other things).

Cloudbreak breaks down the full RAM index list into 32-64 shards called buckets. Each account (with a unique pubkey "ID") on Solana is assigned to one bucket, has its own read/write lock (mutex) and can be processed by a standalone CPU/GPU thread* - this implementation reduces contention (when two or more threads try to use the same resource at the same time - only one will "win") and makes Solana the first production blockchain that has successfully implemented parallel execution on a single layer.

*imagine books in a library - each individual book has its own dedicated spot in a bookcase (this is the pubkey lock), but there are different sections with different bookcases (these are the buckets). This helps with efficient resource usage and processing - individual pubkey locks prevent multiple threads from interacting with an account at the same time (which could be used to double-spend), buckets help with managing RAM usage, cache bloat and other things.

Most blockchains have a single global mutex, which means that transactions can only be executed sequentially, one-by-one, because each execution locks the global state of the blockchain until it's processed. Solana is built around efficiency and parallel execution is one of the key aspects of this, but the whole picture is not complete yet. Now that we know that each individual account has its own lock, let's have a look at how the Sealevel engine actually leverages this in the next section.

Sealevel

Sealevel is Solana's parallel execution engine - this section ties up every everything we've written about so far together. Every Solana transaction must contain a list of every single account it will interact with (read/write) when it's submitted. Declaring all accounts a transaction will touch up front like this allows the Sealevel engine to sort through the submitted transactions and execute non-conflicting ones at the same time (with the help of Cloudbreak) without locks on global state.

Everything a Solana transaction does NOT declare up front is simply inaccessible, which guarantees determinism and prevents malicious attacks. Overlapping transactions simply wait until the previous "batch" of transactions is processed and the state of the blockchain can be altered again, in practice this happens in milliseconds inside the Sealever scheduler.

Executing transactions and code on Solana is measured in Compute Units - CUs. Each action performed on the Solana network costs a fixed CU amount - each transaction has a hard CU cap and every slot has a CU ceiling (if you remember stake-weighted QoS, the mechanism that prefers incoming traffic from staked validators over transactions from non-staked nodes and prevents network buffer overload, hard CU caps perform a similar role when it comes to the CPU budget). Measuring execution cost in CUs prevents DoSing and also gives developers a predictable performance cost of their code.

If a Solana transaction goes over its CU budget, it will just be aborted and has to be submitted again, it doesn't stay waiting anywhere like it's the case with Bitcoin and Ethereum mainnet mempools, where transactions wait until they are picked up.

Solana Program Library (SPL)

The Solana Program Library is a collection of audited on-chain programs (Solana's smart contracts) that allow developers to build on Solana without having to reinvent low-level token/metadata logic and serve as the building blocks for most dApps on Solana.

The most commonly used one is the SPL Token program, which provides minting, transferring, freezing, and burning of fungible tokens and NFTs on Solana - the equivalent of Ethereum's ERC-20, ERC-721 and other standards combined into one. The newer Token-2022 program builds on top of the original SPL Token by adding transfer hooks, clawback, confidential balances and other functions.

Holding Solana tokens (other than SOL) is different than it is on Ethereum - Solana uses ATAs, or Associated Token Accounts (managed by the ATA program). Every user wallet that holds a coin on Solana is assigned a new address that's different from the user wallet's pubkey and whose Owner is the SPL Token program (or other managing program) and this address then holds the actual tokens. This is why you see two addresses on Solscan when you look at token balances on a coin - one is the account (the user wallet) and the other is the token account (controlled by the Owner program).

Other programs from the SPL commonly used on Solana include the SPL Memo, which lets anyone attach a 32-byte custom note to every transactions (used for human-readable notes/logs) and the Address-Lookup Tables (ALTs) program, which is a program that compresses transaction headers for storing batches of addresses on-chain and is used for complex DeFi operations without going over the 1232-byte size limit for a single data packet on Solana.

Last updated