ACID vs BASE: Why Your Payment System and Your Activity Feed Need Different Databases

When Eric Brewer formalized the CAP theorem in 2000, he wasn't offering advice. He was describing physics. In a distributed system, when network partitions occur (and they will occur), you cannot simultaneously guarantee consistency and availability. You have to choose. This choice ripples through every architectural decision you'll make, determining which database you use, how you model your data, and which 3 AM incidents you'll face.

This article focuses on production systems handling 100K+ users and 10K+ requests/second. Below that threshold, a well-tuned PostgreSQL instance handles nearly everything, and the complexity of distributed systems isn't worth the operational overhead.

The stakes are concrete. Get it wrong with payment processing, and you're explaining to customers why they were charged twice. Get it wrong with an activity feed, and... well, nothing really. Some users see a post fifty milliseconds before others. Nobody notices. Nobody cares. This difference in consequences is why serious production systems don't pick ACID or BASE. They use both.

Timeline showing system evolution from single PostgreSQL to hybrid architecture with PostgreSQL + Cassandra as traffic grows from 10K to 10M users

The Promise ACID Databases Make

PostgreSQL and its relational database relatives make a promise that feels almost archaic in the microservices era: your data will be correct, always, without exception. This promise comes from four properties working in concert.

The ACID Properties:

Property	What It Guarantees	Real-World Impact
Atomicity	All operations in a transaction succeed together or fail together	Transfer $100: debit account A AND credit account B both happen, or neither happens
Consistency	Database enforces all constraints and rules	Account balance cannot go negative if you defined that constraint
Isolation	Concurrent transactions don't see each other's partial state	Two people buying the last concert ticket: one succeeds, one fails. Never both.
Durability	Committed data survives crashes and power failures	"Payment confirmed" means the payment is permanently recorded

Atomicity means transactions are all-or-nothing. When you begin a transaction and issue five SQL statements, either all five succeed and commit together, or any failure rolls back the entire transaction as if nothing happened. There's no middle ground where three statements succeeded and two didn't. The implementation of this guarantee is surprisingly sophisticated. Before PostgreSQL commits a transaction, it writes the changes to a write-ahead log on disk. Only after that data is safely on durable storage does the transaction return success. If the server loses power one microsecond after you get a success response, the database will replay that write-ahead log on restart and reconstruct the exact state before the crash.

Consider a payment flow where you need to check the user's balance, deduct the amount, record the transaction, update inventory, and mark the order as paid. In PostgreSQL, you wrap these operations in a transaction. If the inventory update fails because the product is out of stock, the balance deduction automatically rolls back. You didn't write error handling code for every possible failure point. The database handles it.

src/payments/process-payment.ts
async function processPayment(userId: string, amount: number) {
  await db.transaction(async (trx) => {
    // Check balance
    const user = await trx('users').where({ id: userId }).forUpdate().first();
    if (user.balance < amount) throw new InsufficientFundsError();
    
    // Deduct balance
    await trx('users').where({ id: userId }).decrement('balance', amount);
    
    // Record transaction
    await trx('transactions').insert({
      user_id: userId, amount, type: 'debit', status: 'completed'
    });
    
    // If ANY step fails, ALL steps rollback automatically
  });
}

You cannot implement this payment flow correctly in Cassandra without building your own distributed transaction coordinator. Even then, you're fighting the database instead of using it as designed.

The cost appears when you try to scale. ACID databases typically funnel all writes through a single primary node. Read replicas can serve read traffic, but writes must go to one authoritative server. This creates a ceiling.

PostgreSQL Write Capacity (single node):

Standard setup: ~10K writes/second
Well-tuned setup: ~50K writes/second
Hero-tier setup: ~100K writes/second (rare, expensive)
Beyond 100K: You need a different architecture

You can throw money at the problem, buying increasingly powerful hardware, but there's a limit. No single computer exists that can handle millions of writes per second while maintaining ACID guarantees. The mathematics of coordination don't allow it. Every transaction requires PostgreSQL to check that no other transaction is modifying conflicting data, to enforce consistency constraints, and to write changes to disk before reporting success. These operations require synchronization that fundamentally limits parallelism.

The Bargain Cassandra Makes

Cassandra makes a different promise. It will accept your writes and make them durable, but it won't promise that every node sees identical data at every instant. Given time, assuming no new writes arrive, all replicas eventually converge to the same state. This is eventual consistency, and accepting it unlocks different capabilities.

BASE stands for Basically Available, Soft State, Eventual Consistency. It was created as a playful counterpoint to ACID, acknowledging that distributed systems require different guarantees than traditional databases.

The architecture embodies this philosophy. Cassandra organizes itself as a ring of nodes, each node an equal peer. There's no primary, no master, no single point of failure. When you write data, you specify a consistency level determining how many replicas must acknowledge before the write succeeds.

Cassandra Consistency Levels:

Level	Replicas Required	Use Case	Tradeoff
ONE	1 replica responds	Fast reads, tolerates stale data	Fastest, least consistent
QUORUM	Majority responds (n/2 + 1)	Balanced consistency and performance	Good middle ground
ALL	All replicas respond	Strongest consistency	Slowest, sacrifices availability
LOCAL_QUORUM	Majority in local datacenter	Multi-datacenter deployments	Balances latency and consistency

This design provides remarkable availability. If one node fails, others continue serving requests. If an entire datacenter loses connectivity, the cluster degrades gracefully, with each datacenter serving requests using its local replicas. The system never stops, never refuses writes, never blocks waiting for unreachable nodes.

src/social/post-tweet.ts
async function postTweet(authorId: string, content: string) {
  const postId = generateId();
  
  // Write to author's timeline (fast, eventual consistency)
  await cassandra.execute(
    `INSERT INTO posts (user_id, post_id, content, created_at) 
     VALUES (?, ?, ?, ?)`,
    [authorId, postId, content, Date.now()],
    { consistency: 'QUORUM' }
  );
  
  // Fan out to 10M followers happens async in background
  await queue.publish('fanout', { authorId, postId });
  
  return { postId };
}

Twitter's architecture uses this pattern. A celebrity with 50M followers posts a tweet, and it fans out to 50M timelines. Cassandra handles this write load by distributing it across hundreds of nodes. PostgreSQL would choke.

The reward is scale. Cassandra handles millions of writes per second by distributing them across dozens or hundreds of nodes. Adding capacity is straightforward. Add nodes to the cluster, and they automatically join the ring and start handling traffic. No reconfiguration, no downtime, no migration. The cluster rebalances itself, and throughput scales linearly with node count.

When Eventual Consistency Betrays You

The failure modes are subtle and often don't surface until production load exposes them. Consider user registration. The application checks if an email address exists, and if not, creates a new account. In PostgreSQL with ACID transactions, this works correctly under heavy concurrent load. The database's isolation mechanisms ensure two concurrent registration attempts for the same email serialize correctly.

In Cassandra, this check-then-write pattern is fundamentally unsafe. Two concurrent requests might both check for the email, both see no existing account, and both create new accounts with identical email addresses. Cassandra doesn't have uniqueness constraints to prevent this. Last write wins according to timestamp, so one account eventually overwrites the other, but meanwhile your system has inconsistent data.

src/auth/register-problem.ts
// Request 1 and Request 2 arrive 50ms apart for same email

// Both requests check if email exists
const existing = await cassandra.execute(
  "SELECT * FROM users WHERE email = ?",
  [email],
  { consistency: 'QUORUM' }
);

// Both see no existing user (replication hasn't completed)
if (!existing) {
  // BOTH CREATE ACCOUNTS with same email, different IDs
  await cassandra.execute(
    "INSERT INTO users (id, email, username) VALUES (?, ?, ?)",
    [generateId(), email, username]
  );
}

Cassandra doesn't enforce uniqueness constraints like PostgreSQL. You can insert multiple rows with the same email. Your application must handle this, or use PostgreSQL for user accounts instead.

Counter operations present similar problems. Incrementing a counter seems simple: read the current value, add one, write the new value. In an eventually consistent system, this is unsafe. Two concurrent increments might both read ten, both compute eleven, and both write eleven. One increment vanishes. Cassandra provides special counter columns using conflict-free replicated data types to solve this, but these counters have limitations. They can't be decremented safely, can't participate in larger transactions, and are themselves eventually consistent.

The read-your-writes problem creates confusing user experiences. A user updates their profile, and the application confirms success. They immediately reload the page and see their old profile. Eventually the new data propagates to all replicas, but for seconds or minutes, the user sees stale data.

Read-your-writes can be solved by routing each user's requests to the same replica using sticky sessions, or by serving writes from PostgreSQL and reads from Cassandra after async replication.

The Hybrid Architecture That Actually Works

No production system of meaningful scale uses only ACID or only BASE. The architecture that works is hybrid, routing operations to databases whose guarantees match requirements.

Consider an e-commerce platform. When a customer completes a purchase, that operation touches multiple datastores. The order itself goes into PostgreSQL. This data requires ACID guarantees. The transaction must atomically deduct from inventory, create the order record, and store payment information. Any failure must roll back everything.

Once the order is placed, analytics events start firing. The application records that this user purchased these products, at this time, from this location. These events flow into Cassandra. The exact order doesn't matter. If one event is delayed by a network partition, it eventually arrives. If events are slightly out of sequence, analytics models tolerate it.

src/ecommerce/hybrid-approach.ts
class HybridDataLayer {
  // Critical path: PostgreSQL
  async createOrder(userId: string, items: CartItem[]) {
    return await postgres.transaction(async (trx) => {
      // Check inventory, deduct stock, create order
      // ACID guarantees all-or-nothing
    });
  }
  
  // High-volume path: Cassandra
  async recordProductView(userId: string, productId: string) {
    // Fire-and-forget, eventual consistency fine
    await cassandra.execute(
      `INSERT INTO product_views (product_id, user_id, viewed_at) 
       VALUES (?, ?, ?)`,
      [productId, userId, Date.now()],
      { consistency: 'ONE' }
    );
  }
}

Data Storage Decision Matrix:

Data Type	Database	Consistency	Why
User accounts	PostgreSQL	ACID	Cannot have duplicate emails
Orders	PostgreSQL	ACID	Financial data, atomicity required
Payments	PostgreSQL	ACID	Zero tolerance for errors
Inventory	PostgreSQL	ACID	Can't oversell products
Product views	Cassandra	Eventual	High volume, approximate counts acceptable
Activity feed	Cassandra	Eventual	Slight delays don't matter
Session data	Cassandra	Eventual	Temporary, high churn
Audit logs	Cassandra	Eventual	Write-heavy, rarely read

The boundary between these worlds requires careful management. When a PostgreSQL write triggers Cassandra updates, those updates happen asynchronously. A message queue decouples the two systems. The PostgreSQL transaction commits, publishes an event to a queue, and returns success. Background workers consume these events and update Cassandra. If Cassandra is temporarily unavailable, events queue up and apply later.

Data flow must be unidirectional: PostgreSQL to Cassandra. Never the reverse. Bidirectional sync creates conflict resolution nightmares that are nearly impossible to solve correctly.

The CAP Theorem in Production

The CAP theorem is often oversimplified to "pick two of three," as if consistency, availability, and partition tolerance are knobs you turn independently. The reality is more nuanced.

Network partitions will occur. This isn't theoretical. In a distributed system spanning multiple datacenters, network issues happen regularly. A misconfigured firewall rule, a cut fiber optic cable, a BGP routing mistake, all cause partitions. Partition tolerance isn't optional. The choice is between consistency and availability during partitions.

PostgreSQL configured for synchronous replication chooses consistency over availability. If the primary node cannot reach its replicas to confirm they've received a write, the write fails. The system becomes unavailable rather than risk serving inconsistent data. This behavior is exactly what you want for financial transactions, but it means network issues can bring down the system even when the nodes themselves are healthy.

Cassandra chooses availability over consistency. During a network partition, each node continues accepting writes using its local state. When the partition heals, conflict resolution mechanisms reconcile the divergent states. Last-write-wins is the default strategy, meaning some writes are silently discarded. This enables the system to remain available even during major infrastructure failures, but your application must handle cases where different users see different data.

Modern databases blur these lines through tunable consistency. Cassandra lets you choose consistency levels per query, effectively picking different points on the consistency-availability spectrum for different operations. A write at consistency level ALL temporarily sacrifices availability for consistency, blocking until all replicas acknowledge. A write at consistency level ONE sacrifices consistency for availability, returning immediately after a single replica acknowledges.

The CAP theorem forces a choice during partitions, but most of the time there are no partitions. A well-designed system uses strong consistency for critical paths and eventual consistency for everything else.

Making It Work

Data consistency boundaries must be explicit. Every piece of data in your system should have a clear answer to which database is authoritative. For user account information, PostgreSQL is authoritative. Changes happen there first, then asynchronously propagate to Cassandra if needed for high-volume access patterns. For user activity events, Cassandra is authoritative.

Monitoring consistency lag becomes critical. In a hybrid system, eventual consistency means there's always some lag between PostgreSQL and Cassandra. Under normal operation, this lag should be milliseconds or low seconds. If lag grows to minutes, something is wrong. Queue backlog is growing, or Cassandra is struggling with write volume, or background workers are crashing.

src/monitoring/consistency-monitor.ts
async function checkConsistencyLag() {
  // Sample users from PostgreSQL
  const pgUsers = await postgres('users').limit(1000);
  
  let driftCount = 0;
  
  for (const pgUser of pgUsers) {
    const cassUser = await cassandra.execute(
      'SELECT * FROM users_cache WHERE user_id = ?',
      [pgUser.id]
    ).then(r => r.first());
    
    if (!cassUser || cassUser.username !== pgUser.username) {
      driftCount++;
    }
  }
  
  const driftPercent = (driftCount / 1000) * 100;
  
  if (driftPercent > 5) {
    alert('DATA DRIFT CRITICAL', { drift: driftPercent });
  }
}

The system must degrade gracefully when one database becomes unavailable. If Cassandra goes down, PostgreSQL should continue serving the application's critical paths. Read performance might suffer, but the application stays functional. If PostgreSQL goes down, critical operations requiring ACID guarantees must fail fast rather than silently writing to Cassandra and creating inconsistent state.

Testing failure modes is not optional. In development, it's easy to assume both databases will always be available. In production, one will eventually fail. Systematically testing scenarios like Cassandra down, PostgreSQL down, network partition between databases, all of this surfaces bugs before they cause outages.

The decision between ACID and BASE is not a choice between right and wrong. It's choosing different tradeoffs appropriate for different requirements. Strong consistency where correctness cannot be compromised. Eventual consistency where scale and availability matter more than immediate accuracy. PostgreSQL for financial transactions and user accounts. Cassandra for analytics events and activity feeds. Clear boundaries between the two. Unidirectional data flow. Monitoring to ensure consistency lag stays bounded.

This architecture is more complex than using a single database everywhere, but the complexity serves a purpose. It lets each database do what it does best. The result is a system that handles scale gracefully, maintains strong consistency where it matters, and stays available during the network partitions and infrastructure failures that characterize real production environments.