<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Prateek Codes - Building Scalable Backend Systems</title>
    <description>Learn how to build scalable backend systems with Ruby on Rails, PostgreSQL optimization, database scaling, and practical engineering insights from real-world production experiences.</description>
    <link>https://prateekcodes.com/</link>
    <atom:link href="https://prateekcodes.com/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Thu, 15 Jan 2026 20:55:41 +0000</pubDate>
    <lastBuildDate>Thu, 15 Jan 2026 20:55:41 +0000</lastBuildDate>
    <generator>Jekyll v4.4.1</generator>
    
      <item>
        <title>Ruby::Box Practical Guide: Use Cases and Integration Patterns (Part 2)</title>
        <description>&lt;p&gt;In &lt;a href=&quot;/ruby-4-introduces-ruby-box-for-in-process-isolation-part-1/&quot;&gt;Part 1&lt;/a&gt;, we covered what &lt;code&gt;Ruby::Box&lt;/code&gt; is and how it provides namespace isolation. Now let’s explore practical patterns for integrating it into real applications.&lt;/p&gt;

&lt;h2 id=&quot;use-case-plugin-systems&quot;&gt;Use Case: Plugin Systems&lt;/h2&gt;

&lt;p&gt;Plugin systems benefit significantly from &lt;code&gt;Ruby::Box&lt;/code&gt;. Each plugin runs in its own isolated environment, preventing plugins from interfering with each other or the host application.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class PluginManager
  def initialize
    @plugins = {}
  end

  def load_plugin(name, path)
    box = Ruby::Box.new
    box.require(path)

    # Access the plugin class from within the box
    plugin_class = box.eval(&apos;Plugin&apos;)
    @plugins[name] = {
      box: box,
      instance: plugin_class.new
    }
  end

  def run(name, method, *args)
    plugin = @plugins[name]
    plugin[:instance].public_send(method, *args)
  end

  def unload(name)
    @plugins.delete(name)
    # Box becomes eligible for garbage collection
  end
end

# Usage
manager = PluginManager.new
manager.load_plugin(:markdown, &apos;./plugins/markdown_plugin&apos;)
manager.load_plugin(:syntax_highlight, &apos;./plugins/syntax_plugin&apos;)

# Each plugin has its own isolated environment
# If markdown_plugin patches String, syntax_plugin won&apos;t see it
manager.run(:markdown, :process, content)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This pattern ensures that a misbehaving plugin cannot corrupt the global namespace or break other plugins.&lt;/p&gt;

&lt;h2 id=&quot;use-case-multi-tenant-configuration&quot;&gt;Use Case: Multi-Tenant Configuration&lt;/h2&gt;

&lt;p&gt;Applications serving multiple tenants often need per-tenant configurations. &lt;code&gt;Ruby::Box&lt;/code&gt; provides clean isolation without complex scoping logic.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class TenantContext
  def initialize(tenant_id, config_path)
    @tenant_id = tenant_id
    @box = Ruby::Box.new
    @box.require(config_path)
  end

  def config
    @box.eval(&apos;TenantConfig&apos;)
  end

  def execute(code)
    @box.eval(code)
  end
end

# Each tenant gets isolated configuration
tenant_a = TenantContext.new(&apos;acme&apos;, &apos;./tenants/acme/config&apos;)
tenant_b = TenantContext.new(&apos;globex&apos;, &apos;./tenants/globex/config&apos;)

tenant_a.config.theme      # =&amp;gt; &quot;dark&quot;
tenant_b.config.theme      # =&amp;gt; &quot;light&quot;

# Global variables are isolated too
tenant_a.execute(&apos;$rate_limit = 100&apos;)
tenant_b.execute(&apos;$rate_limit = 500&apos;)

tenant_a.execute(&apos;$rate_limit&apos;)  # =&amp;gt; 100
tenant_b.execute(&apos;$rate_limit&apos;)  # =&amp;gt; 500
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;use-case-running-multiple-gem-versions&quot;&gt;Use Case: Running Multiple Gem Versions&lt;/h2&gt;

&lt;p&gt;During migrations, you might need to run two versions of the same gem simultaneously. &lt;code&gt;Ruby::Box&lt;/code&gt; makes this possible without separate processes.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# Load v1 API client in one box
v1_box = Ruby::Box.new
v1_box.eval &amp;lt;&amp;lt;~RUBY
  $LOAD_PATH.unshift(&apos;./vendor/api_client_v1/lib&apos;)
  require &apos;api_client&apos;
RUBY

# Load v2 API client in another box
v2_box = Ruby::Box.new
v2_box.eval &amp;lt;&amp;lt;~RUBY
  $LOAD_PATH.unshift(&apos;./vendor/api_client_v2/lib&apos;)
  require &apos;api_client&apos;
RUBY

# Compare behavior during migration
def compare_responses(endpoint, params)
  code = &quot;ApiClient.get(&apos;#{endpoint}&apos;, #{params.inspect})&quot;
  v1_response = v1_box.eval(code)
  v2_response = v2_box.eval(code)

  if v1_response != v2_response
    log_difference(endpoint, v1_response, v2_response)
  end

  v1_response  # Return v1 for now, switch to v2 when ready
end
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;use-case-isolated-monkey-patches-for-testing&quot;&gt;Use Case: Isolated Monkey Patches for Testing&lt;/h2&gt;

&lt;p&gt;Some tests require monkey patches that would pollute the global namespace. &lt;code&gt;Ruby::Box&lt;/code&gt; keeps these contained.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# test_helper.rb
def create_time_frozen_box(frozen_time)
  box = Ruby::Box.new
  box.eval &amp;lt;&amp;lt;~RUBY
    class Time
      def self.now
        Time.new(#{frozen_time.year}, #{frozen_time.month}, #{frozen_time.day})
      end
    end
  RUBY
  box
end

# In your test
def test_subscription_expiry
  box = create_time_frozen_box(Time.new(2026, 1, 1))

  # Load and test code within the frozen-time box
  box.eval &amp;lt;&amp;lt;~RUBY
    expiry_date = Time.new(2025, 12, 31)
    subscription = Subscription.new(expires_at: expiry_date)
    raise &quot;Expected expired&quot; unless subscription.expired?
  RUBY

  # Time.now is unchanged outside the box
  Time.now  # =&amp;gt; Current actual time
end
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;use-case-shadow-testing&quot;&gt;Use Case: Shadow Testing&lt;/h2&gt;

&lt;p&gt;Run new code paths alongside production code to compare results without affecting users. This pattern is useful for validating refactors or new implementations.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class ShadowRunner
  def initialize(production_box, shadow_box)
    @production = production_box
    @shadow = shadow_box
  end

  def run(method, *args)
    code = &quot;#{method}(#{args.map(&amp;amp;:inspect).join(&apos;, &apos;)})&quot;

    # Production path returns the result
    production_result = @production.eval(code)

    # Shadow path runs asynchronously, logs differences
    Thread.new do
      shadow_result = @shadow.eval(code)

      unless production_result == shadow_result
        Logger.warn(&quot;Shadow mismatch for #{method}&quot;,
          production: production_result,
          shadow: shadow_result
        )
      end
    end

    production_result
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;working-around-native-extension-issues&quot;&gt;Working Around Native Extension Issues&lt;/h2&gt;

&lt;p&gt;Native extensions may fail to install with &lt;code&gt;RUBY_BOX=1&lt;/code&gt; enabled. The solution is to separate installation from execution:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# Gemfile installation without Boxing
bundle install

# Application execution with Boxing
RUBY_BOX=1 bundle exec ruby app.rb
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;For CI/CD pipelines:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;# .github/workflows/test.yml
jobs:
  test:
    steps:
      - name: Install dependencies
        run: bundle install

      - name: Run tests with Ruby::Box
        run: RUBY_BOX=1 bundle exec rspec
        env:
          RUBY_BOX: &quot;1&quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;working-around-activesupport-issues&quot;&gt;Working Around ActiveSupport Issues&lt;/h2&gt;

&lt;p&gt;Some ActiveSupport core extensions have compatibility issues. Load them in your main context before creating boxes:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# At application startup, before creating any boxes
require &apos;active_support/core_ext/string/inflections&apos;
require &apos;active_support/core_ext/hash/keys&apos;

# Now create boxes for isolated code
plugin_box = Ruby::Box.new
# Plugins can use the already-loaded extensions
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Alternatively, selectively load only what you need inside boxes:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;box = Ruby::Box.new
box.eval &amp;lt;&amp;lt;~RUBY
  # Load specific extensions that are known to work
  require &apos;active_support/core_ext/object/blank&apos;
RUBY
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;performance-considerations&quot;&gt;Performance Considerations&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;Ruby::Box&lt;/code&gt; adds minimal overhead for most operations:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Method dispatch&lt;/strong&gt;: Slightly more indirection through separate method tables&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Object creation&lt;/strong&gt;: Unaffected, objects pass freely between boxes&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Memory&lt;/strong&gt;: Each box maintains its own class/module definitions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For performance-critical paths, cache class references:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class OptimizedPluginRunner
  def initialize(box)
    @box = box
    # Cache the class reference once
    @processor_class = box.eval(&apos;DataProcessor&apos;)
  end

  def process(data)
    # Use cached reference instead of evaluating each time
    @processor_class.new.process(data)
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;when-to-use-rubybox&quot;&gt;When to Use &lt;code&gt;Ruby::Box&lt;/code&gt;&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Good candidates:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Plugin or extension systems where isolation is critical&lt;/li&gt;
  &lt;li&gt;Multi-tenant applications with per-tenant customizations&lt;/li&gt;
  &lt;li&gt;Testing scenarios requiring invasive monkey patches&lt;/li&gt;
  &lt;li&gt;Gradual migration between gem versions&lt;/li&gt;
  &lt;li&gt;Applications loading third-party code that might conflict&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Poor candidates:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Running untrusted or potentially malicious code (use OS-level sandboxing)&lt;/li&gt;
  &lt;li&gt;Production systems until the feature stabilizes&lt;/li&gt;
  &lt;li&gt;Applications heavily dependent on native extensions&lt;/li&gt;
  &lt;li&gt;Simple applications without isolation requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;migration-strategy&quot;&gt;Migration Strategy&lt;/h2&gt;

&lt;p&gt;If you’re considering &lt;code&gt;Ruby::Box&lt;/code&gt; for an existing application:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Test compatibility&lt;/strong&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# Run your test suite with Boxing enabled
RUBY_BOX=1 bundle exec rspec
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Identify issues&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Look for failures related to:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Shared global state across files&lt;/li&gt;
  &lt;li&gt;Assumptions about class modifications being visible everywhere&lt;/li&gt;
  &lt;li&gt;Native extension loading errors&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Refactor incrementally&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start with isolated subsystems that don’t share state with the rest of your application. Move more code into boxes as you gain confidence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Monitor in staging&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Run your staging environment with &lt;code&gt;RUBY_BOX=1&lt;/code&gt; before considering production deployment.&lt;/p&gt;

&lt;h2 id=&quot;whats-next-for-rubybox&quot;&gt;What’s Next for &lt;code&gt;Ruby::Box&lt;/code&gt;&lt;/h2&gt;

&lt;p&gt;The Ruby core team has discussed building a higher-level “packages” API on top of &lt;code&gt;Ruby::Box&lt;/code&gt;. This would provide more ergonomic ways to manage gem isolation without manual box management. Track progress in &lt;a href=&quot;https://bugs.ruby-lang.org/issues/21681&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Ruby packages feature discussion (opens in new tab)&quot;&gt;Ruby Issue #21681&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Ruby::Box&lt;/code&gt; solves real problems around namespace pollution and gem conflicts. While still experimental, it’s worth exploring for applications where isolation matters. Start with non-critical paths, understand the limitations, and provide feedback to the Ruby core team as you experiment.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://docs.ruby-lang.org/en/master/Ruby/Box.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Ruby::Box official documentation (opens in new tab)&quot;&gt;Ruby::Box Official Documentation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/geeknees/ruby_box_shadow_universe&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Ruby::Box shadow execution example repository (opens in new tab)&quot;&gt;Ruby::Box Shadow Execution Example&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://rubykaigi.org/2025/presentations/tagomoris.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;State of Namespace presentation at RubyKaigi 2025 (opens in new tab)&quot;&gt;RubyKaigi 2025: State of Namespace&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://bugs.ruby-lang.org/issues/21681&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Ruby packages feature discussion (opens in new tab)&quot;&gt;Ruby Issue #21681: Packages API&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Thu, 15 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://prateekcodes.com/ruby-4-ruby-box-practical-guide-part-2/</link>
        <guid isPermaLink="true">https://prateekcodes.com/ruby-4-ruby-box-practical-guide-part-2/</guid>
        
        <category>ruby</category>
        
        <category>ruby-box</category>
        
        <category>namespace</category>
        
        <category>isolation</category>
        
        <category>plugins</category>
        
        <category>multi-tenant</category>
        
        <category>ruby-4</category>
        
        
        <category>Ruby</category>
        
        <category>Ruby 4.0</category>
        
        <category>Isolation</category>
        
      </item>
    
      <item>
        <title>Ruby 4.0 Introduces Ruby::Box for In-Process Isolation (Part 1)</title>
        <description>&lt;p&gt;Ruby 4.0 introduces &lt;code&gt;Ruby::Box&lt;/code&gt;, a feature that provides isolated namespaces within a single Ruby process. This solves a long-standing problem: monkey patches and global modifications from one gem affecting all other code in your application.&lt;/p&gt;

&lt;h2 id=&quot;the-problem-with-shared-namespaces&quot;&gt;The Problem with Shared Namespaces&lt;/h2&gt;

&lt;p&gt;When you load a gem that modifies core classes, those changes affect everything in your Ruby process:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# Gem A adds a titleize method to String
class String
  def titleize
    split.map(&amp;amp;:capitalize).join(&apos; &apos;)
  end
end

# Now EVERY piece of code in your process sees this method
# Including Gem B, which might have its own expectations

&quot;hello world&quot;.titleize  # =&amp;gt; &quot;Hello World&quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This becomes problematic when:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Two gems define conflicting methods on the same class&lt;/li&gt;
  &lt;li&gt;A gem’s monkey patch breaks another library’s assumptions&lt;/li&gt;
  &lt;li&gt;You want to test code in isolation from invasive patches&lt;/li&gt;
  &lt;li&gt;You need to run multiple versions of a gem simultaneously&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before Ruby 4.0, the only solutions were separate Ruby processes (with IPC overhead) or containers (with even more overhead).&lt;/p&gt;

&lt;h2 id=&quot;ruby-40-enter-rubybox&quot;&gt;Ruby 4.0: Enter Ruby::Box&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;Ruby::Box&lt;/code&gt; creates isolated spaces where code runs with its own class definitions, constants, and global variables. Changes made inside a box stay inside that box.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# Enable with environment variable at startup
# RUBY_BOX=1 ruby my_script.rb

# Check if Boxing is available
Ruby::Box.enabled?  # =&amp;gt; true

# Create an isolated box
box = Ruby::Box.new

# Load code that patches String
box.eval &amp;lt;&amp;lt;~RUBY
  class String
    def shout
      upcase + &quot;!!!&quot;
    end
  end
RUBY

# The patch exists only inside the box
box.eval(&apos;&quot;hello&quot;.shout&apos;)  # =&amp;gt; &quot;HELLO!!!&quot;

# Outside the box, String is unchanged
&quot;hello&quot;.shout  # =&amp;gt; NoMethodError: undefined method `shout&apos;
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;understanding-box-types&quot;&gt;Understanding Box Types&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;Ruby::Box&lt;/code&gt; operates with three types of boxes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Root Box&lt;/strong&gt;: Contains all built-in Ruby classes and modules. This is established before any user code runs and serves as the template for other boxes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Main Box&lt;/strong&gt;: Your application’s default execution context. It’s automatically created from the root box when the process starts. This is where your main script runs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User Boxes&lt;/strong&gt;: Custom boxes you create with &lt;code&gt;Ruby::Box.new&lt;/code&gt;. Each is copied from the root box, giving it a clean slate of built-in classes without any modifications from the main box or other user boxes.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# Your script runs in the &quot;main&quot; box
Ruby::Box.current  # =&amp;gt; #&amp;lt;Ruby::Box main&amp;gt;

# Create isolated boxes
plugin_box = Ruby::Box.new
another_box = Ruby::Box.new

# Each box is independent
plugin_box.object_id != another_box.object_id  # =&amp;gt; true
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;the-rubybox-api&quot;&gt;The Ruby::Box API&lt;/h2&gt;

&lt;p&gt;The API is straightforward with just a few methods:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# Creation
box = Ruby::Box.new

# Loading code
box.require(&apos;some_library&apos;)        # Respects box&apos;s $LOAD_PATH
box.require_relative(&apos;./my_file&apos;)  # Relative to current file
box.load(&apos;script.rb&apos;)              # Direct file execution

# Executing code
box.eval(&apos;1 + 1&apos;)                  # Execute Ruby code as string

# Inspection
Ruby::Box.current    # Returns the currently executing box
Ruby::Box.enabled?   # Check if Boxing is active
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;what-gets-isolated&quot;&gt;What Gets Isolated&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;Ruby::Box&lt;/code&gt; isolates several aspects of the Ruby runtime:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Classes and Constants&lt;/strong&gt;: Reopening a built-in class in one box doesn’t affect other boxes.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;box = Ruby::Box.new
box.eval &amp;lt;&amp;lt;~RUBY
  class Array
    def sum_squares
      map { |n| n ** 2 }.sum
    end
  end
RUBY

box.eval(&apos;[1, 2, 3].sum_squares&apos;)  # =&amp;gt; 14
[1, 2, 3].sum_squares              # =&amp;gt; NoMethodError
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;Global Variables&lt;/strong&gt;: Changes to globals stay within the box.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;box = Ruby::Box.new
box.eval(&apos;$my_config = { debug: true }&apos;)

box.eval(&apos;$my_config&apos;)  # =&amp;gt; { debug: true }
$my_config              # =&amp;gt; nil
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;Top-Level Methods&lt;/strong&gt;: Methods defined at the top level become private instance methods of &lt;code&gt;Object&lt;/code&gt; within that box only.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;box = Ruby::Box.new
box.eval &amp;lt;&amp;lt;~RUBY
  def helper_method
    &quot;I&apos;m only available in this box&quot;
  end
RUBY

box.eval(&apos;helper_method&apos;)  # =&amp;gt; &quot;I&apos;m only available in this box&quot;
helper_method              # =&amp;gt; NoMethodError
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;enabling-rubybox&quot;&gt;Enabling Ruby::Box&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;Ruby::Box&lt;/code&gt; is disabled by default. Enable it by setting the &lt;code&gt;RUBY_BOX&lt;/code&gt; environment variable before the Ruby process starts:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;RUBY_BOX=1 ruby my_application.rb
&lt;/code&gt;&lt;/pre&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: Setting &lt;code&gt;RUBY_BOX&lt;/code&gt; after the process has started has no effect. The boxing infrastructure must be initialized during Ruby’s boot sequence, so the variable must be set before the Ruby process starts.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# This check should be at the top of your application
unless Ruby::Box.enabled?
  warn &quot;Ruby::Box is not enabled. Start with RUBY_BOX=1&quot;
  exit 1
end
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;important-limitations&quot;&gt;Important Limitations&lt;/h2&gt;

&lt;p&gt;Before adopting &lt;code&gt;Ruby::Box&lt;/code&gt;, be aware of these constraints:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Not a Security Sandbox&lt;/strong&gt;: &lt;code&gt;Ruby::Box&lt;/code&gt; provides namespace isolation, not security isolation. Code in a box can still access the filesystem, network, and system resources. Do not use it to run untrusted code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Native Extensions&lt;/strong&gt;: Installing gems with native extensions may fail when &lt;code&gt;RUBY_BOX=1&lt;/code&gt; is set. The workaround is to install gems without the flag, then run your application with it enabled.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# Install gems normally
bundle install

# Run with Boxing enabled
RUBY_BOX=1 bundle exec ruby app.rb
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;ActiveSupport Compatibility&lt;/strong&gt;: Some parts of &lt;code&gt;active_support/core_ext&lt;/code&gt; have compatibility issues with &lt;code&gt;Ruby::Box&lt;/code&gt;. Load &lt;code&gt;ActiveSupport&lt;/code&gt; in your main context before creating boxes if needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Experimental Status&lt;/strong&gt;: This feature is experimental in Ruby 4.0. Behavior may change in future versions. The Ruby core team recommends experimentation but advises caution in production environments.&lt;/p&gt;

&lt;h2 id=&quot;file-scope-execution&quot;&gt;File Scope Execution&lt;/h2&gt;

&lt;p&gt;One important detail: &lt;code&gt;Ruby::Box&lt;/code&gt; operates on a file-scope basis. Each &lt;code&gt;.rb&lt;/code&gt; file executes entirely within a single box. Once loaded, all methods and procs defined in that file operate within their originating box, regardless of where they’re called from.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# helper.rb
def process(data)
  # This method always runs in the box where helper.rb was loaded
  data.transform
end

# main.rb
box = Ruby::Box.new
box.require_relative(&apos;helper&apos;)

# Even when called from main, process() runs in box&apos;s context
box.eval(&apos;process(my_data)&apos;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;Ruby::Box&lt;/code&gt; brings a long-requested capability to Ruby: proper namespace isolation without process boundaries. In &lt;a href=&quot;/ruby-4-ruby-box-practical-guide-part-2/&quot;&gt;Part 2&lt;/a&gt;, we’ll explore practical use cases including plugin systems, multi-tenant configurations, and strategies for gradual adoption.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://docs.ruby-lang.org/en/master/Ruby/Box.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Ruby::Box official documentation (opens in new tab)&quot;&gt;Ruby::Box Official Documentation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.ruby-lang.org/en/news/2025/12/25/ruby-4-0-0-released/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Ruby 4.0.0 release announcement (opens in new tab)&quot;&gt;Ruby 4.0.0 Release Notes&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://dev.to/ko1/rubybox-digest-introduction-ruby-400-new-feature-3bch&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;ko1&apos;s Ruby::Box introduction on DEV.to (opens in new tab)&quot;&gt;Ruby::Box Introduction by ko1&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://rubyreferences.github.io/rubychanges/4.0.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Ruby 4.0 changes reference (opens in new tab)&quot;&gt;Ruby 4.0 Changes Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Wed, 14 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://prateekcodes.com/ruby-4-introduces-ruby-box-for-in-process-isolation-part-1/</link>
        <guid isPermaLink="true">https://prateekcodes.com/ruby-4-introduces-ruby-box-for-in-process-isolation-part-1/</guid>
        
        <category>ruby</category>
        
        <category>ruby-box</category>
        
        <category>namespace</category>
        
        <category>isolation</category>
        
        <category>monkey-patching</category>
        
        <category>ruby-4</category>
        
        
        <category>Ruby</category>
        
        <category>Ruby 4.0</category>
        
        <category>Isolation</category>
        
      </item>
    
      <item>
        <title>Rails 8.2 makes enqueue_after_transaction_commit the default</title>
        <description>&lt;p&gt;&lt;a href=&quot;/rails-72-enqueue-after-transaction-commit&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Blog post about Rails 7.2 enqueue_after_transaction_commit (opens in new tab)&quot;&gt;Rails 7.2&lt;/a&gt; introduced &lt;code&gt;enqueue_after_transaction_commit&lt;/code&gt; to prevent race conditions when jobs are enqueued inside database transactions. However, it required explicit opt-in. Rails 8.2 flips the default. Jobs are now automatically deferred until after the transaction commits.&lt;/p&gt;

&lt;h2 id=&quot;the-problem-with-opt-in&quot;&gt;The Problem with Opt-In&lt;/h2&gt;

&lt;p&gt;With the opt-in approach in Rails 7.2, teams had to remember to enable the feature:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;config.active_job.enqueue_after_transaction_commit = :default
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Or configure it per-job:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class WelcomeEmailJob &amp;lt; ApplicationJob
  self.enqueue_after_transaction_commit = :always
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This created inconsistency. Some jobs would be transaction-aware, others would not. The safer behavior required explicit action.&lt;/p&gt;

&lt;h2 id=&quot;rails-82-changes-the-default&quot;&gt;Rails 8.2 Changes the Default&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/rails/rails/pull/55788&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Rails PR making enqueue_after_transaction_commit the default (opens in new tab)&quot;&gt;PR #55788&lt;/a&gt; changes this. When you upgrade to Rails 8.2 and run &lt;code&gt;load_defaults &quot;8.2&quot;&lt;/code&gt;, jobs are automatically deferred until after the transaction commits.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;def create
  User.transaction do
    user = User.create!(params)
    WelcomeEmailJob.perform_later(user)  # Deferred until commit
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;No configuration needed. The job waits for the transaction to complete before being dispatched to the queue.&lt;/p&gt;

&lt;h2 id=&quot;opting-out&quot;&gt;Opting Out&lt;/h2&gt;

&lt;p&gt;If you need immediate enqueueing for backward compatibility or specific use cases, you have two options.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Global configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;config.active_job.enqueue_after_transaction_commit = false
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;Per-job configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class TimeStampedJob &amp;lt; ApplicationJob
  self.enqueue_after_transaction_commit = false
end
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;why-the-global-config-was-restored&quot;&gt;Why the Global Config Was Restored&lt;/h2&gt;

&lt;p&gt;The global configuration option has an interesting history. It was deprecated and removed in Rails 8.1. The team initially wanted each job to declare its own preference. However, changing the default behavior without a global opt-out would break existing applications.&lt;/p&gt;

&lt;p&gt;The PR restored the global configuration specifically to allow apps upgrading to Rails 8.2 to maintain their existing behavior without modifying every job class.&lt;/p&gt;

&lt;h2 id=&quot;when-this-matters&quot;&gt;When This Matters&lt;/h2&gt;

&lt;p&gt;The new default primarily affects jobs enqueued to external queues like Redis (Sidekiq, Resque). If you use a database-backed queue like Solid Queue or GoodJob with the same database, your jobs are already part of the same transaction.&lt;/p&gt;

&lt;p&gt;Jobs that do not depend on transaction data can still be configured for immediate enqueueing if needed.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Rails 8.2 makes the safer behavior the default. Jobs enqueued inside transactions automatically wait for the commit, eliminating a common source of race conditions without requiring explicit configuration.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/rails/rails/pull/55788&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Rails PR making enqueue_after_transaction_commit the default (opens in new tab)&quot;&gt;Pull Request #55788&lt;/a&gt; making this the default&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/rails/rails/pull/51426&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Rails PR introducing enqueue_after_transaction_commit (opens in new tab)&quot;&gt;Pull Request #51426&lt;/a&gt; introducing the feature in Rails 7.2&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/rails-72-enqueue-after-transaction-commit&quot; aria-label=&quot;Blog post about Rails 7.2 enqueue_after_transaction_commit&quot;&gt;Rails 7.2 enqueue_after_transaction_commit&lt;/a&gt; - detailed explanation of the feature&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Wed, 31 Dec 2025 00:00:00 +0000</pubDate>
        <link>https://prateekcodes.com/rails-82-enqueue-after-transaction-commit-default/</link>
        <guid isPermaLink="true">https://prateekcodes.com/rails-82-enqueue-after-transaction-commit-default/</guid>
        
        <category>rails-8-2</category>
        
        <category>active-job</category>
        
        <category>transactions</category>
        
        <category>background-jobs</category>
        
        
        <category>Rails</category>
        
        <category>Rails 8.2</category>
        
        <category>Active Job</category>
        
      </item>
    
      <item>
        <title>Rails 7.2 adds enqueue_after_transaction_commit to prevent job race conditions</title>
        <description>&lt;p&gt;Scheduling background jobs inside database transactions is a common anti-pattern which is a source of several production bugs in Rails applications. The job can execute before the transaction commits, leading to &lt;code&gt;RecordNotFound&lt;/code&gt; or &lt;code&gt;ActiveJob::DeserializationError&lt;/code&gt; because the data it needs does not exist yet. Or worse, the job could run assuming the txn would commit, but it rolls back at a later stage. We don’t need that kind of optimism.&lt;/p&gt;

&lt;p&gt;Rails 7.2 addresses this with &lt;code&gt;enqueue_after_transaction_commit&lt;/code&gt;, which automatically defers job enqueueing until the transaction completes.&lt;/p&gt;

&lt;h2 id=&quot;before&quot;&gt;Before&lt;/h2&gt;

&lt;p&gt;Consider a typical pattern where you create a user and send a welcome email:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class UsersController &amp;lt; ApplicationController
  def create
    User.transaction do
      @user = User.create!(user_params)
      WelcomeEmailJob.perform_later(@user)
    end
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This code works fine in development where your job queue is slow and transactions commit quickly. In production, with a fast Redis-backed queue like Sidekiq and a busy database, the job can start executing before the transaction commits:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Timeline:
1. Transaction begins
2. User INSERT executes (not committed yet)
3. Job enqueued to Redis
4. Sidekiq picks up job immediately
5. Job tries to find User -&amp;gt; RecordNotFound!
6. Transaction commits (too late)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The same problem occurs with &lt;code&gt;after_create&lt;/code&gt; callbacks in models:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class Project &amp;lt; ApplicationRecord
  after_create -&amp;gt; { NotifyParticipantsJob.perform_later(self) }
end
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;the-workaround&quot;&gt;The Workaround&lt;/h3&gt;

&lt;p&gt;The standard fix was to use &lt;code&gt;after_commit&lt;/code&gt; callbacks instead:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class Project &amp;lt; ApplicationRecord
  after_create_commit -&amp;gt; { NotifyParticipantsJob.perform_later(self) }
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Or wrap job scheduling in explicit &lt;code&gt;after_commit&lt;/code&gt; blocks:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class UsersController &amp;lt; ApplicationController
  def create
    User.transaction do
      @user = User.create!(user_params)

      ActiveRecord::Base.connection.after_transaction_commit do
        WelcomeEmailJob.perform_later(@user)
      end
    end
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This worked but had problems:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Easy to forget&lt;/strong&gt;: Using &lt;code&gt;after_create&lt;/code&gt; instead of &lt;code&gt;after_create_commit&lt;/code&gt; is a common mistake&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Scattered logic&lt;/strong&gt;: Job scheduling gets coupled to model callbacks instead of staying in controllers or service objects&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Verbose&lt;/strong&gt;: Wrapping every &lt;code&gt;perform_later&lt;/code&gt; call in &lt;code&gt;after_commit&lt;/code&gt; blocks adds boilerplate&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Testing friction&lt;/strong&gt;: Transaction callbacks behave differently in test environments using database cleaner with transactions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/Envek/after_commit_everywhere&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;after_commit_everywhere gem on GitHub (opens in new tab)&quot;&gt;after_commit_everywhere&lt;/a&gt; gem became popular specifically to address this problem. It lets you use &lt;code&gt;after_commit&lt;/code&gt; callbacks anywhere in your application, not just in ActiveRecord models:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class UserRegistrationService
  include AfterCommitEverywhere

  def call(params)
    User.transaction do
      user = User.create!(params)

      after_commit do
        WelcomeEmailJob.perform_later(user)
      end
    end
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The gem hooks into ActiveRecord’s transaction lifecycle and ensures callbacks only fire after the outermost transaction commits. It handled nested transactions correctly and became a go-to solution for service objects that needed transaction-safe job scheduling.&lt;/p&gt;

&lt;p&gt;Some teams built their own lightweight wrappers instead:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# Custom AsyncRecord class that hooks into transaction callbacks
class AsyncRecord
  def initialize(&amp;amp;block)
    @callback = block
  end

  def has_transactional_callbacks?
    true
  end

  def committed!(*)
    @callback.call
  end

  def rolledback!(*)
    # Do nothing if transaction rolled back
  end
end

# Usage
User.transaction do
  user = User.create!(params)
  record = AsyncRecord.new { WelcomeEmailJob.perform_later(user) }
  user.class.connection.add_transaction_record(record)
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Both approaches worked, but required teams to remember to use them consistently.&lt;/p&gt;

&lt;h2 id=&quot;rails-72&quot;&gt;Rails 7.2&lt;/h2&gt;

&lt;p&gt;Rails 7.2 makes Active Job transaction-aware. Jobs are automatically deferred until the transaction commits, and dropped if it rolls back.&lt;/p&gt;

&lt;p&gt;Enable it globally in your application:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# config/application.rb
config.active_job.enqueue_after_transaction_commit = :default
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now the original code just works:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class UsersController &amp;lt; ApplicationController
  def create
    User.transaction do
      @user = User.create!(user_params)
      WelcomeEmailJob.perform_later(@user)  # Deferred until commit
    end
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The job only gets enqueued after the transaction successfully commits. If the transaction rolls back, the job is never enqueued.&lt;/p&gt;

&lt;h3 id=&quot;configuration-options&quot;&gt;Configuration Options&lt;/h3&gt;

&lt;p&gt;You can control this behavior at three levels:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Global configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# config/application.rb
config.active_job.enqueue_after_transaction_commit = :default
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;Per-job configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class WelcomeEmailJob &amp;lt; ApplicationJob
  self.enqueue_after_transaction_commit = :always
end

class AuditLogJob &amp;lt; ApplicationJob
  self.enqueue_after_transaction_commit = :never  # Queue immediately
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The available values are:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code&gt;:default&lt;/code&gt; - Let the queue adapter decide the behavior&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;:always&lt;/code&gt; - Always defer until transaction commits&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;:never&lt;/code&gt; - Queue immediately (pre-7.2 behavior)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;checking-enqueue-status&quot;&gt;Checking Enqueue Status&lt;/h3&gt;

&lt;p&gt;Since &lt;code&gt;perform_later&lt;/code&gt; returns immediately even when the job is deferred, you can check if it was actually enqueued:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;User.transaction do
  user = User.create!(user_params)
  job = WelcomeEmailJob.perform_later(user)

  # job.successfully_enqueued? returns false here (still deferred)
end

# After transaction commits, job.successfully_enqueued? returns true
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;model-callbacks-simplified&quot;&gt;Model Callbacks Simplified&lt;/h3&gt;

&lt;p&gt;You can now safely use &lt;code&gt;after_create&lt;/code&gt; for job scheduling without worrying about transaction timing:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class Project &amp;lt; ApplicationRecord
  # This is now safe with enqueue_after_transaction_commit enabled
  after_create -&amp;gt; { NotifyParticipantsJob.perform_later(self) }
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The job automatically waits for any enclosing transaction to complete.&lt;/p&gt;

&lt;h2 id=&quot;when-to-disable&quot;&gt;When to Disable&lt;/h2&gt;

&lt;p&gt;Some scenarios require immediate enqueueing:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Database-backed queues&lt;/strong&gt;: If you use Solid Queue, GoodJob, or Delayed Job with the same database, jobs are part of the same transaction and this deferral is unnecessary&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Fire-and-forget jobs&lt;/strong&gt;: Jobs that do not depend on the transaction data can run immediately&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Time-sensitive operations&lt;/strong&gt;: If you need the job queued at a specific moment regardless of transaction state&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class TimeStampedJob &amp;lt; ApplicationJob
  self.enqueue_after_transaction_commit = :never

  def perform
    # This job needs to capture the exact enqueue time
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;update-rails-82-makes-this-the-default&quot;&gt;Update: Rails 8.2 Makes This the Default&lt;/h2&gt;

&lt;p&gt;Rails 8.2 makes &lt;code&gt;enqueue_after_transaction_commit&lt;/code&gt; the default behavior. Jobs are now automatically deferred until after the transaction commits without requiring explicit configuration.&lt;/p&gt;

&lt;p&gt;See &lt;a href=&quot;/rails-82-enqueue-after-transaction-commit-default&quot; aria-label=&quot;Blog post about Rails 8.2 enqueue_after_transaction_commit default&quot;&gt;Rails 8.2 makes enqueue_after_transaction_commit the default&lt;/a&gt; for details on the change, opting out, and the deprecation history.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;enqueue_after_transaction_commit&lt;/code&gt; eliminates a common source of race conditions in Rails applications. Instead of remembering to use &lt;code&gt;after_commit&lt;/code&gt; callbacks or building custom workarounds, jobs are automatically deferred until transactions complete.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/rails/rails/pull/51426&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Rails PR introducing enqueue_after_transaction_commit (opens in new tab)&quot;&gt;Pull Request #51426&lt;/a&gt; introducing the feature&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/rails/rails/pull/55788&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Rails PR making enqueue_after_transaction_commit the default in Rails 8.2 (opens in new tab)&quot;&gt;Pull Request #55788&lt;/a&gt; making this the default in Rails 8.2&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/rails/rails/issues/26045&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;DHH&apos;s original issue about job scheduling in transactions (opens in new tab)&quot;&gt;Original Issue #26045&lt;/a&gt; by DHH describing the problem&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://guides.rubyonrails.org/active_job_basics.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Rails Active Job documentation (opens in new tab)&quot;&gt;Active Job Basics Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Tue, 30 Dec 2025 00:00:00 +0000</pubDate>
        <link>https://prateekcodes.com/rails-72-enqueue-after-transaction-commit/</link>
        <guid isPermaLink="true">https://prateekcodes.com/rails-72-enqueue-after-transaction-commit/</guid>
        
        <category>rails-7-2</category>
        
        <category>active-job</category>
        
        <category>transactions</category>
        
        <category>background-jobs</category>
        
        <category>sidekiq</category>
        
        
        <category>Rails</category>
        
        <category>Rails 7.2</category>
        
        <category>Active Job</category>
        
      </item>
    
      <item>
        <title>Rails 8.2 introduces Rails.app.creds for unified credential management</title>
        <description>&lt;p&gt;Applications often store secrets in both environment variables and encrypted credential files. Migrating between these storage methods or using both simultaneously has traditionally required code changes. Rails 8.2 solves this with &lt;code&gt;Rails.app.creds&lt;/code&gt;, a unified API that checks ENV first, then falls back to encrypted credentials.&lt;/p&gt;

&lt;h2 id=&quot;before&quot;&gt;Before&lt;/h2&gt;

&lt;p&gt;Managing credentials from multiple sources meant mixing different APIs:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class StripeService
  def initialize
    # Check ENV first, fallback to credentials
    @api_key = ENV[&quot;STRIPE_API_KEY&quot;] || Rails.application.credentials.dig(:stripe, :api_key)
    @webhook_secret = ENV.fetch(&quot;STRIPE_WEBHOOK_SECRET&quot;) {
      Rails.application.credentials.stripe&amp;amp;.webhook_secret
    }

    raise &quot;Missing Stripe API key!&quot; unless @api_key
  end
end

class DatabaseConfig
  def connection_url
    # Different syntax for each source
    ENV[&quot;DATABASE_URL&quot;] || Rails.application.credentials.database_url
  end

  def redis_url
    ENV.fetch(&quot;REDIS_URL&quot;, Rails.application.credentials.dig(:redis, :url) || &quot;redis://localhost:6379&quot;)
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This approach has several problems:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Inconsistent APIs between &lt;code&gt;ENV.fetch()&lt;/code&gt; and &lt;code&gt;credentials.dig()&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Manual fallback logic scattered throughout the codebase&lt;/li&gt;
  &lt;li&gt;Code changes required when moving secrets between storage methods&lt;/li&gt;
  &lt;li&gt;Easy to forget nil checks on nested credentials&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;rails-82&quot;&gt;Rails 8.2&lt;/h2&gt;

&lt;p&gt;The new &lt;code&gt;Rails.app.creds&lt;/code&gt; provides a consistent interface:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;class StripeService
  def initialize
    @api_key = Rails.app.creds.require(:stripe_api_key)
    @webhook_secret = Rails.app.creds.require(:stripe_webhook_secret)
  end
end

class DatabaseConfig
  def connection_url
    Rails.app.creds.require(:database_url)
  end

  def redis_url
    Rails.app.creds.option(:redis_url, default: &quot;redis://localhost:6379&quot;)
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;require&lt;/code&gt; method mandates a value exists and raises &lt;code&gt;KeyError&lt;/code&gt; if missing from both ENV and encrypted credentials. The &lt;code&gt;option&lt;/code&gt; method returns &lt;code&gt;nil&lt;/code&gt; or a default value gracefully.&lt;/p&gt;

&lt;h2 id=&quot;nested-keys&quot;&gt;Nested Keys&lt;/h2&gt;

&lt;p&gt;For nested credentials, pass multiple keys. Rails automatically converts them to the appropriate format for each source:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# Checks ENV[&quot;AWS__ACCESS_KEY_ID&quot;] first, then credentials.dig(:aws, :access_key_id)
Rails.app.creds.require(:aws, :access_key_id)

# Multi-level nesting
# ENV[&quot;REDIS__CACHE__TTL&quot;] || credentials.dig(:redis, :cache, :ttl)
Rails.app.creds.option(:redis, :cache, :ttl, default: 3600)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The ENV lookup uses double underscores (&lt;code&gt;__&lt;/code&gt;) as separators for nested keys:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;code&gt;:database_url&lt;/code&gt; → &lt;code&gt;ENV[&quot;DATABASE_URL&quot;]&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;[:aws, :region]&lt;/code&gt; → &lt;code&gt;ENV[&quot;AWS__REGION&quot;]&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;[:redis, :cache, :ttl]&lt;/code&gt; → &lt;code&gt;ENV[&quot;REDIS__CACHE__TTL&quot;]&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;dynamic-defaults&quot;&gt;Dynamic Defaults&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;option&lt;/code&gt; method accepts callable defaults, evaluated only when needed:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;Rails.app.creds.option(:cache_ttl, default: -&amp;gt; { 1.hour })
Rails.app.creds.option(:max_connections, default: -&amp;gt; { calculate_pool_size })
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;env-only-access&quot;&gt;ENV-Only Access&lt;/h2&gt;

&lt;p&gt;Access environment variables directly using the same API via &lt;code&gt;Rails.app.envs&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# Only checks ENV, no encrypted credentials fallback
Rails.app.envs.require(:port)
Rails.app.envs.option(:log_level, default: &quot;info&quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;custom-credential-sources&quot;&gt;Custom Credential Sources&lt;/h2&gt;

&lt;p&gt;Under the hood, &lt;code&gt;Rails.app.creds&lt;/code&gt; is powered by &lt;code&gt;ActiveSupport::CombinedConfiguration&lt;/code&gt;, which checks multiple credential sources (called backends) in order. By default, it checks ENV first, then encrypted credentials. You can customize this chain to include external secret managers:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# config/initializers/credentials.rb
Rails.app.creds = ActiveSupport::CombinedConfiguration.new(
  Rails.app.envs,                   # Check ENV first
  VaultConfiguration.new,           # Then HashiCorp Vault
  OnePasswordConfiguration.new,     # Then 1Password
  Rails.app.credentials             # Finally, encrypted credentials
)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Each credential source needs to implement &lt;code&gt;require&lt;/code&gt; and &lt;code&gt;option&lt;/code&gt; methods matching the API.&lt;/p&gt;

&lt;h2 id=&quot;railsapp-alias&quot;&gt;Rails.app Alias&lt;/h2&gt;

&lt;p&gt;This feature comes alongside a new &lt;code&gt;Rails.app&lt;/code&gt; alias for &lt;code&gt;Rails.application&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-ruby&quot;&gt;# Before
Rails.application.credentials.aws.access_key_id

# After
Rails.app.credentials.aws.access_key_id
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The shorter alias makes chained method calls more pleasant to read and write.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;Rails.app.creds&lt;/code&gt; eliminates the friction of managing credentials across multiple sources. Secrets can move between ENV and encrypted files without touching application code.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/rails/rails/pull/56404&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Rails PR 56404 add Rails.app.creds (opens in new tab)&quot;&gt;PR #56404&lt;/a&gt; - Add Rails.app.creds for combined credentials lookup&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/rails/rails/pull/56403&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;Rails PR 56403 add Rails.app alias (opens in new tab)&quot;&gt;PR #56403&lt;/a&gt; - Add Rails.app alias for Rails.application&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Mon, 29 Dec 2025 00:00:00 +0000</pubDate>
        <link>https://prateekcodes.com/rails-8-2-combined-credentials-rails-app-creds/</link>
        <guid isPermaLink="true">https://prateekcodes.com/rails-8-2-combined-credentials-rails-app-creds/</guid>
        
        <category>rails-8</category>
        
        <category>credentials</category>
        
        <category>configuration</category>
        
        <category>environment-variables</category>
        
        <category>secrets-management</category>
        
        
        <category>Rails</category>
        
        <category>Rails 8</category>
        
        <category>Configuration</category>
        
      </item>
    
      <item>
        <title>Understanding PostgreSQL Checkpoints: From WAL to Disk</title>
        <description>&lt;p&gt;PostgreSQL relies on checkpoints to ensure data durability while maintaining performance. Understanding how checkpoints work and their relationship with Write-Ahead Logging is essential for database performance tuning and troubleshooting.&lt;/p&gt;

&lt;p&gt;This post builds on fundamental concepts covered in our PostgreSQL internals series:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/postgres-fundamentals-memory-vs-disk-part-1&quot;&gt;Part 1: Memory vs Disk Performance&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/postgres-fundamentals-database-storage-part-2&quot;&gt;Part 2: How Databases Store Data&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/postgres-fundamentals-transactions-part-3&quot;&gt;Part 3: Transactions and ACID&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/postgres-fundamentals-performance-patterns-part-4&quot;&gt;Part 4: Performance Patterns&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/postgres-fundamentals-wal-deep-dive-part-5&quot;&gt;Part 5: Write-Ahead Logging Deep Dive&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/postgres-fundamentals-monitoring-administration-part-6&quot;&gt;Part 6: Monitoring and Administration&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;write-ahead-logging-the-foundation&quot;&gt;Write-Ahead Logging: The Foundation&lt;/h2&gt;

&lt;p&gt;Checkpoints work hand-in-hand with &lt;a href=&quot;/postgres-fundamentals-wal-deep-dive-part-5&quot;&gt;Write-Ahead Logging (WAL)&lt;/a&gt;. When PostgreSQL modifies data, changes are written to WAL files first (sequential, fast) before updating data pages in memory. Modified pages (called “dirty pages”) accumulate in &lt;code&gt;shared_buffers&lt;/code&gt;, and eventually these changes need to be written to the actual data files. That’s where checkpoints come in.&lt;/p&gt;

&lt;h2 id=&quot;what-happens-during-a-checkpoint&quot;&gt;What Happens During a Checkpoint&lt;/h2&gt;

&lt;p&gt;A checkpoint is PostgreSQL’s process of writing all dirty pages from shared buffers to disk. It creates a known recovery point and ensures data durability.&lt;/p&gt;

&lt;h3 id=&quot;the-checkpoint-process&quot;&gt;The Checkpoint Process&lt;/h3&gt;

&lt;p&gt;When a checkpoint occurs:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Checkpoint starts&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;PostgreSQL marks the current WAL position as the checkpoint location&lt;/li&gt;
      &lt;li&gt;This position is the recovery starting point if a crash occurs&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Dirty pages are written&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;All modified data pages in &lt;code&gt;shared_buffers&lt;/code&gt; are flushed to disk&lt;/li&gt;
      &lt;li&gt;This happens gradually to avoid I/O spikes (controlled by &lt;code&gt;checkpoint_completion_target&lt;/code&gt;)&lt;/li&gt;
      &lt;li&gt;Pages are written in order to minimize random I/O&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Checkpoint completes&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;A checkpoint record is written to WAL&lt;/li&gt;
      &lt;li&gt;The &lt;code&gt;pg_control&lt;/code&gt; file is updated with the new checkpoint location&lt;/li&gt;
      &lt;li&gt;Old WAL files (before the checkpoint) can now be recycled or archived&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;what-triggers-a-checkpoint&quot;&gt;What Triggers a Checkpoint?&lt;/h3&gt;

&lt;p&gt;PostgreSQL creates checkpoints based on:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Time&lt;/strong&gt;: &lt;code&gt;checkpoint_timeout&lt;/code&gt; parameter (default: 5 minutes)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;WAL volume&lt;/strong&gt;: &lt;code&gt;max_wal_size&lt;/code&gt; parameter (default: 1GB)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Manual trigger&lt;/strong&gt;: &lt;code&gt;CHECKPOINT&lt;/code&gt; command&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Shutdown&lt;/strong&gt;: Always creates a checkpoint during clean shutdown&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Force an immediate checkpoint
CHECKPOINT;
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;checkpoint-impact-on-performance&quot;&gt;Checkpoint Impact on Performance&lt;/h3&gt;

&lt;p&gt;Checkpoints involve heavy I/O (writing potentially gigabytes of dirty pages), which can cause temporary performance degradation. Understanding &lt;a href=&quot;/postgres-fundamentals-performance-patterns-part-4&quot;&gt;performance trade-offs&lt;/a&gt; helps you balance durability with speed. PostgreSQL spreads checkpoint writes over time using &lt;code&gt;checkpoint_completion_target&lt;/code&gt; (default: 0.9) to minimize I/O spikes.&lt;/p&gt;

&lt;h2 id=&quot;monitoring-checkpoint-activity&quot;&gt;Monitoring Checkpoint Activity&lt;/h2&gt;

&lt;p&gt;For detailed monitoring techniques, see &lt;a href=&quot;/postgres-fundamentals-monitoring-administration-part-6&quot;&gt;Part 6: Monitoring and Administration&lt;/a&gt;. Key metrics to track:&lt;/p&gt;

&lt;h3 id=&quot;check-checkpoint-statistics&quot;&gt;Check Checkpoint Statistics&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- PostgreSQL 17+
SELECT * FROM pg_stat_checkpointer;

-- PostgreSQL 16 and earlier
SELECT * FROM pg_stat_bgwriter;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What to look for:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;High &lt;code&gt;checkpoints_req&lt;/code&gt; (or &lt;code&gt;num_requested&lt;/code&gt; in v17+) means checkpoints are happening too frequently&lt;/li&gt;
  &lt;li&gt;Large &lt;code&gt;checkpoint_write_time&lt;/code&gt; (or &lt;code&gt;write_time&lt;/code&gt; in v17+) indicates heavy I/O load&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;monitor-wal-generation-rate&quot;&gt;Monitor WAL Generation Rate&lt;/h3&gt;

&lt;p&gt;High WAL generation can trigger frequent checkpoints. See &lt;a href=&quot;/postgres-fundamentals-monitoring-administration-part-6#pg_stat_wal-wal-activity&quot;&gt;Part 6&lt;/a&gt; for detailed WAL monitoring queries and interpretation.&lt;/p&gt;

&lt;h2 id=&quot;tuning-checkpoint-behavior&quot;&gt;Tuning Checkpoint Behavior&lt;/h2&gt;

&lt;p&gt;Key parameters to adjust (see &lt;a href=&quot;/postgres-fundamentals-performance-patterns-part-4#checkpoint-trade-off-alert&quot;&gt;Part 4: Performance Patterns&lt;/a&gt; for trade-offs):&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-conf&quot;&gt;checkpoint_timeout = 15min          # Default: 5min
max_wal_size = 4GB                  # Default: 1GB
checkpoint_completion_target = 0.9  # Default: 0.9
log_checkpoints = on
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Guidelines:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Increase &lt;code&gt;max_wal_size&lt;/code&gt; if &lt;code&gt;checkpoints_req&lt;/code&gt; is high&lt;/li&gt;
  &lt;li&gt;Increase &lt;code&gt;checkpoint_timeout&lt;/code&gt; for write-heavy workloads&lt;/li&gt;
  &lt;li&gt;Keep &lt;code&gt;checkpoint_completion_target&lt;/code&gt; at 0.9 to avoid I/O spikes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For broader PostgreSQL performance optimization, see &lt;a href=&quot;/postgresql-query-optimization-guide&quot;&gt;PostgreSQL query optimization guide&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Checkpoints are PostgreSQL’s mechanism for persisting in-memory changes to disk, creating recovery points, and managing WAL files. They balance performance with durability by batching writes and spreading I/O over time. Watch for frequent requested checkpoints and long write times as signals for tuning opportunities.&lt;/p&gt;

&lt;p&gt;For deeper understanding, explore the &lt;a href=&quot;/postgres-fundamentals-memory-vs-disk-part-1&quot;&gt;PostgreSQL internals series&lt;/a&gt;, or dive into &lt;a href=&quot;/postgresql-explain-analyze-deep-dive&quot;&gt;PostgreSQL EXPLAIN ANALYZE&lt;/a&gt; for query optimization.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/wal-configuration.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL WAL Configuration documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: WAL Configuration&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/wal-intro.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL WAL introduction documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: Reliability and the Write-Ahead Log&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-BGWRITER-VIEW&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL pg_stat_bgwriter documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: pg_stat_bgwriter&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Fri, 17 Oct 2025 00:00:00 +0000</pubDate>
        <link>https://prateekcodes.com/understanding-postgres-checkpoints/</link>
        <guid isPermaLink="true">https://prateekcodes.com/understanding-postgres-checkpoints/</guid>
        
        <category>postgres</category>
        
        <category>wal</category>
        
        <category>checkpoints</category>
        
        <category>performance</category>
        
        <category>database-internals</category>
        
        
        <category>PostgreSQL</category>
        
        <category>Database</category>
        
      </item>
    
      <item>
        <title>PostgreSQL Fundamentals: Monitoring and Administration Tools (Part 6)</title>
        <description>&lt;p&gt;In &lt;a href=&quot;/postgres-fundamentals-wal-deep-dive-part-5&quot;&gt;Part 5&lt;/a&gt;, we learned how Write-Ahead Logging works internally. Now let’s explore the tools PostgreSQL provides for monitoring and administering these systems.&lt;/p&gt;

&lt;p&gt;This is Part 6 (final part) of a series on PostgreSQL internals.&lt;/p&gt;

&lt;h2 id=&quot;system-views-your-window-into-postgresql&quot;&gt;System Views: Your Window Into PostgreSQL&lt;/h2&gt;

&lt;p&gt;PostgreSQL exposes extensive information through system views. These views are your primary tool for understanding what’s happening inside the database.&lt;/p&gt;

&lt;h3 id=&quot;pg_stat_checkpointer-checkpoint-statistics-postgresql-17&quot;&gt;pg_stat_checkpointer: Checkpoint Statistics (PostgreSQL 17+)&lt;/h3&gt;

&lt;p&gt;The most important view for checkpoint monitoring:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT * FROM pg_stat_checkpointer;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Key columns:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT
    num_timed,              -- Scheduled checkpoints (time-based)
    num_requested,          -- Requested checkpoints (WAL-based or manual)
    write_time,             -- Milliseconds spent writing files
    sync_time,              -- Milliseconds spent syncing files
    buffers_written,        -- Buffers written during checkpoints
    stats_reset            -- When stats were last reset
FROM pg_stat_checkpointer;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Before PostgreSQL 17, checkpoint statistics were in &lt;code&gt;pg_stat_bgwriter&lt;/code&gt; with column names &lt;code&gt;checkpoints_timed&lt;/code&gt;, &lt;code&gt;checkpoints_req&lt;/code&gt;, &lt;code&gt;checkpoint_write_time&lt;/code&gt;, &lt;code&gt;checkpoint_sync_time&lt;/code&gt;, and &lt;code&gt;buffers_checkpoint&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id=&quot;interpreting-pg_stat_checkpointer&quot;&gt;Interpreting pg_stat_checkpointer&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Example output:
num_timed:       1250    -- Mostly time-based (good)
num_requested:   45      -- Few requested (good)
write_time:      450000  -- 450 seconds total writing
sync_time:       2500    -- 2.5 seconds total syncing
buffers_written: 500000  -- 500k buffers written at checkpoints
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What to look for:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;High num_requested&lt;/strong&gt;: Checkpoints happening too frequently
    &lt;ul&gt;
      &lt;li&gt;Solution: Increase &lt;code&gt;max_wal_size&lt;/code&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;High write_time&lt;/strong&gt;: Checkpoint I/O is slow
    &lt;ul&gt;
      &lt;li&gt;Solution: Increase &lt;code&gt;checkpoint_completion_target&lt;/code&gt;&lt;/li&gt;
      &lt;li&gt;Or: Improve disk I/O performance&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;High buffers_written relative to checkpoint frequency&lt;/strong&gt;: Large checkpoints
    &lt;ul&gt;
      &lt;li&gt;Solution: More frequent checkpoints or increase &lt;code&gt;shared_buffers&lt;/code&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;pg_stat_wal-wal-activity&quot;&gt;pg_stat_wal: WAL Activity&lt;/h3&gt;

&lt;p&gt;Monitor WAL generation and flush activity:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT
    wal_records,        -- Total WAL records generated
    wal_fpi,           -- Full page images written
    wal_bytes,         -- Total bytes written to WAL
    wal_buffers_full,  -- Times WAL buffer was full
    wal_write,         -- Number of WAL writes
    wal_sync,          -- Number of WAL syncs (fsync)
    wal_write_time,    -- Time spent writing WAL (ms)
    wal_sync_time,     -- Time spent syncing WAL (ms)
    stats_reset
FROM pg_stat_wal;
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;calculating-wal-generation-rate&quot;&gt;Calculating WAL Generation Rate&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Record current WAL stats
CREATE TEMP TABLE wal_baseline AS
SELECT
    now() AS measured_at,
    pg_current_wal_lsn() AS wal_lsn,
    wal_bytes
FROM pg_stat_wal;

-- Wait 60 seconds...
SELECT pg_sleep(60);

-- Calculate rate
SELECT
    pg_size_pretty(
        w.wal_bytes - b.wal_bytes
    ) AS wal_generated,
    EXTRACT(EPOCH FROM (now() - b.measured_at)) AS seconds,
    pg_size_pretty(
        (w.wal_bytes - b.wal_bytes) /
        EXTRACT(EPOCH FROM (now() - b.measured_at))
    ) || &apos;/s&apos; AS wal_rate
FROM pg_stat_wal w, wal_baseline b;

-- Example result:
-- wal_generated: 25 MB
-- seconds: 60
-- wal_rate: 427 kB/s
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;pg_stat_database-database-level-stats&quot;&gt;pg_stat_database: Database-Level Stats&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT
    datname,
    xact_commit,           -- Transactions committed
    xact_rollback,         -- Transactions rolled back
    blks_read,            -- Disk blocks read
    blks_hit,             -- Disk blocks found in cache
    tup_inserted,         -- Rows inserted
    tup_updated,          -- Rows updated
    tup_deleted,          -- Rows deleted
    temp_files,           -- Temp files created
    temp_bytes            -- Temp file bytes
FROM pg_stat_database
WHERE datname = current_database();
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Calculate cache hit ratio:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT
    datname,
    round(
        100.0 * blks_hit / nullif(blks_hit + blks_read, 0),
        2
    ) AS cache_hit_ratio
FROM pg_stat_database
WHERE datname = current_database();

-- Healthy databases: 95%+
-- Low ratio: Need more shared_buffers or working set too large
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;pg_stat_statements-query-level-wal-generation&quot;&gt;pg_stat_statements: Query-Level WAL Generation&lt;/h3&gt;

&lt;p&gt;Track which queries generate the most WAL:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Enable extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top WAL generators (PostgreSQL 13+)
SELECT
    substring(query, 1, 60) AS query_preview,
    calls,
    pg_size_pretty(wal_bytes) AS wal_generated,
    pg_size_pretty(wal_bytes / calls) AS wal_per_call,
    round(100.0 * wal_bytes / sum(wal_bytes) OVER (), 2) AS wal_percent
FROM pg_stat_statements
ORDER BY wal_bytes DESC
LIMIT 10;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Once you identify high WAL-generating queries, you can optimize write operations: batch INSERT/UPDATE operations, use &lt;code&gt;COPY&lt;/code&gt; for bulk loads, and consider whether all indexes are necessary.&lt;/p&gt;

&lt;h3 id=&quot;pg_stat_activity-live-connections&quot;&gt;pg_stat_activity: Live Connections&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT
    pid,
    usename,
    application_name,
    state,
    query_start,
    state_change,
    wait_event_type,
    wait_event,
    substring(query, 1, 50) AS query_preview
FROM pg_stat_activity
WHERE state != &apos;idle&apos;
ORDER BY query_start;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Find long-running queries:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SELECT
    pid,
    now() - query_start AS duration,
    query
FROM pg_stat_activity
WHERE state = &apos;active&apos;
  AND now() - query_start &amp;gt; interval &apos;5 minutes&apos;
ORDER BY duration DESC;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;For long-running queries, you can:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Analyze execution plans&lt;/strong&gt;: Use &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; to understand why queries are slow. See &lt;a href=&quot;/postgresql-explain-analyze-deep-dive&quot;&gt;PostgreSQL EXPLAIN ANALYZE Deep Dive&lt;/a&gt; for detailed analysis techniques.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Offload reads to replicas&lt;/strong&gt;: Move long-running SELECT queries to read replicas to reduce contention on the primary database. See &lt;a href=&quot;/rails-read-replicas-part-1-understanding-the-basics&quot;&gt;Rails Read Replicas Part 1&lt;/a&gt; for implementation patterns.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;postgresql-logs-checkpoint-and-wal-messages&quot;&gt;PostgreSQL Logs: Checkpoint and WAL Messages&lt;/h2&gt;

&lt;p&gt;Enable detailed logging in &lt;code&gt;postgresql.conf&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-conf&quot;&gt;# Log checkpoints
log_checkpoints = on

# Log long-running statements
log_min_duration_statement = 1000  # Log queries &amp;gt; 1 second

# Log connections and disconnections
log_connections = on
log_disconnections = on

# Set log destination
logging_collector = on
log_directory = &apos;log&apos;
log_filename = &apos;postgresql-%Y-%m-%d_%H%M%S.log&apos;
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;reading-checkpoint-logs&quot;&gt;Reading Checkpoint Logs&lt;/h3&gt;

&lt;p&gt;With &lt;code&gt;log_checkpoints = on&lt;/code&gt;, you’ll see:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;2025-10-17 10:15:42.123 UTC [12345] LOG: checkpoint starting: time
2025-10-17 10:16:11.456 UTC [12345] LOG: checkpoint complete: wrote 2435 buffers (14.9%); 0 WAL file(s) added, 0 removed, 3 recycled; write=29.725 s, sync=0.004 s, total=29.780 s; sync files=7, longest=0.003 s, average=0.001 s; distance=49142 kB, estimate=49142 kB
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Breakdown:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;wrote 2435 buffers (14.9%)    # Dirty pages written (14.9% of shared_buffers)
0 WAL file(s) added           # New WAL segments created
0 removed                     # Old WAL segments deleted
3 recycled                    # WAL segments renamed for reuse
write=29.725 s                # Time spent writing buffers
sync=0.004 s                  # Time spent fsync&apos;ing
total=29.780 s                # Total checkpoint duration
distance=49142 kB             # WAL generated since last checkpoint
estimate=49142 kB             # Estimated WAL to next checkpoint
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What to watch:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;High write time&lt;/strong&gt;: Checkpoint taking too long
    &lt;ul&gt;
      &lt;li&gt;Check disk I/O performance&lt;/li&gt;
      &lt;li&gt;Consider spreading checkpoint over more time&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Frequent checkpoints&lt;/strong&gt;: If you see many “checkpoint starting: xlog” instead of “checkpoint starting: time”
    &lt;ul&gt;
      &lt;li&gt;Increase &lt;code&gt;max_wal_size&lt;/code&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Large distance&lt;/strong&gt;: Generating lots of WAL
    &lt;ul&gt;
      &lt;li&gt;Normal for write-heavy workloads&lt;/li&gt;
      &lt;li&gt;Ensure &lt;code&gt;max_wal_size&lt;/code&gt; is adequate&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;pg_waldump-inspecting-wal-records&quot;&gt;pg_waldump: Inspecting WAL Records&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;pg_waldump&lt;/code&gt; lets you read WAL files directly:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# Find WAL files
ls $PGDATA/pg_wal/

# If this doesn&apos;t work, replace PGDATA with output from SHOW data_directory;

# Dump a WAL segment
pg_waldump $PGDATA/pg_wal/000000010000000000000001

# Output shows each WAL record:
rmgr: Heap        len: 54   rec: INSERT off 1 flags 0x00
rmgr: Btree       len: 72   rec: INSERT_LEAF off 5
rmgr: Transaction len: 34   rec: COMMIT 2025-10-17 10:15:42.123456 UTC
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;filtering-wal-records&quot;&gt;Filtering WAL Records&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# Show only specific resource manager (Heap = table data)
pg_waldump -r Heap $PGDATA/pg_wal/000000010000000000000001

# Show records from specific LSN range
pg_waldump -s 0/1500000 -e 0/1600000 $PGDATA/pg_wal/000000010000000000000001

# Show statistics summary
pg_waldump --stats $PGDATA/pg_wal/000000010000000000000001
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;understanding-wal-record-output&quot;&gt;Understanding WAL Record Output&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;rmgr: Heap        len: 54   rec: INSERT off 1
    lsn: 0/01500028, prev: 0/01500000, desc: INSERT off 1 flags 0x00
    blkref #0: rel 1663/16384/16385 blk 0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Breaking this down:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;rmgr: Heap              # Resource manager (table data)
len: 54                 # Record length in bytes
rec: INSERT             # Operation type
lsn: 0/01500028        # Log Sequence Number
prev: 0/01500000       # Previous record LSN
blkref #0:             # Block reference
  rel 1663/16384/16385 # Relation OID (tablespace/database/relation)
  blk 0                # Block number
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;pg_control-database-cluster-state&quot;&gt;pg_control: Database Cluster State&lt;/h2&gt;

&lt;p&gt;View control file (requires command-line tool):&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;pg_controldata $PGDATA
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Key information:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;pg_control version number:            1300
Catalog version number:               202107181
Database system identifier:           7012345678901234567
Database cluster state:               in production
pg_control last modified:             Thu 17 Oct 2025 10:15:42 AM UTC
Latest checkpoint location:           0/1500000
Latest checkpoint&apos;s REDO location:    0/1480000
Latest checkpoint&apos;s TimeLineID:       1
Latest checkpoint&apos;s full_page_writes: on
Latest checkpoint&apos;s NextXID:          0:1000
Latest checkpoint&apos;s NextOID:          24576
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This shows the last checkpoint LSN, which is crucial for crash recovery.&lt;/p&gt;

&lt;h2 id=&quot;monitoring-checkpoint-health&quot;&gt;Monitoring Checkpoint Health&lt;/h2&gt;

&lt;p&gt;Create a monitoring query (PostgreSQL 17+):&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;CREATE OR REPLACE VIEW checkpoint_health AS
SELECT
    num_timed,
    num_requested,
    round(100.0 * num_requested /
          nullif(num_timed + num_requested, 0), 2
    ) AS req_checkpoint_pct,
    pg_size_pretty(
        buffers_written * 8192::bigint
    ) AS checkpoint_write_size,
    round(
        write_time::numeric /
        nullif(num_timed + num_requested, 0),
        2
    ) AS avg_checkpoint_write_ms,
    round(
        sync_time::numeric /
        nullif(num_timed + num_requested, 0),
        2
    ) AS avg_checkpoint_sync_ms
FROM pg_stat_checkpointer;

-- Check health
SELECT * FROM checkpoint_health;

-- Example output:
-- num_timed: 1200
-- num_requested: 50
-- req_checkpoint_pct: 4.00           ← Good (&amp;lt; 10%)
-- checkpoint_write_size: 4000 MB
-- avg_checkpoint_write_ms: 375.21
-- avg_checkpoint_sync_ms: 2.08
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;For PostgreSQL 16 and earlier&lt;/strong&gt;, use &lt;code&gt;pg_stat_bgwriter&lt;/code&gt; with column names &lt;code&gt;checkpoints_timed&lt;/code&gt;, &lt;code&gt;checkpoints_req&lt;/code&gt;, &lt;code&gt;checkpoint_write_time&lt;/code&gt;, &lt;code&gt;checkpoint_sync_time&lt;/code&gt;, and &lt;code&gt;buffers_checkpoint&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Healthy checkpoint system:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;code&gt;req_checkpoint_pct&lt;/code&gt; &amp;lt; 10%: Most checkpoints are scheduled&lt;/li&gt;
  &lt;li&gt;Reasonable write times: Not overwhelming the I/O system&lt;/li&gt;
  &lt;li&gt;Consistent checkpoint sizes&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;resetting-statistics&quot;&gt;Resetting Statistics&lt;/h2&gt;

&lt;p&gt;Statistics accumulate since the last reset:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Reset all statistics
SELECT pg_stat_reset();

-- Reset bgwriter stats
SELECT pg_stat_reset_shared(&apos;bgwriter&apos;);

-- Reset WAL stats
SELECT pg_stat_reset_shared(&apos;wal&apos;);

-- Check when stats were last reset
SELECT stats_reset FROM pg_stat_bgwriter;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Reset stats to measure recent behavior or after configuration changes.&lt;/p&gt;

&lt;h2 id=&quot;putting-it-all-together&quot;&gt;Putting It All Together&lt;/h2&gt;

&lt;p&gt;A complete checkpoint monitoring query:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;WITH wal_rate AS (
    SELECT
        pg_size_pretty(wal_bytes) AS total_wal,
        wal_records AS total_records,
        wal_fpi AS full_page_images
    FROM pg_stat_wal
),
checkpoint_stats AS (
    SELECT
        checkpoints_timed + checkpoints_req AS total_checkpoints,
        checkpoints_req,
        round(100.0 * checkpoints_req /
              nullif(checkpoints_timed + checkpoints_req, 0), 2
        ) AS req_pct,
        pg_size_pretty(buffers_checkpoint * 8192::bigint) AS data_written,
        round(checkpoint_write_time::numeric /
              nullif(checkpoints_timed + checkpoints_req, 0), 2
        ) AS avg_write_ms
    FROM pg_stat_bgwriter
)
SELECT
    c.total_checkpoints,
    c.checkpoints_req,
    c.req_pct || &apos;%&apos; AS req_checkpoint_pct,
    w.total_wal,
    w.total_records,
    w.full_page_images,
    c.data_written AS checkpoint_data_written,
    c.avg_write_ms || &apos; ms&apos; AS avg_checkpoint_write_time
FROM checkpoint_stats c, wal_rate w;
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;You now have the foundational knowledge of PostgreSQL internals:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Memory vs disk performance (&lt;a href=&quot;/postgres-fundamentals-memory-vs-disk-part-1&quot;&gt;Part 1&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;How data is stored in pages (&lt;a href=&quot;/postgres-fundamentals-database-storage-part-2&quot;&gt;Part 2&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;Transactions and ACID (&lt;a href=&quot;/postgres-fundamentals-transactions-part-3&quot;&gt;Part 3&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;Performance trade-offs (&lt;a href=&quot;/postgres-fundamentals-performance-patterns-part-4&quot;&gt;Part 4&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;Write-Ahead Logging (&lt;a href=&quot;/postgres-fundamentals-wal-deep-dive-part-5&quot;&gt;Part 5&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;Monitoring tools (&lt;a href=&quot;/postgres-fundamentals-monitoring-administration-part-6&quot;&gt;Part 6&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think I missed out on a key topic? Please reach out to me.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Previous&lt;/strong&gt;: &lt;a href=&quot;/postgres-fundamentals-wal-deep-dive-part-5&quot;&gt;Part 5 - Write-Ahead Logging Deep Dive&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Next&lt;/strong&gt;: &lt;a href=&quot;/understanding-postgres-checkpoints&quot;&gt;Understanding PostgreSQL Checkpoints&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/monitoring-stats.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL monitoring statistics documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: Monitoring Stats&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/pgwaldump.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL pg_waldump documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: pg_waldump&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/runtime-config-logging.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL logging configuration documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: Server Log&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Thu, 16 Oct 2025 00:00:00 +0000</pubDate>
        <link>https://prateekcodes.com/postgres-fundamentals-monitoring-administration-part-6/</link>
        <guid isPermaLink="true">https://prateekcodes.com/postgres-fundamentals-monitoring-administration-part-6/</guid>
        
        <category>postgres</category>
        
        <category>database-fundamentals</category>
        
        <category>monitoring</category>
        
        <category>administration</category>
        
        <category>pg-waldump</category>
        
        <category>system-views</category>
        
        
        <category>PostgreSQL</category>
        
        <category>Database</category>
        
      </item>
    
      <item>
        <title>PostgreSQL Fundamentals: Write-Ahead Logging Deep Dive (Part 5)</title>
        <description>&lt;p&gt;In &lt;a href=&quot;/postgres-fundamentals-performance-patterns-part-4&quot;&gt;Part 4&lt;/a&gt;, we learned about database performance trade-offs and why sequential I/O is preferred. Now let’s dive deep into Write-Ahead Logging (WAL), which is PostgreSQL’s solution to the durability problem.&lt;/p&gt;

&lt;p&gt;This is Part 5 of a series on PostgreSQL internals.&lt;/p&gt;

&lt;h2 id=&quot;what-problem-does-wal-solve&quot;&gt;What Problem Does WAL Solve?&lt;/h2&gt;

&lt;p&gt;Remember the durability dilemma from &lt;a href=&quot;/postgres-fundamentals-transactions-part-3&quot;&gt;Part 3&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;UPDATE users SET balance = 500 WHERE id = 1;
COMMIT;  -- Must survive a crash from this point forward
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The naive approach would be:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Modify the data page in memory&lt;/li&gt;
  &lt;li&gt;Write the page to disk (random I/O, slow)&lt;/li&gt;
  &lt;li&gt;Acknowledge COMMIT&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;But this has problems:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Slow&lt;/strong&gt;: Random writes to data files (10-100ms)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Fragile&lt;/strong&gt;: Partial page writes during crash&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;No batching&lt;/strong&gt;: Can’t combine multiple updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;WAL solves all three problems.&lt;/p&gt;

&lt;h2 id=&quot;how-wal-works-the-big-picture&quot;&gt;How WAL Works: The Big Picture&lt;/h2&gt;

&lt;p&gt;Instead of writing data pages immediately, PostgreSQL writes changes to a sequential log first:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;1. Transaction modifies data
   ↓
2. Write change to WAL (sequential, fast)
   ↓
3. Acknowledge COMMIT (user sees success)
   ↓
4. Later: Apply changes to data files (background)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If PostgreSQL crashes before step 4, the WAL contains everything needed to reconstruct the changes.&lt;/p&gt;

&lt;h3 id=&quot;the-write-ahead-principle&quot;&gt;The Write-Ahead Principle&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Write-Ahead&lt;/strong&gt;: Changes must be logged in WAL before the data page is written to disk.&lt;/p&gt;

&lt;p&gt;This ensures crash recovery always works.&lt;/p&gt;

&lt;h2 id=&quot;wal-file-structure&quot;&gt;WAL File Structure&lt;/h2&gt;

&lt;p&gt;WAL is stored in 16 MB segment files:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# Find your data directory first
psql -c &quot;SHOW data_directory;&quot;

# WAL directory (use the path from above)
ls -lh /path/to/data/pg_wal/

# Example output:
-rw------- 1 postgres postgres 16M Oct 17 10:00 000000010000000000000001
-rw------- 1 postgres postgres 16M Oct 17 10:05 000000010000000000000002
-rw------- 1 postgres postgres 16M Oct 17 10:10 000000010000000000000003
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Each file is exactly 16 MB and contains many WAL records.&lt;/p&gt;

&lt;h3 id=&quot;wal-segment-naming&quot;&gt;WAL Segment Naming&lt;/h3&gt;

&lt;pre&gt;&lt;code&gt;000000010000000000000001
│      ││              │
│      ││              └─ Segment number (hex)
│      │└──────────────── High 32 bits of LSN
│      └───────────────── Timeline ID
└──────────────────────── Always starts with 00000001
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As the database runs, PostgreSQL creates new segments sequentially.&lt;/p&gt;

&lt;h2 id=&quot;lsn-log-sequence-number&quot;&gt;LSN: Log Sequence Number&lt;/h2&gt;

&lt;p&gt;Every position in WAL has a unique identifier called an LSN (Log Sequence Number):&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;LSN format: 0/16A4B80
            │ │
            │ └─ Offset within segment (hex)
            └─── Segment file number (hex)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;LSN is essentially a 64-bit offset into the infinite WAL stream.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Get current WAL write position
SELECT pg_current_wal_lsn();
-- Result: 0/16A4B80

-- Get current WAL insert position
SELECT pg_current_wal_insert_lsn();
-- Result: 0/16A4C00
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;LSNs increase monotonically. A higher LSN means a later point in time.&lt;/p&gt;

&lt;h2 id=&quot;wal-record-structure&quot;&gt;WAL Record Structure&lt;/h2&gt;

&lt;p&gt;Each change to the database generates a WAL record:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;WAL Record:
┌─────────────────────────────┐
│ Record Header               │  ← Metadata (24 bytes)
│  - Total length             │
│  - Transaction ID           │
│  - Previous record pointer  │
│  - CRC checksum             │
├─────────────────────────────┤
│ Resource Manager Info       │  ← What kind of change?
│  - Heap, Btree, Sequence... │
├─────────────────────────────┤
│ Block References            │  ← Which pages changed?
│  - Page number              │
│  - Fork (main/FSM/VM)       │
├─────────────────────────────┤
│ Main Data                   │  ← The actual change
│  - Old values (for undo)    │
│  - New values (for redo)    │
└─────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;example-wal-record&quot;&gt;Example WAL Record&lt;/h3&gt;

&lt;p&gt;When you run:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;UPDATE users SET balance = 500 WHERE id = 1;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;PostgreSQL generates a WAL record like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Resource Manager: Heap (table data)
Block Reference: Relation 16385, Block 0
Old Tuple: (id=1, balance=400, name=&apos;Alice&apos;)
New Tuple: (id=1, balance=500, name=&apos;Alice&apos;)
Transaction ID: 1234
LSN: 0/16A4B80
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This record contains everything needed to:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Redo&lt;/strong&gt;: Apply the change (crash recovery)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Undo&lt;/strong&gt;: Reverse the change (rollback, though PostgreSQL uses MVCC instead)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;types-of-wal-records&quot;&gt;Types of WAL Records&lt;/h2&gt;

&lt;p&gt;Different operations generate different WAL record types:&lt;/p&gt;

&lt;h3 id=&quot;heap-records-table-data&quot;&gt;Heap Records (Table Data)&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;INSERT INTO users VALUES (1, &apos;Alice&apos;);
-- WAL: HEAP_INSERT, block 0, tuple data [1, &apos;Alice&apos;]

UPDATE users SET name = &apos;Alice Smith&apos; WHERE id = 1;
-- WAL: HEAP_UPDATE, block 0, old tuple, new tuple

DELETE FROM users WHERE id = 1;
-- WAL: HEAP_DELETE, block 0, tuple offset
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;btree-records-index-data&quot;&gt;Btree Records (Index Data)&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;CREATE INDEX idx_users_email ON users(email);
-- WAL: BTREE_INSERT for each index entry

UPDATE users SET email = &apos;newemail@example.com&apos; WHERE id = 1;
-- WAL: BTREE_DELETE (old entry), BTREE_INSERT (new entry)
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;transaction-records&quot;&gt;Transaction Records&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;COMMIT;
-- WAL: TRANSACTION_COMMIT, transaction ID, timestamp

ROLLBACK;
-- WAL: TRANSACTION_ABORT, transaction ID
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;checkpoint-records&quot;&gt;Checkpoint Records&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Checkpoint happens
-- WAL: CHECKPOINT, LSN, redo point, next XID, database state
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;how-crash-recovery-works&quot;&gt;How Crash Recovery Works&lt;/h2&gt;

&lt;p&gt;When PostgreSQL starts after a crash:&lt;/p&gt;

&lt;h3 id=&quot;step-1-read-last-checkpoint&quot;&gt;Step 1: Read Last Checkpoint&lt;/h3&gt;

&lt;p&gt;PostgreSQL reads &lt;code&gt;pg_control&lt;/code&gt; to find the last completed checkpoint:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- View control file info (requires pg_controldata tool)
-- $ pg_controldata $PGDATA

-- Shows:
Latest checkpoint location: 0/1500000
Prior checkpoint location:  0/1000000
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;step-2-find-redo-point&quot;&gt;Step 2: Find Redo Point&lt;/h3&gt;

&lt;p&gt;The checkpoint record contains the “redo point”, which is the LSN where recovery should start:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Checkpoint at LSN 0/1500000
Redo point: 0/1480000

This means:
- All changes before 0/1480000 are on disk
- Changes from 0/1480000 to crash point need replay
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;step-3-replay-wal&quot;&gt;Step 3: Replay WAL&lt;/h3&gt;

&lt;p&gt;PostgreSQL reads WAL records from the redo point forward:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Read WAL record at 0/1480000:
  HEAP_UPDATE on block 5
  Apply: Load page 5, apply update

Read WAL record at 0/1480100:
  BTREE_INSERT on index block 10
  Apply: Load page 10, insert index entry

... continue until end of WAL ...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Each record is applied to reconstruct the database state.&lt;/p&gt;

&lt;h3 id=&quot;step-4-reach-consistent-state&quot;&gt;Step 4: Reach Consistent State&lt;/h3&gt;

&lt;p&gt;When replay completes, the database is consistent up to the crash point. All committed transactions are present, all uncommitted transactions are absent (because COMMIT records weren’t written).&lt;/p&gt;

&lt;h2 id=&quot;handling-corrupted-pages&quot;&gt;Handling Corrupted Pages&lt;/h2&gt;

&lt;p&gt;Crash recovery through WAL replay sounds straightforward, but there’s a subtle problem: what if the data pages themselves become corrupted during a crash? PostgreSQL needs a way to handle partial writes that leave pages in an inconsistent state.&lt;/p&gt;

&lt;h3 id=&quot;the-problem&quot;&gt;The Problem&lt;/h3&gt;

&lt;pre&gt;&lt;code&gt;PostgreSQL writes 8 KB page to disk
Crash happens after 4 KB written
Page is corrupted (half old, half new)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Even with WAL, you can’t replay changes onto a corrupted page.&lt;/p&gt;

&lt;h3 id=&quot;the-solution-full-page-writes&quot;&gt;The Solution: Full Page Writes&lt;/h3&gt;

&lt;p&gt;After each checkpoint, the first modification to a page writes the entire page to WAL:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- First update to a page after checkpoint
UPDATE users SET balance = 500 WHERE id = 1;

-- WAL record contains:
-- 1. Full 8 KB page image (FPW)
-- 2. The change record
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Subsequent updates to the same page only log the change (until next checkpoint).&lt;/p&gt;

&lt;p&gt;This ensures:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;If page is partially written during crash, WAL contains a good copy&lt;/li&gt;
  &lt;li&gt;Recovery can restore the page from WAL, then apply changes&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Check FPW setting
SHOW wal_log_hints;
-- or
SHOW full_page_writes;  -- Should be &apos;on&apos;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Disabling full page writes improves performance but risks data corruption during crashes.&lt;/p&gt;

&lt;h2 id=&quot;managing-wal-in-memory&quot;&gt;Managing WAL in Memory&lt;/h2&gt;

&lt;p&gt;WAL records need to reach disk to guarantee durability, but writing to disk on every change would be too slow. PostgreSQL uses an in-memory buffer to batch WAL writes efficiently.&lt;/p&gt;

&lt;p&gt;WAL records aren’t written directly to disk. They go through WAL buffers:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Transaction generates WAL record
    ↓
WAL buffer (in memory, 16 MB)
    ↓
fsync to disk (when needed)
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;when-wal-is-flushed&quot;&gt;When WAL is Flushed&lt;/h3&gt;

&lt;p&gt;WAL is flushed to disk when:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Transaction commits&lt;/strong&gt;:
    &lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;COMMIT;  -- Forces fsync of WAL up to this point
&lt;/code&gt;&lt;/pre&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;WAL buffer fills&lt;/strong&gt;:
    &lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;SHOW wal_buffers;  -- Default: 16 MB
-- When buffer is full, flush to disk
&lt;/code&gt;&lt;/pre&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Background writer&lt;/strong&gt;:
    &lt;pre&gt;&lt;code&gt;Every 200ms, flush any unwritten WAL
&lt;/code&gt;&lt;/pre&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Check WAL flush stats
SELECT * FROM pg_stat_wal;

-- Key metrics:
-- wal_write: Number of times WAL was written
-- wal_sync:  Number of times WAL was synced (fsync)
-- wal_bytes: Total bytes written to WAL
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;beyond-crash-recovery&quot;&gt;Beyond Crash Recovery&lt;/h2&gt;

&lt;p&gt;While WAL’s primary purpose is crash recovery, it also enables several advanced PostgreSQL features. By preserving a complete history of changes, WAL becomes the foundation for backup strategies and replication.&lt;/p&gt;

&lt;p&gt;For point-in-time recovery and replication, you can archive completed WAL segments:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Enable archiving in postgresql.conf
archive_mode = on
archive_command = &apos;cp %p /mnt/wal_archive/%f&apos;

-- PostgreSQL will archive each 16 MB segment when full
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Archived WAL lets you:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Restore to any point in time&lt;/li&gt;
  &lt;li&gt;Set up streaming replication&lt;/li&gt;
  &lt;li&gt;Build read replicas&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;wal-generation-rate&quot;&gt;WAL Generation Rate&lt;/h2&gt;

&lt;p&gt;Different workloads generate WAL at different rates:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Read-only query (no WAL generated)
SELECT * FROM users;

-- Small insert (~100 bytes of WAL)
INSERT INTO users VALUES (1, &apos;Alice&apos;);

-- Large update with indexes (~1 KB of WAL)
UPDATE users SET email = &apos;newemail@example.com&apos; WHERE id = 1;

-- Create index (MB of WAL)
CREATE INDEX idx_users_email ON users(email);
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;monitoring-wal-generation&quot;&gt;Monitoring WAL Generation&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Current WAL position
SELECT pg_current_wal_lsn();
-- Result: 0/16A4B80

-- Wait 1 minute, check again
SELECT pg_current_wal_lsn();
-- Result: 0/18C7000

-- Calculate WAL generated
SELECT pg_size_pretty(
    pg_wal_lsn_diff(&apos;0/18C7000&apos;, &apos;0/16A4B80&apos;)
);
-- Result: 2.1 MB generated in 1 minute
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;High WAL generation rate = more frequent checkpoints.&lt;/p&gt;

&lt;h2 id=&quot;wal-and-performance&quot;&gt;WAL and Performance&lt;/h2&gt;

&lt;p&gt;Understanding how WAL affects performance helps you make informed trade-offs between durability and speed. WAL introduces overhead in multiple areas, from write latency to disk space usage. Let’s look at some of the ways:&lt;/p&gt;

&lt;h3 id=&quot;1-write-performance&quot;&gt;1. Write Performance&lt;/h3&gt;

&lt;p&gt;Every transaction writes to WAL:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Fast (sequential I/O)&lt;/li&gt;
  &lt;li&gt;But still I/O (slower than memory)&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Synchronous commit (default, safe)
SET synchronous_commit = on;
-- Every COMMIT waits for WAL fsync

-- Asynchronous commit (faster, less safe)
SET synchronous_commit = off;
-- COMMIT returns immediately, fsync happens later
-- Risk: Lose last ~200ms of commits if crash
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;2-disk-space&quot;&gt;2. Disk Space&lt;/h3&gt;

&lt;p&gt;WAL segments accumulate:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Each 16 MB&lt;/li&gt;
  &lt;li&gt;Kept until checkpoint completes&lt;/li&gt;
  &lt;li&gt;Then recycled or archived&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Check WAL directory size
SELECT pg_size_pretty(
    pg_stat_file(&apos;pg_wal&apos;, true).size
);
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;3-checkpoint-frequency&quot;&gt;3. Checkpoint Frequency&lt;/h3&gt;

&lt;p&gt;More WAL generation → more frequent checkpoints:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- If you generate 1 GB WAL per minute
-- And max_wal_size = 1GB
-- Checkpoints happen every minute (frequent!)

-- Solution: Increase max_wal_size
max_wal_size = 4GB
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;With the recommended solution, can you guess the trade-offs now? If you’re not sure, make sure to read the previous parts and revisit this post.&lt;/p&gt;

&lt;h2 id=&quot;whats-next&quot;&gt;What’s Next?&lt;/h2&gt;

&lt;p&gt;Now that you understand how WAL works internally, we can explore the monitoring and administration tools PostgreSQL provides. In &lt;a href=&quot;/postgres-fundamentals-monitoring-administration-part-6&quot;&gt;Part 6&lt;/a&gt;, we’ll learn about system views, log analysis, and how to use &lt;code&gt;pg_waldump&lt;/code&gt; to inspect WAL records.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Previous&lt;/strong&gt;: &lt;a href=&quot;/postgres-fundamentals-performance-patterns-part-4&quot;&gt;Part 4 - Performance Patterns and Trade-offs&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/wal-intro.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL WAL introduction documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: Write-Ahead Logging&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/wal-internals.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL WAL internals documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: WAL Internals&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/wal-configuration.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL WAL configuration documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: WAL Configuration&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Wed, 15 Oct 2025 00:00:00 +0000</pubDate>
        <link>https://prateekcodes.com/postgres-fundamentals-wal-deep-dive-part-5/</link>
        <guid isPermaLink="true">https://prateekcodes.com/postgres-fundamentals-wal-deep-dive-part-5/</guid>
        
        <category>postgres</category>
        
        <category>database-fundamentals</category>
        
        <category>wal</category>
        
        <category>write-ahead-logging</category>
        
        <category>crash-recovery</category>
        
        <category>durability</category>
        
        
        <category>PostgreSQL</category>
        
        <category>Database</category>
        
      </item>
    
      <item>
        <title>PostgreSQL Fundamentals: Performance Patterns and Trade-offs (Part 4)</title>
        <description>&lt;p&gt;In &lt;a href=&quot;/postgres-fundamentals-transactions-part-3&quot;&gt;Part 3&lt;/a&gt;, we learned about transactions and ACID properties. Now let’s explore the performance implications of building a durable database and the trade-offs PostgreSQL makes.&lt;/p&gt;

&lt;p&gt;This is Part 4 of a series on PostgreSQL internals.&lt;/p&gt;

&lt;h2 id=&quot;the-durability-vs-performance-dilemma&quot;&gt;The Durability vs Performance Dilemma&lt;/h2&gt;

&lt;p&gt;Remember from &lt;a href=&quot;/postgres-fundamentals-transactions-part-3&quot;&gt;Part 3&lt;/a&gt;: durability means committed data survives crashes. But guaranteeing durability is expensive.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- When you commit
COMMIT;

-- PostgreSQL must ensure data is on disk
-- But disk writes are slow (See Part 1)
-- This creates a fundamental tension
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Every database faces this challenge. The solution lies in clever I/O patterns.&lt;/p&gt;

&lt;h2 id=&quot;write-amplification-the-overhead&quot;&gt;Write Amplification: The Overhead&lt;/h2&gt;

&lt;p&gt;Write amplification is when you write more data to disk than the actual change requires.&lt;/p&gt;

&lt;h3 id=&quot;example-updating-one-field&quot;&gt;Example: Updating One Field&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- You change one field
UPDATE users SET last_login = NOW() WHERE id = 123;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What actually happens:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Logical change&lt;/strong&gt;: 8 bytes (a timestamp)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Physical write&lt;/strong&gt;: Entire 8 KB page must be written&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That’s 1,000x amplification (8 KB ÷ 8 bytes).&lt;/p&gt;

&lt;h3 id=&quot;why-write-the-whole-page&quot;&gt;Why Write the Whole Page?&lt;/h3&gt;

&lt;p&gt;Disks don’t write 8 bytes at a time. The smallest writable unit is typically 512 bytes or 4 KB (disk sector/block size). PostgreSQL’s page size matches this reality.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;You want to change: [X]
Must write: [XXXXXXXX] ← Entire page
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;write-amplification-with-indexes&quot;&gt;Write Amplification with Indexes&lt;/h3&gt;

&lt;p&gt;It gets worse with indexes:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;UPDATE users SET email = &apos;newemail@example.com&apos; WHERE id = 123;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;PostgreSQL must update:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;The table page (8 KB)&lt;/li&gt;
  &lt;li&gt;The old index entry (mark as deleted)&lt;/li&gt;
  &lt;li&gt;The new index entry (insert)&lt;/li&gt;
  &lt;li&gt;Any other indexes on updated columns&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One logical change = multiple page writes.&lt;/p&gt;

&lt;h2 id=&quot;io-batching-this-is-the-way&quot;&gt;I/O Batching: This is the way&lt;/h2&gt;

&lt;p&gt;Instead of writing each change immediately, batch them:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Transaction 1
UPDATE users SET last_login = NOW() WHERE id = 1;
COMMIT;

-- Transaction 2
UPDATE users SET last_login = NOW() WHERE id = 2;
COMMIT;

-- Transaction 3
UPDATE users SET last_login = NOW() WHERE id = 3;
COMMIT;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If users 1, 2, and 3 are on the same page:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Without batching&lt;/strong&gt;: Write the page 3 times&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;With batching&lt;/strong&gt;: Write the page once with all changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;PostgreSQL batches writes in two ways:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;WAL buffering&lt;/strong&gt;: Buffer multiple WAL records before flushing&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Checkpoint batching&lt;/strong&gt;: Accumulate dirty pages, flush together&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;introducing-fsync&quot;&gt;Introducing fsync&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;fsync&lt;/code&gt; is the system call that forces data to physical disk (not just OS cache). It’s slow but necessary for durability.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;// Simplified PostgreSQL commit
write(wal_fd, wal_record);   // Write to OS buffer (fast)
fsync(wal_fd);               // Force to physical disk (slow!)
return COMMIT_SUCCESS;       // Now safe to acknowledge
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;fsync-performance&quot;&gt;fsync Performance&lt;/h3&gt;

&lt;pre&gt;&lt;code&gt;Without fsync:  ~1-2 microseconds (OS buffer)
With fsync:     ~1-2 milliseconds (physical disk)

That&apos;s a 1,000x difference!
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If you commit 1,000 transactions per second and fsync each one:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;1,000 commits × 2ms = 2,000ms = 2 seconds&lt;/li&gt;
  &lt;li&gt;You can only do 500 commits/second, not 1,000&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;group-commit-batching-fsyncs&quot;&gt;Group Commit: Batching fsyncs&lt;/h3&gt;

&lt;p&gt;PostgreSQL uses group commit to amortize fsync cost:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Transaction 1 commits → Write WAL record → Wait for fsync
Transaction 2 commits → Write WAL record → Wait for fsync
Transaction 3 commits → Write WAL record → Wait for fsync
                                             ↓
                                    Single fsync for all three!
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Multiple transactions share the same fsync operation:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Transaction 1 arrives, starts fsync&lt;/li&gt;
  &lt;li&gt;Transactions 2 and 3 arrive while fsync is happening&lt;/li&gt;
  &lt;li&gt;All three complete when fsync finishes&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- See group commit in action
SELECT
    (SELECT sum(xact_commit) FROM pg_stat_database) AS total_commits,
    wal_sync,
    round((SELECT sum(xact_commit) FROM pg_stat_database)::numeric / wal_sync, 2) AS commits_per_sync
FROM pg_stat_wal;

-- Example output:
-- total_commits | wal_sync  | commits_per_sync
-- --------------+-----------+-----------------
-- 68245891203   | 204512847 | 33.37

-- commits_per_sync of 33.37 means ~33 commits share each fsync
-- Group commit is batching effectively
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;sequential-vs-random-io-revisited&quot;&gt;Sequential vs Random I/O Revisited&lt;/h2&gt;

&lt;p&gt;From &lt;a href=&quot;/postgres-fundamentals-memory-vs-disk-part-1&quot;&gt;Part 1&lt;/a&gt;, we know sequential I/O is faster. Let’s see why this matters for database design.&lt;/p&gt;

&lt;h3 id=&quot;random-write-pattern-slow&quot;&gt;Random Write Pattern (Slow)&lt;/h3&gt;

&lt;p&gt;Updating scattered rows:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;UPDATE users SET last_login = NOW() WHERE id IN (1, 5000, 10000, 50000);

-- These rows are likely on different pages
-- PostgreSQL must write multiple pages scattered across disk
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Disk head (or SSD controller) jumps around:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Write page 1    → seek →
Write page 5000 → seek →
Write page 10000 → seek →
Write page 50000
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Each seek adds latency (5-10ms on HDD).&lt;/p&gt;

&lt;h3 id=&quot;sequential-write-pattern-fast&quot;&gt;Sequential Write Pattern (Fast)&lt;/h3&gt;

&lt;p&gt;WAL writes sequentially:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Same updates, but WAL records are written sequentially
[Record 1][Record 2][Record 3][Record 4]...

-- No seeking, just append
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This is why PostgreSQL uses WAL:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Changes go to WAL (sequential, fast)&lt;/li&gt;
  &lt;li&gt;Data pages updated later (random, slow but batched)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;the-write-back-cache-pattern&quot;&gt;The Write-Back Cache Pattern&lt;/h2&gt;

&lt;p&gt;PostgreSQL uses a write-back cache pattern:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;1. Change happens → Write to WAL (fast, sequential)
2. Change happens → Update page in memory (fastest)
3. Later → Flush dirty pages to disk (slow, random, but batched)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This is the same pattern CPUs use with L1/L2 cache and main memory. For more on CPU cache hierarchies, see &lt;a href=&quot;https://people.freebsd.org/~lstewart/articles/cpumemory.pdf&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;CPU memory architecture paper (opens in new tab)&quot;&gt;What Every Programmer Should Know About Memory&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;dirty-pages-accumulate&quot;&gt;Dirty Pages Accumulate&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Transaction 1
UPDATE users SET name = &apos;Alice&apos; WHERE id = 1;
COMMIT;
-- Page 1 is now &quot;dirty&quot; (in memory, not yet written to data file)

-- Transaction 2
UPDATE users SET email = &apos;bob@example.com&apos; WHERE id = 2;
COMMIT;
-- Page 1 is STILL dirty (not written yet)

-- Checkpoint happens
-- Now page 1 with BOTH changes is written once
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Dirty pages stay in &lt;code&gt;shared_buffers&lt;/code&gt; until a &lt;em&gt;checkpoint&lt;/em&gt;.&lt;/p&gt;

&lt;h2 id=&quot;checkpoint-trade-off-alert&quot;&gt;Checkpoint (Trade-off alert)&lt;/h2&gt;

&lt;p&gt;Checkpoints must balance (for a complete understanding of checkpoints, see &lt;a href=&quot;/understanding-postgres-checkpoints&quot;&gt;Understanding PostgreSQL Checkpoints&lt;/a&gt;):&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Frequency&lt;/strong&gt;: How often to flush dirty pages?
    &lt;ul&gt;
      &lt;li&gt;Too frequent: Excessive I/O overhead&lt;/li&gt;
      &lt;li&gt;Too infrequent: Long crash recovery time&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Duration&lt;/strong&gt;: How fast to write dirty pages?
    &lt;ul&gt;
      &lt;li&gt;Too fast: I/O spike, queries slow down&lt;/li&gt;
      &lt;li&gt;Too slow: Checkpoint takes too long&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;PostgreSQL’s solution:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Checkpoint configuration
checkpoint_timeout = 5min          -- Max time between checkpoints
max_wal_size = 1GB                 -- Max WAL before forcing checkpoint
checkpoint_completion_target = 0.9 -- Spread writes over 90% of interval
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;checkpoint_completion_target = 0.9&lt;/code&gt; means:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Checkpoint interval is 5 minutes&lt;/li&gt;
  &lt;li&gt;Spread writes over 4.5 minutes (90%)&lt;/li&gt;
  &lt;li&gt;Write ~100 MB per minute instead of 1 GB in one burst&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;visualizing-checkpoint-io&quot;&gt;Visualizing Checkpoint I/O&lt;/h3&gt;

&lt;pre&gt;&lt;code&gt;Without spreading (checkpoint_completion_target = 0):
I/O │     ┌────┐
    │     │    │
    │─────┘    └───────────────
    Time →

With spreading (checkpoint_completion_target = 0.9):
I/O │     ┌────────────┐
    │    ╱              ╲
    │───┘                └─────
    Time →
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Spreading reduces I/O spikes but increases checkpoint duration.&lt;/p&gt;

&lt;h2 id=&quot;write-amplification-in-postgresql&quot;&gt;Write Amplification in PostgreSQL&lt;/h2&gt;

&lt;p&gt;Let’s calculate actual write amplification:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Simple update
UPDATE users SET status = &apos;active&apos; WHERE id = 123;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Writes required:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;WAL record&lt;/strong&gt;: ~100 bytes (change record + metadata)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Data page&lt;/strong&gt;: 8 KB (entire page)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Index page&lt;/strong&gt;: 8 KB (if updating indexed column)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Total: ~16 KB written for a ~10 byte logical change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amplification factor: 1,600x&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;But this gets batched:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;100 updates to same page → Write page once at checkpoint&lt;/li&gt;
  &lt;li&gt;Effective amplification: ~160x instead of 1,600x&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;understanding-the-trade-offs&quot;&gt;Understanding the Trade-offs&lt;/h2&gt;

&lt;p&gt;PostgreSQL’s design creates fundamental trade-offs between durability, performance, and resource usage. Let’s explore the key decisions database administrators must make.&lt;/p&gt;

&lt;h2 id=&quot;the-memory-trade-off&quot;&gt;The Memory Trade-off&lt;/h2&gt;

&lt;p&gt;More memory (shared_buffers) = better batching:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Small shared_buffers (128 MB)
-- Pages evicted quickly
-- Less batching opportunity
-- More checkpoints needed

-- Large shared_buffers (8 GB)
-- Pages stay in memory longer
-- More batching opportunity
-- Fewer checkpoints needed
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;But memory is expensive and limited.&lt;/p&gt;

&lt;h2 id=&quot;crash-recovery-time-trade-off&quot;&gt;Crash Recovery Time Trade-off&lt;/h2&gt;

&lt;p&gt;The longer between checkpoints, the more WAL to replay after a crash:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Checkpoint every 5 min → ~5 min of WAL to replay
Checkpoint every 30 min → ~30 min of WAL to replay

Larger checkpoint interval = longer recovery time
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;PostgreSQL balances:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Performance (less frequent checkpoints)&lt;/li&gt;
  &lt;li&gt;Recovery time (more frequent checkpoints)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Default &lt;code&gt;checkpoint_timeout = 5min&lt;/code&gt; is a reasonable middle ground.&lt;/p&gt;

&lt;h2 id=&quot;practical-example-measuring-write-amplification&quot;&gt;Practical Example: Measuring Write Amplification&lt;/h2&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Create test table
CREATE TABLE write_test (id SERIAL PRIMARY KEY, data TEXT);

-- Record starting WAL position
SELECT pg_current_wal_lsn() AS start_lsn;
-- Result: 0/1000000

-- Insert 1000 rows
INSERT INTO write_test (data)
SELECT repeat(&apos;x&apos;, 100)
FROM generate_series(1, 1000);

-- Record ending WAL position
SELECT pg_current_wal_lsn() AS end_lsn;
-- Result: 0/1500000

-- Calculate WAL generated
SELECT pg_wal_lsn_diff(&apos;0/1500000&apos;, &apos;0/1000000&apos;) AS wal_bytes;
-- Result: 5242880 (5 MB)

-- Logical data size
SELECT pg_size_pretty(pg_relation_size(&apos;write_test&apos;));
-- Result: 128 KB

-- Write amplification: 5 MB WAL / 128 KB data = ~40x
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The WAL is much larger than the data due to:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Transaction metadata&lt;/li&gt;
  &lt;li&gt;Index updates&lt;/li&gt;
  &lt;li&gt;MVCC version information&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;whats-next&quot;&gt;What’s Next?&lt;/h2&gt;

&lt;p&gt;Now that you understand the performance trade-offs databases face, we can dive deep into Write-Ahead Logging. In &lt;a href=&quot;/postgres-fundamentals-wal-deep-dive-part-5&quot;&gt;Part 5&lt;/a&gt;, we’ll explore exactly how WAL works, what’s inside WAL records, and how it enables both durability and crash recovery.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Previous&lt;/strong&gt;: &lt;a href=&quot;/postgres-fundamentals-transactions-part-3&quot;&gt;Part 3 - Transactions and ACID&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/wal-configuration.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL WAL configuration documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: WAL Configuration&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-CHECKPOINTS&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL checkpoint configuration documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: Checkpoint Parameters&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/runtime-config-resource.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL resource consumption documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: Resource Consumption&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Tue, 14 Oct 2025 00:00:00 +0000</pubDate>
        <link>https://prateekcodes.com/postgres-fundamentals-performance-patterns-part-4/</link>
        <guid isPermaLink="true">https://prateekcodes.com/postgres-fundamentals-performance-patterns-part-4/</guid>
        
        <category>postgres</category>
        
        <category>database-fundamentals</category>
        
        <category>performance</category>
        
        <category>optimization</category>
        
        <category>io-patterns</category>
        
        <category>write-amplification</category>
        
        
        <category>PostgreSQL</category>
        
        <category>Database</category>
        
      </item>
    
      <item>
        <title>PostgreSQL Fundamentals: Transactions and ACID (Part 3)</title>
        <description>&lt;p&gt;In &lt;a href=&quot;/postgres-fundamentals-database-storage-part-2&quot;&gt;Part 2&lt;/a&gt;, we learned how PostgreSQL stores data in pages. Now let’s explore transactions, which are the mechanism that keeps your data consistent even when multiple users access it simultaneously or the system crashes.&lt;/p&gt;

&lt;p&gt;This is Part 3 of a series on PostgreSQL internals.&lt;/p&gt;

&lt;h2 id=&quot;what-is-a-transaction&quot;&gt;What Is a Transaction?&lt;/h2&gt;

&lt;p&gt;A transaction is a sequence of database operations that are treated as a single unit of work. Either all operations succeed, or none do.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Start a transaction
BEGIN;

-- Multiple operations
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- Commit: Make changes permanent
COMMIT;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If anything goes wrong between &lt;code&gt;BEGIN&lt;/code&gt; and &lt;code&gt;COMMIT&lt;/code&gt;, you can &lt;code&gt;ROLLBACK&lt;/code&gt; to undo everything:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN;

UPDATE accounts SET balance = balance - 100 WHERE id = 1;
-- Oops, error or change your mind
ROLLBACK;  -- First UPDATE is undone
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;the-acid-properties&quot;&gt;The ACID Properties&lt;/h2&gt;

&lt;p&gt;ACID is an acronym for the four properties that guarantee database transactions are processed reliably:&lt;/p&gt;

&lt;h3 id=&quot;1-atomicity-all-or-nothing&quot;&gt;1. Atomicity: All or Nothing&lt;/h3&gt;

&lt;p&gt;A transaction either completes fully or has no effect at all.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN;

-- Transfer $100 from Alice to Bob
UPDATE accounts SET balance = balance - 100 WHERE user_id = 1;  -- Alice
UPDATE accounts SET balance = balance + 100 WHERE user_id = 2;  -- Bob

COMMIT;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If the database crashes after the first UPDATE but before COMMIT:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Both updates are undone&lt;/li&gt;
  &lt;li&gt;No money is lost or created&lt;/li&gt;
  &lt;li&gt;The system is as if the transaction never happened&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is &lt;strong&gt;atomicity&lt;/strong&gt;. The transaction is atomic (indivisible).&lt;/p&gt;

&lt;h3 id=&quot;2-consistency-rules-are-always-enforced&quot;&gt;2. Consistency: Rules Are Always Enforced&lt;/h3&gt;

&lt;p&gt;The database moves from one valid state to another valid state. Constraints are always respected.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;CREATE TABLE accounts (
    user_id INT PRIMARY KEY,
    balance DECIMAL CHECK (balance &amp;gt;= 0)  -- Constraint: no negative balances
);

BEGIN;
UPDATE accounts SET balance = balance - 200 WHERE user_id = 1;
-- If this would make balance negative, transaction fails
COMMIT;
-- ERROR: new row violates check constraint &quot;accounts_balance_check&quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The database prevents you from violating constraints. This is &lt;strong&gt;consistency&lt;/strong&gt;.&lt;/p&gt;

&lt;h3 id=&quot;3-isolation-transactions-dont-interfere&quot;&gt;3. Isolation: Transactions Don’t Interfere&lt;/h3&gt;

&lt;p&gt;When multiple transactions run concurrently, each transaction sees a consistent snapshot of the database, as if it’s running alone.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Transaction 1
BEGIN;
SELECT balance FROM accounts WHERE user_id = 1;  -- Returns 500
-- ... doing some work ...

-- Transaction 2 (running at the same time)
BEGIN;
UPDATE accounts SET balance = 1000 WHERE user_id = 1;
COMMIT;

-- Back to Transaction 1
SELECT balance FROM accounts WHERE user_id = 1;  -- Still returns 500!
COMMIT;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Transaction 1 sees a consistent view even though Transaction 2 modified the data. This is &lt;strong&gt;isolation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;PostgreSQL provides multiple isolation levels:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Set isolation level
BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- or
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- or
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Each level provides different guarantees about what changes from other transactions you can see.&lt;/p&gt;

&lt;h3 id=&quot;4-durability-committed-data-survives&quot;&gt;4. Durability: Committed Data Survives&lt;/h3&gt;

&lt;p&gt;Once a transaction commits, the changes are permanent, even if the system crashes immediately after.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN;
INSERT INTO orders (user_id, total) VALUES (1, 99.99);
COMMIT;  -- From this point, the data MUST survive

-- Even if PostgreSQL crashes here, the order exists when it restarts
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;This is &lt;strong&gt;durability&lt;/strong&gt;, which is the hardest property to implement and the reason Write-Ahead Logging exists.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;how-postgresql-implements-durability&quot;&gt;How PostgreSQL Implements Durability&lt;/h2&gt;

&lt;p&gt;When you commit a transaction, PostgreSQL must ensure the data survives a crash. But writing to disk is slow (remember &lt;a href=&quot;/postgres-fundamentals-memory-vs-disk-part-1&quot;&gt;Part 1&lt;/a&gt;?).&lt;/p&gt;

&lt;p&gt;PostgreSQL’s solution: &lt;strong&gt;Write-Ahead Logging (WAL)&lt;/strong&gt;&lt;/p&gt;

&lt;h3 id=&quot;the-durability-problem&quot;&gt;The Durability Problem&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;COMMIT;  -- User expects this to be permanent

-- But writing all changed pages to disk takes time:
-- - Multiple random disk writes
-- - Could be many pages scattered across disk
-- - Takes 10-100ms

-- What if the system crashes during these writes?
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;the-wal-solution&quot;&gt;The WAL Solution&lt;/h3&gt;

&lt;p&gt;Instead of writing data pages directly:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Write changes to WAL&lt;/strong&gt; (sequential, fast)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Acknowledge COMMIT to user&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Write data pages later&lt;/strong&gt; (in background)&lt;/li&gt;
&lt;/ol&gt;

&lt;pre&gt;&lt;code&gt;Transaction commits:
    ↓
Write to WAL (sequential, fast: 1-2ms)
    ↓
COMMIT acknowledged ✅
    ↓
Later: Write dirty pages to data files
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The WAL is a sequential log of all changes. Sequential writes are fast (see &lt;a href=&quot;/postgres-fundamentals-memory-vs-disk-part-1&quot;&gt;Part 1&lt;/a&gt;).&lt;/p&gt;

&lt;h3 id=&quot;wal-example&quot;&gt;WAL Example&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN;
UPDATE accounts SET balance = 500 WHERE user_id = 1;
COMMIT;

-- Behind the scenes:
-- 1. Change recorded in WAL:
--    &quot;Transaction 12345: Change page 42, offset 10, old value 400, new value 500&quot;
-- 2. WAL flushed to disk (fsync)
-- 3. COMMIT returns to user
-- 4. Dirty page stays in memory (shared_buffers)
-- 5. Background writer eventually writes page 42 to data file
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If PostgreSQL crashes after step 3:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;WAL has the change recorded&lt;/li&gt;
  &lt;li&gt;On restart, PostgreSQL replays the WAL&lt;/li&gt;
  &lt;li&gt;The change is reconstructed&lt;/li&gt;
  &lt;li&gt;Durability preserved ✌🏽&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;isolation-levels-in-detail&quot;&gt;Isolation Levels in Detail&lt;/h2&gt;

&lt;p&gt;PostgreSQL offers four isolation levels, each with different trade-offs. Each level provides different guarantees about what changes from other transactions you can see. For a complete reference, see the &lt;a href=&quot;https://www.postgresql.org/docs/current/transaction-iso.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL transaction isolation documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: Transaction Isolation&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;read-uncommitted-not-really-implemented&quot;&gt;Read Uncommitted (Not Really Implemented)&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In PostgreSQL, this behaves the same as Read Committed. PostgreSQL doesn’t allow reading uncommitted data.&lt;/p&gt;

&lt;h3 id=&quot;read-committed-default&quot;&gt;Read Committed (Default)&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Each query sees data committed before the query started:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Transaction 1
BEGIN;
SELECT COUNT(*) FROM orders;  -- Returns 100

-- Transaction 2 (different session)
BEGIN;
INSERT INTO orders VALUES (...);
COMMIT;

-- Back to Transaction 1
SELECT COUNT(*) FROM orders;  -- Returns 101 (sees new commit)
COMMIT;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You might see different data in each query within the same transaction.&lt;/p&gt;

&lt;h3 id=&quot;repeatable-read&quot;&gt;Repeatable Read&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Sees a consistent snapshot of data from when the transaction started:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Transaction 1
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM orders;  -- Returns 100

-- Transaction 2
BEGIN;
INSERT INTO orders VALUES (...);
COMMIT;

-- Back to Transaction 1
SELECT COUNT(*) FROM orders;  -- Still returns 100 (snapshot isolation)
COMMIT;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The entire transaction sees data as it was at the start.&lt;/p&gt;

&lt;h3 id=&quot;serializable&quot;&gt;Serializable&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Strongest isolation. Guarantees transactions behave as if executed one at a time.&lt;/p&gt;

&lt;p&gt;If concurrent transactions would produce inconsistent results, PostgreSQL aborts one:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Transaction 1: Serializable
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT SUM(balance) FROM accounts;  -- 1000

-- Transaction 2: Serializable
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
INSERT INTO accounts (user_id, balance) VALUES (3, 100);
COMMIT;

-- Back to Transaction 1
INSERT INTO audit_log (total) VALUES (1000);  -- Uses old sum
COMMIT;
-- ERROR: could not serialize access due to read/write dependencies
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;PostgreSQL detects that Transaction 1’s view is stale and aborts it.&lt;/p&gt;

&lt;h2 id=&quot;mvcc-how-postgresql-implements-isolation&quot;&gt;MVCC: How PostgreSQL Implements Isolation&lt;/h2&gt;

&lt;p&gt;PostgreSQL uses Multi-Version Concurrency Control (MVCC). Instead of locking, it keeps multiple versions of rows.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Original data
SELECT * FROM accounts WHERE user_id = 1;
-- balance: 500, xmin: 100, xmax: 0

-- Transaction 1 updates
BEGIN;  -- Transaction ID: 101
UPDATE accounts SET balance = 600 WHERE user_id = 1;
COMMIT;

-- Physical storage now has TWO versions:
-- Old: balance 500, xmin: 100, xmax: 101
-- New: balance 600, xmin: 101, xmax: 0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;When you query, PostgreSQL uses transaction IDs to determine which version is visible:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Transactions started before 101: See balance 500&lt;/li&gt;
  &lt;li&gt;Transactions started after 101: See balance 600&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No locks needed for reads and writes on different versions.&lt;/p&gt;

&lt;h2 id=&quot;practical-example-transaction-behavior&quot;&gt;Practical Example: Transaction Behavior&lt;/h2&gt;

&lt;pre&gt;&lt;code class=&quot;language-sql&quot;&gt;-- Create test table
CREATE TABLE transfers (
    id SERIAL PRIMARY KEY,
    from_account INT,
    to_account INT,
    amount DECIMAL
);

-- Transaction that demonstrates atomicity
BEGIN;
INSERT INTO transfers (from_account, to_account, amount) VALUES (1, 2, 100);
SELECT * FROM transfers WHERE from_account = 1;  -- Shows the new row
ROLLBACK;
SELECT * FROM transfers WHERE from_account = 1;  -- Row is gone (atomicity)

-- Transaction that demonstrates isolation
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM transfers;  -- Let&apos;s say it&apos;s 0

-- In another session, insert a row and commit
-- (Open another terminal)
-- BEGIN; INSERT INTO transfers VALUES (DEFAULT, 1, 2, 50); COMMIT;

-- Back in first session
SELECT COUNT(*) FROM transfers;  -- Still 0 (isolation)
COMMIT;
SELECT COUNT(*) FROM transfers;  -- Now 1 (after commit, you see changes)
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;whats-next&quot;&gt;What’s Next?&lt;/h2&gt;

&lt;p&gt;Now that you understand transactions and why durability requires persisting data to disk, we can explore the performance implications of different I/O patterns. In &lt;a href=&quot;/postgres-fundamentals-performance-patterns-part-4&quot;&gt;Part 4&lt;/a&gt;, we’ll learn about write amplification, I/O batching, and why PostgreSQL makes the design choices it does.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Previous&lt;/strong&gt;: &lt;a href=&quot;/postgres-fundamentals-database-storage-part-2&quot;&gt;Part 2 - How Databases Store Data&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/transaction-iso.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL transaction isolation documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: Transaction Isolation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/mvcc-intro.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL MVCC documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: MVCC&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.postgresql.org/docs/current/wal-intro.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot; aria-label=&quot;PostgreSQL WAL documentation (opens in new tab)&quot;&gt;PostgreSQL Documentation: Reliability and Write-Ahead Logging&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Mon, 13 Oct 2025 00:00:00 +0000</pubDate>
        <link>https://prateekcodes.com/postgres-fundamentals-transactions-part-3/</link>
        <guid isPermaLink="true">https://prateekcodes.com/postgres-fundamentals-transactions-part-3/</guid>
        
        <category>postgres</category>
        
        <category>database-fundamentals</category>
        
        <category>transactions</category>
        
        <category>acid</category>
        
        <category>isolation</category>
        
        <category>consistency</category>
        
        
        <category>PostgreSQL</category>
        
        <category>Database</category>
        
      </item>
    
  </channel>
</rss>
