How U.S. Payments Really Work Part 20

‹‹ Series Overview ››

Handling ACH Returns with Missing or Broken Data

Prevent wrong-customer reversals by forcing every return through: immutable raw evidence → tolerant parsing → confidence scoring → conservative matching.

Author: Suma Manjunath

Published on: November 05, 2025

Handling returns with missing or broken data

One-line metric: When a return is ambiguous, don’t guess—preserve immutable raw evidence and route it to manual resolution. Auto-actions are allowed only when identity is unique and high-confidence.When a return is ambiguous, don’t guess—preserve immutable raw evidence and route it to manual resolution. Auto-actions are allowed only when identity is unique and high-confidence.

Audience: Payments engineers, backend engineers, platform architectsPayments engineers, backend engineers, platform architect.
Reading time: 12–16 minutes.
Prerequisites: ACH returns basics, ability to run a single Ruby scriptACH returns basics, ability to run a single Ruby script.
Why now: The first time a bank portal export drops trace numbers, your “exact match or bust” system turns into guesswork—and guesswork reverses the wrong customer.

TL;DR:

Store every inbound return as an immutable raw return system of record (even if parsing fails).

Maintain a transmission system of record for what you sent (file/batch/entry evidence + hashes).

Normalize into a Return Case with nullable fields + parse errors.

Match with tiers (exact → strong → weak → manual review) and output a confidence score.

Never auto-pick when there are multiple candidates.

You need two idempotency layers: ingest and ledger actions.

⚠️ Disclaimer: All scenarios, accounts, names, and data used in examples are not real. They are realistic scenarios provided only for educational and illustrative purposes.

Problem Definition

ACH returns are supposed to be trackable. In reality, your “return feed” might be true NACHA, a processor’s transformed JSON, a bank portal CSV export, or an “exceptions report” that silently drops fields. That’s how you end up with returns missing:

original trace number
names
payment identifiers
batch headers / file headers
amount (null/zero/mismatched)
addenda context

When the feed is fully stripped, you’re no longer safely “processing ACH returns.” You’re processing a failure notice that may not be attributable to any specific payment without additional correlation.

The ugly truth: “last4_account_number + amount” fails at scale (and it fails earlier than people think)

❗ Warning: At 10K+ monthly transactions, you will absolutely see collisions on last4_account_number + amount (and even last4_account_number + amount + company_id if you originate repetitive amounts). If your system auto-matches on these heuristics without guardrails, you are statistically guaranteed to reverse the wrong customer.

Rule: If a heuristic yields multiple candidates, you must never auto-pick.

When do you need this pattern?

Use this pattern if any of these are true:

You’re a regulated entity (or audited like one) and need defensible evidence trails
You process hundreds of returns per month
Your return feed is not guaranteed to preserve trace numbers (portal exports, transformed files)
You originate repetitive amounts at scale (subscriptions, tuition, payroll-like patterns)

If you’re early-stage:

If you have <100 returns/month, it can be rational to manual-review everything (still store immutable raw events). You’re buying correctness while you learn your provider’s quirks.

Cost of getting it wrong (concrete, practical)

Wrong-customer reversals are expensive because they trigger cross-team investigation and reconciliation . The cost is typically hours of staff time plus any customer remediation.

ℹ️ Note: The exact number varies. The point is that “small” matching mistakes become expensive fast when your inbound return data isn’t clean.

Solution Implementation

This pattern has four non-negotiables:

Immutable Raw Return System of Record Store the raw return payload exactly as received and never mutate it.
Transmission System of Record (what you sent) Maintain immutable evidence of outbound transmissions: file identifiers, hashes, batch/entry metadata, and the identifiers you generated at send-time.
Confidence-Based Matching Matching is probabilistic. You must produce a confidence score and rationale.
Manual Review Lane If identity is ambiguous, the system must say: “I don’t know.”

Flow

flowchart TD
  A["Inbound Return Feed (NACHA File / Portal Export / Processor JSON)"] --> B["Immutable Raw Return System of Record (Store Raw Payload + Delivery Metadata)"]
  B --> C["Tolerant Parser (Best-Effort Normalization + Parse Error Capture)"]
  C --> D["Return Case (Nullable Fields + Parse Errors + Confidence Score)"]
  D --> E["Matching Engine (Exact Keys → Strong Heuristics → Weak Heuristics → Candidate Ranking)"]
  E --> F["Ledger Action Gate (Idempotency for Ledger Actions + No Double-Reversal Rules)"]
  E --> G["Manual Review Queue (Low Confidence / Multiple Candidates / Stripped Feed)"]
  G --> H["Ops Resolution (Attach to Customer + Notes + Finalize)"]
  F --> I["Accounting Ledger (Apply Reversal / Adjust Balance / Notify)"]
  J["Outbound Transmission System of Record (File Hashes + Batch/Entry Evidence + Trace Numbers at Send-Time)"] --> E

The Three Risks You Must Treat as “Production Stoppers”

1) Last-4 collision math (don’t bury this)

❗ Warning: last4_account_number is a weak identifier. At scale, last4_account_number + amount is effectively a “bucket,” not an identity key. If you auto-match on it, you are committing to wrong-customer actions.

Rule: Multiple candidates → manual review.

2) Idempotency must exist in two places (and you need both)

❗ Warning: Ingest idempotency prevents duplicate events. Ledger-action idempotency prevents duplicate financial actions. You need both.

Ingest idempotency: “Have I already recorded this raw return payload?”
Ledger-action idempotency: “Have I already applied a reversal/adjustment for this Return Case?”

If you only do ingest idempotency: reprocessing can still double-reverse. If you only do ledger-action idempotency: you can still lose audit evidence and re-ingest noise incorrectly.

3) The “fix corrupt data” trap (engineers WILL try this)

NEVER DO THIS (put this in your code review checklist)

❗ Warning:

❌ Never “fix” a corrupt trace number by guessing characters.
❌ Never auto-match if your heuristic returns multiple candidates.
❌ Never delete raw return events after parsing (“we normalized it already”).
❌ Never disable idempotency because “it’s just a backfill/test run.”

When data is corrupt, your job is to preserve evidence and reduce harm, not to fabricate certainty.

Identity Quality: How to behave when the feed is stripped

When you’ve seen missing names, missing payment identifiers, missing batch headers, and “any payment info fully stripped,” treat identity as a first-class field:

identity_quality = strong | medium | weak | none

Suggested mapping:

strong: valid original trace number (or other immutable payment identifier) → safe to auto-action
medium: unique match from strong heuristic (e.g., last4 + amount + company_id) → auto-action only if exactly one candidate
weak: last4 + amount only → collisions likely → manual review unless you have additional constraints
none: stripped (no trace, no amount, no account signal) → manual review only

💡 Tip: If identity_quality = none, your best “keys” often come from delivery context (SFTP path, file naming, timestamps) plus your transmission system of record (what you sent) and any provider/bank “detail view” you can query.

Matching Engine: Tiered + Conservative + Recurrence Guardrail

Matching is tiered and conservative. We first match on a payment identifier (trace, internal payment ID). If missing, we use a batch identifier to narrow the search space to a known transmission. If that’s also missing, we reconstruct candidates from batch header context (SEC code, company ID, effective date window) and match using entry evidence (amount, account signal, discretionary data, name). Heuristic combinations are only allowed when they produce exactly one candidate. For recurring payments, we apply a 7–10 banking-day window to avoid misattributing a return to the wrong cycle. When in doubt, the system routes the case to manual review.

Design for Resilience

Discretionary Data as a Correlation Handle

If you populate discretionary_data or addenda fields with a stable internal reference (payment ID suffix, short hash), it becomes a semi-identifier that banks won’t strip. Future-you will thank past-you for embedding correlation handles in fields that survive export transformations.If you populate discretionary_data or addenda fields with a stable internal reference (payment ID suffix, short hash), it becomes a semi-identifier that banks won’t strip. Future-you will thank past-you for embedding correlation handles in fields that survive export transformations.

Core Pattern in One Runnable Ruby File

This is the pattern without infrastructure distractions. Copy-paste into ach_return_matcher.rb and run.

#!/usr/bin/env ruby
# ach_return_matcher.rb
# Demonstrates: raw return system of record, tolerant parsing, confidence-based matching
# Run: ruby ach_return_matcher.rb

require 'json'
require 'digest'
require 'date'

# ============================================================================
# DATA STRUCTURES (in-memory for demo)
# ============================================================================

# Simulates your outbound transmission system of record (what you sent)
# Added: file_id, batch_id, discretionary_data, is_recurring
OUTBOUND_TRANSMISSIONS_SOR = [
  {
    id: 1,
    file_id: 'FILE_20240817_A',
    batch_id: 'BATCH_0007',
    trace_number: '061000050001234',
    routing_number: '061000052',
    account_last4: '6789',
    amount_cents: 12_500,
    effective_date: '20240817',
    company_id: 'ACMEPAY001',
    discretionary_data: 'PAY_9f12', # correlation handle you embed at send-time
    is_recurring: true
  },
  # Another entry in the same batch to show narrowing behavior
  {
    id: 2,
    file_id: 'FILE_20240817_A',
    batch_id: 'BATCH_0007',
    trace_number: '061000050009999',
    routing_number: '061000052',
    account_last4: '1111',
    amount_cents: 9_900,
    effective_date: '20240817',
    company_id: 'ACMEPAY001',
    discretionary_data: 'PAY_2a88',
    is_recurring: true
  }
].freeze

# Raw returns system of record (immutable)
$raw_return_events = []

# Return cases (normalized, mutable)
$return_cases = []

# ============================================================================
# CORE PATTERN: Immutable Raw Return System of Record
# ============================================================================

def store_raw_return_event(source:, filename:, payload:)
  # Ingest idempotency key
  idempotency_key = Digest::SHA256.hexdigest("#{source}|#{filename}|#{payload}")

  return nil if $raw_return_events.any? { |e| e[:idempotency_key] == idempotency_key }

  event = {
    id: $raw_return_events.size + 1,
    source: source,
    filename: filename,
    received_at: Time.now,
    idempotency_key: idempotency_key,
    raw_payload: payload # NEVER modify this
  }

  $raw_return_events << event
  event[:id]
end

# ============================================================================
# TOLERANT PARSING: Accept broken data, record errors
# ============================================================================

def parse_return(raw_payload)
  errors = []

  begin
    data = JSON.parse(raw_payload)
  rescue JSON::ParserError
    return { normalized: {}, batch_context: {}, errors: ['invalid_json'] }
  end

  normalized = {
    return_code: data['return_reason_code']&.strip,
    trace_number: data['original_trace_number']&.strip,
    routing: data['routing_number']&.strip,
    account_last4: data['account_number_last4']&.strip,
    amount_cents: data['amount_cents'],
    settlement_date: data['settlement_date']&.strip,
    company_id: data['company_id']&.strip,
    discretionary_data: data['discretionary_data']&.strip
  }

  # Batch context can come from a provider envelope, filename mapping, or extra fields in transformed feeds.
  batch_context = {
    file_id: data['file_id']&.strip,
    batch_id: data['batch_id']&.strip
  }

  if normalized[:trace_number] && !normalized[:trace_number].match?(/^\d{15}$/)
    errors << 'invalid_trace_number'
    normalized[:trace_number] = nil
  end

  if normalized[:routing] && !normalized[:routing].match?(/^\d{9}$/)
    errors << 'invalid_routing'
    normalized[:routing] = nil
  end

  if normalized[:account_last4] && !normalized[:account_last4].match?(/^\d{4}$/)
    errors << 'invalid_last4'
    normalized[:account_last4] = nil
  end

  if normalized[:amount_cents] && (!normalized[:amount_cents].is_a?(Integer) || normalized[:amount_cents] < 0)
    errors << 'invalid_amount'
    normalized[:amount_cents] = nil
  end

  { normalized: normalized, batch_context: batch_context, errors: errors }
end

# ============================================================================
# CONFIDENCE-BASED MATCHING (Corrected): Tier 0/1/2 + Recurrence guardrail
# ============================================================================

# Tier 0: Primary identifier (payment_id, trace, processor txn id)
# Tier 1: Batch identifier (file_id+batch_id narrows search space)
# Tier 2: Batch header + entry evidence (when identifiers are missing)
# Recurrence check: 7-10 banking day window prevents wrong-cycle matches
#
# NOTE: This demo approximates "banking days" using calendar days for simplicity.
# In production, compute true business days using your bank holiday calendar.

def match_return(normalized, batch_context = {})
  # Tier 0: payment identifier (confidence = 1.0)
  if normalized[:trace_number]
    exact = OUTBOUND_TRANSMISSIONS_SOR.find { |e| e[:trace_number] == normalized[:trace_number] }
    return { status: :matched, entry_id: exact[:id], confidence: 1.0, rationale: 'payment_identifier' } if exact
  end

  # Tier 1: batch identifier (confidence = 0.95, narrows search space)
  if batch_context[:batch_id]
    candidates = OUTBOUND_TRANSMISSIONS_SOR.select { |e| e[:batch_id] == batch_context[:batch_id] }

    # Now match within batch using entry evidence
    if candidates.size == 1 && normalized[:amount_cents] && candidates.first[:amount_cents] == normalized[:amount_cents]
      return { status: :matched, entry_id: candidates.first[:id], confidence: 0.95, rationale: 'batch_identifier' }
    end

    # If multiple in batch, use additional entry evidence to disambiguate (still conservative)
    if candidates.size > 1
      narrowed = candidates

      # Strongest low-friction signals first (amount + account_last4 + discretionary_data)
      if normalized[:amount_cents]
        narrowed = narrowed.select { |e| e[:amount_cents] == normalized[:amount_cents] }
      end
      if normalized[:account_last4]
        narrowed = narrowed.select { |e| e[:account_last4] == normalized[:account_last4] }
      end
      if normalized[:discretionary_data]
        narrowed = narrowed.select { |e| e[:discretionary_data] == normalized[:discretionary_data] }
      end

      if narrowed.size == 1
        return { status: :matched, entry_id: narrowed.first[:id], confidence: 0.95, rationale: 'batch_identifier_with_entry_evidence' }
      elsif narrowed.size > 1
        return { status: :needs_review, confidence: 0.6, rationale: 'multiple_candidates_in_batch', candidates: narrowed.map { |c| c[:id] } }
      end
    end
  end

  # Tier 2: Batch header context + entry evidence (confidence = 0.85)
  if normalized[:account_last4] && normalized[:amount_cents] && normalized[:company_id]
    candidates = OUTBOUND_TRANSMISSIONS_SOR.select do |e|
      e[:account_last4] == normalized[:account_last4] &&
      e[:amount_cents] == normalized[:amount_cents] &&
      e[:company_id] == normalized[:company_id]
    end

    # Optional extra disambiguator if present
    if normalized[:discretionary_data]
      candidates = candidates.select { |e| e[:discretionary_data] == normalized[:discretionary_data] } if candidates.size > 1
    end

    # Recurrence guardrail: prevent wrong-cycle matches
    if candidates.size == 1
      candidate = candidates.first
      days_since_payment = (Date.today - Date.strptime(candidate[:effective_date], '%Y%m%d')).to_i

      # Concrete guardrail: recurring payments inside a 7–10 banking-day window should not auto-match
      if candidate[:is_recurring] && days_since_payment < 10
        return { status: :needs_review, confidence: 0.6, rationale: 'recurrence_cooldown_window' }
      end

      return { status: :matched, entry_id: candidate[:id], confidence: 0.85, rationale: 'batch_header_entry_evidence' }
    elsif candidates.size > 1
      return { status: :needs_review, confidence: 0.6, rationale: 'multiple_candidates', candidates: candidates.map { |c| c[:id] } }
    end
  end

  { status: :needs_review, confidence: 0.0, rationale: 'insufficient_identity' }
end

# ============================================================================
# RETURN CASE: Normalized view with match state
# ============================================================================

def create_return_case(raw_event_id:, normalized:, batch_context:, errors:, match_result:)
  identity_quality =
    if normalized[:trace_number]
      :strong
    elsif batch_context[:batch_id]
      :medium
    elsif normalized[:account_last4] && normalized[:amount_cents] && normalized[:company_id]
      :weak
    else
      :none
    end

  {
    id: $return_cases.size + 1,
    raw_event_id: raw_event_id,
    batch_id: batch_context[:batch_id],
    file_id: batch_context[:file_id],
    **normalized,
    parse_errors: errors,
    identity_quality: identity_quality,
    match_confidence: match_result[:confidence],
    matched_entry_id: match_result[:entry_id],
    status: match_result[:status],
    rationale: match_result[:rationale],
    candidates: match_result[:candidates],
    created_at: Time.now
  }.tap { |rc| $return_cases << rc }
end

# ============================================================================
# END-TO-END PROCESSOR
# ============================================================================

def process_return_file(source:, filename:, lines:)
  results = { processed: 0, matched: 0, needs_review: 0, duplicates: 0 }

  lines.each do |line|
    next if line.strip.empty?

    raw_event_id = store_raw_return_event(source: source, filename: filename, payload: line)
    if raw_event_id.nil?
      results[:duplicates] += 1
      next
    end

    parsed = parse_return(line)
    match_result = match_return(parsed[:normalized], parsed[:batch_context])

    create_return_case(
      raw_event_id: raw_event_id,
      normalized: parsed[:normalized],
      batch_context: parsed[:batch_context],
      errors: parsed[:errors],
      match_result: match_result
    )

    results[:processed] += 1
    match_result[:status] == :matched ? results[:matched] += 1 : results[:needs_review] += 1
  end

  results
end

# ============================================================================
# DEMO
# ============================================================================

sample_returns = [
  # Perfect match via trace (Tier 0)
  '{"return_reason_code":"R01","original_trace_number":"061000050001234","routing_number":"061000052","account_number_last4":"6789","amount_cents":12500,"settlement_date":"20251029","company_id":"ACMEPAY001","file_id":"FILE_20240817_A","batch_id":"BATCH_0007","discretionary_data":"PAY_9f12"}',

  # Missing trace, match via batch_id narrowing + entry evidence (Tier 1)
  '{"return_reason_code":"R03","original_trace_number":"","routing_number":"061000052","account_number_last4":"6789","amount_cents":12500,"settlement_date":"20251029","company_id":"ACMEPAY001","file_id":"FILE_20240817_A","batch_id":"BATCH_0007"}',

  # Corrupt trace + missing amount = needs review (insufficient entry evidence)
  '{"return_reason_code":"R19","original_trace_number":"06100005000123X","routing_number":"061000052","account_number_last4":"6789","amount_cents":null,"settlement_date":"20251029","company_id":"ACMEPAY001","batch_id":"BATCH_0007"}',

  # Fully stripped (no usable identity) = needs review (identity_quality :none)
  '{"return_reason_code":"R03"}',

  # Duplicate of first (should skip)
  '{"return_reason_code":"R01","original_trace_number":"061000050001234","routing_number":"061000052","account_number_last4":"6789","amount_cents":12500,"settlement_date":"20251029","company_id":"ACMEPAY001","file_id":"FILE_20240817_A","batch_id":"BATCH_0007","discretionary_data":"PAY_9f12"}'
]

puts "Processing ACH returns...\n\n"

results = process_return_file(
  source: 'SFTP_BANK_X',
  filename: 'RETURNS_20251029.ndjson',
  lines: sample_returns
)

puts "Results:"
puts "  Processed: #{results[:processed]}"
puts "  Matched: #{results[:matched]}"
puts "  Needs Review: #{results[:needs_review]}"
puts "  Duplicates Skipped: #{results[:duplicates]}"
puts "\n"

puts "Return Cases:\n"
$return_cases.each do |rc|
  puts "  Case ##{rc[:id]}:"
  puts "    Status: #{rc[:status]}"
  puts "    Rationale: #{rc[:rationale]}"
  puts "    Identity Quality: #{rc[:identity_quality]}"
  puts "    Confidence: #{rc[:match_confidence]}"
  puts "    Matched Entry: #{rc[:matched_entry_id] || 'none'}"
  puts "    Candidates: #{rc[:candidates] ? rc[:candidates].join(', ') : 'n/a'}"
  puts "    Parse Errors: #{rc[:parse_errors].empty? ? 'none' : rc[:parse_errors].join(', ')}"
  puts "    File ID: #{rc[:file_id] || 'MISSING'}"
  puts "    Batch ID: #{rc[:batch_id] || 'MISSING'}"
  puts "    Trace: #{rc[:trace_number] || 'MISSING'}"
  puts "    Last4: #{rc[:account_last4] || 'MISSING'}"
  puts "    Amount: #{rc[:amount_cents] || 'MISSING'}"
  puts "    Discretionary: #{rc[:discretionary_data] || 'MISSING'}"
  puts ""
end

Run it

ruby ach_return_matcher.rb

Expected outcomes:

Exact trace match → matched (confidence 1.0, identity_quality strong)
batch_id narrowing + entry evidence → matched (confidence 0.95, identity_quality medium)
corrupt trace + missing amount → needs_review (identity_quality medium/weak depending on batch context)
fully stripped return → needs_review (identity_quality none)
duplicate payload → skipped (ingest idempotency)

Practical Thresholds & Defaults (so engineers don’t improvise)

Suggested auto-action threshold

Auto-action allowed:
- Tier 0 (payment identifier): confidence = 1.0 (strong)
- Tier 1 (batch identifier + unique within batch): confidence ≥ 0.95 (medium)
- Tier 2 (batch header + entry evidence): confidence ≥ 0.85 only if exactly one candidate and passes recurrence guardrail
Manual review required: confidence < 0.85 or candidates > 1 or identity_quality is weak/none or recurrence guardrail trigger
Auto-action allowed: confidence ≥ 0.85 and exactly one candidate and identity_quality in (strong, medium)
Manual review required: confidence < 0.85 or candidates > 1 or identity_quality is weak/none

7–10 banking-day recurrence guardrail (make it policy, not folklore)

If a payment is marked recurring (subscriptions/tuition/payroll-like patterns), do not auto-match a return to a candidate whose effective date is within 7–10 banking days of “today” (or within your normal settlement/return latency window). Route to review unless you have a Tier 0 identifier .

Validation & Monitoring

Minimal validation checks

Duplicate ingest: same payload twice → only one raw event, only one case
Corrupt JSON → case created with invalid_json
Invalid trace → trace nulled + parse error recorded
Multiple candidates → must be needs_review
Stripped feed → identity_quality none → must be needs_review
Recurring within 7–10 banking days → must be needs_review unless Tier 0 match

Minimal metrics

% matched at confidence >= 0.95 (Tier 1+) and % matched at 1.0 (Tier 0)
% needs_review and % identity_quality=none
% recurrence_cooldown_window
duplicates skipped
top parse error codes

❗ Warning : Alert on sudden spikes in identity_quality=none or parse errors—this often indicates a provider/export format change

Takeaways

Raw evidence is sacred. If you can’t prove what you received, you can’t debug or defend it.
The transmission system of record matters. If the return feed is stripped, what you sent is often the only reliable starting point.
Heuristics are not identity. At scale, last4 collisions are guaranteed.
Idempotency is two-layer. Ingest and ledger actions are different problems.
Never “fix” corruption. Your system should become more conservative as data quality declines.

Next steps

Add an immutable raw return store (even if it’s just an append-only table/S3 bucket + pointer).
Ensure you have a transmission system of record (file hashes + entry evidence at send-time).
Implement Tier 0/1/2 matching + recurrence guardrail.
Ship a manual review lane before enabling auto-actions.
Start embedding a discretionary/addenda correlation handle in outbound entries.

Acronyms & Definitions

ACH : Automated Clearing House
ODFI : Originating Depository Financial Institution
RDFI : Receiving Depository Financial Institution
System of record (SoR) : The authoritative, immutable evidence store for “what we received” and “what we sent”
Accounting ledger : The financial book of record where reversals/adjustments are applie

References

Nacha ACH Volume Stats - NACHA ACH Volume Statistics, 2024

At BuildTales.dev, Suma Manjunath is passionate about writing, intentional tech, and the messy magic of building great software teams.

If you enjoyed this article, explore more stories in these series:

Comments & Discussion

Share your thoughts, ask questions, or start a discussion about this article.

Build Tales

Stories and insights from the journey of building teams, systems, and culture.