onchainMay 7, 2026

A proof-of-concept algorithm that matches Tornado Cash deposits to withdrawals using a 4-point behavioral scoring system. Demonstrates how transaction attribution can break mixer privacy guarantees.

Tornado Cash Intelligent Demixer: Transaction Attribution Through Behavioral Analysis

What if we can build an algorithm, that matches deposits to withdrawals in a tornado cash trasaction? Doesn’t that break the whole privacy aspect that it promises? This tutorial build a POC trying to match deposits-withdrawals using a 4-point scoring system.

I received some valuable feedback for this POC in my linkedin post, which I plan to implement soon.

In the spirit of literate programming, this article will have the code files linked in raw format. Just as Donald Knuth’s programs weave together code and explanation, we’ll explore how to match deposits to withdrawals, detect behavioral patterns, and identify network connections in privacy-preserving transactions.

Repository: tornado-cash-intelligent-demixer

Overview

Tornado Cash is a privacy-preserving protocol that breaks the link between deposit and withdrawal addresses using zero-knowledge proofs. However, privacy can be compromised through behavioral patterns, timing analysis, and network graph analysis. This methodology extends the same fund-flow tracing approach used in the Hedgey Finance exploit post-mortem and the TRON mule offramp investigation. This tool demonstrates how to:

Architecture

The system consists of four main components:

1. Data Fetching Layer

The afetch.py module handles all interactions with the Bitquery GraphQL API. It retrieves both transfer events and contract events (Deposit/Withdrawal) to build a complete picture of Tornado Cash activity.

@dataclass
class TornadoTransaction:
    """Represents a Tornado Cash transaction"""
    tx_hash: str
    from_address: str
    to_address: str
    value: str
    block_time: str
    gas: int
    call_signature: str
    transaction_type: str  # 'deposit' or 'withdraw'
    commitment: str = None  # bytes32 from Deposit event
    nullifier: str = None  # bytes32 from Withdrawal event
    recipient: str = None  # address from Withdrawal event
    relayer: str = None  # address from Withdrawal event
    fee: str = None  # uint256 from Withdrawal event

The BitqueryFetcher class provides two primary methods:

2. Configuration Management

The config.py module maintains:

# Default analysis configuration
DEFAULT_TIME_TOLERANCE_SECONDS = 7200  # 2 hours
DEFAULT_NETWORK_WINDOW_DAYS = 14
VALUE_TOLERANCE_PERCENT = 0.01  # 1% tolerance for matching amounts

3. Scoring Algorithm

The scoring.py module implements the matching heuristics. The core function calculate_match_score() combines multiple signals:

def calculate_match_score(
    time_diff_seconds: float,
    tolerance_seconds: float,
    deposit_value: Optional[float],
    withdrawal_value: Optional[float],
    same_contract: bool,
    same_pool: bool,
) -> float:
    """Lower scores indicate better matches"""
    time_score = time_diff_seconds / tolerance_seconds
    amount_score = abs(deposit_value - withdrawal_value) / deposit_value
    contract_bonus = 0.0 if same_contract else 0.3
    pool_bonus = 0.0 if same_pool else 0.5
    return time_score + amount_score + contract_bonus + pool_bonus

Scoring factors:

4. Core Analysis Engine

The tornado_analyzer.py module orchestrates the analysis. The TornadoCashAnalyzer class provides several key methods:

Matching Deposits to Withdrawals

The match_deposits_withdrawals() method implements a greedy matching algorithm:

def match_deposits_withdrawals(
    self,
    tolerance_seconds: int = 1209600,  # 2 weeks
    value_tolerance_percent: float = 0.05
) -> List[Dict]:
    """One-to-one matching using greedy algorithm with scoring"""
    # Generate all candidate pairs
    candidates = []
    for deposit in self.deposits:
        for withdrawal in self.withdrawals:
            if withdrawal_time > deposit_time:
                if time_diff <= tolerance_seconds:
                    score = calculate_match_score(...)
                    candidates.append({...})

    # Sort by score and greedily match
    candidates.sort(key=lambda x: x['score'])
    matches = []
    matched_indices = set()
    for candidate in candidates:
        if candidate['deposit_idx'] not in matched_indices:
            matches.append(candidate)
            matched_indices.add(...)

    return matches

Matching constraints:

  1. Withdrawal must occur after deposit
  2. Time difference within tolerance window (default: 2 weeks)
  3. Amounts match within tolerance (accounts for relayer fees)
  4. One-to-one matching (each deposit/withdrawal matched at most once)

Address Reuse Detection

Privacy is compromised when addresses appear in multiple transactions:

def find_address_reuse(self, transactions: List[TornadoTransaction]) -> Dict[str, int]:
    """Find addresses appearing in multiple transactions"""
    address_counts = Counter()
    for tx in transactions:
        address_counts[tx.from_address] += 1
        address_counts[tx.to_address] += 1
    return {addr: count for addr, count in address_counts.items() if count > 1}

Network Pattern Analysis

The analyze_network_patterns() method builds temporal graphs to identify connected addresses:

def analyze_network_patterns(self, window_days: int = 14) -> Dict:
    """Analyze connections between addresses within time windows"""
    # Group transactions by time windows
    # Build adjacency lists for addresses
    # Identify clusters and patterns

Relayer Analysis

Relayers facilitate withdrawals by paying gas fees. Analysis includes:

def analyze_relayers(
    self,
    contract_addresses: List[str],
    limit: int = 1000,
    network: str = "eth"
) -> Dict:
    """Analyze relayer behavior and patterns"""
    withdrawals = self.get_withdrawal_events(...)
    relayer_stats = defaultdict(lambda: {
        'count': 0,
        'total_fees': 0,
        'recipients': set()
    })
    # Aggregate statistics by relayer address

Nullifier Analysis

Nullifiers prevent double-spending. The analysis tracks:

def analyze_nullifiers(
    self,
    contract_addresses: List[str],
    limit: int = 1000,
    network: str = "eth"
) -> Dict:
    """Analyze nullifier usage patterns"""
    withdrawals = self.get_withdrawal_events(...)
    nullifier_counts = Counter(tx.nullifier for tx in withdrawals)
    # Detect potential double-spends

5. Web Interface

The app.py module provides a Flask-based web UI with several endpoints:

The UI displays:

How De-Mixing Works

The de-mixing algorithm combines multiple heuristics:

1. Temporal Analysis

Transactions occurring close in time are more likely to be related. The default tolerance is 2 weeks, but this can be configured.

2. Value Matching

Withdrawals should match deposit amounts (minus relayer fees). The algorithm allows a 5% tolerance to account for:

3. Contract/Pool Matching

Transactions using the same contract or pool denomination are more likely to be related. The scoring algorithm heavily weights this factor.

4. Behavioral Patterns

5. Greedy Matching Algorithm

The algorithm uses a greedy approach:

  1. Generate all candidate deposit-withdrawal pairs
  2. Score each candidate using the heuristics
  3. Sort candidates by score (lower is better)
  4. Greedily match pairs, ensuring one-to-one correspondence

Usage Example

from tornado_analyzer import TornadoCashAnalyzer
import config

# Initialize analyzer
analyzer = TornadoCashAnalyzer(
    oauth_token="your_bitquery_token",
    network="eth"
)

# Fetch transactions
contracts = config.get_tornado_cash_addresses("eth")
analyzer.get_deposits(contracts, limit=1000)
analyzer.get_withdrawals(contracts, limit=1000)

# Match deposits to withdrawals
matches = analyzer.match_deposits_withdrawals(
    tolerance_seconds=1209600,  # 2 weeks
    value_tolerance_percent=0.05
)

# Analyze patterns
reused_addresses = analyzer.find_address_reuse(
    analyzer.deposits + analyzer.withdrawals
)
network_patterns = analyzer.analyze_network_patterns(window_days=14)

# Generate report
report = analyzer.generate_report(contracts, limit=1000, network="eth")

Key Insights

Privacy Compromises

  1. Address Reuse: Using the same address for multiple deposits/withdrawals links transactions
  2. Timing Patterns: Rapid deposits and withdrawals suggest coordinated activity
  3. Amount Patterns: Using exact same amounts across transactions creates patterns
  4. Relayer Usage: Consistent relayer usage can link transactions

Limitations

Conclusion

This tool demonstrates how behavioral analysis can reveal patterns in privacy-preserving protocols. While Tornado Cash’s cryptographic guarantees remain strong, metadata analysis can provide insights for:

The codebase follows literate programming principles, with each module clearly documented and linked. The implementation is modular, allowing researchers and developers to extend the analysis with additional heuristics or integrate it into larger systems.

Explore the code:

Repository: tornado-cash-intelligent-demixer


Read more from Cryptogrammar