Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Nodalync Protocol

A protocol for fair knowledge economics in the age of AI.


Abstract

We propose a protocol for knowledge economics that ensures original contributors receive perpetual, proportional compensation from all downstream value creation. A researcher can publish valuable findings once and receive perpetual royalties as the ecosystem builds upon their work. A writer’s insights compound in value as others synthesize and extend them.

The protocol enables humans to benefit from knowledge compounding—earning from what they know, not just what they continuously produce.

Key Features

  • Cryptographic Provenance — Every piece of knowledge carries its complete derivation history
  • Fair Economics — 95% of transaction value flows to foundational contributors
  • Local-First — Your data stays on your machine, under your control
  • AI-Native — MCP integration for seamless AI agent consumption

Quick Navigation

Getting Started

  • Quick Start — Get your node running in under 5 minutes
  • FAQ — Common questions answered

Protocol

Module Documentation

Applications

  • CLI — Command-line interface
  • MCP Server — AI agent integration

Protocol Layers

LayerNameContentsProperties
L0Raw InputsDocuments, transcripts, notesImmutable, publishable, queryable
L1MentionsAtomic facts with L0 pointersExtracted, visible as preview
L2Entity GraphEntities + RDF relationsInternal only, never shared
L3InsightsEmergent patterns and conclusionsShareable, importable as L0

Current Status

Protocol v0.7.1 · CLI v0.10.1

LayerCrateDescription
Protocolnodalync-cryptoHashing (SHA-256), Ed25519 signing, PeerId derivation
Protocolnodalync-typesAll data structures including L2 Entity Graph
Protocolnodalync-wireDeterministic CBOR serialization, 21 message types
Protocolnodalync-storeSQLite manifests, filesystem content, settlement queue
Protocolnodalync-validContent, provenance, payment, L2 validation
Protocolnodalync-econ95/5 revenue distribution, Merkle batching
Protocolnodalync-opsCREATE, DERIVE, BUILD_L2, MERGE_L2, QUERY
Protocolnodalync-netlibp2p (TCP/Noise/yamux), Kademlia DHT, GossipSub
Protocolnodalync-settleHedera settlement, smart contract deployed to testnet
Appnodalync-cliFull CLI with daemon mode, health endpoints, alerting
Appnodalync-mcpMCP server for AI agent integration

Hedera Testnet

ResourceValue
Contract ID0.0.7729011
EVM Address0xc6b4bFD28AF2F6999B32510557380497487A60dD
HashScanView Contract

Nodalync Quick Start

Get your node running and connected to the network in under 5 minutes.

Installation

Choose one of three options:

macOS / Linux:

curl -fsSL https://raw.githubusercontent.com/gdgiangi/nodalync-protocol/main/install.sh | sh

Windows (PowerShell):

irm https://raw.githubusercontent.com/gdgiangi/nodalync-protocol/main/install.ps1 | iex

Or download the latest .exe from Releases and add it to your PATH.

This auto-detects your platform and installs the latest binary with full Hedera settlement support.

Option B: Docker

# Pull or build the image
docker build -t nodalync:latest https://github.com/gdgiangi/nodalync-protocol.git

# Initialize your identity
docker run -it \
  -e NODALYNC_PASSWORD=your-secure-password \
  -v ~/.nodalync:/home/nodalync/.nodalync \
  nodalync:latest init

# Start your node
docker run -d --name nodalync-node \
  -e NODALYNC_PASSWORD=your-secure-password \
  -v ~/.nodalync:/home/nodalync/.nodalync \
  -p 9000:9000 \
  nodalync:latest start

Option C: Build from Source

Requires Rust 1.88+ (and protoc for Hedera support):

# Clone the repo
git clone https://github.com/gdgiangi/nodalync-protocol.git
cd nodalync-protocol

# Build release binary with Hedera support (default, requires protoc)
cargo build --release -p nodalync-cli

# Or build without Hedera support (smaller binary)
cargo build --release -p nodalync-cli --no-default-features

# Add to PATH (no sudo needed)
export PATH="$PWD/target/release:$PATH"

# Or install system-wide
sudo cp target/release/nodalync /usr/local/bin/

Pre-built binaries also available at Releases.


Step 1: Initialize Your Identity

Set a password and initialize your node identity:

export NODALYNC_PASSWORD=your-secure-password
nodalync init

This will:

  • Generate an Ed25519 keypair (your identity)
  • Create a default configuration file (connects to bootstrap nodes automatically)
  • Set up local storage (SQLite database, content directory)

Note: init fails if an identity already exists. To reinitialize, delete your data directory first (see Troubleshooting below for paths) or use nodalync init --wizard in an interactive terminal to auto-reinitialize.

For an interactive experience that lets you configure network settings, pricing, and settlement mode step by step, use the wizard:

nodalync init --wizard

Check your identity:

nodalync whoami

Step 2: Start Your Node

Foreground mode (see logs, Ctrl+C to stop):

nodalync start

Background mode (daemon):

nodalync start --daemon
nodalync status    # Check status
nodalync stop      # Stop the node

Your node will automatically:

  • Connect to the bootstrap node
  • Discover other peers via DHT
  • Start serving your published content

Step 3: Publish Content

Share knowledge on the network:

# Publish a file with default settings
nodalync publish my-research.md

# Publish with custom price and metadata
nodalync publish my-research.md \
  --price 0.01 \
  --title "My Research Paper" \
  --visibility shared

Visibility levels:

  • private - Local only, never shared
  • unlisted - Available if someone knows the hash
  • shared - Announced to network (default)

List your published content:

nodalync list

Step 4: Search & Query Content

Search the network:

Search matches content titles, descriptions, and tags (not body text):

# Search local content by title/description/tags
nodalync search "research"

# Search entire network
nodalync search "research" --all

Preview content (free, shows metadata only):

nodalync preview <content-hash>

Query full content (paid):

nodalync query <content-hash>

# Save to file
nodalync query <content-hash> --output result.txt

Step 5: Check Your Earnings

When others query your content, you earn HBAR:

# View balance
nodalync balance

# View earnings breakdown
nodalync earnings

# Force settlement (batch payments on-chain)
nodalync settle

Claude / MCP Integration

Connect Claude to your Nodalync node for AI-powered knowledge queries.

Start the MCP Server

Basic (local content only):

nodalync mcp-server \
  --budget 1.0 \
  --auto-approve 0.01

With network search:

nodalync mcp-server \
  --budget 1.0 \
  --auto-approve 0.01 \
  --enable-network

With Hedera settlement (testnet):

nodalync mcp-server \
  --budget 1.0 \
  --auto-approve 0.01 \
  --enable-network \
  --hedera-account-id 0.0.XXXXX \
  --hedera-private-key ~/.nodalync/hedera.key \
  --hedera-contract-id 0.0.7729011 \
  --hedera-network testnet

Options:

  • --budget - Maximum HBAR for this session (default: 1.0)
  • --auto-approve - Auto-approve queries below this price (default: 0.01)
  • --enable-network - Search network peers, not just local content
  • --hedera-account-id - Your Hedera account ID for settlement
  • --hedera-private-key - Path to your Hedera private key file
  • --hedera-contract-id - Settlement contract ID (default: 0.0.7729011)
  • --hedera-network - Network to use: testnet, mainnet, or previewnet

Configure Claude Desktop

Add to your Claude Desktop config (~/.config/claude/mcp.json or similar):

{
  "mcpServers": {
    "nodalync": {
      "command": "nodalync",
      "args": ["mcp-server", "--budget", "1.0", "--auto-approve", "0.01", "--enable-network"],
      "env": {
        "NODALYNC_PASSWORD": "your-secure-password",
        "NODALYNC_HEDERA_ACCOUNT_ID": "0.0.7703962",
        "NODALYNC_HEDERA_CONTRACT_ID": "0.0.7729011",
        "NODALYNC_HEDERA_KEY_PATH": "/Users/you/.nodalync/hedera.key"
      }
    }
  }
}

Note: The private key must be DER-encoded ECDSA format (98 hex characters starting with 303002...).

MCP Tools

When the MCP server is running, AI agents have access to these tools:

ToolDescription
query_knowledgeQuery content by hash or natural language (paid)
list_sourcesBrowse available content with metadata
search_networkSearch connected peers for content (requires --enable-network)
preview_contentView content metadata without paying
publish_contentPublish new content from the agent
synthesize_contentCreate L3 synthesis from multiple sources
update_contentCreate a new version of existing content
delete_contentDelete content and set visibility to offline
set_visibilityChange content visibility
list_versionsList all versions of a content item
get_earningsView earnings breakdown by content
statusNode health, budget, channels, and Hedera status
deposit_hbarDeposit HBAR to the settlement contract
open_channelOpen a payment channel with a peer
close_channelClose a payment channel
close_all_channelsClose all open payment channels

Local Multi-Node Testing

Test the full publish-query-payment flow across three local nodes using Docker.

Prerequisites: Docker, Docker Compose, and jq (for make test) installed.

Quick Version

# 1. Build the Docker image first (from repo root — this must complete before init)
docker compose build

# 2. Initialize node identities and configs (uses the image you just built)
cd infra/local && make init

# 3. Start the 3-node cluster
make up

# 4. Run the end-to-end test (publish on node1, query from node3)
make test

Important: Step 1 (docker compose build) must complete before Step 2 (make init), because make init runs docker run to generate identities using the built image.

What This Creates

ContainerRoleHost PortInternal IP
nodalync-node1Bootstrap / seed node9001, 8081172.28.0.10
nodalync-node2Alice (publisher)9002, 8082172.28.0.11
nodalync-node3Bob (querier)9003, 8083172.28.0.12

All nodes use the password testpassword and form a full-mesh via libp2p.

Manual Interaction

# Run any CLI command on a specific node
docker exec -e NODALYNC_PASSWORD=testpassword nodalync-node1 nodalync status
docker exec -e NODALYNC_PASSWORD=testpassword nodalync-node2 nodalync list

# Open a shell inside a node
cd infra/local && make shell-node1

# Publish test content on node1
make publish-test

# View logs
make logs

# Stop the cluster
make down

# Full reset (remove data + reinitialize)
make reset

Available Makefile Targets

Run cd infra/local && make help to see all targets:

TargetDescription
make initGenerate node identities and configs (required first)
make upStart the 3-node cluster
make downStop the cluster
make logsFollow cluster logs
make statusShow cluster status and peer IDs
make testRun E2E tests (publish, propagate, query)
make cleanRemove containers, volumes, and generated configs
make resetClean + init (fresh start)
make shell-node1Open shell in node1 (also node2, node3)
make publish-testPublish test content on node1

Two Docker Compose Files

There are two docker-compose.yml files in this repo, each for a different purpose:

FileLocationUsed byService namesWhen to use
docker-compose.ymlRepo rootdocker compose buildnode-bootstrap, node-alice, node-bobBuilding the image and custom setups
docker-compose.ymlinfra/local/make up/down/logsnode1, node2, node3Standard 3-node testing via Makefile

Typical workflow: Run docker compose build from the repo root to build the image, then use cd infra/local && make init && make up for the standard 3-node cluster. The Makefile targets use the infra/local/docker-compose.yml internally.

The root docker-compose.yml is useful if you want to customize the cluster (add more nodes, change ports, or integrate with other services). It references infra/local/ for data and configs, so you still need make init first.

Warning: Do not run both compose files at the same time. They use overlapping container names and ports, so running one while the other is active will cause conflicts. Use make down (or docker compose down from the root) to stop one before starting the other.


Manual Two-Node Testing (No Docker)

Test the full publish-query-payment flow between two local nodes without Docker. This requires two terminal windows and uses separate data directories for each node.

1. Set Up Two Identities

# Terminal A — Node A (publisher)
export NODALYNC_PASSWORD=testpassword
export NODALYNC_DATA_DIR=/tmp/nodalync-node-a
nodalync init

# Terminal B — Node B (querier)
export NODALYNC_PASSWORD=testpassword
export NODALYNC_DATA_DIR=/tmp/nodalync-node-b
nodalync init

2. Get Node A’s Peer ID and Start It

# Terminal A
nodalync whoami
# Note the libp2p PeerId (12D3KooW...)

nodalync start
# Note the listening address, e.g. /ip4/127.0.0.1/tcp/9000

3. Configure Node B to Bootstrap from Node A

Edit Node B’s config to point at Node A as a bootstrap peer:

# Terminal B — edit the config file
# The config is at $NODALYNC_DATA_DIR/config.toml

In config.toml, set the bootstrap node to Node A’s address:

[network]
bootstrap_nodes = [
    "/ip4/127.0.0.1/tcp/9000/p2p/<NODE_A_PEER_ID>"
]
listen_addresses = ["/ip4/0.0.0.0/tcp/9001"]

Replace <NODE_A_PEER_ID> with the libp2p PeerId from step 2. Note the different listen port (9001) to avoid conflicts.

4. Publish Content on Node A

# Terminal A (keep the node running in another terminal, or use --daemon)
# If running in foreground, open a third terminal with the same env vars:
export NODALYNC_PASSWORD=testpassword
export NODALYNC_DATA_DIR=/tmp/nodalync-node-a

echo "This is test knowledge content for the Nodalync network." > /tmp/test-content.txt
nodalync publish /tmp/test-content.txt --price 0.01 --title "Test Content"
# Note the content hash from the output
# Terminal B
nodalync start
# Wait a few seconds for peer discovery

# In another terminal with Node B's env:
export NODALYNC_PASSWORD=testpassword
export NODALYNC_DATA_DIR=/tmp/nodalync-node-b

nodalync search "Test Content" --all

6. Open a Payment Channel (Requires Hedera)

If you have Hedera testnet credentials configured:

# Terminal B — open a channel to Node A
nodalync open-channel <NODE_A_PEER_ID> --deposit 100

7. Query Content and Verify Payment

# Terminal B — query the content (paid)
nodalync query <CONTENT_HASH>

# Verify payment on both sides
# Terminal A:
nodalync earnings
# Terminal B:
nodalync balance

8. Clean Up

# Stop both nodes (Ctrl+C if foreground, or nodalync stop if daemon)
rm -rf /tmp/nodalync-node-a /tmp/nodalync-node-b /tmp/test-content.txt

Environment Variables

The easiest way to configure Hedera is to use the .env file in the repo root:

# Export all variables from .env
set -a && source .env && set +a
VariableDescription
NODALYNC_PASSWORDIdentity encryption password
NODALYNC_DATA_DIRData directory (default: platform-specific, see note below)
RUST_LOGLog level (e.g., nodalync=debug)
HEDERA_ACCOUNT_IDHedera account ID (e.g., 0.0.7703962)
HEDERA_CONTRACT_IDSettlement contract ID (default: 0.0.7729011)
HEDERA_PRIVATE_KEYDER-encoded ECDSA private key as inline hex string (see note below)

Note: The variables above (HEDERA_*) are read by the CLI settlement path. The MCP server subcommand reads NODALYNC_HEDERA_* prefixed variants (e.g., NODALYNC_HEDERA_ACCOUNT_ID). See nodalync mcp-server --help.

Hedera Private Key Format

IMPORTANT: HEDERA_PRIVATE_KEY is an inline hex string (not a file path). Smart contract operations require ECDSA keys with DER encoding.

FormatLengthExample PrefixWorks?
DER-encoded ECDSA98 hex chars3030020100300706052b8104000a04220420...Yes
Raw hex (Ed25519)64 hex charsd21f3bfe69929b1d6e0f37fa9622b96f...No

If you have a raw hex key, you need to DER-encode it. Check your account type at HashScan: https://hashscan.io/testnet/account/<account_id>

To create a Hedera testnet account, visit the Hedera Portal.

CLI vs MCP Key Formats

The CLI and MCP server handle Hedera private keys differently:

ContextVariable / FlagValue Type
CLI settlementHEDERA_PRIVATE_KEY env varInline hex string (DER-encoded ECDSA)
MCP server--hedera-private-key flag or NODALYNC_HEDERA_KEY_PATH env varFile path to a key file on disk

If using the MCP server, write your key to a file first and pass the path, rather than the inline hex value.


Auto-Deposit (Payment Channels)

When running a node that serves paid content, you need HBAR deposited to the settlement contract before you can accept payment channels from other peers.

MIGRATION (v0.8.x): auto_deposit now defaults to false for security. To restore previous behavior, explicitly set auto_deposit = true in your config.

How It Works

When auto-deposit is enabled, your node will automatically:

  1. On startup: Check if the contract balance is below the minimum (default: 100 HBAR), and deposit if needed (default: 200 HBAR)
  2. On channel acceptance: When a peer tries to open a channel with you, auto-deposit if balance is insufficient and cooldown has elapsed

Configuration

Configure auto-deposit behavior in your config.toml (in your data directory, see Troubleshooting section below for paths):

[settlement]
# Enable auto-deposit (default: false — opt-in for security)
auto_deposit = true

# Minimum balance to maintain in contract (in HBAR)
min_contract_balance_hbar = 100.0

# Amount to deposit when auto-deposit triggers (in HBAR)
auto_deposit_amount_hbar = 200.0

# Maximum deposit to accept/match per channel (in HBAR)
# Caps how much you'll commit when a peer opens a channel with you
max_accept_deposit_hbar = 500.0

Security Notes

  • Deposit cap: The max_accept_deposit_hbar setting limits how much you’ll commit per channel, regardless of what the peer requests
  • Cooldown: Auto-deposits are rate-limited (5 minute cooldown by default) to prevent spam-triggered deposits
  • Fixed amount: Auto-deposit always uses the configured amount, never an amount derived from the peer’s request
  • Cooldown resets on restart: The cooldown timer doesn’t persist across node restarts. The startup auto-deposit check handles the post-restart case separately.

Manual Control

To disable auto-deposit entirely:

[settlement]
auto_deposit = false

Then manually deposit as needed:

nodalync deposit 200

Common Commands Reference

Identity & Node

CommandDescription
nodalync initSet up identity and config (add --wizard for interactive setup)
nodalync whoamiShow your identity
nodalync startStart node (foreground)
nodalync start --daemonStart node (background)
nodalync statusShow node status
nodalync stopStop daemon
nodalync completions <shell>Generate shell completions (bash, zsh, fish, power-shell)

Content

CommandDescription
nodalync publish <file> [--price <hbar>] [--title "..."]Publish content
nodalync update <hash> <file>Create a new version of content
nodalync delete <hash>Delete local content
nodalync visibility <hash> --level <level>Change content visibility
nodalync versions <hash>Show version history
nodalync listList your content
nodalync search <query> [--all]Search content (matches title/description/tags)
nodalync preview <hash>View metadata (free)
nodalync query <hash>Get full content (paid)

Synthesis

CommandDescription
nodalync synthesize --sources <h1>,<h2> --output <file>Create L3 synthesis
nodalync build-l2 <hash1> <hash2> ...Build L2 entity graph from L1 sources
nodalync merge-l2 <graph1> <graph2> ...Merge L2 entity graphs
nodalync reference <hash>Reference external L3 as L0 source

Economics & Channels

CommandDescription
nodalync balanceCheck HBAR balance
nodalync earningsView earnings breakdown
nodalync deposit <amount>Deposit HBAR to protocol balance
nodalync withdraw <amount>Withdraw HBAR from protocol balance
nodalync settleForce settlement of pending payments
nodalync open-channel <peer-id> --deposit <amount>Open payment channel (min 100 HBAR)
nodalync close-channel <peer-id>Close payment channel (cooperative)
nodalync dispute-channel <peer-id>Initiate dispute close (24h waiting period)
nodalync resolve-dispute <peer-id>Resolve dispute after waiting period
nodalync list-channelsList all payment channels

MCP

CommandDescription
nodalync mcp-serverStart MCP server for AI agents

Bootstrap Node

Your node connects to this bootstrap node by default:

/dns4/nodalync-bootstrap.eastus.azurecontainer.io/tcp/9000/p2p/12D3KooWMqrUmZm4e1BJTRMWqKHCe1TSX9Vu83uJLEyCGr2dUjYm

Health check: http://nodalync-bootstrap.eastus.azurecontainer.io:8080/health


Troubleshooting

Default data directory:

The data directory varies by platform unless you set NODALYNC_DATA_DIR:

  • macOS: ~/Library/Application Support/io.nodalync.nodalync/
  • Linux: ~/.local/share/nodalync/ (or $XDG_DATA_HOME/nodalync/)
  • Windows: %APPDATA%\nodalync\nodalync\

Set NODALYNC_DATA_DIR to override: export NODALYNC_DATA_DIR=~/.nodalync

Node won’t start:

# Check if already running
nodalync status

# View logs (path shown when starting daemon)
cat ~/Library/Application\ Support/io.nodalync.nodalync/node.stderr.log  # macOS
cat ~/.local/share/nodalync/node.stderr.log  # Linux

Can’t connect to peers:

# Verify bootstrap node is reachable
curl http://nodalync-bootstrap.eastus.azurecontainer.io:8080/health

# Check your firewall allows TCP 9000

Reset everything:

# Remove data directory (check your platform above, or use your NODALYNC_DATA_DIR)
rm -rf ~/Library/Application\ Support/io.nodalync.nodalync/  # macOS
# rm -rf ~/.local/share/nodalync/  # Linux
nodalync init --wizard

Next Steps

Troubleshooting

Nodalync Protocol: Frequently Asked Questions

This document addresses common questions and concerns about the Nodalync protocol design.

Status Legend:

  • Designed — Addressed in protocol/implementation
  • Gap — Known limitation, not yet addressed
  • Deferred — Planned for future work
  • Out of Scope — Intentionally not part of the protocol
  • Known Limitation — Acknowledged tradeoff

Economic & Incentive Questions

1. Can People Game the System with Low-Effort Contributions?

StatusDesigned

Concern: Users might add useless or trivial content just to insert themselves into provenance chains and collect unearned payments.

Answer: The protocol’s economic design makes this strategy unprofitable.

Revenue only flows when content is queried. Creating thousands of low-value nodes generates zero income because no one will query them. The market determines value through actual usage, not mere existence in the system.

From the whitepaper (Section 10.2 - Attribution Gaming):

“Revenue distributes only when content is queried. Creating thousands of unused nodes generates no income. The market determines value through actual queries.”

You cannot insert yourself into someone else’s provenance chain. Provenance chains are cryptographically computed when content is created. To be in someone’s root_L0L1[] array, your content must have been:

  1. Queried and paid for by the creator
  2. Used as a source in their derivation

The spec (Section 9.3) enforces that:

“All entries in derived_from MUST have been queried by creator”

This means you can’t retroactively attach yourself to successful content. The only way to earn is to create content valuable enough that others choose to query it and build upon it.

Synthesizers who don’t contribute foundational work earn only 5%. The protocol intentionally rewards original contribution over mere reorganization. A “pure synthesizer” using entirely others’ sources receives only the 5% synthesis fee—this is by design.


2. Will the Platform Get Flooded with Low-Quality Content?

StatusDesigned

Concern: Since uploading L0 content can lead to long-term payouts, people might spam the network with low-quality or copied content, burying valuable material.

Answer: Several mechanisms prevent spam from being profitable or visible.

Protocol-level mechanisms:

  1. Pricing as filter — spam is unprofitable if nobody queries it
  2. Rate limiting — configurable per peer/content hash via AccessControl
  3. Payment bondsrequire_bond: bool in AccessControl can require deposits
  4. ReputationPeerInfo.reputation: int64 tracked per peer
  5. Allowlist/denylist — per-content access control

Discovery is application-layer: The protocol itself doesn’t include search. Discovery occurs through application-layer indexes (search engines, directories, AI agents) that can implement their own quality filtering, reputation systems, and relevance ranking. From the spec (Section 1.4):

“Content discovery/search… Applications index L1 previews and build search UX” “Content moderation — policy decisions for specific communities/jurisdictions” [Out of scope]

L1 previews enable informed decisions: Before paying for content, users see the L1 summary (extracted mentions, topics, preview). This free preview layer helps users evaluate relevance without payment, making it easy to skip low-quality content.

Philosophy: Bad content doesn’t get queried, therefore doesn’t earn. Applications build quality filters on top.


3. Can Someone Game L3 Provenance by Citing Sources They Don’t Actually Use?

StatusKnown Limitation

Concern: Someone could query prestigious sources, claim them in their L3 provenance, but write completely unrelated content—essentially “name-dropping” for credibility.

Answer: This is a real concern. The protocol guarantees cryptographic provenance but not intellectual honesty.

What the protocol guarantees:

  • Cryptographic provenance chain exists
  • You can only claim sources you actually queried and paid for
  • Payment proof exists for every claimed source

What it does NOT guarantee:

  • That the L3 content actually uses the claimed sources intellectually
  • That the L3 is “good” or “honest” synthesis

Why the attack is limited:

  1. They still have to pay for every source they claim
  2. If their L3 is garbage, nobody queries it—no revenue
  3. The sources don’t “endorse” the L3—provenance just means “this L3 paid for access to these sources”

Potential future mitigations:

  • Semantic similarity checking between L3 and claimed sources (application layer)
  • Reputation for L3 creators based on downstream utility
  • ZK proofs for content derivation (mentioned in spec §13.4)

4. Does the Protocol Pay Equal Amounts for Unequal Work?

StatusDesigned

Concern: The system pays everyone the same share per root entry, whether someone contributed a single sentence or comprehensive research. This seems unfair.

Answer: This is a deliberate design choice, not an oversight.

Each root entry represents a discrete contribution that was valuable enough to be used. If your single sentence was included in someone’s L3, that means they queried it, paid for it, and found it valuable enough to derive from. The protocol doesn’t judge contribution size—the market does.

The weighting system handles contribution frequency: From the spec (Section 4.5):

“When the same source appears multiple times in a provenance chain (through different derivation paths), it receives proportionally more: a source contributing twice receives twice the share.”

Quality is priced at the source: Content owners set their own prices. Comprehensive, high-quality research can be priced higher than trivial observations.

The alternative (contribution-weighted shares) creates worse problems:

  • Who decides what contribution is “worth more”? This requires subjective judgment.
  • Gaming becomes easier if you can inflate perceived contribution size.
  • Equal weighting is objective and trustless—a hash is either in the provenance chain or it isn’t.

5. Will Early Users Lock In Permanent Advantages?

StatusDesigned (Intentional)

Concern: Those who publish first in any topic could lock in lifelong royalties, making it hard for newcomers to compete.

Answer: You’re correct that this is largely unavoidable—and it’s intentional.

This is feature, not bug. The protocol’s explicit goal is to reward foundational contributors perpetually. From the whitepaper abstract:

“A researcher can publish valuable findings once and receive perpetual royalties as the ecosystem builds upon their work.”

Factors that prevent an impenetrable moat:

  • New contributions create new chains: If you publish novel research, you create new provenance chains. Later contributors building on YOUR work include YOU in their chains.
  • Quality and relevance matter: Early publication doesn’t guarantee usage. Superior later work will be preferred by synthesizers.
  • Versioning supports improvement: The spec supports content versioning (Section 4.3). Updated versions can be published.

The alternative is worse: Systems that DON’T reward early contributors (like current academic publishing) create no economic incentive for foundational research at all.


6. Can Content Be Reused Forever After a Single Payment?

StatusDesigned

Concern: Once someone pays for content, they can cache and reuse it infinitely without paying again. Creators only get paid once.

Answer: This is accurate and intentional.

You’re paying for access, not per-read: Like buying a book, the initial query gives you the content. Rereading your own copy doesn’t generate new payments.

New queries DO trigger new payments: From the whitepaper (Section 5.1):

“Subsequent queries to the same node (for updated information or different query parameters) trigger new payments.”

Derivation requires payment: Creating new L3 content that derives from cached sources still requires having queried (and paid for) each source at least once. From the spec (Section 7.1.5):

“All sources have been queried (payment proof exists)”

The value is in the provenance chain: When you use cached content to create an L3, and others query YOUR L3, revenue flows back through the entire provenance chain to original creators.

Unlimited re-reads would break usability: If every re-read required payment, the system would be unusable for research or synthesis work.


7. How Do Creators Know What to Charge?

StatusGap

Concern: Without pricing guidance, creators may set inefficient prices.

Answer: The spec explicitly treats pricing as a market function:

“Pricing recommendations — market dynamics emerge from application-layer analytics”

What exists:

  • Economics struct tracks total_queries and total_revenue per content
  • This data is visible to anyone indexing the DHT

What’s missing:

  • No pricing suggestions in the protocol
  • Could be built as an application: “content similar to yours earns X HBAR/query on average”
  • Initial testing uses tiny prices (0.001 HBAR per query) to prove the flow

Unlike prior data marketplaces that failed attempting to solve pricing algorithmically, Nodalync treats price discovery as a market function rather than a protocol function.


Technical Questions

8. How Does Discovery Work Without Knowing the Hash?

StatusDesigned

Concern: If content is addressed by hash, how do users find content they don’t already know about?

Answer: Discovery is an application-layer concern, with protocol primitives to support it.

Application developers can:
┌─────────────────────────────────────────────────────────────┐
│  SEARCH ENGINES                                             │
│  - Subscribe to ANNOUNCE broadcasts on DHT                  │
│  - Fetch free PREVIEW for all shared content                │
│  - Index L1 summaries, tags, content types                  │
│  - Build relevance ranking from total_queries, reputation   │
│  - Return content hashes → users query through protocol     │
└─────────────────────────────────────────────────────────────┘

The MCP server has a list_sources tool that shows available content with title, price, preview text, and topics.

Current gap: MCP doesn’t support natural language search yet—use list_sources to discover hashes. Full-text search would be an application-layer index.


9. How Is L1 Extraction Done — Manual or AI?

StatusDesigned

Concern: How are atomic facts (L1 mentions) extracted from L0 documents?

Answer: Currently rule-based, with plugin architecture for AI extractors.

Current implementation: Rule-based NLP

#![allow(unused)]
fn main() {
pub trait L1Extractor {
    fn extract(&self, content: &[u8], mime_type: Option<&str>) -> Result<Vec<Mention>>;
}

/// Rule-based extractor for MVP
pub struct RuleBasedExtractor;
}

It splits text into sentences, does basic classification (Claim, Statistic, Definition, etc.), and extracts entities (capitalized words).

Future design: Plugin architecture for AI-powered extractors:

#![allow(unused)]
fn main() {
pub trait L1ExtractorPlugin: Send + Sync {
    fn name(&self) -> &str;
    fn supported_mime_types(&self) -> Vec<&str>;
    fn extract(&self, content: &[u8], mime_type: &str) -> Result<Vec<Mention>>;
}
}

Quality enforcement: The spec says “AI extraction quality — pluggable extractors; quality is a market signal.” If your L1s are garbage, nobody queries them, you earn nothing.


10. What About the Cold Start / Chicken-and-Egg Problem?

StatusDesigned

Concern: The network needs content to be valuable, but creators won’t publish until there’s demand.

Answer: The spec explicitly acknowledges this is an application-layer concern, not a protocol concern. The protocol provides primitives; bootstrap is left to implementations.

Practical solutions in the design:

  • L1 previews are free — anyone can browse without paying
  • Discovery through DHT ANNOUNCE broadcasts — search engines can subscribe and index
  • Initial plan: Seed with own content first (spec, whitepaper, technical docs), then dogfood with Claude
  • The MCP server lets AI agents query immediately — if even 1 person has good content, an AI can use it

Gap: No automated discovery UX yet. Intentional—prove the economics first, then build the search layer.


11. How Does the Protocol Scale to Millions of Nodes?

StatusDesigned (Untested at Scale)

Concern: Can the system handle large-scale adoption?

Answer: DHT design from spec §11:

DHT: Kademlia
- Key space: 256-bit (SHA-256)
- Bucket size: 20
- Alpha (parallelism): 3
- Replication factor: 20

Kademlia scales logarithmically—lookups are O(log n). IPFS uses the same approach and handles millions of nodes.

Potential bottlenecks:

  • Settlement batching—currently batches at 100 HBAR or 1 hour intervals
  • GossipSub for announcements—needs tuning at scale
  • Bootstrap node capacity

Current testing: Single node. Multi-node testing is a future priority.


12. What Are the Privacy Implications?

StatusKnown Concern

Concern: Can others monitor what I’m querying through the DHT?

Answer: From spec §13.4:

Visible to NetworkHidden from Network
Content hashes (not content)Private content (entirely local)
L1 previews (for shared content)Query text (between querier and node)
Provenance chainsUnlisted content (unless you have hash)
Payment amounts (in settlement batches)

Current state: Your query goes directly to the content owner—not routed through random peers. But DHT lookups (finding where content lives) are visible.

Future improvements (from spec):

  • ZK proofs for provenance verification
  • Private settlement channels
  • Onion routing for query privacy

13. What If Hedera Fails? Is Multi-Chain Supported?

StatusAbstracted

Concern: The protocol is tied to Hedera. What happens if Hedera has issues?

Answer: Currently Hedera-specific, but abstracted behind a trait:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Settlement: Send + Sync {
    async fn settle_batch(&self, batch: &SettlementBatch) -> SettleResult<TransactionId>;
    async fn verify_settlement(&self, tx_id: &TransactionId) -> SettleResult<SettlementStatus>;
    async fn open_channel(&self, peer: &PeerId, deposit: u64) -> SettleResult<ChannelId>;
    async fn close_channel(&self, id: &ChannelId, ...) -> SettleResult<TransactionId>;
    // ... deposit, withdraw, dispute, account mapping, etc.
}
}

Why Hedera was chosen:

  • Fast finality (3-5 seconds)
  • Low cost (~$0.0001/tx)
  • High throughput (10,000+ TPS)
  • Good for micropayment batching

Multi-chain possibility: The Settlement trait could have implementations for Solana, Arbitrum/Optimism (L2s), or even Bitcoin Lightning.

Current priority: Prove the model works on one chain first, then generalize.


14. What Token Does Nodalync Use?

StatusDesigned

Concern: Is there an NDL or DNL token?

Answer: Neither. The protocol uses HBAR directly (Hedera’s native token)—no native token.

From spec §12.4:

  • Eliminates token bootstrapping complexity
  • Leverages existing HBAR liquidity and exchanges
  • Avoids securities/regulatory concerns
  • Allows focus on proving the knowledge economics model

All amounts are denominated in tinybars (10⁻⁸ HBAR).


Practical & UX Questions

15. Could AI Tools Accidentally Spend Large Amounts?

StatusDesigned

Concern: AI agents might fire off many queries rapidly, leading to unexpected bills.

Answer: This is addressed at the application layer.

Budget controls are application-layer responsibility: From the whitepaper (Section 7.2):

“Application-level concerns—budget controls, cost previews, spending limits, auto-approve settings—are outside protocol scope.”

The MCP server implementation includes budget tracking:

#![allow(unused)]
fn main() {
struct QueryInput {
    query: String,
    budget_hbar: f64,
}
}

Agents are configured with a session budget and cannot exceed it. When budget is exhausted, queries are rejected.

Cost preview before execution: The PREVIEW operation is free. Agents can check content price before querying.


16. Can Stolen Content Enter the System?

StatusKnown Limitation

Concern: People can upload material they don’t own, and the system has no built-in way to prevent profiting from stolen work.

Answer: The protocol cannot prevent unauthorized uploads at the entry point, but it provides strong deterrence and evidence mechanisms.

Timestamps provide priority evidence: From the whitepaper (Section 10.5):

“Timestamps record when content was published in-system. Earnings are fully visible and auditable. Evidence for legal recourse is built-in, not forensic.”

Audit trails document everything: Every query, every payment, every derivation is logged with cryptographic proof.

Republished content lacks provenance benefits:

“Republished content lacks provenance linkage to the original; the original has earlier timestamps providing evidence of priority.”

Future enhancement: Embedding similarity detection can flag potential copies at the application layer.

Practical advice for creators:

  • Publish to Nodalync first to establish timestamped priority
  • Also establish external prior art (arXiv, journal publication, etc.)
  • The protocol itself can serve as a proof-of-creation layer

17. Is There a GUI or Only CLI?

StatusCLI Only

Concern: How do non-technical users interact with the protocol?

Answer: Currently CLI only.

What exists:

  • nodalync init — setup
  • nodalync publish — publish content
  • nodalync query — query content
  • nodalync search <query> — search local and network content
  • nodalync earnings — view earnings breakdown
  • nodalync balance — check protocol balance
  • nodalync mcp-server — for Claude Desktop / AI agent integration
  • 30+ commands total (run nodalync --help for the full list)

GUI/Web: Not in the current roadmap. Focus is proving economics work first. A web interface would be an application built on top of the protocol.


18. Can I Bulk Import Existing Content?

StatusDeferred

Concern: How do I migrate an existing knowledge base to Nodalync?

Answer: Not built yet. Current workflow is:

  1. nodalync init
  2. nodalync publish <file> one at a time
  3. Extract L1 manually or with rule-based extractor

What would help:

  • Directory scanner that publishes all files
  • Watch mode for auto-publishing new content
  • Integration with existing knowledge bases

Explicitly deferred—after core experience works.


StatusOut of Scope

Concern: How do copyright holders request content removal?

Answer: The spec explicitly says this is out of scope:

“Content moderation — policy decisions for specific communities/jurisdictions.” “Takedown mechanisms — legal/policy layer above protocol.”

The protocol is infrastructure—like IPFS doesn’t have takedowns, but Pinata (an application) can.

Practical implications:

  • Content is stored on the owner’s node (local-first)
  • Removing your node removes your content
  • No global “delete” because there’s no central storage
  • DMCA-type requests would go to node operators, not the protocol

Summary Table

QuestionStatusNotes
Gaming with low-effort contentDesignedRevenue only flows on queries; can’t insert into others’ chains
Content flooding/spamDesignedMarket incentives + rate limits + app-layer filtering
L3 provenance gamingKnown LimitationPayment proof required; content quality not verified
Equal pay for unequal workDesignedMarket prices quality; weighting handles frequency
Early mover advantageDesigned (Intentional)Feature, not bug; quality still matters
Single payment cachingDesignedStandard for information goods; derivation still pays
Pricing guidanceGapMarket signals exist, no recommendations yet
Discovery without hashDesignedlist_sources MCP tool; full search is app-layer
L1 extractionDesignedRule-based MVP, plugin architecture for AI
Cold startDesignedSeed with own content first; prove economics
ScalabilityDesigned (Untested)Kademlia DHT scales logarithmically
PrivacyKnown ConcernDHT lookups visible; onion routing planned
Multi-chainAbstractedTrait exists; Hedera-only for now
Token nameDesignedHBAR (no native token)
AI runaway spendingDesignedApplication-layer budgets; free previews
Stolen contentKnown LimitationTimestamps + audit trails for legal recourse
GUIGapCLI only; GUI would be app-layer
Bulk importDeferredNot built yet
TakedownsOut of ScopeLegal/policy layer above protocol

Document Version: 2.0 Last Updated: January 2026 References: Nodalync Whitepaper, Protocol Specification v0.7.1 Contract: 0.0.7729011 (Hedera Testnet)

Nodalync Protocol Specification

Version: 0.7.1 Author: Gabriel Giangi Date: January 2026 Status: Draft


Table of Contents

  1. Overview
  2. Notation and Conventions
  3. Cryptographic Primitives
  4. Data Structures
  5. Node State
  6. Message Types
  7. Protocol Operations
  8. State Transitions
  9. Validation Rules
  10. Economic Rules
  11. Network Layer
  12. Settlement Layer
  13. Security Considerations

1. Overview

1.1 Purpose

The Nodalync Protocol enables decentralized knowledge exchange with cryptographic provenance and automatic compensation. Participants publish knowledge (L0), extract atomic facts (L1), build entity graphs (L2), and synthesize insights (L3). Every query triggers payment distributed through the complete provenance chain to foundational contributors.

1.2 Design Principles

  1. Local-first: All content stored on owner’s node
  2. Decentralized: No central authority required
  3. Trustless: Cryptographic verification, not social trust
  4. Fair: 95% of value flows to foundational contributors
  5. Minimal: Protocol specifies only what’s necessary

1.3 Protocol Layers

┌─────────────────────────────────────────┐
│          Application Layer              │  (Out of scope)
├─────────────────────────────────────────┤
│          Protocol Layer                 │  ← This specification
│  ┌─────────────────────────────────┐    │
│  │  Content    Provenance  Payment │    │
│  └─────────────────────────────────┘    │
├─────────────────────────────────────────┤
│          Network Layer (libp2p)         │  (Referenced)
├─────────────────────────────────────────┤
│          Settlement Layer (Hedera)      │  (Referenced)
└─────────────────────────────────────────┘

1.4 Scope

Nodalync is infrastructure, not an application. Like Bitcoin provides trustless value transfer without building wallets, Nodalync provides trustless knowledge exchange without building search engines.

In Scope (this protocol specifies):

ConcernWhat the protocol provides
Content addressingDeterministic hashing, content types (L0-L3)
ProvenanceCryptographic chains linking derivatives to sources
PaymentAutomatic 95/5 distribution through provenance chains
TransportMessage types, encoding, peer-to-peer delivery
SettlementPayment channel state, batch settlement interface
VisibilityPrivate/unlisted/shared access control primitives

Out of Scope (application layer):

ConcernWhy it’s out of scope
Content discovery/searchApplications index L1 previews and build search UX
Pricing recommendationsMarket dynamics emerge from application-layer analytics
Content moderationPolicy decisions for specific communities/jurisdictions
User interfacesWallets, explorers, dashboards are applications
AI extraction qualityPluggable extractors; quality is a market signal
Takedown mechanismsLegal/policy layer above protocol

1.5 Building on Nodalync

The protocol exposes primitives that enable rich applications:

Application developers can:

┌─────────────────────────────────────────────────────────────┐
│  SEARCH ENGINES                                             │
│  - Subscribe to ANNOUNCE broadcasts on DHT                  │
│  - Fetch free PREVIEW for all shared content                │
│  - Index L1 summaries, tags, content types                  │
│  - Build relevance ranking from total_queries, reputation   │
│  - Return content hashes → users query through protocol     │
├─────────────────────────────────────────────────────────────┤
│  KNOWLEDGE BROWSERS                                         │
│  - Visualize provenance chains (who contributed what)       │
│  - Show payment flows and creator earnings                  │
│  - Navigate L0→L1→L2→L3 relationships                       │
├─────────────────────────────────────────────────────────────┤
│  AI AGENTS (via MCP)                                        │
│  - Query knowledge programmatically                         │
│  - Pay-per-query with automatic source attribution          │
│  - Build L3 synthesis with cryptographic provenance         │
├─────────────────────────────────────────────────────────────┤
│  SPECIALIZED EXTRACTORS                                     │
│  - Implement L1Extractor trait for domain-specific parsing  │
│  - Compete on extraction quality (market selects winners)   │
│  - Offer extraction-as-a-service to non-technical creators  │
├─────────────────────────────────────────────────────────────┤
│  CURATED DIRECTORIES                                        │
│  - Maintain topic-specific indexes                          │
│  - Provide reputation/quality signals                       │
│  - Charge for curation (built on protocol payments)         │
└─────────────────────────────────────────────────────────────┘

Key insight: The protocol doesn’t need full-text search because:

  1. L1 previews are free and public (for shared content)
  2. Anyone can build an index by listening to ANNOUNCE messages
  3. Search is a service that can itself be monetized on the protocol

This mirrors successful infrastructure protocols:

  • Bitcoin → wallets, exchanges, explorers
  • IPFS → Pinata, Filecoin, web3.storage
  • Nodalync → search engines, data browsers, AI agents

2. Notation and Conventions

2.1 Data Types

uint8       Unsigned 8-bit integer
uint32      Unsigned 32-bit integer (big-endian)
uint64      Unsigned 64-bit integer (big-endian)
int64       Signed 64-bit integer (big-endian)
float64     IEEE 754 double-precision float
bytes       Variable-length byte array
string      UTF-8 encoded string
bool        Boolean (0x00 = false, 0x01 = true)
Hash        32 bytes (SHA-256 output)
Signature   64 bytes (Ed25519 signature)
PublicKey   32 bytes (Ed25519 public key)
PeerId      Derived from PublicKey (see 3.2)
Timestamp   uint64 (milliseconds since Unix epoch)
Amount      uint64 (tinybars, 10^-8 HBAR)

2.2 Encoding

All multi-byte integers are big-endian. Structures are serialized using a deterministic CBOR encoding (RFC 8949) with the following rules:

  1. Map keys sorted lexicographically
  2. No indefinite-length arrays or maps
  3. Minimal integer encoding
  4. No floating-point for amounts (use uint64)

2.3 Notation

||          Concatenation
H(x)        SHA-256 hash of x
Sign(k, m)  Ed25519 signature of message m with private key k
Verify(p, m, s)  Verify signature s of message m with public key p
len(x)      Length of x in bytes

3. Cryptographic Primitives

3.1 Hash Function

Algorithm: SHA-256

Content hashes are computed as:

ContentHash(content) = H(
    0x00 ||                    # Domain separator (content)
    len(content) as uint64 ||
    content
)

3.2 Identity

Algorithm: Ed25519

Node identity is an Ed25519 keypair. PeerId is derived as:

PeerId = H(
    0x00 ||                    # Key type: Ed25519
    public_key                 # 32 bytes
)[0:20]                        # Truncate to 20 bytes

Human-readable format: ndl1 + base32(PeerId)

Example: ndl1qpzry9x8gf2tvdw0s3jn54khce6mua7l

3.3 Signatures

All protocol messages requiring authentication are signed:

SignedMessage = {
    payload: bytes,
    signer: PeerId,
    signature: Sign(private_key, H(payload))
}

Verification:

Valid(msg) = Verify(
    lookup_public_key(msg.signer),
    H(msg.payload),
    msg.signature
)

3.4 Content Addressing

Content is referenced by its hash. The hash serves as a unique, verifiable identifier.

Given content C:
    hash = ContentHash(C)
    
Anyone receiving C can verify:
    ContentHash(C) == claimed_hash

4. Data Structures

4.1 Content Types

enum ContentType : uint8 {
    L0 = 0x00,      # Raw input (documents, notes, transcripts)
    L1 = 0x01,      # Mentions (extracted atomic facts)
    L2 = 0x02,      # Entity Graph (linked entities and relationships)
    L3 = 0x03       # Insights (emergent synthesis)
}

Knowledge Layer Semantics:

LayerContentTypical OperationQueryableValue Added
L0Raw documents, notes, transcriptsCREATEYesOriginal source material
L1Atomic facts extracted from L0EXTRACT_L1YesStructured, quotable claims
L2Entities and relationships across L1sBUILD_L2No (personal)Cross-document linking, your perspective
L3Novel insights synthesizing sourcesDERIVEYesOriginal analysis and conclusions

L2 is Personal: Your L2 represents your unique perspective — how you link entities, resolve ambiguities, and structure knowledge. It is never shared or queried directly. Its value surfaces when you create L3 insights that others find valuable enough to query.

4.2 Visibility

enum Visibility : uint8 {
    Private   = 0x00,   # Local only, not served
    Unlisted  = 0x01,   # Served if hash known, not announced
    Shared    = 0x02,   # Announced to DHT, publicly queryable
    Offline   = 0x03    # Taken offline by owner, manifest preserved for provenance
}

4.3 Version

struct Version {
    number: uint32,         # Sequential version number (1-indexed)
    previous: Hash?,        # Hash of previous version (null if first)
    root: Hash,             # Hash of first version (stable identifier)
    timestamp: Timestamp    # Creation time
}

Constraints:
    - If number == 1: previous MUST be null, root MUST equal self hash
    - If number > 1: previous MUST NOT be null, root MUST equal previous.root

4.4 Mention (L1)

struct Mention {
    id: Hash,                       # H(content || source_location)
    content: string,                # The atomic fact (max 1000 chars)
    source_location: SourceLocation,
    classification: Classification,
    confidence: Confidence,
    entities: string[]              # Extracted entity names
}

struct SourceLocation {
    type: LocationType,
    reference: string,              # Location identifier
    quote: string?                  # Exact quote (max 500 chars)
}

enum LocationType : uint8 {
    Paragraph = 0x00,
    Page      = 0x01,
    Timestamp = 0x02,
    Line      = 0x03,
    Section   = 0x04
}

enum Classification : uint8 {
    Claim       = 0x00,
    Statistic   = 0x01,
    Definition  = 0x02,
    Observation = 0x03,
    Method      = 0x04,
    Result      = 0x05
}

enum Confidence : uint8 {
    Explicit = 0x00,    # Directly stated in source
    Inferred = 0x01     # Reasonably inferred
}

4.4a Entity Graph (L2)

L2 Entity Graphs are personal knowledge structures. They represent a node’s interpretation and linking of entities across their queried L1 sources. L2 is never directly queried by others — its value surfaces when used to create L3 insights.

struct L2EntityGraph {
    # === Core Identity ===
    id: Hash,                           # H(serialized entities + relationships)
    
    # === Sources ===
    source_l1s: L1Reference[],          # L1 summaries this graph was built from
    source_l2s: Hash[],                 # Other L2 graphs merged/extended (optional)
    
    # === Namespace Prefixes (for compact URIs) ===
    prefixes: PrefixMap,                # Maps short prefixes to full URIs
    
    # === Graph Content ===
    entities: Entity[],                 # Resolved entities
    relationships: Relationship[],      # Relationships between entities
    
    # === Statistics ===
    entity_count: uint32,
    relationship_count: uint32,
    source_mention_count: uint32        # Total mentions linked
}

struct PrefixMap {
    entries: PrefixEntry[]              # Ordered list of prefix mappings
}

struct PrefixEntry {
    prefix: string,                     # Short form, e.g., "schema"
    uri: string                         # Full URI, e.g., "http://schema.org/"
}

# Default prefixes (always available, can be overridden):
#   "ndl"    -> "https://nodalync.io/ontology/"
#   "schema" -> "http://schema.org/"
#   "foaf"   -> "http://xmlns.com/foaf/0.1/"
#   "dc"     -> "http://purl.org/dc/elements/1.1/"
#   "rdf"    -> "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
#   "rdfs"   -> "http://www.w3.org/2000/01/rdf-schema#"
#   "xsd"    -> "http://www.w3.org/2001/XMLSchema#"
#   "owl"    -> "http://www.w3.org/2002/07/owl#"

struct L1Reference {
    l1_hash: Hash,                      # Hash of the L1Summary content
    l0_hash: Hash,                      # The original L0 this L1 came from
    mention_ids_used: Hash[]            # Which specific mentions were used
}

struct Entity {
    id: Hash,                           # Stable entity ID: H(canonical_uri || canonical_label)
    
    # === Identity ===
    canonical_label: string,            # Primary human-readable name (max 200 chars)
    canonical_uri: Uri?,                # Optional: canonical URI (e.g., "dbr:Albert_Einstein")
    aliases: string[],                  # Alternative names/spellings (max 50)
    
    # === Type (RDF-compatible) ===
    entity_types: Uri[],                # e.g., ["schema:Person", "foaf:Person"]
    
    # === Evidence ===
    source_mentions: MentionRef[],      # Which L1 mentions establish this entity
    
    # === Confidence ===
    confidence: float64,                # 0.0 - 1.0, resolution confidence
    resolution_method: ResolutionMethod,
    
    # === Optional Metadata ===
    description: string?,               # Summary description (max 500 chars)
    same_as: Uri[]?                     # Links to external entities (owl:sameAs)
}

# Uri can be:
#   - Full URI: "http://schema.org/Person"
#   - Compact URI (CURIE): "schema:Person" (expanded using prefixes)
#   - Protocol-defined: "ndl:Person" (Nodalync ontology)
type Uri = string

struct MentionRef {
    l1_hash: Hash,                      # Which L1 contains this mention
    mention_id: Hash                    # Specific mention ID within that L1
}

struct Relationship {
    id: Hash,                           # H(subject || predicate || object)
    
    # === Triple ===
    subject: Hash,                      # Entity ID
    predicate: Uri,                     # RDF predicate, e.g., "schema:worksFor"
    object: RelationshipObject,         # Entity ID or literal
    
    # === Evidence ===
    source_mentions: MentionRef[],      # Mentions that support this relationship
    confidence: float64,                # 0.0 - 1.0
    
    # === Temporal (optional) ===
    valid_from: Timestamp?,
    valid_to: Timestamp?
}

enum RelationshipObject {
    EntityRef(Hash),                    # Reference to another entity in this graph
    ExternalRef(Uri),                   # Reference to external entity
    Literal(LiteralValue)               # A typed value
}

struct LiteralValue {
    value: string,                      # The value as string
    datatype: Uri?,                     # XSD datatype, e.g., "xsd:date" (null = plain string)
    language: string?                   # Language tag, e.g., "en" (for strings only)
}

# Standard XSD datatypes (use "xsd:" prefix):
#   xsd:string, xsd:integer, xsd:decimal, xsd:boolean,
#   xsd:date, xsd:dateTime, xsd:anyURI

enum ResolutionMethod : uint8 {
    ExactMatch    = 0x00,               # Same string
    Normalized    = 0x01,               # Case/punctuation normalized
    Alias         = 0x02,               # Known alias matched
    Coreference   = 0x03,               # Pronoun/reference resolved
    ExternalLink  = 0x04,               # Matched via external KB
    Manual        = 0x05,               # Human-verified
    AIAssisted    = 0x06                # ML model assisted
}

Constraints:
    1. len(source_l1s) >= 1              # Must derive from at least one L1
    2. len(entities) >= 1                 # Must have at least one entity
    3. Each entity.id is unique within the graph
    4. Each relationship references valid entity IDs (or external URIs)
    5. All MentionRefs point to valid L1s in source_l1s
    6. 0.0 <= confidence <= 1.0
    7. len(canonical_label) <= 200
    8. len(aliases) <= 50
    9. All URIs are valid (full URI or valid CURIE with known prefix)
    10. entity_count == len(entities)
    11. relationship_count == len(relationships)
    
L2 Visibility:
    - L2 content is ALWAYS Private (never Unlisted or Shared)
    - L2 is never announced to DHT
    - L2 has no price (cannot be queried for payment)
    - L2's value is realized through L3 insights derived from it

4.4b Nodalync Ontology (ndl:)

The protocol defines a minimal ontology at https://nodalync.io/ontology/:

# Entity Types
ndl:Person
ndl:Organization  
ndl:Location
ndl:Concept
ndl:Event
ndl:Work              # Paper, book, article
ndl:Product
ndl:Technology
ndl:Metric            # Quantitative measure
ndl:TimePoint

# Relationship Predicates
ndl:mentions          # L1 mention references entity
ndl:relatedTo         # Generic relationship
ndl:partOf
ndl:createdBy
ndl:locatedIn
ndl:occurredAt
ndl:hasValue
ndl:sameAs            # Equivalent to owl:sameAs

# Provenance Predicates
ndl:derivedFrom       # Content derivation
ndl:extractedFrom     # L1 extracted from L0
ndl:builtFrom         # L2 built from L1s

Nodes are free to use any ontology. The ndl: namespace provides defaults for nodes that don’t need external ontology integration.

4.5 Provenance

struct Provenance {
    root_L0L1: ProvenanceEntry[],   # All foundational sources
    derived_from: Hash[],            # Direct parent hashes
    depth: uint32                    # Max derivation depth from any L0
}

struct ProvenanceEntry {
    hash: Hash,                 # Content hash
    owner: PeerId,              # Owner's node ID
    visibility: Visibility,     # Visibility at time of derivation
    weight: uint32              # Number of times this source appears (for duplicates)
}

Constraints:
    - root_L0L1 contains entries of type L0 or L1 only (never L2 or L3)
    - L0 content: root_L0L1 = [self], derived_from = [], depth = 0
    - L1 content: root_L0L1 = [parent L0], derived_from = [L0 hash], depth = 1
    - L2 content: root_L0L1 = merged roots from source L1s, 
                  derived_from = source L1/L2 hashes, depth = max(source.depth) + 1
    - L3 content: root_L0L1 = merged roots from all sources,
                  derived_from = source hashes, depth = max(source.depth) + 1
    - All entries in derived_from MUST have been queried by creator
    
Provenance Chain Examples:
    Simple chain:
        L0(doc) → L1(mentions) → L2(entities) → L3(insight)
        depth:  0       1            2              3
    
    L3 deriving directly from L1 (valid, skipping L2):
        L0(doc) → L1(mentions) → L3(insight)
        depth:  0       1            2
    
    L3 from mix of L1 and L2:
        L0(doc1) → L1(m1) → L2(graph) ─┐
                                        ├→ L3(insight)
        L0(doc2) → L1(m2) ─────────────┘
        
        L3.provenance = {
            root_L0L1: [doc1, doc2],  # Merged from both paths
            derived_from: [L2.hash, m2],
            depth: 4  # max(3, 2) + 1
        }

4.6 Access Control

struct AccessControl {
    allowlist: PeerId[]?,       # If set, only these peers can query
    denylist: PeerId[]?,        # These peers are blocked
    require_bond: bool,         # Require payment bond
    bond_amount: Amount?,       # Bond amount if required
    max_queries_per_peer: uint32?   # Rate limit (null = unlimited)
}

Access granted if:
    (allowlist is null OR peer in allowlist) AND
    (denylist is null OR peer NOT in denylist) AND
    (require_bond is false OR peer has posted bond)

4.7 Economics

struct Economics {
    price: Amount,              # Price per query (in smallest unit)
    currency: Currency,         # Currency identifier
    total_queries: uint64,      # Total queries served
    total_revenue: Amount       # Total revenue generated
}

enum Currency : uint8 {
    HBAR = 0x00                 # Hedera native token
}

4.8 Manifest

The manifest is the complete metadata for a content item:

struct Manifest {
    # Identity
    hash: Hash,                 # Content hash
    content_type: ContentType,
    owner: PeerId,              # Content owner (serves content, receives synthesis fee)
    
    # Versioning
    version: Version,
    
    # Visibility & Access
    visibility: Visibility,
    access: AccessControl,
    
    # Metadata
    metadata: Metadata,
    
    # Economics
    economics: Economics,
    
    # Provenance
    provenance: Provenance,
    
    # Timestamps
    created_at: Timestamp,
    updated_at: Timestamp
}

struct Metadata {
    title: string,              # Max 200 chars
    description: string?,       # Max 2000 chars
    tags: string[],             # Max 20 tags, each max 50 chars
    content_size: uint64,       # Size in bytes
    mime_type: string?          # MIME type if applicable
}

4.9 L1 Summary (Preview)

struct L1Summary {
    l0_hash: Hash,              # Source L0 hash
    mention_count: uint32,      # Total mentions extracted
    preview_mentions: Mention[], # First N mentions (max 5)
    primary_topics: string[],   # Main topics (max 5)
    summary: string             # 2-3 sentence summary (max 500 chars)
}

5. Node State

5.1 State Components

A node maintains the following state:

struct NodeState {
    # Identity
    identity: Identity,
    
    # Content storage
    content: Map<Hash, ContentRecord>,
    
    # Provenance graph
    provenance_graph: ProvenanceGraph,
    
    # Payment channels
    channels: Map<PeerId, Channel>,
    
    # Peer information
    peers: Map<PeerId, PeerInfo>,
    
    # Query cache (content from others)
    cache: Map<Hash, CachedContent>,
    
    # Settlement queue
    settlement_queue: SettlementEntry[]
}

struct Identity {
    private_key: bytes,         # Ed25519 private key (encrypted at rest)
    public_key: PublicKey,
    peer_id: PeerId
}

struct ContentRecord {
    manifest: Manifest,
    content: bytes,             # Encrypted at rest
    l1_data: L1Summary?,        # Null if L1 not extracted
    local_path: string          # Filesystem path
}

struct PeerInfo {
    peer_id: PeerId,
    public_key: PublicKey,
    addresses: MultiAddr[],     # libp2p multiaddresses
    last_seen: Timestamp,
    reputation: int64           # Reputation score
}

struct CachedContent {
    hash: Hash,
    content: bytes,
    source_peer: PeerId,
    queried_at: Timestamp,
    payment_proof: PaymentProof
}

5.2 Provenance Graph

struct ProvenanceGraph {
    # Forward edges: what does this content derive from?
    derived_from: Map<Hash, Hash[]>,
    
    # Backward edges: what derives from this content?
    derivations: Map<Hash, Hash[]>,
    
    # Flattened roots cache
    roots_cache: Map<Hash, ProvenanceEntry[]>
}

Operations:
    add_content(hash, derived_from[]) → updates both directions
    get_roots(hash) → returns flattened root_L0L1
    get_derivations(hash) → returns all downstream content

5.3 Payment Channels

struct Channel {
    peer_id: PeerId,
    state: ChannelState,
    my_balance: Amount,
    their_balance: Amount,
    nonce: uint64,
    last_update: Timestamp,
    pending_payments: Payment[]
}

enum ChannelState : uint8 {
    Opening   = 0x00,
    Open      = 0x01,
    Closing   = 0x02,
    Closed    = 0x03,
    Disputed  = 0x04
}

struct Payment {
    id: Hash,                   # H(channel_id || nonce || amount || recipient)
    amount: Amount,
    recipient: PeerId,
    query_hash: Hash,           # Content that was queried
    provenance: ProvenanceEntry[], # For distribution
    timestamp: Timestamp,
    signature: Signature        # Signed by payer
}

6. Message Types

6.1 Message Envelope

All protocol messages use a common envelope:

struct Message {
    version: uint8,             # Protocol version (0x01)
    type: MessageType,
    id: Hash,                   # Unique message ID
    timestamp: Timestamp,
    sender: PeerId,
    payload: bytes,             # Type-specific payload
    signature: Signature        # Signs H(version || type || id || timestamp || sender || payload)
}

enum MessageType : uint16 {
    # Discovery (0x01xx)
    ANNOUNCE         = 0x0100,
    ANNOUNCE_UPDATE  = 0x0101,
    SEARCH           = 0x0110,
    SEARCH_RESPONSE  = 0x0111,
    
    # Preview (0x02xx)
    PREVIEW_REQUEST  = 0x0200,
    PREVIEW_RESPONSE = 0x0201,
    
    # Query (0x03xx)
    QUERY_REQUEST    = 0x0300,
    QUERY_RESPONSE   = 0x0301,
    QUERY_ERROR      = 0x0302,
    
    # Version (0x04xx)
    VERSION_REQUEST  = 0x0400,
    VERSION_RESPONSE = 0x0401,
    
    # Channel (0x05xx)
    CHANNEL_OPEN     = 0x0500,
    CHANNEL_ACCEPT   = 0x0501,
    CHANNEL_UPDATE   = 0x0502,
    CHANNEL_CLOSE    = 0x0503,
    CHANNEL_DISPUTE  = 0x0504,
    CHANNEL_CLOSE_ACK= 0x0505,
    
    # Settlement (0x06xx)
    SETTLE_BATCH     = 0x0600,
    SETTLE_CONFIRM   = 0x0601,
    
    # Peer (0x07xx)
    PING             = 0x0700,
    PONG             = 0x0701,
    PEER_INFO        = 0x0710
}

6.2 Discovery Messages

# ANNOUNCE - Publish content availability to DHT
struct AnnouncePayload {
    hash: Hash,
    content_type: ContentType,
    title: string,
    l1_summary: L1Summary,
    price: Amount,
    addresses: MultiAddr[]
}

# ANNOUNCE_UPDATE - Announce new version
struct AnnounceUpdatePayload {
    version_root: Hash,         # Stable identifier
    new_hash: Hash,             # New version hash
    version_number: uint32,
    title: string,
    l1_summary: L1Summary,
    price: Amount
}

# SEARCH - Query DHT for content
struct SearchPayload {
    query: string,              # Natural language query
    filters: SearchFilters?,
    limit: uint32,              # Max results (1-100)
    offset: uint32              # For pagination
}

struct SearchFilters {
    content_types: ContentType[]?,
    max_price: Amount?,
    min_reputation: int64?,
    created_after: Timestamp?,
    created_before: Timestamp?,
    tags: string[]?
}

# SEARCH_RESPONSE - Search results
struct SearchResponsePayload {
    results: SearchResult[],
    total_count: uint64,
    offset: uint32
}

struct SearchResult {
    hash: Hash,
    content_type: ContentType,
    title: string,
    owner: PeerId,
    l1_summary: L1Summary,
    price: Amount,
    total_queries: uint64,
    relevance_score: float64,    # 0.0 - 1.0
    publisher_addresses: string[] # Multiaddresses for reconnection
}

6.3 Preview Messages

# PREVIEW_REQUEST - Request L1 preview (free)
struct PreviewRequestPayload {
    hash: Hash
}

# PREVIEW_RESPONSE - Return L1 preview
struct PreviewResponsePayload {
    hash: Hash,
    manifest: Manifest,         # Full manifest (no content)
    l1_summary: L1Summary
}

6.4 Query Messages

# QUERY_REQUEST - Request content (paid)
struct QueryRequestPayload {
    hash: Hash,
    query: string?,             # Optional: specific question about content
    payment: Payment,
    version: VersionSpec?       # Optional: specific version
}

enum VersionSpec : uint8 {
    Latest = 0x00,
    Number = 0x01,              # Followed by uint32 version number
    Hash   = 0x02               # Followed by Hash
}

# QUERY_RESPONSE - Return content
struct QueryResponsePayload {
    hash: Hash,
    content: bytes,
    manifest: Manifest,           # Contains full provenance chain
    payment_receipt: PaymentReceipt
}

# Whitepaper simplified response fields map to:
#   response.content    → content
#   response.sources[]  → manifest.provenance.root_L0L1[].hash
#   response.provenance → manifest.provenance
#   response.cost       → payment_receipt.amount

struct PaymentReceipt {
    payment_id: Hash,
    amount: Amount,
    timestamp: Timestamp,
    channel_nonce: uint64,
    distributor_signature: Signature    # Owner signs receipt
}

# QUERY_ERROR - Error response
struct QueryErrorPayload {
    hash: Hash,
    error_code: QueryError,
    message: string?
}

enum QueryError : uint16 {
    NOT_FOUND        = 0x0001,
    ACCESS_DENIED    = 0x0002,
    PAYMENT_REQUIRED = 0x0003,
    PAYMENT_INVALID  = 0x0004,
    RATE_LIMITED     = 0x0005,
    VERSION_NOT_FOUND= 0x0006,
    INTERNAL_ERROR   = 0xFFFF
}

6.5 Version Messages

# VERSION_REQUEST - Get version info
struct VersionRequestPayload {
    version_root: Hash          # Stable identifier
}

# VERSION_RESPONSE - Version history
struct VersionResponsePayload {
    version_root: Hash,
    versions: VersionInfo[],
    latest: Hash
}

struct VersionInfo {
    hash: Hash,
    number: uint32,
    timestamp: Timestamp,
    visibility: Visibility,
    price: Amount
}

6.6 Channel Messages

# CHANNEL_OPEN - Request to open payment channel
struct ChannelOpenPayload {
    channel_id: Hash,           # H(initiator || responder || nonce)
    initial_balance: Amount,    # Initiator's deposit
    funding_tx: bytes?          # On-chain funding proof (if required)
}

# CHANNEL_ACCEPT - Accept channel opening
struct ChannelAcceptPayload {
    channel_id: Hash,
    initial_balance: Amount,    # Responder's deposit
    funding_tx: bytes?
}

# CHANNEL_UPDATE - Update channel state (payment)
struct ChannelUpdatePayload {
    channel_id: Hash,
    nonce: uint64,
    balances: ChannelBalances,
    payments: Payment[],        # Payments in this update
    signature: Signature        # Signs the new state
}

struct ChannelBalances {
    initiator: Amount,
    responder: Amount
}

# CHANNEL_CLOSE - Initiate cooperative close
struct ChannelClosePayload {
    channel_id: Hash,
    final_balances: ChannelBalances,
    settlement_tx: bytes        # Proposed on-chain settlement
}

# CHANNEL_CLOSE_ACK - Acknowledge cooperative close
struct ChannelCloseAckPayload {
    channel_id: Hash,
    responder_signature: Signature  # Responder's signature over the close message
}

# CHANNEL_DISPUTE - Dispute channel state
struct ChannelDisputePayload {
    channel_id: Hash,
    claimed_state: ChannelUpdatePayload,    # Highest known state
    evidence: bytes[]           # Supporting evidence
}

6.7 Settlement Messages

# SETTLE_BATCH - Batch settlement request
struct SettleBatchPayload {
    batch_id: Hash,
    entries: SettlementEntry[],
    merkle_root: Hash           # Root of entries merkle tree
}

struct SettlementEntry {
    recipient: PeerId,
    amount: Amount,
    provenance_hashes: Hash[],  # Content hashes for audit
    payment_ids: Hash[]         # Payment IDs included
}

# SETTLE_CONFIRM - Confirm settlement on-chain
struct SettleConfirmPayload {
    batch_id: Hash,
    transaction_id: string,     # On-chain transaction ID
    block_number: uint64,
    timestamp: Timestamp
}

6.8 Peer Messages

# PING
struct PingPayload {
    nonce: uint64
}

# PONG
struct PongPayload {
    nonce: uint64               # Echo back
}

# PEER_INFO - Exchange peer information
struct PeerInfoPayload {
    peer_id: PeerId,
    public_key: PublicKey,
    addresses: MultiAddr[],
    capabilities: Capability[],
    content_count: uint64,
    uptime: uint64              # Seconds since node start
}

enum Capability : uint8 {
    QUERY    = 0x01,            # Can serve queries
    CHANNEL  = 0x02,            # Supports payment channels
    SETTLE   = 0x04,            # Can initiate settlement
    INDEX    = 0x08             # Participates in DHT indexing
}

7. Protocol Operations

7.1 Content Operations

7.1.1 Create

Create new content locally (not yet published).

CREATE(content: bytes, content_type: ContentType, metadata: Metadata) → Hash

Procedure:
    1. hash = ContentHash(content)
    2. version = Version {
           number: 1,
           previous: null,
           root: hash,
           timestamp: now()
       }
    3. provenance = compute_provenance(content_type, sources=[])
    4. manifest = Manifest {
           hash: hash,
           content_type: content_type,
           version: version,
           visibility: Private,
           access: default_access(),
           metadata: metadata,
           economics: Economics { price: 0, currency: HBAR, ... },
           provenance: provenance,
           created_at: now(),
           updated_at: now()
       }
    5. Store content and manifest locally
    6. Return hash

7.1.2 Extract L1

Extract mentions from L0 content.

EXTRACT_L1(hash: Hash) → L1Summary

Preconditions:
    - Content exists locally
    - content_type == L0
    
Procedure:
    1. content = load_content(hash)
    2. mentions = extract_mentions(content)  # AI or rule-based
    3. summary = L1Summary {
           l0_hash: hash,
           mention_count: len(mentions),
           preview_mentions: mentions[0:5],
           primary_topics: extract_topics(mentions),
           summary: generate_summary(content)
       }
    4. Store L1 data with content record
    5. Return summary

7.1.2a Build L2 (Entity Graph)

Build an L2 Entity Graph from one or more L1 sources. L2 is your personal knowledge structure — it is never published or queried by others.

BUILD_L2(source_l1s: Hash[], config: L2BuildConfig?) → Hash

Preconditions:
    - All source L1s have been queried (payment proof exists) OR are your own
    - len(source_l1s) >= 1

Procedure:
    1. Verify all L1 sources are accessible:
       For each l1_hash in source_l1s:
           assert cache.has(l1_hash) OR content.has(l1_hash)
           l1 = load_l1(l1_hash)
           assert l1.content_type == L1
           
    2. Extract entities from mentions:
       raw_entities = []
       For each l1 in source_l1s:
           For each mention in l1.mentions:
               extracted = extract_entities(mention, config.prefixes)
               raw_entities.extend(extracted)
               
    3. Resolve entities (merge duplicates):
       resolved_entities = resolve_entities(raw_entities, config)
       # Handles: exact match, alias resolution, coreference, external KB linking
       # Assigns URIs from configured ontologies
       
    4. Extract relationships:
       relationships = extract_relationships(resolved_entities, source_l1s, config)
       # Uses predicates from configured ontologies (default: ndl:)
       
    5. Build L2 structure:
       l2_graph = L2EntityGraph {
           id: computed after serialization,
           source_l1s: [L1Reference for each l1],
           source_l2s: [],
           prefixes: config.prefixes ?? default_prefixes(),
           entities: resolved_entities,
           relationships: relationships,
           entity_count: len(resolved_entities),
           relationship_count: len(relationships),
           source_mention_count: total_mentions_linked
       }
       
    6. Compute hash:
       content = serialize(l2_graph)
       hash = ContentHash(content)
       l2_graph.id = hash
       
    7. Compute provenance:
       root_entries = []
       For each l1 in source_l1s:
           l1_prov = get_provenance(l1)
           For each entry in l1_prov.root_L0L1:
               merge_or_increment(root_entries, entry)
       
       provenance = Provenance {
           root_L0L1: root_entries,
           derived_from: source_l1s,
           depth: max(l1.provenance.depth for l1 in source_l1s) + 1
       }
       
    8. Create manifest:
       manifest = Manifest {
           hash: hash,
           content_type: L2,
           owner: my_peer_id,
           visibility: Private,           # L2 is ALWAYS private
           economics: Economics { price: 0, ... },  # L2 has no price
           provenance: provenance,
           ...
       }
       
    9. Store content and manifest locally
    10. Return hash

struct L2BuildConfig {
    # Ontology configuration
    prefixes: PrefixMap?,                # Custom prefix mappings
    default_entity_type: Uri?,           # Default: "ndl:Concept"
    
    # Entity resolution settings
    resolution_threshold: float64?,      # Minimum confidence to merge (default: 0.8)
    use_external_kb: bool?,              # Link to external knowledge bases
    external_kb_list: Uri[]?,            # Which KBs: ["http://www.wikidata.org/", ...]
    
    # Relationship extraction
    extract_implicit: bool?,             # Infer implicit relationships
    relationship_predicates: Uri[]?      # Limit to specific predicates
}

7.1.2b Merge L2

Combine multiple of your own L2 Entity Graphs into a unified graph. This is useful when you have built separate graphs for different domains and want to unify them.

MERGE_L2(source_l2s: Hash[], config: L2MergeConfig?) → Hash

Preconditions:
    - All source L2s are your own (stored locally)
    - len(source_l2s) >= 2

Procedure:
    1. Load all L2 sources (must be local, L2 is never queried)
    2. Unify prefix mappings (merge, detect conflicts)
    3. Collect all entities and relationships from sources
    4. Cross-graph entity resolution (find same entities in different graphs)
       # Match by: canonical_uri, same_as links, label similarity
    5. Merge relationships (update entity references to merged IDs)
    6. Build new L2 with:
       source_l1s: union of all source L1 references
       source_l2s: the input source_l2s
       prefixes: merged prefix map
    7. Compute provenance:
       root_entries = merge roots from all source_l2s
       provenance = Provenance {
           root_L0L1: root_entries,
           derived_from: source_l2s,
           depth: max(l2.provenance.depth for l2 in source_l2s) + 1
       }
    8. Create manifest with visibility = Private
    9. Store and return hash

struct L2MergeConfig {
    prefixes: PrefixMap?,                # Override prefix mappings
    entity_merge_threshold: float64?,    # Confidence threshold for merging entities
    prefer_source: uint32?               # Index of source to prefer on conflicts
}

7.1.3 Publish

Make content available on the network. Note: L2 content cannot be published.

PUBLISH(hash: Hash, visibility: Visibility, price: Amount, access: AccessControl?) → bool

Preconditions:
    - Content exists locally
    - content_type != L2  # L2 is always private
    - visibility != Private OR no-op
    
Procedure:
    1. manifest = load_manifest(hash)
    2. If manifest.content_type == L2:
           Return error("L2 content cannot be published")
    3. manifest.visibility = visibility
    4. manifest.economics.price = price
    5. manifest.access = access ?? default_access()
    6. manifest.updated_at = now()
    7. Save manifest
    
    8. If visibility == Shared:
           l1_summary = get_or_extract_l1(hash)
           announce = AnnouncePayload {
               hash: hash,
               content_type: manifest.content_type,
               title: manifest.metadata.title,
               l1_summary: l1_summary,
               price: price,
               addresses: my_addresses()
           }
           DHT.announce(hash, announce)
           
    9. Return true

7.1.4 Update

Create a new version of existing content.

UPDATE(old_hash: Hash, new_content: bytes) → Hash

Preconditions:
    - Old content exists locally
    - Caller owns old content
    
Procedure:
    1. old_manifest = load_manifest(old_hash)
    2. new_hash = ContentHash(new_content)
    3. new_version = Version {
           number: old_manifest.version.number + 1,
           previous: old_hash,
           root: old_manifest.version.root,
           timestamp: now()
       }
    4. new_manifest = copy(old_manifest)
       new_manifest.hash = new_hash
       new_manifest.version = new_version
       new_manifest.updated_at = now()
    5. Store new content and manifest
    
    6. If old_manifest.visibility == Shared:
           update_announce = AnnounceUpdatePayload {
               version_root: new_manifest.version.root,
               new_hash: new_hash,
               version_number: new_version.number,
               ...
           }
           DHT.announce_update(new_manifest.version.root, update_announce)
           
    7. Return new_hash

7.1.5 Derive (Create L3)

Create an L3 insight from multiple sources.

DERIVE(sources: Hash[], insight_content: bytes, metadata: Metadata) → Hash

Sources may include any combination of:
    - L0 content (raw documents)
    - L1 content (mention collections)
    - L2 content (entity graphs)
    - L3 content (other insights)

Preconditions:
    - All sources have been queried (payment proof exists)
    - At least one source
    
Procedure:
    1. Verify all sources were queried:
       For each source in sources:
           assert cache.has(source) OR content.has(source)
           
    2. Compute provenance:
       root_entries = []
       For each source in sources:
           source_prov = get_provenance(source)
           For each entry in source_prov.root_L0L1:
               merge_or_increment(root_entries, entry)
       
       # Note: For L0/L1 sources, merge their root_L0L1 directly
       #       For L2 sources, merge the L2's root_L0L1 (traces back to L0/L1)
       #       For L3 sources, merge the L3's root_L0L1 (recursive)
           
       provenance = Provenance {
           root_L0L1: root_entries,
           derived_from: sources,
           depth: max(source.provenance.depth for source in sources) + 1
       }
       
    3. hash = ContentHash(insight_content)
    4. Create manifest with content_type = L3, provenance
    5. Store locally
    6. Return hash

Helper merge_or_increment(entries, new_entry):
    existing = find(entries, e => e.hash == new_entry.hash)
    If existing:
        existing.weight += new_entry.weight
    Else:
        entries.append(new_entry with weight=1)

7.1.6 Reference L3 as L0 (Import)

Reference an external L3 as foundational input for your own derivations.

REFERENCE_L3_AS_L0(source_l3_hash: Hash) → Reference

Preconditions:
    - L3 has been queried at least once (payment proof exists)
    - Source content_type == L3
    
Procedure:
    1. Verify L3 was queried:
           assert cache.has(source_l3_hash)
           source_manifest = cache[source_l3_hash].manifest
           assert source_manifest.content_type == L3
           
    2. Create reference in local graph:
           reference = Reference {
               hash: source_l3_hash,
               owner: source_manifest.owner,
               treat_as: L0,  # Treat this L3 as foundational for derivations
               imported_at: now()
           }
           
    3. Store reference locally
    4. Return reference

IMPORTANT: This is a reference operation, not data transfer. The actual 
content remains on the original owner's node. "Import" means treating an 
external L3 as foundational input (L0) in your own derivation chains.

When deriving from this reference:
    - The reference is included in derived_from[]
    - The L3's root_L0L1 is merged into the new content's root_L0L1
    - The L3 itself is added to root_L0L1 (the creator becomes a root)
    - Each query to your derived content triggers payments to:
      - You (5% synthesis fee)
      - The L3 creator (as a root contributor)
      - All upstream contributors in the L3's provenance chain

7.2 Query Operations

7.2.1 Discover

Search for content on the network.

DISCOVER(query: string, filters: SearchFilters?) → SearchResult[]

Procedure:
    1. search_payload = SearchPayload {
           query: query,
           filters: filters,
           limit: 50,
           offset: 0
       }
    2. results = DHT.search(search_payload)
    3. Return results sorted by relevance_score

7.2.2 Preview

Get L1 preview for content (free).

PREVIEW(peer: PeerId, hash: Hash) → (Manifest, L1Summary)

Procedure:
    1. Send PREVIEW_REQUEST { hash } to peer
    2. Await PREVIEW_RESPONSE
    3. Verify response.hash == hash
    4. Return (response.manifest, response.l1_summary)

Handler (receiving node):
    1. manifest = load_manifest(request.hash)
    2. If manifest is null:
           Return QUERY_ERROR { NOT_FOUND }
    3. If manifest.visibility == Private:
           Return QUERY_ERROR { NOT_FOUND }  # Don't reveal existence
    4. If manifest.visibility == Unlisted:
           If not check_access(sender, manifest.access):
               Return QUERY_ERROR { ACCESS_DENIED }
    5. l1_summary = load_l1_summary(request.hash)
    6. Return PREVIEW_RESPONSE { hash, manifest, l1_summary }

7.2.3 Query

Request content with payment.

QUERY(peer: PeerId, hash: Hash, query_text: string?) → (bytes, Manifest, PaymentReceipt)

Procedure:
    1. Ensure channel exists with peer:
           If not channels.has(peer):
               CHANNEL_OPEN(peer)
               
    2. Preview first to get price:
           (manifest, _) = PREVIEW(peer, hash)
           price = manifest.economics.price
           
    3. Create payment:
           payment = Payment {
               id: H(channel_id || nonce || price || peer),
               amount: price,
               recipient: peer,
               query_hash: hash,
               provenance: manifest.provenance.root_L0L1,
               timestamp: now(),
               signature: Sign(my_key, payment_data)
           }
           
    4. Send QUERY_REQUEST { hash, query_text, payment }
    5. Await QUERY_RESPONSE
    
    6. Verify response:
           assert ContentHash(response.content) == hash
           assert response.payment_receipt.amount == price
           
    7. Update channel state:
           channel.my_balance -= price
           channel.nonce += 1
           channel.pending_payments.append(payment)
           
    8. Cache content:
           cache[hash] = CachedContent {
               hash, content, peer, now(), response.payment_receipt
           }
           
    9. Return (response.content, response.manifest, response.payment_receipt)

Handler (receiving node):
    1. manifest = load_manifest(request.hash)
    2. Validate visibility and access (same as PREVIEW)
    
    3. Validate payment:
           assert request.payment.amount >= manifest.economics.price
           assert request.payment.recipient == my_peer_id
           assert Verify(sender_pubkey, payment_data, request.payment.signature)
           assert channel_has_balance(sender, request.payment.amount)
           
    4. Update channel state:
           channel.their_balance -= request.payment.amount
           channel.my_balance += (request.payment.amount * 0.05)  # Synthesis fee
           channel.nonce = max(channel.nonce, request.payment.nonce) + 1
           
    5. Queue distribution:
           For each entry in manifest.provenance.root_L0L1:
               share = (request.payment.amount * 0.95) / total_weight
               queue_settlement(entry.owner, share * entry.weight, hash)
               
    6. Update economics:
           manifest.economics.total_queries += 1
           manifest.economics.total_revenue += request.payment.amount
           
    7. content = load_content(request.hash)
    8. receipt = PaymentReceipt { ... }
    9. Return QUERY_RESPONSE { hash, content, manifest, receipt }

7.3 Channel Operations

7.3.1 Open Channel

CHANNEL_OPEN(peer: PeerId, initial_balance: Amount) → Channel

Procedure:
    1. channel_id = H(my_peer_id || peer || random_nonce())
    2. Send CHANNEL_OPEN { channel_id, initial_balance, funding_tx }
    3. Await CHANNEL_ACCEPT
    4. channel = Channel {
           peer_id: peer,
           state: Open,
           my_balance: initial_balance,
           their_balance: response.initial_balance,
           nonce: 0,
           last_update: now(),
           pending_payments: []
       }
    5. channels[peer] = channel
    6. Return channel

7.3.2 Close Channel

CHANNEL_CLOSE(peer: PeerId) → SettlementEntry[]

Procedure:
    1. channel = channels[peer]
    2. Assert channel.state == Open
    
    3. Create settlement entries from pending payments:
           entries = aggregate_payments(channel.pending_payments)
           
    4. Send CHANNEL_CLOSE { channel_id, final_balances, settlement_tx }
    5. Await acknowledgment or timeout
    
    6. If cooperative:
           Submit settlement to chain
           channel.state = Closed
       Else:
           Initiate dispute resolution
           
    7. Return entries

7.4 Settlement Operations

SETTLE_BATCH(entries: SettlementEntry[]) → TransactionId

Procedure:
    1. batch_id = H(entries || now())
    2. merkle_root = compute_merkle_root(entries)
    
    3. Build on-chain transaction:
           For each entry in entries:
               Add transfer: entry.recipient receives entry.amount
               
    4. Submit transaction to Hedera
    5. Await confirmation
    
    6. Broadcast SETTLE_CONFIRM { batch_id, tx_id, block, timestamp }
    7. Clear settled payments from channels
    
    8. Return tx_id

8. State Transitions

8.1 Content State Machine

                    ┌──────────────────────────────────────────┐
                    │                                          │
                    ▼                                          │
┌─────────┐     ┌─────────┐     ┌─────────┐     ┌─────────┐  │
│ (none)  │────▶│ Private │────▶│Unlisted │────▶│ Shared  │──┘
└─────────┘     └─────────┘     └─────────┘     └─────────┘
    │               │               │               │
    │               │               │               │
    │  CREATE       │  PUBLISH      │  PUBLISH      │
    │               │  (unlisted)   │  (shared)     │
    │               │               │               │
    │               │◀──────────────│◀──────────────│
    │               │   UNPUBLISH   │   UNPUBLISH   │
    │               │               │               │
    │               │               │               │
    └───────────────┴───────────────┴───────────────┘
                            │
                            │ DELETE
                            ▼
                      ┌─────────┐
                      │ Deleted │
                      └─────────┘

Valid transitions:
    (none) → Private:    CREATE
    Private → Unlisted:  PUBLISH(visibility=Unlisted)
    Private → Shared:    PUBLISH(visibility=Shared)
    Unlisted → Shared:   PUBLISH(visibility=Shared)
    Unlisted → Private:  UNPUBLISH
    Shared → Unlisted:   UNPUBLISH(keep_unlisted=true)
    Shared → Private:    UNPUBLISH
    Shared → Offline:    TAKE_OFFLINE (manifest preserved for provenance)
    Unlisted → Offline:  TAKE_OFFLINE
    Offline → Shared:    PUBLISH(visibility=Shared)
    Offline → Unlisted:  PUBLISH(visibility=Unlisted)
    Any → Deleted:       DELETE (local only, provenance persists)

8.2 Channel State Machine

┌─────────┐     ┌─────────┐     ┌─────────┐
│ (none)  │────▶│ Opening │────▶│  Open   │
└─────────┘     └─────────┘     └─────────┘
                    │               │   │
                    │ timeout       │   │ UPDATE
                    │               │   └────┐
                    ▼               │        │
              ┌─────────┐          │        │
              │ Failed  │          │◀───────┘
              └─────────┘          │
                                   │
                    ┌──────────────┴──────────────┐
                    │                             │
                    ▼ cooperative                 ▼ unilateral/dispute
              ┌─────────┐                   ┌──────────┐
              │ Closing │                   │ Disputed │
              └─────────┘                   └──────────┘
                    │                             │
                    │ settled                     │ resolved
                    ▼                             ▼
              ┌─────────────────────────────────────┐
              │              Closed                 │
              └─────────────────────────────────────┘

Valid transitions:
    (none) → Opening:    CHANNEL_OPEN sent
    Opening → Open:      CHANNEL_ACCEPT received
    Opening → Failed:    Timeout or rejection
    Open → Open:         CHANNEL_UPDATE (payment)
    Open → Closing:      CHANNEL_CLOSE (cooperative)
    Open → Disputed:     CHANNEL_DISPUTE
    Closing → Closed:    Settlement confirmed
    Disputed → Closed:   Dispute resolved on-chain

8.3 Query State Machine (per request)

┌─────────┐     ┌─────────┐     ┌──────────┐     ┌─────────┐
│Initiate │────▶│ Preview │────▶│ Payment  │────▶│Complete │
└─────────┘     └─────────┘     └──────────┘     └─────────┘
                    │               │
                    │ error         │ error
                    ▼               ▼
              ┌───────────────────────┐
              │        Failed         │
              └───────────────────────┘

States:
    Initiate:   Query started
    Preview:    L1 preview received, evaluating
    Payment:    Payment sent, awaiting content
    Complete:   Content received and verified
    Failed:     Error at any stage

9. Validation Rules

9.1 Content Validation

VALIDATE_CONTENT(content: bytes, manifest: Manifest) → bool

Rules:
    1. ContentHash(content) == manifest.hash
    2. len(content) == manifest.metadata.content_size
    3. len(manifest.metadata.title) <= 200
    4. len(manifest.metadata.description) <= 2000
    5. len(manifest.metadata.tags) <= 20
    6. For each tag: len(tag) <= 50
    7. manifest.content_type in {L0, L1, L2, L3}
    8. manifest.visibility in {Private, Unlisted, Shared, Offline}
    
    # L2-specific validation
    9. If manifest.content_type == L2:
           l2 = deserialize(content) as L2EntityGraph
           assert l2.id == manifest.hash
           assert len(l2.source_l1s) >= 1
           assert len(l2.entities) >= 1
           assert all entity IDs are unique
           assert all relationship entity refs are valid
           assert all MentionRefs point to valid source L1s
           assert l2.entity_count == len(l2.entities)
           assert l2.relationship_count == len(l2.relationships)

9.2 Version Validation

VALIDATE_VERSION(manifest: Manifest, previous: Manifest?) → bool

Rules:
    1. If manifest.version.number == 1:
           manifest.version.previous == null
           manifest.version.root == manifest.hash
           
    2. If manifest.version.number > 1:
           previous != null
           manifest.version.previous == previous.hash
           manifest.version.root == previous.version.root
           manifest.version.number == previous.version.number + 1
           manifest.version.timestamp > previous.version.timestamp

9.3 Provenance Validation

VALIDATE_PROVENANCE(manifest: Manifest, sources: Manifest[]) → bool

Rules:
    1. If manifest.content_type == L0:
           manifest.provenance.root_L0L1 == [self_entry]
           manifest.provenance.derived_from == []
           manifest.provenance.depth == 0
           
    2. If manifest.content_type == L1:
           len(manifest.provenance.root_L0L1) >= 1
           len(manifest.provenance.derived_from) == 1
           derived_from[0] is an L0 hash
           All root_L0L1 entries are type L0
           manifest.provenance.depth == 1
           
    3. If manifest.content_type == L2:
           len(manifest.provenance.root_L0L1) >= 1
           len(manifest.provenance.derived_from) >= 1
           All derived_from are L1 or L2 hashes
           All root_L0L1 entries are type L0 or L1
           manifest.provenance.depth >= 2
           
    4. If manifest.content_type == L3:
           len(manifest.provenance.root_L0L1) >= 1
           len(manifest.provenance.derived_from) >= 1
           All derived_from hashes exist in sources
           All root_L0L1 entries are type L0 or L1
           
    5. root_L0L1 computation is correct:
           computed = compute_root_L0L1(sources)
           manifest.provenance.root_L0L1 == computed
           
    6. Depth is correct:
           manifest.provenance.depth == max(s.provenance.depth for s in sources) + 1
           
    7. No self-reference:
           manifest.hash not in manifest.provenance.derived_from
           manifest.hash not in [e.hash for e in manifest.provenance.root_L0L1]
           
    8. No cycles in provenance graph

9.4 Payment Validation

VALIDATE_PAYMENT(payment: Payment, channel: Channel, manifest: Manifest) → bool

Rules:
    1. payment.amount >= manifest.economics.price
    2. payment.recipient == manifest_owner
    3. payment.query_hash == manifest.hash
    4. channel.state == Open
    5. channel.their_balance >= payment.amount  # Payer has funds
    6. payment.nonce > channel.nonce  # No replay
    7. Verify(payer_pubkey, payment_data, payment.signature)
    8. payment.provenance == manifest.provenance.root_L0L1

9.5 Message Validation

VALIDATE_MESSAGE(msg: Message) → bool

Rules:
    1. msg.version == PROTOCOL_VERSION
    2. msg.type is valid MessageType
    3. msg.timestamp within acceptable skew (±5 minutes)
    4. msg.sender is valid PeerId
    5. Verify(lookup_pubkey(msg.sender), H(msg without signature), msg.signature)
    6. msg.payload decodes correctly for msg.type
    7. Payload-specific validation passes

9.6 Access Validation

VALIDATE_ACCESS(requester: PeerId, manifest: Manifest) → bool

Rules:
    1. If manifest.visibility == Private:
           Return false  # No external access
           
    2. If manifest.visibility == Unlisted:
           If manifest.access.allowlist != null:
               requester in manifest.access.allowlist
           If manifest.access.denylist != null:
               requester not in manifest.access.denylist
               
    3. If manifest.visibility == Shared:
           If manifest.access.denylist != null:
               requester not in manifest.access.denylist
           # Allowlist ignored for Shared (open to all)
           
    4. If manifest.access.require_bond:
           has_bond(requester, manifest.access.bond_amount)

10. Economic Rules

10.1 Revenue Distribution

DISTRIBUTE_REVENUE(payment: Payment) → Distribution[]

Constants:
    SYNTHESIS_FEE = 0.05  # 5%
    ROOT_POOL = 0.95      # 95%

Procedure:
    1. total = payment.amount
    2. owner_share = total * SYNTHESIS_FEE
    3. root_pool = total * ROOT_POOL
    
    4. total_weight = sum(e.weight for e in payment.provenance)
    5. per_weight = root_pool / total_weight
    
    6. distributions = []
    7. For each entry in payment.provenance:
           amount = per_weight * entry.weight
           
           # Owner also gets share if they have roots
           If entry.owner == content_owner:
               owner_share += amount
           Else:
               distributions.append(Distribution {
                   recipient: entry.owner,
                   amount: amount,
                   source_hash: entry.hash
               })
               
    8. distributions.append(Distribution {
           recipient: content_owner,
           amount: owner_share,
           source_hash: payment.query_hash
       })
       
    9. Return distributions

10.2 Distribution Example

Scenario:
    Bob's L3 derives from:
        - Alice's L0 (2 documents)
        - Carol's L0 (1 document)
        - Bob's L0 (2 documents)
    
    Query payment: 100 HBAR

Provenance:
    root_L0L1 = [
        { hash: alice_1, owner: Alice, weight: 1 },
        { hash: alice_2, owner: Alice, weight: 1 },
        { hash: carol_1, owner: Carol, weight: 1 },
        { hash: bob_1, owner: Bob, weight: 1 },
        { hash: bob_2, owner: Bob, weight: 1 }
    ]
    total_weight = 5

Distribution:
    owner_share = 100 * 0.05 = 5 HBAR (Bob's synthesis fee)
    root_pool = 100 * 0.95 = 95 HBAR
    per_weight = 95 / 5 = 19 HBAR

    Alice: 2 * 19 = 38 HBAR
    Carol: 1 * 19 = 19 HBAR
    Bob (roots): 2 * 19 = 38 HBAR
    Bob (synthesis): 5 HBAR
    Bob total: 43 HBAR (5 + 38)

Final:
    Alice: 38 HBAR (38%)
    Carol: 19 HBAR (19%)
    Bob: 43 HBAR (43%)

10.3 Price Setting

Constraints:
    MIN_PRICE = 1  # 1 tinybar (10^-8 HBAR)
    MAX_PRICE = 10^16  # Practical maximum
    
Rules:
    1. price >= MIN_PRICE
    2. price <= MAX_PRICE
    3. price is uint64 (no floating point)
    4. Owner can change price at any time (takes effect immediately)

10.4 Settlement Batching

BATCH_THRESHOLD = 100 HBAR  # Minimum to trigger auto-settlement
BATCH_INTERVAL = 3600      # Maximum seconds between settlements

Rules:
    1. Settlement triggered when:
           sum(pending_payments) >= BATCH_THRESHOLD
           OR time_since_last_settlement >= BATCH_INTERVAL
           OR channel_closing
           
    2. Batch includes all pending payments across all channels
    3. Payments aggregated by recipient (one entry per recipient)
    4. Merkle root allows any recipient to verify inclusion

11. Network Layer

11.1 Transport

The protocol uses libp2p for peer-to-peer communication:

Transports:
    - TCP (primary)
    - QUIC (preferred when available)
    - WebSocket (browser compatibility)
    
Multiplexing:
    - yamux
    - mplex (fallback)
    
Security:
    - Noise protocol (XX handshake pattern)
    - TLS 1.3 (fallback)

11.2 Discovery

DHT: Kademlia
    - Key space: 256-bit (SHA-256)
    - Bucket size: 20
    - Alpha (parallelism): 3
    - Replication factor: 20

Content records stored at:
    key = H(content_hash)
    value = AnnouncePayload (signed)
    
Version updates stored at:
    key = H("version:" || version_root)
    value = AnnounceUpdatePayload (signed)
    
Search index:
    - Local inverted index per node
    - Gossip-based index synchronization
    - Semantic embeddings for similarity search

11.3 Peer Discovery

Bootstrap nodes:
    - Hardcoded list of well-known nodes
    - DNS-based discovery (TXT records)
    
Peer exchange:
    - Nodes share peer lists periodically
    - Prefer peers with high uptime and low latency
    
NAT traversal:
    - STUN for address discovery
    - Relay nodes for symmetric NAT
    - Hole punching via DCUtR

11.4 Message Routing

Direct messages:
    - Point-to-point when peer is known
    - DHT lookup to find peer addresses
    
Broadcast messages:
    - GossipSub for protocol announcements
    - Topic: /nodalync/announce/1.0.0
    
Request-response:
    - Dedicated protocol streams
    - Timeout: 30 seconds default
    - Retry: 3 attempts with exponential backoff

12. Settlement Layer

12.1 Chain Selection

Primary: Hedera Hashgraph

Rationale: - Fast finality (3-5 seconds) - Low cost (~$0.0001/tx) - High throughput (10,000+ TPS) - Suitable for micropayment batching

12.2 On-Chain Data

Settlement Contract State:
    balances: Map<AccountId, Amount>        # Token balances
    channels: Map<ChannelId, ChannelState>  # Channel states
    attestations: Map<Hash, Attestation>    # Content attestations

struct Attestation {
    content_hash: Hash,
    owner: AccountId,
    timestamp: Timestamp,
    provenance_root: Hash  # Merkle root of root_L0L1
}

struct ChannelState {
    participants: [AccountId, AccountId],
    balances: [Amount, Amount],
    nonce: uint64,
    status: ChannelStatus
}

12.3 Contract Operations

// Deposit tokens to protocol
deposit(amount: Amount)
    Requires: sender has sufficient tokens
    Effects: balances[sender] += amount

// Withdraw tokens from protocol
withdraw(amount: Amount)
    Requires: balances[sender] >= amount
    Effects: balances[sender] -= amount, transfer to sender

// Attest content publication
attest(content_hash: Hash, provenance_root: Hash)
    Requires: caller is content owner
    Effects: attestations[content_hash] = Attestation { ... }

// Open payment channel
openChannel(peer: AccountId, myDeposit: Amount, peerDeposit: Amount)
    Requires: both parties sign, sufficient balances
    Effects: Create channel, lock deposits

// Update channel state (cooperative)
updateChannel(channelId: ChannelId, newState: ChannelState, signatures: [Sig, Sig])
    Requires: Both signatures valid, nonce > current nonce
    Effects: Update channel state

// Close channel (cooperative)
closeChannel(channelId: ChannelId, finalState: ChannelState, signatures: [Sig, Sig])
    Requires: Both signatures valid
    Effects: Distribute balances, delete channel

// Dispute channel (unilateral)
disputeChannel(channelId: ChannelId, claimedState: ChannelState, signature: Sig)
    Requires: Valid signature from one party
    Effects: Start dispute period (24 hours)

// Resolve dispute
resolveDispute(channelId: ChannelId)
    Requires: Dispute period elapsed
    Effects: Apply highest-nonce state, close channel

// Batch settlement
settleBatch(entries: SettlementEntry[], merkleProofs: MerkleProof[])
    Requires: Valid merkle proofs, sufficient channel balances
    Effects: Transfer amounts to recipients

12.4 Currency

Currency: HBAR (Hedera native token)
    Decimals: 8 (1 HBAR = 10^8 tinybars)

The protocol uses HBAR directly for all payments. This decision:
    - Eliminates token bootstrapping complexity
    - Leverages existing HBAR liquidity and exchanges
    - Avoids securities/regulatory concerns
    - Allows focus on proving the knowledge economics model

All amounts in the protocol are denominated in tinybars (10^-8 HBAR).

13. Security Considerations

13.1 Threat Model

Assumptions:
    - Network is asynchronous and unreliable
    - Adversaries can delay or drop messages
    - Adversaries can create unlimited identities (Sybil)
    - Adversaries cannot break cryptographic primitives
    - Majority of economic stake is honest

Threats addressed:
    1. Content theft (copying after query)
    2. Payment fraud (fake payments, double-spending)
    3. Provenance manipulation (false attribution)
    4. Eclipse attacks (isolating nodes)
    5. Denial of service
    
Threats NOT addressed (out of scope):
    1. Content quality/accuracy
    2. Legal disputes over IP
    3. Privacy of query patterns
    4. Nation-state level attacks

13.2 Mitigations

Content theft:
    - Mitigation: Audit trail, timestamps, legal recourse
    - Note: Cannot prevent, only detect and prove
    
Payment fraud:
    - Mitigation: Cryptographic signatures, channel states
    - Settlement disputes resolve on-chain with evidence
    
Provenance manipulation:
    - Mitigation: Content-addressed hashing
    - Cannot claim derivation without querying (payment proof)
    
Eclipse attacks:
    - Mitigation: Multiple bootstrap nodes, peer diversity requirements
    - Monitor for unusual peer behavior
    
Denial of service:
    - Mitigation: Rate limiting, require payment bonds
    - Reputation system penalizes bad actors

13.3 Key Management

Private key storage:
    - Encrypted at rest (AES-256-GCM)
    - Key derived from user password (Argon2id)
    - Optional hardware security module support
    
Key rotation:
    - Supported via identity update message
    - Old key signs authorization for new key
    - Grace period for transition
    
Recovery:
    - Optional mnemonic backup (BIP-39)
    - Social recovery (threshold signatures) - future

13.4 Privacy Considerations

Visible to network:
    - Content hashes (not content)
    - L1 previews (for shared content)
    - Provenance chains
    - Payment amounts (in settlement batches)
    
Hidden from network:
    - Private content (entirely local)
    - Query text (between querier and node)
    - Unlisted content (unless you have hash)
    
Future improvements:
    - ZK proofs for provenance verification
    - Private settlement channels
    - Onion routing for query privacy

Appendix A: Wire Formats

A.1 Message Encoding

All messages use deterministic CBOR encoding:

Message wire format:
    [0x00]                  # Protocol magic byte
    [version: uint8]        # Protocol version
    [type: uint16]          # Message type
    [length: uint32]        # Payload length
    [payload: bytes]        # CBOR-encoded payload
    [signature: 64 bytes]   # Ed25519 signature

A.2 Hash Computation

ContentHash:
    H(
        [0x00]              # Domain separator for content
        [length: uint64]    # Content length
        [content: bytes]    # Raw content
    )

MessageHash (for signing):
    H(
        [0x01]              # Domain separator for messages
        [version: uint8]
        [type: uint16]
        [id: 32 bytes]
        [timestamp: uint64]
        [sender: 20 bytes]
        [payload_hash: 32 bytes]  # H(payload)
    )

ChannelStateHash:
    H(
        [0x02]              # Domain separator for channels
        [channel_id: 32 bytes]
        [nonce: uint64]
        [initiator_balance: uint64]
        [responder_balance: uint64]
    )

Appendix B: Constants

PROTOCOL_VERSION = 0x01
PROTOCOL_MAGIC = 0x00

# Timing
MESSAGE_TIMEOUT_MS = 30000
CHANNEL_DISPUTE_PERIOD_MS = 86400000  # 24 hours
MAX_CLOCK_SKEW_MS = 300000  # 5 minutes

# Limits
MAX_CONTENT_SIZE = 104857600  # 100 MB
MAX_MESSAGE_SIZE = 10485760   # 10 MB
MAX_MENTIONS_PER_L0 = 1000
MAX_SOURCES_PER_L3 = 100
MAX_PROVENANCE_DEPTH = 100
MAX_TAGS = 20
MAX_TAG_LENGTH = 50
MAX_TITLE_LENGTH = 200
MAX_DESCRIPTION_LENGTH = 2000

# L2 Entity Graph limits
MAX_ENTITIES_PER_L2 = 10000
MAX_RELATIONSHIPS_PER_L2 = 50000
MAX_ALIASES_PER_ENTITY = 50
MAX_CANONICAL_LABEL_LENGTH = 200
MAX_PREDICATE_LENGTH = 100
MAX_ENTITY_DESCRIPTION_LENGTH = 500
MAX_SOURCE_L1S_PER_L2 = 100
MAX_SOURCE_L2S_PER_MERGE = 20

# Economics
MIN_PRICE = 1  # Smallest unit
SYNTHESIS_FEE_NUMERATOR = 5
SYNTHESIS_FEE_DENOMINATOR = 100  # 5%
SETTLEMENT_BATCH_THRESHOLD = 10000000000  # 100 HBAR (10^8 tinybars)
SETTLEMENT_BATCH_INTERVAL_MS = 3600000  # 1 hour

# DHT
DHT_BUCKET_SIZE = 20
DHT_ALPHA = 3
DHT_REPLICATION = 20

Appendix C: Error Codes

# Query Errors (0x0001 - 0x00FF)
NOT_FOUND        = 0x0001  # Content does not exist
ACCESS_DENIED    = 0x0002  # Not authorized
PAYMENT_REQUIRED = 0x0003  # No payment provided
PAYMENT_INVALID  = 0x0004  # Payment validation failed
RATE_LIMITED     = 0x0005  # Too many requests
VERSION_NOT_FOUND= 0x0006  # Specific version not found

# Channel Errors (0x0100 - 0x01FF)
CHANNEL_NOT_FOUND    = 0x0100
CHANNEL_CLOSED       = 0x0101
INSUFFICIENT_BALANCE = 0x0102
INVALID_NONCE        = 0x0103
INVALID_SIGNATURE    = 0x0104

# Validation Errors (0x0200 - 0x02FF)
INVALID_HASH        = 0x0200
INVALID_PROVENANCE  = 0x0201
INVALID_VERSION     = 0x0202
INVALID_MANIFEST    = 0x0203
CONTENT_TOO_LARGE   = 0x0204

# L2 Entity Graph Errors (0x0210 - 0x021F)
L2_INVALID_STRUCTURE    = 0x0210  # Malformed L2EntityGraph
L2_MISSING_SOURCE       = 0x0211  # Source L1 not found
L2_ENTITY_LIMIT         = 0x0212  # Too many entities
L2_RELATIONSHIP_LIMIT   = 0x0213  # Too many relationships
L2_INVALID_ENTITY_REF   = 0x0214  # Relationship references invalid entity
L2_CYCLE_DETECTED       = 0x0215  # Circular entity reference
L2_INVALID_URI          = 0x0216  # Invalid URI or CURIE format
L2_CANNOT_PUBLISH       = 0x0217  # L2 content cannot be published

# Network Errors (0x0300 - 0x03FF)
PEER_NOT_FOUND      = 0x0300
CONNECTION_FAILED   = 0x0301
TIMEOUT             = 0x0302

# Internal Errors (0xFF00 - 0xFFFF)
INTERNAL_ERROR      = 0xFFFF

Appendix D: Reference Implementation Notes

The reference implementation SHOULD:

  1. Use Rust for memory safety and performance
  2. Use libp2p-rs for networking
  3. Use SQLite for local storage
  4. Use RocksDB for high-performance caching
  5. Provide both CLI and library interfaces
  6. Support WASM compilation for browser nodes (future)

Directory structure:

nodalync/
├── Cargo.toml
├── src/
│   ├── lib.rs           # Library root
│   ├── main.rs          # CLI entry point
│   ├── types/           # Data structures
│   ├── crypto/          # Cryptographic operations
│   ├── storage/         # Local storage
│   ├── network/         # P2P networking
│   ├── protocol/        # Protocol operations
│   ├── channels/        # Payment channels
│   └── settlement/      # Chain settlement
├── tests/
└── docs/

End of Protocol Specification

Version History:

  • 0.7.1 (February 2026): Added CHANNEL_CLOSE_ACK message type; added Offline transitions to content state machine; fixed validation rule §9.1 to include Offline visibility
  • 0.3.0 (January 2026): Added SEARCH protocol for network-wide content discovery, ManifestFilter with text search
  • 0.2.1-draft (January 2026): Changed currency from NDL token to HBAR (Hedera native)
  • 0.2.0-draft (January 2026): Added L2 Entity Graph as protocol-level content type
  • 0.1.0-draft (January 2025): Initial draft

Nodalync Architecture

This document defines the module structure, dependencies, and implementation order for the Nodalync protocol.

Module Dependency Graph

                  ┌──────────────────┐     ┌──────────────────┐
                  │  nodalync-cli    │     │  nodalync-mcp    │
                  │  (binary crate)  │     │  (MCP server)    │
                  └────────┬─────────┘     └────────┬─────────┘
                           │                        │
                           └───────────┬────────────┘
                                       │
          ┌────────────────────────────┼────────────────────────┐
          │                            │                        │
          ▼                            │                        ▼
   ┌─────────────┐                     │                 ┌──────────────┐
   │ nodalync-net│                     │                 │nodalync-settle│
   │  (P2P/DHT)  │                     │                 │   (chain)    │
   └──────┬──────┘                     │                 └──────┬───────┘
          │                            │                        │
          │        ┌───────────────────┘                        │
          │        │                                            │
          │        ▼                                            │
          │ ┌─────────────┐                                     │
          ├─│ nodalync-ops│                                     │
          │ │ (operations)│                                     │
          │ └──────┬──────┘                                     │
          │        │                                            │
          │ ┌──────┴──────┐                                     │
          │ │             │                                     │
          ▼ ▼             ▼                                     ▼
   ┌─────────────┐  ┌───────────┐ ┌───────────┐  ┌───────────┐
   │nodalync-wire│  │nodalync-  │ │nodalync-  │  │nodalync-  │
   │(serialization)│ │  store   │ │  valid    │  │   econ    │
   └──────┬──────┘  └─────┬─────┘ └─────┬─────┘  └─────┬─────┘
          │               │             │              │
          └───────────────┴──────┬──────┴──────────────┘
                                 │
                                 ▼
                         ┌─────────────┐
                         │nodalync-types│
                         │ (all structs)│
                         └──────┬──────┘
                                │
                                ▼
                         ┌─────────────┐
                         │nodalync-crypto│
                         │(hash, sign)  │
                         └─────────────┘

Note: nodalync-net depends on nodalync-ops to dispatch incoming messages to the appropriate handlers.

Crates Overview

CratePurposeSpec SectionsDependencies
nodalync-cryptoHashing, signing, identity§3None (external: sha2, ed25519-dalek)
nodalync-typesAll data structures§4crypto
nodalync-wireMessage serialization/deserialization§6, Appendix Atypes
nodalync-storeLocal content & manifest storage§5types
nodalync-validAll validation rules§9types
nodalync-econRevenue distribution math§10types
nodalync-opsProtocol operations (CREATE, QUERY, etc)§7store, valid, econ, wire
nodalync-netP2P networking, DHT§11wire, ops
nodalync-settleBlockchain settlement§12econ, types
nodalync-cliCommand-line interfaceall
nodalync-mcpMCP server for AI agentsops, store, net, settle

Key Interfaces (Traits)

Each crate exposes traits that define its contract. Implementations can vary (e.g., in-memory vs SQLite storage) but must satisfy the trait.

nodalync-crypto

#![allow(unused)]
fn main() {
pub trait ContentHasher {
    fn hash(content: &[u8]) -> Hash;
    fn verify(content: &[u8], expected: &Hash) -> bool;
}

pub trait Signer {
    fn sign(&self, message: &[u8]) -> Signature;
    fn verify(public_key: &PublicKey, message: &[u8], signature: &Signature) -> bool;
}

pub trait Identity {
    fn generate() -> Self;
    fn public_key(&self) -> &PublicKey;
    fn peer_id(&self) -> PeerId;
    fn sign(&self, message: &[u8]) -> Signature;
}
}

nodalync-store

#![allow(unused)]
fn main() {
pub trait ContentStore {
    fn store(&mut self, hash: &Hash, content: &[u8]) -> Result<()>;
    fn load(&self, hash: &Hash) -> Result<Option<Vec<u8>>>;
    fn exists(&self, hash: &Hash) -> bool;
    fn delete(&mut self, hash: &Hash) -> Result<()>;
}

pub trait ManifestStore {
    fn store(&mut self, manifest: &Manifest) -> Result<()>;
    fn load(&self, hash: &Hash) -> Result<Option<Manifest>>;
    fn list(&self, filter: ManifestFilter) -> Result<Vec<Manifest>>;
    fn update(&mut self, manifest: &Manifest) -> Result<()>;
}

pub trait ProvenanceGraph {
    fn add(&mut self, hash: &Hash, derived_from: &[Hash]) -> Result<()>;
    fn get_roots(&self, hash: &Hash) -> Result<Vec<ProvenanceEntry>>;
    fn get_derivations(&self, hash: &Hash) -> Result<Vec<Hash>>;
}
}

nodalync-valid

#![allow(unused)]
fn main() {
pub trait Validator {
    fn validate_content(&self, content: &[u8], manifest: &Manifest) -> Result<()>;
    fn validate_version(&self, manifest: &Manifest, previous: Option<&Manifest>) -> Result<()>;
    fn validate_provenance(&self, manifest: &Manifest, sources: &[Manifest]) -> Result<()>;
    fn validate_payment(&self, payment: &Payment, channel: &Channel, manifest: &Manifest) -> Result<()>;
    fn validate_message(&self, message: &Message) -> Result<()>;
    fn validate_access(&self, requester: &PeerId, manifest: &Manifest) -> Result<()>;
}
}

nodalync-econ

#![allow(unused)]
fn main() {
pub trait Distributor {
    fn distribute(&self, payment: &Payment, provenance: &[ProvenanceEntry]) -> Vec<Distribution>;
    fn calculate_batch(&self, payments: &[Payment]) -> SettlementBatch;
}
}

nodalync-ops

#![allow(unused)]
fn main() {
pub trait Operations {
    // Content operations
    fn create(&mut self, content: &[u8], content_type: ContentType, metadata: Metadata) -> Result<Hash>;
    fn publish(&mut self, hash: &Hash, visibility: Visibility, price: Amount) -> Result<()>;
    fn update(&mut self, old_hash: &Hash, new_content: &[u8]) -> Result<Hash>;
    fn derive(&mut self, sources: &[Hash], insight: &[u8], metadata: Metadata) -> Result<Hash>;
    
    // Query operations
    fn preview(&self, hash: &Hash) -> Result<(Manifest, L1Summary)>;
    fn query(&mut self, hash: &Hash, payment: Payment) -> Result<QueryResponse>;
}
}

nodalync-net

#![allow(unused)]
fn main() {
pub trait Network {
    fn announce(&self, hash: &Hash, manifest: &Manifest) -> Result<()>;
    fn search(&self, query: &str, filters: SearchFilters) -> Result<Vec<SearchResult>>;
    fn send(&self, peer: &PeerId, message: Message) -> Result<()>;
    fn receive(&mut self) -> Result<(PeerId, Message)>;
}
}

nodalync-settle

#![allow(unused)]
fn main() {
pub trait Settlement {
    fn submit_batch(&self, batch: SettlementBatch) -> Result<TransactionId>;
    fn verify_settlement(&self, tx_id: &TransactionId) -> Result<SettlementStatus>;
    fn open_channel(&self, peer: &PeerId, deposit: Amount) -> Result<ChannelId>;
    fn close_channel(&self, channel_id: &ChannelId) -> Result<TransactionId>;
}
}

Testing Strategy

Each crate has three test levels:

  1. Unit tests — Test individual functions

    • Location: src/*.rs (inline #[cfg(test)] modules)
    • Run: cargo test -p nodalync-{crate}
  2. Integration tests — Test crate as a whole

    • Location: crates/nodalync-{crate}/tests/
    • Run: cargo test -p nodalync-{crate} --test '*'
  3. Spec compliance tests — Verify against spec validation rules

    • Location: crates/nodalync-{crate}/tests/spec_compliance.rs
    • These tests are derived directly from spec §9
    • Each test references the specific spec section it validates

Error Handling

All crates use a common error type:

#![allow(unused)]
fn main() {
// In nodalync-types
#[derive(Debug, thiserror::Error)]
pub enum NodalyncError {
    #[error("Content validation failed: {0}")]
    ContentValidation(String),
    
    #[error("Provenance validation failed: {0}")]
    ProvenanceValidation(String),
    
    #[error("Payment validation failed: {0}")]
    PaymentValidation(String),
    
    #[error("Storage error: {0}")]
    Storage(String),
    
    #[error("Network error: {0}")]
    Network(String),
    
    #[error("Settlement error: {0}")]
    Settlement(String),
    
    // Maps to spec Appendix C error codes
    #[error("Protocol error {code}: {message}")]
    Protocol { code: u16, message: String },
}
}

Configuration

Node configuration lives in a platform-specific data directory (unless overridden by NODALYNC_DATA_DIR):

  • macOS: ~/Library/Application Support/io.nodalync.nodalync/config.toml
  • Linux: ~/.local/share/nodalync/config.toml (or $XDG_DATA_HOME/nodalync/)
  • Windows: %APPDATA%\nodalync\nodalync\config.toml

Example config.toml (generated by nodalync init):

[identity]
keyfile = "<data_dir>/identity/keypair.key"

[storage]
content_dir = "<data_dir>/content"
database = "<data_dir>/nodalync.db"
cache_dir = "<data_dir>/cache"
cache_max_size_mb = 1000

[network]
enabled = true
listen_addresses = ["/ip4/0.0.0.0/tcp/9000"]
bootstrap_nodes = [
    "/dns4/nodalync-bootstrap.eastus.azurecontainer.io/tcp/9000/p2p/12D3KooWMqrUmZm4e1BJTRMWqKHCe1TSX9Vu83uJLEyCGr2dUjYm",
]

[settlement]
network = "hedera-testnet"
auto_deposit = false

[economics]
default_price = 0.10  # In HBAR

File Layout

The codebase uses a workspace with two groups of crates:

crates/
├── protocol/                # Core protocol crates (v0.7.x)
│   ├── nodalync-crypto/
│   ├── nodalync-types/
│   ├── nodalync-wire/
│   ├── nodalync-store/
│   ├── nodalync-valid/
│   ├── nodalync-econ/
│   ├── nodalync-ops/
│   ├── nodalync-net/
│   └── nodalync-settle/
└── apps/                    # Application crates (v0.10.x)
    ├── nodalync-cli/
    └── nodalync-mcp/

Each crate typically contains:

crates/{group}/nodalync-{module}/
├── Cargo.toml
├── src/
│   ├── lib.rs          # Public API, re-exports
│   └── ...             # Module-specific files
└── tests/
    └── ...             # Integration and compliance tests

Nodalync: A Protocol for Fair Knowledge Economics

Gabriel Giangi
gabegiangi@gmail.com

Abstract

We propose a protocol for knowledge economics that ensures original contributors receive perpetual, proportional compensation from all downstream value creation. A researcher can publish valuable findings once and receive perpetual royalties as the ecosystem builds upon their work. A writer’s insights compound in value as others synthesize and extend them. The protocol enables humans to benefit from knowledge compounding—earning from what they know, not just what they continuously produce. The protocol structures knowledge into four layers where source material (L0) forms an immutable foundation from which all derivative value flows. Cryptographic provenance chains link every insight back to its roots. A pay-per-query model routes 95% of each transaction to foundational contributors regardless of derivation depth. Users add references to shared nodes freely; payment occurs only when content is actually queried—flowing through the entire provenance chain to compensate everyone who contributed. The reference implementation includes Model Context Protocol (MCP) integration as the standard interface for AI agent consumption, creating immediate demand from agentic systems. The result is infrastructure where contributing valuable foundational knowledge once creates perpetual economic participation in all derivative work.

1. Introduction

The digital economy has systematically failed knowledge creators. Researchers publish findings that become foundational to entire industries, receiving citations but not compensation. Writers produce content that trains AI models worth billions, with no mechanism for attribution or payment. The problem is architectural: existing systems cannot track how knowledge compounds through chains of derivation, and even when they can, enforcement mechanisms collapse under market pressure.

Current approaches require continuous production. Creators must constantly generate new content to maintain income. This model favors aggregators who consolidate others’ work over original contributors who establish foundations. When insight A enables insight B which enables insight C, creator A receives nothing from C’s value despite providing the foundation. The result is a knowledge economy where humans must work perpetually, never able to benefit from the compounding value of their past contributions.

We propose a protocol that inverts this dynamic. By structuring knowledge into layers with cryptographic provenance and a pay-per-query transaction model, we ensure value flows backward through derivation chains to original contributors every time knowledge is used. Foundational contributors—those who provide source material—receive proportional compensation automatically with each query. A researcher can publish valuable findings once and receive perpetual royalties as the ecosystem builds upon their work. A domain expert’s knowledge compounds in value as others synthesize and extend it. The protocol enables humans to earn from what they know, not just what they continuously produce—creating a path toward economic participation that does not require perpetual labor.

The protocol serves as a knowledge layer between humans and AI. Any agent can query personal knowledge bases through standard interfaces, with every query triggering automatic compensation to all contributors in the provenance chain. This creates infrastructure for a fair knowledge economy—one that bridges the historical gap between research and commerce, enabling foundational contributors to participate economically in all derivative value their work enables.

2. Prior Work

The components of this protocol draw from established systems. Content-addressed storage, pioneered by Git and formalized by IPFS, provides cryptographic integrity guarantees through hash-based identification. Merkle trees enable efficient verification with logarithmic proof sizes. The Model Context Protocol, released by Anthropic and now stewarded by the Linux Foundation, provides a standard interface for AI systems to consume external resources.

Prior attempts at data marketplaces—Ocean Protocol, Streamr, Azure Data Marketplace—failed primarily on the pricing problem: data value varies dramatically by context, and sellers consistently could not determine appropriate prices. NFT royalty systems failed differently: royalties were never enforced on-chain but relied on marketplace cooperation, which collapsed under competitive pressure when platforms began offering zero-royalty trading to attract volume.

Academic citation systems demonstrate that attribution without compensation creates no economic incentive for foundational contribution. Publishers capture margins while authors receive prestige as a substitute for payment. This protocol proposes that attribution and compensation must be unified—provenance chains that simultaneously prove contribution and trigger payment.

Our contribution is not novel components but their integration into a coherent system with a pay-per-query model that ensures compensation flows to all contributors every time knowledge is used. There is no upfront purchase to bypass, no secondary market to circumvent—every query to every node triggers payment through the entire provenance chain.

3. Knowledge Layers

The protocol structures all knowledge into four distinct layers with specific properties:

LayerNameContentsProperties
L0Raw InputsDocuments, transcripts, notesImmutable, publishable, queryable
L1MentionsAtomic facts with L0 pointersExtracted, visible as preview
L2Entity GraphEntities + RDF relationsInternal only, never shared
L3InsightsEmergent patterns and conclusionsShareable, importable as L0

L0 represents raw source material—documents, transcripts, notes, research. L0 is immutable once published; updates are published as new versions (see Section 4.2). When shared, L0 content remains on the owner’s node and is accessed only through paid queries.

L1 consists of atomic facts extracted from L0, each maintaining a pointer to its source. L1 serves as a preview layer: when browsing shared content, users see L1 mentions as a summary of what the L0 contains. This enables informed decisions about what to query without requiring payment to evaluate relevance.

L2 is the synthesis layer used for internal organization. It represents entities and the RDF relations between them (subject-predicate-object triples), enabling structured queries across source material. L2 is never shared because it represents reorganization rather than new creation—preventing value extraction through mere restructuring.

L3 represents genuinely emergent insights—conclusions abstract enough to constitute new intellectual property. L3 can be shared and queried like L0. When imported into another user’s graph, L3 functions as their L0, enabling knowledge to compound across ownership boundaries while preserving attribution chains.

4. Provenance

Every node in the system stores its complete derivation history through content-addressed hashing. When content is created or modified, a hash is computed over its contents. This hash serves as a unique identifier enabling trustless verification—identical content produces identical hashes regardless of where or when it is created.

4.1 Node Structure

Each node maintains:

hash: content-addressed identifier for this version
derived_from[]: hashes of content directly contributing to this node
root_L0L1[]: flattened array of all ultimate L0+L1 sources with weights
timestamp: creation time for ordering and staleness detection
previous_version: hash of prior version (null if original)
version_root: hash of first version in chain (stable identifier)

The root_L0L1 array is the key structure for revenue distribution. Regardless of how many intermediate derivation steps occur (L2 synthesis, L3 insight generation), every node maintains direct reference to all foundational sources. An L3 derived from another L3 (imported as L0) inherits the original L3’s root_L0L1 array, extending rather than replacing the provenance chain.

This creates cryptographic proof of contribution. If Alice’s L0 hash appears in Bob’s L3’s root_L0L1 array, Alice’s contribution is provable without requiring social trust or centralized verification. The provenance is in the data structure itself.

4.2 Versioning

L0 is immutable once published. Updates are published as new nodes with new hashes. The previous_version field links to the prior version; the version_root field provides a stable identifier across all versions of the same content.

When Alice updates her L0:

new_L0.previous_version = old_L0.hash
new_L0.version_root = old_L0.version_root (or old_L0.hash if original)

Old versions remain accessible. Users who added references to v1 continue using v1; they can add references to v2 separately if desired. Provenance chains reference specific versions, preserving the historical record of what actually contributed to what. This ensures derivations remain valid even as sources evolve.

5. Transactions

The protocol operates on a pay-per-query model. Adding references is free; payment occurs when content is actually queried.

5.1 Reference and Query

Users discover shared content through network indexes that expose metadata: title, L1 mentions (as summary), hash, owner, visibility tier, and version information. This metadata is visible without payment, enabling informed decisions about relevance.

To use content, users add a reference (pointer) to their personal graph. Adding a reference is free—no content is transferred, only a hash is stored locally. The actual content remains on the owner’s node.

When the user (or their agent) queries the reference, the protocol triggers a transaction:

  1. Query request sent to content owner’s node
  2. Payment verified via handshake
  3. Response delivered to requester
  4. Revenue distributed through provenance chain

The query response can be cached and re-read locally without additional payment. The initial query is logged as “viewed,” enabling local access to already-received content. Subsequent queries to the same node (for updated information or different query parameters) trigger new payments.

5.2 Derivation

To create an L3 that derives from external sources, the user must have queried (and paid for) each source at least once. This ensures foundational contributors are compensated before their work is incorporated into derivative content.

When L3 is created, the full provenance chain is computed:

new_L3.root_L0L1 = union of all source.root_L0L1 arrays

Every foundational source that contributed to any input is included. When this L3 is later queried by others, revenue flows to all contributors in the chain.

5.3 L3 Import

When a user queries an L3 and imports it as their own L0, the full provenance chain inherits forward:

imported_L0.root_L0L1 = original_L3.root_L0L1 ∪ {original_L3.hash}

The original L3 creator joins the root contributor set. All upstream sources remain tracked. Any subsequent L3 created using this imported knowledge will distribute revenue to all contributors in the extended chain.

6. Revenue Distribution

Every query triggers revenue distribution through the entire provenance chain.

6.1 Distribution Formula

For a query generating value V to a node with root contributor set R:

owner_share = 0.05 × V
root_pool = 0.95 × V
per_root_share = root_pool / |R|

The node owner retains 5% as synthesis incentive. The remaining 95% splits equally among all L0+L1 roots in the provenance chain. All roots are weighted equally regardless of content type or derivation distance. A single query distributes payment to every contributor who helped create that knowledge.

When the same source appears multiple times in a provenance chain (through different derivation paths), it receives proportionally more: a source contributing twice receives twice the share.

6.2 Rationale for 95/5

This distribution inverts typical platform economics, where intermediaries capture 10-45% of value. The inversion is intentional: foundational knowledge is systematically undervalued in current markets. Researchers, domain experts, and original thinkers provide the substrate on which all synthesis depends, yet receive nothing from downstream value creation. The 95% allocation to foundational contributors corrects this market failure.

The 5% synthesis fee may appear to disincentivize synthesis, but this concern misunderstands the mechanism. Consider a concrete example: Bob creates an L3 insight using 2 of Alice’s L0 documents, 1 of Carol’s L0 documents, and 2 of his own L0 documents. When queried for 100 tokens:

Bob (owner + 2 roots): 5 + (2/5 × 95) = 43 tokens
Alice (2 roots): 2/5 × 95 = 38 tokens
Carol (1 root): 1/5 × 95 = 19 tokens

Bob receives 43% despite the 5% synthesis fee because he also contributed foundational material. The protocol incentivizes synthesizers to also be contributors. A pure synthesizer using entirely others’ sources receives only the 5% floor—this is by design. The incentive structure rewards those who contribute original knowledge, not those who merely reorganize others’ work.

The 5% synthesis fee is not the endgame for valuable synthesis. If an L3 is foundational enough that others build upon it (import as their L0), the original synthesizer becomes part of their root_L0L1[] arrays. The protocol incentivizes creating insights worth building on, not just worth querying. First-order queries earn 5%; becoming foundational for others’ work is where compounding happens.

6.3 Compounding Returns

The mechanism creates exponential potential for foundational contributors. Consider Alice’s L0 document over three generations of derivation:

Direct queries: 10 users query Alice's L0
Second-order: 10 L3s built on Alice's L0, each queried 10× = 100 payments
Third-order: 100 L3s each enable 10 more = 1,000 payments

Alice’s single L0 contribution earns from all downstream queries. She need not create L3s herself to benefit from the ecosystem building on her work. Contributing valuable foundational knowledge once creates perpetual economic participation—enabling earlier exit from continuous production while maintaining income as others build on one’s contributions.

6.4 Fairness Priorities

Fairness priorities are embedded in protocol design at three levels:

Fair distribution (highest priority): The 95/5 split inverts typical platform economics. Equal root weighting distributes value across all foundational contributors. The more sources an L3 builds upon, the more widely value distributes—rewarding comprehensive synthesis that draws from diverse foundations.

Fair contribution: No gatekeeping on L0 publication. No credentials required. No institutional approval necessary. The market determines value, not committees. Anyone can contribute foundational knowledge; quality is determined by whether others choose to build upon it.

Fair access: Access enables contribution. The protocol supports tiered pricing (commercial/academic/individual), a commons layer for explicitly open contributions, and contributor credits for those who publish L0. These mechanisms ensure that the protocol does not create a knowledge economy accessible only to the wealthy.

7. Agent Integration

The protocol exposes a query interface that any application can consume. The reference implementation includes a Model Context Protocol (MCP) integration as the standard interface for AI agent consumption. MCP, originally developed by Anthropic and now stewarded by the Linux Foundation, provides a standardized way for AI systems to access external resources. Any MCP-compatible agent can query knowledge nodes through this integration layer, with every query automatically triggering compensation through the protocol’s payment mechanism.

7.1 Query Mechanism

Agents submit queries through the MCP integration layer, which translates them into protocol QUERY operations. The protocol returns structured responses with provenance metadata:

response.content: answer to query
response.sources[]: hashes of nodes accessed
response.provenance[]: full derivation chain
response.cost: payment amount for this query

The response includes everything needed for the agent (or its operator) to verify sources and confirm payment. Provenance is embedded in the response, not stored externally. The MCP layer can add application-specific fields (confidence scores, formatted answers) while the protocol handles content delivery and payment.

7.2 Payment Handling

The protocol handles the handshake: payment verification triggers response delivery, and revenue distributes through the provenance chain. Application-level concerns—budget controls, cost previews, spending limits, auto-approve settings—are outside protocol scope. Implementations may offer cost estimates before query execution, user-defined budgets for agent sessions, or approval workflows for high-value queries.

7.3 Transparency

The protocol’s message structure provides complete audit data: every query includes timestamp, sender identity, and content hash; every response includes sources accessed; every payment includes the full revenue distribution. Applications can log these protocol events to build comprehensive audit trails—providing transparency into AI knowledge consumption that is impossible with current web scraping approaches.

8. Privacy and Visibility

The protocol is local-first. All data remains on the owner’s node. No centralized storage, no uploads to external platforms. Queries deliver responses; content itself never transfers permanently. This inverts the current paradigm where users upload data to platforms—instead, agents come to users.

8.1 Visibility Tiers

Content owners choose visibility per node:

TierDiscoverableAddable by OthersQueryable
PrivateNoNoNo (personal use only)
UnlistedNoYes (if hash known)Yes (pay-per-query)
SharedYesYesYes (pay-per-query)

Private nodes exist only for personal use—internal organization, drafts, sensitive material. They cannot be discovered, referenced, or queried by others.

Unlisted nodes are queryable but not discoverable. Owners share hashes directly with specific users or groups. This enables selective sharing: grant access to collaborators without public exposure.

Shared nodes are fully public—discoverable through network indexes, addable by anyone, queryable with standard pay-per-query economics.

8.2 Private Sources in Provenance

A shared L3 may derive from private L0 sources. In this case:

The private source’s hash appears in root_L0L1[]—its existence is visible. The private source’s content remains inaccessible—others cannot query it. The private source’s owner still receives their share of revenue when the L3 is queried. Others see “private source” in provenance—they know it exists but cannot access it.

This enables selective disclosure: publish valuable insights while keeping underlying research private. Consumers trust the synthesis or they don’t—provenance shows that sources exist even if content is not verifiable.

8.3 Identity Privacy

Contributors choose their identity level per contribution. The protocol supports named contributions (full identity attached) and pseudonymous contributions (wallet address only). Provenance hashes are public and enable verification; the identity behind those hashes is configurable.

Future enhancement: Zero-knowledge verified contributions would allow contributors to prove membership in a verified set (e.g., “verified researcher”) without revealing specific identity. This requires additional infrastructure (contributor registries, ZK proof verification) and is planned for a future protocol version.

9. Network

Nodes operate independently, storing their own knowledge graphs and serving their own queries. Discovery occurs through a decentralized index where nodes publish metadata about shared content without revealing the content itself.

Settlement uses smart contracts for payment verification and distribution. When a query executes, the contract verifies payment and distributes revenue according to the provenance chain. Minimal data goes on-chain: payment flows and attestations. Content and queries remain off-chain.

This hybrid architecture—off-chain content, on-chain economics—preserves privacy while enabling trustless compensation.

9.1 Governance

The governance model remains under development. Design goals include: decentralization where possible, market-driven decision-making for most parameters, and protections ensuring broad participation rather than plutocracy. Options under consideration include one-node-one-vote, quadratic voting, and contribution-weighted governance. The final model will be determined through community input prior to mainnet launch.

10. Threat Model

We identify and address the primary attack vectors against the protocol.

10.1 Sybil Attacks

Without identity verification, actors could create multiple pseudonymous identities to claim foundational portions of knowledge. The protocol is identity-agnostic by design—we do not require identity verification at the base layer.

Instead, economic incentives align behavior. Quality content earns; spam does not. The market determines which sources are valuable through query volume. A fragmented identity strategy—creating many accounts with thin contributions—produces no advantage because revenue distributes based on which sources are actually queried, not how many sources exist.

Furthermore, reputation accrues to consistent identity. A single account with many high-quality contributions becomes discoverable and trusted. Fragmenting across pseudonyms sacrifices this reputation benefit. Optional reputation layers can build on the base protocol for contexts requiring stronger identity guarantees.

10.2 Attribution Gaming

Actors might attempt to insert themselves into provenance chains through trivial contributions or synthetic chains between controlled addresses. The protocol does not prevent this at the technical layer—but economic incentives make it unprofitable.

Revenue distributes only when content is queried. Creating thousands of unused nodes generates no income. The market determines value through actual queries. Synthetic chains between controlled addresses simply redistribute funds within the attacker’s own wallet.

10.3 Content Copying

After querying content, a user could theoretically republish it as their own. This is a limitation of any system providing information access. However, several factors mitigate this risk: republished content lacks provenance linkage to the original; the original has earlier timestamps providing evidence of priority; copied content cannot benefit from the original’s reputation or query history; and audit trails document the original query, providing evidence for legal recourse.

10.4 Disputes

The protocol does not adjudicate disputes—it provides evidence. Provenance chains are cryptographic fact: a hash is either in root_L0L1[] or it is not. For suspected plagiarism or parallel discovery:

Embedding similarity detection can flag potential copies at the application layer. Audit trails document access patterns, showing who queried what and when. Two independent derivation chains arriving at similar insights is valuable data, not necessarily a conflict—it may indicate robust conclusions. Legal systems handle disputes; the protocol provides complete evidence for those systems to adjudicate.

10.5 External Plagiarism

The protocol cannot prevent unauthorized publication of external work at the entry point. Someone could publish an externally-created paper as their own L0. However, the protocol makes such theft transparent and traceable:

Timestamps record when content was published in-system. Earnings are fully visible and auditable. Evidence for legal recourse is built-in, not forensic. Contributors are encouraged to establish external prior art (journal publication, arXiv, timestamps) as authoritative record of original creation.

Alternatively, the protocol itself can serve as a proof-of-creation layer—publish to Nodalync first as timestamped record, then pursue traditional publication.

11. Limitations

The protocol does not solve all problems in knowledge economics. We acknowledge the following limitations.

Pricing discovery. The protocol does not determine what queries should cost. Owners set prices; the market accepts or rejects them. This may result in inefficient pricing, particularly in early stages before market norms emerge. However, unlike prior data marketplaces that failed attempting to solve pricing algorithmically, we treat price discovery as a market function rather than a protocol function.

Cold start. The protocol’s value increases with participation. Early adopters face a network with limited content and few users. We expect adoption to begin in specific domains where knowledge value is clear (research, technical documentation, domain expertise) before expanding to broader use cases.

Regulatory uncertainty. Immutable provenance chains may conflict with data protection regulations requiring deletion rights. Implementations must consider jurisdictional requirements. The separation of content (deletable at the node) from provenance hashes (persistent) provides partial mitigation, but legal analysis is required for specific deployments.

Not all knowledge should be monetized. The protocol creates an option for compensation, not a mandate. Commons-based knowledge sharing remains valuable and should continue. The protocol complements rather than replaces open knowledge systems—it provides a path for those who wish to receive compensation without requiring everyone to participate in economic exchange.

12. Conclusion

The Nodalync protocol creates infrastructure for fair knowledge economics. By structuring knowledge into layers with cryptographic provenance, implementing pay-per-query transactions, and distributing revenue through complete derivation chains, the protocol ensures that foundational contributors receive perpetual, proportional compensation from all downstream value creation.

Foundational contributors are the substrate of this economy. A researcher, writer, or domain expert can contribute valuable source material once and benefit as the ecosystem builds upon their work. They need not continuously produce, need not create sophisticated L3 insights, need not compete with aggregators. The protocol routes value backward through derivation chains automatically—creating a path to economic participation that does not require perpetual labor.

For AI systems, the protocol provides a standard interface for consuming human knowledge while respecting attribution and compensation. Every query triggers payment to all contributors in the provenance chain. This creates sustainable infrastructure for AI-human knowledge exchange—not extraction without attribution, but transaction with fair compensation.

The alternative to this protocol is not the knowledge commons—it is the current reality where AI systems train on human knowledge with no mechanism for attribution or payment. The protocol offers a third path: knowledge that flows freely through derivation chains while ensuring that those who contribute to that flow receive proportional benefit.

We propose this as the knowledge layer between humans and AI: infrastructure where contributing valuable knowledge creates perpetual economic participation in all derivative work.

References

[1] Nakamoto, S. (2008). Bitcoin: A Peer-to-Peer Electronic Cash System.

[2] Anthropic. (2024). Model Context Protocol Specification.

[3] Benet, J. (2014). IPFS - Content Addressed, Versioned, P2P File System.

[4] Merkle, R. (1988). A Digital Signature Based on a Conventional Encryption Function.

[5] Douceur, J. (2002). The Sybil Attack. IPTPS.

[6] World Wide Web Consortium. (2014). RDF 1.1 Concepts and Abstract Syntax.

Nodalync Protocol Specification — L2 Entity Graph Addendum

Version: 0.2.0-draft
Date: January 2026
Status: Draft Addendum to v0.7.1


Summary of Changes

This addendum elevates L2 (Entity Graph) from internal-only to a protocol-level content type, while keeping it as personal/private content. Key design decisions:

  1. Complete provenance chain: L0 → L1 → L2 → L3
  2. L2 is personal: Your L2 represents your unique perspective — it is never queried by others
  3. URI-based ontology: Entity types and relationship predicates use URIs for RDF interoperability
  4. L3 derives from L2: Your insights (L3) are built on your knowledge graph (L2)

Design Philosophy

L2 is Your Perspective

L2 represents how you understand and link entities across the documents you’ve studied. Two people reading the same papers might build very different L2 graphs based on:

  • Which entities they consider important
  • How they resolve ambiguous references
  • What relationships they infer
  • Which external ontologies they use

This is valuable intellectual work, but it’s personal. Your L2 is never directly monetized — its value surfaces when you create L3 insights that others find valuable.

Economic Model

Alice's L0 (document)
       ↓
Bob queries Alice's L0 → Alice gets paid
       ↓
Bob extracts L1 from Alice's L0
       ↓
You query Bob's L1 → Bob gets paid (Alice gets root share)
       ↓
You build L2 from Bob's L1 (YOUR perspective)
       ↓
You create L3 insight from YOUR L2
       ↓
Eve queries YOUR L3 → You get 5% synthesis fee
                    → Alice gets 95% (she's in root_L0L1)

Your L2 work is “invisible” economically — the compensation comes from your L3 insights.

URI-Based Ontology

Instead of closed enums, L2 uses URIs for extensibility:

# Entity types (can be any ontology)
entity_types: ["schema:Person", "foaf:Person"]
entity_types: ["ndl:Concept"]
entity_types: ["http://example.org/ontology#CustomType"]

# Relationship predicates
predicate: "schema:worksFor"
predicate: "ndl:mentions"
predicate: "http://purl.org/dc/terms/creator"

This enables:

  • Standard ontologies (Schema.org, FOAF, Dublin Core)
  • Custom domain-specific ontologies
  • Interoperability with semantic web tools
  • No protocol changes needed for new types

1. Updated Data Structures

§4.1 Content Types (REPLACE)

enum ContentType : uint8 {
    L0 = 0x00,      # Raw input (documents, notes, transcripts)
    L1 = 0x01,      # Mentions (extracted atomic facts)
    L2 = 0x02,      # Entity Graph (linked entities and relationships)
    L3 = 0x03       # Insights (emergent synthesis)
}

Knowledge Layer Semantics:

LayerContentTypical OperationValue Added
L0Raw documents, notes, transcriptsCREATEOriginal source material
L1Atomic facts extracted from L0EXTRACT_L1Structured, quotable claims
L2Entities and relationships across L1sBUILD_L2Cross-document linking, entity resolution
L3Novel insights synthesizing sourcesDERIVEOriginal analysis and conclusions

§4.4a Entity Graph (L2) (NEW SECTION)

Insert after §4.4 Mention:

struct L2EntityGraph {
    # === Core Identity ===
    id: Hash,                           # H(serialized entities + relationships)
    
    # === Sources ===
    source_l1s: L1Reference[],          # L1 summaries this graph was built from
    source_l2s: Hash[],                 # Other L2 graphs merged/extended (optional)
    
    # === Graph Content ===
    entities: Entity[],                 # Resolved entities
    relationships: Relationship[],      # Relationships between entities
    
    # === Statistics ===
    entity_count: uint32,
    relationship_count: uint32,
    source_mention_count: uint32        # Total mentions linked
}

struct L1Reference {
    l1_hash: Hash,                      # Hash of the L1Summary content
    l0_hash: Hash,                      # The original L0 this L1 came from
    mention_ids_used: Hash[]            # Which specific mentions were used
}

struct Entity {
    id: Hash,                           # Stable entity ID: H(canonical_label || entity_type)
    canonical_label: string,            # Primary name (max 200 chars)
    aliases: string[],                  # Alternative names/spellings (max 50)
    entity_type: EntityType,
    
    # === Evidence ===
    source_mentions: MentionRef[],      # Which L1 mentions establish this entity
    
    # === Confidence ===
    confidence: float64,                # 0.0 - 1.0, resolution confidence
    resolution_method: ResolutionMethod,
    
    # === Optional Metadata ===
    description: string?,               # Summary description (max 500 chars)
    external_ids: ExternalId[]?         # Links to external knowledge bases
}

struct MentionRef {
    l1_hash: Hash,                      # Which L1 contains this mention
    mention_id: Hash                    # Specific mention ID within that L1
}

struct ExternalId {
    system: string,                     # e.g., "wikidata", "orcid", "doi"
    identifier: string                  # The ID in that system
}

struct Relationship {
    id: Hash,                           # H(subject || predicate || object)
    subject: Hash,                      # Entity ID
    predicate: string,                  # Relationship type (max 100 chars)
    object: RelationshipObject,         # Entity ID or literal
    
    # === Evidence ===
    source_mentions: MentionRef[],      # Mentions that support this relationship
    confidence: float64,                # 0.0 - 1.0
    
    # === Temporal (optional) ===
    valid_from: Timestamp?,
    valid_to: Timestamp?
}

enum RelationshipObject {
    EntityRef(Hash),                    # Reference to another entity
    Literal(LiteralValue)               # A value (string, number, date)
}

struct LiteralValue {
    value_type: LiteralType,
    value: string                       # Encoded value
}

enum LiteralType : uint8 {
    String    = 0x00,
    Integer   = 0x01,
    Float     = 0x02,
    Date      = 0x03,                   # ISO 8601
    DateTime  = 0x04,                   # ISO 8601
    Boolean   = 0x05,
    Uri       = 0x06
}

enum EntityType : uint8 {
    Person       = 0x00,
    Organization = 0x01,
    Location     = 0x02,
    Concept      = 0x03,
    Event        = 0x04,
    Work         = 0x05,                # Paper, book, article, etc.
    Product      = 0x06,
    Technology   = 0x07,
    Metric       = 0x08,                # Quantitative measure
    TimePoint    = 0x09,
    Other        = 0xFF
}

enum ResolutionMethod : uint8 {
    ExactMatch    = 0x00,               # Same string
    Normalized    = 0x01,               # Case/punctuation normalized
    Alias         = 0x02,               # Known alias matched
    Coreference   = 0x03,               # Pronoun/reference resolved
    ExternalLink  = 0x04,               # Matched via external KB
    Manual        = 0x05,               # Human-verified
    AIAssisted    = 0x06                # ML model assisted
}

Constraints:

L2 Entity Graph constraints:
    1. len(source_l1s) >= 1              # Must derive from at least one L1
    2. len(entities) >= 1                 # Must have at least one entity
    3. Each entity.id is unique within the graph
    4. Each relationship references valid entity IDs
    5. All MentionRefs point to valid L1s in source_l1s
    6. 0.0 <= confidence <= 1.0
    7. len(canonical_label) <= 200
    8. len(aliases) <= 50
    9. len(predicate) <= 100
    10. entity_count == len(entities)
    11. relationship_count == len(relationships)

§4.4b L2 Summary (Preview) (NEW SECTION)

For previewing L2 content without revealing the full graph:

struct L2Summary {
    l2_hash: Hash,                      # Hash of the full L2EntityGraph
    entity_count: uint32,
    relationship_count: uint32,
    source_l1_count: uint32,
    
    # === Preview (free) ===
    top_entities: EntityPreview[],      # Top 10 entities by mention count
    entity_type_distribution: TypeCount[], # How many of each type
    relationship_types: string[],       # List of predicates used (max 20)
    
    # === Quality Indicators ===
    avg_confidence: float64,
    cross_document_links: uint32        # Entities appearing in multiple L1s
}

struct EntityPreview {
    id: Hash,
    canonical_label: string,
    entity_type: EntityType,
    mention_count: uint32,              # How many mentions support this entity
    relationship_count: uint32          # Relationships involving this entity
}

struct TypeCount {
    entity_type: EntityType,
    count: uint32
}

§4.5 Provenance (UPDATED)

Update the constraints to include L2:

struct Provenance {
    root_L0L1: ProvenanceEntry[],       # All foundational L0/L1 sources
    derived_from: Hash[],                # Direct parent hashes (any content type)
    depth: uint32                        # Max derivation depth from any L0
}

Constraints:
    - root_L0L1 contains entries of type L0 or L1 only (never L2 or L3)
    - L0 content: root_L0L1 = [self], derived_from = [], depth = 0
    - L1 content: root_L0L1 = [parent L0], derived_from = [L0 hash], depth = 1
    - L2 content: root_L0L1 = merged roots from source L1s, 
                  derived_from = source L1 hashes, depth = max(source.depth) + 1
    - L3 content: root_L0L1 = merged roots from all sources,
                  derived_from = source hashes, depth = max(source.depth) + 1
    - All entries in derived_from MUST have been queried by creator

Provenance Chain Examples:

Simple chain:
    L0(doc) → L1(mentions) → L2(entities) → L3(insight)
    depth:  0       1            2              3

Branching:
    L0(doc1) → L1(m1) ─┐
                       ├→ L2(graph) → L3(insight)
    L0(doc2) → L1(m2) ─┘
    
    L2.provenance = {
        root_L0L1: [doc1, doc2],
        derived_from: [m1, m2],
        depth: 2
    }
    
    L3.provenance = {
        root_L0L1: [doc1, doc2],  # Inherited from L2
        derived_from: [L2.hash],
        depth: 3
    }

L3 deriving directly from L1 (skipping L2):
    L0(doc) → L1(mentions) → L3(insight)
    
    L3.provenance = {
        root_L0L1: [doc],
        derived_from: [mentions],
        depth: 2
    }
    
L3 deriving from mix of L1 and L2:
    L0(doc1) → L1(m1) → L2(graph) ─┐
                                    ├→ L3(insight)
    L0(doc2) → L1(m2) ─────────────┘
    
    L3.provenance = {
        root_L0L1: [doc1, doc2],  # Merged from both paths
        derived_from: [L2.hash, m2],
        depth: 4  # max(3, 2) + 1
    }

2. Updated Message Types

§6.2 Discovery Messages (UPDATED)

L2 content can be announced and searched like any other content:

# AnnouncePayload for L2
When content_type == L2:
    l1_summary field is replaced with l2_summary: L2Summary
    
# SearchResult for L2  
struct SearchResult {
    hash: Hash,
    content_type: ContentType,
    title: string,
    owner: PeerId,
    # Type-specific preview:
    l1_summary: L1Summary?,      # If L0 or L1
    l2_summary: L2Summary?,      # If L2
    price: Amount,
    total_queries: uint64,
    relevance_score: float64,
    publisher_addresses: string[]  # Multiaddresses for reconnection
}

§6.3a L2 Preview Messages (NEW)

# L2_PREVIEW_REQUEST = 0x0210
struct L2PreviewRequestPayload {
    hash: Hash
}

# L2_PREVIEW_RESPONSE = 0x0211
struct L2PreviewResponsePayload {
    hash: Hash,
    manifest: Manifest,
    l2_summary: L2Summary
}

§6.1 MessageType (UPDATED)

Add new message types:

enum MessageType : uint16 {
    # ... existing types ...
    
    # L2 Preview (0x02xx range, after Preview)
    L2_PREVIEW_REQUEST   = 0x0210,
    L2_PREVIEW_RESPONSE  = 0x0211,
}

3. Updated Protocol Operations

§7.1.2a Build L2 (Entity Graph) (NEW OPERATION)

Insert after §7.1.2 Extract L1:

BUILD_L2(source_l1s: Hash[], config: L2BuildConfig?) → Hash

Purpose:
    Build an L2 Entity Graph from one or more L1 sources.
    This operation performs entity extraction, resolution, and relationship inference.

Preconditions:
    - All source L1s have been queried (payment proof exists)
    - len(source_l1s) >= 1

Procedure:
    1. Verify all L1 sources were queried:
       For each l1_hash in source_l1s:
           assert cache.has(l1_hash) OR content.has(l1_hash)
           l1 = load_l1(l1_hash)
           assert l1.content_type == L1
           
    2. Extract entities from mentions:
       raw_entities = []
       For each l1 in source_l1s:
           For each mention in l1.mentions:
               extracted = extract_entities(mention)
               raw_entities.extend(extracted)
               
    3. Resolve entities (merge duplicates):
       resolved_entities = resolve_entities(raw_entities, config)
       # This handles:
       #   - Exact string matching
       #   - Alias resolution
       #   - Coreference resolution
       #   - External KB linking (optional)
       
    4. Extract relationships:
       relationships = extract_relationships(resolved_entities, source_l1s)
       
    5. Build L2 structure:
       l2_graph = L2EntityGraph {
           id: computed after serialization,
           source_l1s: [L1Reference for each l1],
           source_l2s: [],
           entities: resolved_entities,
           relationships: relationships,
           entity_count: len(resolved_entities),
           relationship_count: len(relationships),
           source_mention_count: total_mentions_linked
       }
       
    6. Compute hash:
       content = serialize(l2_graph)
       hash = ContentHash(content)
       l2_graph.id = hash
       
    7. Compute provenance:
       root_entries = []
       For each l1 in source_l1s:
           l1_prov = get_provenance(l1)
           For each entry in l1_prov.root_L0L1:
               merge_or_increment(root_entries, entry)
       
       provenance = Provenance {
           root_L0L1: root_entries,
           derived_from: source_l1s,
           depth: max(l1.provenance.depth for l1 in source_l1s) + 1
       }
       
    8. Create manifest:
       manifest = Manifest {
           hash: hash,
           content_type: L2,
           owner: my_peer_id,
           version: Version { number: 1, previous: null, root: hash, ... },
           visibility: Private,
           provenance: provenance,
           ...
       }
       
    9. Store content and manifest locally
    10. Return hash

struct L2BuildConfig {
    # Entity resolution settings
    resolution_threshold: float64?,     # Minimum confidence to merge (default: 0.8)
    use_external_kb: bool?,             # Link to external knowledge bases
    external_kb_list: string[]?,        # Which KBs to use: ["wikidata", "dbpedia"]
    
    # Relationship extraction
    extract_implicit: bool?,            # Infer relationships not explicitly stated
    relationship_types: string[]?       # Limit to specific predicates
}

§7.1.2b Merge L2 (NEW OPERATION)

Merge multiple L2 graphs into one:

MERGE_L2(source_l2s: Hash[], config: L2MergeConfig?) → Hash

Purpose:
    Combine multiple L2 Entity Graphs, resolving entities across them.
    Creates a unified knowledge graph from multiple domain-specific graphs.

Preconditions:
    - All source L2s have been queried (payment proof exists)
    - len(source_l2s) >= 2

Procedure:
    1. Verify all L2 sources were queried
    
    2. Collect all entities and relationships from sources
    
    3. Cross-graph entity resolution:
       # Find same entities appearing in different graphs
       merged_entities = resolve_across_graphs(source_l2s, config)
       
    4. Merge relationships (update entity references)
    
    5. Build new L2 with:
       source_l1s: union of all source L1 references
       source_l2s: the input source_l2s
       
    6. Compute provenance:
       # Roots come from all underlying L1s (via source L2s)
       root_entries = merge roots from all source_l2s
       
       provenance = Provenance {
           root_L0L1: root_entries,
           derived_from: source_l2s,
           depth: max(l2.provenance.depth for l2 in source_l2s) + 1
       }
       
    7. Store and return hash

§7.1.5 Derive (Create L3) (UPDATED)

L3 can now derive from L2 in addition to L0, L1, and other L3:

DERIVE(sources: Hash[], insight_content: bytes, metadata: Metadata) → Hash

Sources may include:
    - L0 content (raw documents)
    - L1 content (mention collections)
    - L2 content (entity graphs)
    - L3 content (other insights)
    
All sources must have been queried (payment proof exists).

Provenance computation:
    For L0/L1 sources: merge their root_L0L1 directly
    For L2 sources: merge the L2's root_L0L1 (which traces back to L0/L1)
    For L3 sources: merge the L3's root_L0L1 (recursive)
    
    derived_from = all source hashes
    depth = max(source.provenance.depth) + 1

§7.2.2a L2 Preview (NEW)

L2_PREVIEW(hash: Hash) → (Manifest, L2Summary)

Procedure:
    1. Send L2_PREVIEW_REQUEST to content owner
    2. Receive L2_PREVIEW_RESPONSE
    3. Validate manifest
    4. Return (manifest, l2_summary)
    
Cost: Free (like L1 preview)

4. Updated Validation Rules

§9.1 Content Validation (UPDATED)

VALIDATE_CONTENT(content: bytes, manifest: Manifest) → bool

Rules:
    # ... existing rules 1-6 ...
    7. manifest.content_type in {L0, L1, L2, L3}  # Updated
    8. manifest.visibility in {Private, Unlisted, Shared}
    
    # L2-specific validation
    9. If manifest.content_type == L2:
           l2 = deserialize(content) as L2EntityGraph
           assert l2.id == manifest.hash
           assert len(l2.source_l1s) >= 1
           assert len(l2.entities) >= 1
           assert all entity IDs are unique
           assert all relationship entity refs are valid
           assert all MentionRefs point to valid source L1s
           assert l2.entity_count == len(l2.entities)
           assert l2.relationship_count == len(l2.relationships)

§9.3 Provenance Validation (UPDATED)

VALIDATE_PROVENANCE(manifest: Manifest, sources: Manifest[]) → bool

Rules:
    1. If manifest.content_type == L0:
           manifest.provenance.root_L0L1 == [self_entry]
           manifest.provenance.derived_from == []
           manifest.provenance.depth == 0
           
    2. If manifest.content_type == L1:
           len(manifest.provenance.root_L0L1) >= 1
           manifest.provenance.derived_from contains exactly one L0 hash
           manifest.provenance.depth == 1
           All root_L0L1 entries are type L0
           
    3. If manifest.content_type == L2:
           len(manifest.provenance.root_L0L1) >= 1
           len(manifest.provenance.derived_from) >= 1
           All derived_from are L1 or L2 hashes
           All root_L0L1 entries are type L0 or L1
           manifest.provenance.depth >= 2
           depth == max(source.depth) + 1
           
    4. If manifest.content_type == L3:
           len(manifest.provenance.root_L0L1) >= 1
           len(manifest.provenance.derived_from) >= 1
           All derived_from hashes exist in sources
           All root_L0L1 entries are type L0 or L1
           depth == max(source.depth) + 1
           
    5. For all types:
           Computed root_L0L1 matches declared root_L0L1
           No cycles in derived_from graph
           All weights > 0

5. Economic Rules (UPDATED)

§10.1 Revenue Distribution (UPDATED)

The distribution formula remains unchanged. L2 creators receive payment when:

  1. Their L2 is queried directly — They get the synthesis fee (5%) plus any roots they contributed
  2. Their L2 is used in an L3 — Their L2’s root_L0L1 is merged, so underlying L0/L1 creators are paid

Important: L2 creators do NOT automatically get compensation when their L2 is derived from. Instead:

  • The root_L0L1 (which traces back through the L2 to original L0/L1) gets paid
  • If the L2 creator also created some of those L0/L1s, they get that share
  • The L2 creator’s work is compensated when someone queries the L2

This maintains the principle: value flows to foundational contributors (L0/L1), while L2/L3 creators earn through synthesis fees when their content is queried.

§10.2 Distribution Example (UPDATED)

Extended scenario with L2:

    Alice creates L0 (document)
    Bob extracts L1 from Alice's L0
    Carol builds L2 entity graph from Bob's L1
    Dave creates L3 insight from Carol's L2
    
    Eve queries Dave's L3 for 100 HBAR

Provenance chain:
    L0 (Alice) → L1 (Bob, depth=1) → L2 (Carol, depth=2) → L3 (Dave, depth=3)

    Dave's L3 provenance:
        root_L0L1 = [{ hash: alice_l0, owner: Alice, weight: 1 }]
        derived_from = [carol_l2]
        depth = 3

Distribution of 100 HBAR payment:
    Dave (L3 owner, synthesis fee): 5 HBAR
    Root pool: 95 HBAR

    Only root_L0L1 entries share the pool:
        Alice (L0 owner): 95 HBAR

    Carol receives nothing from THIS query.
    Carol earns when someone queries HER L2 directly.

What if Carol also contributed an L0?
    If Carol had created L0_carol that Bob also used:
        root_L0L1 = [
            { hash: alice_l0, owner: Alice, weight: 1 },
            { hash: carol_l0, owner: Carol, weight: 1 }
        ]

    Then distribution would be:
        Dave (synthesis): 5 HBAR
        Alice (1/2 root pool): 47.5 HBAR
        Carol (1/2 root pool): 47.5 HBAR

6. Appendix Updates

Appendix B: Constants (ADD)

# L2 Entity Graph limits
MAX_ENTITIES_PER_L2 = 10000
MAX_RELATIONSHIPS_PER_L2 = 50000
MAX_ALIASES_PER_ENTITY = 50
MAX_CANONICAL_LABEL_LENGTH = 200
MAX_PREDICATE_LENGTH = 100
MAX_ENTITY_DESCRIPTION_LENGTH = 500
MAX_SOURCE_L1S_PER_L2 = 100
MAX_SOURCE_L2S_PER_MERGE = 20

Appendix C: Error Codes (ADD)

# L2 specific errors
L2_INVALID_STRUCTURE    = 0x0210    # Malformed L2EntityGraph
L2_MISSING_SOURCE       = 0x0211    # Source L1 not found
L2_ENTITY_LIMIT         = 0x0212    # Too many entities
L2_RELATIONSHIP_LIMIT   = 0x0213    # Too many relationships
L2_INVALID_ENTITY_REF   = 0x0214    # Relationship references invalid entity
L2_CYCLE_DETECTED       = 0x0215    # Circular entity reference

7. Migration Notes

Backward Compatibility

  • Existing L0 → L1 → L3 chains remain valid
  • L2 is optional; protocols can continue without it
  • Nodes that don’t understand L2 treat it as unknown content type
  • Network upgrade is additive (no breaking changes)
  1. Phase 1: Add L2 data structures to types
  2. Phase 2: Add L2 validation rules
  3. Phase 3: Add BUILD_L2 operation
  4. Phase 4: Update DERIVE to accept L2 sources
  5. Phase 5: Add L2 preview messages
  6. Phase 6: Update DHT announcements

8. Design Rationale

Why L2 at Protocol Level?

  1. Complete Provenance: Without L2, the provenance chain has a gap. Entity resolution work is invisible.

  2. Fair Compensation: Building high-quality entity graphs requires significant effort (manual curation, ML models, external KB integration). This work deserves compensation.

  3. Reusability: A well-built entity graph is valuable to many consumers. Making it a first-class content type enables this.

  4. Interoperability: Protocol-level standardization ensures L2 graphs from different nodes are compatible.

Why L0/L1 Remain the Roots?

The economic model preserves foundational value:

  • L0/L1 represent irreducible source material
  • L2/L3 are transformations that add value but depend on foundations
  • Synthesis fees (5%) compensate L2/L3 creators for their work
  • Root pool (95%) ensures original contributors are always paid

This prevents value extraction where intermediaries capture all revenue without compensating sources.

L2 Implementation Flexibility

The spec defines structures but not algorithms:

  • Entity extraction: Rule-based, NLP, or ML
  • Entity resolution: String matching, embedding similarity, or external KB
  • Relationship extraction: Dependency parsing, pattern matching, or LLM

Implementers choose appropriate methods for their use case.

Module: nodalync-crypto

Source: Protocol Specification §3

Overview

This module provides all cryptographic primitives for the Nodalync protocol. It has no internal dependencies and should be implemented first.

Dependencies

External only:

  • sha2 — SHA-256 implementation
  • ed25519-dalek — Ed25519 signatures
  • rand — Random number generation
  • bs58 — Base58 encoding (for human-readable IDs)

§3.1 Hash Function

Algorithm: SHA-256

Content hashes are computed as:

ContentHash(content) = H(
    0x00 ||                    # Domain separator for content
    len(content) as uint64 ||  # Big-endian length prefix
    content                    # Raw content bytes
)

Implementation Notes

  • Use domain separator 0x00 to prevent hash collision across different uses
  • Length is encoded as big-endian uint64
  • Returns 32-byte hash

Test Cases

  1. Determinism: Same content → same hash
  2. Uniqueness: Different content → different hash (probabilistic)
  3. Domain separation: ContentHash(x) ≠ H(x) (raw hash without prefix)

§3.2 Identity

Algorithm: Ed25519

Keypair Generation

#![allow(unused)]
fn main() {
fn generate_keypair() -> (PrivateKey, PublicKey)
}

PeerId Derivation

PeerId is derived from public key:

PeerId = H(
    0x00 ||                    # Key type: Ed25519
    public_key                 # 32 bytes
)[0:20]                        # Truncate to 20 bytes

Human-Readable Format

Format: ndl1 + base32(PeerId)

Example: ndl1qpzry9x8gf2tvdw0s3jn54khce6mua7l

Implementation Notes

  • PeerId is 20 bytes (160 bits) — sufficient entropy, compact
  • Prefix ndl1 identifies Nodalync addresses (like bc1 for Bitcoin)
  • Use Bech32 or similar for human-readable encoding with checksum

Test Cases

  1. Determinism: Same public key → same PeerId
  2. Roundtrip: encode → decode → original PeerId
  3. Checksum: Invalid checksum rejected

§3.3 Signatures

All protocol messages requiring authentication are signed.

Signature Creation

#![allow(unused)]
fn main() {
fn sign(private_key: &PrivateKey, message: &[u8]) -> Signature
}

Internally:

signature = Ed25519_Sign(private_key, H(message))

Signature Verification

#![allow(unused)]
fn main() {
fn verify(public_key: &PublicKey, message: &[u8], signature: &Signature) -> bool
}

Internally:

Ed25519_Verify(public_key, H(message), signature)

SignedMessage Structure

#![allow(unused)]
fn main() {
pub struct SignedMessage {
    pub payload: Vec<u8>,
    pub signer: PeerId,
    pub signature: Signature,
}
}

Test Cases

  1. Valid signature: Sign → Verify succeeds
  2. Tampered message: Modify payload → Verify fails
  3. Wrong key: Verify with different public key → fails
  4. Truncated signature: Short signature → fails

§3.4 Content Addressing

Content is referenced by its hash. The hash serves as a unique, verifiable identifier.

Verification

#![allow(unused)]
fn main() {
fn verify_content(content: &[u8], expected_hash: &Hash) -> bool {
    ContentHash(content) == expected_hash
}
}

Test Cases

  1. Valid content: Verify succeeds
  2. Tampered content: Single byte change → Verify fails

Data Types

#![allow(unused)]
fn main() {
/// 32-byte SHA-256 hash
pub struct Hash(pub [u8; 32]);

/// Ed25519 private key (32 bytes, keep secret)
pub struct PrivateKey([u8; 32]);

/// Ed25519 public key (32 bytes)
pub struct PublicKey(pub [u8; 32]);

/// Ed25519 signature (64 bytes)
pub struct Signature(pub [u8; 64]);

/// Truncated hash of public key (20 bytes)
pub struct PeerId(pub [u8; 20]);

/// Milliseconds since Unix epoch
pub type Timestamp = u64;
}

Public API

#![allow(unused)]
fn main() {
// Content hashing
pub fn content_hash(content: &[u8]) -> Hash;
pub fn verify_content(content: &[u8], expected: &Hash) -> bool;

// Identity
pub fn generate_identity() -> (PrivateKey, PublicKey);
pub fn peer_id_from_public_key(public_key: &PublicKey) -> PeerId;
pub fn peer_id_to_string(peer_id: &PeerId) -> String;
pub fn peer_id_from_string(s: &str) -> Result<PeerId, ParseError>;

// Signing
pub fn sign(private_key: &PrivateKey, message: &[u8]) -> Signature;
pub fn verify(public_key: &PublicKey, message: &[u8], signature: &Signature) -> bool;
}

Appendix: Hash Domain Separators

From spec Appendix A.2:

UseDomain ByteDescription
Content0x00Content hashing
Messages0x01Message signing
Channels0x02Channel state

These ensure hashes computed for different purposes never collide.

Module: nodalync-types

Source: Protocol Specification §4

Overview

This module defines all data structures used across the protocol. No logic, just definitions with validation constraints documented.

Dependencies

  • nodalync-crypto — Hash, PeerId, Signature types
  • serde — Serialization derives

§4.1 ContentType

#![allow(unused)]
fn main() {
#[repr(u8)]
pub enum ContentType {
    /// Raw input (documents, notes, transcripts)
    L0 = 0x00,
    /// Mentions (extracted atomic facts)
    L1 = 0x01,
    /// Entity Graph (personal knowledge structure) - always private
    L2 = 0x02,
    /// Insights (emergent synthesis)
    L3 = 0x03,
}
}

Knowledge Layer Semantics:

LayerQueryablePurpose
L0YesOriginal source material
L1YesStructured, quotable claims
L2NoYour personal perspective (cross-document linking)
L3YesOriginal analysis and conclusions

Note: L2 is personal — always visibility = Private, never announced, never queried by others.


§4.2 Visibility

#![allow(unused)]
fn main() {
#[repr(u8)]
pub enum Visibility {
    /// Local only, not served to others
    Private = 0x00,
    /// Served if hash known, not announced to DHT
    Unlisted = 0x01,
    /// Announced to DHT, publicly queryable
    Shared = 0x02,
}
}

§4.3 Version

#![allow(unused)]
fn main() {
pub struct Version {
    /// Sequential version number (1-indexed)
    pub number: u32,
    /// Hash of previous version (None if first version)
    pub previous: Option<Hash>,
    /// Hash of first version (stable identifier across versions)
    pub root: Hash,
    /// Creation timestamp
    pub timestamp: Timestamp,
}
}

Constraints:

  • If number == 1: previous MUST be None, root MUST equal content hash
  • If number > 1: previous MUST be Some, root MUST equal previous.root

§4.4 Mention (L1)

#![allow(unused)]
fn main() {
pub struct Mention {
    /// H(content || source_location)
    pub id: Hash,
    /// The atomic fact (max 1000 chars)
    pub content: String,
    /// Where in L0 this fact came from
    pub source_location: SourceLocation,
    /// Type of fact
    pub classification: Classification,
    /// How certain we are this is in the source
    pub confidence: Confidence,
    /// Extracted entity names
    pub entities: Vec<String>,
}

pub struct SourceLocation {
    pub location_type: LocationType,
    /// Location identifier (paragraph number, page, timestamp, etc.)
    pub reference: String,
    /// Exact quote from source (max 500 chars)
    pub quote: Option<String>,
}

#[repr(u8)]
pub enum LocationType {
    Paragraph = 0x00,
    Page = 0x01,
    Timestamp = 0x02,
    Line = 0x03,
    Section = 0x04,
}

#[repr(u8)]
pub enum Classification {
    Claim = 0x00,
    Statistic = 0x01,
    Definition = 0x02,
    Observation = 0x03,
    Method = 0x04,
    Result = 0x05,
}

#[repr(u8)]
pub enum Confidence {
    /// Directly stated in source
    Explicit = 0x00,
    /// Reasonably inferred
    Inferred = 0x01,
}
}

§4.4a Entity Graph (L2)

L2 represents your personal knowledge graph — how you link entities across documents you’ve studied.

URI Type

#![allow(unused)]
fn main() {
/// URI for RDF interoperability
/// Can be:
///   - Full URI: "http://schema.org/Person"
///   - Compact URI (CURIE): "schema:Person" (expanded using prefixes)
///   - Protocol-defined: "ndl:Person"
pub type Uri = String;
}

Prefix Mapping

#![allow(unused)]
fn main() {
/// Maps short prefixes to full URI namespaces
pub struct PrefixMap {
    pub entries: Vec<PrefixEntry>,
}

pub struct PrefixEntry {
    /// Short prefix, e.g., "schema"
    pub prefix: String,
    /// Full URI namespace, e.g., "http://schema.org/"
    pub uri: String,
}

impl Default for PrefixMap {
    fn default() -> Self {
        Self {
            entries: vec![
                PrefixEntry { prefix: "ndl".into(), uri: "https://nodalync.io/ontology/".into() },
                PrefixEntry { prefix: "schema".into(), uri: "http://schema.org/".into() },
                PrefixEntry { prefix: "foaf".into(), uri: "http://xmlns.com/foaf/0.1/".into() },
                PrefixEntry { prefix: "dc".into(), uri: "http://purl.org/dc/elements/1.1/".into() },
                PrefixEntry { prefix: "rdf".into(), uri: "http://www.w3.org/1999/02/22-rdf-syntax-ns#".into() },
                PrefixEntry { prefix: "rdfs".into(), uri: "http://www.w3.org/2000/01/rdf-schema#".into() },
                PrefixEntry { prefix: "xsd".into(), uri: "http://www.w3.org/2001/XMLSchema#".into() },
                PrefixEntry { prefix: "owl".into(), uri: "http://www.w3.org/2002/07/owl#".into() },
            ],
        }
    }
}
}

L2 Entity Graph

#![allow(unused)]
fn main() {
pub struct L2EntityGraph {
    /// H(serialized entities + relationships)
    pub id: Hash,
    
    // === Sources ===
    /// L1 summaries this graph was built from
    pub source_l1s: Vec<L1Reference>,
    /// Other L2 graphs merged/extended (for MERGE_L2)
    pub source_l2s: Vec<Hash>,
    
    // === Namespace Prefixes ===
    pub prefixes: PrefixMap,
    
    // === Graph Content ===
    pub entities: Vec<Entity>,
    pub relationships: Vec<Relationship>,
    
    // === Statistics ===
    pub entity_count: u32,
    pub relationship_count: u32,
    pub source_mention_count: u32,
}

pub struct L1Reference {
    /// Hash of the L1Summary content
    pub l1_hash: Hash,
    /// The original L0 this L1 came from
    pub l0_hash: Hash,
    /// Which specific mentions were used (empty = all)
    pub mention_ids_used: Vec<Hash>,
}
}

Entity

#![allow(unused)]
fn main() {
pub struct Entity {
    /// Stable entity ID: H(canonical_uri || canonical_label)
    pub id: Hash,
    
    // === Identity ===
    /// Primary human-readable name (max 200 chars)
    pub canonical_label: String,
    /// Canonical URI, e.g., "dbr:Albert_Einstein"
    pub canonical_uri: Option<Uri>,
    /// Alternative names/spellings (max 50)
    pub aliases: Vec<String>,
    
    // === Type (RDF-compatible) ===
    /// e.g., ["schema:Person", "foaf:Person"]
    pub entity_types: Vec<Uri>,
    
    // === Evidence ===
    /// Which L1 mentions establish this entity
    pub source_mentions: Vec<MentionRef>,
    
    // === Confidence ===
    /// 0.0 - 1.0, resolution confidence
    pub confidence: f64,
    pub resolution_method: ResolutionMethod,
    
    // === Optional Metadata ===
    /// Summary description (max 500 chars)
    pub description: Option<String>,
    /// owl:sameAs links to external entities
    pub same_as: Option<Vec<Uri>>,
}

pub struct MentionRef {
    /// Which L1 contains this mention
    pub l1_hash: Hash,
    /// Specific mention ID within that L1
    pub mention_id: Hash,
}

#[repr(u8)]
pub enum ResolutionMethod {
    /// Same string
    ExactMatch = 0x00,
    /// Case/punctuation normalized
    Normalized = 0x01,
    /// Known alias matched
    Alias = 0x02,
    /// Pronoun/reference resolved
    Coreference = 0x03,
    /// Matched via external KB
    ExternalLink = 0x04,
    /// Human-verified
    Manual = 0x05,
    /// ML model assisted
    AIAssisted = 0x06,
}
}

Relationship

#![allow(unused)]
fn main() {
pub struct Relationship {
    /// H(subject || predicate || object)
    pub id: Hash,
    
    // === Triple ===
    /// Entity ID (subject)
    pub subject: Hash,
    /// RDF predicate URI, e.g., "schema:worksFor"
    pub predicate: Uri,
    /// Entity ID, external ref, or literal value
    pub object: RelationshipObject,
    
    // === Evidence ===
    /// Mentions that support this relationship
    pub source_mentions: Vec<MentionRef>,
    /// 0.0 - 1.0
    pub confidence: f64,
    
    // === Temporal (optional) ===
    pub valid_from: Option<Timestamp>,
    pub valid_to: Option<Timestamp>,
}

pub enum RelationshipObject {
    /// Reference to another entity in this graph
    EntityRef(Hash),
    /// Reference to external entity by URI
    ExternalRef(Uri),
    /// A typed literal value
    Literal(LiteralValue),
}

pub struct LiteralValue {
    /// The value as string
    pub value: String,
    /// XSD datatype URI, e.g., "xsd:date" (None = plain string)
    pub datatype: Option<Uri>,
    /// Language tag, e.g., "en" (for strings only)
    pub language: Option<String>,
}
}

L2 Build/Merge Configuration

#![allow(unused)]
fn main() {
pub struct L2BuildConfig {
    /// Custom prefix mappings (merged with defaults)
    pub prefixes: Option<PrefixMap>,
    /// Default entity type if not detected, default: "ndl:Concept"
    pub default_entity_type: Option<Uri>,
    /// Minimum confidence to merge entities (default: 0.8)
    pub resolution_threshold: Option<f64>,
    /// Link to external knowledge bases
    pub use_external_kb: Option<bool>,
    /// Which KBs: ["http://www.wikidata.org/", ...]
    pub external_kb_list: Option<Vec<Uri>>,
    /// Infer implicit relationships
    pub extract_implicit: Option<bool>,
    /// Limit to specific predicates
    pub relationship_predicates: Option<Vec<Uri>>,
}

pub struct L2MergeConfig {
    /// Override prefix mappings
    pub prefixes: Option<PrefixMap>,
    /// Confidence threshold for cross-graph entity merging
    pub entity_merge_threshold: Option<f64>,
    /// Index of source to prefer on conflicts
    pub prefer_source: Option<u32>,
}
}

L2 Constraints:

  • visibility MUST be Private (L2 is never shared)
  • economics.price MUST be 0 (L2 is never queried)
  • source_l1s.len() >= 1
  • entities.len() >= 1
  • All entity IDs unique within graph
  • All relationship entity refs point to valid entities or external URIs
  • All MentionRefs point to valid L1s in source_l1s
  • 0.0 <= confidence <= 1.0

§4.5 Provenance

#![allow(unused)]
fn main() {
pub struct Provenance {
    /// All foundational L0+L1 sources
    pub root_L0L1: Vec<ProvenanceEntry>,
    /// Direct parent hashes (immediate sources)
    pub derived_from: Vec<Hash>,
    /// Max derivation depth from any L0
    pub depth: u32,
}

pub struct ProvenanceEntry {
    /// Content hash
    pub hash: Hash,
    /// Owner's node ID
    pub owner: PeerId,
    /// Visibility at time of derivation
    pub visibility: Visibility,
    /// Weight for duplicate handling
    /// (same source appearing multiple times gets higher weight)
    pub weight: u32,
}
}

Constraints:

  • root_L0L1 contains only L0/L1 entries (never L2 or L3)
  • For L0: root_L0L1 = [self], derived_from = [], depth = 0
  • For L1: root_L0L1 = [parent L0], derived_from = [L0 hash], depth = 1
  • For L2: root_L0L1 = merged from source L1s, derived_from = L1/L2 hashes, depth >= 2
  • For L3: root_L0L1.len() >= 1, derived_from.len() >= 1, depth = max(sources) + 1
  • All hashes in derived_from must have been queried by creator (or owned)
  • No self-reference allowed

§4.6 AccessControl

#![allow(unused)]
fn main() {
pub struct AccessControl {
    /// If set, only these peers can query (None = all allowed)
    pub allowlist: Option<Vec<PeerId>>,
    /// These peers are blocked (None = none blocked)
    pub denylist: Option<Vec<PeerId>>,
    /// Require payment bond to query
    pub require_bond: bool,
    /// Bond amount if required
    pub bond_amount: Option<Amount>,
    /// Rate limit per peer (None = unlimited)
    pub max_queries_per_peer: Option<u32>,
}
}

Access Logic:

Access granted if:
    (allowlist is None OR peer in allowlist) AND
    (denylist is None OR peer NOT in denylist) AND
    (require_bond is false OR peer has posted bond)

§4.7 Economics

#![allow(unused)]
fn main() {
pub struct Economics {
    /// Price per query (in tinybars, 10^-8 HBAR)
    pub price: Amount,
    /// Currency identifier
    pub currency: Currency,
    /// Total queries served
    pub total_queries: u64,
    /// Total revenue generated
    pub total_revenue: Amount,
}

#[repr(u8)]
pub enum Currency {
    /// Hedera native token (1 HBAR = 10^8 tinybars)
    HBAR = 0x00,
}

/// Amount in tinybars (10^-8 HBAR)
pub type Amount = u64;
}

§4.8 Manifest

The complete metadata for a content item:

#![allow(unused)]
fn main() {
pub struct Manifest {
    // === Identity ===
    /// Content hash (unique identifier)
    pub hash: Hash,
    /// Type of content
    pub content_type: ContentType,
    /// Owner's peer ID (receives synthesis fee, serves content)
    pub owner: PeerId,
    
    // === Versioning ===
    pub version: Version,
    
    // === Visibility & Access ===
    pub visibility: Visibility,
    pub access: AccessControl,
    
    // === Metadata ===
    pub metadata: Metadata,
    
    // === Economics ===
    pub economics: Economics,
    
    // === Provenance ===
    pub provenance: Provenance,
    
    // === Timestamps ===
    pub created_at: Timestamp,
    pub updated_at: Timestamp,
}

pub struct Metadata {
    /// Max 200 chars
    pub title: String,
    /// Max 2000 chars
    pub description: Option<String>,
    /// Max 20 tags, each max 50 chars
    pub tags: Vec<String>,
    /// Size in bytes
    pub content_size: u64,
    /// MIME type if applicable
    pub mime_type: Option<String>,
}
}

§4.9 L1Summary (Preview)

#![allow(unused)]
fn main() {
pub struct L1Summary {
    /// Source L0 hash
    pub l0_hash: Hash,
    /// Total mentions extracted
    pub mention_count: u32,
    /// First N mentions (max 5)
    pub preview_mentions: Vec<Mention>,
    /// Main topics (max 5)
    pub primary_topics: Vec<String>,
    /// 2-3 sentence summary (max 500 chars)
    pub summary: String,
}
}

Additional Types

Payment Channel

#![allow(unused)]
fn main() {
pub struct Channel {
    /// Unique channel identifier: H(initiator || responder || nonce)
    pub channel_id: Hash,
    pub peer_id: PeerId,
    pub state: ChannelState,
    pub my_balance: Amount,
    pub their_balance: Amount,
    pub nonce: u64,
    pub last_update: Timestamp,
    pub pending_payments: Vec<Payment>,
}

#[repr(u8)]
pub enum ChannelState {
    Opening = 0x00,
    Open = 0x01,
    Closing = 0x02,
    Closed = 0x03,
    Disputed = 0x04,
}

pub struct Payment {
    /// H(channel_id || nonce || amount || recipient)
    pub id: Hash,
    /// Channel this payment belongs to
    /// NOTE: Not in spec §5.3 but added for implementation convenience
    /// (needed to compute id, lookup payments by channel)
    pub channel_id: Hash,
    pub amount: Amount,
    pub recipient: PeerId,
    /// Content that was queried
    pub query_hash: Hash,
    /// For distribution to all root contributors
    pub provenance: Vec<ProvenanceEntry>,
    pub timestamp: Timestamp,
    /// Signed by payer
    pub signature: Signature,
}
}

Distribution

#![allow(unused)]
fn main() {
pub struct Distribution {
    pub recipient: PeerId,
    pub amount: Amount,
    /// Which source this is for
    pub source_hash: Hash,
}
}

Settlement

#![allow(unused)]
fn main() {
pub struct SettlementEntry {
    pub recipient: PeerId,
    pub amount: Amount,
    /// Content hashes for audit
    pub provenance_hashes: Vec<Hash>,
    /// Payment IDs included
    pub payment_ids: Vec<Hash>,
}

pub struct SettlementBatch {
    pub batch_id: Hash,
    pub entries: Vec<SettlementEntry>,
    /// Root of entries merkle tree
    pub merkle_root: Hash,
}
}

Constants (from Appendix B)

#![allow(unused)]
fn main() {
pub mod constants {
    use super::Amount;
    
    // Limits
    pub const MAX_CONTENT_SIZE: u64 = 104_857_600;  // 100 MB
    pub const MAX_MESSAGE_SIZE: u64 = 10_485_760;   // 10 MB
    pub const MAX_MENTIONS_PER_L0: u32 = 1000;
    pub const MAX_SOURCES_PER_L3: u32 = 100;
    pub const MAX_PROVENANCE_DEPTH: u32 = 100;
    pub const MAX_TAGS: usize = 20;
    pub const MAX_TAG_LENGTH: usize = 50;
    pub const MAX_TITLE_LENGTH: usize = 200;
    pub const MAX_DESCRIPTION_LENGTH: usize = 2000;
    pub const MAX_SUMMARY_LENGTH: usize = 500;
    pub const MAX_MENTION_CONTENT_LENGTH: usize = 1000;
    pub const MAX_QUOTE_LENGTH: usize = 500;
    
    // L2 Entity Graph limits
    pub const MAX_ENTITIES_PER_L2: u32 = 10_000;
    pub const MAX_RELATIONSHIPS_PER_L2: u32 = 50_000;
    pub const MAX_ALIASES_PER_ENTITY: usize = 50;
    pub const MAX_CANONICAL_LABEL_LENGTH: usize = 200;
    pub const MAX_PREDICATE_LENGTH: usize = 100;
    pub const MAX_ENTITY_DESCRIPTION_LENGTH: usize = 500;
    pub const MAX_SOURCE_L1S_PER_L2: usize = 100;
    pub const MAX_SOURCE_L2S_PER_MERGE: usize = 20;
    
    // Economics
    pub const MIN_PRICE: Amount = 1;
    pub const MAX_PRICE: Amount = 10_000_000_000_000_000;  // 10^16
    pub const SYNTHESIS_FEE_NUMERATOR: u64 = 5;
    pub const SYNTHESIS_FEE_DENOMINATOR: u64 = 100;  // 5%
    pub const SETTLEMENT_BATCH_THRESHOLD: Amount = 10_000_000_000;  // 100 HBAR
    pub const SETTLEMENT_BATCH_INTERVAL_MS: u64 = 3_600_000;  // 1 hour
    
    // Timing
    pub const MESSAGE_TIMEOUT_MS: u64 = 30_000;
    pub const CHANNEL_DISPUTE_PERIOD_MS: u64 = 86_400_000;  // 24 hours
    pub const MAX_CLOCK_SKEW_MS: u64 = 300_000;  // 5 minutes
    
    // DHT
    pub const DHT_BUCKET_SIZE: usize = 20;
    pub const DHT_ALPHA: usize = 3;
    pub const DHT_REPLICATION: usize = 20;
}
}

Error Types

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(u16)]
pub enum ErrorCode {
    // Query Errors (0x0001 - 0x00FF)
    NotFound = 0x0001,
    AccessDenied = 0x0002,
    PaymentRequired = 0x0003,
    PaymentInvalid = 0x0004,
    RateLimited = 0x0005,
    VersionNotFound = 0x0006,
    
    // Channel Errors (0x0100 - 0x01FF)
    ChannelNotFound = 0x0100,
    ChannelClosed = 0x0101,
    InsufficientBalance = 0x0102,
    InvalidNonce = 0x0103,
    InvalidSignature = 0x0104,
    
    // Validation Errors (0x0200 - 0x02FF)
    InvalidHash = 0x0200,
    InvalidProvenance = 0x0201,
    InvalidVersion = 0x0202,
    InvalidManifest = 0x0203,
    ContentTooLarge = 0x0204,
    
    // L2 Entity Graph Errors (0x0210 - 0x021F)
    L2InvalidStructure = 0x0210,
    L2MissingSource = 0x0211,
    L2EntityLimit = 0x0212,
    L2RelationshipLimit = 0x0213,
    L2InvalidEntityRef = 0x0214,
    L2CycleDetected = 0x0215,
    L2InvalidUri = 0x0216,
    L2CannotPublish = 0x0217,
    
    // Network Errors (0x0300 - 0x03FF)
    PeerNotFound = 0x0300,
    ConnectionFailed = 0x0301,
    Timeout = 0x0302,
    
    // Internal Errors
    InternalError = 0xFFFF,
}
}

Implementation Notes

  1. All types should derive Debug, Clone, PartialEq, Eq where sensible
  2. All types should derive Serialize, Deserialize for wire format
  3. Use #[serde(rename_all = "snake_case")] for consistent JSON representation
  4. Consider #[non_exhaustive] for enums to allow future extension
  5. Implement Default for types where a sensible default exists

Module: nodalync-wire

Source: Protocol Specification §6, Appendix A

Overview

Message serialization and deserialization. Defines the wire format for all protocol messages.

Dependencies

  • nodalync-types — All data structures
  • ciborium — CBOR encoding

Message Envelope (§6.1)

#![allow(unused)]
fn main() {
pub struct Message {
    /// Protocol version (0x01)
    pub version: u8,
    /// Message type
    pub message_type: MessageType,
    /// Unique message ID
    pub id: Hash,
    /// Creation timestamp
    pub timestamp: Timestamp,
    /// Sender's peer ID
    pub sender: PeerId,
    /// Type-specific payload (CBOR encoded)
    pub payload: Vec<u8>,
    /// Signs H(version || type || id || timestamp || sender || payload_hash)
    pub signature: Signature,
}

#[repr(u16)]
pub enum MessageType {
    // Discovery (0x01xx)
    Announce = 0x0100,
    AnnounceUpdate = 0x0101,
    Search = 0x0110,
    SearchResponse = 0x0111,
    
    // Preview (0x02xx)
    PreviewRequest = 0x0200,
    PreviewResponse = 0x0201,
    
    // Query (0x03xx)
    QueryRequest = 0x0300,
    QueryResponse = 0x0301,
    QueryError = 0x0302,
    
    // Version (0x04xx)
    VersionRequest = 0x0400,
    VersionResponse = 0x0401,
    
    // Channel (0x05xx)
    ChannelOpen = 0x0500,
    ChannelAccept = 0x0501,
    ChannelUpdate = 0x0502,
    ChannelClose = 0x0503,
    ChannelDispute = 0x0504,
    
    // Settlement (0x06xx)
    SettleBatch = 0x0600,
    SettleConfirm = 0x0601,
    
    // Peer (0x07xx)
    Ping = 0x0700,
    Pong = 0x0701,
    PeerInfo = 0x0710,
}
}

Payload Types (§6.2 - §6.8)

Discovery Payloads

#![allow(unused)]
fn main() {
pub struct AnnouncePayload {
    pub hash: Hash,
    pub content_type: ContentType,
    pub title: String,
    pub l1_summary: L1Summary,
    pub price: Amount,
    pub addresses: Vec<String>,  // Multiaddrs
}

pub struct SearchPayload {
    pub query: String,
    pub filters: Option<SearchFilters>,
    pub limit: u32,
    pub offset: u32,
}

pub struct SearchFilters {
    pub content_types: Option<Vec<ContentType>>,
    pub max_price: Option<Amount>,
    pub min_reputation: Option<i64>,
    pub created_after: Option<Timestamp>,
    pub created_before: Option<Timestamp>,
    pub tags: Option<Vec<String>>,
}

pub struct SearchResult {
    pub hash: Hash,
    pub content_type: ContentType,
    pub title: String,
    pub owner: PeerId,
    pub l1_summary: L1Summary,
    pub price: Amount,
    pub total_queries: u64,
    pub relevance_score: f64,
    /// Publisher's reachable multiaddresses for reconnection
    pub publisher_addresses: Vec<String>,
}
}

Query Payloads

#![allow(unused)]
fn main() {
pub struct QueryRequestPayload {
    pub hash: Hash,
    pub query: Option<String>,
    pub payment: Payment,
    pub version_spec: Option<VersionSpec>,
}

pub enum VersionSpec {
    Latest,
    Number(u32),
    Hash(Hash),
}

pub struct QueryResponsePayload {
    pub hash: Hash,
    pub content: Vec<u8>,
    pub manifest: Manifest,
    pub payment_receipt: PaymentReceipt,
}

pub struct PaymentReceipt {
    pub payment_id: Hash,
    pub amount: Amount,
    pub timestamp: Timestamp,
    pub channel_nonce: u64,
    pub distributor_signature: Signature,
}

pub struct QueryErrorPayload {
    pub hash: Hash,
    pub error_code: ErrorCode,
    pub message: Option<String>,
}
}

Channel Payloads

#![allow(unused)]
fn main() {
pub struct ChannelOpenPayload {
    pub channel_id: Hash,
    pub initial_balance: Amount,
    pub funding_tx: Option<Vec<u8>>,
}

pub struct ChannelAcceptPayload {
    pub channel_id: Hash,
    pub initial_balance: Amount,
    pub funding_tx: Option<Vec<u8>>,
}

pub struct ChannelUpdatePayload {
    pub channel_id: Hash,
    pub nonce: u64,
    pub balances: ChannelBalances,
    pub payments: Vec<Payment>,
    pub signature: Signature,
}

pub struct ChannelBalances {
    pub initiator: Amount,
    pub responder: Amount,
}

pub struct ChannelClosePayload {
    pub channel_id: Hash,
    pub final_balances: ChannelBalances,
    /// Proposed on-chain settlement transaction
    pub settlement_tx: Vec<u8>,
}

pub struct ChannelDisputePayload {
    pub channel_id: Hash,
    /// Highest known state
    pub claimed_state: ChannelUpdatePayload,
    /// Supporting evidence
    pub evidence: Vec<Vec<u8>>,
}
}

Version Payloads

#![allow(unused)]
fn main() {
pub struct VersionRequestPayload {
    /// Stable version root identifier
    pub version_root: Hash,
}

pub struct VersionResponsePayload {
    pub version_root: Hash,
    pub versions: Vec<VersionInfo>,
    pub latest: Hash,
}

pub struct VersionInfo {
    pub hash: Hash,
    pub number: u32,
    pub timestamp: Timestamp,
    pub visibility: Visibility,
    pub price: Amount,
}
}

Settlement Payloads

#![allow(unused)]
fn main() {
pub struct SettleBatchPayload {
    pub batch_id: Hash,
    pub entries: Vec<SettlementEntry>,
    /// Root of entries merkle tree
    pub merkle_root: Hash,
    /// Signature from batch creator
    pub signature: Signature,
}

pub struct SettlementEntry {
    pub recipient: PeerId,
    pub amount: Amount,
    /// Content hashes for audit trail
    pub provenance_hashes: Vec<Hash>,
    /// Payment IDs included in this entry
    pub payment_ids: Vec<Hash>,
}

pub struct SettleConfirmPayload {
    pub batch_id: Hash,
    /// On-chain transaction ID
    pub transaction_id: String,
    pub block_number: u64,
    pub timestamp: Timestamp,
}
}

Peer Payloads

#![allow(unused)]
fn main() {
pub struct PingPayload {
    pub nonce: u64,
}

pub struct PongPayload {
    pub nonce: u64,
}

pub struct PeerInfoPayload {
    pub peer_id: PeerId,
    pub public_key: PublicKey,
    pub addresses: Vec<String>,  // Multiaddrs
    pub capabilities: Vec<Capability>,
    pub content_count: u64,
    pub uptime: u64,  // Seconds
}

#[repr(u8)]
pub enum Capability {
    /// Can serve queries
    Query = 0x01,
    /// Supports payment channels
    Channel = 0x02,
    /// Can initiate settlement
    Settle = 0x04,
    /// Participates in DHT indexing
    Index = 0x08,
}
}

Announce Update Payload

#![allow(unused)]
fn main() {
pub struct AnnounceUpdatePayload {
    /// Stable version root identifier
    pub version_root: Hash,
    /// New version hash
    pub new_hash: Hash,
    pub version_number: u32,
    pub title: String,
    pub l1_summary: L1Summary,
    pub price: Amount,
}
}

Wire Format (Appendix A)

Encoding Rules

  1. CBOR encoding (RFC 8949) with deterministic rules:

    • Map keys sorted lexicographically
    • No indefinite-length arrays or maps
    • Minimal integer encoding
    • No floating-point for amounts (use u64)
  2. Message wire format:

[0x00]                  # Protocol magic byte
[version: u8]           # Protocol version
[type: u16 BE]          # Message type
[length: u32 BE]        # Payload length
[payload: bytes]        # CBOR-encoded payload
[signature: 64 bytes]   # Ed25519 signature

Hash Computation

#![allow(unused)]
fn main() {
// Content hash (domain separator 0x00)
fn content_hash(content: &[u8]) -> Hash {
    let mut hasher = Sha256::new();
    hasher.update(&[0x00]);  // Domain separator
    hasher.update(&(content.len() as u64).to_be_bytes());
    hasher.update(content);
    Hash(hasher.finalize().into())
}

// Message hash for signing (domain separator 0x01)
fn message_hash(msg: &Message) -> Hash {
    let mut hasher = Sha256::new();
    hasher.update(&[0x01]);  // Domain separator
    hasher.update(&[msg.version]);
    hasher.update(&(msg.message_type as u16).to_be_bytes());
    hasher.update(&msg.id.0);
    hasher.update(&msg.timestamp.to_be_bytes());
    hasher.update(&msg.sender.0);
    hasher.update(&content_hash(&msg.payload).0);
    Hash(hasher.finalize().into())
}

// Channel state hash (domain separator 0x02)
fn channel_state_hash(channel_id: &Hash, nonce: u64, balances: &ChannelBalances) -> Hash {
    let mut hasher = Sha256::new();
    hasher.update(&[0x02]);  // Domain separator
    hasher.update(&channel_id.0);
    hasher.update(&nonce.to_be_bytes());
    hasher.update(&balances.initiator.to_be_bytes());
    hasher.update(&balances.responder.to_be_bytes());
    Hash(hasher.finalize().into())
}
}

Public API

#![allow(unused)]
fn main() {
// Encoding
pub fn encode_message(msg: &Message) -> Result<Vec<u8>, EncodeError>;
pub fn encode_payload<T: Serialize>(payload: &T) -> Result<Vec<u8>, EncodeError>;

// Decoding
pub fn decode_message(bytes: &[u8]) -> Result<Message, DecodeError>;
pub fn decode_payload<T: DeserializeOwned>(bytes: &[u8]) -> Result<T, DecodeError>;

// Message construction helpers
pub fn create_message(
    message_type: MessageType,
    payload: Vec<u8>,
    identity: &Identity,
) -> Message;

// Validation (checks format, not semantic validity)
pub fn validate_message_format(msg: &Message) -> Result<(), FormatError>;
}

Test Cases

  1. Roundtrip: Encode → Decode → identical message
  2. Determinism: Same message → same bytes (important for signatures)
  3. Invalid magic byte: Reject
  4. Invalid version: Reject
  5. Truncated message: Reject
  6. Invalid CBOR: Reject
  7. Signature mismatch: Reject

Module: nodalync-store

Source: Protocol Specification §5

Overview

Local storage for content, manifests, provenance graph, and payment channels.

Dependencies

  • nodalync-types — All data structures
  • rusqlite — SQLite for structured data
  • directories — Platform-specific paths

Storage Layout

~/.nodalync/
├── config.toml              # Node configuration
├── identity/
│   ├── keypair.key          # Ed25519 private key (encrypted)
│   └── peer_id              # Public identity
├── content/
│   └── {hash_prefix}/
│       └── {hash}           # Raw content files
├── nodalync.db              # SQLite: manifests, provenance, channels
└── cache/
    └── {hash_prefix}/
        └── {hash}           # Cached content from queries

§5.1 State Components

NodeState

#![allow(unused)]
fn main() {
pub struct NodeState {
    pub identity: Identity,
    pub content: ContentStore,
    pub manifests: ManifestStore,
    pub provenance: ProvenanceGraph,
    pub channels: ChannelStore,
    pub cache: CacheStore,
}
}

Identity Storage

Private key encrypted at rest:

  • Encryption: AES-256-GCM
  • Key derivation: Argon2id from user password
  • Nonce: Random 12 bytes, stored with ciphertext

§5.2 Provenance Graph

Bidirectional graph for efficient traversal:

#![allow(unused)]
fn main() {
pub trait ProvenanceGraph {
    /// Add content with its derivation sources
    fn add(&mut self, hash: &Hash, derived_from: &[Hash]) -> Result<()>;
    
    /// Get all root L0+L1 sources (flattened)
    fn get_roots(&self, hash: &Hash) -> Result<Vec<ProvenanceEntry>>;
    
    /// Get all content derived from this hash
    fn get_derivations(&self, hash: &Hash) -> Result<Vec<Hash>>;
    
    /// Check if A is an ancestor of B
    fn is_ancestor(&self, ancestor: &Hash, descendant: &Hash) -> Result<bool>;
}
}

SQL Schema:

-- Forward edges
CREATE TABLE derived_from (
    content_hash BLOB NOT NULL,
    source_hash BLOB NOT NULL,
    PRIMARY KEY (content_hash, source_hash)
);

-- Cached flattened roots (for performance)
CREATE TABLE root_cache (
    content_hash BLOB NOT NULL,
    root_hash BLOB NOT NULL,
    owner BLOB NOT NULL,
    visibility INTEGER NOT NULL,
    weight INTEGER NOT NULL DEFAULT 1,
    PRIMARY KEY (content_hash, root_hash)
);

CREATE INDEX idx_derivations ON derived_from(source_hash);

§5.3 Payment Channels

#![allow(unused)]
fn main() {
pub trait ChannelStore {
    fn create(&mut self, peer: &PeerId, channel: Channel) -> Result<()>;
    fn get(&self, peer: &PeerId) -> Result<Option<Channel>>;
    fn update(&mut self, peer: &PeerId, channel: &Channel) -> Result<()>;
    fn list_open(&self) -> Result<Vec<(PeerId, Channel)>>;
    fn add_payment(&mut self, peer: &PeerId, payment: Payment) -> Result<()>;
    fn get_pending_payments(&self, peer: &PeerId) -> Result<Vec<Payment>>;
    fn clear_payments(&mut self, peer: &PeerId, payment_ids: &[Hash]) -> Result<()>;
}
}

Trait Definitions

ContentStore

#![allow(unused)]
fn main() {
pub trait ContentStore {
    /// Store content, returns hash
    fn store(&mut self, content: &[u8]) -> Result<Hash>;
    
    /// Store content with known hash (for verification)
    fn store_verified(&mut self, hash: &Hash, content: &[u8]) -> Result<()>;
    
    /// Load content by hash
    fn load(&self, hash: &Hash) -> Result<Option<Vec<u8>>>;
    
    /// Check if content exists
    fn exists(&self, hash: &Hash) -> bool;
    
    /// Delete content
    fn delete(&mut self, hash: &Hash) -> Result<()>;
    
    /// Get content size without loading
    fn size(&self, hash: &Hash) -> Result<Option<u64>>;
}
}

ManifestStore

#![allow(unused)]
fn main() {
pub trait ManifestStore {
    fn store(&mut self, manifest: &Manifest) -> Result<()>;
    fn load(&self, hash: &Hash) -> Result<Option<Manifest>>;
    fn update(&mut self, manifest: &Manifest) -> Result<()>;
    fn delete(&mut self, hash: &Hash) -> Result<()>;
    
    /// List manifests with optional filtering
    fn list(&self, filter: ManifestFilter) -> Result<Vec<Manifest>>;
    
    /// Get all versions of content by version_root
    fn get_versions(&self, version_root: &Hash) -> Result<Vec<Manifest>>;
}

pub struct ManifestFilter {
    pub visibility: Option<Visibility>,
    pub content_type: Option<ContentType>,
    pub created_after: Option<Timestamp>,
    pub created_before: Option<Timestamp>,
    pub limit: Option<u32>,
    pub offset: Option<u32>,
}
}

CacheStore

#![allow(unused)]
fn main() {
pub trait CacheStore {
    /// Cache content from a query
    fn cache(&mut self, entry: CachedContent) -> Result<()>;
    
    /// Get cached content
    fn get(&self, hash: &Hash) -> Result<Option<CachedContent>>;
    
    /// Check if cached
    fn is_cached(&self, hash: &Hash) -> bool;
    
    /// Evict old entries (LRU)
    fn evict(&mut self, max_size_bytes: u64) -> Result<u64>;
    
    /// Clear all cache
    fn clear(&mut self) -> Result<()>;
}

pub struct CachedContent {
    pub hash: Hash,
    pub content: Vec<u8>,
    pub source_peer: PeerId,
    pub queried_at: Timestamp,
    /// NOTE: Spec §5.1 says "PaymentProof" but that type is undefined.
    /// Using PaymentReceipt from §6.4 instead.
    pub payment_proof: PaymentReceipt,
}
}

SettlementQueueStore

The settlement queue stores pending distributions until batch settlement. nodalync-ops writes to this queue after processing queries. nodalync-settle reads from this queue to create settlement batches.

#![allow(unused)]
fn main() {
pub trait SettlementQueueStore {
    /// Add a distribution to the queue
    fn enqueue(&mut self, distribution: QueuedDistribution) -> Result<()>;
    
    /// Get all pending distributions
    fn get_pending(&self) -> Result<Vec<QueuedDistribution>>;
    
    /// Get pending distributions for a specific recipient
    fn get_pending_for(&self, recipient: &PeerId) -> Result<Vec<QueuedDistribution>>;
    
    /// Get total pending amount across all recipients
    fn get_pending_total(&self) -> Result<Amount>;
    
    /// Mark distributions as settled (by payment IDs)
    fn mark_settled(&mut self, payment_ids: &[Hash], batch_id: &Hash) -> Result<()>;
    
    /// Get last settlement timestamp
    fn get_last_settlement_time(&self) -> Result<Option<Timestamp>>;
    
    /// Set last settlement timestamp
    fn set_last_settlement_time(&mut self, timestamp: Timestamp) -> Result<()>;
}

pub struct QueuedDistribution {
    /// Original payment ID this distribution came from
    pub payment_id: Hash,
    /// Recipient of this distribution
    pub recipient: PeerId,
    /// Amount owed
    pub amount: Amount,
    /// Source content hash (for audit)
    pub source_hash: Hash,
    /// When the original query happened
    pub queued_at: Timestamp,
}
}

---

## SQL Schema (Full)

```sql
-- Manifests
CREATE TABLE manifests (
    hash BLOB PRIMARY KEY,
    content_type INTEGER NOT NULL,
    version_number INTEGER NOT NULL,
    version_previous BLOB,
    version_root BLOB NOT NULL,
    version_timestamp INTEGER NOT NULL,
    visibility INTEGER NOT NULL,
    title TEXT NOT NULL,
    description TEXT,
    tags TEXT,  -- JSON array
    content_size INTEGER NOT NULL,
    mime_type TEXT,
    price INTEGER NOT NULL,
    total_queries INTEGER NOT NULL DEFAULT 0,
    total_revenue INTEGER NOT NULL DEFAULT 0,
    created_at INTEGER NOT NULL,
    updated_at INTEGER NOT NULL,
    -- Access control stored as JSON
    access_control TEXT NOT NULL
);

CREATE INDEX idx_manifests_visibility ON manifests(visibility);
CREATE INDEX idx_manifests_version_root ON manifests(version_root);
CREATE INDEX idx_manifests_created ON manifests(created_at);

-- L1 Summaries
CREATE TABLE l1_summaries (
    l0_hash BLOB PRIMARY KEY,
    mention_count INTEGER NOT NULL,
    preview_mentions TEXT NOT NULL,  -- JSON
    primary_topics TEXT NOT NULL,     -- JSON
    summary TEXT NOT NULL
);

-- Payment Channels
CREATE TABLE channels (
    peer_id BLOB PRIMARY KEY,
    state INTEGER NOT NULL,
    my_balance INTEGER NOT NULL,
    their_balance INTEGER NOT NULL,
    nonce INTEGER NOT NULL,
    last_update INTEGER NOT NULL
);

-- Pending Payments
CREATE TABLE payments (
    id BLOB PRIMARY KEY,
    channel_peer BLOB NOT NULL,
    amount INTEGER NOT NULL,
    recipient BLOB NOT NULL,
    query_hash BLOB NOT NULL,
    provenance TEXT NOT NULL,  -- JSON
    timestamp INTEGER NOT NULL,
    signature BLOB NOT NULL,
    settled INTEGER NOT NULL DEFAULT 0,
    FOREIGN KEY (channel_peer) REFERENCES channels(peer_id)
);

CREATE INDEX idx_payments_channel ON payments(channel_peer);
CREATE INDEX idx_payments_settled ON payments(settled);

-- Cache metadata (content stored on filesystem)
CREATE TABLE cache (
    hash BLOB PRIMARY KEY,
    source_peer BLOB NOT NULL,
    queried_at INTEGER NOT NULL,
    size_bytes INTEGER NOT NULL,
    payment_receipt TEXT NOT NULL  -- JSON
);

CREATE INDEX idx_cache_queried ON cache(queried_at);

-- Settlement Queue
CREATE TABLE settlement_queue (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    payment_id BLOB NOT NULL,
    recipient BLOB NOT NULL,
    amount INTEGER NOT NULL,
    source_hash BLOB NOT NULL,
    queued_at INTEGER NOT NULL,
    settled INTEGER NOT NULL DEFAULT 0,
    batch_id BLOB  -- Set when settled
);

CREATE INDEX idx_settlement_queue_recipient ON settlement_queue(recipient);
CREATE INDEX idx_settlement_queue_settled ON settlement_queue(settled);

-- Settlement metadata
CREATE TABLE settlement_meta (
    key TEXT PRIMARY KEY,
    value TEXT NOT NULL
);
-- Stores: last_settlement_time

Test Cases

  1. Content roundtrip: Store → Load → identical
  2. Manifest CRUD: Create, read, update, delete
  3. Provenance graph: Add edges → get_roots returns correct set
  4. Weight accumulation: Same source via multiple paths → weight increases
  5. Channel state: Open → payments → state updates correctly
  6. Cache eviction: LRU eviction frees correct entries
  7. Concurrent access: Multiple readers, single writer
  8. Settlement queue enqueue: Add distribution → retrievable
  9. Settlement queue totals: Multiple distributions → correct sum
  10. Settlement queue mark settled: Mark as settled → no longer in pending
  11. Settlement queue by recipient: Filter by recipient works

Module: nodalync-valid

Source: Protocol Specification §9

Overview

All validation rules for the protocol. Returns detailed errors for debugging.

Dependencies

  • nodalync-types — All data structures
  • nodalync-crypto — Hash verification

Validation Trait

#![allow(unused)]
fn main() {
pub trait Validator {
    fn validate_content(&self, content: &[u8], manifest: &Manifest) -> Result<(), ValidationError>;
    fn validate_version(&self, manifest: &Manifest, previous: Option<&Manifest>) -> Result<(), ValidationError>;
    fn validate_provenance(&self, manifest: &Manifest, sources: &[Manifest]) -> Result<(), ValidationError>;
    fn validate_payment(&self, payment: &Payment, channel: &Channel, manifest: &Manifest) -> Result<(), ValidationError>;
    fn validate_message(&self, message: &Message) -> Result<(), ValidationError>;
    fn validate_access(&self, requester: &PeerId, manifest: &Manifest) -> Result<(), ValidationError>;
}
}

§9.1 Content Validation

#![allow(unused)]
fn main() {
fn validate_content(content: &[u8], manifest: &Manifest) -> Result<()> {
    // 1. Hash matches
    ensure!(
        content_hash(content) == manifest.hash,
        ContentValidation("hash mismatch")
    );
    
    // 2. Size matches
    ensure!(
        content.len() as u64 == manifest.metadata.content_size,
        ContentValidation("size mismatch")
    );
    
    // 3. Title length
    ensure!(
        manifest.metadata.title.len() <= MAX_TITLE_LENGTH,
        ContentValidation("title too long")
    );
    
    // 4. Description length
    if let Some(ref desc) = manifest.metadata.description {
        ensure!(
            desc.len() <= MAX_DESCRIPTION_LENGTH,
            ContentValidation("description too long")
        );
    }
    
    // 5. Tags
    ensure!(
        manifest.metadata.tags.len() <= MAX_TAGS,
        ContentValidation("too many tags")
    );
    for tag in &manifest.metadata.tags {
        ensure!(
            tag.len() <= MAX_TAG_LENGTH,
            ContentValidation("tag too long")
        );
    }
    
    // 6. Valid enums
    ensure!(
        matches!(manifest.content_type, ContentType::L0 | ContentType::L1 | ContentType::L2 | ContentType::L3),
        ContentValidation("invalid content type")
    );
    ensure!(
        matches!(manifest.visibility, Visibility::Private | Visibility::Unlisted | Visibility::Shared),
        ContentValidation("invalid visibility")
    );
    
    // 7. L2-specific validation
    if manifest.content_type == ContentType::L2 {
        validate_l2_content(content, manifest)?;
    }
    
    Ok(())
}
}

§9.1a L2 Content Validation

#![allow(unused)]
fn main() {
fn validate_l2_content(content: &[u8], manifest: &Manifest) -> Result<()> {
    // L2 MUST be private
    ensure!(
        manifest.visibility == Visibility::Private,
        L2Validation("L2 must be private")
    );
    
    // L2 MUST have zero price
    ensure!(
        manifest.economics.price == 0,
        L2Validation("L2 must have zero price")
    );
    
    // Deserialize and validate structure
    let l2: L2EntityGraph = deserialize(content)
        .map_err(|_| L2Validation("invalid L2 structure"))?;
    
    // ID matches
    ensure!(
        l2.id == manifest.hash,
        L2Validation("L2 id must match manifest hash")
    );
    
    // Must have at least one source L1
    ensure!(
        !l2.source_l1s.is_empty(),
        L2Validation("L2 must have at least one source L1")
    );
    ensure!(
        l2.source_l1s.len() <= MAX_SOURCE_L1S_PER_L2,
        L2Validation("too many source L1s")
    );
    
    // Must have at least one entity
    ensure!(
        !l2.entities.is_empty(),
        L2Validation("L2 must have at least one entity")
    );
    ensure!(
        l2.entities.len() <= MAX_ENTITIES_PER_L2 as usize,
        L2Validation("too many entities")
    );
    
    // Relationship limits
    ensure!(
        l2.relationships.len() <= MAX_RELATIONSHIPS_PER_L2 as usize,
        L2Validation("too many relationships")
    );
    
    // Counts match
    ensure!(
        l2.entity_count as usize == l2.entities.len(),
        L2Validation("entity_count mismatch")
    );
    ensure!(
        l2.relationship_count as usize == l2.relationships.len(),
        L2Validation("relationship_count mismatch")
    );
    
    // Validate prefix map
    validate_prefix_map(&l2.prefixes)?;
    
    // Validate all entities
    let mut entity_ids: HashSet<Hash> = HashSet::new();
    for entity in &l2.entities {
        validate_entity(entity, &l2.prefixes, &l2.source_l1s)?;
        ensure!(
            entity_ids.insert(entity.id),
            L2Validation("duplicate entity ID")
        );
    }
    
    // Validate all relationships
    for rel in &l2.relationships {
        validate_relationship(rel, &entity_ids, &l2.prefixes, &l2.source_l1s)?;
    }
    
    Ok(())
}

fn validate_prefix_map(prefixes: &PrefixMap) -> Result<()> {
    let mut seen_prefixes: HashSet<&str> = HashSet::new();
    for entry in &prefixes.entries {
        ensure!(
            !entry.prefix.is_empty(),
            L2Validation("empty prefix")
        );
        ensure!(
            !entry.uri.is_empty(),
            L2Validation("empty URI")
        );
        ensure!(
            entry.uri.ends_with('/') || entry.uri.ends_with('#'),
            L2Validation("prefix URI must end with / or #")
        );
        ensure!(
            seen_prefixes.insert(&entry.prefix),
            L2Validation("duplicate prefix")
        );
    }
    Ok(())
}

fn validate_entity(
    entity: &Entity,
    prefixes: &PrefixMap,
    source_l1s: &[L1Reference],
) -> Result<()> {
    // Label constraints
    ensure!(
        !entity.canonical_label.is_empty(),
        L2Validation("empty canonical_label")
    );
    ensure!(
        entity.canonical_label.len() <= MAX_CANONICAL_LABEL_LENGTH,
        L2Validation("canonical_label too long")
    );
    
    // Aliases
    ensure!(
        entity.aliases.len() <= MAX_ALIASES_PER_ENTITY,
        L2Validation("too many aliases")
    );
    
    // Validate entity type URIs
    for uri in &entity.entity_types {
        validate_uri(uri, prefixes)?;
    }
    
    // Validate canonical_uri if present
    if let Some(ref uri) = entity.canonical_uri {
        validate_uri(uri, prefixes)?;
    }
    
    // Validate same_as URIs if present
    if let Some(ref same_as) = entity.same_as {
        for uri in same_as {
            validate_uri(uri, prefixes)?;
        }
    }
    
    // Confidence in range
    ensure!(
        entity.confidence >= 0.0 && entity.confidence <= 1.0,
        L2Validation("confidence out of range")
    );
    
    // All mention refs point to valid L1s
    let valid_l1_hashes: HashSet<_> = source_l1s.iter().map(|r| &r.l1_hash).collect();
    for mention_ref in &entity.source_mentions {
        ensure!(
            valid_l1_hashes.contains(&mention_ref.l1_hash),
            L2Validation("mention ref points to unknown L1")
        );
    }
    
    // Description length
    if let Some(ref desc) = entity.description {
        ensure!(
            desc.len() <= MAX_ENTITY_DESCRIPTION_LENGTH,
            L2Validation("entity description too long")
        );
    }
    
    Ok(())
}

fn validate_relationship(
    rel: &Relationship,
    entity_ids: &HashSet<Hash>,
    prefixes: &PrefixMap,
    source_l1s: &[L1Reference],
) -> Result<()> {
    // Subject must exist
    ensure!(
        entity_ids.contains(&rel.subject),
        L2Validation("relationship subject not found")
    );
    
    // Predicate must be valid URI
    validate_uri(&rel.predicate, prefixes)?;
    
    // Object validation
    match &rel.object {
        RelationshipObject::EntityRef(hash) => {
            ensure!(
                entity_ids.contains(hash),
                L2Validation("relationship object entity not found")
            );
        }
        RelationshipObject::ExternalRef(uri) => {
            validate_uri(uri, prefixes)?;
        }
        RelationshipObject::Literal(lit) => {
            if let Some(ref dt) = lit.datatype {
                validate_uri(dt, prefixes)?;
            }
        }
    }
    
    // Confidence in range
    ensure!(
        rel.confidence >= 0.0 && rel.confidence <= 1.0,
        L2Validation("relationship confidence out of range")
    );
    
    // Temporal validity
    if let (Some(from), Some(to)) = (rel.valid_from, rel.valid_to) {
        ensure!(from <= to, L2Validation("valid_from > valid_to"));
    }
    
    // Mention refs
    let valid_l1_hashes: HashSet<_> = source_l1s.iter().map(|r| &r.l1_hash).collect();
    for mention_ref in &rel.source_mentions {
        ensure!(
            valid_l1_hashes.contains(&mention_ref.l1_hash),
            L2Validation("relationship mention ref points to unknown L1")
        );
    }
    
    Ok(())
}
}

§9.1b URI/CURIE Validation

#![allow(unused)]
fn main() {
/// Validate a URI or CURIE
fn validate_uri(uri: &Uri, prefixes: &PrefixMap) -> Result<()> {
    ensure!(!uri.is_empty(), L2Validation("empty URI"));
    
    if uri.contains("://") {
        // Full URI - basic syntax check
        ensure!(
            uri.starts_with("http://") || uri.starts_with("https://"),
            L2Validation("URI must be http(s)")
        );
    } else if let Some(colon_pos) = uri.find(':') {
        // CURIE - check prefix exists
        let prefix = &uri[..colon_pos];
        let has_prefix = prefixes.entries.iter().any(|e| e.prefix == prefix);
        ensure!(
            has_prefix,
            L2Validation(format!("unknown prefix: {}", prefix))
        );
    } else {
        // No scheme or prefix - invalid
        return Err(L2Validation("URI must be full URI or valid CURIE"));
    }
    
    Ok(())
}

/// Expand a CURIE to full URI
pub fn expand_curie(curie: &str, prefixes: &PrefixMap) -> Result<String> {
    if curie.contains("://") {
        // Already a full URI
        return Ok(curie.to_string());
    }
    
    if let Some(colon_pos) = curie.find(':') {
        let prefix = &curie[..colon_pos];
        let local = &curie[colon_pos + 1..];
        
        for entry in &prefixes.entries {
            if entry.prefix == prefix {
                return Ok(format!("{}{}", entry.uri, local));
            }
        }
        Err(L2Validation(format!("unknown prefix: {}", prefix)))
    } else {
        Err(L2Validation("not a valid CURIE"))
    }
}
}

§9.2 Version Validation

#![allow(unused)]
fn main() {
fn validate_version(manifest: &Manifest, previous: Option<&Manifest>) -> Result<()> {
    let v = &manifest.version;
    
    if v.number == 1 {
        // First version
        ensure!(v.previous.is_none(), VersionValidation("v1 must have no previous"));
        ensure!(v.root == manifest.hash, VersionValidation("v1 root must equal hash"));
    } else {
        // Subsequent version
        ensure!(v.previous.is_some(), VersionValidation("v2+ must have previous"));
        
        if let Some(prev) = previous {
            ensure!(
                v.previous.as_ref() == Some(&prev.hash),
                VersionValidation("previous hash mismatch")
            );
            ensure!(
                v.root == prev.version.root,
                VersionValidation("root must equal previous root")
            );
            ensure!(
                v.number == prev.version.number + 1,
                VersionValidation("version number must increment by 1")
            );
            ensure!(
                v.timestamp > prev.version.timestamp,
                VersionValidation("timestamp must be after previous")
            );
        }
    }
    
    Ok(())
}
}

§9.3 Provenance Validation

#![allow(unused)]
fn main() {
fn validate_provenance(manifest: &Manifest, sources: &[Manifest]) -> Result<()> {
    let prov = &manifest.provenance;
    
    match manifest.content_type {
        ContentType::L0 => {
            // L0: self-referential provenance
            ensure!(
                prov.root_L0L1.len() == 1,
                ProvenanceValidation("L0 must have exactly one root (self)")
            );
            ensure!(
                prov.root_L0L1[0].hash == manifest.hash,
                ProvenanceValidation("L0 root must be self")
            );
            ensure!(
                prov.derived_from.is_empty(),
                ProvenanceValidation("L0 must not derive from anything")
            );
            ensure!(
                prov.depth == 0,
                ProvenanceValidation("L0 depth must be 0")
            );
        }
        ContentType::L1 => {
            // L1: extracted from exactly one L0
            ensure!(
                !prov.root_L0L1.is_empty(),
                ProvenanceValidation("L1 must have at least one root")
            );
            ensure!(
                prov.derived_from.len() == 1,
                ProvenanceValidation("L1 must derive from exactly one L0")
            );
            ensure!(
                prov.depth == 1,
                ProvenanceValidation("L1 depth must be 1")
            );
            // All roots must be L0
            for root in &prov.root_L0L1 {
                if let Some(source) = sources.iter().find(|s| s.hash == root.hash) {
                    ensure!(
                        source.content_type == ContentType::L0,
                        ProvenanceValidation("L1 roots must all be L0")
                    );
                }
            }
        }
        ContentType::L2 => {
            // L2: built from L1s (and optionally other L2s)
            ensure!(
                !prov.root_L0L1.is_empty(),
                ProvenanceValidation("L2 must have at least one root")
            );
            ensure!(
                !prov.derived_from.is_empty(),
                ProvenanceValidation("L2 must derive from at least one source")
            );
            ensure!(
                prov.depth >= 2,
                ProvenanceValidation("L2 depth must be >= 2")
            );
            
            // All roots must be L0 or L1 (never L2 or L3)
            for root in &prov.root_L0L1 {
                if let Some(source) = sources.iter().find(|s| s.hash == root.hash) {
                    ensure!(
                        matches!(source.content_type, ContentType::L0 | ContentType::L1),
                        ProvenanceValidation("L2 roots must be L0 or L1 only")
                    );
                }
            }
            
            // derived_from must be L1 or L2
            for df in &prov.derived_from {
                if let Some(source) = sources.iter().find(|s| s.hash == *df) {
                    ensure!(
                        matches!(source.content_type, ContentType::L1 | ContentType::L2),
                        ProvenanceValidation("L2 must derive from L1 or L2")
                    );
                }
            }
            
            // Verify root_L0L1 computation
            let computed_roots = compute_root_L0L1(sources);
            ensure!(
                roots_match(&prov.root_L0L1, &computed_roots),
                ProvenanceValidation("root_L0L1 computation mismatch")
            );
            
            // Verify depth
            let expected_depth = sources.iter()
                .map(|s| s.provenance.depth)
                .max()
                .unwrap_or(0) + 1;
            ensure!(
                prov.depth == expected_depth,
                ProvenanceValidation("depth mismatch")
            );
        }
        ContentType::L3 => {
            // L3: must derive from sources (L0, L1, L2, or other L3)
            ensure!(
                !prov.root_L0L1.is_empty(),
                ProvenanceValidation("L3 must have at least one root")
            );
            ensure!(
                !prov.derived_from.is_empty(),
                ProvenanceValidation("L3 must derive from at least one source")
            );
            
            // All roots must be L0 or L1 (never L2 or L3)
            for root in &prov.root_L0L1 {
                if let Some(source) = sources.iter().find(|s| s.hash == root.hash) {
                    ensure!(
                        matches!(source.content_type, ContentType::L0 | ContentType::L1),
                        ProvenanceValidation("L3 roots must be L0 or L1 only")
                    );
                }
            }
            
            // All derived_from must exist in sources
            let source_hashes: HashSet<_> = sources.iter().map(|s| &s.hash).collect();
            for df in &prov.derived_from {
                ensure!(
                    source_hashes.contains(df),
                    ProvenanceValidation("derived_from references unknown source")
                );
            }
            
            // Verify root_L0L1 computation
            let computed_roots = compute_root_L0L1(sources);
            ensure!(
                roots_match(&prov.root_L0L1, &computed_roots),
                ProvenanceValidation("root_L0L1 computation mismatch")
            );
            
            // Verify depth
            let expected_depth = sources.iter()
                .map(|s| s.provenance.depth)
                .max()
                .unwrap_or(0) + 1;
            ensure!(
                prov.depth == expected_depth,
                ProvenanceValidation("depth mismatch")
            );
        }
    }
    
    // Common checks for all types
    // No self-reference
    ensure!(
        !prov.derived_from.contains(&manifest.hash),
        ProvenanceValidation("cannot derive from self")
    );
    ensure!(
        !prov.root_L0L1.iter().any(|e| e.hash == manifest.hash),
        ProvenanceValidation("cannot be own root")
    );
    
    // No cycles (basic check - full cycle detection is expensive)
    ensure!(
        prov.depth <= MAX_PROVENANCE_DEPTH,
        ProvenanceValidation("provenance too deep")
    );
    
    Ok(())
}
}

§9.4 Payment Validation

#![allow(unused)]
fn main() {
fn validate_payment(payment: &Payment, channel: &Channel, manifest: &Manifest) -> Result<()> {
    // 1. Amount sufficient
    ensure!(
        payment.amount >= manifest.economics.price,
        PaymentValidation("insufficient payment")
    );
    
    // 2. Correct recipient
    ensure!(
        payment.recipient == manifest_owner(manifest),
        PaymentValidation("wrong recipient")
    );
    
    // 3. Query hash matches
    ensure!(
        payment.query_hash == manifest.hash,
        PaymentValidation("query hash mismatch")
    );
    
    // 4. Channel is open
    ensure!(
        channel.state == ChannelState::Open,
        PaymentValidation("channel not open")
    );
    
    // 5. Sufficient balance
    ensure!(
        channel.their_balance >= payment.amount,
        PaymentValidation("insufficient channel balance")
    );
    
    // 6. Nonce is valid (prevents replay)
    ensure!(
        payment_nonce(payment) > channel.nonce,
        PaymentValidation("invalid nonce (replay?)")
    );
    
    // 7. Signature valid
    let payer_pubkey = lookup_public_key(&payment_payer(payment, channel))?;
    ensure!(
        verify_payment_signature(&payer_pubkey, payment),
        PaymentValidation("invalid signature")
    );
    
    // 8. Provenance matches manifest
    ensure!(
        provenance_matches(&payment.provenance, &manifest.provenance.root_L0L1),
        PaymentValidation("provenance mismatch")
    );
    
    Ok(())
}
}

§9.5 Message Validation

#![allow(unused)]
fn main() {
fn validate_message(msg: &Message) -> Result<()> {
    // 1. Protocol version
    ensure!(
        msg.version == PROTOCOL_VERSION,
        MessageValidation("unsupported protocol version")
    );
    
    // 2. Valid message type
    ensure!(
        is_valid_message_type(msg.message_type),
        MessageValidation("invalid message type")
    );
    
    // 3. Timestamp within skew
    let now = current_timestamp();
    let skew = if msg.timestamp > now {
        msg.timestamp - now
    } else {
        now - msg.timestamp
    };
    ensure!(
        skew <= MAX_CLOCK_SKEW_MS,
        MessageValidation("timestamp outside acceptable range")
    );
    
    // 4. Valid sender
    ensure!(
        is_valid_peer_id(&msg.sender),
        MessageValidation("invalid sender peer ID")
    );
    
    // 5. Signature valid
    let pubkey = lookup_public_key(&msg.sender)?;
    let msg_hash = message_hash(msg);
    ensure!(
        verify(&pubkey, &msg_hash.0, &msg.signature),
        MessageValidation("invalid signature")
    );
    
    // 6. Payload decodes
    ensure!(
        payload_decodes_for_type(&msg.payload, msg.message_type),
        MessageValidation("payload decode failed")
    );
    
    Ok(())
}
}

§9.6 Access Validation

#![allow(unused)]
fn main() {
fn validate_access(requester: &PeerId, manifest: &Manifest) -> Result<()> {
    match manifest.visibility {
        Visibility::Private => {
            // Private: never accessible externally
            return Err(AccessValidation("content is private"));
        }
        Visibility::Unlisted => {
            // Check allowlist if set
            if let Some(ref allowlist) = manifest.access.allowlist {
                ensure!(
                    allowlist.contains(requester),
                    AccessValidation("not in allowlist")
                );
            }
            // Check denylist if set
            if let Some(ref denylist) = manifest.access.denylist {
                ensure!(
                    !denylist.contains(requester),
                    AccessValidation("in denylist")
                );
            }
        }
        Visibility::Shared => {
            // Only check denylist (allowlist ignored for Shared)
            if let Some(ref denylist) = manifest.access.denylist {
                ensure!(
                    !denylist.contains(requester),
                    AccessValidation("in denylist")
                );
            }
        }
    }
    
    // Check bond requirement
    if manifest.access.require_bond {
        ensure!(
            has_bond(requester, manifest.access.bond_amount.unwrap_or(0)),
            AccessValidation("bond required")
        );
    }
    
    Ok(())
}
}

Error Types

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum ValidationError {
    #[error("Content validation failed: {0}")]
    ContentValidation(String),
    
    #[error("Version validation failed: {0}")]
    VersionValidation(String),
    
    #[error("Provenance validation failed: {0}")]
    ProvenanceValidation(String),
    
    #[error("Payment validation failed: {0}")]
    PaymentValidation(String),
    
    #[error("Message validation failed: {0}")]
    MessageValidation(String),
    
    #[error("Access validation failed: {0}")]
    AccessValidation(String),
    
    #[error("L2 validation failed: {0}")]
    L2Validation(String),
    
    #[error("Publish validation failed: {0}")]
    PublishValidation(String),
}
}

§9.7 Publish Validation

#![allow(unused)]
fn main() {
/// Validate that content can be published
fn validate_publish(manifest: &Manifest, visibility: Visibility) -> Result<()> {
    // L2 can NEVER be published
    if manifest.content_type == ContentType::L2 {
        return Err(PublishValidation("L2 content cannot be published"));
    }
    
    // Cannot publish to a more restricted visibility
    // (e.g., can't go from Shared back to Unlisted via PUBLISH)
    // This is handled by UNPUBLISH operation instead
    
    Ok(())
}
}

Test Cases

For each validation function, test:

  1. Valid input passes
  2. Each invalid condition is caught
  3. Error message is descriptive
  4. Edge cases (empty arrays, zero values, max values)

L2-specific tests:

  1. L2 with visibility != Private fails
  2. L2 with price != 0 fails
  3. L2 with empty entities fails
  4. L2 with duplicate entity IDs fails
  5. L2 with invalid entity reference in relationship fails
  6. L2 with invalid URI/CURIE fails
  7. L2 with unknown prefix fails
  8. L2 PUBLISH attempt fails
  9. CURIE expansion works correctly
  10. Confidence values outside [0,1] fail

Module: nodalync-econ

Source: Protocol Specification §10

Overview

Revenue distribution calculations. Pure functions, no I/O.

Key Design Decision: The settlement contract distributes payments to ALL root contributors directly. When Bob queries Alice’s L3 (which derives from Carol’s L0), the settlement contract pays:

  • Alice: 5% synthesis fee + her root shares
  • Carol: her root shares
  • Any other root contributors: their shares

This ensures trustless distribution — Alice cannot withhold payment from Carol.

Dependencies

  • nodalync-types — ProvenanceEntry, Distribution, Amount

§10.1 Revenue Distribution

Constants

#![allow(unused)]
fn main() {
/// Synthesis fee: 5%
pub const SYNTHESIS_FEE_NUMERATOR: u64 = 5;
pub const SYNTHESIS_FEE_DENOMINATOR: u64 = 100;

/// Root pool: 95%
pub const ROOT_POOL_NUMERATOR: u64 = 95;
pub const ROOT_POOL_DENOMINATOR: u64 = 100;

/// Settlement threshold: 100 HBAR (in tinybars)
pub const SETTLEMENT_BATCH_THRESHOLD: Amount = 10_000_000_000;

/// Settlement interval: 1 hour
pub const SETTLEMENT_BATCH_INTERVAL_MS: u64 = 3_600_000;
}

Distribution Function

#![allow(unused)]
fn main() {
/// Distribute payment revenue to owner and root contributors.
/// 
/// # Arguments
/// * `payment_amount` - Total payment received
/// * `owner` - Content owner (receives synthesis fee)
/// * `provenance` - All root L0+L1 sources with weights
/// 
/// # Returns
/// Vec of distributions to each recipient
pub fn distribute_revenue(
    payment_amount: Amount,
    owner: &PeerId,
    provenance: &[ProvenanceEntry],
) -> Vec<Distribution> {
    let mut distributions = Vec::new();
    
    // Calculate shares
    let owner_share = payment_amount * SYNTHESIS_FEE_NUMERATOR / SYNTHESIS_FEE_DENOMINATOR;
    let root_pool = payment_amount * ROOT_POOL_NUMERATOR / ROOT_POOL_DENOMINATOR;
    
    // Total weight across all roots
    let total_weight: u64 = provenance.iter().map(|e| e.weight as u64).sum();
    
    if total_weight == 0 {
        // Edge case: no roots (shouldn't happen for valid L3)
        distributions.push(Distribution {
            recipient: owner.clone(),
            amount: payment_amount,
            source_hash: Hash::default(), // Owner's own content
        });
        return distributions;
    }
    
    // Per-weight share (integer division, remainder goes to owner)
    let per_weight = root_pool / total_weight;
    let mut distributed: Amount = 0;
    
    // Group by owner to aggregate payments
    let mut owner_amounts: HashMap<PeerId, Amount> = HashMap::new();
    
    for entry in provenance {
        let amount = per_weight * (entry.weight as u64);
        distributed += amount;
        
        *owner_amounts.entry(entry.owner.clone()).or_default() += amount;
    }
    
    // Add synthesis fee to owner (may already have root shares)
    let remainder = root_pool - distributed; // Rounding dust
    *owner_amounts.entry(owner.clone()).or_default() += owner_share + remainder;
    
    // Convert to distributions
    for (recipient, amount) in owner_amounts {
        if amount > 0 {
            distributions.push(Distribution {
                recipient,
                amount,
                source_hash: Hash::default(), // Aggregated
            });
        }
    }
    
    distributions
}
}

Example (from spec)

Scenario:
    Bob's L3 derives from:
        - Alice's L0 (weight: 2)
        - Carol's L0 (weight: 1)
        - Bob's L0 (weight: 2)
    Total weight: 5
    
    Query payment: 100 HBAR

Distribution:
    owner_share = 100 * 5/100 = 5 HBAR (Bob's synthesis fee)
    root_pool = 100 * 95/100 = 95 HBAR
    per_weight = 95 / 5 = 19 HBAR

    Alice: 2 * 19 = 38 HBAR
    Carol: 1 * 19 = 19 HBAR
    Bob (roots): 2 * 19 = 38 HBAR
    Bob (synthesis): 5 HBAR
    Bob total: 43 HBAR

Final:
    Alice: 38 HBAR (38%)
    Carol: 19 HBAR (19%)
    Bob: 43 HBAR (43%)

§10.3 Price Constraints

#![allow(unused)]
fn main() {
pub const MIN_PRICE: Amount = 1;
pub const MAX_PRICE: Amount = 10_000_000_000_000_000; // 10^16

pub fn validate_price(price: Amount) -> Result<(), EconError> {
    if price < MIN_PRICE {
        return Err(EconError::PriceTooLow);
    }
    if price > MAX_PRICE {
        return Err(EconError::PriceTooHigh);
    }
    Ok(())
}
}

§10.4 Settlement Batching

#![allow(unused)]
fn main() {
/// Aggregate payments into settlement batch.
/// 
/// Combines all pending payments, aggregating by recipient.
pub fn create_settlement_batch(
    payments: &[Payment],
) -> SettlementBatch {
    let mut by_recipient: HashMap<PeerId, (Amount, Vec<Hash>, Vec<Hash>)> = HashMap::new();
    
    for payment in payments {
        // Distribute this payment
        let distributions = distribute_revenue(
            payment.amount,
            &payment.recipient,
            &payment.provenance,
        );
        
        for dist in distributions {
            let entry = by_recipient.entry(dist.recipient.clone()).or_default();
            entry.0 += dist.amount;
            if !entry.1.contains(&dist.source_hash) {
                entry.1.push(dist.source_hash);
            }
            if !entry.2.contains(&payment.id) {
                entry.2.push(payment.id.clone());
            }
        }
    }
    
    let entries: Vec<SettlementEntry> = by_recipient
        .into_iter()
        .map(|(recipient, (amount, provenance_hashes, payment_ids))| {
            SettlementEntry {
                recipient,
                amount,
                provenance_hashes,
                payment_ids,
            }
        })
        .collect();
    
    let batch_id = compute_batch_id(&entries);
    let merkle_root = compute_merkle_root(&entries);
    
    SettlementBatch {
        batch_id,
        entries,
        merkle_root,
    }
}

/// Check if settlement should be triggered.
pub fn should_settle(
    pending_total: Amount,
    last_settlement: Timestamp,
    now: Timestamp,
) -> bool {
    // Threshold reached
    if pending_total >= SETTLEMENT_BATCH_THRESHOLD {
        return true;
    }
    
    // Interval elapsed
    if now - last_settlement >= SETTLEMENT_BATCH_INTERVAL_MS {
        return true;
    }
    
    false
}
}

Merkle Root Computation

#![allow(unused)]
fn main() {
/// Compute merkle root of settlement entries.
/// Allows any recipient to verify their inclusion.
pub fn compute_merkle_root(entries: &[SettlementEntry]) -> Hash {
    if entries.is_empty() {
        return Hash::default();
    }
    
    // Leaf hashes
    let mut hashes: Vec<Hash> = entries
        .iter()
        .map(|e| hash_settlement_entry(e))
        .collect();
    
    // Build tree
    while hashes.len() > 1 {
        let mut next_level = Vec::new();
        for chunk in hashes.chunks(2) {
            if chunk.len() == 2 {
                next_level.push(hash_pair(&chunk[0], &chunk[1]));
            } else {
                next_level.push(chunk[0].clone());
            }
        }
        hashes = next_level;
    }
    
    hashes.pop().unwrap()
}

fn hash_settlement_entry(entry: &SettlementEntry) -> Hash {
    let mut hasher = Sha256::new();
    hasher.update(&entry.recipient.0);
    hasher.update(&entry.amount.to_be_bytes());
    // ... hash other fields
    Hash(hasher.finalize().into())
}

fn hash_pair(a: &Hash, b: &Hash) -> Hash {
    let mut hasher = Sha256::new();
    hasher.update(&a.0);
    hasher.update(&b.0);
    Hash(hasher.finalize().into())
}
}

Public API

#![allow(unused)]
fn main() {
// Distribution
pub fn distribute_revenue(
    payment_amount: Amount,
    owner: &PeerId,
    provenance: &[ProvenanceEntry],
) -> Vec<Distribution>;

// Batching
pub fn create_settlement_batch(payments: &[Payment]) -> SettlementBatch;
pub fn should_settle(pending_total: Amount, last_settlement: Timestamp, now: Timestamp) -> bool;

// Validation
pub fn validate_price(price: Amount) -> Result<(), EconError>;

// Merkle proofs
pub fn compute_merkle_root(entries: &[SettlementEntry]) -> Hash;
pub fn create_merkle_proof(entries: &[SettlementEntry], index: usize) -> MerkleProof;
pub fn verify_merkle_proof(root: &Hash, entry: &SettlementEntry, proof: &MerkleProof) -> bool;
}

Test Cases

  1. Basic distribution: 100 tokens, single root → 95 to root, 5 to owner
  2. Multiple roots: Verify equal per-weight distribution
  3. Owner is root: Owner gets synthesis fee + root share
  4. Rounding: Integer division remainder goes to owner
  5. Zero payment: Handle gracefully
  6. Empty provenance: All to owner
  7. Batch aggregation: Multiple payments to same recipient aggregate
  8. Merkle proof: Create proof, verify proof
  9. Settlement trigger: Threshold triggers, interval triggers

Module: nodalync-ops

Source: Protocol Specification §7

Overview

Core protocol operations. Combines storage, validation, and economics to implement the protocol’s business logic.

Key Design Decisions:

  1. L1 Extraction: Rule-based NLP for MVP. Future: plugin architecture for AI integration.
  2. Channel Auto-Open: When querying a peer with no channel, auto-open with configurable minimum deposit. Return PAYMENT_REQUIRED if insufficient funds.
  3. Settlement Queue: This module WRITES to the settlement queue (in nodalync-store). The nodalync-settle module READS from it.
  4. Payment Distribution: All distributions go to the settlement queue. The settlement contract pays ALL recipients (owner + all root contributors).

Dependencies

  • nodalync-types — All data structures
  • nodalync-crypto — Hashing, signing
  • nodalync-store — Persistence (including settlement queue)
  • nodalync-valid — Validation
  • nodalync-econ — Revenue distribution
  • nodalync-wire — Message encoding

Operations Trait

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Operations {
    // === Content Operations ===
    
    /// Create new content locally (not yet published)
    async fn create(
        &mut self,
        content: &[u8],
        content_type: ContentType,
        metadata: Metadata,
    ) -> Result<Hash>;
    
    /// Extract L1 mentions from L0 content (rule-based for MVP)
    async fn extract_l1(&mut self, hash: &Hash) -> Result<L1Summary>;
    
    /// Build L2 entity graph from L1 sources (always private)
    async fn build_l2(
        &mut self,
        source_l1s: &[Hash],
        config: Option<L2BuildConfig>,
    ) -> Result<Hash>;
    
    /// Merge multiple of your own L2 graphs into one
    async fn merge_l2(
        &mut self,
        source_l2s: &[Hash],
        config: Option<L2MergeConfig>,
    ) -> Result<Hash>;
    
    /// Publish content to the network (NOT allowed for L2)
    async fn publish(
        &mut self,
        hash: &Hash,
        visibility: Visibility,
        price: Amount,
        access: Option<AccessControl>,
    ) -> Result<()>;
    
    /// Unpublish content (set to Private)
    async fn unpublish(&mut self, hash: &Hash) -> Result<()>;
    
    /// Create new version of existing content
    async fn update(&mut self, old_hash: &Hash, new_content: &[u8]) -> Result<Hash>;
    
    /// Create L3 from multiple sources (can include L0, L1, L2, L3)
    async fn derive(
        &mut self,
        sources: &[Hash],
        insight: &[u8],
        metadata: Metadata,
    ) -> Result<Hash>;
    
    /// Reference external L3 as L0 for derivations
    async fn reference_l3_as_l0(&mut self, l3_hash: &Hash) -> Result<()>;
    
    // === Query Operations ===
    
    /// Get L1 preview (free)
    async fn preview(&self, hash: &Hash) -> Result<(Manifest, L1Summary)>;
    
    /// Query content (paid) - auto-opens channel if needed
    async fn query(&mut self, hash: &Hash, payment: Payment) -> Result<QueryResponse>;
    
    /// Get version history for content
    async fn get_versions(&self, version_root: &Hash) -> Result<Vec<VersionInfo>>;
    
    // === Visibility Operations ===
    
    /// Change content visibility (NOT allowed for L2)
    async fn set_visibility(&mut self, hash: &Hash, visibility: Visibility) -> Result<()>;
    
    /// Update access control
    async fn set_access(&mut self, hash: &Hash, access: AccessControl) -> Result<()>;
    
    // === Channel Operations ===
    
    /// Open payment channel with peer
    async fn open_channel(&mut self, peer: &PeerId, deposit: Amount) -> Result<Hash>;
    
    /// Accept incoming channel open request
    async fn accept_channel(&mut self, channel_id: &Hash, deposit: Amount) -> Result<()>;
    
    /// Update channel state (after payment)
    async fn update_channel(&mut self, channel_id: &Hash, payment: &Payment) -> Result<()>;
    
    /// Close channel cooperatively
    async fn close_channel(&mut self, channel_id: &Hash) -> Result<()>;
    
    /// Dispute channel with on-chain evidence
    async fn dispute_channel(&mut self, channel_id: &Hash, state: &ChannelUpdatePayload) -> Result<()>;
    
    // === Settlement Operations ===
    
    /// Trigger settlement batch (called by nodalync-settle or manually)
    async fn trigger_settlement(&mut self) -> Result<Option<SettlementBatch>>;
}
}

§7.1.1 CREATE

#![allow(unused)]
fn main() {
async fn create(
    &mut self,
    content: &[u8],
    content_type: ContentType,
    metadata: Metadata,
) -> Result<Hash> {
    // Reject L2 and L3 through CREATE - they have dedicated operations
    match content_type {
        ContentType::L2 => {
            return Err(Error::InvalidOperation(
                "Use build_l2() for L2 content".into()
            ));
        }
        ContentType::L3 => {
            return Err(Error::InvalidOperation(
                "Use derive() for L3 content".into()
            ));
        }
        ContentType::L0 | ContentType::L1 => {}
    }
    
    // 1. Compute hash
    let hash = content_hash(content);
    
    // 2. Create version (v1)
    let version = Version {
        number: 1,
        previous: None,
        root: hash.clone(),
        timestamp: current_timestamp(),
    };
    
    // 3. Compute provenance (L0/L1: self-referential)
    let provenance = Provenance {
        root_L0L1: vec![ProvenanceEntry {
            hash: hash.clone(),
            owner: self.identity.peer_id(),
            visibility: Visibility::Private,
            weight: 1,
        }],
        derived_from: vec![],
        depth: if content_type == ContentType::L0 { 0 } else { 1 },
    };
    
    // 4. Create manifest (includes owner)
    let manifest = Manifest {
        hash: hash.clone(),
        content_type,
        owner: self.identity.peer_id(),
        version,
        visibility: Visibility::Private,
        access: AccessControl::default(),
        metadata,
        economics: Economics {
            price: 0,
            currency: Currency::HBAR,
            total_queries: 0,
            total_revenue: 0,
        },
        provenance,
        created_at: current_timestamp(),
        updated_at: current_timestamp(),
    };
    
    // 5. Validate
    self.validator.validate_content(content, &manifest)?;
    
    // 6. Store
    self.content_store.store_verified(&hash, content)?;
    self.manifest_store.store(&manifest)?;
    
    Ok(hash)
}
}

§7.1.2 EXTRACT_L1 (Rule-Based MVP)

L1 extraction identifies atomic facts from L0 content. For MVP, we use rule-based NLP. Future versions will support a plugin architecture for AI-powered extraction.

#![allow(unused)]
fn main() {
/// L1 Extraction trait for pluggable implementations
pub trait L1Extractor {
    fn extract(&self, content: &[u8], mime_type: Option<&str>) -> Result<Vec<Mention>>;
}

/// Rule-based extractor for MVP
pub struct RuleBasedExtractor;

impl L1Extractor for RuleBasedExtractor {
    fn extract(&self, content: &[u8], mime_type: Option<&str>) -> Result<Vec<Mention>> {
        let text = std::str::from_utf8(content)?;
        let mut mentions = Vec::new();
        
        // Split into sentences (basic approach)
        let sentences: Vec<&str> = text
            .split(|c| c == '.' || c == '!' || c == '?')
            .filter(|s| !s.trim().is_empty())
            .collect();
        
        for (idx, sentence) in sentences.iter().enumerate() {
            let trimmed = sentence.trim();
            if trimmed.len() < 10 || trimmed.len() > 1000 {
                continue; // Skip too short or too long
            }
            
            // Basic classification heuristics
            let classification = classify_sentence(trimmed);
            
            // Extract entities (basic: capitalized words)
            let entities = extract_entities(trimmed);
            
            let mention = Mention {
                id: content_hash(format!("{}:{}", idx, trimmed).as_bytes()),
                content: trimmed.to_string(),
                source_location: SourceLocation {
                    location_type: LocationType::Paragraph,
                    reference: format!("sentence_{}", idx),
                    quote: Some(trimmed.chars().take(500).collect()),
                },
                classification,
                confidence: Confidence::Explicit,
                entities,
            };
            
            mentions.push(mention);
        }
        
        Ok(mentions)
    }
}

fn classify_sentence(sentence: &str) -> Classification {
    let lower = sentence.to_lowercase();
    
    if lower.contains('%') || lower.contains("percent") || 
       lower.chars().any(|c| c.is_numeric()) {
        Classification::Statistic
    } else if lower.starts_with("according to") || lower.contains("claims") ||
              lower.contains("argues") || lower.contains("suggests") {
        Classification::Claim
    } else if lower.contains("defined as") || lower.contains("refers to") ||
              lower.contains("is a") || lower.contains("are a") {
        Classification::Definition
    } else if lower.contains("method") || lower.contains("approach") ||
              lower.contains("technique") || lower.contains("process") {
        Classification::Method
    } else if lower.contains("found") || lower.contains("result") ||
              lower.contains("showed") || lower.contains("demonstrated") {
        Classification::Result
    } else {
        Classification::Observation
    }
}

fn extract_entities(sentence: &str) -> Vec<String> {
    // Basic entity extraction: capitalized multi-word sequences
    sentence
        .split_whitespace()
        .filter(|w| w.chars().next().map(|c| c.is_uppercase()).unwrap_or(false))
        .filter(|w| w.len() > 1)
        .map(|w| w.trim_matches(|c: char| !c.is_alphanumeric()).to_string())
        .filter(|w| !w.is_empty())
        .collect()
}

async fn extract_l1(&mut self, hash: &Hash) -> Result<L1Summary> {
    // 1. Load content
    let content = self.content_store.load(hash)?
        .ok_or(Error::NotFound)?;
    let manifest = self.manifest_store.load(hash)?
        .ok_or(Error::NotFound)?;
    
    // 2. Extract mentions using configured extractor
    let mentions = self.l1_extractor.extract(&content, manifest.metadata.mime_type.as_deref())?;
    
    // 3. Generate summary
    let primary_topics: Vec<String> = mentions.iter()
        .flat_map(|m| m.entities.iter().cloned())
        .take(5)
        .collect();
    
    let summary = if mentions.len() > 0 {
        format!(
            "Contains {} mentions covering topics: {}",
            mentions.len(),
            primary_topics.join(", ")
        )
    } else {
        "No structured mentions extracted.".to_string()
    };
    
    // 4. Create L1Summary
    let l1_summary = L1Summary {
        l0_hash: hash.clone(),
        mention_count: mentions.len() as u32,
        preview_mentions: mentions.iter().take(5).cloned().collect(),
        primary_topics,
        summary: summary.chars().take(500).collect(),
    };
    
    // 5. Store L1 data
    self.l1_store.store(hash, &l1_summary)?;
    
    Ok(l1_summary)
}
}

Future Plugin Architecture:

#![allow(unused)]
fn main() {
pub trait L1ExtractorPlugin: Send + Sync {
    fn name(&self) -> &str;
    fn supported_mime_types(&self) -> Vec<&str>;
    fn extract(&self, content: &[u8], mime_type: &str) -> Result<Vec<Mention>>;
}

// Example: AI-powered extractor (future)
pub struct OpenAIExtractor {
    api_key: String,
    model: String,
}

impl L1ExtractorPlugin for OpenAIExtractor {
    fn name(&self) -> &str { "openai" }
    fn supported_mime_types(&self) -> Vec<&str> { vec!["text/plain", "text/markdown"] }
    fn extract(&self, content: &[u8], mime_type: &str) -> Result<Vec<Mention>> {
        // Call OpenAI API...
        todo!()
    }
}
}

§7.1.2a BUILD_L2 (Entity Graph)

Build a personal L2 entity graph from L1 sources. L2 is always private and never directly monetized.

#![allow(unused)]
fn main() {
async fn build_l2(
    &mut self,
    source_l1s: &[Hash],
    config: Option<L2BuildConfig>,
) -> Result<Hash> {
    let config = config.unwrap_or_default();
    
    // 1. Validate we have at least one source
    if source_l1s.is_empty() {
        return Err(Error::InvalidOperation("build_l2 requires at least one L1 source".into()));
    }
    if source_l1s.len() > MAX_SOURCE_L1S_PER_L2 {
        return Err(Error::InvalidOperation("too many L1 sources".into()));
    }
    
    // 2. Load and verify all L1 sources (must be queried or owned)
    let mut l1_refs = Vec::new();
    let mut all_mentions: Vec<(Hash, Mention)> = Vec::new();
    
    for l1_hash in source_l1s {
        // Check if we have it (either owned or cached from query)
        let manifest = self.manifest_store.load(l1_hash)
            .or_else(|_| self.cache.get_manifest(l1_hash))
            .ok_or(Error::L2MissingSource)?;
        
        if manifest.content_type != ContentType::L1 {
            return Err(Error::InvalidOperation("source must be L1".into()));
        }
        
        // Load L1 summary to get mentions
        let l1_summary = self.l1_store.load(l1_hash)?
            .ok_or(Error::L2MissingSource)?;
        
        l1_refs.push(L1Reference {
            l1_hash: l1_hash.clone(),
            l0_hash: l1_summary.l0_hash.clone(),
            mention_ids_used: vec![], // All mentions
        });
        
        // Collect mentions with their L1 hash for reference
        for mention in &l1_summary.preview_mentions {
            all_mentions.push((l1_hash.clone(), mention.clone()));
        }
    }
    
    // 3. Extract entities from mentions
    let raw_entities = extract_entities_from_mentions(&all_mentions, &config)?;
    
    // 4. Resolve entities (merge duplicates, link to external KBs)
    let prefixes = config.prefixes.clone().unwrap_or_default();
    let resolved_entities = resolve_entities(raw_entities, &config)?;
    
    if resolved_entities.is_empty() {
        return Err(Error::InvalidOperation("no entities extracted".into()));
    }
    
    // 5. Extract relationships
    let relationships = extract_relationships(&resolved_entities, &all_mentions, &config)?;
    
    // 6. Build L2 graph
    let mut l2_graph = L2EntityGraph {
        id: Hash::default(), // Computed below
        source_l1s: l1_refs,
        source_l2s: vec![],
        prefixes,
        entities: resolved_entities.clone(),
        relationships: relationships.clone(),
        entity_count: resolved_entities.len() as u32,
        relationship_count: relationships.len() as u32,
        source_mention_count: all_mentions.len() as u32,
    };
    
    // 7. Serialize and compute hash
    let content = serialize(&l2_graph)?;
    let hash = content_hash(&content);
    l2_graph.id = hash.clone();
    
    // 8. Compute provenance (merge from all source L1s)
    let mut root_entries: Vec<ProvenanceEntry> = Vec::new();
    let mut max_depth = 0u32;
    
    for l1_hash in source_l1s {
        let manifest = self.manifest_store.load(l1_hash)
            .or_else(|_| self.cache.get_manifest(l1_hash))?;
        
        for entry in &manifest.provenance.root_L0L1 {
            merge_or_increment(&mut root_entries, entry.clone());
        }
        max_depth = max_depth.max(manifest.provenance.depth);
    }
    
    let provenance = Provenance {
        root_L0L1: root_entries,
        derived_from: source_l1s.to_vec(),
        depth: max_depth + 1,
    };
    
    // 9. Create manifest (L2 is ALWAYS private with zero price)
    let manifest = Manifest {
        hash: hash.clone(),
        content_type: ContentType::L2,
        owner: self.identity.peer_id(),
        version: Version {
            number: 1,
            previous: None,
            root: hash.clone(),
            timestamp: current_timestamp(),
        },
        visibility: Visibility::Private,  // L2 is ALWAYS private
        access: AccessControl::default(),
        metadata: Metadata {
            title: format!("Entity Graph ({} entities)", resolved_entities.len()),
            description: None,
            tags: vec![],
            content_size: content.len() as u64,
            mime_type: Some("application/x-nodalync-l2".into()),
        },
        economics: Economics {
            price: 0,  // L2 is ALWAYS free (never queried)
            currency: Currency::HBAR,
            total_queries: 0,
            total_revenue: 0,
        },
        provenance,
        created_at: current_timestamp(),
        updated_at: current_timestamp(),
    };
    
    // 10. Validate
    self.validator.validate_content(&content, &manifest)?;
    
    // 11. Store
    self.content_store.store_verified(&hash, &content)?;
    self.manifest_store.store(&manifest)?;
    
    Ok(hash)
}

/// Helper: Extract entities from mentions
fn extract_entities_from_mentions(
    mentions: &[(Hash, Mention)],
    config: &L2BuildConfig,
) -> Result<Vec<Entity>> {
    let mut entities = Vec::new();
    let default_type = config.default_entity_type.clone()
        .unwrap_or_else(|| "ndl:Concept".into());
    
    for (l1_hash, mention) in mentions {
        for entity_name in &mention.entities {
            // Create entity with mention reference
            let entity = Entity {
                id: content_hash(format!("{}:{}", entity_name, default_type).as_bytes()),
                canonical_label: entity_name.clone(),
                canonical_uri: None,
                aliases: vec![],
                entity_types: vec![default_type.clone()],
                source_mentions: vec![MentionRef {
                    l1_hash: l1_hash.clone(),
                    mention_id: mention.id.clone(),
                }],
                confidence: 0.8,  // Default confidence
                resolution_method: ResolutionMethod::ExactMatch,
                description: None,
                same_as: None,
            };
            entities.push(entity);
        }
    }
    
    Ok(entities)
}
}

§7.1.2b MERGE_L2

Merge multiple of your own L2 graphs into a unified graph.

#![allow(unused)]
fn main() {
async fn merge_l2(
    &mut self,
    source_l2s: &[Hash],
    config: Option<L2MergeConfig>,
) -> Result<Hash> {
    let config = config.unwrap_or_default();
    
    // 1. Validate
    if source_l2s.len() < 2 {
        return Err(Error::InvalidOperation("merge_l2 requires at least 2 sources".into()));
    }
    if source_l2s.len() > MAX_SOURCE_L2S_PER_MERGE {
        return Err(Error::InvalidOperation("too many L2 sources".into()));
    }
    
    // 2. Load all L2 sources (must be local - L2 is never queried)
    let mut all_entities: Vec<Entity> = Vec::new();
    let mut all_relationships: Vec<Relationship> = Vec::new();
    let mut all_l1_refs: Vec<L1Reference> = Vec::new();
    let mut merged_prefixes = PrefixMap::default();
    let mut root_entries: Vec<ProvenanceEntry> = Vec::new();
    let mut max_depth = 0u32;
    
    for l2_hash in source_l2s {
        // Must be local (owned)
        let manifest = self.manifest_store.load(l2_hash)?
            .ok_or(Error::NotFound)?;
        
        if manifest.content_type != ContentType::L2 {
            return Err(Error::InvalidOperation("source must be L2".into()));
        }
        if manifest.owner != self.identity.peer_id() {
            return Err(Error::InvalidOperation("can only merge your own L2s".into()));
        }
        
        // Load L2 content
        let content = self.content_store.load(l2_hash)?
            .ok_or(Error::NotFound)?;
        let l2_graph: L2EntityGraph = deserialize(&content)?;
        
        // Collect entities, relationships, refs
        all_entities.extend(l2_graph.entities);
        all_relationships.extend(l2_graph.relationships);
        all_l1_refs.extend(l2_graph.source_l1s);
        
        // Merge prefixes (later ones override earlier)
        for entry in l2_graph.prefixes.entries {
            merged_prefixes.entries.retain(|e| e.prefix != entry.prefix);
            merged_prefixes.entries.push(entry);
        }
        
        // Merge provenance
        for entry in &manifest.provenance.root_L0L1 {
            merge_or_increment(&mut root_entries, entry.clone());
        }
        max_depth = max_depth.max(manifest.provenance.depth);
    }
    
    // 3. Deduplicate L1 refs
    let mut unique_l1_refs: Vec<L1Reference> = Vec::new();
    for l1_ref in all_l1_refs {
        if !unique_l1_refs.iter().any(|r| r.l1_hash == l1_ref.l1_hash) {
            unique_l1_refs.push(l1_ref);
        }
    }
    
    // 4. Cross-graph entity resolution
    let threshold = config.entity_merge_threshold.unwrap_or(0.8);
    let resolved_entities = merge_entities(&all_entities, threshold)?;
    
    // 5. Update relationship entity references
    let entity_id_map = build_entity_id_map(&all_entities, &resolved_entities);
    let resolved_relationships = update_relationship_refs(&all_relationships, &entity_id_map)?;
    
    // 6. Build merged L2
    let mut l2_graph = L2EntityGraph {
        id: Hash::default(),
        source_l1s: unique_l1_refs,
        source_l2s: source_l2s.to_vec(),
        prefixes: config.prefixes.clone().unwrap_or(merged_prefixes),
        entities: resolved_entities.clone(),
        relationships: resolved_relationships.clone(),
        entity_count: resolved_entities.len() as u32,
        relationship_count: resolved_relationships.len() as u32,
        source_mention_count: resolved_entities.iter()
            .map(|e| e.source_mentions.len())
            .sum::<usize>() as u32,
    };
    
    // 7. Hash
    let content = serialize(&l2_graph)?;
    let hash = content_hash(&content);
    l2_graph.id = hash.clone();
    
    // 8. Provenance
    let provenance = Provenance {
        root_L0L1: root_entries,
        derived_from: source_l2s.to_vec(),
        depth: max_depth + 1,
    };
    
    // 9. Create manifest
    let manifest = Manifest {
        hash: hash.clone(),
        content_type: ContentType::L2,
        owner: self.identity.peer_id(),
        version: Version {
            number: 1,
            previous: None,
            root: hash.clone(),
            timestamp: current_timestamp(),
        },
        visibility: Visibility::Private,
        access: AccessControl::default(),
        metadata: Metadata {
            title: format!("Merged Entity Graph ({} entities)", resolved_entities.len()),
            description: None,
            tags: vec![],
            content_size: content.len() as u64,
            mime_type: Some("application/x-nodalync-l2".into()),
        },
        economics: Economics {
            price: 0,
            currency: Currency::HBAR,
            total_queries: 0,
            total_revenue: 0,
        },
        provenance,
        created_at: current_timestamp(),
        updated_at: current_timestamp(),
    };
    
    // 10. Validate and store
    self.validator.validate_content(&content, &manifest)?;
    self.content_store.store_verified(&hash, &content)?;
    self.manifest_store.store(&manifest)?;
    
    Ok(hash)
}
}

§7.1.3 PUBLISH

#![allow(unused)]
fn main() {
async fn publish(
    &mut self,
    hash: &Hash,
    visibility: Visibility,
    price: Amount,
    access: Option<AccessControl>,
) -> Result<()> {
    // 1. Load manifest
    let mut manifest = self.manifest_store.load(hash)?
        .ok_or(Error::NotFound)?;
    
    // 2. L2 can NEVER be published
    if manifest.content_type == ContentType::L2 {
        return Err(Error::L2CannotPublish);
    }
    
    // 3. Validate price
    validate_price(price)?;
    
    // 4. Update manifest
    manifest.visibility = visibility;
    manifest.economics.price = price;
    if let Some(access) = access {
        manifest.access = access;
    }
    manifest.updated_at = current_timestamp();
    
    // 5. Save
    self.manifest_store.update(&manifest)?;
    
    // 6. Announce to network (if Shared)
    if visibility == Visibility::Shared {
        let l1_summary = self.get_or_extract_l1(hash).await?;
        let announce = AnnouncePayload {
            hash: hash.clone(),
            content_type: manifest.content_type,
            title: manifest.metadata.title.clone(),
            l1_summary,
            price,
            addresses: self.network.listen_addresses(),
        };
        self.network.dht_announce(hash, announce).await?;
    }
    
    Ok(())
}
}

§7.1.5 DERIVE (Create L3)

Create L3 insight from sources. Sources can be any combination of:

  • L0 (raw documents)
  • L1 (mentions)
  • L2 (your entity graphs - must be owned, not queried)
  • L3 (other insights)
#![allow(unused)]
fn main() {
async fn derive(
    &mut self,
    sources: &[Hash],
    insight: &[u8],
    metadata: Metadata,
) -> Result<Hash> {
    // 1. Verify all sources are accessible
    for source in sources {
        let manifest = self.get_manifest(source)?;
        
        match manifest.content_type {
            ContentType::L2 => {
                // L2 must be owned (it's personal, never queried)
                if manifest.owner != self.identity.peer_id() {
                    return Err(Error::InvalidOperation(
                        "can only derive from your own L2".into()
                    ));
                }
            }
            _ => {
                // Other types: must be queried or owned
                if !self.cache.is_cached(source) && !self.content_store.exists(source) {
                    return Err(Error::SourceNotQueried(source.clone()));
                }
            }
        }
    }
    
    // 2. Load source manifests
    let source_manifests: Vec<Manifest> = sources.iter()
        .map(|h| self.get_manifest(h))
        .collect::<Result<Vec<_>>>()?;
    
    // 3. Compute provenance (roots are always L0/L1, traced through L2/L3)
    let mut root_entries: HashMap<Hash, ProvenanceEntry> = HashMap::new();
    
    for source in &source_manifests {
        for entry in &source.provenance.root_L0L1 {
            root_entries.entry(entry.hash.clone())
                .and_modify(|e| e.weight += entry.weight)
                .or_insert(entry.clone());
        }
    }
    
    let max_depth = source_manifests.iter()
        .map(|s| s.provenance.depth)
        .max()
        .unwrap_or(0);
    
    let provenance = Provenance {
        root_L0L1: root_entries.into_values().collect(),
        derived_from: sources.to_vec(),
        depth: max_depth + 1,
    };
    
    // 4. Compute hash
    let hash = content_hash(insight);
    
    // 5. Create version
    let version = Version {
        number: 1,
        previous: None,
        root: hash.clone(),
        timestamp: current_timestamp(),
    };
    
    // 6. Create manifest
    let manifest = Manifest {
        hash: hash.clone(),
        content_type: ContentType::L3,
        owner: self.identity.peer_id(),
        version,
        visibility: Visibility::Private,
        access: AccessControl::default(),
        metadata,
        economics: Economics::default(),
        provenance,
        created_at: current_timestamp(),
        updated_at: current_timestamp(),
    };
    
    // 7. Validate
    self.validator.validate_provenance(&manifest, &source_manifests)?;
    
    // 8. Store
    self.content_store.store_verified(&hash, insight)?;
    self.manifest_store.store(&manifest)?;
    self.provenance_graph.add(&hash, sources)?;
    
    Ok(hash)
}
}

§7.2.3 QUERY

#![allow(unused)]
fn main() {
/// Configuration for channel auto-open
pub struct ChannelConfig {
    /// Minimum deposit when auto-opening a channel
    pub min_deposit: Amount,
    /// Default deposit for new channels
    pub default_deposit: Amount,
}

impl Default for ChannelConfig {
    fn default() -> Self {
        Self {
            min_deposit: 10_000_000_000,  // 100 HBAR minimum
            default_deposit: 100_000_000_000,  // 1000 HBAR default
        }
    }
}

async fn query(&mut self, hash: &Hash, payment: Payment) -> Result<QueryResponse> {
    // As requester
    
    // 1. Get preview for price check and owner discovery
    let (manifest, _) = self.preview(hash).await?;
    let owner = &manifest.owner;
    
    // 2. Ensure channel exists - AUTO-OPEN if not
    if !self.channels.exists(owner) {
        // Check if we have sufficient balance for auto-open
        let balance = self.get_available_balance().await?;
        if balance < self.config.channel.min_deposit {
            return Err(Error::PaymentRequired {
                message: format!(
                    "No channel with {} and insufficient balance to auto-open. Need {} HBAR minimum.",
                    owner, self.config.channel.min_deposit
                ),
            });
        }
        
        // Auto-open channel with default deposit
        let deposit = std::cmp::min(balance, self.config.channel.default_deposit);
        self.open_channel(owner, deposit).await?;
    }
    
    // 3. Validate payment amount
    if payment.amount < manifest.economics.price {
        return Err(Error::PaymentInsufficient);
    }
    
    // 4. Check channel balance
    let channel = self.channels.get(owner)?
        .ok_or(Error::ChannelNotFound)?;
    if channel.my_balance < payment.amount {
        return Err(Error::InsufficientChannelBalance);
    }
    
    // 5. Send request
    let request = QueryRequestPayload {
        hash: hash.clone(),
        query: None,
        payment: payment.clone(),
        version_spec: None,
    };
    let response = self.network.send_query(owner, request).await?;
    
    // 6. Verify response
    if content_hash(&response.content) != *hash {
        return Err(Error::ContentHashMismatch);
    }
    
    // 7. Update channel state
    self.channels.debit(owner, payment.amount)?;
    self.channels.add_payment(owner, payment)?;
    
    // 8. Cache content
    self.cache.cache(CachedContent {
        hash: hash.clone(),
        content: response.content.clone(),
        source_peer: owner.clone(),
        queried_at: current_timestamp(),
        payment_proof: response.payment_receipt.clone(),
    })?;
    
    Ok(response)
}
}

Query Handler (receiving side)

The handler queues ALL distributions to the settlement queue. The settlement contract will distribute to all recipients (owner + all root contributors).

#![allow(unused)]
fn main() {
async fn handle_query_request(
    &mut self,
    sender: &PeerId,
    request: QueryRequestPayload,
) -> Result<QueryResponsePayload> {
    // 1. Load manifest
    let manifest = self.manifest_store.load(&request.hash)?
        .ok_or(Error::NotFound)?;
    
    // 2. Validate access
    self.validator.validate_access(sender, &manifest)?;
    
    // 3. Validate payment
    let channel = self.channels.get(sender)?
        .ok_or(Error::ChannelNotFound)?;
    self.validator.validate_payment(&request.payment, &channel, &manifest)?;
    
    // 4. Update channel state (credit the payment)
    self.channels.credit(sender, request.payment.amount)?;
    self.channels.increment_nonce(sender)?;
    
    // 5. Calculate distributions and queue ALL of them
    // The settlement contract will pay everyone, including us
    let distributions = distribute_revenue(
        request.payment.amount,
        &manifest.owner,
        &manifest.provenance.root_L0L1,
    );
    
    for dist in distributions {
        self.settlement_queue.enqueue(QueuedDistribution {
            payment_id: request.payment.id.clone(),
            recipient: dist.recipient,
            amount: dist.amount,
            source_hash: dist.source_hash,
            queued_at: current_timestamp(),
        })?;
    }
    
    // 6. Update economics
    let mut updated_manifest = manifest.clone();
    updated_manifest.economics.total_queries += 1;
    updated_manifest.economics.total_revenue += request.payment.amount;
    self.manifest_store.update(&updated_manifest)?;
    
    // 7. Check if settlement should be triggered
    let pending_total = self.settlement_queue.get_pending_total()?;
    let last_settlement = self.settlement_queue.get_last_settlement_time()?;
    if should_settle(pending_total, last_settlement.unwrap_or(0), current_timestamp()) {
        // Queue settlement for async processing
        self.settlement_trigger.notify();
    }
    
    // 8. Load and return content
    let content = self.content_store.load(&request.hash)?
        .ok_or(Error::ContentNotFound)?;
    
    let receipt_data = encode_receipt_data(&request.payment, channel.nonce + 1)?;
    let receipt = PaymentReceipt {
        payment_id: request.payment.id.clone(),
        amount: request.payment.amount,
        timestamp: current_timestamp(),
        channel_nonce: channel.nonce + 1,
        distributor_signature: self.identity.sign(&receipt_data)?,
    };
    
    Ok(QueryResponsePayload {
        hash: request.hash,
        content,
        manifest: updated_manifest,
        payment_receipt: receipt,
    })
}
}

§7.1.6 REFERENCE_L3_AS_L0

#![allow(unused)]
fn main() {
async fn reference_l3_as_l0(&mut self, l3_hash: &Hash) -> Result<()> {
    // 1. Verify L3 was queried
    let cached = self.cache.get(l3_hash)?
        .ok_or(Error::SourceNotQueried(l3_hash.clone()))?;
    
    // 2. Verify it's an L3
    let manifest = self.get_manifest(l3_hash)?;
    if manifest.content_type != ContentType::L3 {
        return Err(Error::NotAnL3);
    }
    
    // 3. Store reference
    // Note: This is a reference, not a copy. The content stays
    // in cache/remote. When deriving, we use this hash in sources.
    self.references.add_l3_reference(l3_hash, &manifest)?;
    
    Ok(())
}
}

§7.3 Channel Operations

§7.3.1 CHANNEL_OPEN

#![allow(unused)]
fn main() {
async fn open_channel(&mut self, peer: &PeerId, deposit: Amount) -> Result<Hash> {
    // 1. Generate channel ID
    let nonce = rand::random::<u64>();
    let channel_id = content_hash(
        &[self.identity.peer_id().0.as_slice(), peer.0.as_slice(), &nonce.to_be_bytes()].concat()
    );
    
    // 2. Create channel state
    let channel = Channel {
        channel_id: channel_id.clone(),
        peer_id: peer.clone(),
        state: ChannelState::Opening,
        my_balance: deposit,
        their_balance: 0,
        nonce: 0,
        last_update: current_timestamp(),
        pending_payments: vec![],
    };
    
    // 3. Store locally
    self.channels.create(peer, channel)?;
    
    // 4. Send open request
    let open_msg = ChannelOpenPayload {
        channel_id: channel_id.clone(),
        initial_balance: deposit,
        funding_tx: None,  // Off-chain for now, on-chain funding optional
    };
    
    let response = self.network.send_channel_open(peer, open_msg).await?;
    
    // 5. Process accept response
    self.handle_channel_accept(&channel_id, &response)?;
    
    Ok(channel_id)
}
}

§7.3.2 CHANNEL_ACCEPT (Handler)

#![allow(unused)]
fn main() {
async fn handle_channel_open(
    &mut self,
    sender: &PeerId,
    request: ChannelOpenPayload,
) -> Result<ChannelAcceptPayload> {
    // 1. Validate channel doesn't already exist
    if self.channels.exists(sender) {
        return Err(Error::ChannelAlreadyExists);
    }
    
    // 2. Decide on our deposit (could be configurable)
    let our_deposit = self.config.channel.default_deposit;
    
    // 3. Create channel state
    let channel = Channel {
        channel_id: request.channel_id.clone(),
        peer_id: sender.clone(),
        state: ChannelState::Open,
        my_balance: our_deposit,
        their_balance: request.initial_balance,
        nonce: 0,
        last_update: current_timestamp(),
        pending_payments: vec![],
    };
    
    // 4. Store
    self.channels.create(sender, channel)?;
    
    // 5. Return accept
    Ok(ChannelAcceptPayload {
        channel_id: request.channel_id,
        initial_balance: our_deposit,
        funding_tx: None,
    })
}

fn handle_channel_accept(&mut self, channel_id: &Hash, accept: &ChannelAcceptPayload) -> Result<()> {
    // Update channel to Open state with peer's deposit
    let channel = self.channels.get_by_id(channel_id)?
        .ok_or(Error::ChannelNotFound)?;
    
    let mut updated = channel.clone();
    updated.state = ChannelState::Open;
    updated.their_balance = accept.initial_balance;
    updated.last_update = current_timestamp();
    
    self.channels.update(&updated)?;
    Ok(())
}
}

§7.3.3 CHANNEL_CLOSE

Cooperative channel close flow:

  1. Initiator creates settlement_tx proposal
  2. Send ChannelClosePayload to peer
  3. Peer verifies and signs
  4. Either party submits signed tx to chain
#![allow(unused)]
fn main() {
async fn close_channel(&mut self, channel_id: &Hash) -> Result<()> {
    // 1. Get channel
    let channel = self.channels.get_by_id(channel_id)?
        .ok_or(Error::ChannelNotFound)?;
    
    // 2. Compute final balances
    let final_balances = ChannelBalances {
        initiator: channel.my_balance,
        responder: channel.their_balance,
    };
    
    // 3. Create proposed settlement transaction bytes
    let settlement_tx = self.settlement.create_close_tx_bytes(
        channel_id,
        &final_balances,
    );
    
    // 4. Sign the proposal
    let my_signature = self.identity.sign(&settlement_tx)?;
    
    // 5. Send close request to peer
    let close_msg = ChannelClosePayload {
        channel_id: channel_id.clone(),
        final_balances: final_balances.clone(),
        settlement_tx: settlement_tx.clone(),
    };
    
    let response = self.network.send_channel_close(&channel.peer_id, close_msg).await?;
    
    // 6. Peer's response includes their signature - submit to chain
    // (The response handler on peer side also signs the settlement_tx)
    self.settlement.close_channel(
        channel_id,
        final_balances,
        [my_signature, response.peer_signature],
    ).await?;
    
    // 7. Update local state
    self.channels.set_state(channel_id, ChannelState::Closed)?;
    
    Ok(())
}
}

§7.3.4 CHANNEL_DISPUTE

#![allow(unused)]
fn main() {
async fn dispute_channel(&mut self, channel_id: &Hash, our_state: &ChannelUpdatePayload) -> Result<()> {
    // 1. Submit dispute to chain with our latest signed state
    self.settlement.dispute_channel(channel_id, our_state).await?;
    
    // 2. Update local state
    self.channels.set_state(channel_id, ChannelState::Disputed)?;
    
    // 3. Wait for dispute period (24 hours) - handled by settlement module
    Ok(())
}
}

§7.4 Version Operations

handle_version_request

#![allow(unused)]
fn main() {
async fn handle_version_request(
    &mut self,
    _sender: &PeerId,
    request: VersionRequestPayload,
) -> Result<VersionResponsePayload> {
    // 1. Get all versions for this root
    let versions = self.manifest_store.get_versions(&request.version_root)?;
    
    if versions.is_empty() {
        return Err(Error::NotFound);
    }
    
    // 2. Find latest
    let latest = versions.iter()
        .max_by_key(|m| m.version.number)
        .unwrap();
    
    // 3. Convert to VersionInfo
    let version_infos: Vec<VersionInfo> = versions.iter()
        .map(|m| VersionInfo {
            hash: m.hash.clone(),
            number: m.version.number,
            timestamp: m.version.timestamp,
            visibility: m.visibility,
            price: m.economics.price,
        })
        .collect();
    
    Ok(VersionResponsePayload {
        version_root: request.version_root,
        versions: version_infos,
        latest: latest.hash.clone(),
    })
}
}

§7.5 Settlement Operations

trigger_settlement

Called periodically or when threshold reached. Creates batch and submits to chain.

#![allow(unused)]
fn main() {
async fn trigger_settlement(&mut self) -> Result<Option<SettlementBatch>> {
    // 1. Check if settlement needed
    let pending_total = self.settlement_queue.get_pending_total()?;
    let last_settlement = self.settlement_queue.get_last_settlement_time()?;
    
    if !should_settle(pending_total, last_settlement.unwrap_or(0), current_timestamp()) {
        return Ok(None);
    }
    
    // 2. Get pending distributions
    let pending = self.settlement_queue.get_pending()?;
    if pending.is_empty() {
        return Ok(None);
    }
    
    // 3. Create batch (aggregates by recipient)
    let payments: Vec<Payment> = pending.iter()
        .map(|d| reconstruct_payment(d))
        .collect();
    
    let batch = create_settlement_batch(&payments);
    
    // 4. Submit to chain
    let tx_id = self.settlement.settle_batch(batch.clone()).await?;
    
    // 5. Mark as settled
    let payment_ids: Vec<Hash> = pending.iter().map(|d| d.payment_id.clone()).collect();
    self.settlement_queue.mark_settled(&payment_ids, &batch.batch_id)?;
    self.settlement_queue.set_last_settlement_time(current_timestamp())?;
    
    // 6. Broadcast confirmation
    let confirm = SettleConfirmPayload {
        batch_id: batch.batch_id.clone(),
        transaction_id: tx_id.to_vec(),
        timestamp: current_timestamp(),
    };
    self.network.broadcast_settlement_confirm(confirm).await?;
    
    Ok(Some(batch))
}
}

Public API Summary

#![allow(unused)]
fn main() {
// Content lifecycle
pub async fn create(...) -> Result<Hash>;           // L0/L1 only
pub async fn extract_l1(...) -> Result<L1Summary>;  // L0 → L1
pub async fn build_l2(...) -> Result<Hash>;         // L1s → L2 (always private)
pub async fn merge_l2(...) -> Result<Hash>;         // L2s → L2 (always private)
pub async fn publish(...) -> Result<()>;            // NOT allowed for L2
pub async fn unpublish(...) -> Result<()>;
pub async fn update(...) -> Result<Hash>;
pub async fn derive(...) -> Result<Hash>;           // Any sources → L3
pub async fn reference_l3_as_l0(...) -> Result<()>;

// Querying (L2 is never queried)
pub async fn preview(...) -> Result<(Manifest, L1Summary)>;
pub async fn query(...) -> Result<QueryResponse>;
pub async fn get_versions(...) -> Result<Vec<VersionInfo>>;

// Visibility/access (L2 is always private)
pub async fn set_visibility(...) -> Result<()>;
pub async fn set_access(...) -> Result<()>;

// Channel operations
pub async fn open_channel(...) -> Result<Hash>;
pub async fn accept_channel(...) -> Result<()>;
pub async fn close_channel(...) -> Result<()>;
pub async fn dispute_channel(...) -> Result<()>;

// Settlement (L2 is invisible to settlement)
pub async fn trigger_settlement(...) -> Result<Option<SettlementBatch>>;

// Handlers (for incoming messages - no L2 handlers needed)
pub async fn handle_preview_request(...) -> Result<PreviewResponsePayload>;
pub async fn handle_query_request(...) -> Result<QueryResponsePayload>;
pub async fn handle_version_request(...) -> Result<VersionResponsePayload>;
pub async fn handle_channel_open(...) -> Result<ChannelAcceptPayload>;
pub async fn handle_channel_close(...) -> Result<ChannelClosePayload>;
}

Test Cases

Content Lifecycle

  1. Create L0: Creates content, hash matches, owner set
  2. Create L2 via create(): Fails with “Use build_l2()”
  3. Create L3 via create(): Fails with “Use derive()”
  4. Extract L1: Rule-based extraction produces mentions

L2 Entity Graph

  1. Build L2 from L1s: Creates private L2, entities extracted
  2. Build L2 no sources: Fails
  3. Build L2 from non-L1: Fails
  4. Build L2 from unqueried L1: Fails
  5. Merge L2s: Combines entities, updates relationships
  6. Merge L2s from different owners: Fails (“can only merge your own L2s”)
  7. Merge single L2: Fails (requires >= 2)
  8. L2 is always private: visibility forced to Private
  9. L2 has zero price: price forced to 0
  10. Publish L2: Fails with L2CannotPublish

L3 Derivation

  1. Derive L3 from L1: Works, provenance correct
  2. Derive L3 from L2: Works if owned, provenance traces to L0/L1
  3. Derive L3 from someone else’s L2: Fails
  4. Derive L3 from mix: L0, L1, L2, L3 all work together

Publishing

  1. Publish L0/L1/L3: Works, visibility changes
  2. Unpublish: Visibility returns to Private
  3. Update version: New hash, version links correctly

Query Flow

  1. Query flow: Request → auto-open channel → payment → response → cache
  2. Query with existing channel: Uses existing channel
  3. Query insufficient balance: Returns PAYMENT_REQUIRED
  4. Query L2: Not possible (L2 is always private)
  5. Access denied: Private content returns NotFound
  6. Unlisted access: With hash works, without fails
  7. Insufficient payment: Rejected

Economics

  1. L3 from L2 provenance: root_L0L1 contains original L0/L1, not L2
  2. Settlement for L3: L2 creator gets nothing, L0/L1 creators paid

Other Operations

  1. Reference L3: Only works if queried first
  2. Channel open: Creates channel, both sides have state
  3. Channel close: Cooperative close submits to chain
  4. Channel dispute: Submits dispute with latest state
  5. Version request: Returns all versions for root
  6. Settlement trigger: Creates batch, submits to chain
  7. Settlement trigger: Creates batch, submits to chain
  8. Settlement threshold: Triggers when threshold reached
  9. Settlement interval: Triggers after time elapsed

Module: nodalync-net

Source: Protocol Specification §11

Overview

P2P networking using libp2p. Handles peer discovery, DHT, and message routing.

Key Design Decisions:

  1. Hash-Only Lookup for MVP: The protocol supports hash-based content discovery only. Keyword/semantic search is an application-layer concern and out of scope for the core protocol. Users discover content via external channels (social media, links, recommendations) and use the protocol to query by hash.

  2. DHT stores: content_hash -> AnnouncePayload mapping. This allows anyone with a hash to find the content owner’s addresses and metadata.

  3. No search index: The DHT is NOT an inverted index. Future application-layer services can build search functionality on top of the protocol.

Dependencies

  • nodalync-types — All data structures
  • nodalync-wire — Message encoding
  • nodalync-ops — Operation handlers
  • libp2p — P2P networking stack

§11.1 Transport

#![allow(unused)]
fn main() {
pub fn build_transport(identity: &Keypair) -> Boxed<(PeerId, StreamMuxerBox)> {
    let tcp = tcp::tokio::Transport::new(tcp::Config::default().nodelay(true));
    
    let transport = tcp
        .upgrade(Version::V1)
        .authenticate(noise::Config::new(&identity).unwrap())
        .multiplex(yamux::Config::default())
        .boxed();
    
    transport
}
}

Supported transports:

  • TCP (primary)
  • QUIC (optional, for better performance)
  • WebSocket (optional, for browser nodes)

Security:

  • Noise protocol (XX handshake pattern)

Multiplexing:

  • yamux (primary)
  • mplex (fallback)

§11.2 Discovery (DHT)

Kademlia Configuration

#![allow(unused)]
fn main() {
pub fn build_kademlia(peer_id: PeerId) -> Kademlia<MemoryStore> {
    let mut config = KademliaConfig::default();
    config.set_query_timeout(Duration::from_secs(60));
    config.set_replication_factor(NonZeroUsize::new(DHT_REPLICATION).unwrap());
    
    let store = MemoryStore::new(peer_id);
    Kademlia::with_config(peer_id, store, config)
}

// Constants from spec
const DHT_BUCKET_SIZE: usize = 20;
const DHT_ALPHA: usize = 3;
const DHT_REPLICATION: usize = 20;
}

Content Announcement

#![allow(unused)]
fn main() {
/// Announce content availability to DHT
/// Stores: content_hash -> AnnouncePayload
pub async fn dht_announce(&mut self, hash: &Hash, payload: AnnouncePayload) -> Result<()> {
    let key = Key::new(&hash.0);
    let value = encode_payload(&payload)?;
    
    self.kademlia.put_record(Record::new(key, value), Quorum::Majority).await?;
    
    Ok(())
}

/// Lookup content by hash (the ONLY lookup mechanism in protocol)
/// Returns owner's addresses and metadata if found
pub async fn dht_get(&mut self, hash: &Hash) -> Result<Option<AnnouncePayload>> {
    let key = Key::new(&hash.0);
    
    match self.kademlia.get_record(key).await {
        Ok(record) => {
            let payload: AnnouncePayload = decode_payload(&record.value)?;
            Ok(Some(payload))
        }
        Err(GetRecordError::NotFound) => Ok(None),
        Err(e) => Err(e.into()),
    }
}

/// Remove content announcement from DHT
pub async fn dht_remove(&mut self, hash: &Hash) -> Result<()> {
    let key = Key::new(&hash.0);
    self.kademlia.remove_record(&key).await?;
    Ok(())
}
}

Note on Search:

The protocol does NOT include keyword search. The DHT only supports exact hash lookups. Content discovery happens through application-layer mechanisms:

  • External search services (could index L1 summaries)
  • Social sharing (users share links containing hashes)
  • Recommendations (applications can build on provenance data)
  • Curated directories (third parties can maintain topic indexes)

This keeps the protocol minimal and focused on trustless content exchange.


---

## §11.3 Peer Discovery

### Bootstrap

```rust
const BOOTSTRAP_NODES: &[&str] = &[
    "/dns4/bootstrap1.nodalync.io/tcp/9000/p2p/12D3KooW...",
    "/dns4/bootstrap2.nodalync.io/tcp/9000/p2p/12D3KooW...",
];

pub async fn bootstrap(&mut self) -> Result<()> {
    for addr in BOOTSTRAP_NODES {
        let addr: Multiaddr = addr.parse()?;
        self.swarm.dial(addr)?;
    }
    
    // Bootstrap Kademlia
    self.kademlia.bootstrap()?;
    
    Ok(())
}

Peer Exchange

#![allow(unused)]
fn main() {
/// Exchange peer lists with connected peers
pub async fn exchange_peers(&mut self) -> Result<()> {
    let my_peers: Vec<PeerInfo> = self.connected_peers()
        .iter()
        .map(|p| self.get_peer_info(p))
        .collect();
    
    for peer in self.connected_peers() {
        let msg = Message::new(
            MessageType::PeerInfo,
            encode_payload(&PeerInfoPayload {
                peer_id: self.peer_id(),
                public_key: self.public_key(),
                addresses: self.listen_addresses(),
                capabilities: self.capabilities(),
                content_count: self.content_count(),
                uptime: self.uptime(),
            })?,
            &self.identity,
        );
        self.send(&peer, msg).await?;
    }
    
    Ok(())
}
}

§11.4 Message Routing

Request-Response Protocol

#![allow(unused)]
fn main() {
#[derive(NetworkBehaviour)]
pub struct NodalyncBehaviour {
    kademlia: Kademlia<MemoryStore>,
    request_response: request_response::Behaviour<NodalyncCodec>,
    gossipsub: gossipsub::Behaviour,
    identify: identify::Behaviour,
}

pub struct NodalyncCodec;

impl request_response::Codec for NodalyncCodec {
    type Protocol = &'static str;
    type Request = Message;
    type Response = Message;
    
    fn protocol(&self) -> Self::Protocol {
        "/nodalync/1.0.0"
    }
    
    async fn read_request(&mut self, io: &mut impl AsyncRead) -> io::Result<Self::Request> {
        let bytes = read_length_prefixed(io, MAX_MESSAGE_SIZE).await?;
        decode_message(&bytes).map_err(|e| io::Error::new(io::ErrorKind::InvalidData, e))
    }
    
    async fn write_response(&mut self, io: &mut impl AsyncWrite, msg: Self::Response) -> io::Result<()> {
        let bytes = encode_message(&msg)?;
        write_length_prefixed(io, &bytes).await
    }
}
}

Send/Receive

#![allow(unused)]
fn main() {
/// Send message to specific peer
pub async fn send(&mut self, peer: &PeerId, message: Message) -> Result<Message> {
    let response = self.request_response
        .send_request(peer, message)
        .await
        .map_err(|e| Error::Network(e.to_string()))?;
    
    Ok(response)
}

/// Broadcast announcement via GossipSub
pub async fn broadcast(&mut self, message: Message) -> Result<()> {
    let topic = gossipsub::IdentTopic::new("/nodalync/announce/1.0.0");
    let bytes = encode_message(&message)?;
    
    self.gossipsub.publish(topic, bytes)?;
    
    Ok(())
}
}

Timeouts and Retries

#![allow(unused)]
fn main() {
const MESSAGE_TIMEOUT: Duration = Duration::from_secs(30);
const MAX_RETRIES: usize = 3;

pub async fn send_with_retry(&mut self, peer: &PeerId, message: Message) -> Result<Message> {
    let mut last_error = None;
    
    for attempt in 0..MAX_RETRIES {
        match timeout(MESSAGE_TIMEOUT, self.send(peer, message.clone())).await {
            Ok(Ok(response)) => return Ok(response),
            Ok(Err(e)) => {
                last_error = Some(e);
                // Exponential backoff
                tokio::time::sleep(Duration::from_millis(100 * 2_u64.pow(attempt as u32))).await;
            }
            Err(_) => {
                last_error = Some(Error::Timeout);
            }
        }
    }
    
    Err(last_error.unwrap())
}
}

Network Trait

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Network {
    // Discovery (hash-based only)
    async fn dht_announce(&mut self, hash: &Hash, payload: AnnouncePayload) -> Result<()>;
    async fn dht_get(&mut self, hash: &Hash) -> Result<Option<AnnouncePayload>>;
    async fn dht_remove(&mut self, hash: &Hash) -> Result<()>;
    
    // Messaging
    async fn send(&mut self, peer: &PeerId, message: Message) -> Result<Message>;
    async fn broadcast(&mut self, message: Message) -> Result<()>;
    
    // Specific message helpers
    async fn send_preview_request(&mut self, peer: &PeerId, hash: &Hash) -> Result<PreviewResponsePayload>;
    async fn send_query(&mut self, peer: &PeerId, request: QueryRequestPayload) -> Result<QueryResponsePayload>;
    async fn send_channel_open(&mut self, peer: &PeerId, request: ChannelOpenPayload) -> Result<ChannelAcceptPayload>;
    async fn send_channel_close(&mut self, peer: &PeerId, request: ChannelClosePayload) -> Result<ChannelClosePayload>;
    async fn broadcast_settlement_confirm(&mut self, confirm: SettleConfirmPayload) -> Result<()>;
    
    // Peer management
    fn connected_peers(&self) -> Vec<PeerId>;
    fn listen_addresses(&self) -> Vec<Multiaddr>;
    async fn dial(&mut self, addr: Multiaddr) -> Result<()>;
    
    // Event loop
    async fn next_event(&mut self) -> NetworkEvent;
}

pub enum NetworkEvent {
    MessageReceived { peer: PeerId, message: Message },
    PeerConnected(PeerId),
    PeerDisconnected(PeerId),
    DhtPutComplete { key: Hash, success: bool },
    DhtGetResult { key: Hash, value: Option<Vec<u8>> },
}
}

Test Cases

  1. Bootstrap: Connect to bootstrap nodes
  2. DHT announce/lookup: Announce content, find it from another node by hash
  3. DHT remove: Remove announcement, no longer findable
  4. Request-response: Send query, receive response
  5. Timeout: Slow peer triggers timeout
  6. Retry: Failed request retries
  7. Peer discovery: Find peers through DHT
  8. GossipSub: Broadcast reaches subscribers
  9. Channel messages: Open/close flow works
  10. Settlement broadcast: Confirm reaches all peers

Module: nodalync-settle

Source: Protocol Specification §12

Overview

Blockchain settlement on Hedera Hashgraph. Handles deposits, withdrawals, channel management, and batch settlement.

Key Design Decision: The settlement contract distributes payments to ALL recipients directly. When a settlement batch is submitted, the contract pays:

  • Content owners (5% synthesis fee + any root shares they have)
  • All root contributors (their proportional shares)

This ensures trustless distribution — content owners cannot withhold payments from upstream contributors. All recipients must have Hedera accounts to receive payments.

Dependencies

  • nodalync-types — Settlement types
  • nodalync-econ — Batch creation
  • hedera-sdk — Hedera integration

§12.1 Chain Selection

Primary chain: Hedera Hashgraph

Rationale:

  • Fast finality (3-5 seconds)
  • Low cost (~$0.0001/tx)
  • High throughput (10,000+ TPS)
  • Enterprise backing (helps with non-crypto user trust)

§12.2 On-Chain Data

Contract State

// Simplified representation of on-chain state

contract NodalyncSettlement {
    // Token balances
    mapping(address => uint256) public balances;
    
    // Payment channels
    struct Channel {
        address participant1;
        address participant2;
        uint256 balance1;
        uint256 balance2;
        uint64 nonce;
        ChannelStatus status;
    }
    mapping(bytes32 => Channel) public channels;
    
    // Content attestations
    struct Attestation {
        bytes32 contentHash;
        address owner;
        uint64 timestamp;
        bytes32 provenanceRoot;
    }
    mapping(bytes32 => Attestation) public attestations;
}

§12.3 Contract Operations

EVM Address Handling

Critical for ECDSA accounts: When interacting with the settlement contract, the EVM address used by msg.sender differs based on account key type:

Key TypeEVM Address (msg.sender)
ECDSADerived from public key: keccak256(uncompressed_pubkey)[12:]
Ed25519Simple padded account number: 0x000...{account_num_hex}

For ECDSA accounts, AccountId::to_solidity_address() returns the wrong address for contract storage lookups. The contract uses msg.sender (the key-derived address) when storing balances, but queries using to_solidity_address() will look up the wrong slot.

To get the correct EVM address for any account:

curl -s "https://testnet.mirrornode.hedera.com/api/v1/accounts/0.0.ACCOUNT_ID" | jq '.evm_address'

Deposit/Withdraw

Important: Deposits must call the contract’s deposit() payable function to update the internal balances mapping. A simple TransferTransaction sends HBAR but does NOT update the contract’s balance tracking.

#![allow(unused)]
fn main() {
pub async fn deposit(&self, amount: Amount) -> Result<TransactionId> {
    // CORRECT: Call the contract's deposit() payable function
    let tx = ContractExecuteTransaction::new()
        .contract_id(self.contract_id)
        .gas(100_000)
        .payable_amount(Hbar::from_tinybars(amount as i64))
        .function("deposit")
        .execute(&self.client)
        .await?;

    let receipt = tx.get_receipt(&self.client).await?;
    Ok(receipt.transaction_id)
}

pub async fn withdraw(&self, amount: Amount) -> Result<TransactionId> {
    let tx = ContractExecuteTransaction::new()
        .contract_id(self.contract_id)
        .gas(100_000)
        .function("withdraw")
        .function_parameters(ContractFunctionParameters::new().add_uint256(amount))
        .execute(&self.client)
        .await?;

    Ok(tx.transaction_id)
}
}

Content Attestation

#![allow(unused)]
fn main() {
pub async fn attest(
    &self,
    content_hash: &Hash,
    provenance_root: &Hash,
) -> Result<TransactionId> {
    let tx = ContractExecuteTransaction::new()
        .contract_id(self.contract_id)
        .function("attest")
        .function_parameters(
            ContractFunctionParameters::new()
                .add_bytes32(&content_hash.0)
                .add_bytes32(&provenance_root.0)
        )
        .execute(&self.client)
        .await?;
    
    Ok(tx.transaction_id)
}
}

Channel Operations

#![allow(unused)]
fn main() {
pub async fn open_channel(
    &self,
    peer: &AccountId,
    my_deposit: Amount,
    peer_deposit: Amount,
) -> Result<(ChannelId, TransactionId)> {
    let channel_id = compute_channel_id(&self.account_id, peer);
    
    let tx = ContractExecuteTransaction::new()
        .contract_id(self.contract_id)
        .function("openChannel")
        .function_parameters(
            ContractFunctionParameters::new()
                .add_bytes32(&channel_id.0)
                .add_address(peer)
                .add_uint256(my_deposit)
                .add_uint256(peer_deposit)
        )
        .execute(&self.client)
        .await?;
    
    Ok((channel_id, tx.transaction_id))
}

pub async fn close_channel(
    &self,
    channel_id: &ChannelId,
    final_balances: ChannelBalances,
    signatures: [Signature; 2],
) -> Result<TransactionId> {
    // NOTE: The spec's ChannelClosePayload.settlement_tx is the encoded
    // bytes of this on-chain call. Both parties must agree on final_balances
    // and sign before submitting.
    let tx = ContractExecuteTransaction::new()
        .contract_id(self.contract_id)
        .function("closeChannel")
        .function_parameters(
            ContractFunctionParameters::new()
                .add_bytes32(&channel_id.0)
                .add_uint256(final_balances.initiator)
                .add_uint256(final_balances.responder)
                .add_bytes(&signatures[0].0)
                .add_bytes(&signatures[1].0)
        )
        .execute(&self.client)
        .await?;
    
    Ok(tx.transaction_id)
}

/// Create settlement_tx bytes for ChannelClosePayload
pub fn create_close_tx_bytes(
    &self,
    channel_id: &ChannelId,
    final_balances: &ChannelBalances,
) -> Vec<u8> {
    // Encode the proposed close transaction for P2P negotiation
    let mut bytes = Vec::new();
    bytes.extend_from_slice(&channel_id.0);
    bytes.extend_from_slice(&final_balances.initiator.to_be_bytes());
    bytes.extend_from_slice(&final_balances.responder.to_be_bytes());
    bytes
}

pub async fn dispute_channel(
    &self,
    channel_id: &ChannelId,
    claimed_state: &ChannelUpdatePayload,
) -> Result<TransactionId> {
    let tx = ContractExecuteTransaction::new()
        .contract_id(self.contract_id)
        .function("disputeChannel")
        .function_parameters(
            ContractFunctionParameters::new()
                .add_bytes32(&channel_id.0)
                .add_uint64(claimed_state.nonce)
                .add_uint256(claimed_state.balances.initiator)
                .add_uint256(claimed_state.balances.responder)
                .add_bytes(&claimed_state.signature.0)
        )
        .execute(&self.client)
        .await?;
    
    Ok(tx.transaction_id)
}

/// Resolve a dispute after the dispute period (24 hours).
/// The contract will use the highest-nonce state submitted during the dispute period.
pub async fn resolve_dispute(
    &self,
    channel_id: &ChannelId,
) -> Result<TransactionId> {
    // After CHANNEL_DISPUTE_PERIOD_MS (24 hours), anyone can call resolve
    let tx = ContractExecuteTransaction::new()
        .contract_id(self.contract_id)
        .function("resolveDispute")
        .function_parameters(
            ContractFunctionParameters::new()
                .add_bytes32(&channel_id.0)
        )
        .execute(&self.client)
        .await?;
    
    Ok(tx.transaction_id)
}

/// Submit a counter-claim during dispute period with a higher nonce state
pub async fn counter_dispute(
    &self,
    channel_id: &ChannelId,
    better_state: &ChannelUpdatePayload,
) -> Result<TransactionId> {
    // If we have a state with higher nonce, submit it to win the dispute
    let tx = ContractExecuteTransaction::new()
        .contract_id(self.contract_id)
        .function("counterDispute")
        .function_parameters(
            ContractFunctionParameters::new()
                .add_bytes32(&channel_id.0)
                .add_uint64(better_state.nonce)
                .add_uint256(better_state.balances.initiator)
                .add_uint256(better_state.balances.responder)
                .add_bytes(&better_state.signature.0)
        )
        .execute(&self.client)
        .await?;
    
    Ok(tx.transaction_id)
}
}

Batch Settlement

#![allow(unused)]
fn main() {
pub async fn settle_batch(&self, batch: SettlementBatch) -> Result<TransactionId> {
    // Encode batch entries
    let entries_encoded: Vec<Vec<u8>> = batch.entries
        .iter()
        .map(|e| encode_settlement_entry(e))
        .collect();
    
    let tx = ContractExecuteTransaction::new()
        .contract_id(self.contract_id)
        .function("settleBatch")
        .function_parameters(
            ContractFunctionParameters::new()
                .add_bytes32(&batch.batch_id.0)
                .add_bytes32(&batch.merkle_root.0)
                .add_bytes_array(&entries_encoded)
        )
        .execute(&self.client)
        .await?;
    
    Ok(tx.transaction_id)
}
}

Settlement Trait

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Settlement {
    // Balance management
    async fn deposit(&self, amount: Amount) -> Result<TransactionId>;
    async fn withdraw(&self, amount: Amount) -> Result<TransactionId>;
    async fn get_balance(&self) -> Result<Amount>;
    
    // Attestations
    async fn attest(&self, content_hash: &Hash, provenance_root: &Hash) -> Result<TransactionId>;
    async fn get_attestation(&self, content_hash: &Hash) -> Result<Option<Attestation>>;
    
    // Channels
    async fn open_channel(&self, peer: &AccountId, deposit: Amount) -> Result<ChannelId>;
    async fn close_channel(&self, channel_id: &ChannelId, final_state: ChannelBalances, signatures: [Signature; 2]) -> Result<TransactionId>;
    async fn dispute_channel(&self, channel_id: &ChannelId, state: &ChannelUpdatePayload) -> Result<TransactionId>;
    async fn counter_dispute(&self, channel_id: &ChannelId, better_state: &ChannelUpdatePayload) -> Result<TransactionId>;
    async fn resolve_dispute(&self, channel_id: &ChannelId) -> Result<TransactionId>;
    
    // Batch settlement - distributes to ALL recipients in the batch
    async fn settle_batch(&self, batch: SettlementBatch) -> Result<TransactionId>;
    async fn verify_settlement(&self, tx_id: &TransactionId) -> Result<SettlementStatus>;
}

pub enum SettlementStatus {
    Pending,
    Confirmed { block: u64, timestamp: Timestamp },
    Failed { reason: String },
}
}

Configuration

[settlement]
# Hedera network: mainnet, testnet, previewnet
network = "testnet"

# Account ID (format: 0.0.12345)
account_id = "0.0.12345"

# Private key (or path to file)
private_key_path = "~/.nodalync/hedera.key"

# Contract ID
contract_id = "0.0.67890"

# Gas limits
max_gas_attest = 100000
max_gas_settle = 500000

Test Cases (Testnet)

  1. Deposit: Deposit tokens → balance increases
  2. Withdraw: Withdraw tokens → balance decreases
  3. Attest: Create attestation → retrievable on-chain
  4. Channel lifecycle: Open → update → close
  5. Dispute initiation: Submit dispute → channel enters Disputed state
  6. Counter dispute: Submit higher-nonce state → wins dispute
  7. Dispute resolution: After 24h → resolve settles to highest nonce
  8. Batch settlement: Multiple recipients settled in one tx
  9. Batch distribution: All root contributors receive correct amounts
  10. Merkle verification: Prove inclusion in batch

Debugging & Verification

Verify Transactions On-Chain

After any settlement operation, always verify on-chain status:

# Check recent transactions - should show CONTRACTCALL, not just CRYPTOTRANSFER
curl -s "https://testnet.mirrornode.hedera.com/api/v1/transactions?account.id=0.0.ACCOUNT&limit=5&order=desc" \
  | jq '.transactions[] | {timestamp: .consensus_timestamp, type: .name, result: .result}'

# Check contract calls specifically
curl -s "https://testnet.mirrornode.hedera.com/api/v1/contracts/0.0.7729011/results?limit=5&order=desc" \
  | jq '.results[] | {timestamp, from, result: .error_message}'

Check Contract State

# View all storage slots
curl -s "https://testnet.mirrornode.hedera.com/api/v1/contracts/0.0.7729011/state" | jq '.state'

# Query balance for an address (balances mapping, selector 0x27e235e3)
# Replace EVM_ADDRESS with 40 hex chars (no 0x prefix)
curl -s -X POST "https://testnet.mirrornode.hedera.com/api/v1/contracts/call" \
  -H "Content-Type: application/json" \
  -d '{
    "block": "latest",
    "data": "0x27e235e3000000000000000000000000EVM_ADDRESS",
    "to": "0xc6b4bFD28AF2F6999B32510557380497487A60dD"
  }' | jq '.result'

Check Event Logs

# View deposit/withdraw events (shows actual credited address)
curl -s "https://testnet.mirrornode.hedera.com/api/v1/contracts/0.0.7729011/results/logs?order=desc&limit=10" \
  | jq '.logs[] | {timestamp, topics, data}'

Common Issues

SymptomCauseSolution
Transaction shows CRYPTOTRANSFER not CONTRACTCALLUsing TransferTransaction instead of ContractExecuteTransactionUse ContractExecuteTransaction with payable_amount()
Balance query returns 0 after depositWrong EVM address for ECDSA accountsUse key-derived evm_address from mirror node
CONTRACT_REVERT_EXECUTEDContract logic rejected the callCheck function parameters, balances, or channel state
CLI shows success but contract revertsReceipt status not properly checkedVerify via mirror node API

Contract Function Selectors

FunctionSelectorNotes
deposit()0xd0e30db0Payable, no parameters
withdraw(uint256)0x2e1a7d4dAmount in tinybars
balances(address)0x27e235e3Public mapping getter
openChannel(bytes32,address,uint256,uint256)0xcf027915channelId, peer, deposit1, deposit2
closeChannel(bytes32,uint256,uint256,bytes)varieschannelId, bal1, bal2, signatures
settleBatch(bytes32,bytes32,bytes[])variesbatchId, merkleRoot, entries

Module: nodalync-cli

Source: Not in spec (application layer)

Overview

Command-line interface for interacting with a Nodalync node. User-facing binary.

Dependencies

  • All nodalync-* crates
  • clap — Argument parsing
  • indicatif — Progress bars
  • colored — Terminal colors

Commands

Identity

# Initialize new identity
nodalync init
> Identity created: ndl1abc123...
> Configuration saved to <data_dir>/config.toml

# Show identity
nodalync whoami
> PeerId: ndl1abc123...
> Public Key: 0x...
> Addresses: /ip4/0.0.0.0/tcp/9000

Content Management

# Publish content
nodalync publish <file> [--price <amount>] [--visibility <private|unlisted|shared>]
> Hashing content...
> Extracting L1 mentions... (23 found)
> Published: a1b2c3d4e5f6...
> Price: 0.10 HBAR
> Visibility: shared

# List local content
nodalync list [--visibility <filter>]
> SHARED (3)
>   a1b2c3d4e5f6... "Research Paper" v3, 0.10 HBAR, 847 queries
>   b7c8d9e0f1a2... "Analysis" v1, 0.05 HBAR, 234 queries
>
> PRIVATE (2)
>   d9e0f1a2b3c4... "Draft Ideas" v4
>   e5f6a7b8c9d0... "Personal Notes" v1

# Update content (new version)
nodalync update <hash> <new-file>
> Previous: a1b2c3d4e5f6... (v1)
> New: b7c8d9e0f1a2... (v2)
> Version root: a1b2c3d4e5f6...

# Show versions
nodalync versions <hash>
> Version root: a1b2c3d4e5f6...
> v1: a1b2c3d4e5f6... (2025-01-15) - shared
> v2: b7c8d9e0f1a2... (2025-01-20) - shared [latest]

# Change visibility
nodalync visibility <hash> --level <private|unlisted|shared>
> Visibility updated: a1b2c3d4e5f6... → shared

# Delete (local only)
nodalync delete <hash>
> Deleted: a1b2c3d4e5f6... (local copy only, provenance preserved)

Discovery & Querying

# Search network
nodalync search "climate change mitigation" [--limit <n>]
> Found 47 results
> [1] b7c8d9e0f1a2... "IPCC Report Summary" by ndl1def... (0.05/query, 847 queries)
>     Preview: Global temperatures have risen 1.1°C since pre-industrial...
> [2] c3d4e5f6a7b8... "Carbon Capture Analysis" by ndl1ghi... (0.12/query, 234 queries)
>     Preview: Current carbon capture technology can sequester...

# Preview content (free)
nodalync preview <hash>
> Title: "IPCC Report Summary"
> Owner: ndl1def...
> Price: 0.05 HBAR
> Queries: 847
> 
> L1 Mentions (5 of 23):
> - Global temperatures have risen 1.1°C since pre-industrial
> - Net-zero by 2050 requires 45% emission reduction by 2030
> - ...

# Query content (paid)
nodalync query <hash>
> Querying b7c8d9e0f1a2...
> Payment: 0.05 HBAR
> Content saved to ./cache/b7c8d9e0f1a2...

Synthesis

# Create L3 insight from sources
nodalync synthesize --sources <hash1>,<hash2>,... --output <file>
> Verifying sources queried... ✓
> Computing provenance (12 roots)...
> L3 hash: f1a2b3c4d5e6...
> 
> Publish now? [y/n/set price]: 0.15
> Published: f1a2b3c4d5e6... (0.15 HBAR, shared)

# Reference external L3 as L0
nodalync reference <l3-hash>
> Referencing a1b2c3d4e5f6... as L0 for future derivations

Economics

# Check balance
nodalync balance
> Protocol Balance: 127.50 HBAR
> Pending Earnings: 4.23 HBAR
> Pending Settlement: 12 payments
>
> Breakdown:
>   Direct queries: 89.20 HBAR
>   Root contributions: 38.30 HBAR

# Earnings by content
nodalync earnings [--content <hash>]
> Top earning content:
>   a1b2c3d4e5f6... "Research Paper": 45.30 HBAR (234 queries)
>   b7c8d9e0f1a2... "Analysis": 23.10 HBAR (462 queries, as root)

# Deposit tokens
nodalync deposit <amount>
> Depositing 50.00 HBAR...
> Transaction: 0x...
> New balance: 177.50 HBAR

# Withdraw tokens
nodalync withdraw <amount>
> Withdrawing 100.00 HBAR...
> Transaction: 0x...
> New balance: 77.50 HBAR

# Force settlement
nodalync settle
> Settling 12 pending payments...
> Batch ID: 0a1b2c3d4e5f...
> Transaction: 0x...
> Settled: 4.23 HBAR to 5 recipients

Payment Channels

# Open payment channel with peer
nodalync open-channel <peer-id> --deposit 100
> Channel opened: 4d5e6f7a8b9c...
> Peer: ndl1abc123...
> State: Open
> My Balance: 100.00 HBAR
> Their Balance: 100.00 HBAR

# List all payment channels
nodalync list-channels
> Payment Channels: 3 channels (2 open)
>   1a2b3c4d5e6f... ndl1abc... [Open] my: 0.85 HBAR / their: 1.15 HBAR
>   2b3c4d5e6f7a... ndl1def... [Open] my: 2.30 HBAR / their: 0.70 HBAR (5 pending)
>   3c4d5e6f7a8b... ndl1ghi... [Closed] my: 0.00 HBAR / their: 0.00 HBAR

# Close payment channel
nodalync close-channel <peer-id>
> Channel closed: 4d5e6f7a8b9c...
> Peer: ndl1abc123...
> Final Balance: my: 0.85 HBAR / their: 1.15 HBAR

Node Management

# Start node (foreground)
nodalync start
> Starting Nodalync node...
> PeerId: 12D3KooW...
> Listening on /ip4/0.0.0.0/tcp/9000
> Connected to 12 peers
> DHT bootstrapped

# Start with health endpoint (for containers/monitoring)
nodalync start --health --health-port 8080
> Starting Nodalync node...
> PeerId: 12D3KooW...
> Health endpoint: http://0.0.0.0:8080/health
> Metrics endpoint: http://0.0.0.0:8080/metrics

# Start as daemon (background)
nodalync start --daemon
> Nodalync daemon started (PID: 12345)
> PeerId: 12D3KooW...

# Node status
nodalync status
> Node: running (PID: 12345)
> PeerId: 12D3KooW...
> Uptime: 4h 23m
> Peers: 12 connected
> Content: 5 shared, 2 private
> Pending: 12 payments (4.23 HBAR)

# Stop daemon
nodalync stop
> Shutting down gracefully...
> Flushing pending operations...
> Node stopped

Health Endpoints (when --health flag is used):

EndpointContent-TypeDescription
GET /healthapplication/json{"status":"ok","connected_peers":N,"uptime_secs":M}
GET /metricstext/plainPrometheus metrics format

Prometheus Metrics:

  • nodalync_connected_peers — Current peer count
  • nodalync_peer_events_total{event} — Connect/disconnect events
  • nodalync_dht_operations_total{op,result} — DHT put/get operations
  • nodalync_gossipsub_messages_total — Broadcast messages received
  • nodalync_settlement_batches_total{status} — Settlement batches
  • nodalync_settlement_latency_seconds — Settlement operation latency
  • nodalync_queries_total — Total queries processed
  • nodalync_query_latency_seconds — Query latency histogram
  • nodalync_uptime_seconds — Node uptime
  • nodalync_node_info{version,peer_id} — Node metadata

CLI Structure

#![allow(unused)]
fn main() {
use clap::{Parser, Subcommand};

#[derive(Parser)]
#[command(name = "nodalync")]
#[command(about = "Nodalync Protocol CLI")]
pub struct Cli {
    #[command(subcommand)]
    pub command: Commands,
    
    /// Path to config file (default: <data_dir>/config.toml)
    #[arg(short, long)]
    pub config: Option<PathBuf>,
    
    /// Output format
    #[arg(short, long, default_value = "human")]
    pub format: OutputFormat,
}

#[derive(Subcommand)]
pub enum Commands {
    /// Initialize new identity
    Init,
    
    /// Show identity info
    Whoami,
    
    /// Publish content
    Publish {
        file: PathBuf,
        #[arg(short, long)]
        price: Option<f64>,
        #[arg(short, long, default_value = "shared")]
        visibility: Visibility,
    },
    
    /// List local content
    List {
        #[arg(short, long)]
        visibility: Option<Visibility>,
    },
    
    /// Search network
    Search {
        query: String,
        #[arg(short, long, default_value = "10")]
        limit: u32,
    },
    
    /// Preview content (free)
    Preview { hash: String },
    
    /// Query content (paid)
    Query { hash: String },
    
    /// Create L3 synthesis
    Synthesize {
        #[arg(short, long, value_delimiter = ',')]
        sources: Vec<String>,
        #[arg(short, long)]
        output: PathBuf,
    },
    
    /// Check balance
    Balance,
    
    /// Start node
    Start {
        #[arg(short, long)]
        daemon: bool,

        /// Enable HTTP health endpoint
        #[arg(long)]
        health: bool,

        /// Port for health endpoint (default: 8080)
        #[arg(long, default_value = "8080")]
        health_port: u16,
    },
    
    /// Node status
    Status,
    
    /// Stop node
    Stop,

    /// Open payment channel
    OpenChannel {
        peer_id: String,
        #[arg(short, long)]
        deposit: f64,
    },

    /// Close payment channel
    CloseChannel { peer_id: String },

    /// List payment channels
    ListChannels,

    // ... more commands
}

#[derive(Clone, Copy, ValueEnum)]
pub enum OutputFormat {
    Human,
    Json,
}
}

Output Formatting

#![allow(unused)]
fn main() {
pub trait Render {
    fn render_human(&self) -> String;
    fn render_json(&self) -> String;
}

impl Render for SearchResult {
    fn render_human(&self) -> String {
        format!(
            "{} \"{}\" by {} ({}/query, {} queries)\n    Preview: {}",
            self.hash.short(),
            self.title,
            self.owner.short(),
            format_amount(self.price),
            self.total_queries,
            self.l1_summary.summary.truncate(80),
        )
    }
    
    fn render_json(&self) -> String {
        serde_json::to_string_pretty(self).unwrap()
    }
}
}

Error Handling

pub fn run() -> Result<()> {
    let cli = Cli::parse();
    
    match cli.command {
        Commands::Publish { file, price, visibility } => {
            let result = publish(&file, price, visibility)?;
            println!("{}", result.render(cli.format));
        }
        // ...
    }
    
    Ok(())
}

fn main() {
    if let Err(e) = run() {
        eprintln!("{}: {}", "Error".red().bold(), e);
        std::process::exit(1);
    }
}

Configuration

Configuration is stored in a platform-specific data directory (set NODALYNC_DATA_DIR to override):

  • macOS: ~/Library/Application Support/io.nodalync.nodalync/config.toml
  • Linux: ~/.local/share/nodalync/config.toml
  • Windows: %APPDATA%\nodalync\nodalync\config.toml
[identity]
keyfile = "<data_dir>/identity/keypair.key"

[storage]
content_dir = "<data_dir>/content"
database = "<data_dir>/nodalync.db"
cache_dir = "<data_dir>/cache"
cache_max_size_mb = 1000

[network]
enabled = true
listen_addresses = ["/ip4/0.0.0.0/tcp/9000"]
bootstrap_nodes = [
    "/dns4/nodalync-bootstrap.eastus.azurecontainer.io/tcp/9000/p2p/12D3KooWMqrUmZm4e1BJTRMWqKHCe1TSX9Vu83uJLEyCGr2dUjYm",
]

[settlement]
network = "hedera-testnet"
auto_deposit = false

[economics]
default_price = 0.1  # In HBAR
auto_settle_threshold = 100.0  # In HBAR

[display]
default_format = "human"
show_previews = true
max_search_results = 20

Test Cases

  1. init: Creates identity and config
  2. publish: File hashed, L1 extracted, announced
  3. search: Returns results from network
  4. query: Pays and retrieves content
  5. synthesize: Creates L3 with correct provenance
  6. balance: Shows correct amounts
  7. JSON output: Valid JSON for all commands
  8. Error messages: Helpful, actionable errors
  9. open-channel: Opens channel, both sides have state
  10. list-channels: Shows all channels with states
  11. close-channel: Cooperative close, settles on-chain

Module 11: MCP Server

The nodalync-mcp crate provides an MCP (Model Context Protocol) server that enables AI assistants like Claude to query knowledge from a local Nodalync node.

Quick Start

1. Build the CLI

cargo build --release -p nodalync-cli

2. Initialize a Node

./target/release/nodalync init

3. Configure Claude Desktop

Add to your Claude Desktop MCP config (typically ~/.config/claude/mcp.json on macOS/Linux):

{
  "mcpServers": {
    "nodalync": {
      "command": "/path/to/nodalync",
      "args": ["mcp-server", "--budget", "1.0", "--auto-approve", "0.01"]
    }
  }
}

4. Restart Claude Desktop

Quit and reopen Claude Desktop to load the MCP server.

CLI Usage

# Start MCP server with defaults (1 HBAR budget, 0.01 auto-approve)
nodalync mcp-server

# Custom budget and auto-approve threshold
nodalync mcp-server --budget 5.0 --auto-approve 0.1

Options

FlagDefaultDescription
--budget, -b1.0Total session budget in HBAR
--auto-approve, -a0.01Auto-approve queries under this HBAR amount

MCP Tools

When the MCP server is running, AI agents have access to these tools:

ToolDescription
query_knowledgeQuery content by hash or natural language (paid)
list_sourcesBrowse available content with metadata
search_networkSearch connected peers for content (requires --enable-network)
preview_contentView content metadata without paying
publish_contentPublish new content from the agent
synthesize_contentCreate L3 synthesis from multiple sources
update_contentCreate a new version of existing content
delete_contentDelete content and set visibility to offline
set_visibilityChange content visibility
list_versionsList all versions of a content item
get_earningsView earnings breakdown by content
statusNode health, budget, channels, and Hedera status
deposit_hbarDeposit HBAR to the settlement contract
open_channelOpen a payment channel with a peer
close_channelClose a payment channel
close_all_channelsClose all open payment channels

Note: Natural language queries are not yet supported for query_knowledge. Use list_sources or search_network to discover content hashes first.

MCP Resources

knowledge://{hash}

Direct content access by hash. Use list_sources to discover available hashes.

URI Format: knowledge://<base58-encoded-hash>

Example:

knowledge://5dY7Kx9mT2...

Returns the content directly. Payment is handled automatically from session budget.

Architecture

┌──────────────┐     stdio      ┌─────────────────┐
│ Claude       │ ◄────────────► │ nodalync        │
│ Desktop      │     MCP        │ mcp-server      │
└──────────────┘                └────────┬────────┘
                                         │
                        ┌────────────────┼────────────────┐
                        │                │                │
                        ▼                ▼                ▼
                ┌─────────────┐  ┌─────────────┐  ┌─────────────┐
                │ nodalync-   │  │ nodalync-   │  │ Event Loop  │
                │ store       │  │ net         │  │ (background)│
                │ (local)     │  │ (P2P)       │  │             │
                └─────────────┘  └─────────────┘  └─────────────┘

Event Processing

When --enable-network is used, the MCP server spawns a background event loop that processes incoming network events (e.g., ChannelAccept messages). This enables full payment channel lifecycle support:

  1. Channel Open: Server sends ChannelOpen to peer
  2. Event Loop: Receives ChannelAccept from peer
  3. State Transition: Channel moves from OpeningOpen
  4. Payments: Channel is ready for micropayments

Budget System

The budget system prevents runaway spending:

  1. Session Budget: Total HBAR available for the session
  2. Auto-Approve Threshold: Queries below this cost are approved automatically
  3. Atomic Tracking: Thread-safe spending with compare_exchange
#![allow(unused)]
fn main() {
// Budget is tracked atomically
pub fn try_spend(&self, amount: Amount) -> Result<Amount, McpError> {
    // Atomic compare-and-swap ensures thread safety
}
}

Error Handling

ErrorCauseResolution
BudgetExceededQuery cost > remaining budgetIncrease budget or use smaller queries
ContentNotFoundHash doesn’t exist locallyEnsure content is published
StorageErrorDatabase issuesCheck permissions, disk space

Testing

# Run MCP crate tests
cargo test -p nodalync-mcp

# Test server manually
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | ./target/release/nodalync mcp-server