Nodalync Protocol
A protocol for fair knowledge economics in the age of AI.
Abstract
We propose a protocol for knowledge economics that ensures original contributors receive perpetual, proportional compensation from all downstream value creation. A researcher can publish valuable findings once and receive perpetual royalties as the ecosystem builds upon their work. A writer’s insights compound in value as others synthesize and extend them.
The protocol enables humans to benefit from knowledge compounding—earning from what they know, not just what they continuously produce.
Key Features
- Cryptographic Provenance — Every piece of knowledge carries its complete derivation history
- Fair Economics — 95% of transaction value flows to foundational contributors
- Local-First — Your data stays on your machine, under your control
- AI-Native — MCP integration for seamless AI agent consumption
Quick Navigation
Getting Started
- Quick Start — Get your node running in under 5 minutes
- FAQ — Common questions answered
Protocol
- Specification — Complete protocol specification
- Architecture — Module structure and dependencies
- Whitepaper — Protocol design and economics
- L2 Addendum — Entity graph details
Module Documentation
- Crypto — Hashing, signing, identity
- Types — Data structures
- Wire — Serialization
- Store — Storage layer
- Validation — Validation rules
- Economics — Revenue distribution
- Operations — Protocol operations
- Networking — P2P layer
- Settlement — Hedera integration
Applications
- CLI — Command-line interface
- MCP Server — AI agent integration
Protocol Layers
| Layer | Name | Contents | Properties |
|---|---|---|---|
| L0 | Raw Inputs | Documents, transcripts, notes | Immutable, publishable, queryable |
| L1 | Mentions | Atomic facts with L0 pointers | Extracted, visible as preview |
| L2 | Entity Graph | Entities + RDF relations | Internal only, never shared |
| L3 | Insights | Emergent patterns and conclusions | Shareable, importable as L0 |
Current Status
Protocol v0.7.1 · CLI v0.10.1
| Layer | Crate | Description |
|---|---|---|
| Protocol | nodalync-crypto | Hashing (SHA-256), Ed25519 signing, PeerId derivation |
| Protocol | nodalync-types | All data structures including L2 Entity Graph |
| Protocol | nodalync-wire | Deterministic CBOR serialization, 21 message types |
| Protocol | nodalync-store | SQLite manifests, filesystem content, settlement queue |
| Protocol | nodalync-valid | Content, provenance, payment, L2 validation |
| Protocol | nodalync-econ | 95/5 revenue distribution, Merkle batching |
| Protocol | nodalync-ops | CREATE, DERIVE, BUILD_L2, MERGE_L2, QUERY |
| Protocol | nodalync-net | libp2p (TCP/Noise/yamux), Kademlia DHT, GossipSub |
| Protocol | nodalync-settle | Hedera settlement, smart contract deployed to testnet |
| App | nodalync-cli | Full CLI with daemon mode, health endpoints, alerting |
| App | nodalync-mcp | MCP server for AI agent integration |
Hedera Testnet
| Resource | Value |
|---|---|
| Contract ID | 0.0.7729011 |
| EVM Address | 0xc6b4bFD28AF2F6999B32510557380497487A60dD |
| HashScan | View Contract |
Links
Nodalync Quick Start
Get your node running and connected to the network in under 5 minutes.
Installation
Choose one of three options:
Option A: One-Line Install (Recommended)
macOS / Linux:
curl -fsSL https://raw.githubusercontent.com/gdgiangi/nodalync-protocol/main/install.sh | sh
Windows (PowerShell):
irm https://raw.githubusercontent.com/gdgiangi/nodalync-protocol/main/install.ps1 | iex
Or download the latest .exe from Releases and add it to your PATH.
This auto-detects your platform and installs the latest binary with full Hedera settlement support.
Option B: Docker
# Pull or build the image
docker build -t nodalync:latest https://github.com/gdgiangi/nodalync-protocol.git
# Initialize your identity
docker run -it \
-e NODALYNC_PASSWORD=your-secure-password \
-v ~/.nodalync:/home/nodalync/.nodalync \
nodalync:latest init
# Start your node
docker run -d --name nodalync-node \
-e NODALYNC_PASSWORD=your-secure-password \
-v ~/.nodalync:/home/nodalync/.nodalync \
-p 9000:9000 \
nodalync:latest start
Option C: Build from Source
Requires Rust 1.88+ (and protoc for Hedera support):
# Clone the repo
git clone https://github.com/gdgiangi/nodalync-protocol.git
cd nodalync-protocol
# Build release binary with Hedera support (default, requires protoc)
cargo build --release -p nodalync-cli
# Or build without Hedera support (smaller binary)
cargo build --release -p nodalync-cli --no-default-features
# Add to PATH (no sudo needed)
export PATH="$PWD/target/release:$PATH"
# Or install system-wide
sudo cp target/release/nodalync /usr/local/bin/
Pre-built binaries also available at Releases.
Step 1: Initialize Your Identity
Set a password and initialize your node identity:
export NODALYNC_PASSWORD=your-secure-password
nodalync init
This will:
- Generate an Ed25519 keypair (your identity)
- Create a default configuration file (connects to bootstrap nodes automatically)
- Set up local storage (SQLite database, content directory)
Note:
initfails if an identity already exists. To reinitialize, delete your data directory first (see Troubleshooting below for paths) or usenodalync init --wizardin an interactive terminal to auto-reinitialize.
For an interactive experience that lets you configure network settings, pricing, and settlement mode step by step, use the wizard:
nodalync init --wizard
Check your identity:
nodalync whoami
Step 2: Start Your Node
Foreground mode (see logs, Ctrl+C to stop):
nodalync start
Background mode (daemon):
nodalync start --daemon
nodalync status # Check status
nodalync stop # Stop the node
Your node will automatically:
- Connect to the bootstrap node
- Discover other peers via DHT
- Start serving your published content
Step 3: Publish Content
Share knowledge on the network:
# Publish a file with default settings
nodalync publish my-research.md
# Publish with custom price and metadata
nodalync publish my-research.md \
--price 0.01 \
--title "My Research Paper" \
--visibility shared
Visibility levels:
private- Local only, never sharedunlisted- Available if someone knows the hashshared- Announced to network (default)
List your published content:
nodalync list
Step 4: Search & Query Content
Search the network:
Search matches content titles, descriptions, and tags (not body text):
# Search local content by title/description/tags
nodalync search "research"
# Search entire network
nodalync search "research" --all
Preview content (free, shows metadata only):
nodalync preview <content-hash>
Query full content (paid):
nodalync query <content-hash>
# Save to file
nodalync query <content-hash> --output result.txt
Step 5: Check Your Earnings
When others query your content, you earn HBAR:
# View balance
nodalync balance
# View earnings breakdown
nodalync earnings
# Force settlement (batch payments on-chain)
nodalync settle
Claude / MCP Integration
Connect Claude to your Nodalync node for AI-powered knowledge queries.
Start the MCP Server
Basic (local content only):
nodalync mcp-server \
--budget 1.0 \
--auto-approve 0.01
With network search:
nodalync mcp-server \
--budget 1.0 \
--auto-approve 0.01 \
--enable-network
With Hedera settlement (testnet):
nodalync mcp-server \
--budget 1.0 \
--auto-approve 0.01 \
--enable-network \
--hedera-account-id 0.0.XXXXX \
--hedera-private-key ~/.nodalync/hedera.key \
--hedera-contract-id 0.0.7729011 \
--hedera-network testnet
Options:
--budget- Maximum HBAR for this session (default: 1.0)--auto-approve- Auto-approve queries below this price (default: 0.01)--enable-network- Search network peers, not just local content--hedera-account-id- Your Hedera account ID for settlement--hedera-private-key- Path to your Hedera private key file--hedera-contract-id- Settlement contract ID (default: 0.0.7729011)--hedera-network- Network to use: testnet, mainnet, or previewnet
Configure Claude Desktop
Add to your Claude Desktop config (~/.config/claude/mcp.json or similar):
{
"mcpServers": {
"nodalync": {
"command": "nodalync",
"args": ["mcp-server", "--budget", "1.0", "--auto-approve", "0.01", "--enable-network"],
"env": {
"NODALYNC_PASSWORD": "your-secure-password",
"NODALYNC_HEDERA_ACCOUNT_ID": "0.0.7703962",
"NODALYNC_HEDERA_CONTRACT_ID": "0.0.7729011",
"NODALYNC_HEDERA_KEY_PATH": "/Users/you/.nodalync/hedera.key"
}
}
}
}
Note: The private key must be DER-encoded ECDSA format (98 hex characters starting with 303002...).
MCP Tools
When the MCP server is running, AI agents have access to these tools:
| Tool | Description |
|---|---|
query_knowledge | Query content by hash or natural language (paid) |
list_sources | Browse available content with metadata |
search_network | Search connected peers for content (requires --enable-network) |
preview_content | View content metadata without paying |
publish_content | Publish new content from the agent |
synthesize_content | Create L3 synthesis from multiple sources |
update_content | Create a new version of existing content |
delete_content | Delete content and set visibility to offline |
set_visibility | Change content visibility |
list_versions | List all versions of a content item |
get_earnings | View earnings breakdown by content |
status | Node health, budget, channels, and Hedera status |
deposit_hbar | Deposit HBAR to the settlement contract |
open_channel | Open a payment channel with a peer |
close_channel | Close a payment channel |
close_all_channels | Close all open payment channels |
Local Multi-Node Testing
Test the full publish-query-payment flow across three local nodes using Docker.
Prerequisites: Docker, Docker Compose, and jq (for make test) installed.
Quick Version
# 1. Build the Docker image first (from repo root — this must complete before init)
docker compose build
# 2. Initialize node identities and configs (uses the image you just built)
cd infra/local && make init
# 3. Start the 3-node cluster
make up
# 4. Run the end-to-end test (publish on node1, query from node3)
make test
Important: Step 1 (docker compose build) must complete before Step 2 (make init),
because make init runs docker run to generate identities using the built image.
What This Creates
| Container | Role | Host Port | Internal IP |
|---|---|---|---|
nodalync-node1 | Bootstrap / seed node | 9001, 8081 | 172.28.0.10 |
nodalync-node2 | Alice (publisher) | 9002, 8082 | 172.28.0.11 |
nodalync-node3 | Bob (querier) | 9003, 8083 | 172.28.0.12 |
All nodes use the password testpassword and form a full-mesh via libp2p.
Manual Interaction
# Run any CLI command on a specific node
docker exec -e NODALYNC_PASSWORD=testpassword nodalync-node1 nodalync status
docker exec -e NODALYNC_PASSWORD=testpassword nodalync-node2 nodalync list
# Open a shell inside a node
cd infra/local && make shell-node1
# Publish test content on node1
make publish-test
# View logs
make logs
# Stop the cluster
make down
# Full reset (remove data + reinitialize)
make reset
Available Makefile Targets
Run cd infra/local && make help to see all targets:
| Target | Description |
|---|---|
make init | Generate node identities and configs (required first) |
make up | Start the 3-node cluster |
make down | Stop the cluster |
make logs | Follow cluster logs |
make status | Show cluster status and peer IDs |
make test | Run E2E tests (publish, propagate, query) |
make clean | Remove containers, volumes, and generated configs |
make reset | Clean + init (fresh start) |
make shell-node1 | Open shell in node1 (also node2, node3) |
make publish-test | Publish test content on node1 |
Two Docker Compose Files
There are two docker-compose.yml files in this repo, each for a different purpose:
| File | Location | Used by | Service names | When to use |
|---|---|---|---|---|
docker-compose.yml | Repo root | docker compose build | node-bootstrap, node-alice, node-bob | Building the image and custom setups |
docker-compose.yml | infra/local/ | make up/down/logs | node1, node2, node3 | Standard 3-node testing via Makefile |
Typical workflow: Run docker compose build from the repo root to build the image, then use cd infra/local && make init && make up for the standard 3-node cluster. The Makefile targets use the infra/local/docker-compose.yml internally.
The root docker-compose.yml is useful if you want to customize the cluster (add more nodes, change ports, or integrate with other services). It references infra/local/ for data and configs, so you still need make init first.
Warning: Do not run both compose files at the same time. They use overlapping container names and ports, so running one while the other is active will cause conflicts. Use make down (or docker compose down from the root) to stop one before starting the other.
Manual Two-Node Testing (No Docker)
Test the full publish-query-payment flow between two local nodes without Docker. This requires two terminal windows and uses separate data directories for each node.
1. Set Up Two Identities
# Terminal A — Node A (publisher)
export NODALYNC_PASSWORD=testpassword
export NODALYNC_DATA_DIR=/tmp/nodalync-node-a
nodalync init
# Terminal B — Node B (querier)
export NODALYNC_PASSWORD=testpassword
export NODALYNC_DATA_DIR=/tmp/nodalync-node-b
nodalync init
2. Get Node A’s Peer ID and Start It
# Terminal A
nodalync whoami
# Note the libp2p PeerId (12D3KooW...)
nodalync start
# Note the listening address, e.g. /ip4/127.0.0.1/tcp/9000
3. Configure Node B to Bootstrap from Node A
Edit Node B’s config to point at Node A as a bootstrap peer:
# Terminal B — edit the config file
# The config is at $NODALYNC_DATA_DIR/config.toml
In config.toml, set the bootstrap node to Node A’s address:
[network]
bootstrap_nodes = [
"/ip4/127.0.0.1/tcp/9000/p2p/<NODE_A_PEER_ID>"
]
listen_addresses = ["/ip4/0.0.0.0/tcp/9001"]
Replace <NODE_A_PEER_ID> with the libp2p PeerId from step 2. Note the different listen port (9001) to avoid conflicts.
4. Publish Content on Node A
# Terminal A (keep the node running in another terminal, or use --daemon)
# If running in foreground, open a third terminal with the same env vars:
export NODALYNC_PASSWORD=testpassword
export NODALYNC_DATA_DIR=/tmp/nodalync-node-a
echo "This is test knowledge content for the Nodalync network." > /tmp/test-content.txt
nodalync publish /tmp/test-content.txt --price 0.01 --title "Test Content"
# Note the content hash from the output
5. Start Node B and Search
# Terminal B
nodalync start
# Wait a few seconds for peer discovery
# In another terminal with Node B's env:
export NODALYNC_PASSWORD=testpassword
export NODALYNC_DATA_DIR=/tmp/nodalync-node-b
nodalync search "Test Content" --all
6. Open a Payment Channel (Requires Hedera)
If you have Hedera testnet credentials configured:
# Terminal B — open a channel to Node A
nodalync open-channel <NODE_A_PEER_ID> --deposit 100
7. Query Content and Verify Payment
# Terminal B — query the content (paid)
nodalync query <CONTENT_HASH>
# Verify payment on both sides
# Terminal A:
nodalync earnings
# Terminal B:
nodalync balance
8. Clean Up
# Stop both nodes (Ctrl+C if foreground, or nodalync stop if daemon)
rm -rf /tmp/nodalync-node-a /tmp/nodalync-node-b /tmp/test-content.txt
Environment Variables
The easiest way to configure Hedera is to use the .env file in the repo root:
# Export all variables from .env
set -a && source .env && set +a
| Variable | Description |
|---|---|
NODALYNC_PASSWORD | Identity encryption password |
NODALYNC_DATA_DIR | Data directory (default: platform-specific, see note below) |
RUST_LOG | Log level (e.g., nodalync=debug) |
HEDERA_ACCOUNT_ID | Hedera account ID (e.g., 0.0.7703962) |
HEDERA_CONTRACT_ID | Settlement contract ID (default: 0.0.7729011) |
HEDERA_PRIVATE_KEY | DER-encoded ECDSA private key as inline hex string (see note below) |
Note: The variables above (HEDERA_*) are read by the CLI settlement path. The MCP server subcommand reads NODALYNC_HEDERA_* prefixed variants (e.g., NODALYNC_HEDERA_ACCOUNT_ID). See nodalync mcp-server --help.
Hedera Private Key Format
IMPORTANT: HEDERA_PRIVATE_KEY is an inline hex string (not a file path). Smart contract operations require ECDSA keys with DER encoding.
| Format | Length | Example Prefix | Works? |
|---|---|---|---|
| DER-encoded ECDSA | 98 hex chars | 3030020100300706052b8104000a04220420... | Yes |
| Raw hex (Ed25519) | 64 hex chars | d21f3bfe69929b1d6e0f37fa9622b96f... | No |
If you have a raw hex key, you need to DER-encode it. Check your account type at HashScan:
https://hashscan.io/testnet/account/<account_id>
To create a Hedera testnet account, visit the Hedera Portal.
CLI vs MCP Key Formats
The CLI and MCP server handle Hedera private keys differently:
| Context | Variable / Flag | Value Type |
|---|---|---|
| CLI settlement | HEDERA_PRIVATE_KEY env var | Inline hex string (DER-encoded ECDSA) |
| MCP server | --hedera-private-key flag or NODALYNC_HEDERA_KEY_PATH env var | File path to a key file on disk |
If using the MCP server, write your key to a file first and pass the path, rather than the inline hex value.
Auto-Deposit (Payment Channels)
When running a node that serves paid content, you need HBAR deposited to the settlement contract before you can accept payment channels from other peers.
MIGRATION (v0.8.x):
auto_depositnow defaults tofalsefor security. To restore previous behavior, explicitly setauto_deposit = truein your config.
How It Works
When auto-deposit is enabled, your node will automatically:
- On startup: Check if the contract balance is below the minimum (default: 100 HBAR), and deposit if needed (default: 200 HBAR)
- On channel acceptance: When a peer tries to open a channel with you, auto-deposit if balance is insufficient and cooldown has elapsed
Configuration
Configure auto-deposit behavior in your config.toml (in your data directory, see Troubleshooting section below for paths):
[settlement]
# Enable auto-deposit (default: false — opt-in for security)
auto_deposit = true
# Minimum balance to maintain in contract (in HBAR)
min_contract_balance_hbar = 100.0
# Amount to deposit when auto-deposit triggers (in HBAR)
auto_deposit_amount_hbar = 200.0
# Maximum deposit to accept/match per channel (in HBAR)
# Caps how much you'll commit when a peer opens a channel with you
max_accept_deposit_hbar = 500.0
Security Notes
- Deposit cap: The
max_accept_deposit_hbarsetting limits how much you’ll commit per channel, regardless of what the peer requests - Cooldown: Auto-deposits are rate-limited (5 minute cooldown by default) to prevent spam-triggered deposits
- Fixed amount: Auto-deposit always uses the configured amount, never an amount derived from the peer’s request
- Cooldown resets on restart: The cooldown timer doesn’t persist across node restarts. The startup auto-deposit check handles the post-restart case separately.
Manual Control
To disable auto-deposit entirely:
[settlement]
auto_deposit = false
Then manually deposit as needed:
nodalync deposit 200
Common Commands Reference
Identity & Node
| Command | Description |
|---|---|
nodalync init | Set up identity and config (add --wizard for interactive setup) |
nodalync whoami | Show your identity |
nodalync start | Start node (foreground) |
nodalync start --daemon | Start node (background) |
nodalync status | Show node status |
nodalync stop | Stop daemon |
nodalync completions <shell> | Generate shell completions (bash, zsh, fish, power-shell) |
Content
| Command | Description |
|---|---|
nodalync publish <file> [--price <hbar>] [--title "..."] | Publish content |
nodalync update <hash> <file> | Create a new version of content |
nodalync delete <hash> | Delete local content |
nodalync visibility <hash> --level <level> | Change content visibility |
nodalync versions <hash> | Show version history |
nodalync list | List your content |
nodalync search <query> [--all] | Search content (matches title/description/tags) |
nodalync preview <hash> | View metadata (free) |
nodalync query <hash> | Get full content (paid) |
Synthesis
| Command | Description |
|---|---|
nodalync synthesize --sources <h1>,<h2> --output <file> | Create L3 synthesis |
nodalync build-l2 <hash1> <hash2> ... | Build L2 entity graph from L1 sources |
nodalync merge-l2 <graph1> <graph2> ... | Merge L2 entity graphs |
nodalync reference <hash> | Reference external L3 as L0 source |
Economics & Channels
| Command | Description |
|---|---|
nodalync balance | Check HBAR balance |
nodalync earnings | View earnings breakdown |
nodalync deposit <amount> | Deposit HBAR to protocol balance |
nodalync withdraw <amount> | Withdraw HBAR from protocol balance |
nodalync settle | Force settlement of pending payments |
nodalync open-channel <peer-id> --deposit <amount> | Open payment channel (min 100 HBAR) |
nodalync close-channel <peer-id> | Close payment channel (cooperative) |
nodalync dispute-channel <peer-id> | Initiate dispute close (24h waiting period) |
nodalync resolve-dispute <peer-id> | Resolve dispute after waiting period |
nodalync list-channels | List all payment channels |
MCP
| Command | Description |
|---|---|
nodalync mcp-server | Start MCP server for AI agents |
Bootstrap Node
Your node connects to this bootstrap node by default:
/dns4/nodalync-bootstrap.eastus.azurecontainer.io/tcp/9000/p2p/12D3KooWMqrUmZm4e1BJTRMWqKHCe1TSX9Vu83uJLEyCGr2dUjYm
Health check: http://nodalync-bootstrap.eastus.azurecontainer.io:8080/health
Troubleshooting
Default data directory:
The data directory varies by platform unless you set NODALYNC_DATA_DIR:
- macOS:
~/Library/Application Support/io.nodalync.nodalync/ - Linux:
~/.local/share/nodalync/(or$XDG_DATA_HOME/nodalync/) - Windows:
%APPDATA%\nodalync\nodalync\
Set NODALYNC_DATA_DIR to override: export NODALYNC_DATA_DIR=~/.nodalync
Node won’t start:
# Check if already running
nodalync status
# View logs (path shown when starting daemon)
cat ~/Library/Application\ Support/io.nodalync.nodalync/node.stderr.log # macOS
cat ~/.local/share/nodalync/node.stderr.log # Linux
Can’t connect to peers:
# Verify bootstrap node is reachable
curl http://nodalync-bootstrap.eastus.azurecontainer.io:8080/health
# Check your firewall allows TCP 9000
Reset everything:
# Remove data directory (check your platform above, or use your NODALYNC_DATA_DIR)
rm -rf ~/Library/Application\ Support/io.nodalync.nodalync/ # macOS
# rm -rf ~/.local/share/nodalync/ # Linux
nodalync init --wizard
Next Steps
- Read the Protocol Spec to understand how Nodalync works
- Explore the Architecture for module details
- Check the FAQ for common questions
- Join the Discord community!
Troubleshooting
Nodalync Protocol: Frequently Asked Questions
This document addresses common questions and concerns about the Nodalync protocol design.
Status Legend:
- Designed — Addressed in protocol/implementation
- Gap — Known limitation, not yet addressed
- Deferred — Planned for future work
- Out of Scope — Intentionally not part of the protocol
- Known Limitation — Acknowledged tradeoff
Economic & Incentive Questions
1. Can People Game the System with Low-Effort Contributions?
| Status | Designed |
|---|
Concern: Users might add useless or trivial content just to insert themselves into provenance chains and collect unearned payments.
Answer: The protocol’s economic design makes this strategy unprofitable.
Revenue only flows when content is queried. Creating thousands of low-value nodes generates zero income because no one will query them. The market determines value through actual usage, not mere existence in the system.
From the whitepaper (Section 10.2 - Attribution Gaming):
“Revenue distributes only when content is queried. Creating thousands of unused nodes generates no income. The market determines value through actual queries.”
You cannot insert yourself into someone else’s provenance chain. Provenance chains are cryptographically computed when content is created. To be in someone’s root_L0L1[] array, your content must have been:
- Queried and paid for by the creator
- Used as a source in their derivation
The spec (Section 9.3) enforces that:
“All entries in derived_from MUST have been queried by creator”
This means you can’t retroactively attach yourself to successful content. The only way to earn is to create content valuable enough that others choose to query it and build upon it.
Synthesizers who don’t contribute foundational work earn only 5%. The protocol intentionally rewards original contribution over mere reorganization. A “pure synthesizer” using entirely others’ sources receives only the 5% synthesis fee—this is by design.
2. Will the Platform Get Flooded with Low-Quality Content?
| Status | Designed |
|---|
Concern: Since uploading L0 content can lead to long-term payouts, people might spam the network with low-quality or copied content, burying valuable material.
Answer: Several mechanisms prevent spam from being profitable or visible.
Protocol-level mechanisms:
- Pricing as filter — spam is unprofitable if nobody queries it
- Rate limiting — configurable per peer/content hash via
AccessControl - Payment bonds —
require_bond: boolin AccessControl can require deposits - Reputation —
PeerInfo.reputation: int64tracked per peer - Allowlist/denylist — per-content access control
Discovery is application-layer: The protocol itself doesn’t include search. Discovery occurs through application-layer indexes (search engines, directories, AI agents) that can implement their own quality filtering, reputation systems, and relevance ranking. From the spec (Section 1.4):
“Content discovery/search… Applications index L1 previews and build search UX” “Content moderation — policy decisions for specific communities/jurisdictions” [Out of scope]
L1 previews enable informed decisions: Before paying for content, users see the L1 summary (extracted mentions, topics, preview). This free preview layer helps users evaluate relevance without payment, making it easy to skip low-quality content.
Philosophy: Bad content doesn’t get queried, therefore doesn’t earn. Applications build quality filters on top.
3. Can Someone Game L3 Provenance by Citing Sources They Don’t Actually Use?
| Status | Known Limitation |
|---|
Concern: Someone could query prestigious sources, claim them in their L3 provenance, but write completely unrelated content—essentially “name-dropping” for credibility.
Answer: This is a real concern. The protocol guarantees cryptographic provenance but not intellectual honesty.
What the protocol guarantees:
- Cryptographic provenance chain exists
- You can only claim sources you actually queried and paid for
- Payment proof exists for every claimed source
What it does NOT guarantee:
- That the L3 content actually uses the claimed sources intellectually
- That the L3 is “good” or “honest” synthesis
Why the attack is limited:
- They still have to pay for every source they claim
- If their L3 is garbage, nobody queries it—no revenue
- The sources don’t “endorse” the L3—provenance just means “this L3 paid for access to these sources”
Potential future mitigations:
- Semantic similarity checking between L3 and claimed sources (application layer)
- Reputation for L3 creators based on downstream utility
- ZK proofs for content derivation (mentioned in spec §13.4)
4. Does the Protocol Pay Equal Amounts for Unequal Work?
| Status | Designed |
|---|
Concern: The system pays everyone the same share per root entry, whether someone contributed a single sentence or comprehensive research. This seems unfair.
Answer: This is a deliberate design choice, not an oversight.
Each root entry represents a discrete contribution that was valuable enough to be used. If your single sentence was included in someone’s L3, that means they queried it, paid for it, and found it valuable enough to derive from. The protocol doesn’t judge contribution size—the market does.
The weighting system handles contribution frequency: From the spec (Section 4.5):
“When the same source appears multiple times in a provenance chain (through different derivation paths), it receives proportionally more: a source contributing twice receives twice the share.”
Quality is priced at the source: Content owners set their own prices. Comprehensive, high-quality research can be priced higher than trivial observations.
The alternative (contribution-weighted shares) creates worse problems:
- Who decides what contribution is “worth more”? This requires subjective judgment.
- Gaming becomes easier if you can inflate perceived contribution size.
- Equal weighting is objective and trustless—a hash is either in the provenance chain or it isn’t.
5. Will Early Users Lock In Permanent Advantages?
| Status | Designed (Intentional) |
|---|
Concern: Those who publish first in any topic could lock in lifelong royalties, making it hard for newcomers to compete.
Answer: You’re correct that this is largely unavoidable—and it’s intentional.
This is feature, not bug. The protocol’s explicit goal is to reward foundational contributors perpetually. From the whitepaper abstract:
“A researcher can publish valuable findings once and receive perpetual royalties as the ecosystem builds upon their work.”
Factors that prevent an impenetrable moat:
- New contributions create new chains: If you publish novel research, you create new provenance chains. Later contributors building on YOUR work include YOU in their chains.
- Quality and relevance matter: Early publication doesn’t guarantee usage. Superior later work will be preferred by synthesizers.
- Versioning supports improvement: The spec supports content versioning (Section 4.3). Updated versions can be published.
The alternative is worse: Systems that DON’T reward early contributors (like current academic publishing) create no economic incentive for foundational research at all.
6. Can Content Be Reused Forever After a Single Payment?
| Status | Designed |
|---|
Concern: Once someone pays for content, they can cache and reuse it infinitely without paying again. Creators only get paid once.
Answer: This is accurate and intentional.
You’re paying for access, not per-read: Like buying a book, the initial query gives you the content. Rereading your own copy doesn’t generate new payments.
New queries DO trigger new payments: From the whitepaper (Section 5.1):
“Subsequent queries to the same node (for updated information or different query parameters) trigger new payments.”
Derivation requires payment: Creating new L3 content that derives from cached sources still requires having queried (and paid for) each source at least once. From the spec (Section 7.1.5):
“All sources have been queried (payment proof exists)”
The value is in the provenance chain: When you use cached content to create an L3, and others query YOUR L3, revenue flows back through the entire provenance chain to original creators.
Unlimited re-reads would break usability: If every re-read required payment, the system would be unusable for research or synthesis work.
7. How Do Creators Know What to Charge?
| Status | Gap |
|---|
Concern: Without pricing guidance, creators may set inefficient prices.
Answer: The spec explicitly treats pricing as a market function:
“Pricing recommendations — market dynamics emerge from application-layer analytics”
What exists:
Economicsstruct trackstotal_queriesandtotal_revenueper content- This data is visible to anyone indexing the DHT
What’s missing:
- No pricing suggestions in the protocol
- Could be built as an application: “content similar to yours earns X HBAR/query on average”
- Initial testing uses tiny prices (0.001 HBAR per query) to prove the flow
Unlike prior data marketplaces that failed attempting to solve pricing algorithmically, Nodalync treats price discovery as a market function rather than a protocol function.
Technical Questions
8. How Does Discovery Work Without Knowing the Hash?
| Status | Designed |
|---|
Concern: If content is addressed by hash, how do users find content they don’t already know about?
Answer: Discovery is an application-layer concern, with protocol primitives to support it.
Application developers can:
┌─────────────────────────────────────────────────────────────┐
│ SEARCH ENGINES │
│ - Subscribe to ANNOUNCE broadcasts on DHT │
│ - Fetch free PREVIEW for all shared content │
│ - Index L1 summaries, tags, content types │
│ - Build relevance ranking from total_queries, reputation │
│ - Return content hashes → users query through protocol │
└─────────────────────────────────────────────────────────────┘
The MCP server has a list_sources tool that shows available content with title, price, preview text, and topics.
Current gap: MCP doesn’t support natural language search yet—use list_sources to discover hashes. Full-text search would be an application-layer index.
9. How Is L1 Extraction Done — Manual or AI?
| Status | Designed |
|---|
Concern: How are atomic facts (L1 mentions) extracted from L0 documents?
Answer: Currently rule-based, with plugin architecture for AI extractors.
Current implementation: Rule-based NLP
#![allow(unused)]
fn main() {
pub trait L1Extractor {
fn extract(&self, content: &[u8], mime_type: Option<&str>) -> Result<Vec<Mention>>;
}
/// Rule-based extractor for MVP
pub struct RuleBasedExtractor;
}
It splits text into sentences, does basic classification (Claim, Statistic, Definition, etc.), and extracts entities (capitalized words).
Future design: Plugin architecture for AI-powered extractors:
#![allow(unused)]
fn main() {
pub trait L1ExtractorPlugin: Send + Sync {
fn name(&self) -> &str;
fn supported_mime_types(&self) -> Vec<&str>;
fn extract(&self, content: &[u8], mime_type: &str) -> Result<Vec<Mention>>;
}
}
Quality enforcement: The spec says “AI extraction quality — pluggable extractors; quality is a market signal.” If your L1s are garbage, nobody queries them, you earn nothing.
10. What About the Cold Start / Chicken-and-Egg Problem?
| Status | Designed |
|---|
Concern: The network needs content to be valuable, but creators won’t publish until there’s demand.
Answer: The spec explicitly acknowledges this is an application-layer concern, not a protocol concern. The protocol provides primitives; bootstrap is left to implementations.
Practical solutions in the design:
- L1 previews are free — anyone can browse without paying
- Discovery through DHT ANNOUNCE broadcasts — search engines can subscribe and index
- Initial plan: Seed with own content first (spec, whitepaper, technical docs), then dogfood with Claude
- The MCP server lets AI agents query immediately — if even 1 person has good content, an AI can use it
Gap: No automated discovery UX yet. Intentional—prove the economics first, then build the search layer.
11. How Does the Protocol Scale to Millions of Nodes?
| Status | Designed (Untested at Scale) |
|---|
Concern: Can the system handle large-scale adoption?
Answer: DHT design from spec §11:
DHT: Kademlia
- Key space: 256-bit (SHA-256)
- Bucket size: 20
- Alpha (parallelism): 3
- Replication factor: 20
Kademlia scales logarithmically—lookups are O(log n). IPFS uses the same approach and handles millions of nodes.
Potential bottlenecks:
- Settlement batching—currently batches at 100 HBAR or 1 hour intervals
- GossipSub for announcements—needs tuning at scale
- Bootstrap node capacity
Current testing: Single node. Multi-node testing is a future priority.
12. What Are the Privacy Implications?
| Status | Known Concern |
|---|
Concern: Can others monitor what I’m querying through the DHT?
Answer: From spec §13.4:
| Visible to Network | Hidden from Network |
|---|---|
| Content hashes (not content) | Private content (entirely local) |
| L1 previews (for shared content) | Query text (between querier and node) |
| Provenance chains | Unlisted content (unless you have hash) |
| Payment amounts (in settlement batches) |
Current state: Your query goes directly to the content owner—not routed through random peers. But DHT lookups (finding where content lives) are visible.
Future improvements (from spec):
- ZK proofs for provenance verification
- Private settlement channels
- Onion routing for query privacy
13. What If Hedera Fails? Is Multi-Chain Supported?
| Status | Abstracted |
|---|
Concern: The protocol is tied to Hedera. What happens if Hedera has issues?
Answer: Currently Hedera-specific, but abstracted behind a trait:
#![allow(unused)]
fn main() {
#[async_trait]
pub trait Settlement: Send + Sync {
async fn settle_batch(&self, batch: &SettlementBatch) -> SettleResult<TransactionId>;
async fn verify_settlement(&self, tx_id: &TransactionId) -> SettleResult<SettlementStatus>;
async fn open_channel(&self, peer: &PeerId, deposit: u64) -> SettleResult<ChannelId>;
async fn close_channel(&self, id: &ChannelId, ...) -> SettleResult<TransactionId>;
// ... deposit, withdraw, dispute, account mapping, etc.
}
}
Why Hedera was chosen:
- Fast finality (3-5 seconds)
- Low cost (~$0.0001/tx)
- High throughput (10,000+ TPS)
- Good for micropayment batching
Multi-chain possibility: The Settlement trait could have implementations for Solana, Arbitrum/Optimism (L2s), or even Bitcoin Lightning.
Current priority: Prove the model works on one chain first, then generalize.
14. What Token Does Nodalync Use?
| Status | Designed |
|---|
Concern: Is there an NDL or DNL token?
Answer: Neither. The protocol uses HBAR directly (Hedera’s native token)—no native token.
From spec §12.4:
- Eliminates token bootstrapping complexity
- Leverages existing HBAR liquidity and exchanges
- Avoids securities/regulatory concerns
- Allows focus on proving the knowledge economics model
All amounts are denominated in tinybars (10⁻⁸ HBAR).
Practical & UX Questions
15. Could AI Tools Accidentally Spend Large Amounts?
| Status | Designed |
|---|
Concern: AI agents might fire off many queries rapidly, leading to unexpected bills.
Answer: This is addressed at the application layer.
Budget controls are application-layer responsibility: From the whitepaper (Section 7.2):
“Application-level concerns—budget controls, cost previews, spending limits, auto-approve settings—are outside protocol scope.”
The MCP server implementation includes budget tracking:
#![allow(unused)]
fn main() {
struct QueryInput {
query: String,
budget_hbar: f64,
}
}
Agents are configured with a session budget and cannot exceed it. When budget is exhausted, queries are rejected.
Cost preview before execution: The PREVIEW operation is free. Agents can check content price before querying.
16. Can Stolen Content Enter the System?
| Status | Known Limitation |
|---|
Concern: People can upload material they don’t own, and the system has no built-in way to prevent profiting from stolen work.
Answer: The protocol cannot prevent unauthorized uploads at the entry point, but it provides strong deterrence and evidence mechanisms.
Timestamps provide priority evidence: From the whitepaper (Section 10.5):
“Timestamps record when content was published in-system. Earnings are fully visible and auditable. Evidence for legal recourse is built-in, not forensic.”
Audit trails document everything: Every query, every payment, every derivation is logged with cryptographic proof.
Republished content lacks provenance benefits:
“Republished content lacks provenance linkage to the original; the original has earlier timestamps providing evidence of priority.”
Future enhancement: Embedding similarity detection can flag potential copies at the application layer.
Practical advice for creators:
- Publish to Nodalync first to establish timestamped priority
- Also establish external prior art (arXiv, journal publication, etc.)
- The protocol itself can serve as a proof-of-creation layer
17. Is There a GUI or Only CLI?
| Status | CLI Only |
|---|
Concern: How do non-technical users interact with the protocol?
Answer: Currently CLI only.
What exists:
nodalync init— setupnodalync publish— publish contentnodalync query— query contentnodalync search <query>— search local and network contentnodalync earnings— view earnings breakdownnodalync balance— check protocol balancenodalync mcp-server— for Claude Desktop / AI agent integration- 30+ commands total (run
nodalync --helpfor the full list)
GUI/Web: Not in the current roadmap. Focus is proving economics work first. A web interface would be an application built on top of the protocol.
18. Can I Bulk Import Existing Content?
| Status | Deferred |
|---|
Concern: How do I migrate an existing knowledge base to Nodalync?
Answer: Not built yet. Current workflow is:
nodalync initnodalync publish <file>one at a time- Extract L1 manually or with rule-based extractor
What would help:
- Directory scanner that publishes all files
- Watch mode for auto-publishing new content
- Integration with existing knowledge bases
Explicitly deferred—after core experience works.
19. Is There a Takedown Mechanism for Copyright?
| Status | Out of Scope |
|---|
Concern: How do copyright holders request content removal?
Answer: The spec explicitly says this is out of scope:
“Content moderation — policy decisions for specific communities/jurisdictions.” “Takedown mechanisms — legal/policy layer above protocol.”
The protocol is infrastructure—like IPFS doesn’t have takedowns, but Pinata (an application) can.
Practical implications:
- Content is stored on the owner’s node (local-first)
- Removing your node removes your content
- No global “delete” because there’s no central storage
- DMCA-type requests would go to node operators, not the protocol
Summary Table
| Question | Status | Notes |
|---|---|---|
| Gaming with low-effort content | Designed | Revenue only flows on queries; can’t insert into others’ chains |
| Content flooding/spam | Designed | Market incentives + rate limits + app-layer filtering |
| L3 provenance gaming | Known Limitation | Payment proof required; content quality not verified |
| Equal pay for unequal work | Designed | Market prices quality; weighting handles frequency |
| Early mover advantage | Designed (Intentional) | Feature, not bug; quality still matters |
| Single payment caching | Designed | Standard for information goods; derivation still pays |
| Pricing guidance | Gap | Market signals exist, no recommendations yet |
| Discovery without hash | Designed | list_sources MCP tool; full search is app-layer |
| L1 extraction | Designed | Rule-based MVP, plugin architecture for AI |
| Cold start | Designed | Seed with own content first; prove economics |
| Scalability | Designed (Untested) | Kademlia DHT scales logarithmically |
| Privacy | Known Concern | DHT lookups visible; onion routing planned |
| Multi-chain | Abstracted | Trait exists; Hedera-only for now |
| Token name | Designed | HBAR (no native token) |
| AI runaway spending | Designed | Application-layer budgets; free previews |
| Stolen content | Known Limitation | Timestamps + audit trails for legal recourse |
| GUI | Gap | CLI only; GUI would be app-layer |
| Bulk import | Deferred | Not built yet |
| Takedowns | Out of Scope | Legal/policy layer above protocol |
Document Version: 2.0 Last Updated: January 2026 References: Nodalync Whitepaper, Protocol Specification v0.7.1 Contract: 0.0.7729011 (Hedera Testnet)
Nodalync Protocol Specification
Version: 0.7.1 Author: Gabriel Giangi Date: January 2026 Status: Draft
Table of Contents
- Overview
- Notation and Conventions
- Cryptographic Primitives
- Data Structures
- Node State
- Message Types
- Protocol Operations
- State Transitions
- Validation Rules
- Economic Rules
- Network Layer
- Settlement Layer
- Security Considerations
1. Overview
1.1 Purpose
The Nodalync Protocol enables decentralized knowledge exchange with cryptographic provenance and automatic compensation. Participants publish knowledge (L0), extract atomic facts (L1), build entity graphs (L2), and synthesize insights (L3). Every query triggers payment distributed through the complete provenance chain to foundational contributors.
1.2 Design Principles
- Local-first: All content stored on owner’s node
- Decentralized: No central authority required
- Trustless: Cryptographic verification, not social trust
- Fair: 95% of value flows to foundational contributors
- Minimal: Protocol specifies only what’s necessary
1.3 Protocol Layers
┌─────────────────────────────────────────┐
│ Application Layer │ (Out of scope)
├─────────────────────────────────────────┤
│ Protocol Layer │ ← This specification
│ ┌─────────────────────────────────┐ │
│ │ Content Provenance Payment │ │
│ └─────────────────────────────────┘ │
├─────────────────────────────────────────┤
│ Network Layer (libp2p) │ (Referenced)
├─────────────────────────────────────────┤
│ Settlement Layer (Hedera) │ (Referenced)
└─────────────────────────────────────────┘
1.4 Scope
Nodalync is infrastructure, not an application. Like Bitcoin provides trustless value transfer without building wallets, Nodalync provides trustless knowledge exchange without building search engines.
In Scope (this protocol specifies):
| Concern | What the protocol provides |
|---|---|
| Content addressing | Deterministic hashing, content types (L0-L3) |
| Provenance | Cryptographic chains linking derivatives to sources |
| Payment | Automatic 95/5 distribution through provenance chains |
| Transport | Message types, encoding, peer-to-peer delivery |
| Settlement | Payment channel state, batch settlement interface |
| Visibility | Private/unlisted/shared access control primitives |
Out of Scope (application layer):
| Concern | Why it’s out of scope |
|---|---|
| Content discovery/search | Applications index L1 previews and build search UX |
| Pricing recommendations | Market dynamics emerge from application-layer analytics |
| Content moderation | Policy decisions for specific communities/jurisdictions |
| User interfaces | Wallets, explorers, dashboards are applications |
| AI extraction quality | Pluggable extractors; quality is a market signal |
| Takedown mechanisms | Legal/policy layer above protocol |
1.5 Building on Nodalync
The protocol exposes primitives that enable rich applications:
Application developers can:
┌─────────────────────────────────────────────────────────────┐
│ SEARCH ENGINES │
│ - Subscribe to ANNOUNCE broadcasts on DHT │
│ - Fetch free PREVIEW for all shared content │
│ - Index L1 summaries, tags, content types │
│ - Build relevance ranking from total_queries, reputation │
│ - Return content hashes → users query through protocol │
├─────────────────────────────────────────────────────────────┤
│ KNOWLEDGE BROWSERS │
│ - Visualize provenance chains (who contributed what) │
│ - Show payment flows and creator earnings │
│ - Navigate L0→L1→L2→L3 relationships │
├─────────────────────────────────────────────────────────────┤
│ AI AGENTS (via MCP) │
│ - Query knowledge programmatically │
│ - Pay-per-query with automatic source attribution │
│ - Build L3 synthesis with cryptographic provenance │
├─────────────────────────────────────────────────────────────┤
│ SPECIALIZED EXTRACTORS │
│ - Implement L1Extractor trait for domain-specific parsing │
│ - Compete on extraction quality (market selects winners) │
│ - Offer extraction-as-a-service to non-technical creators │
├─────────────────────────────────────────────────────────────┤
│ CURATED DIRECTORIES │
│ - Maintain topic-specific indexes │
│ - Provide reputation/quality signals │
│ - Charge for curation (built on protocol payments) │
└─────────────────────────────────────────────────────────────┘
Key insight: The protocol doesn’t need full-text search because:
- L1 previews are free and public (for shared content)
- Anyone can build an index by listening to ANNOUNCE messages
- Search is a service that can itself be monetized on the protocol
This mirrors successful infrastructure protocols:
- Bitcoin → wallets, exchanges, explorers
- IPFS → Pinata, Filecoin, web3.storage
- Nodalync → search engines, data browsers, AI agents
2. Notation and Conventions
2.1 Data Types
uint8 Unsigned 8-bit integer
uint32 Unsigned 32-bit integer (big-endian)
uint64 Unsigned 64-bit integer (big-endian)
int64 Signed 64-bit integer (big-endian)
float64 IEEE 754 double-precision float
bytes Variable-length byte array
string UTF-8 encoded string
bool Boolean (0x00 = false, 0x01 = true)
Hash 32 bytes (SHA-256 output)
Signature 64 bytes (Ed25519 signature)
PublicKey 32 bytes (Ed25519 public key)
PeerId Derived from PublicKey (see 3.2)
Timestamp uint64 (milliseconds since Unix epoch)
Amount uint64 (tinybars, 10^-8 HBAR)
2.2 Encoding
All multi-byte integers are big-endian. Structures are serialized using a deterministic CBOR encoding (RFC 8949) with the following rules:
- Map keys sorted lexicographically
- No indefinite-length arrays or maps
- Minimal integer encoding
- No floating-point for amounts (use uint64)
2.3 Notation
|| Concatenation
H(x) SHA-256 hash of x
Sign(k, m) Ed25519 signature of message m with private key k
Verify(p, m, s) Verify signature s of message m with public key p
len(x) Length of x in bytes
3. Cryptographic Primitives
3.1 Hash Function
Algorithm: SHA-256
Content hashes are computed as:
ContentHash(content) = H(
0x00 || # Domain separator (content)
len(content) as uint64 ||
content
)
3.2 Identity
Algorithm: Ed25519
Node identity is an Ed25519 keypair. PeerId is derived as:
PeerId = H(
0x00 || # Key type: Ed25519
public_key # 32 bytes
)[0:20] # Truncate to 20 bytes
Human-readable format: ndl1 + base32(PeerId)
Example: ndl1qpzry9x8gf2tvdw0s3jn54khce6mua7l
3.3 Signatures
All protocol messages requiring authentication are signed:
SignedMessage = {
payload: bytes,
signer: PeerId,
signature: Sign(private_key, H(payload))
}
Verification:
Valid(msg) = Verify(
lookup_public_key(msg.signer),
H(msg.payload),
msg.signature
)
3.4 Content Addressing
Content is referenced by its hash. The hash serves as a unique, verifiable identifier.
Given content C:
hash = ContentHash(C)
Anyone receiving C can verify:
ContentHash(C) == claimed_hash
4. Data Structures
4.1 Content Types
enum ContentType : uint8 {
L0 = 0x00, # Raw input (documents, notes, transcripts)
L1 = 0x01, # Mentions (extracted atomic facts)
L2 = 0x02, # Entity Graph (linked entities and relationships)
L3 = 0x03 # Insights (emergent synthesis)
}
Knowledge Layer Semantics:
| Layer | Content | Typical Operation | Queryable | Value Added |
|---|---|---|---|---|
| L0 | Raw documents, notes, transcripts | CREATE | Yes | Original source material |
| L1 | Atomic facts extracted from L0 | EXTRACT_L1 | Yes | Structured, quotable claims |
| L2 | Entities and relationships across L1s | BUILD_L2 | No (personal) | Cross-document linking, your perspective |
| L3 | Novel insights synthesizing sources | DERIVE | Yes | Original analysis and conclusions |
L2 is Personal: Your L2 represents your unique perspective — how you link entities, resolve ambiguities, and structure knowledge. It is never shared or queried directly. Its value surfaces when you create L3 insights that others find valuable enough to query.
4.2 Visibility
enum Visibility : uint8 {
Private = 0x00, # Local only, not served
Unlisted = 0x01, # Served if hash known, not announced
Shared = 0x02, # Announced to DHT, publicly queryable
Offline = 0x03 # Taken offline by owner, manifest preserved for provenance
}
4.3 Version
struct Version {
number: uint32, # Sequential version number (1-indexed)
previous: Hash?, # Hash of previous version (null if first)
root: Hash, # Hash of first version (stable identifier)
timestamp: Timestamp # Creation time
}
Constraints:
- If number == 1: previous MUST be null, root MUST equal self hash
- If number > 1: previous MUST NOT be null, root MUST equal previous.root
4.4 Mention (L1)
struct Mention {
id: Hash, # H(content || source_location)
content: string, # The atomic fact (max 1000 chars)
source_location: SourceLocation,
classification: Classification,
confidence: Confidence,
entities: string[] # Extracted entity names
}
struct SourceLocation {
type: LocationType,
reference: string, # Location identifier
quote: string? # Exact quote (max 500 chars)
}
enum LocationType : uint8 {
Paragraph = 0x00,
Page = 0x01,
Timestamp = 0x02,
Line = 0x03,
Section = 0x04
}
enum Classification : uint8 {
Claim = 0x00,
Statistic = 0x01,
Definition = 0x02,
Observation = 0x03,
Method = 0x04,
Result = 0x05
}
enum Confidence : uint8 {
Explicit = 0x00, # Directly stated in source
Inferred = 0x01 # Reasonably inferred
}
4.4a Entity Graph (L2)
L2 Entity Graphs are personal knowledge structures. They represent a node’s interpretation and linking of entities across their queried L1 sources. L2 is never directly queried by others — its value surfaces when used to create L3 insights.
struct L2EntityGraph {
# === Core Identity ===
id: Hash, # H(serialized entities + relationships)
# === Sources ===
source_l1s: L1Reference[], # L1 summaries this graph was built from
source_l2s: Hash[], # Other L2 graphs merged/extended (optional)
# === Namespace Prefixes (for compact URIs) ===
prefixes: PrefixMap, # Maps short prefixes to full URIs
# === Graph Content ===
entities: Entity[], # Resolved entities
relationships: Relationship[], # Relationships between entities
# === Statistics ===
entity_count: uint32,
relationship_count: uint32,
source_mention_count: uint32 # Total mentions linked
}
struct PrefixMap {
entries: PrefixEntry[] # Ordered list of prefix mappings
}
struct PrefixEntry {
prefix: string, # Short form, e.g., "schema"
uri: string # Full URI, e.g., "http://schema.org/"
}
# Default prefixes (always available, can be overridden):
# "ndl" -> "https://nodalync.io/ontology/"
# "schema" -> "http://schema.org/"
# "foaf" -> "http://xmlns.com/foaf/0.1/"
# "dc" -> "http://purl.org/dc/elements/1.1/"
# "rdf" -> "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
# "rdfs" -> "http://www.w3.org/2000/01/rdf-schema#"
# "xsd" -> "http://www.w3.org/2001/XMLSchema#"
# "owl" -> "http://www.w3.org/2002/07/owl#"
struct L1Reference {
l1_hash: Hash, # Hash of the L1Summary content
l0_hash: Hash, # The original L0 this L1 came from
mention_ids_used: Hash[] # Which specific mentions were used
}
struct Entity {
id: Hash, # Stable entity ID: H(canonical_uri || canonical_label)
# === Identity ===
canonical_label: string, # Primary human-readable name (max 200 chars)
canonical_uri: Uri?, # Optional: canonical URI (e.g., "dbr:Albert_Einstein")
aliases: string[], # Alternative names/spellings (max 50)
# === Type (RDF-compatible) ===
entity_types: Uri[], # e.g., ["schema:Person", "foaf:Person"]
# === Evidence ===
source_mentions: MentionRef[], # Which L1 mentions establish this entity
# === Confidence ===
confidence: float64, # 0.0 - 1.0, resolution confidence
resolution_method: ResolutionMethod,
# === Optional Metadata ===
description: string?, # Summary description (max 500 chars)
same_as: Uri[]? # Links to external entities (owl:sameAs)
}
# Uri can be:
# - Full URI: "http://schema.org/Person"
# - Compact URI (CURIE): "schema:Person" (expanded using prefixes)
# - Protocol-defined: "ndl:Person" (Nodalync ontology)
type Uri = string
struct MentionRef {
l1_hash: Hash, # Which L1 contains this mention
mention_id: Hash # Specific mention ID within that L1
}
struct Relationship {
id: Hash, # H(subject || predicate || object)
# === Triple ===
subject: Hash, # Entity ID
predicate: Uri, # RDF predicate, e.g., "schema:worksFor"
object: RelationshipObject, # Entity ID or literal
# === Evidence ===
source_mentions: MentionRef[], # Mentions that support this relationship
confidence: float64, # 0.0 - 1.0
# === Temporal (optional) ===
valid_from: Timestamp?,
valid_to: Timestamp?
}
enum RelationshipObject {
EntityRef(Hash), # Reference to another entity in this graph
ExternalRef(Uri), # Reference to external entity
Literal(LiteralValue) # A typed value
}
struct LiteralValue {
value: string, # The value as string
datatype: Uri?, # XSD datatype, e.g., "xsd:date" (null = plain string)
language: string? # Language tag, e.g., "en" (for strings only)
}
# Standard XSD datatypes (use "xsd:" prefix):
# xsd:string, xsd:integer, xsd:decimal, xsd:boolean,
# xsd:date, xsd:dateTime, xsd:anyURI
enum ResolutionMethod : uint8 {
ExactMatch = 0x00, # Same string
Normalized = 0x01, # Case/punctuation normalized
Alias = 0x02, # Known alias matched
Coreference = 0x03, # Pronoun/reference resolved
ExternalLink = 0x04, # Matched via external KB
Manual = 0x05, # Human-verified
AIAssisted = 0x06 # ML model assisted
}
Constraints:
1. len(source_l1s) >= 1 # Must derive from at least one L1
2. len(entities) >= 1 # Must have at least one entity
3. Each entity.id is unique within the graph
4. Each relationship references valid entity IDs (or external URIs)
5. All MentionRefs point to valid L1s in source_l1s
6. 0.0 <= confidence <= 1.0
7. len(canonical_label) <= 200
8. len(aliases) <= 50
9. All URIs are valid (full URI or valid CURIE with known prefix)
10. entity_count == len(entities)
11. relationship_count == len(relationships)
L2 Visibility:
- L2 content is ALWAYS Private (never Unlisted or Shared)
- L2 is never announced to DHT
- L2 has no price (cannot be queried for payment)
- L2's value is realized through L3 insights derived from it
4.4b Nodalync Ontology (ndl:)
The protocol defines a minimal ontology at https://nodalync.io/ontology/:
# Entity Types
ndl:Person
ndl:Organization
ndl:Location
ndl:Concept
ndl:Event
ndl:Work # Paper, book, article
ndl:Product
ndl:Technology
ndl:Metric # Quantitative measure
ndl:TimePoint
# Relationship Predicates
ndl:mentions # L1 mention references entity
ndl:relatedTo # Generic relationship
ndl:partOf
ndl:createdBy
ndl:locatedIn
ndl:occurredAt
ndl:hasValue
ndl:sameAs # Equivalent to owl:sameAs
# Provenance Predicates
ndl:derivedFrom # Content derivation
ndl:extractedFrom # L1 extracted from L0
ndl:builtFrom # L2 built from L1s
Nodes are free to use any ontology. The ndl: namespace provides defaults
for nodes that don’t need external ontology integration.
4.5 Provenance
struct Provenance {
root_L0L1: ProvenanceEntry[], # All foundational sources
derived_from: Hash[], # Direct parent hashes
depth: uint32 # Max derivation depth from any L0
}
struct ProvenanceEntry {
hash: Hash, # Content hash
owner: PeerId, # Owner's node ID
visibility: Visibility, # Visibility at time of derivation
weight: uint32 # Number of times this source appears (for duplicates)
}
Constraints:
- root_L0L1 contains entries of type L0 or L1 only (never L2 or L3)
- L0 content: root_L0L1 = [self], derived_from = [], depth = 0
- L1 content: root_L0L1 = [parent L0], derived_from = [L0 hash], depth = 1
- L2 content: root_L0L1 = merged roots from source L1s,
derived_from = source L1/L2 hashes, depth = max(source.depth) + 1
- L3 content: root_L0L1 = merged roots from all sources,
derived_from = source hashes, depth = max(source.depth) + 1
- All entries in derived_from MUST have been queried by creator
Provenance Chain Examples:
Simple chain:
L0(doc) → L1(mentions) → L2(entities) → L3(insight)
depth: 0 1 2 3
L3 deriving directly from L1 (valid, skipping L2):
L0(doc) → L1(mentions) → L3(insight)
depth: 0 1 2
L3 from mix of L1 and L2:
L0(doc1) → L1(m1) → L2(graph) ─┐
├→ L3(insight)
L0(doc2) → L1(m2) ─────────────┘
L3.provenance = {
root_L0L1: [doc1, doc2], # Merged from both paths
derived_from: [L2.hash, m2],
depth: 4 # max(3, 2) + 1
}
4.6 Access Control
struct AccessControl {
allowlist: PeerId[]?, # If set, only these peers can query
denylist: PeerId[]?, # These peers are blocked
require_bond: bool, # Require payment bond
bond_amount: Amount?, # Bond amount if required
max_queries_per_peer: uint32? # Rate limit (null = unlimited)
}
Access granted if:
(allowlist is null OR peer in allowlist) AND
(denylist is null OR peer NOT in denylist) AND
(require_bond is false OR peer has posted bond)
4.7 Economics
struct Economics {
price: Amount, # Price per query (in smallest unit)
currency: Currency, # Currency identifier
total_queries: uint64, # Total queries served
total_revenue: Amount # Total revenue generated
}
enum Currency : uint8 {
HBAR = 0x00 # Hedera native token
}
4.8 Manifest
The manifest is the complete metadata for a content item:
struct Manifest {
# Identity
hash: Hash, # Content hash
content_type: ContentType,
owner: PeerId, # Content owner (serves content, receives synthesis fee)
# Versioning
version: Version,
# Visibility & Access
visibility: Visibility,
access: AccessControl,
# Metadata
metadata: Metadata,
# Economics
economics: Economics,
# Provenance
provenance: Provenance,
# Timestamps
created_at: Timestamp,
updated_at: Timestamp
}
struct Metadata {
title: string, # Max 200 chars
description: string?, # Max 2000 chars
tags: string[], # Max 20 tags, each max 50 chars
content_size: uint64, # Size in bytes
mime_type: string? # MIME type if applicable
}
4.9 L1 Summary (Preview)
struct L1Summary {
l0_hash: Hash, # Source L0 hash
mention_count: uint32, # Total mentions extracted
preview_mentions: Mention[], # First N mentions (max 5)
primary_topics: string[], # Main topics (max 5)
summary: string # 2-3 sentence summary (max 500 chars)
}
5. Node State
5.1 State Components
A node maintains the following state:
struct NodeState {
# Identity
identity: Identity,
# Content storage
content: Map<Hash, ContentRecord>,
# Provenance graph
provenance_graph: ProvenanceGraph,
# Payment channels
channels: Map<PeerId, Channel>,
# Peer information
peers: Map<PeerId, PeerInfo>,
# Query cache (content from others)
cache: Map<Hash, CachedContent>,
# Settlement queue
settlement_queue: SettlementEntry[]
}
struct Identity {
private_key: bytes, # Ed25519 private key (encrypted at rest)
public_key: PublicKey,
peer_id: PeerId
}
struct ContentRecord {
manifest: Manifest,
content: bytes, # Encrypted at rest
l1_data: L1Summary?, # Null if L1 not extracted
local_path: string # Filesystem path
}
struct PeerInfo {
peer_id: PeerId,
public_key: PublicKey,
addresses: MultiAddr[], # libp2p multiaddresses
last_seen: Timestamp,
reputation: int64 # Reputation score
}
struct CachedContent {
hash: Hash,
content: bytes,
source_peer: PeerId,
queried_at: Timestamp,
payment_proof: PaymentProof
}
5.2 Provenance Graph
struct ProvenanceGraph {
# Forward edges: what does this content derive from?
derived_from: Map<Hash, Hash[]>,
# Backward edges: what derives from this content?
derivations: Map<Hash, Hash[]>,
# Flattened roots cache
roots_cache: Map<Hash, ProvenanceEntry[]>
}
Operations:
add_content(hash, derived_from[]) → updates both directions
get_roots(hash) → returns flattened root_L0L1
get_derivations(hash) → returns all downstream content
5.3 Payment Channels
struct Channel {
peer_id: PeerId,
state: ChannelState,
my_balance: Amount,
their_balance: Amount,
nonce: uint64,
last_update: Timestamp,
pending_payments: Payment[]
}
enum ChannelState : uint8 {
Opening = 0x00,
Open = 0x01,
Closing = 0x02,
Closed = 0x03,
Disputed = 0x04
}
struct Payment {
id: Hash, # H(channel_id || nonce || amount || recipient)
amount: Amount,
recipient: PeerId,
query_hash: Hash, # Content that was queried
provenance: ProvenanceEntry[], # For distribution
timestamp: Timestamp,
signature: Signature # Signed by payer
}
6. Message Types
6.1 Message Envelope
All protocol messages use a common envelope:
struct Message {
version: uint8, # Protocol version (0x01)
type: MessageType,
id: Hash, # Unique message ID
timestamp: Timestamp,
sender: PeerId,
payload: bytes, # Type-specific payload
signature: Signature # Signs H(version || type || id || timestamp || sender || payload)
}
enum MessageType : uint16 {
# Discovery (0x01xx)
ANNOUNCE = 0x0100,
ANNOUNCE_UPDATE = 0x0101,
SEARCH = 0x0110,
SEARCH_RESPONSE = 0x0111,
# Preview (0x02xx)
PREVIEW_REQUEST = 0x0200,
PREVIEW_RESPONSE = 0x0201,
# Query (0x03xx)
QUERY_REQUEST = 0x0300,
QUERY_RESPONSE = 0x0301,
QUERY_ERROR = 0x0302,
# Version (0x04xx)
VERSION_REQUEST = 0x0400,
VERSION_RESPONSE = 0x0401,
# Channel (0x05xx)
CHANNEL_OPEN = 0x0500,
CHANNEL_ACCEPT = 0x0501,
CHANNEL_UPDATE = 0x0502,
CHANNEL_CLOSE = 0x0503,
CHANNEL_DISPUTE = 0x0504,
CHANNEL_CLOSE_ACK= 0x0505,
# Settlement (0x06xx)
SETTLE_BATCH = 0x0600,
SETTLE_CONFIRM = 0x0601,
# Peer (0x07xx)
PING = 0x0700,
PONG = 0x0701,
PEER_INFO = 0x0710
}
6.2 Discovery Messages
# ANNOUNCE - Publish content availability to DHT
struct AnnouncePayload {
hash: Hash,
content_type: ContentType,
title: string,
l1_summary: L1Summary,
price: Amount,
addresses: MultiAddr[]
}
# ANNOUNCE_UPDATE - Announce new version
struct AnnounceUpdatePayload {
version_root: Hash, # Stable identifier
new_hash: Hash, # New version hash
version_number: uint32,
title: string,
l1_summary: L1Summary,
price: Amount
}
# SEARCH - Query DHT for content
struct SearchPayload {
query: string, # Natural language query
filters: SearchFilters?,
limit: uint32, # Max results (1-100)
offset: uint32 # For pagination
}
struct SearchFilters {
content_types: ContentType[]?,
max_price: Amount?,
min_reputation: int64?,
created_after: Timestamp?,
created_before: Timestamp?,
tags: string[]?
}
# SEARCH_RESPONSE - Search results
struct SearchResponsePayload {
results: SearchResult[],
total_count: uint64,
offset: uint32
}
struct SearchResult {
hash: Hash,
content_type: ContentType,
title: string,
owner: PeerId,
l1_summary: L1Summary,
price: Amount,
total_queries: uint64,
relevance_score: float64, # 0.0 - 1.0
publisher_addresses: string[] # Multiaddresses for reconnection
}
6.3 Preview Messages
# PREVIEW_REQUEST - Request L1 preview (free)
struct PreviewRequestPayload {
hash: Hash
}
# PREVIEW_RESPONSE - Return L1 preview
struct PreviewResponsePayload {
hash: Hash,
manifest: Manifest, # Full manifest (no content)
l1_summary: L1Summary
}
6.4 Query Messages
# QUERY_REQUEST - Request content (paid)
struct QueryRequestPayload {
hash: Hash,
query: string?, # Optional: specific question about content
payment: Payment,
version: VersionSpec? # Optional: specific version
}
enum VersionSpec : uint8 {
Latest = 0x00,
Number = 0x01, # Followed by uint32 version number
Hash = 0x02 # Followed by Hash
}
# QUERY_RESPONSE - Return content
struct QueryResponsePayload {
hash: Hash,
content: bytes,
manifest: Manifest, # Contains full provenance chain
payment_receipt: PaymentReceipt
}
# Whitepaper simplified response fields map to:
# response.content → content
# response.sources[] → manifest.provenance.root_L0L1[].hash
# response.provenance → manifest.provenance
# response.cost → payment_receipt.amount
struct PaymentReceipt {
payment_id: Hash,
amount: Amount,
timestamp: Timestamp,
channel_nonce: uint64,
distributor_signature: Signature # Owner signs receipt
}
# QUERY_ERROR - Error response
struct QueryErrorPayload {
hash: Hash,
error_code: QueryError,
message: string?
}
enum QueryError : uint16 {
NOT_FOUND = 0x0001,
ACCESS_DENIED = 0x0002,
PAYMENT_REQUIRED = 0x0003,
PAYMENT_INVALID = 0x0004,
RATE_LIMITED = 0x0005,
VERSION_NOT_FOUND= 0x0006,
INTERNAL_ERROR = 0xFFFF
}
6.5 Version Messages
# VERSION_REQUEST - Get version info
struct VersionRequestPayload {
version_root: Hash # Stable identifier
}
# VERSION_RESPONSE - Version history
struct VersionResponsePayload {
version_root: Hash,
versions: VersionInfo[],
latest: Hash
}
struct VersionInfo {
hash: Hash,
number: uint32,
timestamp: Timestamp,
visibility: Visibility,
price: Amount
}
6.6 Channel Messages
# CHANNEL_OPEN - Request to open payment channel
struct ChannelOpenPayload {
channel_id: Hash, # H(initiator || responder || nonce)
initial_balance: Amount, # Initiator's deposit
funding_tx: bytes? # On-chain funding proof (if required)
}
# CHANNEL_ACCEPT - Accept channel opening
struct ChannelAcceptPayload {
channel_id: Hash,
initial_balance: Amount, # Responder's deposit
funding_tx: bytes?
}
# CHANNEL_UPDATE - Update channel state (payment)
struct ChannelUpdatePayload {
channel_id: Hash,
nonce: uint64,
balances: ChannelBalances,
payments: Payment[], # Payments in this update
signature: Signature # Signs the new state
}
struct ChannelBalances {
initiator: Amount,
responder: Amount
}
# CHANNEL_CLOSE - Initiate cooperative close
struct ChannelClosePayload {
channel_id: Hash,
final_balances: ChannelBalances,
settlement_tx: bytes # Proposed on-chain settlement
}
# CHANNEL_CLOSE_ACK - Acknowledge cooperative close
struct ChannelCloseAckPayload {
channel_id: Hash,
responder_signature: Signature # Responder's signature over the close message
}
# CHANNEL_DISPUTE - Dispute channel state
struct ChannelDisputePayload {
channel_id: Hash,
claimed_state: ChannelUpdatePayload, # Highest known state
evidence: bytes[] # Supporting evidence
}
6.7 Settlement Messages
# SETTLE_BATCH - Batch settlement request
struct SettleBatchPayload {
batch_id: Hash,
entries: SettlementEntry[],
merkle_root: Hash # Root of entries merkle tree
}
struct SettlementEntry {
recipient: PeerId,
amount: Amount,
provenance_hashes: Hash[], # Content hashes for audit
payment_ids: Hash[] # Payment IDs included
}
# SETTLE_CONFIRM - Confirm settlement on-chain
struct SettleConfirmPayload {
batch_id: Hash,
transaction_id: string, # On-chain transaction ID
block_number: uint64,
timestamp: Timestamp
}
6.8 Peer Messages
# PING
struct PingPayload {
nonce: uint64
}
# PONG
struct PongPayload {
nonce: uint64 # Echo back
}
# PEER_INFO - Exchange peer information
struct PeerInfoPayload {
peer_id: PeerId,
public_key: PublicKey,
addresses: MultiAddr[],
capabilities: Capability[],
content_count: uint64,
uptime: uint64 # Seconds since node start
}
enum Capability : uint8 {
QUERY = 0x01, # Can serve queries
CHANNEL = 0x02, # Supports payment channels
SETTLE = 0x04, # Can initiate settlement
INDEX = 0x08 # Participates in DHT indexing
}
7. Protocol Operations
7.1 Content Operations
7.1.1 Create
Create new content locally (not yet published).
CREATE(content: bytes, content_type: ContentType, metadata: Metadata) → Hash
Procedure:
1. hash = ContentHash(content)
2. version = Version {
number: 1,
previous: null,
root: hash,
timestamp: now()
}
3. provenance = compute_provenance(content_type, sources=[])
4. manifest = Manifest {
hash: hash,
content_type: content_type,
version: version,
visibility: Private,
access: default_access(),
metadata: metadata,
economics: Economics { price: 0, currency: HBAR, ... },
provenance: provenance,
created_at: now(),
updated_at: now()
}
5. Store content and manifest locally
6. Return hash
7.1.2 Extract L1
Extract mentions from L0 content.
EXTRACT_L1(hash: Hash) → L1Summary
Preconditions:
- Content exists locally
- content_type == L0
Procedure:
1. content = load_content(hash)
2. mentions = extract_mentions(content) # AI or rule-based
3. summary = L1Summary {
l0_hash: hash,
mention_count: len(mentions),
preview_mentions: mentions[0:5],
primary_topics: extract_topics(mentions),
summary: generate_summary(content)
}
4. Store L1 data with content record
5. Return summary
7.1.2a Build L2 (Entity Graph)
Build an L2 Entity Graph from one or more L1 sources. L2 is your personal knowledge structure — it is never published or queried by others.
BUILD_L2(source_l1s: Hash[], config: L2BuildConfig?) → Hash
Preconditions:
- All source L1s have been queried (payment proof exists) OR are your own
- len(source_l1s) >= 1
Procedure:
1. Verify all L1 sources are accessible:
For each l1_hash in source_l1s:
assert cache.has(l1_hash) OR content.has(l1_hash)
l1 = load_l1(l1_hash)
assert l1.content_type == L1
2. Extract entities from mentions:
raw_entities = []
For each l1 in source_l1s:
For each mention in l1.mentions:
extracted = extract_entities(mention, config.prefixes)
raw_entities.extend(extracted)
3. Resolve entities (merge duplicates):
resolved_entities = resolve_entities(raw_entities, config)
# Handles: exact match, alias resolution, coreference, external KB linking
# Assigns URIs from configured ontologies
4. Extract relationships:
relationships = extract_relationships(resolved_entities, source_l1s, config)
# Uses predicates from configured ontologies (default: ndl:)
5. Build L2 structure:
l2_graph = L2EntityGraph {
id: computed after serialization,
source_l1s: [L1Reference for each l1],
source_l2s: [],
prefixes: config.prefixes ?? default_prefixes(),
entities: resolved_entities,
relationships: relationships,
entity_count: len(resolved_entities),
relationship_count: len(relationships),
source_mention_count: total_mentions_linked
}
6. Compute hash:
content = serialize(l2_graph)
hash = ContentHash(content)
l2_graph.id = hash
7. Compute provenance:
root_entries = []
For each l1 in source_l1s:
l1_prov = get_provenance(l1)
For each entry in l1_prov.root_L0L1:
merge_or_increment(root_entries, entry)
provenance = Provenance {
root_L0L1: root_entries,
derived_from: source_l1s,
depth: max(l1.provenance.depth for l1 in source_l1s) + 1
}
8. Create manifest:
manifest = Manifest {
hash: hash,
content_type: L2,
owner: my_peer_id,
visibility: Private, # L2 is ALWAYS private
economics: Economics { price: 0, ... }, # L2 has no price
provenance: provenance,
...
}
9. Store content and manifest locally
10. Return hash
struct L2BuildConfig {
# Ontology configuration
prefixes: PrefixMap?, # Custom prefix mappings
default_entity_type: Uri?, # Default: "ndl:Concept"
# Entity resolution settings
resolution_threshold: float64?, # Minimum confidence to merge (default: 0.8)
use_external_kb: bool?, # Link to external knowledge bases
external_kb_list: Uri[]?, # Which KBs: ["http://www.wikidata.org/", ...]
# Relationship extraction
extract_implicit: bool?, # Infer implicit relationships
relationship_predicates: Uri[]? # Limit to specific predicates
}
7.1.2b Merge L2
Combine multiple of your own L2 Entity Graphs into a unified graph. This is useful when you have built separate graphs for different domains and want to unify them.
MERGE_L2(source_l2s: Hash[], config: L2MergeConfig?) → Hash
Preconditions:
- All source L2s are your own (stored locally)
- len(source_l2s) >= 2
Procedure:
1. Load all L2 sources (must be local, L2 is never queried)
2. Unify prefix mappings (merge, detect conflicts)
3. Collect all entities and relationships from sources
4. Cross-graph entity resolution (find same entities in different graphs)
# Match by: canonical_uri, same_as links, label similarity
5. Merge relationships (update entity references to merged IDs)
6. Build new L2 with:
source_l1s: union of all source L1 references
source_l2s: the input source_l2s
prefixes: merged prefix map
7. Compute provenance:
root_entries = merge roots from all source_l2s
provenance = Provenance {
root_L0L1: root_entries,
derived_from: source_l2s,
depth: max(l2.provenance.depth for l2 in source_l2s) + 1
}
8. Create manifest with visibility = Private
9. Store and return hash
struct L2MergeConfig {
prefixes: PrefixMap?, # Override prefix mappings
entity_merge_threshold: float64?, # Confidence threshold for merging entities
prefer_source: uint32? # Index of source to prefer on conflicts
}
7.1.3 Publish
Make content available on the network. Note: L2 content cannot be published.
PUBLISH(hash: Hash, visibility: Visibility, price: Amount, access: AccessControl?) → bool
Preconditions:
- Content exists locally
- content_type != L2 # L2 is always private
- visibility != Private OR no-op
Procedure:
1. manifest = load_manifest(hash)
2. If manifest.content_type == L2:
Return error("L2 content cannot be published")
3. manifest.visibility = visibility
4. manifest.economics.price = price
5. manifest.access = access ?? default_access()
6. manifest.updated_at = now()
7. Save manifest
8. If visibility == Shared:
l1_summary = get_or_extract_l1(hash)
announce = AnnouncePayload {
hash: hash,
content_type: manifest.content_type,
title: manifest.metadata.title,
l1_summary: l1_summary,
price: price,
addresses: my_addresses()
}
DHT.announce(hash, announce)
9. Return true
7.1.4 Update
Create a new version of existing content.
UPDATE(old_hash: Hash, new_content: bytes) → Hash
Preconditions:
- Old content exists locally
- Caller owns old content
Procedure:
1. old_manifest = load_manifest(old_hash)
2. new_hash = ContentHash(new_content)
3. new_version = Version {
number: old_manifest.version.number + 1,
previous: old_hash,
root: old_manifest.version.root,
timestamp: now()
}
4. new_manifest = copy(old_manifest)
new_manifest.hash = new_hash
new_manifest.version = new_version
new_manifest.updated_at = now()
5. Store new content and manifest
6. If old_manifest.visibility == Shared:
update_announce = AnnounceUpdatePayload {
version_root: new_manifest.version.root,
new_hash: new_hash,
version_number: new_version.number,
...
}
DHT.announce_update(new_manifest.version.root, update_announce)
7. Return new_hash
7.1.5 Derive (Create L3)
Create an L3 insight from multiple sources.
DERIVE(sources: Hash[], insight_content: bytes, metadata: Metadata) → Hash
Sources may include any combination of:
- L0 content (raw documents)
- L1 content (mention collections)
- L2 content (entity graphs)
- L3 content (other insights)
Preconditions:
- All sources have been queried (payment proof exists)
- At least one source
Procedure:
1. Verify all sources were queried:
For each source in sources:
assert cache.has(source) OR content.has(source)
2. Compute provenance:
root_entries = []
For each source in sources:
source_prov = get_provenance(source)
For each entry in source_prov.root_L0L1:
merge_or_increment(root_entries, entry)
# Note: For L0/L1 sources, merge their root_L0L1 directly
# For L2 sources, merge the L2's root_L0L1 (traces back to L0/L1)
# For L3 sources, merge the L3's root_L0L1 (recursive)
provenance = Provenance {
root_L0L1: root_entries,
derived_from: sources,
depth: max(source.provenance.depth for source in sources) + 1
}
3. hash = ContentHash(insight_content)
4. Create manifest with content_type = L3, provenance
5. Store locally
6. Return hash
Helper merge_or_increment(entries, new_entry):
existing = find(entries, e => e.hash == new_entry.hash)
If existing:
existing.weight += new_entry.weight
Else:
entries.append(new_entry with weight=1)
7.1.6 Reference L3 as L0 (Import)
Reference an external L3 as foundational input for your own derivations.
REFERENCE_L3_AS_L0(source_l3_hash: Hash) → Reference
Preconditions:
- L3 has been queried at least once (payment proof exists)
- Source content_type == L3
Procedure:
1. Verify L3 was queried:
assert cache.has(source_l3_hash)
source_manifest = cache[source_l3_hash].manifest
assert source_manifest.content_type == L3
2. Create reference in local graph:
reference = Reference {
hash: source_l3_hash,
owner: source_manifest.owner,
treat_as: L0, # Treat this L3 as foundational for derivations
imported_at: now()
}
3. Store reference locally
4. Return reference
IMPORTANT: This is a reference operation, not data transfer. The actual
content remains on the original owner's node. "Import" means treating an
external L3 as foundational input (L0) in your own derivation chains.
When deriving from this reference:
- The reference is included in derived_from[]
- The L3's root_L0L1 is merged into the new content's root_L0L1
- The L3 itself is added to root_L0L1 (the creator becomes a root)
- Each query to your derived content triggers payments to:
- You (5% synthesis fee)
- The L3 creator (as a root contributor)
- All upstream contributors in the L3's provenance chain
7.2 Query Operations
7.2.1 Discover
Search for content on the network.
DISCOVER(query: string, filters: SearchFilters?) → SearchResult[]
Procedure:
1. search_payload = SearchPayload {
query: query,
filters: filters,
limit: 50,
offset: 0
}
2. results = DHT.search(search_payload)
3. Return results sorted by relevance_score
7.2.2 Preview
Get L1 preview for content (free).
PREVIEW(peer: PeerId, hash: Hash) → (Manifest, L1Summary)
Procedure:
1. Send PREVIEW_REQUEST { hash } to peer
2. Await PREVIEW_RESPONSE
3. Verify response.hash == hash
4. Return (response.manifest, response.l1_summary)
Handler (receiving node):
1. manifest = load_manifest(request.hash)
2. If manifest is null:
Return QUERY_ERROR { NOT_FOUND }
3. If manifest.visibility == Private:
Return QUERY_ERROR { NOT_FOUND } # Don't reveal existence
4. If manifest.visibility == Unlisted:
If not check_access(sender, manifest.access):
Return QUERY_ERROR { ACCESS_DENIED }
5. l1_summary = load_l1_summary(request.hash)
6. Return PREVIEW_RESPONSE { hash, manifest, l1_summary }
7.2.3 Query
Request content with payment.
QUERY(peer: PeerId, hash: Hash, query_text: string?) → (bytes, Manifest, PaymentReceipt)
Procedure:
1. Ensure channel exists with peer:
If not channels.has(peer):
CHANNEL_OPEN(peer)
2. Preview first to get price:
(manifest, _) = PREVIEW(peer, hash)
price = manifest.economics.price
3. Create payment:
payment = Payment {
id: H(channel_id || nonce || price || peer),
amount: price,
recipient: peer,
query_hash: hash,
provenance: manifest.provenance.root_L0L1,
timestamp: now(),
signature: Sign(my_key, payment_data)
}
4. Send QUERY_REQUEST { hash, query_text, payment }
5. Await QUERY_RESPONSE
6. Verify response:
assert ContentHash(response.content) == hash
assert response.payment_receipt.amount == price
7. Update channel state:
channel.my_balance -= price
channel.nonce += 1
channel.pending_payments.append(payment)
8. Cache content:
cache[hash] = CachedContent {
hash, content, peer, now(), response.payment_receipt
}
9. Return (response.content, response.manifest, response.payment_receipt)
Handler (receiving node):
1. manifest = load_manifest(request.hash)
2. Validate visibility and access (same as PREVIEW)
3. Validate payment:
assert request.payment.amount >= manifest.economics.price
assert request.payment.recipient == my_peer_id
assert Verify(sender_pubkey, payment_data, request.payment.signature)
assert channel_has_balance(sender, request.payment.amount)
4. Update channel state:
channel.their_balance -= request.payment.amount
channel.my_balance += (request.payment.amount * 0.05) # Synthesis fee
channel.nonce = max(channel.nonce, request.payment.nonce) + 1
5. Queue distribution:
For each entry in manifest.provenance.root_L0L1:
share = (request.payment.amount * 0.95) / total_weight
queue_settlement(entry.owner, share * entry.weight, hash)
6. Update economics:
manifest.economics.total_queries += 1
manifest.economics.total_revenue += request.payment.amount
7. content = load_content(request.hash)
8. receipt = PaymentReceipt { ... }
9. Return QUERY_RESPONSE { hash, content, manifest, receipt }
7.3 Channel Operations
7.3.1 Open Channel
CHANNEL_OPEN(peer: PeerId, initial_balance: Amount) → Channel
Procedure:
1. channel_id = H(my_peer_id || peer || random_nonce())
2. Send CHANNEL_OPEN { channel_id, initial_balance, funding_tx }
3. Await CHANNEL_ACCEPT
4. channel = Channel {
peer_id: peer,
state: Open,
my_balance: initial_balance,
their_balance: response.initial_balance,
nonce: 0,
last_update: now(),
pending_payments: []
}
5. channels[peer] = channel
6. Return channel
7.3.2 Close Channel
CHANNEL_CLOSE(peer: PeerId) → SettlementEntry[]
Procedure:
1. channel = channels[peer]
2. Assert channel.state == Open
3. Create settlement entries from pending payments:
entries = aggregate_payments(channel.pending_payments)
4. Send CHANNEL_CLOSE { channel_id, final_balances, settlement_tx }
5. Await acknowledgment or timeout
6. If cooperative:
Submit settlement to chain
channel.state = Closed
Else:
Initiate dispute resolution
7. Return entries
7.4 Settlement Operations
SETTLE_BATCH(entries: SettlementEntry[]) → TransactionId
Procedure:
1. batch_id = H(entries || now())
2. merkle_root = compute_merkle_root(entries)
3. Build on-chain transaction:
For each entry in entries:
Add transfer: entry.recipient receives entry.amount
4. Submit transaction to Hedera
5. Await confirmation
6. Broadcast SETTLE_CONFIRM { batch_id, tx_id, block, timestamp }
7. Clear settled payments from channels
8. Return tx_id
8. State Transitions
8.1 Content State Machine
┌──────────────────────────────────────────┐
│ │
▼ │
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ (none) │────▶│ Private │────▶│Unlisted │────▶│ Shared │──┘
└─────────┘ └─────────┘ └─────────┘ └─────────┘
│ │ │ │
│ │ │ │
│ CREATE │ PUBLISH │ PUBLISH │
│ │ (unlisted) │ (shared) │
│ │ │ │
│ │◀──────────────│◀──────────────│
│ │ UNPUBLISH │ UNPUBLISH │
│ │ │ │
│ │ │ │
└───────────────┴───────────────┴───────────────┘
│
│ DELETE
▼
┌─────────┐
│ Deleted │
└─────────┘
Valid transitions:
(none) → Private: CREATE
Private → Unlisted: PUBLISH(visibility=Unlisted)
Private → Shared: PUBLISH(visibility=Shared)
Unlisted → Shared: PUBLISH(visibility=Shared)
Unlisted → Private: UNPUBLISH
Shared → Unlisted: UNPUBLISH(keep_unlisted=true)
Shared → Private: UNPUBLISH
Shared → Offline: TAKE_OFFLINE (manifest preserved for provenance)
Unlisted → Offline: TAKE_OFFLINE
Offline → Shared: PUBLISH(visibility=Shared)
Offline → Unlisted: PUBLISH(visibility=Unlisted)
Any → Deleted: DELETE (local only, provenance persists)
8.2 Channel State Machine
┌─────────┐ ┌─────────┐ ┌─────────┐
│ (none) │────▶│ Opening │────▶│ Open │
└─────────┘ └─────────┘ └─────────┘
│ │ │
│ timeout │ │ UPDATE
│ │ └────┐
▼ │ │
┌─────────┐ │ │
│ Failed │ │◀───────┘
└─────────┘ │
│
┌──────────────┴──────────────┐
│ │
▼ cooperative ▼ unilateral/dispute
┌─────────┐ ┌──────────┐
│ Closing │ │ Disputed │
└─────────┘ └──────────┘
│ │
│ settled │ resolved
▼ ▼
┌─────────────────────────────────────┐
│ Closed │
└─────────────────────────────────────┘
Valid transitions:
(none) → Opening: CHANNEL_OPEN sent
Opening → Open: CHANNEL_ACCEPT received
Opening → Failed: Timeout or rejection
Open → Open: CHANNEL_UPDATE (payment)
Open → Closing: CHANNEL_CLOSE (cooperative)
Open → Disputed: CHANNEL_DISPUTE
Closing → Closed: Settlement confirmed
Disputed → Closed: Dispute resolved on-chain
8.3 Query State Machine (per request)
┌─────────┐ ┌─────────┐ ┌──────────┐ ┌─────────┐
│Initiate │────▶│ Preview │────▶│ Payment │────▶│Complete │
└─────────┘ └─────────┘ └──────────┘ └─────────┘
│ │
│ error │ error
▼ ▼
┌───────────────────────┐
│ Failed │
└───────────────────────┘
States:
Initiate: Query started
Preview: L1 preview received, evaluating
Payment: Payment sent, awaiting content
Complete: Content received and verified
Failed: Error at any stage
9. Validation Rules
9.1 Content Validation
VALIDATE_CONTENT(content: bytes, manifest: Manifest) → bool
Rules:
1. ContentHash(content) == manifest.hash
2. len(content) == manifest.metadata.content_size
3. len(manifest.metadata.title) <= 200
4. len(manifest.metadata.description) <= 2000
5. len(manifest.metadata.tags) <= 20
6. For each tag: len(tag) <= 50
7. manifest.content_type in {L0, L1, L2, L3}
8. manifest.visibility in {Private, Unlisted, Shared, Offline}
# L2-specific validation
9. If manifest.content_type == L2:
l2 = deserialize(content) as L2EntityGraph
assert l2.id == manifest.hash
assert len(l2.source_l1s) >= 1
assert len(l2.entities) >= 1
assert all entity IDs are unique
assert all relationship entity refs are valid
assert all MentionRefs point to valid source L1s
assert l2.entity_count == len(l2.entities)
assert l2.relationship_count == len(l2.relationships)
9.2 Version Validation
VALIDATE_VERSION(manifest: Manifest, previous: Manifest?) → bool
Rules:
1. If manifest.version.number == 1:
manifest.version.previous == null
manifest.version.root == manifest.hash
2. If manifest.version.number > 1:
previous != null
manifest.version.previous == previous.hash
manifest.version.root == previous.version.root
manifest.version.number == previous.version.number + 1
manifest.version.timestamp > previous.version.timestamp
9.3 Provenance Validation
VALIDATE_PROVENANCE(manifest: Manifest, sources: Manifest[]) → bool
Rules:
1. If manifest.content_type == L0:
manifest.provenance.root_L0L1 == [self_entry]
manifest.provenance.derived_from == []
manifest.provenance.depth == 0
2. If manifest.content_type == L1:
len(manifest.provenance.root_L0L1) >= 1
len(manifest.provenance.derived_from) == 1
derived_from[0] is an L0 hash
All root_L0L1 entries are type L0
manifest.provenance.depth == 1
3. If manifest.content_type == L2:
len(manifest.provenance.root_L0L1) >= 1
len(manifest.provenance.derived_from) >= 1
All derived_from are L1 or L2 hashes
All root_L0L1 entries are type L0 or L1
manifest.provenance.depth >= 2
4. If manifest.content_type == L3:
len(manifest.provenance.root_L0L1) >= 1
len(manifest.provenance.derived_from) >= 1
All derived_from hashes exist in sources
All root_L0L1 entries are type L0 or L1
5. root_L0L1 computation is correct:
computed = compute_root_L0L1(sources)
manifest.provenance.root_L0L1 == computed
6. Depth is correct:
manifest.provenance.depth == max(s.provenance.depth for s in sources) + 1
7. No self-reference:
manifest.hash not in manifest.provenance.derived_from
manifest.hash not in [e.hash for e in manifest.provenance.root_L0L1]
8. No cycles in provenance graph
9.4 Payment Validation
VALIDATE_PAYMENT(payment: Payment, channel: Channel, manifest: Manifest) → bool
Rules:
1. payment.amount >= manifest.economics.price
2. payment.recipient == manifest_owner
3. payment.query_hash == manifest.hash
4. channel.state == Open
5. channel.their_balance >= payment.amount # Payer has funds
6. payment.nonce > channel.nonce # No replay
7. Verify(payer_pubkey, payment_data, payment.signature)
8. payment.provenance == manifest.provenance.root_L0L1
9.5 Message Validation
VALIDATE_MESSAGE(msg: Message) → bool
Rules:
1. msg.version == PROTOCOL_VERSION
2. msg.type is valid MessageType
3. msg.timestamp within acceptable skew (±5 minutes)
4. msg.sender is valid PeerId
5. Verify(lookup_pubkey(msg.sender), H(msg without signature), msg.signature)
6. msg.payload decodes correctly for msg.type
7. Payload-specific validation passes
9.6 Access Validation
VALIDATE_ACCESS(requester: PeerId, manifest: Manifest) → bool
Rules:
1. If manifest.visibility == Private:
Return false # No external access
2. If manifest.visibility == Unlisted:
If manifest.access.allowlist != null:
requester in manifest.access.allowlist
If manifest.access.denylist != null:
requester not in manifest.access.denylist
3. If manifest.visibility == Shared:
If manifest.access.denylist != null:
requester not in manifest.access.denylist
# Allowlist ignored for Shared (open to all)
4. If manifest.access.require_bond:
has_bond(requester, manifest.access.bond_amount)
10. Economic Rules
10.1 Revenue Distribution
DISTRIBUTE_REVENUE(payment: Payment) → Distribution[]
Constants:
SYNTHESIS_FEE = 0.05 # 5%
ROOT_POOL = 0.95 # 95%
Procedure:
1. total = payment.amount
2. owner_share = total * SYNTHESIS_FEE
3. root_pool = total * ROOT_POOL
4. total_weight = sum(e.weight for e in payment.provenance)
5. per_weight = root_pool / total_weight
6. distributions = []
7. For each entry in payment.provenance:
amount = per_weight * entry.weight
# Owner also gets share if they have roots
If entry.owner == content_owner:
owner_share += amount
Else:
distributions.append(Distribution {
recipient: entry.owner,
amount: amount,
source_hash: entry.hash
})
8. distributions.append(Distribution {
recipient: content_owner,
amount: owner_share,
source_hash: payment.query_hash
})
9. Return distributions
10.2 Distribution Example
Scenario:
Bob's L3 derives from:
- Alice's L0 (2 documents)
- Carol's L0 (1 document)
- Bob's L0 (2 documents)
Query payment: 100 HBAR
Provenance:
root_L0L1 = [
{ hash: alice_1, owner: Alice, weight: 1 },
{ hash: alice_2, owner: Alice, weight: 1 },
{ hash: carol_1, owner: Carol, weight: 1 },
{ hash: bob_1, owner: Bob, weight: 1 },
{ hash: bob_2, owner: Bob, weight: 1 }
]
total_weight = 5
Distribution:
owner_share = 100 * 0.05 = 5 HBAR (Bob's synthesis fee)
root_pool = 100 * 0.95 = 95 HBAR
per_weight = 95 / 5 = 19 HBAR
Alice: 2 * 19 = 38 HBAR
Carol: 1 * 19 = 19 HBAR
Bob (roots): 2 * 19 = 38 HBAR
Bob (synthesis): 5 HBAR
Bob total: 43 HBAR (5 + 38)
Final:
Alice: 38 HBAR (38%)
Carol: 19 HBAR (19%)
Bob: 43 HBAR (43%)
10.3 Price Setting
Constraints:
MIN_PRICE = 1 # 1 tinybar (10^-8 HBAR)
MAX_PRICE = 10^16 # Practical maximum
Rules:
1. price >= MIN_PRICE
2. price <= MAX_PRICE
3. price is uint64 (no floating point)
4. Owner can change price at any time (takes effect immediately)
10.4 Settlement Batching
BATCH_THRESHOLD = 100 HBAR # Minimum to trigger auto-settlement
BATCH_INTERVAL = 3600 # Maximum seconds between settlements
Rules:
1. Settlement triggered when:
sum(pending_payments) >= BATCH_THRESHOLD
OR time_since_last_settlement >= BATCH_INTERVAL
OR channel_closing
2. Batch includes all pending payments across all channels
3. Payments aggregated by recipient (one entry per recipient)
4. Merkle root allows any recipient to verify inclusion
11. Network Layer
11.1 Transport
The protocol uses libp2p for peer-to-peer communication:
Transports:
- TCP (primary)
- QUIC (preferred when available)
- WebSocket (browser compatibility)
Multiplexing:
- yamux
- mplex (fallback)
Security:
- Noise protocol (XX handshake pattern)
- TLS 1.3 (fallback)
11.2 Discovery
DHT: Kademlia
- Key space: 256-bit (SHA-256)
- Bucket size: 20
- Alpha (parallelism): 3
- Replication factor: 20
Content records stored at:
key = H(content_hash)
value = AnnouncePayload (signed)
Version updates stored at:
key = H("version:" || version_root)
value = AnnounceUpdatePayload (signed)
Search index:
- Local inverted index per node
- Gossip-based index synchronization
- Semantic embeddings for similarity search
11.3 Peer Discovery
Bootstrap nodes:
- Hardcoded list of well-known nodes
- DNS-based discovery (TXT records)
Peer exchange:
- Nodes share peer lists periodically
- Prefer peers with high uptime and low latency
NAT traversal:
- STUN for address discovery
- Relay nodes for symmetric NAT
- Hole punching via DCUtR
11.4 Message Routing
Direct messages:
- Point-to-point when peer is known
- DHT lookup to find peer addresses
Broadcast messages:
- GossipSub for protocol announcements
- Topic: /nodalync/announce/1.0.0
Request-response:
- Dedicated protocol streams
- Timeout: 30 seconds default
- Retry: 3 attempts with exponential backoff
12. Settlement Layer
12.1 Chain Selection
Primary: Hedera Hashgraph
Rationale: - Fast finality (3-5 seconds) - Low cost (~$0.0001/tx) - High throughput (10,000+ TPS) - Suitable for micropayment batching
12.2 On-Chain Data
Settlement Contract State:
balances: Map<AccountId, Amount> # Token balances
channels: Map<ChannelId, ChannelState> # Channel states
attestations: Map<Hash, Attestation> # Content attestations
struct Attestation {
content_hash: Hash,
owner: AccountId,
timestamp: Timestamp,
provenance_root: Hash # Merkle root of root_L0L1
}
struct ChannelState {
participants: [AccountId, AccountId],
balances: [Amount, Amount],
nonce: uint64,
status: ChannelStatus
}
12.3 Contract Operations
// Deposit tokens to protocol
deposit(amount: Amount)
Requires: sender has sufficient tokens
Effects: balances[sender] += amount
// Withdraw tokens from protocol
withdraw(amount: Amount)
Requires: balances[sender] >= amount
Effects: balances[sender] -= amount, transfer to sender
// Attest content publication
attest(content_hash: Hash, provenance_root: Hash)
Requires: caller is content owner
Effects: attestations[content_hash] = Attestation { ... }
// Open payment channel
openChannel(peer: AccountId, myDeposit: Amount, peerDeposit: Amount)
Requires: both parties sign, sufficient balances
Effects: Create channel, lock deposits
// Update channel state (cooperative)
updateChannel(channelId: ChannelId, newState: ChannelState, signatures: [Sig, Sig])
Requires: Both signatures valid, nonce > current nonce
Effects: Update channel state
// Close channel (cooperative)
closeChannel(channelId: ChannelId, finalState: ChannelState, signatures: [Sig, Sig])
Requires: Both signatures valid
Effects: Distribute balances, delete channel
// Dispute channel (unilateral)
disputeChannel(channelId: ChannelId, claimedState: ChannelState, signature: Sig)
Requires: Valid signature from one party
Effects: Start dispute period (24 hours)
// Resolve dispute
resolveDispute(channelId: ChannelId)
Requires: Dispute period elapsed
Effects: Apply highest-nonce state, close channel
// Batch settlement
settleBatch(entries: SettlementEntry[], merkleProofs: MerkleProof[])
Requires: Valid merkle proofs, sufficient channel balances
Effects: Transfer amounts to recipients
12.4 Currency
Currency: HBAR (Hedera native token)
Decimals: 8 (1 HBAR = 10^8 tinybars)
The protocol uses HBAR directly for all payments. This decision:
- Eliminates token bootstrapping complexity
- Leverages existing HBAR liquidity and exchanges
- Avoids securities/regulatory concerns
- Allows focus on proving the knowledge economics model
All amounts in the protocol are denominated in tinybars (10^-8 HBAR).
13. Security Considerations
13.1 Threat Model
Assumptions:
- Network is asynchronous and unreliable
- Adversaries can delay or drop messages
- Adversaries can create unlimited identities (Sybil)
- Adversaries cannot break cryptographic primitives
- Majority of economic stake is honest
Threats addressed:
1. Content theft (copying after query)
2. Payment fraud (fake payments, double-spending)
3. Provenance manipulation (false attribution)
4. Eclipse attacks (isolating nodes)
5. Denial of service
Threats NOT addressed (out of scope):
1. Content quality/accuracy
2. Legal disputes over IP
3. Privacy of query patterns
4. Nation-state level attacks
13.2 Mitigations
Content theft:
- Mitigation: Audit trail, timestamps, legal recourse
- Note: Cannot prevent, only detect and prove
Payment fraud:
- Mitigation: Cryptographic signatures, channel states
- Settlement disputes resolve on-chain with evidence
Provenance manipulation:
- Mitigation: Content-addressed hashing
- Cannot claim derivation without querying (payment proof)
Eclipse attacks:
- Mitigation: Multiple bootstrap nodes, peer diversity requirements
- Monitor for unusual peer behavior
Denial of service:
- Mitigation: Rate limiting, require payment bonds
- Reputation system penalizes bad actors
13.3 Key Management
Private key storage:
- Encrypted at rest (AES-256-GCM)
- Key derived from user password (Argon2id)
- Optional hardware security module support
Key rotation:
- Supported via identity update message
- Old key signs authorization for new key
- Grace period for transition
Recovery:
- Optional mnemonic backup (BIP-39)
- Social recovery (threshold signatures) - future
13.4 Privacy Considerations
Visible to network:
- Content hashes (not content)
- L1 previews (for shared content)
- Provenance chains
- Payment amounts (in settlement batches)
Hidden from network:
- Private content (entirely local)
- Query text (between querier and node)
- Unlisted content (unless you have hash)
Future improvements:
- ZK proofs for provenance verification
- Private settlement channels
- Onion routing for query privacy
Appendix A: Wire Formats
A.1 Message Encoding
All messages use deterministic CBOR encoding:
Message wire format:
[0x00] # Protocol magic byte
[version: uint8] # Protocol version
[type: uint16] # Message type
[length: uint32] # Payload length
[payload: bytes] # CBOR-encoded payload
[signature: 64 bytes] # Ed25519 signature
A.2 Hash Computation
ContentHash:
H(
[0x00] # Domain separator for content
[length: uint64] # Content length
[content: bytes] # Raw content
)
MessageHash (for signing):
H(
[0x01] # Domain separator for messages
[version: uint8]
[type: uint16]
[id: 32 bytes]
[timestamp: uint64]
[sender: 20 bytes]
[payload_hash: 32 bytes] # H(payload)
)
ChannelStateHash:
H(
[0x02] # Domain separator for channels
[channel_id: 32 bytes]
[nonce: uint64]
[initiator_balance: uint64]
[responder_balance: uint64]
)
Appendix B: Constants
PROTOCOL_VERSION = 0x01
PROTOCOL_MAGIC = 0x00
# Timing
MESSAGE_TIMEOUT_MS = 30000
CHANNEL_DISPUTE_PERIOD_MS = 86400000 # 24 hours
MAX_CLOCK_SKEW_MS = 300000 # 5 minutes
# Limits
MAX_CONTENT_SIZE = 104857600 # 100 MB
MAX_MESSAGE_SIZE = 10485760 # 10 MB
MAX_MENTIONS_PER_L0 = 1000
MAX_SOURCES_PER_L3 = 100
MAX_PROVENANCE_DEPTH = 100
MAX_TAGS = 20
MAX_TAG_LENGTH = 50
MAX_TITLE_LENGTH = 200
MAX_DESCRIPTION_LENGTH = 2000
# L2 Entity Graph limits
MAX_ENTITIES_PER_L2 = 10000
MAX_RELATIONSHIPS_PER_L2 = 50000
MAX_ALIASES_PER_ENTITY = 50
MAX_CANONICAL_LABEL_LENGTH = 200
MAX_PREDICATE_LENGTH = 100
MAX_ENTITY_DESCRIPTION_LENGTH = 500
MAX_SOURCE_L1S_PER_L2 = 100
MAX_SOURCE_L2S_PER_MERGE = 20
# Economics
MIN_PRICE = 1 # Smallest unit
SYNTHESIS_FEE_NUMERATOR = 5
SYNTHESIS_FEE_DENOMINATOR = 100 # 5%
SETTLEMENT_BATCH_THRESHOLD = 10000000000 # 100 HBAR (10^8 tinybars)
SETTLEMENT_BATCH_INTERVAL_MS = 3600000 # 1 hour
# DHT
DHT_BUCKET_SIZE = 20
DHT_ALPHA = 3
DHT_REPLICATION = 20
Appendix C: Error Codes
# Query Errors (0x0001 - 0x00FF)
NOT_FOUND = 0x0001 # Content does not exist
ACCESS_DENIED = 0x0002 # Not authorized
PAYMENT_REQUIRED = 0x0003 # No payment provided
PAYMENT_INVALID = 0x0004 # Payment validation failed
RATE_LIMITED = 0x0005 # Too many requests
VERSION_NOT_FOUND= 0x0006 # Specific version not found
# Channel Errors (0x0100 - 0x01FF)
CHANNEL_NOT_FOUND = 0x0100
CHANNEL_CLOSED = 0x0101
INSUFFICIENT_BALANCE = 0x0102
INVALID_NONCE = 0x0103
INVALID_SIGNATURE = 0x0104
# Validation Errors (0x0200 - 0x02FF)
INVALID_HASH = 0x0200
INVALID_PROVENANCE = 0x0201
INVALID_VERSION = 0x0202
INVALID_MANIFEST = 0x0203
CONTENT_TOO_LARGE = 0x0204
# L2 Entity Graph Errors (0x0210 - 0x021F)
L2_INVALID_STRUCTURE = 0x0210 # Malformed L2EntityGraph
L2_MISSING_SOURCE = 0x0211 # Source L1 not found
L2_ENTITY_LIMIT = 0x0212 # Too many entities
L2_RELATIONSHIP_LIMIT = 0x0213 # Too many relationships
L2_INVALID_ENTITY_REF = 0x0214 # Relationship references invalid entity
L2_CYCLE_DETECTED = 0x0215 # Circular entity reference
L2_INVALID_URI = 0x0216 # Invalid URI or CURIE format
L2_CANNOT_PUBLISH = 0x0217 # L2 content cannot be published
# Network Errors (0x0300 - 0x03FF)
PEER_NOT_FOUND = 0x0300
CONNECTION_FAILED = 0x0301
TIMEOUT = 0x0302
# Internal Errors (0xFF00 - 0xFFFF)
INTERNAL_ERROR = 0xFFFF
Appendix D: Reference Implementation Notes
The reference implementation SHOULD:
- Use Rust for memory safety and performance
- Use libp2p-rs for networking
- Use SQLite for local storage
- Use RocksDB for high-performance caching
- Provide both CLI and library interfaces
- Support WASM compilation for browser nodes (future)
Directory structure:
nodalync/
├── Cargo.toml
├── src/
│ ├── lib.rs # Library root
│ ├── main.rs # CLI entry point
│ ├── types/ # Data structures
│ ├── crypto/ # Cryptographic operations
│ ├── storage/ # Local storage
│ ├── network/ # P2P networking
│ ├── protocol/ # Protocol operations
│ ├── channels/ # Payment channels
│ └── settlement/ # Chain settlement
├── tests/
└── docs/
End of Protocol Specification
Version History:
- 0.7.1 (February 2026): Added CHANNEL_CLOSE_ACK message type; added Offline transitions to content state machine; fixed validation rule §9.1 to include Offline visibility
- 0.3.0 (January 2026): Added SEARCH protocol for network-wide content discovery, ManifestFilter with text search
- 0.2.1-draft (January 2026): Changed currency from NDL token to HBAR (Hedera native)
- 0.2.0-draft (January 2026): Added L2 Entity Graph as protocol-level content type
- 0.1.0-draft (January 2025): Initial draft
Nodalync Architecture
This document defines the module structure, dependencies, and implementation order for the Nodalync protocol.
Module Dependency Graph
┌──────────────────┐ ┌──────────────────┐
│ nodalync-cli │ │ nodalync-mcp │
│ (binary crate) │ │ (MCP server) │
└────────┬─────────┘ └────────┬─────────┘
│ │
└───────────┬────────────┘
│
┌────────────────────────────┼────────────────────────┐
│ │ │
▼ │ ▼
┌─────────────┐ │ ┌──────────────┐
│ nodalync-net│ │ │nodalync-settle│
│ (P2P/DHT) │ │ │ (chain) │
└──────┬──────┘ │ └──────┬───────┘
│ │ │
│ ┌───────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
├─│ nodalync-ops│ │
│ │ (operations)│ │
│ └──────┬──────┘ │
│ │ │
│ ┌──────┴──────┐ │
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐
│nodalync-wire│ │nodalync- │ │nodalync- │ │nodalync- │
│(serialization)│ │ store │ │ valid │ │ econ │
└──────┬──────┘ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘
│ │ │ │
└───────────────┴──────┬──────┴──────────────┘
│
▼
┌─────────────┐
│nodalync-types│
│ (all structs)│
└──────┬──────┘
│
▼
┌─────────────┐
│nodalync-crypto│
│(hash, sign) │
└─────────────┘
Note: nodalync-net depends on nodalync-ops to dispatch incoming messages to the appropriate handlers.
Crates Overview
| Crate | Purpose | Spec Sections | Dependencies |
|---|---|---|---|
nodalync-crypto | Hashing, signing, identity | §3 | None (external: sha2, ed25519-dalek) |
nodalync-types | All data structures | §4 | crypto |
nodalync-wire | Message serialization/deserialization | §6, Appendix A | types |
nodalync-store | Local content & manifest storage | §5 | types |
nodalync-valid | All validation rules | §9 | types |
nodalync-econ | Revenue distribution math | §10 | types |
nodalync-ops | Protocol operations (CREATE, QUERY, etc) | §7 | store, valid, econ, wire |
nodalync-net | P2P networking, DHT | §11 | wire, ops |
nodalync-settle | Blockchain settlement | §12 | econ, types |
nodalync-cli | Command-line interface | — | all |
nodalync-mcp | MCP server for AI agents | — | ops, store, net, settle |
Key Interfaces (Traits)
Each crate exposes traits that define its contract. Implementations can vary (e.g., in-memory vs SQLite storage) but must satisfy the trait.
nodalync-crypto
#![allow(unused)]
fn main() {
pub trait ContentHasher {
fn hash(content: &[u8]) -> Hash;
fn verify(content: &[u8], expected: &Hash) -> bool;
}
pub trait Signer {
fn sign(&self, message: &[u8]) -> Signature;
fn verify(public_key: &PublicKey, message: &[u8], signature: &Signature) -> bool;
}
pub trait Identity {
fn generate() -> Self;
fn public_key(&self) -> &PublicKey;
fn peer_id(&self) -> PeerId;
fn sign(&self, message: &[u8]) -> Signature;
}
}
nodalync-store
#![allow(unused)]
fn main() {
pub trait ContentStore {
fn store(&mut self, hash: &Hash, content: &[u8]) -> Result<()>;
fn load(&self, hash: &Hash) -> Result<Option<Vec<u8>>>;
fn exists(&self, hash: &Hash) -> bool;
fn delete(&mut self, hash: &Hash) -> Result<()>;
}
pub trait ManifestStore {
fn store(&mut self, manifest: &Manifest) -> Result<()>;
fn load(&self, hash: &Hash) -> Result<Option<Manifest>>;
fn list(&self, filter: ManifestFilter) -> Result<Vec<Manifest>>;
fn update(&mut self, manifest: &Manifest) -> Result<()>;
}
pub trait ProvenanceGraph {
fn add(&mut self, hash: &Hash, derived_from: &[Hash]) -> Result<()>;
fn get_roots(&self, hash: &Hash) -> Result<Vec<ProvenanceEntry>>;
fn get_derivations(&self, hash: &Hash) -> Result<Vec<Hash>>;
}
}
nodalync-valid
#![allow(unused)]
fn main() {
pub trait Validator {
fn validate_content(&self, content: &[u8], manifest: &Manifest) -> Result<()>;
fn validate_version(&self, manifest: &Manifest, previous: Option<&Manifest>) -> Result<()>;
fn validate_provenance(&self, manifest: &Manifest, sources: &[Manifest]) -> Result<()>;
fn validate_payment(&self, payment: &Payment, channel: &Channel, manifest: &Manifest) -> Result<()>;
fn validate_message(&self, message: &Message) -> Result<()>;
fn validate_access(&self, requester: &PeerId, manifest: &Manifest) -> Result<()>;
}
}
nodalync-econ
#![allow(unused)]
fn main() {
pub trait Distributor {
fn distribute(&self, payment: &Payment, provenance: &[ProvenanceEntry]) -> Vec<Distribution>;
fn calculate_batch(&self, payments: &[Payment]) -> SettlementBatch;
}
}
nodalync-ops
#![allow(unused)]
fn main() {
pub trait Operations {
// Content operations
fn create(&mut self, content: &[u8], content_type: ContentType, metadata: Metadata) -> Result<Hash>;
fn publish(&mut self, hash: &Hash, visibility: Visibility, price: Amount) -> Result<()>;
fn update(&mut self, old_hash: &Hash, new_content: &[u8]) -> Result<Hash>;
fn derive(&mut self, sources: &[Hash], insight: &[u8], metadata: Metadata) -> Result<Hash>;
// Query operations
fn preview(&self, hash: &Hash) -> Result<(Manifest, L1Summary)>;
fn query(&mut self, hash: &Hash, payment: Payment) -> Result<QueryResponse>;
}
}
nodalync-net
#![allow(unused)]
fn main() {
pub trait Network {
fn announce(&self, hash: &Hash, manifest: &Manifest) -> Result<()>;
fn search(&self, query: &str, filters: SearchFilters) -> Result<Vec<SearchResult>>;
fn send(&self, peer: &PeerId, message: Message) -> Result<()>;
fn receive(&mut self) -> Result<(PeerId, Message)>;
}
}
nodalync-settle
#![allow(unused)]
fn main() {
pub trait Settlement {
fn submit_batch(&self, batch: SettlementBatch) -> Result<TransactionId>;
fn verify_settlement(&self, tx_id: &TransactionId) -> Result<SettlementStatus>;
fn open_channel(&self, peer: &PeerId, deposit: Amount) -> Result<ChannelId>;
fn close_channel(&self, channel_id: &ChannelId) -> Result<TransactionId>;
}
}
Testing Strategy
Each crate has three test levels:
-
Unit tests — Test individual functions
- Location:
src/*.rs(inline#[cfg(test)]modules) - Run:
cargo test -p nodalync-{crate}
- Location:
-
Integration tests — Test crate as a whole
- Location:
crates/nodalync-{crate}/tests/ - Run:
cargo test -p nodalync-{crate} --test '*'
- Location:
-
Spec compliance tests — Verify against spec validation rules
- Location:
crates/nodalync-{crate}/tests/spec_compliance.rs - These tests are derived directly from spec §9
- Each test references the specific spec section it validates
- Location:
Error Handling
All crates use a common error type:
#![allow(unused)]
fn main() {
// In nodalync-types
#[derive(Debug, thiserror::Error)]
pub enum NodalyncError {
#[error("Content validation failed: {0}")]
ContentValidation(String),
#[error("Provenance validation failed: {0}")]
ProvenanceValidation(String),
#[error("Payment validation failed: {0}")]
PaymentValidation(String),
#[error("Storage error: {0}")]
Storage(String),
#[error("Network error: {0}")]
Network(String),
#[error("Settlement error: {0}")]
Settlement(String),
// Maps to spec Appendix C error codes
#[error("Protocol error {code}: {message}")]
Protocol { code: u16, message: String },
}
}
Configuration
Node configuration lives in a platform-specific data directory (unless overridden by NODALYNC_DATA_DIR):
- macOS:
~/Library/Application Support/io.nodalync.nodalync/config.toml - Linux:
~/.local/share/nodalync/config.toml(or$XDG_DATA_HOME/nodalync/) - Windows:
%APPDATA%\nodalync\nodalync\config.toml
Example config.toml (generated by nodalync init):
[identity]
keyfile = "<data_dir>/identity/keypair.key"
[storage]
content_dir = "<data_dir>/content"
database = "<data_dir>/nodalync.db"
cache_dir = "<data_dir>/cache"
cache_max_size_mb = 1000
[network]
enabled = true
listen_addresses = ["/ip4/0.0.0.0/tcp/9000"]
bootstrap_nodes = [
"/dns4/nodalync-bootstrap.eastus.azurecontainer.io/tcp/9000/p2p/12D3KooWMqrUmZm4e1BJTRMWqKHCe1TSX9Vu83uJLEyCGr2dUjYm",
]
[settlement]
network = "hedera-testnet"
auto_deposit = false
[economics]
default_price = 0.10 # In HBAR
File Layout
The codebase uses a workspace with two groups of crates:
crates/
├── protocol/ # Core protocol crates (v0.7.x)
│ ├── nodalync-crypto/
│ ├── nodalync-types/
│ ├── nodalync-wire/
│ ├── nodalync-store/
│ ├── nodalync-valid/
│ ├── nodalync-econ/
│ ├── nodalync-ops/
│ ├── nodalync-net/
│ └── nodalync-settle/
└── apps/ # Application crates (v0.10.x)
├── nodalync-cli/
└── nodalync-mcp/
Each crate typically contains:
crates/{group}/nodalync-{module}/
├── Cargo.toml
├── src/
│ ├── lib.rs # Public API, re-exports
│ └── ... # Module-specific files
└── tests/
└── ... # Integration and compliance tests
Nodalync: A Protocol for Fair Knowledge Economics
Gabriel Giangi
gabegiangi@gmail.com
Abstract
We propose a protocol for knowledge economics that ensures original contributors receive perpetual, proportional compensation from all downstream value creation. A researcher can publish valuable findings once and receive perpetual royalties as the ecosystem builds upon their work. A writer’s insights compound in value as others synthesize and extend them. The protocol enables humans to benefit from knowledge compounding—earning from what they know, not just what they continuously produce. The protocol structures knowledge into four layers where source material (L0) forms an immutable foundation from which all derivative value flows. Cryptographic provenance chains link every insight back to its roots. A pay-per-query model routes 95% of each transaction to foundational contributors regardless of derivation depth. Users add references to shared nodes freely; payment occurs only when content is actually queried—flowing through the entire provenance chain to compensate everyone who contributed. The reference implementation includes Model Context Protocol (MCP) integration as the standard interface for AI agent consumption, creating immediate demand from agentic systems. The result is infrastructure where contributing valuable foundational knowledge once creates perpetual economic participation in all derivative work.
1. Introduction
The digital economy has systematically failed knowledge creators. Researchers publish findings that become foundational to entire industries, receiving citations but not compensation. Writers produce content that trains AI models worth billions, with no mechanism for attribution or payment. The problem is architectural: existing systems cannot track how knowledge compounds through chains of derivation, and even when they can, enforcement mechanisms collapse under market pressure.
Current approaches require continuous production. Creators must constantly generate new content to maintain income. This model favors aggregators who consolidate others’ work over original contributors who establish foundations. When insight A enables insight B which enables insight C, creator A receives nothing from C’s value despite providing the foundation. The result is a knowledge economy where humans must work perpetually, never able to benefit from the compounding value of their past contributions.
We propose a protocol that inverts this dynamic. By structuring knowledge into layers with cryptographic provenance and a pay-per-query transaction model, we ensure value flows backward through derivation chains to original contributors every time knowledge is used. Foundational contributors—those who provide source material—receive proportional compensation automatically with each query. A researcher can publish valuable findings once and receive perpetual royalties as the ecosystem builds upon their work. A domain expert’s knowledge compounds in value as others synthesize and extend it. The protocol enables humans to earn from what they know, not just what they continuously produce—creating a path toward economic participation that does not require perpetual labor.
The protocol serves as a knowledge layer between humans and AI. Any agent can query personal knowledge bases through standard interfaces, with every query triggering automatic compensation to all contributors in the provenance chain. This creates infrastructure for a fair knowledge economy—one that bridges the historical gap between research and commerce, enabling foundational contributors to participate economically in all derivative value their work enables.
2. Prior Work
The components of this protocol draw from established systems. Content-addressed storage, pioneered by Git and formalized by IPFS, provides cryptographic integrity guarantees through hash-based identification. Merkle trees enable efficient verification with logarithmic proof sizes. The Model Context Protocol, released by Anthropic and now stewarded by the Linux Foundation, provides a standard interface for AI systems to consume external resources.
Prior attempts at data marketplaces—Ocean Protocol, Streamr, Azure Data Marketplace—failed primarily on the pricing problem: data value varies dramatically by context, and sellers consistently could not determine appropriate prices. NFT royalty systems failed differently: royalties were never enforced on-chain but relied on marketplace cooperation, which collapsed under competitive pressure when platforms began offering zero-royalty trading to attract volume.
Academic citation systems demonstrate that attribution without compensation creates no economic incentive for foundational contribution. Publishers capture margins while authors receive prestige as a substitute for payment. This protocol proposes that attribution and compensation must be unified—provenance chains that simultaneously prove contribution and trigger payment.
Our contribution is not novel components but their integration into a coherent system with a pay-per-query model that ensures compensation flows to all contributors every time knowledge is used. There is no upfront purchase to bypass, no secondary market to circumvent—every query to every node triggers payment through the entire provenance chain.
3. Knowledge Layers
The protocol structures all knowledge into four distinct layers with specific properties:
| Layer | Name | Contents | Properties |
|---|---|---|---|
| L0 | Raw Inputs | Documents, transcripts, notes | Immutable, publishable, queryable |
| L1 | Mentions | Atomic facts with L0 pointers | Extracted, visible as preview |
| L2 | Entity Graph | Entities + RDF relations | Internal only, never shared |
| L3 | Insights | Emergent patterns and conclusions | Shareable, importable as L0 |
L0 represents raw source material—documents, transcripts, notes, research. L0 is immutable once published; updates are published as new versions (see Section 4.2). When shared, L0 content remains on the owner’s node and is accessed only through paid queries.
L1 consists of atomic facts extracted from L0, each maintaining a pointer to its source. L1 serves as a preview layer: when browsing shared content, users see L1 mentions as a summary of what the L0 contains. This enables informed decisions about what to query without requiring payment to evaluate relevance.
L2 is the synthesis layer used for internal organization. It represents entities and the RDF relations between them (subject-predicate-object triples), enabling structured queries across source material. L2 is never shared because it represents reorganization rather than new creation—preventing value extraction through mere restructuring.
L3 represents genuinely emergent insights—conclusions abstract enough to constitute new intellectual property. L3 can be shared and queried like L0. When imported into another user’s graph, L3 functions as their L0, enabling knowledge to compound across ownership boundaries while preserving attribution chains.
4. Provenance
Every node in the system stores its complete derivation history through content-addressed hashing. When content is created or modified, a hash is computed over its contents. This hash serves as a unique identifier enabling trustless verification—identical content produces identical hashes regardless of where or when it is created.
4.1 Node Structure
Each node maintains:
hash: content-addressed identifier for this version
derived_from[]: hashes of content directly contributing to this node
root_L0L1[]: flattened array of all ultimate L0+L1 sources with weights
timestamp: creation time for ordering and staleness detection
previous_version: hash of prior version (null if original)
version_root: hash of first version in chain (stable identifier)
The root_L0L1 array is the key structure for revenue distribution. Regardless of how many intermediate derivation steps occur (L2 synthesis, L3 insight generation), every node maintains direct reference to all foundational sources. An L3 derived from another L3 (imported as L0) inherits the original L3’s root_L0L1 array, extending rather than replacing the provenance chain.
This creates cryptographic proof of contribution. If Alice’s L0 hash appears in Bob’s L3’s root_L0L1 array, Alice’s contribution is provable without requiring social trust or centralized verification. The provenance is in the data structure itself.
4.2 Versioning
L0 is immutable once published. Updates are published as new nodes with new hashes. The previous_version field links to the prior version; the version_root field provides a stable identifier across all versions of the same content.
When Alice updates her L0:
new_L0.previous_version = old_L0.hash
new_L0.version_root = old_L0.version_root (or old_L0.hash if original)
Old versions remain accessible. Users who added references to v1 continue using v1; they can add references to v2 separately if desired. Provenance chains reference specific versions, preserving the historical record of what actually contributed to what. This ensures derivations remain valid even as sources evolve.
5. Transactions
The protocol operates on a pay-per-query model. Adding references is free; payment occurs when content is actually queried.
5.1 Reference and Query
Users discover shared content through network indexes that expose metadata: title, L1 mentions (as summary), hash, owner, visibility tier, and version information. This metadata is visible without payment, enabling informed decisions about relevance.
To use content, users add a reference (pointer) to their personal graph. Adding a reference is free—no content is transferred, only a hash is stored locally. The actual content remains on the owner’s node.
When the user (or their agent) queries the reference, the protocol triggers a transaction:
- Query request sent to content owner’s node
- Payment verified via handshake
- Response delivered to requester
- Revenue distributed through provenance chain
The query response can be cached and re-read locally without additional payment. The initial query is logged as “viewed,” enabling local access to already-received content. Subsequent queries to the same node (for updated information or different query parameters) trigger new payments.
5.2 Derivation
To create an L3 that derives from external sources, the user must have queried (and paid for) each source at least once. This ensures foundational contributors are compensated before their work is incorporated into derivative content.
When L3 is created, the full provenance chain is computed:
new_L3.root_L0L1 = union of all source.root_L0L1 arrays
Every foundational source that contributed to any input is included. When this L3 is later queried by others, revenue flows to all contributors in the chain.
5.3 L3 Import
When a user queries an L3 and imports it as their own L0, the full provenance chain inherits forward:
imported_L0.root_L0L1 = original_L3.root_L0L1 ∪ {original_L3.hash}
The original L3 creator joins the root contributor set. All upstream sources remain tracked. Any subsequent L3 created using this imported knowledge will distribute revenue to all contributors in the extended chain.
6. Revenue Distribution
Every query triggers revenue distribution through the entire provenance chain.
6.1 Distribution Formula
For a query generating value V to a node with root contributor set R:
owner_share = 0.05 × V
root_pool = 0.95 × V
per_root_share = root_pool / |R|
The node owner retains 5% as synthesis incentive. The remaining 95% splits equally among all L0+L1 roots in the provenance chain. All roots are weighted equally regardless of content type or derivation distance. A single query distributes payment to every contributor who helped create that knowledge.
When the same source appears multiple times in a provenance chain (through different derivation paths), it receives proportionally more: a source contributing twice receives twice the share.
6.2 Rationale for 95/5
This distribution inverts typical platform economics, where intermediaries capture 10-45% of value. The inversion is intentional: foundational knowledge is systematically undervalued in current markets. Researchers, domain experts, and original thinkers provide the substrate on which all synthesis depends, yet receive nothing from downstream value creation. The 95% allocation to foundational contributors corrects this market failure.
The 5% synthesis fee may appear to disincentivize synthesis, but this concern misunderstands the mechanism. Consider a concrete example: Bob creates an L3 insight using 2 of Alice’s L0 documents, 1 of Carol’s L0 documents, and 2 of his own L0 documents. When queried for 100 tokens:
Bob (owner + 2 roots): 5 + (2/5 × 95) = 43 tokens
Alice (2 roots): 2/5 × 95 = 38 tokens
Carol (1 root): 1/5 × 95 = 19 tokens
Bob receives 43% despite the 5% synthesis fee because he also contributed foundational material. The protocol incentivizes synthesizers to also be contributors. A pure synthesizer using entirely others’ sources receives only the 5% floor—this is by design. The incentive structure rewards those who contribute original knowledge, not those who merely reorganize others’ work.
The 5% synthesis fee is not the endgame for valuable synthesis. If an L3 is foundational enough that others build upon it (import as their L0), the original synthesizer becomes part of their root_L0L1[] arrays. The protocol incentivizes creating insights worth building on, not just worth querying. First-order queries earn 5%; becoming foundational for others’ work is where compounding happens.
6.3 Compounding Returns
The mechanism creates exponential potential for foundational contributors. Consider Alice’s L0 document over three generations of derivation:
Direct queries: 10 users query Alice's L0
Second-order: 10 L3s built on Alice's L0, each queried 10× = 100 payments
Third-order: 100 L3s each enable 10 more = 1,000 payments
Alice’s single L0 contribution earns from all downstream queries. She need not create L3s herself to benefit from the ecosystem building on her work. Contributing valuable foundational knowledge once creates perpetual economic participation—enabling earlier exit from continuous production while maintaining income as others build on one’s contributions.
6.4 Fairness Priorities
Fairness priorities are embedded in protocol design at three levels:
Fair distribution (highest priority): The 95/5 split inverts typical platform economics. Equal root weighting distributes value across all foundational contributors. The more sources an L3 builds upon, the more widely value distributes—rewarding comprehensive synthesis that draws from diverse foundations.
Fair contribution: No gatekeeping on L0 publication. No credentials required. No institutional approval necessary. The market determines value, not committees. Anyone can contribute foundational knowledge; quality is determined by whether others choose to build upon it.
Fair access: Access enables contribution. The protocol supports tiered pricing (commercial/academic/individual), a commons layer for explicitly open contributions, and contributor credits for those who publish L0. These mechanisms ensure that the protocol does not create a knowledge economy accessible only to the wealthy.
7. Agent Integration
The protocol exposes a query interface that any application can consume. The reference implementation includes a Model Context Protocol (MCP) integration as the standard interface for AI agent consumption. MCP, originally developed by Anthropic and now stewarded by the Linux Foundation, provides a standardized way for AI systems to access external resources. Any MCP-compatible agent can query knowledge nodes through this integration layer, with every query automatically triggering compensation through the protocol’s payment mechanism.
7.1 Query Mechanism
Agents submit queries through the MCP integration layer, which translates them into protocol QUERY operations. The protocol returns structured responses with provenance metadata:
response.content: answer to query
response.sources[]: hashes of nodes accessed
response.provenance[]: full derivation chain
response.cost: payment amount for this query
The response includes everything needed for the agent (or its operator) to verify sources and confirm payment. Provenance is embedded in the response, not stored externally. The MCP layer can add application-specific fields (confidence scores, formatted answers) while the protocol handles content delivery and payment.
7.2 Payment Handling
The protocol handles the handshake: payment verification triggers response delivery, and revenue distributes through the provenance chain. Application-level concerns—budget controls, cost previews, spending limits, auto-approve settings—are outside protocol scope. Implementations may offer cost estimates before query execution, user-defined budgets for agent sessions, or approval workflows for high-value queries.
7.3 Transparency
The protocol’s message structure provides complete audit data: every query includes timestamp, sender identity, and content hash; every response includes sources accessed; every payment includes the full revenue distribution. Applications can log these protocol events to build comprehensive audit trails—providing transparency into AI knowledge consumption that is impossible with current web scraping approaches.
8. Privacy and Visibility
The protocol is local-first. All data remains on the owner’s node. No centralized storage, no uploads to external platforms. Queries deliver responses; content itself never transfers permanently. This inverts the current paradigm where users upload data to platforms—instead, agents come to users.
8.1 Visibility Tiers
Content owners choose visibility per node:
| Tier | Discoverable | Addable by Others | Queryable |
|---|---|---|---|
| Private | No | No | No (personal use only) |
| Unlisted | No | Yes (if hash known) | Yes (pay-per-query) |
| Shared | Yes | Yes | Yes (pay-per-query) |
Private nodes exist only for personal use—internal organization, drafts, sensitive material. They cannot be discovered, referenced, or queried by others.
Unlisted nodes are queryable but not discoverable. Owners share hashes directly with specific users or groups. This enables selective sharing: grant access to collaborators without public exposure.
Shared nodes are fully public—discoverable through network indexes, addable by anyone, queryable with standard pay-per-query economics.
8.2 Private Sources in Provenance
A shared L3 may derive from private L0 sources. In this case:
The private source’s hash appears in root_L0L1[]—its existence is visible. The private source’s content remains inaccessible—others cannot query it. The private source’s owner still receives their share of revenue when the L3 is queried. Others see “private source” in provenance—they know it exists but cannot access it.
This enables selective disclosure: publish valuable insights while keeping underlying research private. Consumers trust the synthesis or they don’t—provenance shows that sources exist even if content is not verifiable.
8.3 Identity Privacy
Contributors choose their identity level per contribution. The protocol supports named contributions (full identity attached) and pseudonymous contributions (wallet address only). Provenance hashes are public and enable verification; the identity behind those hashes is configurable.
Future enhancement: Zero-knowledge verified contributions would allow contributors to prove membership in a verified set (e.g., “verified researcher”) without revealing specific identity. This requires additional infrastructure (contributor registries, ZK proof verification) and is planned for a future protocol version.
9. Network
Nodes operate independently, storing their own knowledge graphs and serving their own queries. Discovery occurs through a decentralized index where nodes publish metadata about shared content without revealing the content itself.
Settlement uses smart contracts for payment verification and distribution. When a query executes, the contract verifies payment and distributes revenue according to the provenance chain. Minimal data goes on-chain: payment flows and attestations. Content and queries remain off-chain.
This hybrid architecture—off-chain content, on-chain economics—preserves privacy while enabling trustless compensation.
9.1 Governance
The governance model remains under development. Design goals include: decentralization where possible, market-driven decision-making for most parameters, and protections ensuring broad participation rather than plutocracy. Options under consideration include one-node-one-vote, quadratic voting, and contribution-weighted governance. The final model will be determined through community input prior to mainnet launch.
10. Threat Model
We identify and address the primary attack vectors against the protocol.
10.1 Sybil Attacks
Without identity verification, actors could create multiple pseudonymous identities to claim foundational portions of knowledge. The protocol is identity-agnostic by design—we do not require identity verification at the base layer.
Instead, economic incentives align behavior. Quality content earns; spam does not. The market determines which sources are valuable through query volume. A fragmented identity strategy—creating many accounts with thin contributions—produces no advantage because revenue distributes based on which sources are actually queried, not how many sources exist.
Furthermore, reputation accrues to consistent identity. A single account with many high-quality contributions becomes discoverable and trusted. Fragmenting across pseudonyms sacrifices this reputation benefit. Optional reputation layers can build on the base protocol for contexts requiring stronger identity guarantees.
10.2 Attribution Gaming
Actors might attempt to insert themselves into provenance chains through trivial contributions or synthetic chains between controlled addresses. The protocol does not prevent this at the technical layer—but economic incentives make it unprofitable.
Revenue distributes only when content is queried. Creating thousands of unused nodes generates no income. The market determines value through actual queries. Synthetic chains between controlled addresses simply redistribute funds within the attacker’s own wallet.
10.3 Content Copying
After querying content, a user could theoretically republish it as their own. This is a limitation of any system providing information access. However, several factors mitigate this risk: republished content lacks provenance linkage to the original; the original has earlier timestamps providing evidence of priority; copied content cannot benefit from the original’s reputation or query history; and audit trails document the original query, providing evidence for legal recourse.
10.4 Disputes
The protocol does not adjudicate disputes—it provides evidence. Provenance chains are cryptographic fact: a hash is either in root_L0L1[] or it is not. For suspected plagiarism or parallel discovery:
Embedding similarity detection can flag potential copies at the application layer. Audit trails document access patterns, showing who queried what and when. Two independent derivation chains arriving at similar insights is valuable data, not necessarily a conflict—it may indicate robust conclusions. Legal systems handle disputes; the protocol provides complete evidence for those systems to adjudicate.
10.5 External Plagiarism
The protocol cannot prevent unauthorized publication of external work at the entry point. Someone could publish an externally-created paper as their own L0. However, the protocol makes such theft transparent and traceable:
Timestamps record when content was published in-system. Earnings are fully visible and auditable. Evidence for legal recourse is built-in, not forensic. Contributors are encouraged to establish external prior art (journal publication, arXiv, timestamps) as authoritative record of original creation.
Alternatively, the protocol itself can serve as a proof-of-creation layer—publish to Nodalync first as timestamped record, then pursue traditional publication.
11. Limitations
The protocol does not solve all problems in knowledge economics. We acknowledge the following limitations.
Pricing discovery. The protocol does not determine what queries should cost. Owners set prices; the market accepts or rejects them. This may result in inefficient pricing, particularly in early stages before market norms emerge. However, unlike prior data marketplaces that failed attempting to solve pricing algorithmically, we treat price discovery as a market function rather than a protocol function.
Cold start. The protocol’s value increases with participation. Early adopters face a network with limited content and few users. We expect adoption to begin in specific domains where knowledge value is clear (research, technical documentation, domain expertise) before expanding to broader use cases.
Regulatory uncertainty. Immutable provenance chains may conflict with data protection regulations requiring deletion rights. Implementations must consider jurisdictional requirements. The separation of content (deletable at the node) from provenance hashes (persistent) provides partial mitigation, but legal analysis is required for specific deployments.
Not all knowledge should be monetized. The protocol creates an option for compensation, not a mandate. Commons-based knowledge sharing remains valuable and should continue. The protocol complements rather than replaces open knowledge systems—it provides a path for those who wish to receive compensation without requiring everyone to participate in economic exchange.
12. Conclusion
The Nodalync protocol creates infrastructure for fair knowledge economics. By structuring knowledge into layers with cryptographic provenance, implementing pay-per-query transactions, and distributing revenue through complete derivation chains, the protocol ensures that foundational contributors receive perpetual, proportional compensation from all downstream value creation.
Foundational contributors are the substrate of this economy. A researcher, writer, or domain expert can contribute valuable source material once and benefit as the ecosystem builds upon their work. They need not continuously produce, need not create sophisticated L3 insights, need not compete with aggregators. The protocol routes value backward through derivation chains automatically—creating a path to economic participation that does not require perpetual labor.
For AI systems, the protocol provides a standard interface for consuming human knowledge while respecting attribution and compensation. Every query triggers payment to all contributors in the provenance chain. This creates sustainable infrastructure for AI-human knowledge exchange—not extraction without attribution, but transaction with fair compensation.
The alternative to this protocol is not the knowledge commons—it is the current reality where AI systems train on human knowledge with no mechanism for attribution or payment. The protocol offers a third path: knowledge that flows freely through derivation chains while ensuring that those who contribute to that flow receive proportional benefit.
We propose this as the knowledge layer between humans and AI: infrastructure where contributing valuable knowledge creates perpetual economic participation in all derivative work.
References
[1] Nakamoto, S. (2008). Bitcoin: A Peer-to-Peer Electronic Cash System.
[2] Anthropic. (2024). Model Context Protocol Specification.
[3] Benet, J. (2014). IPFS - Content Addressed, Versioned, P2P File System.
[4] Merkle, R. (1988). A Digital Signature Based on a Conventional Encryption Function.
[5] Douceur, J. (2002). The Sybil Attack. IPTPS.
[6] World Wide Web Consortium. (2014). RDF 1.1 Concepts and Abstract Syntax.
Nodalync Protocol Specification — L2 Entity Graph Addendum
Version: 0.2.0-draft
Date: January 2026
Status: Draft Addendum to v0.7.1
Summary of Changes
This addendum elevates L2 (Entity Graph) from internal-only to a protocol-level content type, while keeping it as personal/private content. Key design decisions:
- Complete provenance chain: L0 → L1 → L2 → L3
- L2 is personal: Your L2 represents your unique perspective — it is never queried by others
- URI-based ontology: Entity types and relationship predicates use URIs for RDF interoperability
- L3 derives from L2: Your insights (L3) are built on your knowledge graph (L2)
Design Philosophy
L2 is Your Perspective
L2 represents how you understand and link entities across the documents you’ve studied. Two people reading the same papers might build very different L2 graphs based on:
- Which entities they consider important
- How they resolve ambiguous references
- What relationships they infer
- Which external ontologies they use
This is valuable intellectual work, but it’s personal. Your L2 is never directly monetized — its value surfaces when you create L3 insights that others find valuable.
Economic Model
Alice's L0 (document)
↓
Bob queries Alice's L0 → Alice gets paid
↓
Bob extracts L1 from Alice's L0
↓
You query Bob's L1 → Bob gets paid (Alice gets root share)
↓
You build L2 from Bob's L1 (YOUR perspective)
↓
You create L3 insight from YOUR L2
↓
Eve queries YOUR L3 → You get 5% synthesis fee
→ Alice gets 95% (she's in root_L0L1)
Your L2 work is “invisible” economically — the compensation comes from your L3 insights.
URI-Based Ontology
Instead of closed enums, L2 uses URIs for extensibility:
# Entity types (can be any ontology)
entity_types: ["schema:Person", "foaf:Person"]
entity_types: ["ndl:Concept"]
entity_types: ["http://example.org/ontology#CustomType"]
# Relationship predicates
predicate: "schema:worksFor"
predicate: "ndl:mentions"
predicate: "http://purl.org/dc/terms/creator"
This enables:
- Standard ontologies (Schema.org, FOAF, Dublin Core)
- Custom domain-specific ontologies
- Interoperability with semantic web tools
- No protocol changes needed for new types
1. Updated Data Structures
§4.1 Content Types (REPLACE)
enum ContentType : uint8 {
L0 = 0x00, # Raw input (documents, notes, transcripts)
L1 = 0x01, # Mentions (extracted atomic facts)
L2 = 0x02, # Entity Graph (linked entities and relationships)
L3 = 0x03 # Insights (emergent synthesis)
}
Knowledge Layer Semantics:
| Layer | Content | Typical Operation | Value Added |
|---|---|---|---|
| L0 | Raw documents, notes, transcripts | CREATE | Original source material |
| L1 | Atomic facts extracted from L0 | EXTRACT_L1 | Structured, quotable claims |
| L2 | Entities and relationships across L1s | BUILD_L2 | Cross-document linking, entity resolution |
| L3 | Novel insights synthesizing sources | DERIVE | Original analysis and conclusions |
§4.4a Entity Graph (L2) (NEW SECTION)
Insert after §4.4 Mention:
struct L2EntityGraph {
# === Core Identity ===
id: Hash, # H(serialized entities + relationships)
# === Sources ===
source_l1s: L1Reference[], # L1 summaries this graph was built from
source_l2s: Hash[], # Other L2 graphs merged/extended (optional)
# === Graph Content ===
entities: Entity[], # Resolved entities
relationships: Relationship[], # Relationships between entities
# === Statistics ===
entity_count: uint32,
relationship_count: uint32,
source_mention_count: uint32 # Total mentions linked
}
struct L1Reference {
l1_hash: Hash, # Hash of the L1Summary content
l0_hash: Hash, # The original L0 this L1 came from
mention_ids_used: Hash[] # Which specific mentions were used
}
struct Entity {
id: Hash, # Stable entity ID: H(canonical_label || entity_type)
canonical_label: string, # Primary name (max 200 chars)
aliases: string[], # Alternative names/spellings (max 50)
entity_type: EntityType,
# === Evidence ===
source_mentions: MentionRef[], # Which L1 mentions establish this entity
# === Confidence ===
confidence: float64, # 0.0 - 1.0, resolution confidence
resolution_method: ResolutionMethod,
# === Optional Metadata ===
description: string?, # Summary description (max 500 chars)
external_ids: ExternalId[]? # Links to external knowledge bases
}
struct MentionRef {
l1_hash: Hash, # Which L1 contains this mention
mention_id: Hash # Specific mention ID within that L1
}
struct ExternalId {
system: string, # e.g., "wikidata", "orcid", "doi"
identifier: string # The ID in that system
}
struct Relationship {
id: Hash, # H(subject || predicate || object)
subject: Hash, # Entity ID
predicate: string, # Relationship type (max 100 chars)
object: RelationshipObject, # Entity ID or literal
# === Evidence ===
source_mentions: MentionRef[], # Mentions that support this relationship
confidence: float64, # 0.0 - 1.0
# === Temporal (optional) ===
valid_from: Timestamp?,
valid_to: Timestamp?
}
enum RelationshipObject {
EntityRef(Hash), # Reference to another entity
Literal(LiteralValue) # A value (string, number, date)
}
struct LiteralValue {
value_type: LiteralType,
value: string # Encoded value
}
enum LiteralType : uint8 {
String = 0x00,
Integer = 0x01,
Float = 0x02,
Date = 0x03, # ISO 8601
DateTime = 0x04, # ISO 8601
Boolean = 0x05,
Uri = 0x06
}
enum EntityType : uint8 {
Person = 0x00,
Organization = 0x01,
Location = 0x02,
Concept = 0x03,
Event = 0x04,
Work = 0x05, # Paper, book, article, etc.
Product = 0x06,
Technology = 0x07,
Metric = 0x08, # Quantitative measure
TimePoint = 0x09,
Other = 0xFF
}
enum ResolutionMethod : uint8 {
ExactMatch = 0x00, # Same string
Normalized = 0x01, # Case/punctuation normalized
Alias = 0x02, # Known alias matched
Coreference = 0x03, # Pronoun/reference resolved
ExternalLink = 0x04, # Matched via external KB
Manual = 0x05, # Human-verified
AIAssisted = 0x06 # ML model assisted
}
Constraints:
L2 Entity Graph constraints:
1. len(source_l1s) >= 1 # Must derive from at least one L1
2. len(entities) >= 1 # Must have at least one entity
3. Each entity.id is unique within the graph
4. Each relationship references valid entity IDs
5. All MentionRefs point to valid L1s in source_l1s
6. 0.0 <= confidence <= 1.0
7. len(canonical_label) <= 200
8. len(aliases) <= 50
9. len(predicate) <= 100
10. entity_count == len(entities)
11. relationship_count == len(relationships)
§4.4b L2 Summary (Preview) (NEW SECTION)
For previewing L2 content without revealing the full graph:
struct L2Summary {
l2_hash: Hash, # Hash of the full L2EntityGraph
entity_count: uint32,
relationship_count: uint32,
source_l1_count: uint32,
# === Preview (free) ===
top_entities: EntityPreview[], # Top 10 entities by mention count
entity_type_distribution: TypeCount[], # How many of each type
relationship_types: string[], # List of predicates used (max 20)
# === Quality Indicators ===
avg_confidence: float64,
cross_document_links: uint32 # Entities appearing in multiple L1s
}
struct EntityPreview {
id: Hash,
canonical_label: string,
entity_type: EntityType,
mention_count: uint32, # How many mentions support this entity
relationship_count: uint32 # Relationships involving this entity
}
struct TypeCount {
entity_type: EntityType,
count: uint32
}
§4.5 Provenance (UPDATED)
Update the constraints to include L2:
struct Provenance {
root_L0L1: ProvenanceEntry[], # All foundational L0/L1 sources
derived_from: Hash[], # Direct parent hashes (any content type)
depth: uint32 # Max derivation depth from any L0
}
Constraints:
- root_L0L1 contains entries of type L0 or L1 only (never L2 or L3)
- L0 content: root_L0L1 = [self], derived_from = [], depth = 0
- L1 content: root_L0L1 = [parent L0], derived_from = [L0 hash], depth = 1
- L2 content: root_L0L1 = merged roots from source L1s,
derived_from = source L1 hashes, depth = max(source.depth) + 1
- L3 content: root_L0L1 = merged roots from all sources,
derived_from = source hashes, depth = max(source.depth) + 1
- All entries in derived_from MUST have been queried by creator
Provenance Chain Examples:
Simple chain:
L0(doc) → L1(mentions) → L2(entities) → L3(insight)
depth: 0 1 2 3
Branching:
L0(doc1) → L1(m1) ─┐
├→ L2(graph) → L3(insight)
L0(doc2) → L1(m2) ─┘
L2.provenance = {
root_L0L1: [doc1, doc2],
derived_from: [m1, m2],
depth: 2
}
L3.provenance = {
root_L0L1: [doc1, doc2], # Inherited from L2
derived_from: [L2.hash],
depth: 3
}
L3 deriving directly from L1 (skipping L2):
L0(doc) → L1(mentions) → L3(insight)
L3.provenance = {
root_L0L1: [doc],
derived_from: [mentions],
depth: 2
}
L3 deriving from mix of L1 and L2:
L0(doc1) → L1(m1) → L2(graph) ─┐
├→ L3(insight)
L0(doc2) → L1(m2) ─────────────┘
L3.provenance = {
root_L0L1: [doc1, doc2], # Merged from both paths
derived_from: [L2.hash, m2],
depth: 4 # max(3, 2) + 1
}
2. Updated Message Types
§6.2 Discovery Messages (UPDATED)
L2 content can be announced and searched like any other content:
# AnnouncePayload for L2
When content_type == L2:
l1_summary field is replaced with l2_summary: L2Summary
# SearchResult for L2
struct SearchResult {
hash: Hash,
content_type: ContentType,
title: string,
owner: PeerId,
# Type-specific preview:
l1_summary: L1Summary?, # If L0 or L1
l2_summary: L2Summary?, # If L2
price: Amount,
total_queries: uint64,
relevance_score: float64,
publisher_addresses: string[] # Multiaddresses for reconnection
}
§6.3a L2 Preview Messages (NEW)
# L2_PREVIEW_REQUEST = 0x0210
struct L2PreviewRequestPayload {
hash: Hash
}
# L2_PREVIEW_RESPONSE = 0x0211
struct L2PreviewResponsePayload {
hash: Hash,
manifest: Manifest,
l2_summary: L2Summary
}
§6.1 MessageType (UPDATED)
Add new message types:
enum MessageType : uint16 {
# ... existing types ...
# L2 Preview (0x02xx range, after Preview)
L2_PREVIEW_REQUEST = 0x0210,
L2_PREVIEW_RESPONSE = 0x0211,
}
3. Updated Protocol Operations
§7.1.2a Build L2 (Entity Graph) (NEW OPERATION)
Insert after §7.1.2 Extract L1:
BUILD_L2(source_l1s: Hash[], config: L2BuildConfig?) → Hash
Purpose:
Build an L2 Entity Graph from one or more L1 sources.
This operation performs entity extraction, resolution, and relationship inference.
Preconditions:
- All source L1s have been queried (payment proof exists)
- len(source_l1s) >= 1
Procedure:
1. Verify all L1 sources were queried:
For each l1_hash in source_l1s:
assert cache.has(l1_hash) OR content.has(l1_hash)
l1 = load_l1(l1_hash)
assert l1.content_type == L1
2. Extract entities from mentions:
raw_entities = []
For each l1 in source_l1s:
For each mention in l1.mentions:
extracted = extract_entities(mention)
raw_entities.extend(extracted)
3. Resolve entities (merge duplicates):
resolved_entities = resolve_entities(raw_entities, config)
# This handles:
# - Exact string matching
# - Alias resolution
# - Coreference resolution
# - External KB linking (optional)
4. Extract relationships:
relationships = extract_relationships(resolved_entities, source_l1s)
5. Build L2 structure:
l2_graph = L2EntityGraph {
id: computed after serialization,
source_l1s: [L1Reference for each l1],
source_l2s: [],
entities: resolved_entities,
relationships: relationships,
entity_count: len(resolved_entities),
relationship_count: len(relationships),
source_mention_count: total_mentions_linked
}
6. Compute hash:
content = serialize(l2_graph)
hash = ContentHash(content)
l2_graph.id = hash
7. Compute provenance:
root_entries = []
For each l1 in source_l1s:
l1_prov = get_provenance(l1)
For each entry in l1_prov.root_L0L1:
merge_or_increment(root_entries, entry)
provenance = Provenance {
root_L0L1: root_entries,
derived_from: source_l1s,
depth: max(l1.provenance.depth for l1 in source_l1s) + 1
}
8. Create manifest:
manifest = Manifest {
hash: hash,
content_type: L2,
owner: my_peer_id,
version: Version { number: 1, previous: null, root: hash, ... },
visibility: Private,
provenance: provenance,
...
}
9. Store content and manifest locally
10. Return hash
struct L2BuildConfig {
# Entity resolution settings
resolution_threshold: float64?, # Minimum confidence to merge (default: 0.8)
use_external_kb: bool?, # Link to external knowledge bases
external_kb_list: string[]?, # Which KBs to use: ["wikidata", "dbpedia"]
# Relationship extraction
extract_implicit: bool?, # Infer relationships not explicitly stated
relationship_types: string[]? # Limit to specific predicates
}
§7.1.2b Merge L2 (NEW OPERATION)
Merge multiple L2 graphs into one:
MERGE_L2(source_l2s: Hash[], config: L2MergeConfig?) → Hash
Purpose:
Combine multiple L2 Entity Graphs, resolving entities across them.
Creates a unified knowledge graph from multiple domain-specific graphs.
Preconditions:
- All source L2s have been queried (payment proof exists)
- len(source_l2s) >= 2
Procedure:
1. Verify all L2 sources were queried
2. Collect all entities and relationships from sources
3. Cross-graph entity resolution:
# Find same entities appearing in different graphs
merged_entities = resolve_across_graphs(source_l2s, config)
4. Merge relationships (update entity references)
5. Build new L2 with:
source_l1s: union of all source L1 references
source_l2s: the input source_l2s
6. Compute provenance:
# Roots come from all underlying L1s (via source L2s)
root_entries = merge roots from all source_l2s
provenance = Provenance {
root_L0L1: root_entries,
derived_from: source_l2s,
depth: max(l2.provenance.depth for l2 in source_l2s) + 1
}
7. Store and return hash
§7.1.5 Derive (Create L3) (UPDATED)
L3 can now derive from L2 in addition to L0, L1, and other L3:
DERIVE(sources: Hash[], insight_content: bytes, metadata: Metadata) → Hash
Sources may include:
- L0 content (raw documents)
- L1 content (mention collections)
- L2 content (entity graphs)
- L3 content (other insights)
All sources must have been queried (payment proof exists).
Provenance computation:
For L0/L1 sources: merge their root_L0L1 directly
For L2 sources: merge the L2's root_L0L1 (which traces back to L0/L1)
For L3 sources: merge the L3's root_L0L1 (recursive)
derived_from = all source hashes
depth = max(source.provenance.depth) + 1
§7.2.2a L2 Preview (NEW)
L2_PREVIEW(hash: Hash) → (Manifest, L2Summary)
Procedure:
1. Send L2_PREVIEW_REQUEST to content owner
2. Receive L2_PREVIEW_RESPONSE
3. Validate manifest
4. Return (manifest, l2_summary)
Cost: Free (like L1 preview)
4. Updated Validation Rules
§9.1 Content Validation (UPDATED)
VALIDATE_CONTENT(content: bytes, manifest: Manifest) → bool
Rules:
# ... existing rules 1-6 ...
7. manifest.content_type in {L0, L1, L2, L3} # Updated
8. manifest.visibility in {Private, Unlisted, Shared}
# L2-specific validation
9. If manifest.content_type == L2:
l2 = deserialize(content) as L2EntityGraph
assert l2.id == manifest.hash
assert len(l2.source_l1s) >= 1
assert len(l2.entities) >= 1
assert all entity IDs are unique
assert all relationship entity refs are valid
assert all MentionRefs point to valid source L1s
assert l2.entity_count == len(l2.entities)
assert l2.relationship_count == len(l2.relationships)
§9.3 Provenance Validation (UPDATED)
VALIDATE_PROVENANCE(manifest: Manifest, sources: Manifest[]) → bool
Rules:
1. If manifest.content_type == L0:
manifest.provenance.root_L0L1 == [self_entry]
manifest.provenance.derived_from == []
manifest.provenance.depth == 0
2. If manifest.content_type == L1:
len(manifest.provenance.root_L0L1) >= 1
manifest.provenance.derived_from contains exactly one L0 hash
manifest.provenance.depth == 1
All root_L0L1 entries are type L0
3. If manifest.content_type == L2:
len(manifest.provenance.root_L0L1) >= 1
len(manifest.provenance.derived_from) >= 1
All derived_from are L1 or L2 hashes
All root_L0L1 entries are type L0 or L1
manifest.provenance.depth >= 2
depth == max(source.depth) + 1
4. If manifest.content_type == L3:
len(manifest.provenance.root_L0L1) >= 1
len(manifest.provenance.derived_from) >= 1
All derived_from hashes exist in sources
All root_L0L1 entries are type L0 or L1
depth == max(source.depth) + 1
5. For all types:
Computed root_L0L1 matches declared root_L0L1
No cycles in derived_from graph
All weights > 0
5. Economic Rules (UPDATED)
§10.1 Revenue Distribution (UPDATED)
The distribution formula remains unchanged. L2 creators receive payment when:
- Their L2 is queried directly — They get the synthesis fee (5%) plus any roots they contributed
- Their L2 is used in an L3 — Their L2’s root_L0L1 is merged, so underlying L0/L1 creators are paid
Important: L2 creators do NOT automatically get compensation when their L2 is derived from. Instead:
- The root_L0L1 (which traces back through the L2 to original L0/L1) gets paid
- If the L2 creator also created some of those L0/L1s, they get that share
- The L2 creator’s work is compensated when someone queries the L2
This maintains the principle: value flows to foundational contributors (L0/L1), while L2/L3 creators earn through synthesis fees when their content is queried.
§10.2 Distribution Example (UPDATED)
Extended scenario with L2:
Alice creates L0 (document)
Bob extracts L1 from Alice's L0
Carol builds L2 entity graph from Bob's L1
Dave creates L3 insight from Carol's L2
Eve queries Dave's L3 for 100 HBAR
Provenance chain:
L0 (Alice) → L1 (Bob, depth=1) → L2 (Carol, depth=2) → L3 (Dave, depth=3)
Dave's L3 provenance:
root_L0L1 = [{ hash: alice_l0, owner: Alice, weight: 1 }]
derived_from = [carol_l2]
depth = 3
Distribution of 100 HBAR payment:
Dave (L3 owner, synthesis fee): 5 HBAR
Root pool: 95 HBAR
Only root_L0L1 entries share the pool:
Alice (L0 owner): 95 HBAR
Carol receives nothing from THIS query.
Carol earns when someone queries HER L2 directly.
What if Carol also contributed an L0?
If Carol had created L0_carol that Bob also used:
root_L0L1 = [
{ hash: alice_l0, owner: Alice, weight: 1 },
{ hash: carol_l0, owner: Carol, weight: 1 }
]
Then distribution would be:
Dave (synthesis): 5 HBAR
Alice (1/2 root pool): 47.5 HBAR
Carol (1/2 root pool): 47.5 HBAR
6. Appendix Updates
Appendix B: Constants (ADD)
# L2 Entity Graph limits
MAX_ENTITIES_PER_L2 = 10000
MAX_RELATIONSHIPS_PER_L2 = 50000
MAX_ALIASES_PER_ENTITY = 50
MAX_CANONICAL_LABEL_LENGTH = 200
MAX_PREDICATE_LENGTH = 100
MAX_ENTITY_DESCRIPTION_LENGTH = 500
MAX_SOURCE_L1S_PER_L2 = 100
MAX_SOURCE_L2S_PER_MERGE = 20
Appendix C: Error Codes (ADD)
# L2 specific errors
L2_INVALID_STRUCTURE = 0x0210 # Malformed L2EntityGraph
L2_MISSING_SOURCE = 0x0211 # Source L1 not found
L2_ENTITY_LIMIT = 0x0212 # Too many entities
L2_RELATIONSHIP_LIMIT = 0x0213 # Too many relationships
L2_INVALID_ENTITY_REF = 0x0214 # Relationship references invalid entity
L2_CYCLE_DETECTED = 0x0215 # Circular entity reference
7. Migration Notes
Backward Compatibility
- Existing L0 → L1 → L3 chains remain valid
- L2 is optional; protocols can continue without it
- Nodes that don’t understand L2 treat it as unknown content type
- Network upgrade is additive (no breaking changes)
Recommended Upgrade Path
- Phase 1: Add L2 data structures to types
- Phase 2: Add L2 validation rules
- Phase 3: Add BUILD_L2 operation
- Phase 4: Update DERIVE to accept L2 sources
- Phase 5: Add L2 preview messages
- Phase 6: Update DHT announcements
8. Design Rationale
Why L2 at Protocol Level?
-
Complete Provenance: Without L2, the provenance chain has a gap. Entity resolution work is invisible.
-
Fair Compensation: Building high-quality entity graphs requires significant effort (manual curation, ML models, external KB integration). This work deserves compensation.
-
Reusability: A well-built entity graph is valuable to many consumers. Making it a first-class content type enables this.
-
Interoperability: Protocol-level standardization ensures L2 graphs from different nodes are compatible.
Why L0/L1 Remain the Roots?
The economic model preserves foundational value:
- L0/L1 represent irreducible source material
- L2/L3 are transformations that add value but depend on foundations
- Synthesis fees (5%) compensate L2/L3 creators for their work
- Root pool (95%) ensures original contributors are always paid
This prevents value extraction where intermediaries capture all revenue without compensating sources.
L2 Implementation Flexibility
The spec defines structures but not algorithms:
- Entity extraction: Rule-based, NLP, or ML
- Entity resolution: String matching, embedding similarity, or external KB
- Relationship extraction: Dependency parsing, pattern matching, or LLM
Implementers choose appropriate methods for their use case.
Module: nodalync-crypto
Source: Protocol Specification §3
Overview
This module provides all cryptographic primitives for the Nodalync protocol. It has no internal dependencies and should be implemented first.
Dependencies
External only:
sha2— SHA-256 implementationed25519-dalek— Ed25519 signaturesrand— Random number generationbs58— Base58 encoding (for human-readable IDs)
§3.1 Hash Function
Algorithm: SHA-256
Content hashes are computed as:
ContentHash(content) = H(
0x00 || # Domain separator for content
len(content) as uint64 || # Big-endian length prefix
content # Raw content bytes
)
Implementation Notes
- Use domain separator
0x00to prevent hash collision across different uses - Length is encoded as big-endian uint64
- Returns 32-byte hash
Test Cases
- Determinism: Same content → same hash
- Uniqueness: Different content → different hash (probabilistic)
- Domain separation:
ContentHash(x) ≠ H(x)(raw hash without prefix)
§3.2 Identity
Algorithm: Ed25519
Keypair Generation
#![allow(unused)]
fn main() {
fn generate_keypair() -> (PrivateKey, PublicKey)
}
PeerId Derivation
PeerId is derived from public key:
PeerId = H(
0x00 || # Key type: Ed25519
public_key # 32 bytes
)[0:20] # Truncate to 20 bytes
Human-Readable Format
Format: ndl1 + base32(PeerId)
Example: ndl1qpzry9x8gf2tvdw0s3jn54khce6mua7l
Implementation Notes
- PeerId is 20 bytes (160 bits) — sufficient entropy, compact
- Prefix
ndl1identifies Nodalync addresses (likebc1for Bitcoin) - Use Bech32 or similar for human-readable encoding with checksum
Test Cases
- Determinism: Same public key → same PeerId
- Roundtrip: encode → decode → original PeerId
- Checksum: Invalid checksum rejected
§3.3 Signatures
All protocol messages requiring authentication are signed.
Signature Creation
#![allow(unused)]
fn main() {
fn sign(private_key: &PrivateKey, message: &[u8]) -> Signature
}
Internally:
signature = Ed25519_Sign(private_key, H(message))
Signature Verification
#![allow(unused)]
fn main() {
fn verify(public_key: &PublicKey, message: &[u8], signature: &Signature) -> bool
}
Internally:
Ed25519_Verify(public_key, H(message), signature)
SignedMessage Structure
#![allow(unused)]
fn main() {
pub struct SignedMessage {
pub payload: Vec<u8>,
pub signer: PeerId,
pub signature: Signature,
}
}
Test Cases
- Valid signature: Sign → Verify succeeds
- Tampered message: Modify payload → Verify fails
- Wrong key: Verify with different public key → fails
- Truncated signature: Short signature → fails
§3.4 Content Addressing
Content is referenced by its hash. The hash serves as a unique, verifiable identifier.
Verification
#![allow(unused)]
fn main() {
fn verify_content(content: &[u8], expected_hash: &Hash) -> bool {
ContentHash(content) == expected_hash
}
}
Test Cases
- Valid content: Verify succeeds
- Tampered content: Single byte change → Verify fails
Data Types
#![allow(unused)]
fn main() {
/// 32-byte SHA-256 hash
pub struct Hash(pub [u8; 32]);
/// Ed25519 private key (32 bytes, keep secret)
pub struct PrivateKey([u8; 32]);
/// Ed25519 public key (32 bytes)
pub struct PublicKey(pub [u8; 32]);
/// Ed25519 signature (64 bytes)
pub struct Signature(pub [u8; 64]);
/// Truncated hash of public key (20 bytes)
pub struct PeerId(pub [u8; 20]);
/// Milliseconds since Unix epoch
pub type Timestamp = u64;
}
Public API
#![allow(unused)]
fn main() {
// Content hashing
pub fn content_hash(content: &[u8]) -> Hash;
pub fn verify_content(content: &[u8], expected: &Hash) -> bool;
// Identity
pub fn generate_identity() -> (PrivateKey, PublicKey);
pub fn peer_id_from_public_key(public_key: &PublicKey) -> PeerId;
pub fn peer_id_to_string(peer_id: &PeerId) -> String;
pub fn peer_id_from_string(s: &str) -> Result<PeerId, ParseError>;
// Signing
pub fn sign(private_key: &PrivateKey, message: &[u8]) -> Signature;
pub fn verify(public_key: &PublicKey, message: &[u8], signature: &Signature) -> bool;
}
Appendix: Hash Domain Separators
From spec Appendix A.2:
| Use | Domain Byte | Description |
|---|---|---|
| Content | 0x00 | Content hashing |
| Messages | 0x01 | Message signing |
| Channels | 0x02 | Channel state |
These ensure hashes computed for different purposes never collide.
Module: nodalync-types
Source: Protocol Specification §4
Overview
This module defines all data structures used across the protocol. No logic, just definitions with validation constraints documented.
Dependencies
nodalync-crypto— Hash, PeerId, Signature typesserde— Serialization derives
§4.1 ContentType
#![allow(unused)]
fn main() {
#[repr(u8)]
pub enum ContentType {
/// Raw input (documents, notes, transcripts)
L0 = 0x00,
/// Mentions (extracted atomic facts)
L1 = 0x01,
/// Entity Graph (personal knowledge structure) - always private
L2 = 0x02,
/// Insights (emergent synthesis)
L3 = 0x03,
}
}
Knowledge Layer Semantics:
| Layer | Queryable | Purpose |
|---|---|---|
| L0 | Yes | Original source material |
| L1 | Yes | Structured, quotable claims |
| L2 | No | Your personal perspective (cross-document linking) |
| L3 | Yes | Original analysis and conclusions |
Note: L2 is personal — always visibility = Private, never announced, never queried by others.
§4.2 Visibility
#![allow(unused)]
fn main() {
#[repr(u8)]
pub enum Visibility {
/// Local only, not served to others
Private = 0x00,
/// Served if hash known, not announced to DHT
Unlisted = 0x01,
/// Announced to DHT, publicly queryable
Shared = 0x02,
}
}
§4.3 Version
#![allow(unused)]
fn main() {
pub struct Version {
/// Sequential version number (1-indexed)
pub number: u32,
/// Hash of previous version (None if first version)
pub previous: Option<Hash>,
/// Hash of first version (stable identifier across versions)
pub root: Hash,
/// Creation timestamp
pub timestamp: Timestamp,
}
}
Constraints:
- If
number == 1:previousMUST beNone,rootMUST equal content hash - If
number > 1:previousMUST beSome,rootMUST equalprevious.root
§4.4 Mention (L1)
#![allow(unused)]
fn main() {
pub struct Mention {
/// H(content || source_location)
pub id: Hash,
/// The atomic fact (max 1000 chars)
pub content: String,
/// Where in L0 this fact came from
pub source_location: SourceLocation,
/// Type of fact
pub classification: Classification,
/// How certain we are this is in the source
pub confidence: Confidence,
/// Extracted entity names
pub entities: Vec<String>,
}
pub struct SourceLocation {
pub location_type: LocationType,
/// Location identifier (paragraph number, page, timestamp, etc.)
pub reference: String,
/// Exact quote from source (max 500 chars)
pub quote: Option<String>,
}
#[repr(u8)]
pub enum LocationType {
Paragraph = 0x00,
Page = 0x01,
Timestamp = 0x02,
Line = 0x03,
Section = 0x04,
}
#[repr(u8)]
pub enum Classification {
Claim = 0x00,
Statistic = 0x01,
Definition = 0x02,
Observation = 0x03,
Method = 0x04,
Result = 0x05,
}
#[repr(u8)]
pub enum Confidence {
/// Directly stated in source
Explicit = 0x00,
/// Reasonably inferred
Inferred = 0x01,
}
}
§4.4a Entity Graph (L2)
L2 represents your personal knowledge graph — how you link entities across documents you’ve studied.
URI Type
#![allow(unused)]
fn main() {
/// URI for RDF interoperability
/// Can be:
/// - Full URI: "http://schema.org/Person"
/// - Compact URI (CURIE): "schema:Person" (expanded using prefixes)
/// - Protocol-defined: "ndl:Person"
pub type Uri = String;
}
Prefix Mapping
#![allow(unused)]
fn main() {
/// Maps short prefixes to full URI namespaces
pub struct PrefixMap {
pub entries: Vec<PrefixEntry>,
}
pub struct PrefixEntry {
/// Short prefix, e.g., "schema"
pub prefix: String,
/// Full URI namespace, e.g., "http://schema.org/"
pub uri: String,
}
impl Default for PrefixMap {
fn default() -> Self {
Self {
entries: vec![
PrefixEntry { prefix: "ndl".into(), uri: "https://nodalync.io/ontology/".into() },
PrefixEntry { prefix: "schema".into(), uri: "http://schema.org/".into() },
PrefixEntry { prefix: "foaf".into(), uri: "http://xmlns.com/foaf/0.1/".into() },
PrefixEntry { prefix: "dc".into(), uri: "http://purl.org/dc/elements/1.1/".into() },
PrefixEntry { prefix: "rdf".into(), uri: "http://www.w3.org/1999/02/22-rdf-syntax-ns#".into() },
PrefixEntry { prefix: "rdfs".into(), uri: "http://www.w3.org/2000/01/rdf-schema#".into() },
PrefixEntry { prefix: "xsd".into(), uri: "http://www.w3.org/2001/XMLSchema#".into() },
PrefixEntry { prefix: "owl".into(), uri: "http://www.w3.org/2002/07/owl#".into() },
],
}
}
}
}
L2 Entity Graph
#![allow(unused)]
fn main() {
pub struct L2EntityGraph {
/// H(serialized entities + relationships)
pub id: Hash,
// === Sources ===
/// L1 summaries this graph was built from
pub source_l1s: Vec<L1Reference>,
/// Other L2 graphs merged/extended (for MERGE_L2)
pub source_l2s: Vec<Hash>,
// === Namespace Prefixes ===
pub prefixes: PrefixMap,
// === Graph Content ===
pub entities: Vec<Entity>,
pub relationships: Vec<Relationship>,
// === Statistics ===
pub entity_count: u32,
pub relationship_count: u32,
pub source_mention_count: u32,
}
pub struct L1Reference {
/// Hash of the L1Summary content
pub l1_hash: Hash,
/// The original L0 this L1 came from
pub l0_hash: Hash,
/// Which specific mentions were used (empty = all)
pub mention_ids_used: Vec<Hash>,
}
}
Entity
#![allow(unused)]
fn main() {
pub struct Entity {
/// Stable entity ID: H(canonical_uri || canonical_label)
pub id: Hash,
// === Identity ===
/// Primary human-readable name (max 200 chars)
pub canonical_label: String,
/// Canonical URI, e.g., "dbr:Albert_Einstein"
pub canonical_uri: Option<Uri>,
/// Alternative names/spellings (max 50)
pub aliases: Vec<String>,
// === Type (RDF-compatible) ===
/// e.g., ["schema:Person", "foaf:Person"]
pub entity_types: Vec<Uri>,
// === Evidence ===
/// Which L1 mentions establish this entity
pub source_mentions: Vec<MentionRef>,
// === Confidence ===
/// 0.0 - 1.0, resolution confidence
pub confidence: f64,
pub resolution_method: ResolutionMethod,
// === Optional Metadata ===
/// Summary description (max 500 chars)
pub description: Option<String>,
/// owl:sameAs links to external entities
pub same_as: Option<Vec<Uri>>,
}
pub struct MentionRef {
/// Which L1 contains this mention
pub l1_hash: Hash,
/// Specific mention ID within that L1
pub mention_id: Hash,
}
#[repr(u8)]
pub enum ResolutionMethod {
/// Same string
ExactMatch = 0x00,
/// Case/punctuation normalized
Normalized = 0x01,
/// Known alias matched
Alias = 0x02,
/// Pronoun/reference resolved
Coreference = 0x03,
/// Matched via external KB
ExternalLink = 0x04,
/// Human-verified
Manual = 0x05,
/// ML model assisted
AIAssisted = 0x06,
}
}
Relationship
#![allow(unused)]
fn main() {
pub struct Relationship {
/// H(subject || predicate || object)
pub id: Hash,
// === Triple ===
/// Entity ID (subject)
pub subject: Hash,
/// RDF predicate URI, e.g., "schema:worksFor"
pub predicate: Uri,
/// Entity ID, external ref, or literal value
pub object: RelationshipObject,
// === Evidence ===
/// Mentions that support this relationship
pub source_mentions: Vec<MentionRef>,
/// 0.0 - 1.0
pub confidence: f64,
// === Temporal (optional) ===
pub valid_from: Option<Timestamp>,
pub valid_to: Option<Timestamp>,
}
pub enum RelationshipObject {
/// Reference to another entity in this graph
EntityRef(Hash),
/// Reference to external entity by URI
ExternalRef(Uri),
/// A typed literal value
Literal(LiteralValue),
}
pub struct LiteralValue {
/// The value as string
pub value: String,
/// XSD datatype URI, e.g., "xsd:date" (None = plain string)
pub datatype: Option<Uri>,
/// Language tag, e.g., "en" (for strings only)
pub language: Option<String>,
}
}
L2 Build/Merge Configuration
#![allow(unused)]
fn main() {
pub struct L2BuildConfig {
/// Custom prefix mappings (merged with defaults)
pub prefixes: Option<PrefixMap>,
/// Default entity type if not detected, default: "ndl:Concept"
pub default_entity_type: Option<Uri>,
/// Minimum confidence to merge entities (default: 0.8)
pub resolution_threshold: Option<f64>,
/// Link to external knowledge bases
pub use_external_kb: Option<bool>,
/// Which KBs: ["http://www.wikidata.org/", ...]
pub external_kb_list: Option<Vec<Uri>>,
/// Infer implicit relationships
pub extract_implicit: Option<bool>,
/// Limit to specific predicates
pub relationship_predicates: Option<Vec<Uri>>,
}
pub struct L2MergeConfig {
/// Override prefix mappings
pub prefixes: Option<PrefixMap>,
/// Confidence threshold for cross-graph entity merging
pub entity_merge_threshold: Option<f64>,
/// Index of source to prefer on conflicts
pub prefer_source: Option<u32>,
}
}
L2 Constraints:
visibilityMUST bePrivate(L2 is never shared)economics.priceMUST be0(L2 is never queried)source_l1s.len() >= 1entities.len() >= 1- All entity IDs unique within graph
- All relationship entity refs point to valid entities or external URIs
- All MentionRefs point to valid L1s in
source_l1s 0.0 <= confidence <= 1.0
§4.5 Provenance
#![allow(unused)]
fn main() {
pub struct Provenance {
/// All foundational L0+L1 sources
pub root_L0L1: Vec<ProvenanceEntry>,
/// Direct parent hashes (immediate sources)
pub derived_from: Vec<Hash>,
/// Max derivation depth from any L0
pub depth: u32,
}
pub struct ProvenanceEntry {
/// Content hash
pub hash: Hash,
/// Owner's node ID
pub owner: PeerId,
/// Visibility at time of derivation
pub visibility: Visibility,
/// Weight for duplicate handling
/// (same source appearing multiple times gets higher weight)
pub weight: u32,
}
}
Constraints:
root_L0L1contains only L0/L1 entries (never L2 or L3)- For L0:
root_L0L1 = [self],derived_from = [],depth = 0 - For L1:
root_L0L1 = [parent L0],derived_from = [L0 hash],depth = 1 - For L2:
root_L0L1 = merged from source L1s,derived_from = L1/L2 hashes,depth >= 2 - For L3:
root_L0L1.len() >= 1,derived_from.len() >= 1,depth = max(sources) + 1 - All hashes in
derived_frommust have been queried by creator (or owned) - No self-reference allowed
§4.6 AccessControl
#![allow(unused)]
fn main() {
pub struct AccessControl {
/// If set, only these peers can query (None = all allowed)
pub allowlist: Option<Vec<PeerId>>,
/// These peers are blocked (None = none blocked)
pub denylist: Option<Vec<PeerId>>,
/// Require payment bond to query
pub require_bond: bool,
/// Bond amount if required
pub bond_amount: Option<Amount>,
/// Rate limit per peer (None = unlimited)
pub max_queries_per_peer: Option<u32>,
}
}
Access Logic:
Access granted if:
(allowlist is None OR peer in allowlist) AND
(denylist is None OR peer NOT in denylist) AND
(require_bond is false OR peer has posted bond)
§4.7 Economics
#![allow(unused)]
fn main() {
pub struct Economics {
/// Price per query (in tinybars, 10^-8 HBAR)
pub price: Amount,
/// Currency identifier
pub currency: Currency,
/// Total queries served
pub total_queries: u64,
/// Total revenue generated
pub total_revenue: Amount,
}
#[repr(u8)]
pub enum Currency {
/// Hedera native token (1 HBAR = 10^8 tinybars)
HBAR = 0x00,
}
/// Amount in tinybars (10^-8 HBAR)
pub type Amount = u64;
}
§4.8 Manifest
The complete metadata for a content item:
#![allow(unused)]
fn main() {
pub struct Manifest {
// === Identity ===
/// Content hash (unique identifier)
pub hash: Hash,
/// Type of content
pub content_type: ContentType,
/// Owner's peer ID (receives synthesis fee, serves content)
pub owner: PeerId,
// === Versioning ===
pub version: Version,
// === Visibility & Access ===
pub visibility: Visibility,
pub access: AccessControl,
// === Metadata ===
pub metadata: Metadata,
// === Economics ===
pub economics: Economics,
// === Provenance ===
pub provenance: Provenance,
// === Timestamps ===
pub created_at: Timestamp,
pub updated_at: Timestamp,
}
pub struct Metadata {
/// Max 200 chars
pub title: String,
/// Max 2000 chars
pub description: Option<String>,
/// Max 20 tags, each max 50 chars
pub tags: Vec<String>,
/// Size in bytes
pub content_size: u64,
/// MIME type if applicable
pub mime_type: Option<String>,
}
}
§4.9 L1Summary (Preview)
#![allow(unused)]
fn main() {
pub struct L1Summary {
/// Source L0 hash
pub l0_hash: Hash,
/// Total mentions extracted
pub mention_count: u32,
/// First N mentions (max 5)
pub preview_mentions: Vec<Mention>,
/// Main topics (max 5)
pub primary_topics: Vec<String>,
/// 2-3 sentence summary (max 500 chars)
pub summary: String,
}
}
Additional Types
Payment Channel
#![allow(unused)]
fn main() {
pub struct Channel {
/// Unique channel identifier: H(initiator || responder || nonce)
pub channel_id: Hash,
pub peer_id: PeerId,
pub state: ChannelState,
pub my_balance: Amount,
pub their_balance: Amount,
pub nonce: u64,
pub last_update: Timestamp,
pub pending_payments: Vec<Payment>,
}
#[repr(u8)]
pub enum ChannelState {
Opening = 0x00,
Open = 0x01,
Closing = 0x02,
Closed = 0x03,
Disputed = 0x04,
}
pub struct Payment {
/// H(channel_id || nonce || amount || recipient)
pub id: Hash,
/// Channel this payment belongs to
/// NOTE: Not in spec §5.3 but added for implementation convenience
/// (needed to compute id, lookup payments by channel)
pub channel_id: Hash,
pub amount: Amount,
pub recipient: PeerId,
/// Content that was queried
pub query_hash: Hash,
/// For distribution to all root contributors
pub provenance: Vec<ProvenanceEntry>,
pub timestamp: Timestamp,
/// Signed by payer
pub signature: Signature,
}
}
Distribution
#![allow(unused)]
fn main() {
pub struct Distribution {
pub recipient: PeerId,
pub amount: Amount,
/// Which source this is for
pub source_hash: Hash,
}
}
Settlement
#![allow(unused)]
fn main() {
pub struct SettlementEntry {
pub recipient: PeerId,
pub amount: Amount,
/// Content hashes for audit
pub provenance_hashes: Vec<Hash>,
/// Payment IDs included
pub payment_ids: Vec<Hash>,
}
pub struct SettlementBatch {
pub batch_id: Hash,
pub entries: Vec<SettlementEntry>,
/// Root of entries merkle tree
pub merkle_root: Hash,
}
}
Constants (from Appendix B)
#![allow(unused)]
fn main() {
pub mod constants {
use super::Amount;
// Limits
pub const MAX_CONTENT_SIZE: u64 = 104_857_600; // 100 MB
pub const MAX_MESSAGE_SIZE: u64 = 10_485_760; // 10 MB
pub const MAX_MENTIONS_PER_L0: u32 = 1000;
pub const MAX_SOURCES_PER_L3: u32 = 100;
pub const MAX_PROVENANCE_DEPTH: u32 = 100;
pub const MAX_TAGS: usize = 20;
pub const MAX_TAG_LENGTH: usize = 50;
pub const MAX_TITLE_LENGTH: usize = 200;
pub const MAX_DESCRIPTION_LENGTH: usize = 2000;
pub const MAX_SUMMARY_LENGTH: usize = 500;
pub const MAX_MENTION_CONTENT_LENGTH: usize = 1000;
pub const MAX_QUOTE_LENGTH: usize = 500;
// L2 Entity Graph limits
pub const MAX_ENTITIES_PER_L2: u32 = 10_000;
pub const MAX_RELATIONSHIPS_PER_L2: u32 = 50_000;
pub const MAX_ALIASES_PER_ENTITY: usize = 50;
pub const MAX_CANONICAL_LABEL_LENGTH: usize = 200;
pub const MAX_PREDICATE_LENGTH: usize = 100;
pub const MAX_ENTITY_DESCRIPTION_LENGTH: usize = 500;
pub const MAX_SOURCE_L1S_PER_L2: usize = 100;
pub const MAX_SOURCE_L2S_PER_MERGE: usize = 20;
// Economics
pub const MIN_PRICE: Amount = 1;
pub const MAX_PRICE: Amount = 10_000_000_000_000_000; // 10^16
pub const SYNTHESIS_FEE_NUMERATOR: u64 = 5;
pub const SYNTHESIS_FEE_DENOMINATOR: u64 = 100; // 5%
pub const SETTLEMENT_BATCH_THRESHOLD: Amount = 10_000_000_000; // 100 HBAR
pub const SETTLEMENT_BATCH_INTERVAL_MS: u64 = 3_600_000; // 1 hour
// Timing
pub const MESSAGE_TIMEOUT_MS: u64 = 30_000;
pub const CHANNEL_DISPUTE_PERIOD_MS: u64 = 86_400_000; // 24 hours
pub const MAX_CLOCK_SKEW_MS: u64 = 300_000; // 5 minutes
// DHT
pub const DHT_BUCKET_SIZE: usize = 20;
pub const DHT_ALPHA: usize = 3;
pub const DHT_REPLICATION: usize = 20;
}
}
Error Types
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(u16)]
pub enum ErrorCode {
// Query Errors (0x0001 - 0x00FF)
NotFound = 0x0001,
AccessDenied = 0x0002,
PaymentRequired = 0x0003,
PaymentInvalid = 0x0004,
RateLimited = 0x0005,
VersionNotFound = 0x0006,
// Channel Errors (0x0100 - 0x01FF)
ChannelNotFound = 0x0100,
ChannelClosed = 0x0101,
InsufficientBalance = 0x0102,
InvalidNonce = 0x0103,
InvalidSignature = 0x0104,
// Validation Errors (0x0200 - 0x02FF)
InvalidHash = 0x0200,
InvalidProvenance = 0x0201,
InvalidVersion = 0x0202,
InvalidManifest = 0x0203,
ContentTooLarge = 0x0204,
// L2 Entity Graph Errors (0x0210 - 0x021F)
L2InvalidStructure = 0x0210,
L2MissingSource = 0x0211,
L2EntityLimit = 0x0212,
L2RelationshipLimit = 0x0213,
L2InvalidEntityRef = 0x0214,
L2CycleDetected = 0x0215,
L2InvalidUri = 0x0216,
L2CannotPublish = 0x0217,
// Network Errors (0x0300 - 0x03FF)
PeerNotFound = 0x0300,
ConnectionFailed = 0x0301,
Timeout = 0x0302,
// Internal Errors
InternalError = 0xFFFF,
}
}
Implementation Notes
- All types should derive
Debug,Clone,PartialEq,Eqwhere sensible - All types should derive
Serialize,Deserializefor wire format - Use
#[serde(rename_all = "snake_case")]for consistent JSON representation - Consider
#[non_exhaustive]for enums to allow future extension - Implement
Defaultfor types where a sensible default exists
Module: nodalync-wire
Source: Protocol Specification §6, Appendix A
Overview
Message serialization and deserialization. Defines the wire format for all protocol messages.
Dependencies
nodalync-types— All data structuresciborium— CBOR encoding
Message Envelope (§6.1)
#![allow(unused)]
fn main() {
pub struct Message {
/// Protocol version (0x01)
pub version: u8,
/// Message type
pub message_type: MessageType,
/// Unique message ID
pub id: Hash,
/// Creation timestamp
pub timestamp: Timestamp,
/// Sender's peer ID
pub sender: PeerId,
/// Type-specific payload (CBOR encoded)
pub payload: Vec<u8>,
/// Signs H(version || type || id || timestamp || sender || payload_hash)
pub signature: Signature,
}
#[repr(u16)]
pub enum MessageType {
// Discovery (0x01xx)
Announce = 0x0100,
AnnounceUpdate = 0x0101,
Search = 0x0110,
SearchResponse = 0x0111,
// Preview (0x02xx)
PreviewRequest = 0x0200,
PreviewResponse = 0x0201,
// Query (0x03xx)
QueryRequest = 0x0300,
QueryResponse = 0x0301,
QueryError = 0x0302,
// Version (0x04xx)
VersionRequest = 0x0400,
VersionResponse = 0x0401,
// Channel (0x05xx)
ChannelOpen = 0x0500,
ChannelAccept = 0x0501,
ChannelUpdate = 0x0502,
ChannelClose = 0x0503,
ChannelDispute = 0x0504,
// Settlement (0x06xx)
SettleBatch = 0x0600,
SettleConfirm = 0x0601,
// Peer (0x07xx)
Ping = 0x0700,
Pong = 0x0701,
PeerInfo = 0x0710,
}
}
Payload Types (§6.2 - §6.8)
Discovery Payloads
#![allow(unused)]
fn main() {
pub struct AnnouncePayload {
pub hash: Hash,
pub content_type: ContentType,
pub title: String,
pub l1_summary: L1Summary,
pub price: Amount,
pub addresses: Vec<String>, // Multiaddrs
}
pub struct SearchPayload {
pub query: String,
pub filters: Option<SearchFilters>,
pub limit: u32,
pub offset: u32,
}
pub struct SearchFilters {
pub content_types: Option<Vec<ContentType>>,
pub max_price: Option<Amount>,
pub min_reputation: Option<i64>,
pub created_after: Option<Timestamp>,
pub created_before: Option<Timestamp>,
pub tags: Option<Vec<String>>,
}
pub struct SearchResult {
pub hash: Hash,
pub content_type: ContentType,
pub title: String,
pub owner: PeerId,
pub l1_summary: L1Summary,
pub price: Amount,
pub total_queries: u64,
pub relevance_score: f64,
/// Publisher's reachable multiaddresses for reconnection
pub publisher_addresses: Vec<String>,
}
}
Query Payloads
#![allow(unused)]
fn main() {
pub struct QueryRequestPayload {
pub hash: Hash,
pub query: Option<String>,
pub payment: Payment,
pub version_spec: Option<VersionSpec>,
}
pub enum VersionSpec {
Latest,
Number(u32),
Hash(Hash),
}
pub struct QueryResponsePayload {
pub hash: Hash,
pub content: Vec<u8>,
pub manifest: Manifest,
pub payment_receipt: PaymentReceipt,
}
pub struct PaymentReceipt {
pub payment_id: Hash,
pub amount: Amount,
pub timestamp: Timestamp,
pub channel_nonce: u64,
pub distributor_signature: Signature,
}
pub struct QueryErrorPayload {
pub hash: Hash,
pub error_code: ErrorCode,
pub message: Option<String>,
}
}
Channel Payloads
#![allow(unused)]
fn main() {
pub struct ChannelOpenPayload {
pub channel_id: Hash,
pub initial_balance: Amount,
pub funding_tx: Option<Vec<u8>>,
}
pub struct ChannelAcceptPayload {
pub channel_id: Hash,
pub initial_balance: Amount,
pub funding_tx: Option<Vec<u8>>,
}
pub struct ChannelUpdatePayload {
pub channel_id: Hash,
pub nonce: u64,
pub balances: ChannelBalances,
pub payments: Vec<Payment>,
pub signature: Signature,
}
pub struct ChannelBalances {
pub initiator: Amount,
pub responder: Amount,
}
pub struct ChannelClosePayload {
pub channel_id: Hash,
pub final_balances: ChannelBalances,
/// Proposed on-chain settlement transaction
pub settlement_tx: Vec<u8>,
}
pub struct ChannelDisputePayload {
pub channel_id: Hash,
/// Highest known state
pub claimed_state: ChannelUpdatePayload,
/// Supporting evidence
pub evidence: Vec<Vec<u8>>,
}
}
Version Payloads
#![allow(unused)]
fn main() {
pub struct VersionRequestPayload {
/// Stable version root identifier
pub version_root: Hash,
}
pub struct VersionResponsePayload {
pub version_root: Hash,
pub versions: Vec<VersionInfo>,
pub latest: Hash,
}
pub struct VersionInfo {
pub hash: Hash,
pub number: u32,
pub timestamp: Timestamp,
pub visibility: Visibility,
pub price: Amount,
}
}
Settlement Payloads
#![allow(unused)]
fn main() {
pub struct SettleBatchPayload {
pub batch_id: Hash,
pub entries: Vec<SettlementEntry>,
/// Root of entries merkle tree
pub merkle_root: Hash,
/// Signature from batch creator
pub signature: Signature,
}
pub struct SettlementEntry {
pub recipient: PeerId,
pub amount: Amount,
/// Content hashes for audit trail
pub provenance_hashes: Vec<Hash>,
/// Payment IDs included in this entry
pub payment_ids: Vec<Hash>,
}
pub struct SettleConfirmPayload {
pub batch_id: Hash,
/// On-chain transaction ID
pub transaction_id: String,
pub block_number: u64,
pub timestamp: Timestamp,
}
}
Peer Payloads
#![allow(unused)]
fn main() {
pub struct PingPayload {
pub nonce: u64,
}
pub struct PongPayload {
pub nonce: u64,
}
pub struct PeerInfoPayload {
pub peer_id: PeerId,
pub public_key: PublicKey,
pub addresses: Vec<String>, // Multiaddrs
pub capabilities: Vec<Capability>,
pub content_count: u64,
pub uptime: u64, // Seconds
}
#[repr(u8)]
pub enum Capability {
/// Can serve queries
Query = 0x01,
/// Supports payment channels
Channel = 0x02,
/// Can initiate settlement
Settle = 0x04,
/// Participates in DHT indexing
Index = 0x08,
}
}
Announce Update Payload
#![allow(unused)]
fn main() {
pub struct AnnounceUpdatePayload {
/// Stable version root identifier
pub version_root: Hash,
/// New version hash
pub new_hash: Hash,
pub version_number: u32,
pub title: String,
pub l1_summary: L1Summary,
pub price: Amount,
}
}
Wire Format (Appendix A)
Encoding Rules
-
CBOR encoding (RFC 8949) with deterministic rules:
- Map keys sorted lexicographically
- No indefinite-length arrays or maps
- Minimal integer encoding
- No floating-point for amounts (use u64)
-
Message wire format:
[0x00] # Protocol magic byte
[version: u8] # Protocol version
[type: u16 BE] # Message type
[length: u32 BE] # Payload length
[payload: bytes] # CBOR-encoded payload
[signature: 64 bytes] # Ed25519 signature
Hash Computation
#![allow(unused)]
fn main() {
// Content hash (domain separator 0x00)
fn content_hash(content: &[u8]) -> Hash {
let mut hasher = Sha256::new();
hasher.update(&[0x00]); // Domain separator
hasher.update(&(content.len() as u64).to_be_bytes());
hasher.update(content);
Hash(hasher.finalize().into())
}
// Message hash for signing (domain separator 0x01)
fn message_hash(msg: &Message) -> Hash {
let mut hasher = Sha256::new();
hasher.update(&[0x01]); // Domain separator
hasher.update(&[msg.version]);
hasher.update(&(msg.message_type as u16).to_be_bytes());
hasher.update(&msg.id.0);
hasher.update(&msg.timestamp.to_be_bytes());
hasher.update(&msg.sender.0);
hasher.update(&content_hash(&msg.payload).0);
Hash(hasher.finalize().into())
}
// Channel state hash (domain separator 0x02)
fn channel_state_hash(channel_id: &Hash, nonce: u64, balances: &ChannelBalances) -> Hash {
let mut hasher = Sha256::new();
hasher.update(&[0x02]); // Domain separator
hasher.update(&channel_id.0);
hasher.update(&nonce.to_be_bytes());
hasher.update(&balances.initiator.to_be_bytes());
hasher.update(&balances.responder.to_be_bytes());
Hash(hasher.finalize().into())
}
}
Public API
#![allow(unused)]
fn main() {
// Encoding
pub fn encode_message(msg: &Message) -> Result<Vec<u8>, EncodeError>;
pub fn encode_payload<T: Serialize>(payload: &T) -> Result<Vec<u8>, EncodeError>;
// Decoding
pub fn decode_message(bytes: &[u8]) -> Result<Message, DecodeError>;
pub fn decode_payload<T: DeserializeOwned>(bytes: &[u8]) -> Result<T, DecodeError>;
// Message construction helpers
pub fn create_message(
message_type: MessageType,
payload: Vec<u8>,
identity: &Identity,
) -> Message;
// Validation (checks format, not semantic validity)
pub fn validate_message_format(msg: &Message) -> Result<(), FormatError>;
}
Test Cases
- Roundtrip: Encode → Decode → identical message
- Determinism: Same message → same bytes (important for signatures)
- Invalid magic byte: Reject
- Invalid version: Reject
- Truncated message: Reject
- Invalid CBOR: Reject
- Signature mismatch: Reject
Module: nodalync-store
Source: Protocol Specification §5
Overview
Local storage for content, manifests, provenance graph, and payment channels.
Dependencies
nodalync-types— All data structuresrusqlite— SQLite for structured datadirectories— Platform-specific paths
Storage Layout
~/.nodalync/
├── config.toml # Node configuration
├── identity/
│ ├── keypair.key # Ed25519 private key (encrypted)
│ └── peer_id # Public identity
├── content/
│ └── {hash_prefix}/
│ └── {hash} # Raw content files
├── nodalync.db # SQLite: manifests, provenance, channels
└── cache/
└── {hash_prefix}/
└── {hash} # Cached content from queries
§5.1 State Components
NodeState
#![allow(unused)]
fn main() {
pub struct NodeState {
pub identity: Identity,
pub content: ContentStore,
pub manifests: ManifestStore,
pub provenance: ProvenanceGraph,
pub channels: ChannelStore,
pub cache: CacheStore,
}
}
Identity Storage
Private key encrypted at rest:
- Encryption: AES-256-GCM
- Key derivation: Argon2id from user password
- Nonce: Random 12 bytes, stored with ciphertext
§5.2 Provenance Graph
Bidirectional graph for efficient traversal:
#![allow(unused)]
fn main() {
pub trait ProvenanceGraph {
/// Add content with its derivation sources
fn add(&mut self, hash: &Hash, derived_from: &[Hash]) -> Result<()>;
/// Get all root L0+L1 sources (flattened)
fn get_roots(&self, hash: &Hash) -> Result<Vec<ProvenanceEntry>>;
/// Get all content derived from this hash
fn get_derivations(&self, hash: &Hash) -> Result<Vec<Hash>>;
/// Check if A is an ancestor of B
fn is_ancestor(&self, ancestor: &Hash, descendant: &Hash) -> Result<bool>;
}
}
SQL Schema:
-- Forward edges
CREATE TABLE derived_from (
content_hash BLOB NOT NULL,
source_hash BLOB NOT NULL,
PRIMARY KEY (content_hash, source_hash)
);
-- Cached flattened roots (for performance)
CREATE TABLE root_cache (
content_hash BLOB NOT NULL,
root_hash BLOB NOT NULL,
owner BLOB NOT NULL,
visibility INTEGER NOT NULL,
weight INTEGER NOT NULL DEFAULT 1,
PRIMARY KEY (content_hash, root_hash)
);
CREATE INDEX idx_derivations ON derived_from(source_hash);
§5.3 Payment Channels
#![allow(unused)]
fn main() {
pub trait ChannelStore {
fn create(&mut self, peer: &PeerId, channel: Channel) -> Result<()>;
fn get(&self, peer: &PeerId) -> Result<Option<Channel>>;
fn update(&mut self, peer: &PeerId, channel: &Channel) -> Result<()>;
fn list_open(&self) -> Result<Vec<(PeerId, Channel)>>;
fn add_payment(&mut self, peer: &PeerId, payment: Payment) -> Result<()>;
fn get_pending_payments(&self, peer: &PeerId) -> Result<Vec<Payment>>;
fn clear_payments(&mut self, peer: &PeerId, payment_ids: &[Hash]) -> Result<()>;
}
}
Trait Definitions
ContentStore
#![allow(unused)]
fn main() {
pub trait ContentStore {
/// Store content, returns hash
fn store(&mut self, content: &[u8]) -> Result<Hash>;
/// Store content with known hash (for verification)
fn store_verified(&mut self, hash: &Hash, content: &[u8]) -> Result<()>;
/// Load content by hash
fn load(&self, hash: &Hash) -> Result<Option<Vec<u8>>>;
/// Check if content exists
fn exists(&self, hash: &Hash) -> bool;
/// Delete content
fn delete(&mut self, hash: &Hash) -> Result<()>;
/// Get content size without loading
fn size(&self, hash: &Hash) -> Result<Option<u64>>;
}
}
ManifestStore
#![allow(unused)]
fn main() {
pub trait ManifestStore {
fn store(&mut self, manifest: &Manifest) -> Result<()>;
fn load(&self, hash: &Hash) -> Result<Option<Manifest>>;
fn update(&mut self, manifest: &Manifest) -> Result<()>;
fn delete(&mut self, hash: &Hash) -> Result<()>;
/// List manifests with optional filtering
fn list(&self, filter: ManifestFilter) -> Result<Vec<Manifest>>;
/// Get all versions of content by version_root
fn get_versions(&self, version_root: &Hash) -> Result<Vec<Manifest>>;
}
pub struct ManifestFilter {
pub visibility: Option<Visibility>,
pub content_type: Option<ContentType>,
pub created_after: Option<Timestamp>,
pub created_before: Option<Timestamp>,
pub limit: Option<u32>,
pub offset: Option<u32>,
}
}
CacheStore
#![allow(unused)]
fn main() {
pub trait CacheStore {
/// Cache content from a query
fn cache(&mut self, entry: CachedContent) -> Result<()>;
/// Get cached content
fn get(&self, hash: &Hash) -> Result<Option<CachedContent>>;
/// Check if cached
fn is_cached(&self, hash: &Hash) -> bool;
/// Evict old entries (LRU)
fn evict(&mut self, max_size_bytes: u64) -> Result<u64>;
/// Clear all cache
fn clear(&mut self) -> Result<()>;
}
pub struct CachedContent {
pub hash: Hash,
pub content: Vec<u8>,
pub source_peer: PeerId,
pub queried_at: Timestamp,
/// NOTE: Spec §5.1 says "PaymentProof" but that type is undefined.
/// Using PaymentReceipt from §6.4 instead.
pub payment_proof: PaymentReceipt,
}
}
SettlementQueueStore
The settlement queue stores pending distributions until batch settlement.
nodalync-ops writes to this queue after processing queries.
nodalync-settle reads from this queue to create settlement batches.
#![allow(unused)]
fn main() {
pub trait SettlementQueueStore {
/// Add a distribution to the queue
fn enqueue(&mut self, distribution: QueuedDistribution) -> Result<()>;
/// Get all pending distributions
fn get_pending(&self) -> Result<Vec<QueuedDistribution>>;
/// Get pending distributions for a specific recipient
fn get_pending_for(&self, recipient: &PeerId) -> Result<Vec<QueuedDistribution>>;
/// Get total pending amount across all recipients
fn get_pending_total(&self) -> Result<Amount>;
/// Mark distributions as settled (by payment IDs)
fn mark_settled(&mut self, payment_ids: &[Hash], batch_id: &Hash) -> Result<()>;
/// Get last settlement timestamp
fn get_last_settlement_time(&self) -> Result<Option<Timestamp>>;
/// Set last settlement timestamp
fn set_last_settlement_time(&mut self, timestamp: Timestamp) -> Result<()>;
}
pub struct QueuedDistribution {
/// Original payment ID this distribution came from
pub payment_id: Hash,
/// Recipient of this distribution
pub recipient: PeerId,
/// Amount owed
pub amount: Amount,
/// Source content hash (for audit)
pub source_hash: Hash,
/// When the original query happened
pub queued_at: Timestamp,
}
}
---
## SQL Schema (Full)
```sql
-- Manifests
CREATE TABLE manifests (
hash BLOB PRIMARY KEY,
content_type INTEGER NOT NULL,
version_number INTEGER NOT NULL,
version_previous BLOB,
version_root BLOB NOT NULL,
version_timestamp INTEGER NOT NULL,
visibility INTEGER NOT NULL,
title TEXT NOT NULL,
description TEXT,
tags TEXT, -- JSON array
content_size INTEGER NOT NULL,
mime_type TEXT,
price INTEGER NOT NULL,
total_queries INTEGER NOT NULL DEFAULT 0,
total_revenue INTEGER NOT NULL DEFAULT 0,
created_at INTEGER NOT NULL,
updated_at INTEGER NOT NULL,
-- Access control stored as JSON
access_control TEXT NOT NULL
);
CREATE INDEX idx_manifests_visibility ON manifests(visibility);
CREATE INDEX idx_manifests_version_root ON manifests(version_root);
CREATE INDEX idx_manifests_created ON manifests(created_at);
-- L1 Summaries
CREATE TABLE l1_summaries (
l0_hash BLOB PRIMARY KEY,
mention_count INTEGER NOT NULL,
preview_mentions TEXT NOT NULL, -- JSON
primary_topics TEXT NOT NULL, -- JSON
summary TEXT NOT NULL
);
-- Payment Channels
CREATE TABLE channels (
peer_id BLOB PRIMARY KEY,
state INTEGER NOT NULL,
my_balance INTEGER NOT NULL,
their_balance INTEGER NOT NULL,
nonce INTEGER NOT NULL,
last_update INTEGER NOT NULL
);
-- Pending Payments
CREATE TABLE payments (
id BLOB PRIMARY KEY,
channel_peer BLOB NOT NULL,
amount INTEGER NOT NULL,
recipient BLOB NOT NULL,
query_hash BLOB NOT NULL,
provenance TEXT NOT NULL, -- JSON
timestamp INTEGER NOT NULL,
signature BLOB NOT NULL,
settled INTEGER NOT NULL DEFAULT 0,
FOREIGN KEY (channel_peer) REFERENCES channels(peer_id)
);
CREATE INDEX idx_payments_channel ON payments(channel_peer);
CREATE INDEX idx_payments_settled ON payments(settled);
-- Cache metadata (content stored on filesystem)
CREATE TABLE cache (
hash BLOB PRIMARY KEY,
source_peer BLOB NOT NULL,
queried_at INTEGER NOT NULL,
size_bytes INTEGER NOT NULL,
payment_receipt TEXT NOT NULL -- JSON
);
CREATE INDEX idx_cache_queried ON cache(queried_at);
-- Settlement Queue
CREATE TABLE settlement_queue (
id INTEGER PRIMARY KEY AUTOINCREMENT,
payment_id BLOB NOT NULL,
recipient BLOB NOT NULL,
amount INTEGER NOT NULL,
source_hash BLOB NOT NULL,
queued_at INTEGER NOT NULL,
settled INTEGER NOT NULL DEFAULT 0,
batch_id BLOB -- Set when settled
);
CREATE INDEX idx_settlement_queue_recipient ON settlement_queue(recipient);
CREATE INDEX idx_settlement_queue_settled ON settlement_queue(settled);
-- Settlement metadata
CREATE TABLE settlement_meta (
key TEXT PRIMARY KEY,
value TEXT NOT NULL
);
-- Stores: last_settlement_time
Test Cases
- Content roundtrip: Store → Load → identical
- Manifest CRUD: Create, read, update, delete
- Provenance graph: Add edges → get_roots returns correct set
- Weight accumulation: Same source via multiple paths → weight increases
- Channel state: Open → payments → state updates correctly
- Cache eviction: LRU eviction frees correct entries
- Concurrent access: Multiple readers, single writer
- Settlement queue enqueue: Add distribution → retrievable
- Settlement queue totals: Multiple distributions → correct sum
- Settlement queue mark settled: Mark as settled → no longer in pending
- Settlement queue by recipient: Filter by recipient works
Module: nodalync-valid
Source: Protocol Specification §9
Overview
All validation rules for the protocol. Returns detailed errors for debugging.
Dependencies
nodalync-types— All data structuresnodalync-crypto— Hash verification
Validation Trait
#![allow(unused)]
fn main() {
pub trait Validator {
fn validate_content(&self, content: &[u8], manifest: &Manifest) -> Result<(), ValidationError>;
fn validate_version(&self, manifest: &Manifest, previous: Option<&Manifest>) -> Result<(), ValidationError>;
fn validate_provenance(&self, manifest: &Manifest, sources: &[Manifest]) -> Result<(), ValidationError>;
fn validate_payment(&self, payment: &Payment, channel: &Channel, manifest: &Manifest) -> Result<(), ValidationError>;
fn validate_message(&self, message: &Message) -> Result<(), ValidationError>;
fn validate_access(&self, requester: &PeerId, manifest: &Manifest) -> Result<(), ValidationError>;
}
}
§9.1 Content Validation
#![allow(unused)]
fn main() {
fn validate_content(content: &[u8], manifest: &Manifest) -> Result<()> {
// 1. Hash matches
ensure!(
content_hash(content) == manifest.hash,
ContentValidation("hash mismatch")
);
// 2. Size matches
ensure!(
content.len() as u64 == manifest.metadata.content_size,
ContentValidation("size mismatch")
);
// 3. Title length
ensure!(
manifest.metadata.title.len() <= MAX_TITLE_LENGTH,
ContentValidation("title too long")
);
// 4. Description length
if let Some(ref desc) = manifest.metadata.description {
ensure!(
desc.len() <= MAX_DESCRIPTION_LENGTH,
ContentValidation("description too long")
);
}
// 5. Tags
ensure!(
manifest.metadata.tags.len() <= MAX_TAGS,
ContentValidation("too many tags")
);
for tag in &manifest.metadata.tags {
ensure!(
tag.len() <= MAX_TAG_LENGTH,
ContentValidation("tag too long")
);
}
// 6. Valid enums
ensure!(
matches!(manifest.content_type, ContentType::L0 | ContentType::L1 | ContentType::L2 | ContentType::L3),
ContentValidation("invalid content type")
);
ensure!(
matches!(manifest.visibility, Visibility::Private | Visibility::Unlisted | Visibility::Shared),
ContentValidation("invalid visibility")
);
// 7. L2-specific validation
if manifest.content_type == ContentType::L2 {
validate_l2_content(content, manifest)?;
}
Ok(())
}
}
§9.1a L2 Content Validation
#![allow(unused)]
fn main() {
fn validate_l2_content(content: &[u8], manifest: &Manifest) -> Result<()> {
// L2 MUST be private
ensure!(
manifest.visibility == Visibility::Private,
L2Validation("L2 must be private")
);
// L2 MUST have zero price
ensure!(
manifest.economics.price == 0,
L2Validation("L2 must have zero price")
);
// Deserialize and validate structure
let l2: L2EntityGraph = deserialize(content)
.map_err(|_| L2Validation("invalid L2 structure"))?;
// ID matches
ensure!(
l2.id == manifest.hash,
L2Validation("L2 id must match manifest hash")
);
// Must have at least one source L1
ensure!(
!l2.source_l1s.is_empty(),
L2Validation("L2 must have at least one source L1")
);
ensure!(
l2.source_l1s.len() <= MAX_SOURCE_L1S_PER_L2,
L2Validation("too many source L1s")
);
// Must have at least one entity
ensure!(
!l2.entities.is_empty(),
L2Validation("L2 must have at least one entity")
);
ensure!(
l2.entities.len() <= MAX_ENTITIES_PER_L2 as usize,
L2Validation("too many entities")
);
// Relationship limits
ensure!(
l2.relationships.len() <= MAX_RELATIONSHIPS_PER_L2 as usize,
L2Validation("too many relationships")
);
// Counts match
ensure!(
l2.entity_count as usize == l2.entities.len(),
L2Validation("entity_count mismatch")
);
ensure!(
l2.relationship_count as usize == l2.relationships.len(),
L2Validation("relationship_count mismatch")
);
// Validate prefix map
validate_prefix_map(&l2.prefixes)?;
// Validate all entities
let mut entity_ids: HashSet<Hash> = HashSet::new();
for entity in &l2.entities {
validate_entity(entity, &l2.prefixes, &l2.source_l1s)?;
ensure!(
entity_ids.insert(entity.id),
L2Validation("duplicate entity ID")
);
}
// Validate all relationships
for rel in &l2.relationships {
validate_relationship(rel, &entity_ids, &l2.prefixes, &l2.source_l1s)?;
}
Ok(())
}
fn validate_prefix_map(prefixes: &PrefixMap) -> Result<()> {
let mut seen_prefixes: HashSet<&str> = HashSet::new();
for entry in &prefixes.entries {
ensure!(
!entry.prefix.is_empty(),
L2Validation("empty prefix")
);
ensure!(
!entry.uri.is_empty(),
L2Validation("empty URI")
);
ensure!(
entry.uri.ends_with('/') || entry.uri.ends_with('#'),
L2Validation("prefix URI must end with / or #")
);
ensure!(
seen_prefixes.insert(&entry.prefix),
L2Validation("duplicate prefix")
);
}
Ok(())
}
fn validate_entity(
entity: &Entity,
prefixes: &PrefixMap,
source_l1s: &[L1Reference],
) -> Result<()> {
// Label constraints
ensure!(
!entity.canonical_label.is_empty(),
L2Validation("empty canonical_label")
);
ensure!(
entity.canonical_label.len() <= MAX_CANONICAL_LABEL_LENGTH,
L2Validation("canonical_label too long")
);
// Aliases
ensure!(
entity.aliases.len() <= MAX_ALIASES_PER_ENTITY,
L2Validation("too many aliases")
);
// Validate entity type URIs
for uri in &entity.entity_types {
validate_uri(uri, prefixes)?;
}
// Validate canonical_uri if present
if let Some(ref uri) = entity.canonical_uri {
validate_uri(uri, prefixes)?;
}
// Validate same_as URIs if present
if let Some(ref same_as) = entity.same_as {
for uri in same_as {
validate_uri(uri, prefixes)?;
}
}
// Confidence in range
ensure!(
entity.confidence >= 0.0 && entity.confidence <= 1.0,
L2Validation("confidence out of range")
);
// All mention refs point to valid L1s
let valid_l1_hashes: HashSet<_> = source_l1s.iter().map(|r| &r.l1_hash).collect();
for mention_ref in &entity.source_mentions {
ensure!(
valid_l1_hashes.contains(&mention_ref.l1_hash),
L2Validation("mention ref points to unknown L1")
);
}
// Description length
if let Some(ref desc) = entity.description {
ensure!(
desc.len() <= MAX_ENTITY_DESCRIPTION_LENGTH,
L2Validation("entity description too long")
);
}
Ok(())
}
fn validate_relationship(
rel: &Relationship,
entity_ids: &HashSet<Hash>,
prefixes: &PrefixMap,
source_l1s: &[L1Reference],
) -> Result<()> {
// Subject must exist
ensure!(
entity_ids.contains(&rel.subject),
L2Validation("relationship subject not found")
);
// Predicate must be valid URI
validate_uri(&rel.predicate, prefixes)?;
// Object validation
match &rel.object {
RelationshipObject::EntityRef(hash) => {
ensure!(
entity_ids.contains(hash),
L2Validation("relationship object entity not found")
);
}
RelationshipObject::ExternalRef(uri) => {
validate_uri(uri, prefixes)?;
}
RelationshipObject::Literal(lit) => {
if let Some(ref dt) = lit.datatype {
validate_uri(dt, prefixes)?;
}
}
}
// Confidence in range
ensure!(
rel.confidence >= 0.0 && rel.confidence <= 1.0,
L2Validation("relationship confidence out of range")
);
// Temporal validity
if let (Some(from), Some(to)) = (rel.valid_from, rel.valid_to) {
ensure!(from <= to, L2Validation("valid_from > valid_to"));
}
// Mention refs
let valid_l1_hashes: HashSet<_> = source_l1s.iter().map(|r| &r.l1_hash).collect();
for mention_ref in &rel.source_mentions {
ensure!(
valid_l1_hashes.contains(&mention_ref.l1_hash),
L2Validation("relationship mention ref points to unknown L1")
);
}
Ok(())
}
}
§9.1b URI/CURIE Validation
#![allow(unused)]
fn main() {
/// Validate a URI or CURIE
fn validate_uri(uri: &Uri, prefixes: &PrefixMap) -> Result<()> {
ensure!(!uri.is_empty(), L2Validation("empty URI"));
if uri.contains("://") {
// Full URI - basic syntax check
ensure!(
uri.starts_with("http://") || uri.starts_with("https://"),
L2Validation("URI must be http(s)")
);
} else if let Some(colon_pos) = uri.find(':') {
// CURIE - check prefix exists
let prefix = &uri[..colon_pos];
let has_prefix = prefixes.entries.iter().any(|e| e.prefix == prefix);
ensure!(
has_prefix,
L2Validation(format!("unknown prefix: {}", prefix))
);
} else {
// No scheme or prefix - invalid
return Err(L2Validation("URI must be full URI or valid CURIE"));
}
Ok(())
}
/// Expand a CURIE to full URI
pub fn expand_curie(curie: &str, prefixes: &PrefixMap) -> Result<String> {
if curie.contains("://") {
// Already a full URI
return Ok(curie.to_string());
}
if let Some(colon_pos) = curie.find(':') {
let prefix = &curie[..colon_pos];
let local = &curie[colon_pos + 1..];
for entry in &prefixes.entries {
if entry.prefix == prefix {
return Ok(format!("{}{}", entry.uri, local));
}
}
Err(L2Validation(format!("unknown prefix: {}", prefix)))
} else {
Err(L2Validation("not a valid CURIE"))
}
}
}
§9.2 Version Validation
#![allow(unused)]
fn main() {
fn validate_version(manifest: &Manifest, previous: Option<&Manifest>) -> Result<()> {
let v = &manifest.version;
if v.number == 1 {
// First version
ensure!(v.previous.is_none(), VersionValidation("v1 must have no previous"));
ensure!(v.root == manifest.hash, VersionValidation("v1 root must equal hash"));
} else {
// Subsequent version
ensure!(v.previous.is_some(), VersionValidation("v2+ must have previous"));
if let Some(prev) = previous {
ensure!(
v.previous.as_ref() == Some(&prev.hash),
VersionValidation("previous hash mismatch")
);
ensure!(
v.root == prev.version.root,
VersionValidation("root must equal previous root")
);
ensure!(
v.number == prev.version.number + 1,
VersionValidation("version number must increment by 1")
);
ensure!(
v.timestamp > prev.version.timestamp,
VersionValidation("timestamp must be after previous")
);
}
}
Ok(())
}
}
§9.3 Provenance Validation
#![allow(unused)]
fn main() {
fn validate_provenance(manifest: &Manifest, sources: &[Manifest]) -> Result<()> {
let prov = &manifest.provenance;
match manifest.content_type {
ContentType::L0 => {
// L0: self-referential provenance
ensure!(
prov.root_L0L1.len() == 1,
ProvenanceValidation("L0 must have exactly one root (self)")
);
ensure!(
prov.root_L0L1[0].hash == manifest.hash,
ProvenanceValidation("L0 root must be self")
);
ensure!(
prov.derived_from.is_empty(),
ProvenanceValidation("L0 must not derive from anything")
);
ensure!(
prov.depth == 0,
ProvenanceValidation("L0 depth must be 0")
);
}
ContentType::L1 => {
// L1: extracted from exactly one L0
ensure!(
!prov.root_L0L1.is_empty(),
ProvenanceValidation("L1 must have at least one root")
);
ensure!(
prov.derived_from.len() == 1,
ProvenanceValidation("L1 must derive from exactly one L0")
);
ensure!(
prov.depth == 1,
ProvenanceValidation("L1 depth must be 1")
);
// All roots must be L0
for root in &prov.root_L0L1 {
if let Some(source) = sources.iter().find(|s| s.hash == root.hash) {
ensure!(
source.content_type == ContentType::L0,
ProvenanceValidation("L1 roots must all be L0")
);
}
}
}
ContentType::L2 => {
// L2: built from L1s (and optionally other L2s)
ensure!(
!prov.root_L0L1.is_empty(),
ProvenanceValidation("L2 must have at least one root")
);
ensure!(
!prov.derived_from.is_empty(),
ProvenanceValidation("L2 must derive from at least one source")
);
ensure!(
prov.depth >= 2,
ProvenanceValidation("L2 depth must be >= 2")
);
// All roots must be L0 or L1 (never L2 or L3)
for root in &prov.root_L0L1 {
if let Some(source) = sources.iter().find(|s| s.hash == root.hash) {
ensure!(
matches!(source.content_type, ContentType::L0 | ContentType::L1),
ProvenanceValidation("L2 roots must be L0 or L1 only")
);
}
}
// derived_from must be L1 or L2
for df in &prov.derived_from {
if let Some(source) = sources.iter().find(|s| s.hash == *df) {
ensure!(
matches!(source.content_type, ContentType::L1 | ContentType::L2),
ProvenanceValidation("L2 must derive from L1 or L2")
);
}
}
// Verify root_L0L1 computation
let computed_roots = compute_root_L0L1(sources);
ensure!(
roots_match(&prov.root_L0L1, &computed_roots),
ProvenanceValidation("root_L0L1 computation mismatch")
);
// Verify depth
let expected_depth = sources.iter()
.map(|s| s.provenance.depth)
.max()
.unwrap_or(0) + 1;
ensure!(
prov.depth == expected_depth,
ProvenanceValidation("depth mismatch")
);
}
ContentType::L3 => {
// L3: must derive from sources (L0, L1, L2, or other L3)
ensure!(
!prov.root_L0L1.is_empty(),
ProvenanceValidation("L3 must have at least one root")
);
ensure!(
!prov.derived_from.is_empty(),
ProvenanceValidation("L3 must derive from at least one source")
);
// All roots must be L0 or L1 (never L2 or L3)
for root in &prov.root_L0L1 {
if let Some(source) = sources.iter().find(|s| s.hash == root.hash) {
ensure!(
matches!(source.content_type, ContentType::L0 | ContentType::L1),
ProvenanceValidation("L3 roots must be L0 or L1 only")
);
}
}
// All derived_from must exist in sources
let source_hashes: HashSet<_> = sources.iter().map(|s| &s.hash).collect();
for df in &prov.derived_from {
ensure!(
source_hashes.contains(df),
ProvenanceValidation("derived_from references unknown source")
);
}
// Verify root_L0L1 computation
let computed_roots = compute_root_L0L1(sources);
ensure!(
roots_match(&prov.root_L0L1, &computed_roots),
ProvenanceValidation("root_L0L1 computation mismatch")
);
// Verify depth
let expected_depth = sources.iter()
.map(|s| s.provenance.depth)
.max()
.unwrap_or(0) + 1;
ensure!(
prov.depth == expected_depth,
ProvenanceValidation("depth mismatch")
);
}
}
// Common checks for all types
// No self-reference
ensure!(
!prov.derived_from.contains(&manifest.hash),
ProvenanceValidation("cannot derive from self")
);
ensure!(
!prov.root_L0L1.iter().any(|e| e.hash == manifest.hash),
ProvenanceValidation("cannot be own root")
);
// No cycles (basic check - full cycle detection is expensive)
ensure!(
prov.depth <= MAX_PROVENANCE_DEPTH,
ProvenanceValidation("provenance too deep")
);
Ok(())
}
}
§9.4 Payment Validation
#![allow(unused)]
fn main() {
fn validate_payment(payment: &Payment, channel: &Channel, manifest: &Manifest) -> Result<()> {
// 1. Amount sufficient
ensure!(
payment.amount >= manifest.economics.price,
PaymentValidation("insufficient payment")
);
// 2. Correct recipient
ensure!(
payment.recipient == manifest_owner(manifest),
PaymentValidation("wrong recipient")
);
// 3. Query hash matches
ensure!(
payment.query_hash == manifest.hash,
PaymentValidation("query hash mismatch")
);
// 4. Channel is open
ensure!(
channel.state == ChannelState::Open,
PaymentValidation("channel not open")
);
// 5. Sufficient balance
ensure!(
channel.their_balance >= payment.amount,
PaymentValidation("insufficient channel balance")
);
// 6. Nonce is valid (prevents replay)
ensure!(
payment_nonce(payment) > channel.nonce,
PaymentValidation("invalid nonce (replay?)")
);
// 7. Signature valid
let payer_pubkey = lookup_public_key(&payment_payer(payment, channel))?;
ensure!(
verify_payment_signature(&payer_pubkey, payment),
PaymentValidation("invalid signature")
);
// 8. Provenance matches manifest
ensure!(
provenance_matches(&payment.provenance, &manifest.provenance.root_L0L1),
PaymentValidation("provenance mismatch")
);
Ok(())
}
}
§9.5 Message Validation
#![allow(unused)]
fn main() {
fn validate_message(msg: &Message) -> Result<()> {
// 1. Protocol version
ensure!(
msg.version == PROTOCOL_VERSION,
MessageValidation("unsupported protocol version")
);
// 2. Valid message type
ensure!(
is_valid_message_type(msg.message_type),
MessageValidation("invalid message type")
);
// 3. Timestamp within skew
let now = current_timestamp();
let skew = if msg.timestamp > now {
msg.timestamp - now
} else {
now - msg.timestamp
};
ensure!(
skew <= MAX_CLOCK_SKEW_MS,
MessageValidation("timestamp outside acceptable range")
);
// 4. Valid sender
ensure!(
is_valid_peer_id(&msg.sender),
MessageValidation("invalid sender peer ID")
);
// 5. Signature valid
let pubkey = lookup_public_key(&msg.sender)?;
let msg_hash = message_hash(msg);
ensure!(
verify(&pubkey, &msg_hash.0, &msg.signature),
MessageValidation("invalid signature")
);
// 6. Payload decodes
ensure!(
payload_decodes_for_type(&msg.payload, msg.message_type),
MessageValidation("payload decode failed")
);
Ok(())
}
}
§9.6 Access Validation
#![allow(unused)]
fn main() {
fn validate_access(requester: &PeerId, manifest: &Manifest) -> Result<()> {
match manifest.visibility {
Visibility::Private => {
// Private: never accessible externally
return Err(AccessValidation("content is private"));
}
Visibility::Unlisted => {
// Check allowlist if set
if let Some(ref allowlist) = manifest.access.allowlist {
ensure!(
allowlist.contains(requester),
AccessValidation("not in allowlist")
);
}
// Check denylist if set
if let Some(ref denylist) = manifest.access.denylist {
ensure!(
!denylist.contains(requester),
AccessValidation("in denylist")
);
}
}
Visibility::Shared => {
// Only check denylist (allowlist ignored for Shared)
if let Some(ref denylist) = manifest.access.denylist {
ensure!(
!denylist.contains(requester),
AccessValidation("in denylist")
);
}
}
}
// Check bond requirement
if manifest.access.require_bond {
ensure!(
has_bond(requester, manifest.access.bond_amount.unwrap_or(0)),
AccessValidation("bond required")
);
}
Ok(())
}
}
Error Types
#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum ValidationError {
#[error("Content validation failed: {0}")]
ContentValidation(String),
#[error("Version validation failed: {0}")]
VersionValidation(String),
#[error("Provenance validation failed: {0}")]
ProvenanceValidation(String),
#[error("Payment validation failed: {0}")]
PaymentValidation(String),
#[error("Message validation failed: {0}")]
MessageValidation(String),
#[error("Access validation failed: {0}")]
AccessValidation(String),
#[error("L2 validation failed: {0}")]
L2Validation(String),
#[error("Publish validation failed: {0}")]
PublishValidation(String),
}
}
§9.7 Publish Validation
#![allow(unused)]
fn main() {
/// Validate that content can be published
fn validate_publish(manifest: &Manifest, visibility: Visibility) -> Result<()> {
// L2 can NEVER be published
if manifest.content_type == ContentType::L2 {
return Err(PublishValidation("L2 content cannot be published"));
}
// Cannot publish to a more restricted visibility
// (e.g., can't go from Shared back to Unlisted via PUBLISH)
// This is handled by UNPUBLISH operation instead
Ok(())
}
}
Test Cases
For each validation function, test:
- Valid input passes
- Each invalid condition is caught
- Error message is descriptive
- Edge cases (empty arrays, zero values, max values)
L2-specific tests:
- L2 with visibility != Private fails
- L2 with price != 0 fails
- L2 with empty entities fails
- L2 with duplicate entity IDs fails
- L2 with invalid entity reference in relationship fails
- L2 with invalid URI/CURIE fails
- L2 with unknown prefix fails
- L2 PUBLISH attempt fails
- CURIE expansion works correctly
- Confidence values outside [0,1] fail
Module: nodalync-econ
Source: Protocol Specification §10
Overview
Revenue distribution calculations. Pure functions, no I/O.
Key Design Decision: The settlement contract distributes payments to ALL root contributors directly. When Bob queries Alice’s L3 (which derives from Carol’s L0), the settlement contract pays:
- Alice: 5% synthesis fee + her root shares
- Carol: her root shares
- Any other root contributors: their shares
This ensures trustless distribution — Alice cannot withhold payment from Carol.
Dependencies
nodalync-types— ProvenanceEntry, Distribution, Amount
§10.1 Revenue Distribution
Constants
#![allow(unused)]
fn main() {
/// Synthesis fee: 5%
pub const SYNTHESIS_FEE_NUMERATOR: u64 = 5;
pub const SYNTHESIS_FEE_DENOMINATOR: u64 = 100;
/// Root pool: 95%
pub const ROOT_POOL_NUMERATOR: u64 = 95;
pub const ROOT_POOL_DENOMINATOR: u64 = 100;
/// Settlement threshold: 100 HBAR (in tinybars)
pub const SETTLEMENT_BATCH_THRESHOLD: Amount = 10_000_000_000;
/// Settlement interval: 1 hour
pub const SETTLEMENT_BATCH_INTERVAL_MS: u64 = 3_600_000;
}
Distribution Function
#![allow(unused)]
fn main() {
/// Distribute payment revenue to owner and root contributors.
///
/// # Arguments
/// * `payment_amount` - Total payment received
/// * `owner` - Content owner (receives synthesis fee)
/// * `provenance` - All root L0+L1 sources with weights
///
/// # Returns
/// Vec of distributions to each recipient
pub fn distribute_revenue(
payment_amount: Amount,
owner: &PeerId,
provenance: &[ProvenanceEntry],
) -> Vec<Distribution> {
let mut distributions = Vec::new();
// Calculate shares
let owner_share = payment_amount * SYNTHESIS_FEE_NUMERATOR / SYNTHESIS_FEE_DENOMINATOR;
let root_pool = payment_amount * ROOT_POOL_NUMERATOR / ROOT_POOL_DENOMINATOR;
// Total weight across all roots
let total_weight: u64 = provenance.iter().map(|e| e.weight as u64).sum();
if total_weight == 0 {
// Edge case: no roots (shouldn't happen for valid L3)
distributions.push(Distribution {
recipient: owner.clone(),
amount: payment_amount,
source_hash: Hash::default(), // Owner's own content
});
return distributions;
}
// Per-weight share (integer division, remainder goes to owner)
let per_weight = root_pool / total_weight;
let mut distributed: Amount = 0;
// Group by owner to aggregate payments
let mut owner_amounts: HashMap<PeerId, Amount> = HashMap::new();
for entry in provenance {
let amount = per_weight * (entry.weight as u64);
distributed += amount;
*owner_amounts.entry(entry.owner.clone()).or_default() += amount;
}
// Add synthesis fee to owner (may already have root shares)
let remainder = root_pool - distributed; // Rounding dust
*owner_amounts.entry(owner.clone()).or_default() += owner_share + remainder;
// Convert to distributions
for (recipient, amount) in owner_amounts {
if amount > 0 {
distributions.push(Distribution {
recipient,
amount,
source_hash: Hash::default(), // Aggregated
});
}
}
distributions
}
}
Example (from spec)
Scenario:
Bob's L3 derives from:
- Alice's L0 (weight: 2)
- Carol's L0 (weight: 1)
- Bob's L0 (weight: 2)
Total weight: 5
Query payment: 100 HBAR
Distribution:
owner_share = 100 * 5/100 = 5 HBAR (Bob's synthesis fee)
root_pool = 100 * 95/100 = 95 HBAR
per_weight = 95 / 5 = 19 HBAR
Alice: 2 * 19 = 38 HBAR
Carol: 1 * 19 = 19 HBAR
Bob (roots): 2 * 19 = 38 HBAR
Bob (synthesis): 5 HBAR
Bob total: 43 HBAR
Final:
Alice: 38 HBAR (38%)
Carol: 19 HBAR (19%)
Bob: 43 HBAR (43%)
§10.3 Price Constraints
#![allow(unused)]
fn main() {
pub const MIN_PRICE: Amount = 1;
pub const MAX_PRICE: Amount = 10_000_000_000_000_000; // 10^16
pub fn validate_price(price: Amount) -> Result<(), EconError> {
if price < MIN_PRICE {
return Err(EconError::PriceTooLow);
}
if price > MAX_PRICE {
return Err(EconError::PriceTooHigh);
}
Ok(())
}
}
§10.4 Settlement Batching
#![allow(unused)]
fn main() {
/// Aggregate payments into settlement batch.
///
/// Combines all pending payments, aggregating by recipient.
pub fn create_settlement_batch(
payments: &[Payment],
) -> SettlementBatch {
let mut by_recipient: HashMap<PeerId, (Amount, Vec<Hash>, Vec<Hash>)> = HashMap::new();
for payment in payments {
// Distribute this payment
let distributions = distribute_revenue(
payment.amount,
&payment.recipient,
&payment.provenance,
);
for dist in distributions {
let entry = by_recipient.entry(dist.recipient.clone()).or_default();
entry.0 += dist.amount;
if !entry.1.contains(&dist.source_hash) {
entry.1.push(dist.source_hash);
}
if !entry.2.contains(&payment.id) {
entry.2.push(payment.id.clone());
}
}
}
let entries: Vec<SettlementEntry> = by_recipient
.into_iter()
.map(|(recipient, (amount, provenance_hashes, payment_ids))| {
SettlementEntry {
recipient,
amount,
provenance_hashes,
payment_ids,
}
})
.collect();
let batch_id = compute_batch_id(&entries);
let merkle_root = compute_merkle_root(&entries);
SettlementBatch {
batch_id,
entries,
merkle_root,
}
}
/// Check if settlement should be triggered.
pub fn should_settle(
pending_total: Amount,
last_settlement: Timestamp,
now: Timestamp,
) -> bool {
// Threshold reached
if pending_total >= SETTLEMENT_BATCH_THRESHOLD {
return true;
}
// Interval elapsed
if now - last_settlement >= SETTLEMENT_BATCH_INTERVAL_MS {
return true;
}
false
}
}
Merkle Root Computation
#![allow(unused)]
fn main() {
/// Compute merkle root of settlement entries.
/// Allows any recipient to verify their inclusion.
pub fn compute_merkle_root(entries: &[SettlementEntry]) -> Hash {
if entries.is_empty() {
return Hash::default();
}
// Leaf hashes
let mut hashes: Vec<Hash> = entries
.iter()
.map(|e| hash_settlement_entry(e))
.collect();
// Build tree
while hashes.len() > 1 {
let mut next_level = Vec::new();
for chunk in hashes.chunks(2) {
if chunk.len() == 2 {
next_level.push(hash_pair(&chunk[0], &chunk[1]));
} else {
next_level.push(chunk[0].clone());
}
}
hashes = next_level;
}
hashes.pop().unwrap()
}
fn hash_settlement_entry(entry: &SettlementEntry) -> Hash {
let mut hasher = Sha256::new();
hasher.update(&entry.recipient.0);
hasher.update(&entry.amount.to_be_bytes());
// ... hash other fields
Hash(hasher.finalize().into())
}
fn hash_pair(a: &Hash, b: &Hash) -> Hash {
let mut hasher = Sha256::new();
hasher.update(&a.0);
hasher.update(&b.0);
Hash(hasher.finalize().into())
}
}
Public API
#![allow(unused)]
fn main() {
// Distribution
pub fn distribute_revenue(
payment_amount: Amount,
owner: &PeerId,
provenance: &[ProvenanceEntry],
) -> Vec<Distribution>;
// Batching
pub fn create_settlement_batch(payments: &[Payment]) -> SettlementBatch;
pub fn should_settle(pending_total: Amount, last_settlement: Timestamp, now: Timestamp) -> bool;
// Validation
pub fn validate_price(price: Amount) -> Result<(), EconError>;
// Merkle proofs
pub fn compute_merkle_root(entries: &[SettlementEntry]) -> Hash;
pub fn create_merkle_proof(entries: &[SettlementEntry], index: usize) -> MerkleProof;
pub fn verify_merkle_proof(root: &Hash, entry: &SettlementEntry, proof: &MerkleProof) -> bool;
}
Test Cases
- Basic distribution: 100 tokens, single root → 95 to root, 5 to owner
- Multiple roots: Verify equal per-weight distribution
- Owner is root: Owner gets synthesis fee + root share
- Rounding: Integer division remainder goes to owner
- Zero payment: Handle gracefully
- Empty provenance: All to owner
- Batch aggregation: Multiple payments to same recipient aggregate
- Merkle proof: Create proof, verify proof
- Settlement trigger: Threshold triggers, interval triggers
Module: nodalync-ops
Source: Protocol Specification §7
Overview
Core protocol operations. Combines storage, validation, and economics to implement the protocol’s business logic.
Key Design Decisions:
- L1 Extraction: Rule-based NLP for MVP. Future: plugin architecture for AI integration.
- Channel Auto-Open: When querying a peer with no channel, auto-open with configurable minimum deposit. Return PAYMENT_REQUIRED if insufficient funds.
- Settlement Queue: This module WRITES to the settlement queue (in nodalync-store). The nodalync-settle module READS from it.
- Payment Distribution: All distributions go to the settlement queue. The settlement contract pays ALL recipients (owner + all root contributors).
Dependencies
nodalync-types— All data structuresnodalync-crypto— Hashing, signingnodalync-store— Persistence (including settlement queue)nodalync-valid— Validationnodalync-econ— Revenue distributionnodalync-wire— Message encoding
Operations Trait
#![allow(unused)]
fn main() {
#[async_trait]
pub trait Operations {
// === Content Operations ===
/// Create new content locally (not yet published)
async fn create(
&mut self,
content: &[u8],
content_type: ContentType,
metadata: Metadata,
) -> Result<Hash>;
/// Extract L1 mentions from L0 content (rule-based for MVP)
async fn extract_l1(&mut self, hash: &Hash) -> Result<L1Summary>;
/// Build L2 entity graph from L1 sources (always private)
async fn build_l2(
&mut self,
source_l1s: &[Hash],
config: Option<L2BuildConfig>,
) -> Result<Hash>;
/// Merge multiple of your own L2 graphs into one
async fn merge_l2(
&mut self,
source_l2s: &[Hash],
config: Option<L2MergeConfig>,
) -> Result<Hash>;
/// Publish content to the network (NOT allowed for L2)
async fn publish(
&mut self,
hash: &Hash,
visibility: Visibility,
price: Amount,
access: Option<AccessControl>,
) -> Result<()>;
/// Unpublish content (set to Private)
async fn unpublish(&mut self, hash: &Hash) -> Result<()>;
/// Create new version of existing content
async fn update(&mut self, old_hash: &Hash, new_content: &[u8]) -> Result<Hash>;
/// Create L3 from multiple sources (can include L0, L1, L2, L3)
async fn derive(
&mut self,
sources: &[Hash],
insight: &[u8],
metadata: Metadata,
) -> Result<Hash>;
/// Reference external L3 as L0 for derivations
async fn reference_l3_as_l0(&mut self, l3_hash: &Hash) -> Result<()>;
// === Query Operations ===
/// Get L1 preview (free)
async fn preview(&self, hash: &Hash) -> Result<(Manifest, L1Summary)>;
/// Query content (paid) - auto-opens channel if needed
async fn query(&mut self, hash: &Hash, payment: Payment) -> Result<QueryResponse>;
/// Get version history for content
async fn get_versions(&self, version_root: &Hash) -> Result<Vec<VersionInfo>>;
// === Visibility Operations ===
/// Change content visibility (NOT allowed for L2)
async fn set_visibility(&mut self, hash: &Hash, visibility: Visibility) -> Result<()>;
/// Update access control
async fn set_access(&mut self, hash: &Hash, access: AccessControl) -> Result<()>;
// === Channel Operations ===
/// Open payment channel with peer
async fn open_channel(&mut self, peer: &PeerId, deposit: Amount) -> Result<Hash>;
/// Accept incoming channel open request
async fn accept_channel(&mut self, channel_id: &Hash, deposit: Amount) -> Result<()>;
/// Update channel state (after payment)
async fn update_channel(&mut self, channel_id: &Hash, payment: &Payment) -> Result<()>;
/// Close channel cooperatively
async fn close_channel(&mut self, channel_id: &Hash) -> Result<()>;
/// Dispute channel with on-chain evidence
async fn dispute_channel(&mut self, channel_id: &Hash, state: &ChannelUpdatePayload) -> Result<()>;
// === Settlement Operations ===
/// Trigger settlement batch (called by nodalync-settle or manually)
async fn trigger_settlement(&mut self) -> Result<Option<SettlementBatch>>;
}
}
§7.1.1 CREATE
#![allow(unused)]
fn main() {
async fn create(
&mut self,
content: &[u8],
content_type: ContentType,
metadata: Metadata,
) -> Result<Hash> {
// Reject L2 and L3 through CREATE - they have dedicated operations
match content_type {
ContentType::L2 => {
return Err(Error::InvalidOperation(
"Use build_l2() for L2 content".into()
));
}
ContentType::L3 => {
return Err(Error::InvalidOperation(
"Use derive() for L3 content".into()
));
}
ContentType::L0 | ContentType::L1 => {}
}
// 1. Compute hash
let hash = content_hash(content);
// 2. Create version (v1)
let version = Version {
number: 1,
previous: None,
root: hash.clone(),
timestamp: current_timestamp(),
};
// 3. Compute provenance (L0/L1: self-referential)
let provenance = Provenance {
root_L0L1: vec![ProvenanceEntry {
hash: hash.clone(),
owner: self.identity.peer_id(),
visibility: Visibility::Private,
weight: 1,
}],
derived_from: vec![],
depth: if content_type == ContentType::L0 { 0 } else { 1 },
};
// 4. Create manifest (includes owner)
let manifest = Manifest {
hash: hash.clone(),
content_type,
owner: self.identity.peer_id(),
version,
visibility: Visibility::Private,
access: AccessControl::default(),
metadata,
economics: Economics {
price: 0,
currency: Currency::HBAR,
total_queries: 0,
total_revenue: 0,
},
provenance,
created_at: current_timestamp(),
updated_at: current_timestamp(),
};
// 5. Validate
self.validator.validate_content(content, &manifest)?;
// 6. Store
self.content_store.store_verified(&hash, content)?;
self.manifest_store.store(&manifest)?;
Ok(hash)
}
}
§7.1.2 EXTRACT_L1 (Rule-Based MVP)
L1 extraction identifies atomic facts from L0 content. For MVP, we use rule-based NLP. Future versions will support a plugin architecture for AI-powered extraction.
#![allow(unused)]
fn main() {
/// L1 Extraction trait for pluggable implementations
pub trait L1Extractor {
fn extract(&self, content: &[u8], mime_type: Option<&str>) -> Result<Vec<Mention>>;
}
/// Rule-based extractor for MVP
pub struct RuleBasedExtractor;
impl L1Extractor for RuleBasedExtractor {
fn extract(&self, content: &[u8], mime_type: Option<&str>) -> Result<Vec<Mention>> {
let text = std::str::from_utf8(content)?;
let mut mentions = Vec::new();
// Split into sentences (basic approach)
let sentences: Vec<&str> = text
.split(|c| c == '.' || c == '!' || c == '?')
.filter(|s| !s.trim().is_empty())
.collect();
for (idx, sentence) in sentences.iter().enumerate() {
let trimmed = sentence.trim();
if trimmed.len() < 10 || trimmed.len() > 1000 {
continue; // Skip too short or too long
}
// Basic classification heuristics
let classification = classify_sentence(trimmed);
// Extract entities (basic: capitalized words)
let entities = extract_entities(trimmed);
let mention = Mention {
id: content_hash(format!("{}:{}", idx, trimmed).as_bytes()),
content: trimmed.to_string(),
source_location: SourceLocation {
location_type: LocationType::Paragraph,
reference: format!("sentence_{}", idx),
quote: Some(trimmed.chars().take(500).collect()),
},
classification,
confidence: Confidence::Explicit,
entities,
};
mentions.push(mention);
}
Ok(mentions)
}
}
fn classify_sentence(sentence: &str) -> Classification {
let lower = sentence.to_lowercase();
if lower.contains('%') || lower.contains("percent") ||
lower.chars().any(|c| c.is_numeric()) {
Classification::Statistic
} else if lower.starts_with("according to") || lower.contains("claims") ||
lower.contains("argues") || lower.contains("suggests") {
Classification::Claim
} else if lower.contains("defined as") || lower.contains("refers to") ||
lower.contains("is a") || lower.contains("are a") {
Classification::Definition
} else if lower.contains("method") || lower.contains("approach") ||
lower.contains("technique") || lower.contains("process") {
Classification::Method
} else if lower.contains("found") || lower.contains("result") ||
lower.contains("showed") || lower.contains("demonstrated") {
Classification::Result
} else {
Classification::Observation
}
}
fn extract_entities(sentence: &str) -> Vec<String> {
// Basic entity extraction: capitalized multi-word sequences
sentence
.split_whitespace()
.filter(|w| w.chars().next().map(|c| c.is_uppercase()).unwrap_or(false))
.filter(|w| w.len() > 1)
.map(|w| w.trim_matches(|c: char| !c.is_alphanumeric()).to_string())
.filter(|w| !w.is_empty())
.collect()
}
async fn extract_l1(&mut self, hash: &Hash) -> Result<L1Summary> {
// 1. Load content
let content = self.content_store.load(hash)?
.ok_or(Error::NotFound)?;
let manifest = self.manifest_store.load(hash)?
.ok_or(Error::NotFound)?;
// 2. Extract mentions using configured extractor
let mentions = self.l1_extractor.extract(&content, manifest.metadata.mime_type.as_deref())?;
// 3. Generate summary
let primary_topics: Vec<String> = mentions.iter()
.flat_map(|m| m.entities.iter().cloned())
.take(5)
.collect();
let summary = if mentions.len() > 0 {
format!(
"Contains {} mentions covering topics: {}",
mentions.len(),
primary_topics.join(", ")
)
} else {
"No structured mentions extracted.".to_string()
};
// 4. Create L1Summary
let l1_summary = L1Summary {
l0_hash: hash.clone(),
mention_count: mentions.len() as u32,
preview_mentions: mentions.iter().take(5).cloned().collect(),
primary_topics,
summary: summary.chars().take(500).collect(),
};
// 5. Store L1 data
self.l1_store.store(hash, &l1_summary)?;
Ok(l1_summary)
}
}
Future Plugin Architecture:
#![allow(unused)]
fn main() {
pub trait L1ExtractorPlugin: Send + Sync {
fn name(&self) -> &str;
fn supported_mime_types(&self) -> Vec<&str>;
fn extract(&self, content: &[u8], mime_type: &str) -> Result<Vec<Mention>>;
}
// Example: AI-powered extractor (future)
pub struct OpenAIExtractor {
api_key: String,
model: String,
}
impl L1ExtractorPlugin for OpenAIExtractor {
fn name(&self) -> &str { "openai" }
fn supported_mime_types(&self) -> Vec<&str> { vec!["text/plain", "text/markdown"] }
fn extract(&self, content: &[u8], mime_type: &str) -> Result<Vec<Mention>> {
// Call OpenAI API...
todo!()
}
}
}
§7.1.2a BUILD_L2 (Entity Graph)
Build a personal L2 entity graph from L1 sources. L2 is always private and never directly monetized.
#![allow(unused)]
fn main() {
async fn build_l2(
&mut self,
source_l1s: &[Hash],
config: Option<L2BuildConfig>,
) -> Result<Hash> {
let config = config.unwrap_or_default();
// 1. Validate we have at least one source
if source_l1s.is_empty() {
return Err(Error::InvalidOperation("build_l2 requires at least one L1 source".into()));
}
if source_l1s.len() > MAX_SOURCE_L1S_PER_L2 {
return Err(Error::InvalidOperation("too many L1 sources".into()));
}
// 2. Load and verify all L1 sources (must be queried or owned)
let mut l1_refs = Vec::new();
let mut all_mentions: Vec<(Hash, Mention)> = Vec::new();
for l1_hash in source_l1s {
// Check if we have it (either owned or cached from query)
let manifest = self.manifest_store.load(l1_hash)
.or_else(|_| self.cache.get_manifest(l1_hash))
.ok_or(Error::L2MissingSource)?;
if manifest.content_type != ContentType::L1 {
return Err(Error::InvalidOperation("source must be L1".into()));
}
// Load L1 summary to get mentions
let l1_summary = self.l1_store.load(l1_hash)?
.ok_or(Error::L2MissingSource)?;
l1_refs.push(L1Reference {
l1_hash: l1_hash.clone(),
l0_hash: l1_summary.l0_hash.clone(),
mention_ids_used: vec![], // All mentions
});
// Collect mentions with their L1 hash for reference
for mention in &l1_summary.preview_mentions {
all_mentions.push((l1_hash.clone(), mention.clone()));
}
}
// 3. Extract entities from mentions
let raw_entities = extract_entities_from_mentions(&all_mentions, &config)?;
// 4. Resolve entities (merge duplicates, link to external KBs)
let prefixes = config.prefixes.clone().unwrap_or_default();
let resolved_entities = resolve_entities(raw_entities, &config)?;
if resolved_entities.is_empty() {
return Err(Error::InvalidOperation("no entities extracted".into()));
}
// 5. Extract relationships
let relationships = extract_relationships(&resolved_entities, &all_mentions, &config)?;
// 6. Build L2 graph
let mut l2_graph = L2EntityGraph {
id: Hash::default(), // Computed below
source_l1s: l1_refs,
source_l2s: vec![],
prefixes,
entities: resolved_entities.clone(),
relationships: relationships.clone(),
entity_count: resolved_entities.len() as u32,
relationship_count: relationships.len() as u32,
source_mention_count: all_mentions.len() as u32,
};
// 7. Serialize and compute hash
let content = serialize(&l2_graph)?;
let hash = content_hash(&content);
l2_graph.id = hash.clone();
// 8. Compute provenance (merge from all source L1s)
let mut root_entries: Vec<ProvenanceEntry> = Vec::new();
let mut max_depth = 0u32;
for l1_hash in source_l1s {
let manifest = self.manifest_store.load(l1_hash)
.or_else(|_| self.cache.get_manifest(l1_hash))?;
for entry in &manifest.provenance.root_L0L1 {
merge_or_increment(&mut root_entries, entry.clone());
}
max_depth = max_depth.max(manifest.provenance.depth);
}
let provenance = Provenance {
root_L0L1: root_entries,
derived_from: source_l1s.to_vec(),
depth: max_depth + 1,
};
// 9. Create manifest (L2 is ALWAYS private with zero price)
let manifest = Manifest {
hash: hash.clone(),
content_type: ContentType::L2,
owner: self.identity.peer_id(),
version: Version {
number: 1,
previous: None,
root: hash.clone(),
timestamp: current_timestamp(),
},
visibility: Visibility::Private, // L2 is ALWAYS private
access: AccessControl::default(),
metadata: Metadata {
title: format!("Entity Graph ({} entities)", resolved_entities.len()),
description: None,
tags: vec![],
content_size: content.len() as u64,
mime_type: Some("application/x-nodalync-l2".into()),
},
economics: Economics {
price: 0, // L2 is ALWAYS free (never queried)
currency: Currency::HBAR,
total_queries: 0,
total_revenue: 0,
},
provenance,
created_at: current_timestamp(),
updated_at: current_timestamp(),
};
// 10. Validate
self.validator.validate_content(&content, &manifest)?;
// 11. Store
self.content_store.store_verified(&hash, &content)?;
self.manifest_store.store(&manifest)?;
Ok(hash)
}
/// Helper: Extract entities from mentions
fn extract_entities_from_mentions(
mentions: &[(Hash, Mention)],
config: &L2BuildConfig,
) -> Result<Vec<Entity>> {
let mut entities = Vec::new();
let default_type = config.default_entity_type.clone()
.unwrap_or_else(|| "ndl:Concept".into());
for (l1_hash, mention) in mentions {
for entity_name in &mention.entities {
// Create entity with mention reference
let entity = Entity {
id: content_hash(format!("{}:{}", entity_name, default_type).as_bytes()),
canonical_label: entity_name.clone(),
canonical_uri: None,
aliases: vec![],
entity_types: vec![default_type.clone()],
source_mentions: vec![MentionRef {
l1_hash: l1_hash.clone(),
mention_id: mention.id.clone(),
}],
confidence: 0.8, // Default confidence
resolution_method: ResolutionMethod::ExactMatch,
description: None,
same_as: None,
};
entities.push(entity);
}
}
Ok(entities)
}
}
§7.1.2b MERGE_L2
Merge multiple of your own L2 graphs into a unified graph.
#![allow(unused)]
fn main() {
async fn merge_l2(
&mut self,
source_l2s: &[Hash],
config: Option<L2MergeConfig>,
) -> Result<Hash> {
let config = config.unwrap_or_default();
// 1. Validate
if source_l2s.len() < 2 {
return Err(Error::InvalidOperation("merge_l2 requires at least 2 sources".into()));
}
if source_l2s.len() > MAX_SOURCE_L2S_PER_MERGE {
return Err(Error::InvalidOperation("too many L2 sources".into()));
}
// 2. Load all L2 sources (must be local - L2 is never queried)
let mut all_entities: Vec<Entity> = Vec::new();
let mut all_relationships: Vec<Relationship> = Vec::new();
let mut all_l1_refs: Vec<L1Reference> = Vec::new();
let mut merged_prefixes = PrefixMap::default();
let mut root_entries: Vec<ProvenanceEntry> = Vec::new();
let mut max_depth = 0u32;
for l2_hash in source_l2s {
// Must be local (owned)
let manifest = self.manifest_store.load(l2_hash)?
.ok_or(Error::NotFound)?;
if manifest.content_type != ContentType::L2 {
return Err(Error::InvalidOperation("source must be L2".into()));
}
if manifest.owner != self.identity.peer_id() {
return Err(Error::InvalidOperation("can only merge your own L2s".into()));
}
// Load L2 content
let content = self.content_store.load(l2_hash)?
.ok_or(Error::NotFound)?;
let l2_graph: L2EntityGraph = deserialize(&content)?;
// Collect entities, relationships, refs
all_entities.extend(l2_graph.entities);
all_relationships.extend(l2_graph.relationships);
all_l1_refs.extend(l2_graph.source_l1s);
// Merge prefixes (later ones override earlier)
for entry in l2_graph.prefixes.entries {
merged_prefixes.entries.retain(|e| e.prefix != entry.prefix);
merged_prefixes.entries.push(entry);
}
// Merge provenance
for entry in &manifest.provenance.root_L0L1 {
merge_or_increment(&mut root_entries, entry.clone());
}
max_depth = max_depth.max(manifest.provenance.depth);
}
// 3. Deduplicate L1 refs
let mut unique_l1_refs: Vec<L1Reference> = Vec::new();
for l1_ref in all_l1_refs {
if !unique_l1_refs.iter().any(|r| r.l1_hash == l1_ref.l1_hash) {
unique_l1_refs.push(l1_ref);
}
}
// 4. Cross-graph entity resolution
let threshold = config.entity_merge_threshold.unwrap_or(0.8);
let resolved_entities = merge_entities(&all_entities, threshold)?;
// 5. Update relationship entity references
let entity_id_map = build_entity_id_map(&all_entities, &resolved_entities);
let resolved_relationships = update_relationship_refs(&all_relationships, &entity_id_map)?;
// 6. Build merged L2
let mut l2_graph = L2EntityGraph {
id: Hash::default(),
source_l1s: unique_l1_refs,
source_l2s: source_l2s.to_vec(),
prefixes: config.prefixes.clone().unwrap_or(merged_prefixes),
entities: resolved_entities.clone(),
relationships: resolved_relationships.clone(),
entity_count: resolved_entities.len() as u32,
relationship_count: resolved_relationships.len() as u32,
source_mention_count: resolved_entities.iter()
.map(|e| e.source_mentions.len())
.sum::<usize>() as u32,
};
// 7. Hash
let content = serialize(&l2_graph)?;
let hash = content_hash(&content);
l2_graph.id = hash.clone();
// 8. Provenance
let provenance = Provenance {
root_L0L1: root_entries,
derived_from: source_l2s.to_vec(),
depth: max_depth + 1,
};
// 9. Create manifest
let manifest = Manifest {
hash: hash.clone(),
content_type: ContentType::L2,
owner: self.identity.peer_id(),
version: Version {
number: 1,
previous: None,
root: hash.clone(),
timestamp: current_timestamp(),
},
visibility: Visibility::Private,
access: AccessControl::default(),
metadata: Metadata {
title: format!("Merged Entity Graph ({} entities)", resolved_entities.len()),
description: None,
tags: vec![],
content_size: content.len() as u64,
mime_type: Some("application/x-nodalync-l2".into()),
},
economics: Economics {
price: 0,
currency: Currency::HBAR,
total_queries: 0,
total_revenue: 0,
},
provenance,
created_at: current_timestamp(),
updated_at: current_timestamp(),
};
// 10. Validate and store
self.validator.validate_content(&content, &manifest)?;
self.content_store.store_verified(&hash, &content)?;
self.manifest_store.store(&manifest)?;
Ok(hash)
}
}
§7.1.3 PUBLISH
#![allow(unused)]
fn main() {
async fn publish(
&mut self,
hash: &Hash,
visibility: Visibility,
price: Amount,
access: Option<AccessControl>,
) -> Result<()> {
// 1. Load manifest
let mut manifest = self.manifest_store.load(hash)?
.ok_or(Error::NotFound)?;
// 2. L2 can NEVER be published
if manifest.content_type == ContentType::L2 {
return Err(Error::L2CannotPublish);
}
// 3. Validate price
validate_price(price)?;
// 4. Update manifest
manifest.visibility = visibility;
manifest.economics.price = price;
if let Some(access) = access {
manifest.access = access;
}
manifest.updated_at = current_timestamp();
// 5. Save
self.manifest_store.update(&manifest)?;
// 6. Announce to network (if Shared)
if visibility == Visibility::Shared {
let l1_summary = self.get_or_extract_l1(hash).await?;
let announce = AnnouncePayload {
hash: hash.clone(),
content_type: manifest.content_type,
title: manifest.metadata.title.clone(),
l1_summary,
price,
addresses: self.network.listen_addresses(),
};
self.network.dht_announce(hash, announce).await?;
}
Ok(())
}
}
§7.1.5 DERIVE (Create L3)
Create L3 insight from sources. Sources can be any combination of:
- L0 (raw documents)
- L1 (mentions)
- L2 (your entity graphs - must be owned, not queried)
- L3 (other insights)
#![allow(unused)]
fn main() {
async fn derive(
&mut self,
sources: &[Hash],
insight: &[u8],
metadata: Metadata,
) -> Result<Hash> {
// 1. Verify all sources are accessible
for source in sources {
let manifest = self.get_manifest(source)?;
match manifest.content_type {
ContentType::L2 => {
// L2 must be owned (it's personal, never queried)
if manifest.owner != self.identity.peer_id() {
return Err(Error::InvalidOperation(
"can only derive from your own L2".into()
));
}
}
_ => {
// Other types: must be queried or owned
if !self.cache.is_cached(source) && !self.content_store.exists(source) {
return Err(Error::SourceNotQueried(source.clone()));
}
}
}
}
// 2. Load source manifests
let source_manifests: Vec<Manifest> = sources.iter()
.map(|h| self.get_manifest(h))
.collect::<Result<Vec<_>>>()?;
// 3. Compute provenance (roots are always L0/L1, traced through L2/L3)
let mut root_entries: HashMap<Hash, ProvenanceEntry> = HashMap::new();
for source in &source_manifests {
for entry in &source.provenance.root_L0L1 {
root_entries.entry(entry.hash.clone())
.and_modify(|e| e.weight += entry.weight)
.or_insert(entry.clone());
}
}
let max_depth = source_manifests.iter()
.map(|s| s.provenance.depth)
.max()
.unwrap_or(0);
let provenance = Provenance {
root_L0L1: root_entries.into_values().collect(),
derived_from: sources.to_vec(),
depth: max_depth + 1,
};
// 4. Compute hash
let hash = content_hash(insight);
// 5. Create version
let version = Version {
number: 1,
previous: None,
root: hash.clone(),
timestamp: current_timestamp(),
};
// 6. Create manifest
let manifest = Manifest {
hash: hash.clone(),
content_type: ContentType::L3,
owner: self.identity.peer_id(),
version,
visibility: Visibility::Private,
access: AccessControl::default(),
metadata,
economics: Economics::default(),
provenance,
created_at: current_timestamp(),
updated_at: current_timestamp(),
};
// 7. Validate
self.validator.validate_provenance(&manifest, &source_manifests)?;
// 8. Store
self.content_store.store_verified(&hash, insight)?;
self.manifest_store.store(&manifest)?;
self.provenance_graph.add(&hash, sources)?;
Ok(hash)
}
}
§7.2.3 QUERY
#![allow(unused)]
fn main() {
/// Configuration for channel auto-open
pub struct ChannelConfig {
/// Minimum deposit when auto-opening a channel
pub min_deposit: Amount,
/// Default deposit for new channels
pub default_deposit: Amount,
}
impl Default for ChannelConfig {
fn default() -> Self {
Self {
min_deposit: 10_000_000_000, // 100 HBAR minimum
default_deposit: 100_000_000_000, // 1000 HBAR default
}
}
}
async fn query(&mut self, hash: &Hash, payment: Payment) -> Result<QueryResponse> {
// As requester
// 1. Get preview for price check and owner discovery
let (manifest, _) = self.preview(hash).await?;
let owner = &manifest.owner;
// 2. Ensure channel exists - AUTO-OPEN if not
if !self.channels.exists(owner) {
// Check if we have sufficient balance for auto-open
let balance = self.get_available_balance().await?;
if balance < self.config.channel.min_deposit {
return Err(Error::PaymentRequired {
message: format!(
"No channel with {} and insufficient balance to auto-open. Need {} HBAR minimum.",
owner, self.config.channel.min_deposit
),
});
}
// Auto-open channel with default deposit
let deposit = std::cmp::min(balance, self.config.channel.default_deposit);
self.open_channel(owner, deposit).await?;
}
// 3. Validate payment amount
if payment.amount < manifest.economics.price {
return Err(Error::PaymentInsufficient);
}
// 4. Check channel balance
let channel = self.channels.get(owner)?
.ok_or(Error::ChannelNotFound)?;
if channel.my_balance < payment.amount {
return Err(Error::InsufficientChannelBalance);
}
// 5. Send request
let request = QueryRequestPayload {
hash: hash.clone(),
query: None,
payment: payment.clone(),
version_spec: None,
};
let response = self.network.send_query(owner, request).await?;
// 6. Verify response
if content_hash(&response.content) != *hash {
return Err(Error::ContentHashMismatch);
}
// 7. Update channel state
self.channels.debit(owner, payment.amount)?;
self.channels.add_payment(owner, payment)?;
// 8. Cache content
self.cache.cache(CachedContent {
hash: hash.clone(),
content: response.content.clone(),
source_peer: owner.clone(),
queried_at: current_timestamp(),
payment_proof: response.payment_receipt.clone(),
})?;
Ok(response)
}
}
Query Handler (receiving side)
The handler queues ALL distributions to the settlement queue. The settlement contract will distribute to all recipients (owner + all root contributors).
#![allow(unused)]
fn main() {
async fn handle_query_request(
&mut self,
sender: &PeerId,
request: QueryRequestPayload,
) -> Result<QueryResponsePayload> {
// 1. Load manifest
let manifest = self.manifest_store.load(&request.hash)?
.ok_or(Error::NotFound)?;
// 2. Validate access
self.validator.validate_access(sender, &manifest)?;
// 3. Validate payment
let channel = self.channels.get(sender)?
.ok_or(Error::ChannelNotFound)?;
self.validator.validate_payment(&request.payment, &channel, &manifest)?;
// 4. Update channel state (credit the payment)
self.channels.credit(sender, request.payment.amount)?;
self.channels.increment_nonce(sender)?;
// 5. Calculate distributions and queue ALL of them
// The settlement contract will pay everyone, including us
let distributions = distribute_revenue(
request.payment.amount,
&manifest.owner,
&manifest.provenance.root_L0L1,
);
for dist in distributions {
self.settlement_queue.enqueue(QueuedDistribution {
payment_id: request.payment.id.clone(),
recipient: dist.recipient,
amount: dist.amount,
source_hash: dist.source_hash,
queued_at: current_timestamp(),
})?;
}
// 6. Update economics
let mut updated_manifest = manifest.clone();
updated_manifest.economics.total_queries += 1;
updated_manifest.economics.total_revenue += request.payment.amount;
self.manifest_store.update(&updated_manifest)?;
// 7. Check if settlement should be triggered
let pending_total = self.settlement_queue.get_pending_total()?;
let last_settlement = self.settlement_queue.get_last_settlement_time()?;
if should_settle(pending_total, last_settlement.unwrap_or(0), current_timestamp()) {
// Queue settlement for async processing
self.settlement_trigger.notify();
}
// 8. Load and return content
let content = self.content_store.load(&request.hash)?
.ok_or(Error::ContentNotFound)?;
let receipt_data = encode_receipt_data(&request.payment, channel.nonce + 1)?;
let receipt = PaymentReceipt {
payment_id: request.payment.id.clone(),
amount: request.payment.amount,
timestamp: current_timestamp(),
channel_nonce: channel.nonce + 1,
distributor_signature: self.identity.sign(&receipt_data)?,
};
Ok(QueryResponsePayload {
hash: request.hash,
content,
manifest: updated_manifest,
payment_receipt: receipt,
})
}
}
§7.1.6 REFERENCE_L3_AS_L0
#![allow(unused)]
fn main() {
async fn reference_l3_as_l0(&mut self, l3_hash: &Hash) -> Result<()> {
// 1. Verify L3 was queried
let cached = self.cache.get(l3_hash)?
.ok_or(Error::SourceNotQueried(l3_hash.clone()))?;
// 2. Verify it's an L3
let manifest = self.get_manifest(l3_hash)?;
if manifest.content_type != ContentType::L3 {
return Err(Error::NotAnL3);
}
// 3. Store reference
// Note: This is a reference, not a copy. The content stays
// in cache/remote. When deriving, we use this hash in sources.
self.references.add_l3_reference(l3_hash, &manifest)?;
Ok(())
}
}
§7.3 Channel Operations
§7.3.1 CHANNEL_OPEN
#![allow(unused)]
fn main() {
async fn open_channel(&mut self, peer: &PeerId, deposit: Amount) -> Result<Hash> {
// 1. Generate channel ID
let nonce = rand::random::<u64>();
let channel_id = content_hash(
&[self.identity.peer_id().0.as_slice(), peer.0.as_slice(), &nonce.to_be_bytes()].concat()
);
// 2. Create channel state
let channel = Channel {
channel_id: channel_id.clone(),
peer_id: peer.clone(),
state: ChannelState::Opening,
my_balance: deposit,
their_balance: 0,
nonce: 0,
last_update: current_timestamp(),
pending_payments: vec![],
};
// 3. Store locally
self.channels.create(peer, channel)?;
// 4. Send open request
let open_msg = ChannelOpenPayload {
channel_id: channel_id.clone(),
initial_balance: deposit,
funding_tx: None, // Off-chain for now, on-chain funding optional
};
let response = self.network.send_channel_open(peer, open_msg).await?;
// 5. Process accept response
self.handle_channel_accept(&channel_id, &response)?;
Ok(channel_id)
}
}
§7.3.2 CHANNEL_ACCEPT (Handler)
#![allow(unused)]
fn main() {
async fn handle_channel_open(
&mut self,
sender: &PeerId,
request: ChannelOpenPayload,
) -> Result<ChannelAcceptPayload> {
// 1. Validate channel doesn't already exist
if self.channels.exists(sender) {
return Err(Error::ChannelAlreadyExists);
}
// 2. Decide on our deposit (could be configurable)
let our_deposit = self.config.channel.default_deposit;
// 3. Create channel state
let channel = Channel {
channel_id: request.channel_id.clone(),
peer_id: sender.clone(),
state: ChannelState::Open,
my_balance: our_deposit,
their_balance: request.initial_balance,
nonce: 0,
last_update: current_timestamp(),
pending_payments: vec![],
};
// 4. Store
self.channels.create(sender, channel)?;
// 5. Return accept
Ok(ChannelAcceptPayload {
channel_id: request.channel_id,
initial_balance: our_deposit,
funding_tx: None,
})
}
fn handle_channel_accept(&mut self, channel_id: &Hash, accept: &ChannelAcceptPayload) -> Result<()> {
// Update channel to Open state with peer's deposit
let channel = self.channels.get_by_id(channel_id)?
.ok_or(Error::ChannelNotFound)?;
let mut updated = channel.clone();
updated.state = ChannelState::Open;
updated.their_balance = accept.initial_balance;
updated.last_update = current_timestamp();
self.channels.update(&updated)?;
Ok(())
}
}
§7.3.3 CHANNEL_CLOSE
Cooperative channel close flow:
- Initiator creates settlement_tx proposal
- Send ChannelClosePayload to peer
- Peer verifies and signs
- Either party submits signed tx to chain
#![allow(unused)]
fn main() {
async fn close_channel(&mut self, channel_id: &Hash) -> Result<()> {
// 1. Get channel
let channel = self.channels.get_by_id(channel_id)?
.ok_or(Error::ChannelNotFound)?;
// 2. Compute final balances
let final_balances = ChannelBalances {
initiator: channel.my_balance,
responder: channel.their_balance,
};
// 3. Create proposed settlement transaction bytes
let settlement_tx = self.settlement.create_close_tx_bytes(
channel_id,
&final_balances,
);
// 4. Sign the proposal
let my_signature = self.identity.sign(&settlement_tx)?;
// 5. Send close request to peer
let close_msg = ChannelClosePayload {
channel_id: channel_id.clone(),
final_balances: final_balances.clone(),
settlement_tx: settlement_tx.clone(),
};
let response = self.network.send_channel_close(&channel.peer_id, close_msg).await?;
// 6. Peer's response includes their signature - submit to chain
// (The response handler on peer side also signs the settlement_tx)
self.settlement.close_channel(
channel_id,
final_balances,
[my_signature, response.peer_signature],
).await?;
// 7. Update local state
self.channels.set_state(channel_id, ChannelState::Closed)?;
Ok(())
}
}
§7.3.4 CHANNEL_DISPUTE
#![allow(unused)]
fn main() {
async fn dispute_channel(&mut self, channel_id: &Hash, our_state: &ChannelUpdatePayload) -> Result<()> {
// 1. Submit dispute to chain with our latest signed state
self.settlement.dispute_channel(channel_id, our_state).await?;
// 2. Update local state
self.channels.set_state(channel_id, ChannelState::Disputed)?;
// 3. Wait for dispute period (24 hours) - handled by settlement module
Ok(())
}
}
§7.4 Version Operations
handle_version_request
#![allow(unused)]
fn main() {
async fn handle_version_request(
&mut self,
_sender: &PeerId,
request: VersionRequestPayload,
) -> Result<VersionResponsePayload> {
// 1. Get all versions for this root
let versions = self.manifest_store.get_versions(&request.version_root)?;
if versions.is_empty() {
return Err(Error::NotFound);
}
// 2. Find latest
let latest = versions.iter()
.max_by_key(|m| m.version.number)
.unwrap();
// 3. Convert to VersionInfo
let version_infos: Vec<VersionInfo> = versions.iter()
.map(|m| VersionInfo {
hash: m.hash.clone(),
number: m.version.number,
timestamp: m.version.timestamp,
visibility: m.visibility,
price: m.economics.price,
})
.collect();
Ok(VersionResponsePayload {
version_root: request.version_root,
versions: version_infos,
latest: latest.hash.clone(),
})
}
}
§7.5 Settlement Operations
trigger_settlement
Called periodically or when threshold reached. Creates batch and submits to chain.
#![allow(unused)]
fn main() {
async fn trigger_settlement(&mut self) -> Result<Option<SettlementBatch>> {
// 1. Check if settlement needed
let pending_total = self.settlement_queue.get_pending_total()?;
let last_settlement = self.settlement_queue.get_last_settlement_time()?;
if !should_settle(pending_total, last_settlement.unwrap_or(0), current_timestamp()) {
return Ok(None);
}
// 2. Get pending distributions
let pending = self.settlement_queue.get_pending()?;
if pending.is_empty() {
return Ok(None);
}
// 3. Create batch (aggregates by recipient)
let payments: Vec<Payment> = pending.iter()
.map(|d| reconstruct_payment(d))
.collect();
let batch = create_settlement_batch(&payments);
// 4. Submit to chain
let tx_id = self.settlement.settle_batch(batch.clone()).await?;
// 5. Mark as settled
let payment_ids: Vec<Hash> = pending.iter().map(|d| d.payment_id.clone()).collect();
self.settlement_queue.mark_settled(&payment_ids, &batch.batch_id)?;
self.settlement_queue.set_last_settlement_time(current_timestamp())?;
// 6. Broadcast confirmation
let confirm = SettleConfirmPayload {
batch_id: batch.batch_id.clone(),
transaction_id: tx_id.to_vec(),
timestamp: current_timestamp(),
};
self.network.broadcast_settlement_confirm(confirm).await?;
Ok(Some(batch))
}
}
Public API Summary
#![allow(unused)]
fn main() {
// Content lifecycle
pub async fn create(...) -> Result<Hash>; // L0/L1 only
pub async fn extract_l1(...) -> Result<L1Summary>; // L0 → L1
pub async fn build_l2(...) -> Result<Hash>; // L1s → L2 (always private)
pub async fn merge_l2(...) -> Result<Hash>; // L2s → L2 (always private)
pub async fn publish(...) -> Result<()>; // NOT allowed for L2
pub async fn unpublish(...) -> Result<()>;
pub async fn update(...) -> Result<Hash>;
pub async fn derive(...) -> Result<Hash>; // Any sources → L3
pub async fn reference_l3_as_l0(...) -> Result<()>;
// Querying (L2 is never queried)
pub async fn preview(...) -> Result<(Manifest, L1Summary)>;
pub async fn query(...) -> Result<QueryResponse>;
pub async fn get_versions(...) -> Result<Vec<VersionInfo>>;
// Visibility/access (L2 is always private)
pub async fn set_visibility(...) -> Result<()>;
pub async fn set_access(...) -> Result<()>;
// Channel operations
pub async fn open_channel(...) -> Result<Hash>;
pub async fn accept_channel(...) -> Result<()>;
pub async fn close_channel(...) -> Result<()>;
pub async fn dispute_channel(...) -> Result<()>;
// Settlement (L2 is invisible to settlement)
pub async fn trigger_settlement(...) -> Result<Option<SettlementBatch>>;
// Handlers (for incoming messages - no L2 handlers needed)
pub async fn handle_preview_request(...) -> Result<PreviewResponsePayload>;
pub async fn handle_query_request(...) -> Result<QueryResponsePayload>;
pub async fn handle_version_request(...) -> Result<VersionResponsePayload>;
pub async fn handle_channel_open(...) -> Result<ChannelAcceptPayload>;
pub async fn handle_channel_close(...) -> Result<ChannelClosePayload>;
}
Test Cases
Content Lifecycle
- Create L0: Creates content, hash matches, owner set
- Create L2 via create(): Fails with “Use build_l2()”
- Create L3 via create(): Fails with “Use derive()”
- Extract L1: Rule-based extraction produces mentions
L2 Entity Graph
- Build L2 from L1s: Creates private L2, entities extracted
- Build L2 no sources: Fails
- Build L2 from non-L1: Fails
- Build L2 from unqueried L1: Fails
- Merge L2s: Combines entities, updates relationships
- Merge L2s from different owners: Fails (“can only merge your own L2s”)
- Merge single L2: Fails (requires >= 2)
- L2 is always private: visibility forced to Private
- L2 has zero price: price forced to 0
- Publish L2: Fails with L2CannotPublish
L3 Derivation
- Derive L3 from L1: Works, provenance correct
- Derive L3 from L2: Works if owned, provenance traces to L0/L1
- Derive L3 from someone else’s L2: Fails
- Derive L3 from mix: L0, L1, L2, L3 all work together
Publishing
- Publish L0/L1/L3: Works, visibility changes
- Unpublish: Visibility returns to Private
- Update version: New hash, version links correctly
Query Flow
- Query flow: Request → auto-open channel → payment → response → cache
- Query with existing channel: Uses existing channel
- Query insufficient balance: Returns PAYMENT_REQUIRED
- Query L2: Not possible (L2 is always private)
- Access denied: Private content returns NotFound
- Unlisted access: With hash works, without fails
- Insufficient payment: Rejected
Economics
- L3 from L2 provenance: root_L0L1 contains original L0/L1, not L2
- Settlement for L3: L2 creator gets nothing, L0/L1 creators paid
Other Operations
- Reference L3: Only works if queried first
- Channel open: Creates channel, both sides have state
- Channel close: Cooperative close submits to chain
- Channel dispute: Submits dispute with latest state
- Version request: Returns all versions for root
- Settlement trigger: Creates batch, submits to chain
- Settlement trigger: Creates batch, submits to chain
- Settlement threshold: Triggers when threshold reached
- Settlement interval: Triggers after time elapsed
Module: nodalync-net
Source: Protocol Specification §11
Overview
P2P networking using libp2p. Handles peer discovery, DHT, and message routing.
Key Design Decisions:
-
Hash-Only Lookup for MVP: The protocol supports hash-based content discovery only. Keyword/semantic search is an application-layer concern and out of scope for the core protocol. Users discover content via external channels (social media, links, recommendations) and use the protocol to query by hash.
-
DHT stores:
content_hash -> AnnouncePayloadmapping. This allows anyone with a hash to find the content owner’s addresses and metadata. -
No search index: The DHT is NOT an inverted index. Future application-layer services can build search functionality on top of the protocol.
Dependencies
nodalync-types— All data structuresnodalync-wire— Message encodingnodalync-ops— Operation handlerslibp2p— P2P networking stack
§11.1 Transport
#![allow(unused)]
fn main() {
pub fn build_transport(identity: &Keypair) -> Boxed<(PeerId, StreamMuxerBox)> {
let tcp = tcp::tokio::Transport::new(tcp::Config::default().nodelay(true));
let transport = tcp
.upgrade(Version::V1)
.authenticate(noise::Config::new(&identity).unwrap())
.multiplex(yamux::Config::default())
.boxed();
transport
}
}
Supported transports:
- TCP (primary)
- QUIC (optional, for better performance)
- WebSocket (optional, for browser nodes)
Security:
- Noise protocol (XX handshake pattern)
Multiplexing:
- yamux (primary)
- mplex (fallback)
§11.2 Discovery (DHT)
Kademlia Configuration
#![allow(unused)]
fn main() {
pub fn build_kademlia(peer_id: PeerId) -> Kademlia<MemoryStore> {
let mut config = KademliaConfig::default();
config.set_query_timeout(Duration::from_secs(60));
config.set_replication_factor(NonZeroUsize::new(DHT_REPLICATION).unwrap());
let store = MemoryStore::new(peer_id);
Kademlia::with_config(peer_id, store, config)
}
// Constants from spec
const DHT_BUCKET_SIZE: usize = 20;
const DHT_ALPHA: usize = 3;
const DHT_REPLICATION: usize = 20;
}
Content Announcement
#![allow(unused)]
fn main() {
/// Announce content availability to DHT
/// Stores: content_hash -> AnnouncePayload
pub async fn dht_announce(&mut self, hash: &Hash, payload: AnnouncePayload) -> Result<()> {
let key = Key::new(&hash.0);
let value = encode_payload(&payload)?;
self.kademlia.put_record(Record::new(key, value), Quorum::Majority).await?;
Ok(())
}
/// Lookup content by hash (the ONLY lookup mechanism in protocol)
/// Returns owner's addresses and metadata if found
pub async fn dht_get(&mut self, hash: &Hash) -> Result<Option<AnnouncePayload>> {
let key = Key::new(&hash.0);
match self.kademlia.get_record(key).await {
Ok(record) => {
let payload: AnnouncePayload = decode_payload(&record.value)?;
Ok(Some(payload))
}
Err(GetRecordError::NotFound) => Ok(None),
Err(e) => Err(e.into()),
}
}
/// Remove content announcement from DHT
pub async fn dht_remove(&mut self, hash: &Hash) -> Result<()> {
let key = Key::new(&hash.0);
self.kademlia.remove_record(&key).await?;
Ok(())
}
}
Note on Search:
The protocol does NOT include keyword search. The DHT only supports exact hash lookups. Content discovery happens through application-layer mechanisms:
- External search services (could index L1 summaries)
- Social sharing (users share links containing hashes)
- Recommendations (applications can build on provenance data)
- Curated directories (third parties can maintain topic indexes)
This keeps the protocol minimal and focused on trustless content exchange.
---
## §11.3 Peer Discovery
### Bootstrap
```rust
const BOOTSTRAP_NODES: &[&str] = &[
"/dns4/bootstrap1.nodalync.io/tcp/9000/p2p/12D3KooW...",
"/dns4/bootstrap2.nodalync.io/tcp/9000/p2p/12D3KooW...",
];
pub async fn bootstrap(&mut self) -> Result<()> {
for addr in BOOTSTRAP_NODES {
let addr: Multiaddr = addr.parse()?;
self.swarm.dial(addr)?;
}
// Bootstrap Kademlia
self.kademlia.bootstrap()?;
Ok(())
}
Peer Exchange
#![allow(unused)]
fn main() {
/// Exchange peer lists with connected peers
pub async fn exchange_peers(&mut self) -> Result<()> {
let my_peers: Vec<PeerInfo> = self.connected_peers()
.iter()
.map(|p| self.get_peer_info(p))
.collect();
for peer in self.connected_peers() {
let msg = Message::new(
MessageType::PeerInfo,
encode_payload(&PeerInfoPayload {
peer_id: self.peer_id(),
public_key: self.public_key(),
addresses: self.listen_addresses(),
capabilities: self.capabilities(),
content_count: self.content_count(),
uptime: self.uptime(),
})?,
&self.identity,
);
self.send(&peer, msg).await?;
}
Ok(())
}
}
§11.4 Message Routing
Request-Response Protocol
#![allow(unused)]
fn main() {
#[derive(NetworkBehaviour)]
pub struct NodalyncBehaviour {
kademlia: Kademlia<MemoryStore>,
request_response: request_response::Behaviour<NodalyncCodec>,
gossipsub: gossipsub::Behaviour,
identify: identify::Behaviour,
}
pub struct NodalyncCodec;
impl request_response::Codec for NodalyncCodec {
type Protocol = &'static str;
type Request = Message;
type Response = Message;
fn protocol(&self) -> Self::Protocol {
"/nodalync/1.0.0"
}
async fn read_request(&mut self, io: &mut impl AsyncRead) -> io::Result<Self::Request> {
let bytes = read_length_prefixed(io, MAX_MESSAGE_SIZE).await?;
decode_message(&bytes).map_err(|e| io::Error::new(io::ErrorKind::InvalidData, e))
}
async fn write_response(&mut self, io: &mut impl AsyncWrite, msg: Self::Response) -> io::Result<()> {
let bytes = encode_message(&msg)?;
write_length_prefixed(io, &bytes).await
}
}
}
Send/Receive
#![allow(unused)]
fn main() {
/// Send message to specific peer
pub async fn send(&mut self, peer: &PeerId, message: Message) -> Result<Message> {
let response = self.request_response
.send_request(peer, message)
.await
.map_err(|e| Error::Network(e.to_string()))?;
Ok(response)
}
/// Broadcast announcement via GossipSub
pub async fn broadcast(&mut self, message: Message) -> Result<()> {
let topic = gossipsub::IdentTopic::new("/nodalync/announce/1.0.0");
let bytes = encode_message(&message)?;
self.gossipsub.publish(topic, bytes)?;
Ok(())
}
}
Timeouts and Retries
#![allow(unused)]
fn main() {
const MESSAGE_TIMEOUT: Duration = Duration::from_secs(30);
const MAX_RETRIES: usize = 3;
pub async fn send_with_retry(&mut self, peer: &PeerId, message: Message) -> Result<Message> {
let mut last_error = None;
for attempt in 0..MAX_RETRIES {
match timeout(MESSAGE_TIMEOUT, self.send(peer, message.clone())).await {
Ok(Ok(response)) => return Ok(response),
Ok(Err(e)) => {
last_error = Some(e);
// Exponential backoff
tokio::time::sleep(Duration::from_millis(100 * 2_u64.pow(attempt as u32))).await;
}
Err(_) => {
last_error = Some(Error::Timeout);
}
}
}
Err(last_error.unwrap())
}
}
Network Trait
#![allow(unused)]
fn main() {
#[async_trait]
pub trait Network {
// Discovery (hash-based only)
async fn dht_announce(&mut self, hash: &Hash, payload: AnnouncePayload) -> Result<()>;
async fn dht_get(&mut self, hash: &Hash) -> Result<Option<AnnouncePayload>>;
async fn dht_remove(&mut self, hash: &Hash) -> Result<()>;
// Messaging
async fn send(&mut self, peer: &PeerId, message: Message) -> Result<Message>;
async fn broadcast(&mut self, message: Message) -> Result<()>;
// Specific message helpers
async fn send_preview_request(&mut self, peer: &PeerId, hash: &Hash) -> Result<PreviewResponsePayload>;
async fn send_query(&mut self, peer: &PeerId, request: QueryRequestPayload) -> Result<QueryResponsePayload>;
async fn send_channel_open(&mut self, peer: &PeerId, request: ChannelOpenPayload) -> Result<ChannelAcceptPayload>;
async fn send_channel_close(&mut self, peer: &PeerId, request: ChannelClosePayload) -> Result<ChannelClosePayload>;
async fn broadcast_settlement_confirm(&mut self, confirm: SettleConfirmPayload) -> Result<()>;
// Peer management
fn connected_peers(&self) -> Vec<PeerId>;
fn listen_addresses(&self) -> Vec<Multiaddr>;
async fn dial(&mut self, addr: Multiaddr) -> Result<()>;
// Event loop
async fn next_event(&mut self) -> NetworkEvent;
}
pub enum NetworkEvent {
MessageReceived { peer: PeerId, message: Message },
PeerConnected(PeerId),
PeerDisconnected(PeerId),
DhtPutComplete { key: Hash, success: bool },
DhtGetResult { key: Hash, value: Option<Vec<u8>> },
}
}
Test Cases
- Bootstrap: Connect to bootstrap nodes
- DHT announce/lookup: Announce content, find it from another node by hash
- DHT remove: Remove announcement, no longer findable
- Request-response: Send query, receive response
- Timeout: Slow peer triggers timeout
- Retry: Failed request retries
- Peer discovery: Find peers through DHT
- GossipSub: Broadcast reaches subscribers
- Channel messages: Open/close flow works
- Settlement broadcast: Confirm reaches all peers
Module: nodalync-settle
Source: Protocol Specification §12
Overview
Blockchain settlement on Hedera Hashgraph. Handles deposits, withdrawals, channel management, and batch settlement.
Key Design Decision: The settlement contract distributes payments to ALL recipients directly. When a settlement batch is submitted, the contract pays:
- Content owners (5% synthesis fee + any root shares they have)
- All root contributors (their proportional shares)
This ensures trustless distribution — content owners cannot withhold payments from upstream contributors. All recipients must have Hedera accounts to receive payments.
Dependencies
nodalync-types— Settlement typesnodalync-econ— Batch creationhedera-sdk— Hedera integration
§12.1 Chain Selection
Primary chain: Hedera Hashgraph
Rationale:
- Fast finality (3-5 seconds)
- Low cost (~$0.0001/tx)
- High throughput (10,000+ TPS)
- Enterprise backing (helps with non-crypto user trust)
§12.2 On-Chain Data
Contract State
// Simplified representation of on-chain state
contract NodalyncSettlement {
// Token balances
mapping(address => uint256) public balances;
// Payment channels
struct Channel {
address participant1;
address participant2;
uint256 balance1;
uint256 balance2;
uint64 nonce;
ChannelStatus status;
}
mapping(bytes32 => Channel) public channels;
// Content attestations
struct Attestation {
bytes32 contentHash;
address owner;
uint64 timestamp;
bytes32 provenanceRoot;
}
mapping(bytes32 => Attestation) public attestations;
}
§12.3 Contract Operations
EVM Address Handling
Critical for ECDSA accounts: When interacting with the settlement contract, the EVM address
used by msg.sender differs based on account key type:
| Key Type | EVM Address (msg.sender) |
|---|---|
| ECDSA | Derived from public key: keccak256(uncompressed_pubkey)[12:] |
| Ed25519 | Simple padded account number: 0x000...{account_num_hex} |
For ECDSA accounts, AccountId::to_solidity_address() returns the wrong address for
contract storage lookups. The contract uses msg.sender (the key-derived address) when
storing balances, but queries using to_solidity_address() will look up the wrong slot.
To get the correct EVM address for any account:
curl -s "https://testnet.mirrornode.hedera.com/api/v1/accounts/0.0.ACCOUNT_ID" | jq '.evm_address'
Deposit/Withdraw
Important: Deposits must call the contract’s deposit() payable function to update
the internal balances mapping. A simple TransferTransaction sends HBAR but does NOT
update the contract’s balance tracking.
#![allow(unused)]
fn main() {
pub async fn deposit(&self, amount: Amount) -> Result<TransactionId> {
// CORRECT: Call the contract's deposit() payable function
let tx = ContractExecuteTransaction::new()
.contract_id(self.contract_id)
.gas(100_000)
.payable_amount(Hbar::from_tinybars(amount as i64))
.function("deposit")
.execute(&self.client)
.await?;
let receipt = tx.get_receipt(&self.client).await?;
Ok(receipt.transaction_id)
}
pub async fn withdraw(&self, amount: Amount) -> Result<TransactionId> {
let tx = ContractExecuteTransaction::new()
.contract_id(self.contract_id)
.gas(100_000)
.function("withdraw")
.function_parameters(ContractFunctionParameters::new().add_uint256(amount))
.execute(&self.client)
.await?;
Ok(tx.transaction_id)
}
}
Content Attestation
#![allow(unused)]
fn main() {
pub async fn attest(
&self,
content_hash: &Hash,
provenance_root: &Hash,
) -> Result<TransactionId> {
let tx = ContractExecuteTransaction::new()
.contract_id(self.contract_id)
.function("attest")
.function_parameters(
ContractFunctionParameters::new()
.add_bytes32(&content_hash.0)
.add_bytes32(&provenance_root.0)
)
.execute(&self.client)
.await?;
Ok(tx.transaction_id)
}
}
Channel Operations
#![allow(unused)]
fn main() {
pub async fn open_channel(
&self,
peer: &AccountId,
my_deposit: Amount,
peer_deposit: Amount,
) -> Result<(ChannelId, TransactionId)> {
let channel_id = compute_channel_id(&self.account_id, peer);
let tx = ContractExecuteTransaction::new()
.contract_id(self.contract_id)
.function("openChannel")
.function_parameters(
ContractFunctionParameters::new()
.add_bytes32(&channel_id.0)
.add_address(peer)
.add_uint256(my_deposit)
.add_uint256(peer_deposit)
)
.execute(&self.client)
.await?;
Ok((channel_id, tx.transaction_id))
}
pub async fn close_channel(
&self,
channel_id: &ChannelId,
final_balances: ChannelBalances,
signatures: [Signature; 2],
) -> Result<TransactionId> {
// NOTE: The spec's ChannelClosePayload.settlement_tx is the encoded
// bytes of this on-chain call. Both parties must agree on final_balances
// and sign before submitting.
let tx = ContractExecuteTransaction::new()
.contract_id(self.contract_id)
.function("closeChannel")
.function_parameters(
ContractFunctionParameters::new()
.add_bytes32(&channel_id.0)
.add_uint256(final_balances.initiator)
.add_uint256(final_balances.responder)
.add_bytes(&signatures[0].0)
.add_bytes(&signatures[1].0)
)
.execute(&self.client)
.await?;
Ok(tx.transaction_id)
}
/// Create settlement_tx bytes for ChannelClosePayload
pub fn create_close_tx_bytes(
&self,
channel_id: &ChannelId,
final_balances: &ChannelBalances,
) -> Vec<u8> {
// Encode the proposed close transaction for P2P negotiation
let mut bytes = Vec::new();
bytes.extend_from_slice(&channel_id.0);
bytes.extend_from_slice(&final_balances.initiator.to_be_bytes());
bytes.extend_from_slice(&final_balances.responder.to_be_bytes());
bytes
}
pub async fn dispute_channel(
&self,
channel_id: &ChannelId,
claimed_state: &ChannelUpdatePayload,
) -> Result<TransactionId> {
let tx = ContractExecuteTransaction::new()
.contract_id(self.contract_id)
.function("disputeChannel")
.function_parameters(
ContractFunctionParameters::new()
.add_bytes32(&channel_id.0)
.add_uint64(claimed_state.nonce)
.add_uint256(claimed_state.balances.initiator)
.add_uint256(claimed_state.balances.responder)
.add_bytes(&claimed_state.signature.0)
)
.execute(&self.client)
.await?;
Ok(tx.transaction_id)
}
/// Resolve a dispute after the dispute period (24 hours).
/// The contract will use the highest-nonce state submitted during the dispute period.
pub async fn resolve_dispute(
&self,
channel_id: &ChannelId,
) -> Result<TransactionId> {
// After CHANNEL_DISPUTE_PERIOD_MS (24 hours), anyone can call resolve
let tx = ContractExecuteTransaction::new()
.contract_id(self.contract_id)
.function("resolveDispute")
.function_parameters(
ContractFunctionParameters::new()
.add_bytes32(&channel_id.0)
)
.execute(&self.client)
.await?;
Ok(tx.transaction_id)
}
/// Submit a counter-claim during dispute period with a higher nonce state
pub async fn counter_dispute(
&self,
channel_id: &ChannelId,
better_state: &ChannelUpdatePayload,
) -> Result<TransactionId> {
// If we have a state with higher nonce, submit it to win the dispute
let tx = ContractExecuteTransaction::new()
.contract_id(self.contract_id)
.function("counterDispute")
.function_parameters(
ContractFunctionParameters::new()
.add_bytes32(&channel_id.0)
.add_uint64(better_state.nonce)
.add_uint256(better_state.balances.initiator)
.add_uint256(better_state.balances.responder)
.add_bytes(&better_state.signature.0)
)
.execute(&self.client)
.await?;
Ok(tx.transaction_id)
}
}
Batch Settlement
#![allow(unused)]
fn main() {
pub async fn settle_batch(&self, batch: SettlementBatch) -> Result<TransactionId> {
// Encode batch entries
let entries_encoded: Vec<Vec<u8>> = batch.entries
.iter()
.map(|e| encode_settlement_entry(e))
.collect();
let tx = ContractExecuteTransaction::new()
.contract_id(self.contract_id)
.function("settleBatch")
.function_parameters(
ContractFunctionParameters::new()
.add_bytes32(&batch.batch_id.0)
.add_bytes32(&batch.merkle_root.0)
.add_bytes_array(&entries_encoded)
)
.execute(&self.client)
.await?;
Ok(tx.transaction_id)
}
}
Settlement Trait
#![allow(unused)]
fn main() {
#[async_trait]
pub trait Settlement {
// Balance management
async fn deposit(&self, amount: Amount) -> Result<TransactionId>;
async fn withdraw(&self, amount: Amount) -> Result<TransactionId>;
async fn get_balance(&self) -> Result<Amount>;
// Attestations
async fn attest(&self, content_hash: &Hash, provenance_root: &Hash) -> Result<TransactionId>;
async fn get_attestation(&self, content_hash: &Hash) -> Result<Option<Attestation>>;
// Channels
async fn open_channel(&self, peer: &AccountId, deposit: Amount) -> Result<ChannelId>;
async fn close_channel(&self, channel_id: &ChannelId, final_state: ChannelBalances, signatures: [Signature; 2]) -> Result<TransactionId>;
async fn dispute_channel(&self, channel_id: &ChannelId, state: &ChannelUpdatePayload) -> Result<TransactionId>;
async fn counter_dispute(&self, channel_id: &ChannelId, better_state: &ChannelUpdatePayload) -> Result<TransactionId>;
async fn resolve_dispute(&self, channel_id: &ChannelId) -> Result<TransactionId>;
// Batch settlement - distributes to ALL recipients in the batch
async fn settle_batch(&self, batch: SettlementBatch) -> Result<TransactionId>;
async fn verify_settlement(&self, tx_id: &TransactionId) -> Result<SettlementStatus>;
}
pub enum SettlementStatus {
Pending,
Confirmed { block: u64, timestamp: Timestamp },
Failed { reason: String },
}
}
Configuration
[settlement]
# Hedera network: mainnet, testnet, previewnet
network = "testnet"
# Account ID (format: 0.0.12345)
account_id = "0.0.12345"
# Private key (or path to file)
private_key_path = "~/.nodalync/hedera.key"
# Contract ID
contract_id = "0.0.67890"
# Gas limits
max_gas_attest = 100000
max_gas_settle = 500000
Test Cases (Testnet)
- Deposit: Deposit tokens → balance increases
- Withdraw: Withdraw tokens → balance decreases
- Attest: Create attestation → retrievable on-chain
- Channel lifecycle: Open → update → close
- Dispute initiation: Submit dispute → channel enters Disputed state
- Counter dispute: Submit higher-nonce state → wins dispute
- Dispute resolution: After 24h → resolve settles to highest nonce
- Batch settlement: Multiple recipients settled in one tx
- Batch distribution: All root contributors receive correct amounts
- Merkle verification: Prove inclusion in batch
Debugging & Verification
Verify Transactions On-Chain
After any settlement operation, always verify on-chain status:
# Check recent transactions - should show CONTRACTCALL, not just CRYPTOTRANSFER
curl -s "https://testnet.mirrornode.hedera.com/api/v1/transactions?account.id=0.0.ACCOUNT&limit=5&order=desc" \
| jq '.transactions[] | {timestamp: .consensus_timestamp, type: .name, result: .result}'
# Check contract calls specifically
curl -s "https://testnet.mirrornode.hedera.com/api/v1/contracts/0.0.7729011/results?limit=5&order=desc" \
| jq '.results[] | {timestamp, from, result: .error_message}'
Check Contract State
# View all storage slots
curl -s "https://testnet.mirrornode.hedera.com/api/v1/contracts/0.0.7729011/state" | jq '.state'
# Query balance for an address (balances mapping, selector 0x27e235e3)
# Replace EVM_ADDRESS with 40 hex chars (no 0x prefix)
curl -s -X POST "https://testnet.mirrornode.hedera.com/api/v1/contracts/call" \
-H "Content-Type: application/json" \
-d '{
"block": "latest",
"data": "0x27e235e3000000000000000000000000EVM_ADDRESS",
"to": "0xc6b4bFD28AF2F6999B32510557380497487A60dD"
}' | jq '.result'
Check Event Logs
# View deposit/withdraw events (shows actual credited address)
curl -s "https://testnet.mirrornode.hedera.com/api/v1/contracts/0.0.7729011/results/logs?order=desc&limit=10" \
| jq '.logs[] | {timestamp, topics, data}'
Common Issues
| Symptom | Cause | Solution |
|---|---|---|
Transaction shows CRYPTOTRANSFER not CONTRACTCALL | Using TransferTransaction instead of ContractExecuteTransaction | Use ContractExecuteTransaction with payable_amount() |
| Balance query returns 0 after deposit | Wrong EVM address for ECDSA accounts | Use key-derived evm_address from mirror node |
CONTRACT_REVERT_EXECUTED | Contract logic rejected the call | Check function parameters, balances, or channel state |
| CLI shows success but contract reverts | Receipt status not properly checked | Verify via mirror node API |
Contract Function Selectors
| Function | Selector | Notes |
|---|---|---|
deposit() | 0xd0e30db0 | Payable, no parameters |
withdraw(uint256) | 0x2e1a7d4d | Amount in tinybars |
balances(address) | 0x27e235e3 | Public mapping getter |
openChannel(bytes32,address,uint256,uint256) | 0xcf027915 | channelId, peer, deposit1, deposit2 |
closeChannel(bytes32,uint256,uint256,bytes) | varies | channelId, bal1, bal2, signatures |
settleBatch(bytes32,bytes32,bytes[]) | varies | batchId, merkleRoot, entries |
Module: nodalync-cli
Source: Not in spec (application layer)
Overview
Command-line interface for interacting with a Nodalync node. User-facing binary.
Dependencies
- All
nodalync-*crates clap— Argument parsingindicatif— Progress barscolored— Terminal colors
Commands
Identity
# Initialize new identity
nodalync init
> Identity created: ndl1abc123...
> Configuration saved to <data_dir>/config.toml
# Show identity
nodalync whoami
> PeerId: ndl1abc123...
> Public Key: 0x...
> Addresses: /ip4/0.0.0.0/tcp/9000
Content Management
# Publish content
nodalync publish <file> [--price <amount>] [--visibility <private|unlisted|shared>]
> Hashing content...
> Extracting L1 mentions... (23 found)
> Published: a1b2c3d4e5f6...
> Price: 0.10 HBAR
> Visibility: shared
# List local content
nodalync list [--visibility <filter>]
> SHARED (3)
> a1b2c3d4e5f6... "Research Paper" v3, 0.10 HBAR, 847 queries
> b7c8d9e0f1a2... "Analysis" v1, 0.05 HBAR, 234 queries
>
> PRIVATE (2)
> d9e0f1a2b3c4... "Draft Ideas" v4
> e5f6a7b8c9d0... "Personal Notes" v1
# Update content (new version)
nodalync update <hash> <new-file>
> Previous: a1b2c3d4e5f6... (v1)
> New: b7c8d9e0f1a2... (v2)
> Version root: a1b2c3d4e5f6...
# Show versions
nodalync versions <hash>
> Version root: a1b2c3d4e5f6...
> v1: a1b2c3d4e5f6... (2025-01-15) - shared
> v2: b7c8d9e0f1a2... (2025-01-20) - shared [latest]
# Change visibility
nodalync visibility <hash> --level <private|unlisted|shared>
> Visibility updated: a1b2c3d4e5f6... → shared
# Delete (local only)
nodalync delete <hash>
> Deleted: a1b2c3d4e5f6... (local copy only, provenance preserved)
Discovery & Querying
# Search network
nodalync search "climate change mitigation" [--limit <n>]
> Found 47 results
> [1] b7c8d9e0f1a2... "IPCC Report Summary" by ndl1def... (0.05/query, 847 queries)
> Preview: Global temperatures have risen 1.1°C since pre-industrial...
> [2] c3d4e5f6a7b8... "Carbon Capture Analysis" by ndl1ghi... (0.12/query, 234 queries)
> Preview: Current carbon capture technology can sequester...
# Preview content (free)
nodalync preview <hash>
> Title: "IPCC Report Summary"
> Owner: ndl1def...
> Price: 0.05 HBAR
> Queries: 847
>
> L1 Mentions (5 of 23):
> - Global temperatures have risen 1.1°C since pre-industrial
> - Net-zero by 2050 requires 45% emission reduction by 2030
> - ...
# Query content (paid)
nodalync query <hash>
> Querying b7c8d9e0f1a2...
> Payment: 0.05 HBAR
> Content saved to ./cache/b7c8d9e0f1a2...
Synthesis
# Create L3 insight from sources
nodalync synthesize --sources <hash1>,<hash2>,... --output <file>
> Verifying sources queried... ✓
> Computing provenance (12 roots)...
> L3 hash: f1a2b3c4d5e6...
>
> Publish now? [y/n/set price]: 0.15
> Published: f1a2b3c4d5e6... (0.15 HBAR, shared)
# Reference external L3 as L0
nodalync reference <l3-hash>
> Referencing a1b2c3d4e5f6... as L0 for future derivations
Economics
# Check balance
nodalync balance
> Protocol Balance: 127.50 HBAR
> Pending Earnings: 4.23 HBAR
> Pending Settlement: 12 payments
>
> Breakdown:
> Direct queries: 89.20 HBAR
> Root contributions: 38.30 HBAR
# Earnings by content
nodalync earnings [--content <hash>]
> Top earning content:
> a1b2c3d4e5f6... "Research Paper": 45.30 HBAR (234 queries)
> b7c8d9e0f1a2... "Analysis": 23.10 HBAR (462 queries, as root)
# Deposit tokens
nodalync deposit <amount>
> Depositing 50.00 HBAR...
> Transaction: 0x...
> New balance: 177.50 HBAR
# Withdraw tokens
nodalync withdraw <amount>
> Withdrawing 100.00 HBAR...
> Transaction: 0x...
> New balance: 77.50 HBAR
# Force settlement
nodalync settle
> Settling 12 pending payments...
> Batch ID: 0a1b2c3d4e5f...
> Transaction: 0x...
> Settled: 4.23 HBAR to 5 recipients
Payment Channels
# Open payment channel with peer
nodalync open-channel <peer-id> --deposit 100
> Channel opened: 4d5e6f7a8b9c...
> Peer: ndl1abc123...
> State: Open
> My Balance: 100.00 HBAR
> Their Balance: 100.00 HBAR
# List all payment channels
nodalync list-channels
> Payment Channels: 3 channels (2 open)
> 1a2b3c4d5e6f... ndl1abc... [Open] my: 0.85 HBAR / their: 1.15 HBAR
> 2b3c4d5e6f7a... ndl1def... [Open] my: 2.30 HBAR / their: 0.70 HBAR (5 pending)
> 3c4d5e6f7a8b... ndl1ghi... [Closed] my: 0.00 HBAR / their: 0.00 HBAR
# Close payment channel
nodalync close-channel <peer-id>
> Channel closed: 4d5e6f7a8b9c...
> Peer: ndl1abc123...
> Final Balance: my: 0.85 HBAR / their: 1.15 HBAR
Node Management
# Start node (foreground)
nodalync start
> Starting Nodalync node...
> PeerId: 12D3KooW...
> Listening on /ip4/0.0.0.0/tcp/9000
> Connected to 12 peers
> DHT bootstrapped
# Start with health endpoint (for containers/monitoring)
nodalync start --health --health-port 8080
> Starting Nodalync node...
> PeerId: 12D3KooW...
> Health endpoint: http://0.0.0.0:8080/health
> Metrics endpoint: http://0.0.0.0:8080/metrics
# Start as daemon (background)
nodalync start --daemon
> Nodalync daemon started (PID: 12345)
> PeerId: 12D3KooW...
# Node status
nodalync status
> Node: running (PID: 12345)
> PeerId: 12D3KooW...
> Uptime: 4h 23m
> Peers: 12 connected
> Content: 5 shared, 2 private
> Pending: 12 payments (4.23 HBAR)
# Stop daemon
nodalync stop
> Shutting down gracefully...
> Flushing pending operations...
> Node stopped
Health Endpoints (when --health flag is used):
| Endpoint | Content-Type | Description |
|---|---|---|
GET /health | application/json | {"status":"ok","connected_peers":N,"uptime_secs":M} |
GET /metrics | text/plain | Prometheus metrics format |
Prometheus Metrics:
nodalync_connected_peers— Current peer countnodalync_peer_events_total{event}— Connect/disconnect eventsnodalync_dht_operations_total{op,result}— DHT put/get operationsnodalync_gossipsub_messages_total— Broadcast messages receivednodalync_settlement_batches_total{status}— Settlement batchesnodalync_settlement_latency_seconds— Settlement operation latencynodalync_queries_total— Total queries processednodalync_query_latency_seconds— Query latency histogramnodalync_uptime_seconds— Node uptimenodalync_node_info{version,peer_id}— Node metadata
CLI Structure
#![allow(unused)]
fn main() {
use clap::{Parser, Subcommand};
#[derive(Parser)]
#[command(name = "nodalync")]
#[command(about = "Nodalync Protocol CLI")]
pub struct Cli {
#[command(subcommand)]
pub command: Commands,
/// Path to config file (default: <data_dir>/config.toml)
#[arg(short, long)]
pub config: Option<PathBuf>,
/// Output format
#[arg(short, long, default_value = "human")]
pub format: OutputFormat,
}
#[derive(Subcommand)]
pub enum Commands {
/// Initialize new identity
Init,
/// Show identity info
Whoami,
/// Publish content
Publish {
file: PathBuf,
#[arg(short, long)]
price: Option<f64>,
#[arg(short, long, default_value = "shared")]
visibility: Visibility,
},
/// List local content
List {
#[arg(short, long)]
visibility: Option<Visibility>,
},
/// Search network
Search {
query: String,
#[arg(short, long, default_value = "10")]
limit: u32,
},
/// Preview content (free)
Preview { hash: String },
/// Query content (paid)
Query { hash: String },
/// Create L3 synthesis
Synthesize {
#[arg(short, long, value_delimiter = ',')]
sources: Vec<String>,
#[arg(short, long)]
output: PathBuf,
},
/// Check balance
Balance,
/// Start node
Start {
#[arg(short, long)]
daemon: bool,
/// Enable HTTP health endpoint
#[arg(long)]
health: bool,
/// Port for health endpoint (default: 8080)
#[arg(long, default_value = "8080")]
health_port: u16,
},
/// Node status
Status,
/// Stop node
Stop,
/// Open payment channel
OpenChannel {
peer_id: String,
#[arg(short, long)]
deposit: f64,
},
/// Close payment channel
CloseChannel { peer_id: String },
/// List payment channels
ListChannels,
// ... more commands
}
#[derive(Clone, Copy, ValueEnum)]
pub enum OutputFormat {
Human,
Json,
}
}
Output Formatting
#![allow(unused)]
fn main() {
pub trait Render {
fn render_human(&self) -> String;
fn render_json(&self) -> String;
}
impl Render for SearchResult {
fn render_human(&self) -> String {
format!(
"{} \"{}\" by {} ({}/query, {} queries)\n Preview: {}",
self.hash.short(),
self.title,
self.owner.short(),
format_amount(self.price),
self.total_queries,
self.l1_summary.summary.truncate(80),
)
}
fn render_json(&self) -> String {
serde_json::to_string_pretty(self).unwrap()
}
}
}
Error Handling
pub fn run() -> Result<()> {
let cli = Cli::parse();
match cli.command {
Commands::Publish { file, price, visibility } => {
let result = publish(&file, price, visibility)?;
println!("{}", result.render(cli.format));
}
// ...
}
Ok(())
}
fn main() {
if let Err(e) = run() {
eprintln!("{}: {}", "Error".red().bold(), e);
std::process::exit(1);
}
}
Configuration
Configuration is stored in a platform-specific data directory (set NODALYNC_DATA_DIR to override):
- macOS:
~/Library/Application Support/io.nodalync.nodalync/config.toml - Linux:
~/.local/share/nodalync/config.toml - Windows:
%APPDATA%\nodalync\nodalync\config.toml
[identity]
keyfile = "<data_dir>/identity/keypair.key"
[storage]
content_dir = "<data_dir>/content"
database = "<data_dir>/nodalync.db"
cache_dir = "<data_dir>/cache"
cache_max_size_mb = 1000
[network]
enabled = true
listen_addresses = ["/ip4/0.0.0.0/tcp/9000"]
bootstrap_nodes = [
"/dns4/nodalync-bootstrap.eastus.azurecontainer.io/tcp/9000/p2p/12D3KooWMqrUmZm4e1BJTRMWqKHCe1TSX9Vu83uJLEyCGr2dUjYm",
]
[settlement]
network = "hedera-testnet"
auto_deposit = false
[economics]
default_price = 0.1 # In HBAR
auto_settle_threshold = 100.0 # In HBAR
[display]
default_format = "human"
show_previews = true
max_search_results = 20
Test Cases
- init: Creates identity and config
- publish: File hashed, L1 extracted, announced
- search: Returns results from network
- query: Pays and retrieves content
- synthesize: Creates L3 with correct provenance
- balance: Shows correct amounts
- JSON output: Valid JSON for all commands
- Error messages: Helpful, actionable errors
- open-channel: Opens channel, both sides have state
- list-channels: Shows all channels with states
- close-channel: Cooperative close, settles on-chain
Module 11: MCP Server
The nodalync-mcp crate provides an MCP (Model Context Protocol) server that enables AI assistants like Claude to query knowledge from a local Nodalync node.
Quick Start
1. Build the CLI
cargo build --release -p nodalync-cli
2. Initialize a Node
./target/release/nodalync init
3. Configure Claude Desktop
Add to your Claude Desktop MCP config (typically ~/.config/claude/mcp.json on macOS/Linux):
{
"mcpServers": {
"nodalync": {
"command": "/path/to/nodalync",
"args": ["mcp-server", "--budget", "1.0", "--auto-approve", "0.01"]
}
}
}
4. Restart Claude Desktop
Quit and reopen Claude Desktop to load the MCP server.
CLI Usage
# Start MCP server with defaults (1 HBAR budget, 0.01 auto-approve)
nodalync mcp-server
# Custom budget and auto-approve threshold
nodalync mcp-server --budget 5.0 --auto-approve 0.1
Options
| Flag | Default | Description |
|---|---|---|
--budget, -b | 1.0 | Total session budget in HBAR |
--auto-approve, -a | 0.01 | Auto-approve queries under this HBAR amount |
MCP Tools
When the MCP server is running, AI agents have access to these tools:
| Tool | Description |
|---|---|
query_knowledge | Query content by hash or natural language (paid) |
list_sources | Browse available content with metadata |
search_network | Search connected peers for content (requires --enable-network) |
preview_content | View content metadata without paying |
publish_content | Publish new content from the agent |
synthesize_content | Create L3 synthesis from multiple sources |
update_content | Create a new version of existing content |
delete_content | Delete content and set visibility to offline |
set_visibility | Change content visibility |
list_versions | List all versions of a content item |
get_earnings | View earnings breakdown by content |
status | Node health, budget, channels, and Hedera status |
deposit_hbar | Deposit HBAR to the settlement contract |
open_channel | Open a payment channel with a peer |
close_channel | Close a payment channel |
close_all_channels | Close all open payment channels |
Note: Natural language queries are not yet supported for
query_knowledge. Uselist_sourcesorsearch_networkto discover content hashes first.
MCP Resources
knowledge://{hash}
Direct content access by hash. Use list_sources to discover available hashes.
URI Format: knowledge://<base58-encoded-hash>
Example:
knowledge://5dY7Kx9mT2...
Returns the content directly. Payment is handled automatically from session budget.
Architecture
┌──────────────┐ stdio ┌─────────────────┐
│ Claude │ ◄────────────► │ nodalync │
│ Desktop │ MCP │ mcp-server │
└──────────────┘ └────────┬────────┘
│
┌────────────────┼────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ nodalync- │ │ nodalync- │ │ Event Loop │
│ store │ │ net │ │ (background)│
│ (local) │ │ (P2P) │ │ │
└─────────────┘ └─────────────┘ └─────────────┘
Event Processing
When --enable-network is used, the MCP server spawns a background event loop that processes incoming network events (e.g., ChannelAccept messages). This enables full payment channel lifecycle support:
- Channel Open: Server sends
ChannelOpento peer - Event Loop: Receives
ChannelAcceptfrom peer - State Transition: Channel moves from
Opening→Open - Payments: Channel is ready for micropayments
Budget System
The budget system prevents runaway spending:
- Session Budget: Total HBAR available for the session
- Auto-Approve Threshold: Queries below this cost are approved automatically
- Atomic Tracking: Thread-safe spending with
compare_exchange
#![allow(unused)]
fn main() {
// Budget is tracked atomically
pub fn try_spend(&self, amount: Amount) -> Result<Amount, McpError> {
// Atomic compare-and-swap ensures thread safety
}
}
Error Handling
| Error | Cause | Resolution |
|---|---|---|
BudgetExceeded | Query cost > remaining budget | Increase budget or use smaller queries |
ContentNotFound | Hash doesn’t exist locally | Ensure content is published |
StorageError | Database issues | Check permissions, disk space |
Testing
# Run MCP crate tests
cargo test -p nodalync-mcp
# Test server manually
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | ./target/release/nodalync mcp-server