Complete reference for the CoCoX distributed on-chain analytics platform — Data Coins, Analysis Coins, native R, SQL engine, scrapers, and API.
CoCoX is a distributed on-chain research platform. Anyone can mint structured data tables (Data Coins) and executable R research contracts (Analysis Coins) that run natively on-chain via offchain workers. No external R dependency — all statistical functions are pure Rust.
SCRAPERS (standalone Node.js) WEB UI (cocox/*.html)
FRED BLS Census IMF World Data Wallet | Coin IDE | Data Explorer
Canada Asia DeFi │
│ │
▼ mint_data_coin ▼ /api/cocox-v2/*
┌────────────────────────────────────────────────────────────┐
│ SUBSTRATE CHAIN (pallet-analysis) │
│ │
│ ┌─ DATA COINS ───────────────────────────────────────┐ │
│ │ Columns(coin, col, chunk) → Vec<DataValue> │ │
│ │ Per-column, 1024 rows/chunk │ │
│ │ ReadOnly | AppendOnly | Derived(sql_expr) │ │
│ └────────────────────────────────────────────────────┘ │
│ │
│ ┌─ ANALYSIS COINS ───────────────────────────────────┐ │
│ │ r_script + steps: Vec<ComputeStep> │ │
│ │ state: Pending → Solving → Solved → Verified │ │
│ │ StepResults(coin, step) → bytes │ │
│ └─────────────┬──────────────────────────────────────┘ │
│ │ │
│ ┌─ OFFCHAIN WORKER (per block) ──────────────────────┐ │
│ │ DataFetch step → sql::execute_query() │ │
│ │ NativeFn step → native_r::execute_native() │ │
│ │ submit_step(signed) → StepResults storage │ │
│ └────────────────────────────────────────────────────┘ │
│ │
│ ┌─ NATIVE R ─────────────────────────────────────────┐ │
│ │ lm | glm | t.test | cor.test | chisq.test │ │
│ │ benford | arima | garch | pca | johansen | var │ │
│ │ sharpe | beta | maxdrawdown | sortino | calmar │ │
│ │ summary | predict │ │
│ └────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────┘
Data Coins are on-chain structured tables with per-column chunked storage. Each coin has a declared schema, a provenance source, and a read/write mode. They are public — any Analysis Coin can reference any Data Coin.
| Type | Storage | JSON Example |
|---|---|---|
Float | (i64 mantissa, u8 decimals) | {Float: [100, 2]} = 1.00 |
Int | i64 | {Int: 42} |
Text | Vec<u8> | {Text: "FRED"} |
TimeStamp | u64 (epoch seconds) | {TimeStamp: 978307200} |
Bool | bool | {Bool: true} |
| Mode | Behavior | Example |
|---|---|---|
ReadOnly | Minted once, never changes | Research dataset snapshot |
AppendOnly | Creator adds rows over time | FRED series, price feeds |
Derived(expr) | Auto-evaluated SQL referencing other coins. Updates every 600 blocks. | Real GDP = GDP / CPI |
| Field | Type | Values |
|---|---|---|
category | Category | Unknown, InterestRates, Inflation, GDP, Employment, Housing, Trade, Equities, FX, Commodities, Crypto, DeFi, FiscalDebt, MoneySupply, YieldCurve, Volatility, Derivatives, Other |
frequency | Frequency | Unknown, Daily, Weekly, Monthly, Quarterly, Annual, RealTime |
geography | Geography | Global, US, EU, Eurozone, UK, Japan, China, India, Canada, Australia, Korea, Singapore, Other |
tags | Vec<Vec<u8>> | Arbitrary labels: ["monthly", "seasonally_adjusted"] |
Each column is stored in chunks of 1024 values. A Data Coin with 1M rows per column = 1000 chunks, ~8 MB per Float column. Scales from 300 rows (FRED quarterly) to millions. Configurable via MaxChunks constant (default 10,000 = 10M rows).
| # | Method | Access | Purpose |
|---|---|---|---|
| 5 | mint_data_coin | Signed | Create coin with schema + optional seed rows |
| 6 | append_rows | Creator only | Add rows (validates against schema, AppendOnly only) |
| 7 | update_cell | Creator only | Edit a single cell |
| 10 | deactivate_data_coin | Creator only | Close coin, unlocks COCO deposit |
Automated scrapers run every 30 minutes against public economic APIs. Each observation is appended to an on-chain Data Coin:
1. FRED scraper fetches latest GDP, CPI, UNRATE etc. from api.stlouisfed.org
2. EU scraper fetches ECB rates, Eurostat GDP, World Bank indicators
3. Asia scraper fetches RBI (India), ABS (Australia), PBoC (China), etc.
↓
ensureCoin("GDP"): scan chain for source=FredSeries("GDP") → cache coinId
appendRows(coinId, [[Int:timestamp, Float:value]]) → appended to chain
↓
If coin doesn't exist yet, mintDataCoin creates it with metadata:
category=GDP | frequency=Quarterly | geography=US | source=FredSeries("GDP")
Data coins are active on-chain. Full list at Data Explorer.
| Coin ID | Series | Source | Rows | Category | Geo | Freq |
|---|---|---|---|---|---|---|
| 33 | GDP | FredSeries | 317 | GDP | US | Quarterly |
| 35 | UNRATE | FredSeries | 939 | Employment | US | Monthly |
| 44 | HOUST | FredSeries | 808 | Housing | US | Monthly |
| 45 | DGS10 | FredSeries | 16,083 | InterestRates | US | Daily |
| 49 | eurostat:gdp | ExternalRef | 2 | GDP | EU | Quarterly |
Data Coins live on-chain and can be referenced by any Analysis Coin, SQL query, or R script running within the platform. They don't need to be downloaded or exported — you work with them directly inside the Coin IDE.
Scrapers pull economic data every 30 min
↓
Data Coins store it on-chain (columnar, 1024 rows/chunk)
↓
Analysis Coins reference Data Coins by ID or alias
↓
R scripts run inside the chain (native Rust, no external R)
↓
Results stored as Analysis Coin step outputs
Every coin has a numeric ID and a source name. In the Data Explorer, browse the catalog to find coins. In the Coin IDE, reference them by ID with an alias:
# In the Coin IDE, declare inputs like: # Coin #33 → alias "gdp" # Coin #35 → alias "unemp" # Coin #45 → alias "rates" # Coin #44 → alias "housing" # Quick lookup in the explorer: # Open /cocox/data-explorer.html?coin=33 # Browse full catalog at /cocox/data-explorer.html # Reference snippets shown in the Coin view tab
Both SQL and R run natively on-chain. Use whichever fits your analysis.
-- Last 10 GDP rows SELECT * FROM coin[33] ORDER BY date DESC LIMIT 10 -- Unemployment over 5% SELECT date, value FROM coin[35] WHERE value > 5 -- GDP + Unemployment joined SELECT a.date, a.value AS gdp, b.value AS unemp FROM coin[33] a JOIN coin[35] b ON a.date = b.date -- Average 10-year yield SELECT AVG(value) FROM coin[45]
# Alias: coin #33 → "gdp" # Linear model lm(gdp$value ~ unemp$value) # Summary summary(unemp) # T-test t.test(gdp$value, mu = 20000) # Correlation cor.test(gdp$value, unemp$value) # Benford's law benford(rates$value) # ARIMA forecast arima(housing$value, c(1,1,1)) # Beta between assets beta(rates$value, sp500$value)
Input coins: #33 (GDP) + #35 (UNRATE). Linear model with 317 quarterly observations.
# In Coin IDE: set coins #33→"gdp", #35→"unemp" model <- lm(gdp$value ~ unemp$value, data = merge(gdp, unemp, by = "date")) summary(model) # Output: R², coefficients, p-values — stored as Analysis Coin step result
Input coin: #45 (DGS10, 16,083 daily observations). Benford's law test on first digits.
# In Coin IDE: set coin #45→"rates" result <- benford(rates$value) # Output: χ², p-value. p < 0.05 suggests abnormal digit distribution.
Input coin: #44 (HOUST, 808 monthly observations). ARIMA(1,1,1) time series forecast.
# In Coin IDE: set coin #44→"housing" fit <- arima(housing$value, order = c(1,1,1)) # Output: coefficients, AIC, BIC, fitted values.
Create a Derived Data Coin that auto-updates. No R needed — pure SQL that re-evaluates every 600 blocks.
-- In the coin creator, set mode = Derived with this SQL: SELECT a.date, a.value / b.value AS real_gdp FROM coin[33] a JOIN coin[36] b ON a.date = b.date -- This creates a new coin that updates automatically
| Coin | Series | Resolve Name | R Alias Example |
|---|---|---|---|
| 33 | Gross Domestic Product | "GDP" or "fred/GDP" | gdp <- coin["fred/GDP"] |
| 35 | Unemployment Rate | "UNRATE" | unemp <- coin["UNRATE"] |
| 44 | Housing Starts | "HOUST" | housing <- coin["HOUST"] |
| 45 | 10-Year Treasury Rate | "DGS10" | rates <- coin["DGS10"] |
| 36 | CPI All Urban Consumers | "CPIAUCSL" | cpi <- coin["CPIAUCSL"] |
| 40 | Federal Funds Rate | "FEDFUNDS" | fedfunds <- coin["FEDFUNDS"] |
| 49 | EU27 GDP | "eurostat:gdp" | eu_gdp <- coin["eurostat:gdp"] |
Analysis Coins are executable R research contracts. Each coin contains an R script, input Data Coin references, and a directed acyclic graph of computation steps that execute against on-chain data.
1. MINT: User submits R script + parsed steps → AnalysisCoins(id) storage 2. SOLVING: OCW picks up steps with dependencies met → dispatches execution 3. REMINT: Each step result advances coin state — coin persists, state updates 4. SOLVED: All steps complete → coin.state = Solved 5. VERIFIED: Challenge window passes (2 blocks for testnet)
| Type | Executes | Output |
|---|---|---|
DataFetch(query) | SQL query against Data Coins | Matrix: [rowCount:u32, colCount:u32, values...] |
NativeFunction(fn) | Pure Rust statistical function | Typed binary (see function catalog for byte layouts) |
The step graph must be a valid DAG: no cycles, no self-references, dependencies precede dependents. Rejected at mint time if invalid. Steps must have ascending step_id with dependencies pointing to lower IDs.
| Input | Meaning |
|---|---|
DataCoinColumn(id, col) | Reads a specific column from a Data Coin |
PriorStep(id) | Consumes the output of a previous step (dependency) |
Constant(val) | Inline constant value |
| # | Method | Access | Purpose |
|---|---|---|---|
| 8 | mint_analysis_coin | Signed | Create coin with R script + steps + data refs |
| 9 | submit_step | Signed | Submit result for a single step (by worker or test script) |
Native Rust SQL parser. No external libraries. Deterministic at a given block height — same query at same block always produces the same result. Columnar storage access: reads Columns(coin_id, col_idx, chunk_idx) storage maps directly.
SELECT col1, col2, AGG(col3) FROM coin[N] [AS alias] [JOIN coin[M] ON a.col_i = b.col_j] [WHERE col OP value [AND|OR ...]] [GROUP BY col] [ORDER BY col ASC|DESC] [LIMIT n]
SUM(col) AVG(col) COUNT(col) COUNT(*) MAX(col) MIN(col)
= == != <> < > <= >= LIKE
All statistical functions are pure Rust — zero external R dependency. Compiled into the pallet binary. Deterministic: same input bytes always produce the same output bytes. 20 functions across 6 categories.
| Function | Method | Output Bytes |
|---|---|---|
LinearModel | OLS via Gaussian elimination | [k:u32, n:u32, R²:f64, coefs:f64×k, SE:f64×k, t:f64×k, p:f64×k, F:f64, σ²:f64, residuals:f64×n] |
Glm | IRLS with logit link (25 iterations) | [k:u32, n:u32, deviance:f64, coefs:f64×k] |
| Function | Method | Output Bytes |
|---|---|---|
TTest | Welch one-sample | [mean:f64, t:f64, df:f64, p:f64, SE:f64, n:u32] |
Correlation | Pearson | [r:f64, t:f64, df:f64, p:f64, n:u32] |
ChiSq | Pearson's chi-squared | [χ²:f64, df:f64, p:f64, n:u32] |
Benford | First-digit law test | [χ²:f64, p:f64, total:u32, obs:u32×9, exp:f64×9, n:u32] |
| Function | Method | Output Bytes |
|---|---|---|
Arima | Conditional sum-of-squares | [p:u32, d:u32, q:u32, mean:f64, σ²:f64, logLik:f64, AIC:f64, BIC:f64, n:u32, ar:f64×p, ma:f64×q] |
Garch | GARCH(1,1) via MLE gradient descent | [p:u32, q:u32, n:u32, ω:f64, α:f64, β:f64, AIC:f64, BIC:f64, fitted_vol:f64×n] |
| Function | Method | Output Bytes |
|---|---|---|
SharpeRatio | mean / std × √n | [sharpe:f64, mean:f64, std:f64, n:u32] |
VaRHistorical | Sort + percentile | [var95:f64, var99:f64, n:u32] |
VaRParametric | mean − z × σ | [var95:f64, var99:f64, mean:f64, std:f64, n:u32] |
MaxDrawdown | Peak-to-trough scan | [max_dd:f64, n:u32] |
Beta | cov(a,b) / var(b) | [beta:f64, alpha:f64, n:u32] |
SortinoRatio | mean / downside_std × √n | [sortino:f64, down_std:f64, n:u32] |
CalmarRatio | annual_return / max_dd | [calmar:f64, max_dd:f64, n:u32] |
| Function | Method | Output Bytes |
|---|---|---|
Pca | Power iteration | [d:u32, n:u32, totalVar:f64, ev:f64×d, cumVar:f64×d, vectors:f64×d×d] |
Johansen | Canonical correlation | [k:u32, n:u32, eigenvalues:f64×k, trace:f64×k, eigen:f64×k] |
VarModel | OLS per equation | [k:u32, lag:u32, n:u32, coefs:f64×k×k×lag] |
| Function | Behavior |
|---|---|
Summary | Passes through prior step result bytes unchanged |
Predict | Linear predictor: ŷ = Xβ from model coefficients + new data. Output: [n:u32, predictions:f64×n] |
9 standalone Node.js scripts that fetch data and mint it as Data Coins on the chain. All run independently — no server integration needed.
Offchain workers run on each validator node. They dispatch Analysis Coin steps every block.
dispatch_analysis_steps(block_number):
For each AnalysisCoin in Pending or Solving:
For each step with dependencies met:
if DataFetch(query):
→ parse "coin[N]" from SQL
→ read Columns(N, col, chunk) across all chunks
→ zip columns into rows → encode as typed binary
→ submit_step(signed) → StepResults storage
if NativeFunction(fn):
→ read input data from StepResults or Data Coins
→ execute pure Rust function via native_r::execute_native()
→ submit_step(signed) → StepResults storage
If all steps now solved:
→ coin.state = Solved
| Method | Endpoint | Purpose |
|---|---|---|
| GET | /wallet/:address | All coins for an address (data + analysis + subtokens) |
| GET | /coins/list | All Data Coin IDs + names + row counts |
| GET | /coins/:id | Coin metadata: schema, mode, source, tags, category, rows, creator |
| GET | /coins/:id/rows?from=N&to=M | Paginated rows. Returns typed JSON values per cell. |
| POST | /coins/:id/query | Run SQL query. Body: {"query":"SELECT ... FROM coin[N] ..."} |
| POST | /parse-r | Parse R script → step DAG. Body: {"script":"...","refs":[...]} |
| GET | /series-search?source=X&q=Y | Search FRED or BLS series catalog |
| Method | Endpoint | Purpose |
|---|---|---|
| GET | /stats | Chain stats: subtoken/collision/worker counts |
| GET | /subtokens | Active surveillance subtokens |
| GET | /sec/filings | Search SEC 13F, Form 4, SC 13G filings |
| GET | /sec/health | SEC scraper status: running, rate limiting, last poll |
| GET | /sec/institutions | List of 59 tracked institutional investors |
| GET | /contracts/inspect/:chain/:address | Token/contract profile (ETH/SOL/TRON) |
| GET | /entanglement/graph/:address | Wallet relationship graph |
# List all Data Coins
curl https://cocoio.cc/api/cocox-v2/coins/list
# → {"success":true,"data":{"coins":[{"coinId":0,"name":"x","rows":10}],"total":3}}
# Get coin detail
curl https://cocoio.cc/api/cocox-v2/coins/0
# → {"success":true,"data":{"coinId":0,"schema":[{"name":"x","dataType":"Float(0)"}],"rows":10}}
# Get paginated rows
curl "https://cocoio.cc/api/cocox-v2/coins/0/rows?from=0&to=5"
# Run SQL query
curl -X POST https://cocoio.cc/api/cocox-v2/coins/0/query \
-H "Content-Type: application/json" \
-d '{"query":"SELECT x, y FROM coin[0] WHERE x > 5 ORDER BY y DESC"}'
# View wallet
curl https://cocoio.cc/api/cocox-v2/wallet/5GrwvaEF5zXb...
# Parse R script
curl -X POST https://cocoio.cc/api/cocox-v2/parse-r \
-H "Content-Type: application/json" \
-d '{"script":"model <- lm(y ~ x, data = d)","refs":[{"coinId":0,"alias":"d"}]}'
This tutorial walks through the full workflow: get data → explore it → create an analysis coin → submit steps → read and interpret results.
cd coco-substrate && cargo build --release ./target/release/coco-node --dev --base-path=/tmp/coco-dev \ --rpc-port=9933 --unsafe-rpc-external --pool-type single-state \ --enable-offchain-indexing true --no-prometheus
export FRED_API_KEY=your_key cd scripts node fred-scraper.js preview # View data, no writes node fred-scraper.js update # Mint Data Coins on chain
Go to Data Explorer, enter Coin ID 0. Browse rows, run SQL, see chart.
curl https://cocoio.cc/api/cocox-v2/coins/0 curl "https://cocoio.cc/api/cocox-v2/coins/0/rows?from=0&to=20"
Go to Coin IDE → Analysis Coin tab. Write:
model <- lm(y ~ x, data = mydata) summary(model)
Add input coin (ID 0, alias "mydata"). Click "Parse & Preview Steps" to see the DAG. Click "Mint".
Submit step results via the test script or Coin IDE. Then check your wallet:
curl https://cocoio.cc/api/cocox-v2/wallet/5GrwvaEF5zXb...
# → { "analysisCoins": [{ "coinId": 0, "state": "Solved", "steps": 2, "solved": 2 }] }
The LM result bytes encode: coefficients, SE, t-stats, p-values, R², F-statistic, and residuals. See the function catalog above for exact byte layouts.
| Metric | Meaning | Good Value |
|---|---|---|
| R² | Variance explained (0–1) | > 0.7 |
| Coefficients | Effect size per predictor | Context-dependent |
| SE | Standard error of estimate | Smaller = better |
| t-statistic | Signal/noise ratio | |t| > 2 |
| p-value | Probability of null hypothesis | < 0.05 |
| F-statistic | Overall model significance | > 4 |
lm()Find the relationship between a dependent variable and predictors. Classic OLS.
# R: model <- lm(y ~ x1 + x2, data = mydata) # Input: y column + 1+ x columns # Output: coefs, SE, t, p, R², F, residuals # Use: Predicting GDP from unemployment + inflation
t.test()# R: result <- t.test(returns, mu = 0) # Input: 1 numeric column, Output: mean, t, df, p, SE # Use: Testing if average returns differ from zero
benford()# R: result <- benford(amounts) # Input: 1 numeric column, Output: χ², p, obs/exp counts per digit # Use: Forensic accounting — p < 0.05 suggests manipulation
cor.test()# R: result <- cor.test(x, y, method = "pearson") # Input: 2 numeric columns, Output: r, t, df, p # Use: Checking if two assets move together
glm()# R: model <- glm(recession ~ spread + unemployment, data = macro) # Input: 0/1 outcome + predictors, Output: logit coefs, deviance # Use: Predicting binary events like recession probability
garch()# R: fit <- garch(returns) # Input: time series (≥30 obs), Output: ω, α, β, AIC, BIC, fitted vol # Use: Modeling volatility clustering. Backbone of risk management
Deploy package at DEPLOY_PACKAGE/ contains everything needed to run a CoCo node with scrapers.
# Quick deploy
unzip DEPLOY_PACKAGE.zip -d /opt/coco
cd /opt/coco
cp systemd/*.service /etc/systemd/system/
cp .env.template .env # Fill in API keys
systemctl daemon-reload
systemctl enable coco-node coco-scraper-scheduler
systemctl start coco-node
sleep 30
systemctl start coco-scraper-scheduler
# Health check
curl localhost:9933 -d '{"method":"system_health"}'
node scripts/fred-scraper.js preview
Required keys listed in NEED_KEYS. Free API keys from FRED, BLS, Census, IMF.
curl localhost:9933 -d '{"method":"system_health"}'curl /api/cocox/sec/health. If consecutive429 > 5, scraper is backing off. Auto-resets on successful polls.state_getRuntimeVersion. If chain spec != node binary spec, rebuild + deploy WASM upgrade.