Stack
The non-default product choices in the system. Commodity primitives (S3, EventBridge, IAM) omitted by design — the picks that explain a design decision are the picks worth listing.
Agentic layer
- LangGraph
Multi-agent orchestration with
Send()fan-out semantics and custom dict-keyed reducers. Six sector teams + macro economist + CIO + LLM-as-judge run as graph nodes with typed state. Vanilla LangChain doesn't compose for parallel multi-team coordination with state merging. - Anthropic Claude — tiered
Haiku for per-ticker quant, qual, and peer-review (~12 calls per Saturday run); Sonnet only for synthesis (macro economist, CIO batch evaluation, nuanced judge cases). Sonnet is ~5× Haiku — reserving it for synthesis keeps per-run cost stable.
- LangSmith
Auto-tracing on every production LLM call. The trajectory validator polls LangSmith for graph-correctness invariants (required nodes present, sector_team_node appears exactly 6 times) post-
graph.invoke(). Free tier covers a personal-scale workload. - Pydantic
Typed agent outputs (
with_structured_output(..., include_raw=True)with strict-mode parse-error contract across every LLM-output site).
Data & retrieval
- ArcticDB (over parquet on S3)
Feature store for ~50 features × ~900 tickers × 10y. Migration was driven by a real performance bottleneck: per-feature parquet scans on S3 were slow on the read side and OOMed c5.large training instances. ArcticDB's symbol-keyed range queries pull only the slice each consumer needs.
- Neon — serverless Postgres + pgvector
RAG corpus for SEC filings, 8-Ks, earnings transcripts, and rolling investment theses. HNSW indexing for fast vector search; serverless tier means no idle compute cost when the weekly research run isn't firing.
- Voyage `voyage-3-lite`
512-dimensional embeddings tuned for financial-domain text. Cheaper than OpenAI's
text-embedding-3-*tier with better fit on the SEC-filing corpus the qual agents retrieve over. - Polygon.io
Primary market-data source for adjusted OHLCV and intraday data; yfinance retained as cross-validation only, not as a silent fallback (silent-fallback patterns have been a repeat offender for masking real failures and were retired across the data layer).
- FRED + FMP
Macro indicators and supplemental fundamentals.
- VectorBT
Historical portfolio simulation in the backtester.
ML tools
- Python 3.12 / 3.13
Primary language across all eight repos. Lambda runs 3.12; local development on 3.13.
- LightGBM
Layer-1 gradient-boosted models in the predictor meta-ensemble (momentum + volatility).
- pandas + numpy
Feature engineering, signal scoring, metric computation.
Compute & orchestration
- AWS Step Functions (over Airflow / Dagster / Prefect)
Three pipelines (Saturday weekly research, weekday morning, EOD post-close) orchestrated as state machines. Native to the AWS account that already runs Lambda + EC2 + S3, so no new infrastructure to operate. Execution history is itself the audit trail.
- AWS Lambda
Stateless agent calls, predictor inference, LLM-as-judge, rationale clustering, replay-concordance, replay-counterfactual. Container-image deploys via ECR for the Python 3.12 research Lambda.
- AWS EC2
t3.smallfor the executor daemon (stateful, holds the IB Gateway connection),t3.microfor the dashboard,c5.largespot for batch (predictor training + backtester). Right-sized per workload — whenpredictor_data_prepOOMed on c5.large, the response was a pandas refactor (~1.1 GB → ~91 MB resident), not a bump to xlarge. - CloudFormation
Infrastructure-as-Code for Step Functions, Lambda functions, IAM roles, EventBridge rules. Drift detector compares CloudFormation stamps against live AWS state weekly.
Surfaces
- Astro (this site)
Static-site generator for the apex marketing surface — zero JS shipped by default, fast TTFB, build-time SEO. The harness identity benefits from a marketing surface that loads as instantly as the instrument it describes.
- Streamlit
Live data console at
live.nousergon.aiand the private interview-demo console atconsole.nousergon.ai. Pragmatic pick for read-only monitoring: a custom React dashboard would be weeks of work for the same outcome. - IBC (Interactive Brokers Gateway)
Paper-account broker connection on the executor host. Hard runtime check refuses to connect to non-paper accounts (account ID must start with "D") — defense against accidental live-account wiring.