🧪 Test RFT v2 Yourself

Use our interactive tester to run the same validated solver on any galaxy. Upload your own data or try sample galaxies.

Launch Tester →

Key Findings

58.8%
RFT v2 Pass Rate
20% RMS threshold on SPARC TEST set (n=34, k=0)
66.7% vs 0%
LSB Dominance
RFT vs NFW/MOND on low surface brightness galaxies
p = 0.004
vs MOND (Significant)
McNemar paired test; vs NFW p=0.69 (competitive)

What we tested: RFT v2 geometry-only solver with 6 global parameters (k=0, zero per-galaxy tuning) on 34 blind TEST galaxies from SPARC, compared to fair k=0 baselines.

What we found: RFT achieves 58.8% pass@20%, competitive with NFW_global (52.9%, McNemar p=0.69) and significantly better than MOND (23.5%, p=0.004). Key mechanistic validation: LSB dominance (66.7% vs 0%) where acceleration gating activates.

Publication status: Paper ready for arXiv submission. Full reproducibility package available below with one-click verification, figures, and complete statistical analysis.

Example Fit: NGC3198

NGC3198 is part of the blind TEST manifest. The plot overlays the observed SPARC data with the frozen RFT v2 prediction and the two fair baselines used throughout this page.

  • Galaxy: NGC3198 (SP99-TEST)
  • RFT config: config/global_rc_v2_frozen.json
  • NFW (global): ρs=1.0×106 M☉/kpc³, rs=29.76 kpc
  • MOND: a₀ = 1.2×10⁻¹⁰ m/s² (standard μ)
  • Window: 1–30 kpc, ≥3 points

Download the raw arrays: sample_curve_ngc3198.json

Velocities beyond 30 kpc are omitted to match the evaluation window used for all metrics.

Generated with python3 scripts/generate_sample_curve.py (2025-11-10) using the frozen config plus the global NFW and MOND baselines.

Reproducibility Starter Kit

Everything below is locked to the 2025-11-10 TEST run. Follow the steps to recreate the full benchmark (RFT v2 + fair baselines) with identical settings.

1. Environment

git clone https://github.com/rft-cosmology/site.git
cd site
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

2. Run frozen RFT v2

python3 -m cli.rft_rc_bench \
  --solver rft_geom \
  --global-config config/global_rc_v2_frozen.json \
  --batch cases/SP99-TEST.manifest.txt \
  --min-radius 1.0 --max-radius 30.0 \
  --min-points 10 --max-workers 0

3. Publish the fairness pack

python3 scripts/generate_fairness_pack.py
python3 scripts/generate_stability_analysis.py
python3 scripts/extract_ablation_results.py
Artifact Path SHA256
Frozen config config/global_rc_v2_frozen.json f7bbdd50b69f8b58b426ba3c46f11a6c830c057d18f5b2e4438fe646d7f8ec01
Frozen TEST results results/v2_frozen/test_results.json ab92a1141a9f31fc3ca16001ce5672aa85487d46486b08a6f2c2fa3b10f0d16a
Fairness pack data data/refs/v2_fairness_pack.json 0cbe1481f9e458c09170143c43c3e8cc92f1af190fd66486665bbe35b155857d
Stability analysis data/refs/v2_stability.json 88a294e71ed7c9f8fb433bd3174e59e05d9aa148543a27dcbbfc63e20dc971c7
Ablation summary data/refs/v2_ablations.json c07dd87644ab0e976d52cfb062a945d43b691799cad3b2b92964381eda41f539
Sample curve (NGC3198) app/static/data/sample_curve_ngc3198.json 7c02ebf21d9c90e48ef8608793fcf0ca40bc8460b91ede22715c436172041d7d
Methods bundle app/static/downloads/rft_v2_methods_bundle_2025-11-10.zip aedc2d453d301e7f676ced4ed0246289957b04e60ff49c1edd5456f8245cfd76

Download everything above as a single archive: rft_v2_methods_bundle_2025-11-10.zip. Full methodological details are documented in FAIR_COMPARISON_RESULTS.md and RFT_V2.1_FINAL_REPORT.md.

Model Performance Comparison

Theory Pass@20% Pass@10% Median RMS vs RFT v2
RFT v2 (k=0) 58.8% (20/34) 8.8% 17.1%
NFW global (k=0) 52.9% (18/34) 2.9% 19.5% +5.9pp (p=0.69)
MOND (k=0) 23.5% (8/34) 0.0% 32.2% +35.3pp (p=0.004)
NFW fitted (k=2, reference) 82.4% (28/34) 17.6% 14.2% k=2 (different question)

Test conditions: SPARC TEST cohort (n=34 blind galaxies), frozen parameters from TRAIN (n=65), pre-registered protocol.

Pass criterion: RMS error ≤ 20% (strict for heterogeneous SPARC sample).

Statistical test: McNemar's exact (paired, PRIMARY). RFT vs NFW: p=0.69 (competitive, overlapping CIs). RFT vs MOND: p=0.004 (significant).

k=0 vs k=2: All models use zero per-galaxy tuning (k=0) for fair comparison. NFW_fitted (k=2, 68 params) shown as reference for descriptive fit ceiling.

📦 Reproducibility Package

Full reproducibility package now available! One-click verification of all published results with code, configs, data manifests, and analysis scripts.

RFT v2 Repro Pack 1.0

Complete package for reproducing RFT v2 galaxy rotation results. Includes:

  • ✅ Frozen v2 solver code (Python)
  • ✅ Baseline implementations (NFW, MOND)
  • ✅ TRAIN/TEST manifests (SPARC-99 cohort)
  • ✅ Pre-computed results (RFT, NFW, MOND on TEST)
  • ✅ Fairness pack (statistical tests, head-to-head)
  • ✅ Stability analysis (±10% perturbations)
  • ✅ One-click RUNME.sh verification script
  • ✅ Full provenance (commit hash, data sources)

Version: 1.0 | Released: 2025-11-10 | Tag: rc-v2-green-20pct

SHA256 (.tar.gz): 33cd83cf2279132e42a54cc24745e997a482359e724b96c39387dd6807f78857

SHA256 (.zip): 14e2f9d9210034aa0554ec95c2d3873fabc68e0299c56ebe08913edb8740b818

License: MIT | Commit: 3428db0f

Quick Start

# Extract archive
tar xzf rft-v2-repro-1.0.tar.gz
cd rft-v2-repro-1.0

# One-click verification
./RUNME.sh

# Expected output:
# ✅ RFT v2 TEST: 20/34 pass (58.8%)
# ✅ NFW (global) TEST: 10/34 pass (29.4%)
# ✅ MOND TEST: 8/34 pass (23.5%)
# ✅ RFT vs NFW: z=2.44, p=0.015
# ✅ Stability: 12/12 maintained

Requirements: Python 3.11+, Conda (recommended) or Docker. See README.md for details.

Checksums: Download SHA256SUMS.txt and verify with sha256sum -c

📄 Research Paper & Figures

Publication-ready materials: Complete paper draft with camera-ready figures and comprehensive statistical analysis using McNemar paired tests.

📑 Paper Draft (arXiv Ready)

Title: Predictive Geometry-Only Rotation Curves: RFT v2 vs Fair k=0 Baselines

Status: Complete draft with Methods, Results, Discussion

Key Sections:

  • ✅ Abstract with honest McNemar framing
  • ✅ Comprehensive Methods (code→paper traceability)
  • ✅ Statistical tests (McNemar PRIMARY, Wilson CIs)
  • ✅ Discussion with limitations & falsifiability
  • ✅ Full reproducibility protocol

📊 Camera-Ready Figures

6 publication-quality figures (vector PDF + 300 DPI PNG):

  • F1: Overview accuracy with Wilson CIs
  • F2: McNemar paired test (p=0.69 vs NFW)
  • F3: LSB dominance (66.7% vs 0%)
  • F4: Stability (±10%, zero degradation)
  • F5: Ablations (tail −35.3pp causal)
  • F8: Mechanism (gate + onset curves)

Key Statistical Findings

PRIMARY Test (Paired)
McNemar p=0.69
RFT vs NFW: competitive (NOT significant)
MOND Comparison
McNemar p=0.004
RFT vs MOND: significant (14 wins vs 2)
LSB Dominance
66.7% vs 0%
Mechanistic validation (n=15 LSB galaxies)
Robustness
Zero Degradation
All ±10% perturbations maintain 58.8%

Fairness Analysis: Matched Baseline Audit

The frozen v2 config is evaluated head-to-head against our reference NFW halo and MOND implementations on the exact same TEST cohort, with all data products published for reproducibility.

Head-to-Head Matrix

Comparison RFT Wins Competitor Wins Ties p-value
RFT v2 vs NFW
RFT v2 vs MOND

Win = passes the 20% gate when the competitor fails. Two-proportion z-tests use α = 0.05; ties include cases where both models pass or both fail.

Wilson Confidence Intervals (95%)

Binomial Wilson intervals capture the sampling uncertainty (34 TEST galaxies) for each theory’s pass@20% rate.

Low vs High Surface Brightness

Galaxy Type Count RFT Pass@20% NFW Pass@20% MOND Pass@20%
Low Surface Brightness (LSB)
High Surface Brightness (HSB)

LSB threshold: vmax < 120 km/s. Publishing the split shows v2’s lift targets low-acceleration systems rather than cherry-picked HSB galaxies.

Generated on 2025-11-10 via python3 scripts/generate_fairness_pack.py using results/v2_frozen/test_results.json plus baselines/results/*.json.

Assumptions: 20% RMS gate was pre-registered; σ values come directly from SPARC (sigma_v_kms); LSB defined as vmax<120 km/s; distances and M*/L remain fixed across all solvers (no nuisance refits).

Causal Validation: Ablation Tests

To prove the acceleration-gated tail is causally responsible for the improved fit, we systematically removed each component and measured the performance drop:

Tail OFF
-35.3pp
Removing the tail entirely drops to Newtonian baseline (23.5%)
No Radial Onset Gate
-14.7pp
Inner disk suppression is critical for good fits
No Acceleration Gate
-8.8pp
Selective activation where baryons are weak matters
α = 1.0 (vs 0.6)
-8.8pp
Sub-linear falloff is optimal, not constant-velocity
Softer Gate (γ×0.75)
+0.0pp
Config is robust to small parameter variations
Configuration Pass@20% Galaxies Passing Δ vs Baseline Interpretation
Baseline (all physics) 58.8% 20/34 Frozen v2 config
No Tail (A₀ = 0) Tail drives the uplift
No Acceleration Gate Must target low-acceleration regions
No Radial Onset Inner disk suppression prevents over-fitting
α = 1.0 (constant-V tail) Sub-linear falloff (α=0.6) is required
Gate softened (γ × 0.75) Small perturbations stay GREEN

Conclusion: All major design components (tail presence, acceleration gating, radial onset, sub-linear falloff) contribute meaningfully. The model is not over-tuned—small variations don't cause performance cliffs.

Stability Analysis: ±10% Parameter Jitter

Each tail parameter is nudged by ±10% (without re-tuning) and re-run on the TEST cohort. The pass rate remains in the 55–60% band, demonstrating the solution is not a knife-edge coincidence.

Parameter Baseline −10% +10% Range
Loading…

Baseline reference: 58.8% TEST pass rate (20/34). All perturbations stay within a few percentage points, confirming robustness.

Data regenerated on 2025-11-10 via python3 scripts/generate_stability_analysis.py around config/global_rc_v2_frozen.json.

Methodology & Reproducibility

Pre-registered Protocol

  • Dataset: SPARC rotation curve database, split 65 TRAIN / 34 TEST
  • Pass criterion: RMS error ≤ 20% (pre-registered)
  • Stop rule: One calibration grid → TEST validation → stop (no post-hoc tuning)
  • Gate criteria: GREEN if ≥30% pass rate, enforced strictly

Model: RFT v2

g_tail = A₀ (r_geo/r)^α · [1 + (g_b/g*)^γ]⁻¹ · [1 - exp(-(r/r_turn)^p)]
  • Physics: Tail activates where baryonic acceleration g_b is weak relative to threshold g*
  • Best config: A₀=1000, α=0.6, g*=1000, γ=0.5, r_turn=2.0
  • Parameters: 6 global (no per-galaxy fitting)

Validation Stages

  1. Calibration grid (12 configs) on TRAIN → identified best config
  2. Bounded micro-tune (16 configs) → confirmed optimal regime
  3. Frozen config → locked with git tag + SHA256
  4. TEST validation → 58.8% pass rate (single run, no tuning)
  5. Ablation tests → proved causal mechanism

Reproducibility

Next Steps: Full Competitive Audit

Before claiming definitive superiority, we're conducting a comprehensive fairness audit:

Basic baselines tested

NFW halo (standard), MOND (standard μ form), Newtonian (baryons only)

Competitive baselines (in progress)

Einasto halo, Burkert halo, MOND+EFE (external field effect), QUMOND

Information criteria (pending)

Add AIC/BIC/WAIC alongside pass rates to account for parameter count differences

Nuisance parity (pending)

Ensure distance, inclination, M*/L handled identically across all models

Independent replication (pending)

External validation from independent research group

Our commitment: We will update these results honestly if competitive baselines (Einasto, MOND+EFE) perform better. Transparency over premature claims.

Current Limitations

🔍 Preliminary Status

Results are from our baseline implementations. More competitive variants may narrow the gap or potentially exceed our performance.

📊 Small TEST Set

n=34 galaxies gives 95% CI of [42%, 74%] for pass rate. Larger validation cohort would tighten uncertainties.

🔭 Galaxy Scale Only

Results apply to galaxy rotation curves (1-30 kpc). Extension to cluster scales (100-3000 kpc) requires different physics.

⚙️ Parameter Tuning

6 global parameters fit to TRAIN set. Not a parameter-free prediction, though fewer DOF than per-galaxy halo fits.

Interested in the Technical Details?

Full methodology, code, and ablation analysis available in our technical documentation.