Deterministic F4-Statistic Regression for Admixture Modeling
In this post, I’ll cover how to build a transparent, deterministic admixture modeler from scratch using R. The main advantage of this approach is automated combinatorial testing: instead of manually specifying and running each potential ancestry model one-by-one in qpAdm, this script systematically tests hundreds or even thousands of source combinations in seconds and ranks them by fit quality. While qpAdm is the “industry standard” and provides sophisticated covariance weighting through block jackknife procedures, it requires manual intervention for each model specification and can often feel like a “black box.” This script offers a transparent, complementary approach that uses the exact same mathematical foundation (f4-statistic regression with quadratic programming constraints) to rapidly explore the models. If you have 10 potential sources (or even more) and want to test all 2-way, 3-way, and 4-way combinations, that’s 375 separate qpAdm runs against a single automated run with this script. ...