The origins of the Proto-Anatolians are often treated as one of the more obscure problems in Indo-European archaeogenetics, but the genetic data may be less ambiguous than this framing suggests. Since Anatolian is widely regarded as the earliest-splitting branch of Indo-European, the relevant question is whether the earlier Eneolithic steppe-related ancestry behind Yamnaya, especially the Caucasus-Lower Volga or CLV component, also moved south of the Caucasus and into Anatolia. For this purpose, I use Progress-2 specifically as a practical proxy for the north Caucasus-facing part of this Eneolithic steppe-related ancestry.

Languages are obviously not genetics, and ancient DNA does not identify speech communities by itself. But this caveat should not become a license to ignore demographic evidence whenever it points in an inconvenient direction. A repeated, statistically supported ancestry signal along a coherent geographic route is not proof of language, but it is evidence about movement and population formation. If competing scenarios are allowed to rest on linguistic reconstruction, archaeological interpretation, and historical inference, then genetic evidence should not be excluded simply because it is probabilistic rather than deductive.

In this post, I will show possible evidence for an eastern route into Anatolia, through or around the Caucasus.


The Caucasus Route and Arrival from the East

To test whether Eneolithic steppe-related ancestry appeared in the Southern Caucasus by the Chalcolithic, I will begin with f4f_4-statistics in the following arrangement:

f4(Outgroup,Progress-2;Neolithic baseline,Target) f_4(\text{Outgroup}, \text{Progress-2}; \text{Neolithic baseline}, \text{Target})

This test asks whether the target shares more alleles with Progress-2 than the Neolithic baseline does. A positive result would mean that the target is shifted toward Eneolithic north Caucasus steppe-related ancestry relative to that baseline.

A significantly positive result would therefore suggest that the target cannot be explained as simply lying on the Southern Caucasus Neolithic cline, but instead carries additional affinity to Eneolithic steppe-related ancestry. This provides a first test of whether ancestry related to groups north of the Caucasus had already entered the Southern Caucasus by the Chalcolithic.

With Areni-1 Chalcolithic in the target position

As a first example, I place Areni-1 Chalcolithic in the target position:

Targetf4f_4SEZP
Areni-1 Chalcolithic0.002190.0005414.055.03e-5

This indicates that Areni-1 Chalcolithic shares significantly more alleles with Progress-2 than Mentesh Tepe Neolithic does. So, it does not behave as a simple continuation of the Southern Caucasus Neolithic baseline, but instead shows excess affinity to Eneolithic steppe-related ancestry from north of the Caucasus.

The point here is that this ancestry signal is already visible south of the Caucasus by the Chalcolithic. This supports the first step of the eastern-route argument: before turning to Anatolia itself, we can already observe a detectable movement, or at least possible gene flow, linking Eneolithic steppe-related groups north of the Caucasus with populations on its southern side.

With Arslantepe Late Chalcolithic in the target position

As the next step, I place Arslantepe Late Chalcolithic in the target position. Arslantepe is relevant here because it lies in eastern Anatolia, to the west of the Caucasus, and because a Late Chalcolithic individual (ART038) from the site carries Y-DNA haplogroup R1b-V1636. This is the same paternal lineage found among the men buried in the kurgans at Progress-2, the Eneolithic steppe-related population used here as the northern Caucasus-facing reference. Arslantepe is therefore an obvious test case for whether the ancestry signal seen south of the Caucasus also becomes detectable in the Upper Euphrates region.

First, using Çayönü as an eastern Anatolian Neolithic baseline:

f4(Ju_hoan_North,Progress-2;Mentesh Tepe Neolithic,Arslantepe Late Chalcolithic) f_4(\text{Ju\_hoan\_North}, \text{Progress-2}; \text{Mentesh Tepe Neolithic}, \text{Arslantepe Late Chalcolithic})
BaselineTargetf4f_4SEZP
Çayönü NeolithicArslantepe Late Chalcolithic0.001600.0003904.113.94e-5

This result is clearly positive. Arslantepe Late Chalcolithic shares significantly more alleles with the Eneolithic steppe-related source than Çayönü Neolithic does, suggesting that it cannot be modeled as a simple continuation of the local eastern Anatolian Neolithic baseline.

The same test can also be repeated against the Southern Caucasus Neolithic baseline used above:

f4(Ju_hoan_North,Progress-2;Mentesh Tepe Neolithic,Arslantepe Late Chalcolithic) f_4(\text{Ju\_hoan\_North}, \text{Progress-2}; \text{Mentesh Tepe Neolithic}, \text{Arslantepe Late Chalcolithic})
BaselineTargetf4f_4SEZP
Mentesh Tepe NeolithicArslantepe Late Chalcolithic0.001320.0005792.280.0223

Here the signal is weaker, but it still points in the same direction. Arslantepe Late Chalcolithic shows more affinity to Eneolithic steppe-related ancestry than the Southern Caucasus Neolithic baseline, although the result is only suggestive rather than strongly significant.

Taken together, these two tests are important because they move the signal from the Southern Caucasus into eastern Anatolia. Against the local eastern Anatolian Neolithic baseline, the excess is clear; against the Southern Caucasus Neolithic baseline, it is more modest but still positive. This fits the expectation of an eastern route, where steppe-related ancestry first appears south of the Caucasus and then becomes detectable in eastern Anatolia during the Late Chalcolithic, potentially already in a gradually diluted form.


After demonstrating the gradual appearance of steppe-related allele frequencies along a Caucasus route, I will now try to formally quantify Eneolithic steppe-related ancestry with qpAdm in several northern Near Eastern groups relevant to this question.

My preference is to treat qpAdm as a convex ancestry-modeling problem. In this setup, the left populations are the proposed ancestry sources, while the right populations serve as external anchors that are differentially related to those sources. I therefore prefer a well-constrained and stable right-population setup over qpAdm rotation strategies, where increasingly close or related populations are moved to the right side in search of a better fit. In my view, a carefully chosen right set should test the model rather than optimize it.

For the following models, I use this right-population set:

Ju_hoan_North
Iraq_PPNA
Georgia_KotiasKlde_Mesolithic
Russia_Vologda_Mesolithic
Switzerland_Epipaleolithic
Tajikistan_Mesolithic
Turkey_Epipaleolithic
Israel_Natufian
Iran_BeltCave_Mesolithic

Arslantepe Late Chalcolithic

For Arslantepe Late Chalcolithic, I model the target as a two-way mixture of Çayönü PPN and Progress-2 Eneolithic Steppe.

The model is accepted with a good fit:

TargetSourceWeightSEZ
Arslantepe Late ChalcolithicÇayönü PPN0.8660.025534.0
Arslantepe Late ChalcolithicProgress-2 Eneolithic Steppe0.1340.02555.25
Modelf4f_4 rankdofchisqP
Çayönü PPN + Progress-2 Eneolithic Steppe176.260.510

The model estimates Arslantepe Late Chalcolithic as approximately 86.6% Çayönü PPN-related and 13.4% Progress-2 Eneolithic Steppe-related, with the steppe-related component being clearly significant.

The popdrop results are also informative:

Dropped sourcedofchisqPInterpretation
None76.260.510Full model accepted
Progress-2 Eneolithic Steppe833.64.81e-5Steppe source required
Çayönü PPN86727.87e-140Local Anatolian source required

When Progress-2 Eneolithic Steppe is removed, the model fails, which shows that the steppe-related source is not simply decorative but necessary for the fit. At the same time, the overwhelming failure after removing Çayönü PPN confirms that most of the ancestry remains local eastern Anatolian-related.

This result matches the earlier f4f_4-statistics well. Arslantepe Late Chalcolithic carries a mostly local eastern Anatolian ancestry profile, but with a significant Eneolithic steppe-related contribution. In quantitative terms, this contribution is modest, around 13%, but it is statistically required and fits the pattern expected from gradual dilution along an eastern route into Anatolia.

Arslantepe38, Royal Tomb

The same model can also be applied to ART038, the R1b-V1636 individual from the Arslantepe Royal Tomb:

TargetSourceWeightSEZ
ART038Çayönü PPN0.8850.046019.2
ART038Progress-2 Eneolithic Steppe0.1150.04602.49
Modelf4 rankdofchisqP
Çayönü PPN + Progress-2 Eneolithic Steppe174.480.723
Dropped sourcedofchisqPInterpretation
None74.480.723Full model accepted
Progress-2 Eneolithic Steppe812.10.148Çayönü-only model still accepted
Çayönü PPN83631.24e-73Local Anatolian source required

The model estimates ART038 as about 88.5% Çayönü PPN-related and 11.5% Progress-2 Eneolithic Steppe-related. The steppe-related component approaches significance, with a Z-score of 2.49, and the addition of this source improves the fit strongly, lowering the chisq from 12.1 in the Çayönü-only model to 4.48 in the two-way model.

At the same time, the steppe source is not strictly required here, since the Çayönü-only model remains formally acceptable with p=0.148p=0.148. This should be interpreted cautiously, especially because this is a single individual rather than a population average, making the result more vulnerable to quality-related issues and individual-level noise. Still, given the improved fit, the direction of the estimate, and the paternal link to Progress-2 through R1b-V1636, including Eneolithic steppe-related ancestry in the model is reasonable, though not required in this individual case.

Tilbeşar Höyük (Gaziantep) Bronze Age, I14649

A further relevant case is I14649 from Bronze Age Tilbeşar Höyük, a R1b-V1636 individual from the Gaziantep region. The site lies roughly 50 km west of Carchemish, the later Neo-Hittite capital. Historically, this sample predates written evidence for Anatolian speakers in the region, so it cannot be treated as linguistically identifiable in any direct sense.

What makes this individual interesting is the apparent mobility of R1b-V1636-bearing groups only a few centuries after its appearance at Late Chalcolithic Arslantepe. By the Bronze Age, the same paternal lineage is found farther southwest at Tilbeşar Höyük, suggesting that the movement did not end at Arslantepe. Instead, it may reflect a broader movement of people, or at least male-mediated ancestry, from the Anatolian Upper Euphrates region into southeastern Anatolia and the northern Levantine frontier zone, a region that later also becomes relevant for Luwian-speaking groups.

TargetSourceWeightSEZ
Tilbeşar Höyük BA, I14649Çayönü PPN0.8520.053416.0
Tilbeşar Höyük BA, I14649Progress-2 Eneolithic Steppe0.1480.05342.77
Modelf4f_4 rankdofchisqP
Çayönü PPN + Progress2 Eneolithic Steppe179.470.221

This gives an estimate of roughly 85.2% Çayönü PPN-related and 14.8% Progress-2 Eneolithic Steppe-related ancestry. The model passes with p=0.221p=0.221, and the steppe-related component is significant with Z=2.77Z=2.77.

I do not claim that this is necessarily the best or most realistic model for this individual. The purpose here is more limited: even with a constrained right-population setup, the model passes and detects a meaningful Eneolithic steppe-related component. It is also notable that a single-source Çayönü model does not pass, making the addition of a steppe-related source difficult to dismiss in this specific test.


Oylum Höyük Middle Bronze Age

A further test can be made with the Middle Bronze Age average from Oylum Höyük. This is useful because it moves the analysis beyond single individuals and asks whether a similar signal is also visible at the population level in southeastern Anatolia.

TargetSourceWeightSEZ
Oylum Höyük MBAÇayönü PPN0.9120.025136.3
Oylum Höyük MBAProgress-2 Eneolithic Steppe0.08780.02513.49
Modelf4f_4 rankdofchisqP
Çayönü PPN + Progress-2 Eneolithic Steppe1711.10.135
Dropped sourcedofchisqPInterpretation
None711.10.135Full model accepted
Progress-2 Eneolithic Steppe823.50.00275Steppe source required
Çayönü PPN88151.21e-170Local Anatolian source required

The model estimates Oylum Höyük Middle Bronze Age as approximately 91.2% Çayönü PPN-related and 8.8% Eneolithic Steppe-related. It passes with p=0.135p=0.135, while the steppe-related component is statistically significant with Z=3.49Z=3.49.

As before, I do not claim that this is necessarily the best model for Oylum Höyük MBA. There may be better alternatives using more intermediate populations closer in time and geography.

It should also be noted that Çayönü PPN is not being used here as a generic Mesopotamian Neolithic source, like Shanidar PPNB. Çayönü is closer to the broader Chalcolithic and Bronze Age Anatolian and Levantine profile, which makes it a more appropriate regional baseline for this test.


Kalehöyük, Kārum and Old Hittite Periods

The same two-way model can also be applied to Kalehöyük, first in the Kārum period and then in the Old Hittite period. This is especially relevant because the Old Hittite period, roughly 1750 to 1500 BCE, now falls within a historically Anatolian-speaking context.

TargetSourceWeightSEZ
Kalehöyük Kārum PeriodÇayönü PPN0.8720.030428.6
Kalehöyük Kārum PeriodProgress-2 Eneolithic Steppe0.1280.03044.20
Modelf4f_4 rankdofchisqP
Çayönü PPN + Progress-2 Eneolithic Steppe177.920.340
Dropped sourcedofchisqPInterpretation
None77.920.340Full model accepted
Progress-2 Eneolithic Steppe825.70.00119Steppe source required
Çayönü PPN85551.10e-114Local Anatolian source required

For the Kārum period average, the model estimates about 87.2% Çayönü PPN-related and 12.8% Progress-2 Eneolithic Steppe-related ancestry. The model passes with p=0.340p=0.340, and the steppe-related component is significant with Z=4.20Z=4.20. Removing the steppe source causes the model to fail.

TargetSourceWeightSEZ
Kalehöyük Old Hittite PeriodÇayönü PPN0.8170.031526.0
Kalehöyük Old Hittite PeriodProgress-2 Eneolithic Steppe0.1830.03155.81
Modelf4f_4 rankdofchisqP
Çayönü PPN + Progress-2 Eneolithic Steppe173.080.878
Dropped sourcedofchisqPInterpretation
None73.080.878Full model accepted
Progress-2 Eneolithic Steppe836.71.32e-5Steppe source required
Çayönü PPN85245.50e-108Local Anatolian source required

For the Old Hittite period average, the estimate rises to about 18.3% Progress-2 Eneolithic Steppe-related ancestry, and the fit is very strong with p=0.878p=0.878. Again, the steppe-related source is required, since removing it causes the model to fail.

The Old Hittite period fit is especially noteworthy. A model using a rather eastern Anatolian Neolithic source such as Çayönü PPN might not be the first expectation for central Anatolia, yet it produces a good fit, especially given that the right-population set is fairly constrained.

Replacing Çayönü PPN with a more central Anatolian Neolithic source, Tepecik-Çiftlik, does not improve the situation. In fact, the two-way model with Tepecik-Çiftlik and Eneolithic Steppe fails with p=0.00134p=0.00134, even though the estimated steppe-related ancestry remains around 12.2%. This makes the stronger Çayönü-based fit more notable.


Ovaören MA2213, Early Bronze Age II

One final central Anatolian case is Ovaören MA2213 from Early Bronze Age II. Interestingly, the full Ovaören average of three samples does not pass in this setup, but MA2213 individually does. This is also the Ovaören individual reported to share an IBD link of 15.2 cM with Vonyucka-1 in the North Caucasus steppes, making it especially useful for testing Steppe-related connections into Bronze Age Anatolia.

TargetSourceWeightSEZ
Ovaören MA2213, EBA IIÇayönü PPN0.8530.035524.0
Ovaören MA2213, EBA IIProgress-2 Eneolithic Steppe0.1470.03554.14
Modelf4f_4 rankdofchisqP
Çayönü PPN + Progress-2 Eneolithic Steppe176.420.491
Dropped sourcedofchisqPInterpretation
None76.420.491Full model accepted
Progress-2 Eneolithic Steppe827.75.30e-4Steppe source required
Çayönü PPN84866.43e-100Local Anatolian source required

The model estimates MA2213 as roughly 85.3% Çayönü PPN-related and 14.7% Progress-2 Eneolithic Steppe-related. The model passes comfortably with p=0.491, and the steppe-related component is significant with Z=4.14Z=4.14. Removing the steppe source causes the model to fail.

Summary

TargetSteppe-related estimateZP-Value
Arslantepe LC13.4%5.250.510
ART03811.5%2.490.723
Tilbeşar BA I1464914.8%2.770.221
Oylum MBA8.8%3.490.135
Kalehöyük Kārum12.8%4.200.340
Kalehöyük Old Hittite18.3%5.810.878
Ovaören MA221314.7%4.140.491

Conclusion

Taken together, the results point to a consistent pattern. Eneolithic steppe-related ancestry is first detectable south of the Caucasus, then appears in eastern Anatolia, and later remains visible in several Chalcolithic and Bronze Age Anatolian contexts. The signal is not large, and in some cases it is clearly diluted, but it is repeatedly detectable and often statistically required. Nor should the signal necessarily be expected to be large. If early Proto-Anatolian speakers first existed as a subculture within a mostly local Anatolian environment, rather than as a mass population already spreading across Anatolia, then even a modest ancestry signal could be historically meaningful. This is worth keeping in mind, especially against exaggerated maps of Luwian or Anatolian-speaking territory that project much later distributions too far back, often with an excessive focus on western Anatolia.

This does not mean that every individual carrying such ancestry was necessarily an Anatolian speaker, nor that a simple two-way qpAdm model is the final word on the ancestry of these populations. More intermediate sources and more regionally specific models may improve individual fits in some cases. But the broader direction of the evidence is difficult to ignore: the eastern route is not merely a theoretical possibility. It is supported by a trail of genetic signals moving from the Eneolithic steppe and northern Caucasus zone, through the Southern Caucasus, and into Anatolia.

In opposition, the Balkan route remains harder to reconcile with this pattern. It would require the relevant ancestry to enter Anatolia from the west or northwest, yet the clearest signals discussed here appear first in the Southern Caucasus, eastern Anatolia, and later central and southeastern Anatolia.