mergeit is part of the EIGENSOFT package and can be used to merge exactly two EIGENSTRAT datasets without converting to PACKEDPED format first. In this post, I’ll show how to merge the sample we created in Pseudohaploid Genotyping for Ancient DNA: BAM to EIGENSTRAT with the AADR dataset.


Setting up EIGENSOFT

mergeit is part of the EIGENSOFT package. You can install it via conda:

conda install -c bioconda eigensoft

If you haven’t installed conda yet, see the Miniconda setup in the pseudohaploid genotyping post.

Alternatively, if you prefer to compile from source, see: From EIGENSTRAT to PACKEDPED.


Setting Up A Parameter File

Like other EIGENSOFT tools, mergeit requires a parameter file:

geno1: aadr.geno
snp1: aadr.snp
ind1: aadr.ind

geno2: eigenstrat_output.geno
snp2: eigenstrat_output.snp
ind2: eigenstrat_output.ind

genooutfilename: merged.geno
snpoutfilename: merged.snp
indoutfilename: merged.ind

Note: The first dataset (geno1, snp1, ind1) acts as the reference, only SNPs present in dataset 1 will appear in the output. Since we’re merging our sample into AADR, put AADR as dataset 1. Adjust the input prefixes as needed.

Save this as mergeit.par and run:

mergeit -p mergeit.par

This produces merged.geno, merged.snp, and merged.ind, ready for downstream analysis with ADMIXTOOLS.