PLINK PCA Tutorial: Running PCA in PLINK (Commands + Output)

In this post, I’ll demonstrate how to perform a PCA on a PLINK dataset. Before we begin, we need to prepare a subset of samples we’re interested in analyzing. To do this, we’ll extract sample information from the .fam file. But first, we need to identify the samples of interest. For example, those from a specific population such as Sardinians. The easiest way is to open the corresponding .ind file and look at the population column, which is the third column in each row. Open the file in a text editor, and search for the population name, in this case, Sardinian. ...

July 29, 2025

Converting EIGENSTRAT to PACKEDPED

The files downloaded in the previous blog post are in EIGENSTRAT format. In this post, we’ll look at how to convert them to PACKEDPED format. PACKEDPED format allows for easier downstream processing using the PLINK toolset. With PLINK, it becomes straightforward to extract sample subsets, filter SNPs, and perform a wide range of analyses. Downloading PLINK I use PLINK 1.9. While there is a newer version (2.0), I prefer 1.9 because it includes several features that were deprecated or removed in the newer release. ...

July 29, 2025

Download Ancient & Modern DNA (AADR Tutorial)

A Linux environment is unavoidable when it comes to bioinformatical data processing and preparation. You can use your favorite distribution. For Windows users, the Windows Subsystem for Linux (WSL) provides a good alternative to dual booting or setting up a full virtual machine. Installing WSL with Debian Open PowerShell as Administrator and run: wsl --install -d Debian Once installed, update the system: sudo apt update && sudo apt upgrade -y Downloading A Genetic Dataset Before doing PCA, ADMIXTURE, qpAdm, etc, you need actual genotype data. A good and comprehensive resource is the Allen Ancient DNA Resource (AADR). ...

July 29, 2025