This is the first post in a series where I’ll process an ancient DNA sample from raw FASTQ sequences to EIGENSTRAT format for use with ADMIXTOOLS. This is a complete walkthrough of processing ancient DNA from raw sequences to EIGENSTRAT format, with actual aDNA-specific BWA parameter settings that are rarely documented elsewhere.
System Requirements All commands are written for Debian/Ubuntu-based Linux systems.
CPU: At least 8 cores recommended. BWA-MEM scales efficiently up to 12–16 threads; beyond this, you will likely encounter diminishing returns due to memory bandwidth saturation. RAM Recommendation: 16 GB for moderate-coverage samples. 20–32 GB can be beneficial for deeply sequenced datasets, mainly to keep sorting and downstream processing in memory. Storage & Network: A fast internet connection is recommended. While this tutorial uses a small sample of approximately 3.5 GB of compressed FASTQs, ancient DNA samples vary widely in sequencing depth and can easily reach or exceed 10 GB of compressed FASTQs per individual. If you’re limited by hardware or bandwidth, consider using a cloud computing instance (Google Cloud Computing, AWS, or DigitalOcean). You can SSH into the instance and run the pipeline there. I use this approach since my internet speed is not the best.
...