Deciding WTF To Do (Nov. 2014)

This was a stressful time in my graduate school career. I felt torn by indecision about what the best wet lab method was to get the data I wanted- given the fact I had very little research funds. When I wrote a few grant proposals in Spring 2014, I had chosen to do the ezRAD method with pooling of 20 individuals from a site per library with one unique barcode per library. As the name suggests, this method is technically straightforward as it uses standard Illumina TruSeq preparation kits, with additional benefits of eliminating PCR-induced bias and not requiring sonication. I had enough money for one kit of 24 barcodes and 1 lane of Illumina HiSeq sequencing. Perfect! But in October, I met with a couple faculty members (one doing plant phylogeograhy using high throughput sequencing, the other a bioinformatics/population genomics guy) and they strongly discouraged against the pooling idea. While there was some support in the literature (Molecular Ecology 2013 Gautier) for the ability to get accurate allele frequencies from pooled data, numerous other papers (such as Molecular Ecology 2014 Anderson) cast doubt. By pooling individuals, I would limit the information I could glean from my sequencing data (ie observed heterozygosity or any form of haplotype analysis) and also be making an a priori assumption that the sites I collected were indeed separate populations.

Alright, so pooling was out. But if I could barely afford 24 barcodes, how could I possibly afford enough unique barcodes in order to sequence 96 individuals on a lane?? Fortunately, a curator at the Field Museum offered to share her lab’s Genotype-by-Sequencing (Elshire-2011-A Robust, Simple Gen) adaptors and barcodes for free. The only problem was they only had 48 barcodes, and my goal was to sequence at least 96 individuals per lane (each lane costs $1100-$1800). This act of scientific kindness led me down another path of obsessive pros and cons lists. Numerous grad students, postdocs, and professors (some I had never met and only stalked on the internet) kindly put up with my frantic emails as I tried to figure out wtf to do. Long story short, I decided to accept the 48 GBS barcodes and use a combinatorial index approach as in Double Digest RADSeq. Excessive pros/cons lists attached.

Cost of Pooling with Different Methods

Method:
  1. Make 40 libraries, with 2 for each population each containing 20 individuals. Sequence on 1 lane. Then resequence individuals from a subset (ie 8 pops) on another lane, will get more loci and better coverage. Can be pops that were not sequenced well previously or pops of interest.
    1. Use ApeKI and GBS
    2. Use ezRAD and REs of choice
    3. ddRAD
  2. Make 40 libraries, with 2 for each population each containing 20 individuals. Sequence on 1 lane. Resequence 8 pools that were poorly sequenced or of greater interest. OR sequence 20 each lane.
    1. Use ApeKI and GBS.
    2. Use ezRAD and REs of choice
    3. ddRAD
GBS (vs ezRAD)
Pros
Cons
$2000 cheaper
uneven sequencing of loci: may need to throw out loci. 2nd run def required
Advice on protocol
Not sensitive to methylation
blocked by some CpG methylation
PCR bias
~65,000 fragments
PCR cleanup vs Ampure beads
1a)40, then 96 GBS
Item
Estimated Cost
Sequencing
2200 (if joined with another group, otherwise 3600)
Adapters
~$200 for Y
RE
~$168
Ligase
80-200
PCR purification kit
$220-$524
Bionalyzer
$1088
Total
$3956-$5780
1b)40, 96 ezRAD
Item
Estimated Cost
Sequencing
3600
Adapters
2880
RE
236
PCR purification kit
0
Bionalyzer
$1088
Total
$7104

1c) and d) 40(48) to 40+96 ddRAD

Item
Estimated Cost
Sequencing
3600
Adapters
$4350
RE
$300
Ligase
80-$201
Ampure beads
$945
Pippin Prep
40-120
Bionalyzer
320-$1088
Total
9636-10,604
Not Pooling:
 
Method:
  1. Use GBS with ApeKI.
    1. 96 libraries of individuals using GBS with ApeKI on one lane. 5 from each of 19 populations or 6 from 16 pops. Look at sequencing, then do another lane with 96 individuals.
    2. 48 on two lanes (so as not to fiddle with adaptors).  ~7 from 13 populations.
  2. 96 libraries of individuals on one lane using ezRAD. 5 from 19 populations or 6 from 16.
    1. Adjust sequencing of additional individuals/pops
  3. 96 on one lane using ddRAD. Resequence additional on 2nd lane.
1a) 96 then 96 GBS
Item
Estimated Cost
Sequencing
1800-3600
Adapters
$200
RE
0-150
Ligase
80-$201
PCR Cleanup
330-524
Bionalyzer
$768-$1536
Total
$3178($33/96)-6211($32/192)
1b) 96 then 96 GBS (real)
Item
Estimated Cost
Sequencing
3600
Primers
$200
RE
$256
Ligase
$256
PCR Cleanup
$279-$389
Bionalyzer
$16
NEB Taq 2X Master Mix
$56
Total
$4663-4773 ($24.5/192)
2)96 then 96 ezRAD
Item
Estimated Cost
Sequencing
1800-3600
Adapters
$2880
RE
$150-300
Bionalyzer
$768-$1536 (12($96)-24($192)
qPCR
72-144
Total
5000($52/96)-7116($37/192)
3) 96 then 96 ddRAD
Item
Estimated Cost
Sequencing
1800-3600
Adapters
$4350
RE
$300
Ligase
80-$201
Ampure beads
$945
Pippin Prep
80-160
Bionalyzer
$768-$1536
Total
8323-11,092($86/96-$57/192)
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s