IMPORTANT CONSIDERATIONS
Before submitting your order for NGS services to ACGT, please consider the following issues:
Library Construction
Library preparation can take from 1-4 days depending on the application. The input material is fragmented DNA or RNA. Typically, libraries consist of carefully size-selected fragments which range from 25 to 600 bp to which specific adapters are ligated, followed by PCR amplification. Once a library has been constructed and validated, it is fixed to the flow cell by hybridizing the fragments to a lawn of oligos complementary to the adapter sequences. Bridge amplification is performed to create millions of dense clusters.
Run Length
The clusters are sequenced by incorporation of fluorescent nucleotides (one base/cycle). The TruSeq sequencing process takes between 3 hours and 6 days, depending on read length and the platform used. Typically, paired end runs of 75 to 150 cycles per end read are sufficient for most types of applications, although de novo sequencing and metagenomic analysis may require longer read lengths (up to 300 cycles), while ChIP-Seq and miRNA-Seq reads are shorter.
Paired Ends
Sequencing from both sides of the adapter-ligated fragments is typically performed to maximize data output. The paired end protocol essentially flips the fragments attached to the flow cell and sequences them from the other direction, after completion of the first read. In addition to generating more data per read, paired end sequencing allows more positional information to be obtained from the data, facilitating the evaluation of alternate splice junctions, indels, etc. The standard paired end protocol can evaluate sequences 100-600 bases apart. Clients who intend to evaluate longer distances (3-15 kb) between sequences can request the Mate Pair protocol to generate their libraries and should contact ACGT concerning the requirements for this service. Note: Paired end samples cannot be run on the same flow cell as single read samples.
Coverage
As the average read length using the Illumina® technology is relatively short (50 to 300 bp), multiple fold levels of sequence coverage are recommended in order to ensure efficient assembly of the contigs. For re-sequencing of genomes and transcriptomes, and ChIP analysis, we recommend between 10- and 100-fold coverage, while de novo sequencing requires between 100-fold and 300-fold coverage. Polyploid genomes, such as those of many plants, require even higher coverage levels, as does variant analysis in tumor samples (up to 500-fold). The level of coverage in combination with the expected sequence size and the number of sequencing cycles per run determines the amount of flow cell space and the number of runs that would be required for the experiment (see Table 1).
Multiplexing
Clients who wish to run more than one sample per flow cell or lane can take advantage of the Illumina® indexing system. A six to eight base sequence index is added to the adapter and sequenced by an additional short run. The number of available indices varies based on a kit. Currently, a maximum of 384 indices are available from various commercial sources. Please note that the number of readable clusters per sample will drop proportionally with the number of indices used per lane. Also note that the use of index sequencing still requires a separate library to be prepared for each sample.
Project Timeline
ACGT will put forth its best effort to process samples in the order received. To use the instruments in the most efficient manner, and to minimize the cost to the user, ACGT will attempt to accumulate a full set of samples (single or multiplexed) to load a flow cell before initiating a run. Clients who do not wish to wait until a full set of samples has accumulated will have the option of expediting their order by agreeing to pay the cost of the whole run, or to share this cost with another researcher. Please contact us if you would like to have your order expedited, or to get a cost estimate for your project. NOTE: For certain applications, ACGT does not possess the necessary kit(s) onsite and it may require one week or more to receive the kit from Illumina® or other providers.
Table 1.
Sample loading per run on ACGT Illumina® platforms, based on expected average coverage
Test Article | Average Coverage | MiSeq v3 kits | NextSeq500 High Output Run | HiSeq4000, 1 Lane |
BACs and other large episomes | 200X + | 384 | 384 | 384 |
DNA/RNA virus (>50% viral genome in sample) | 100X + | 384 | 384 | 384 |
Bacterial (5Mb) / Fungal(20Mb) WGS re-sequencing | 50X | 45 – 11 | 384 – 96 | 384 – 72 |
Bacterial (5Mb) / Fungal(20Mb) WGS de novo | 200X | 11 – 3 | 96 – 30 | 60 – 22 |
Insect genome (< 0.5 Gb) re-sequencing* | 30X | 1 | 6 | 4.5 |
Insect genome (< 0.5 Gb) de novo* | 100X | 0.25 | 2 | 1 |
Human Genome re-sequencing*,** | 30X | 0.15 | 1.3 | 0.92 |
Human or Mouse Expanded Exome (est. 50Mb) | 80X | 3 | 30 | 22 |
Large genome (plant or vertebrate) de novo* | 100X | NA | 0.4 | 0.3 |
ChIP-Seq, 10M single reads | Varies | 2 | 40 | 30 |
Bacterial RNA-Seq, 5M PE reads | Varies | 4 | 80 | 60 |
Fungal RNA-Seq, 10M PE reads | Varies | 2 | 40 | 30 |
Transcriptome Analysis (De novo), 60M PE reads | Varies | 0.35 | 6 | 4.5 |
Transcriptome Analysis (Re-seq), 30M PE reads | Varies | 0.7 | 13 | 9 |
small RNA Analysis, 5M single reads | Varies | 4 | 80 | 60 |
Amplicon sequencing, 10M PE reads | Varies | 2 | 40 | 30 |
* Same coverage and loading considerations apply to Whole Genome Methylation Sequencing of these genomes
** Same coverage and loading considerations apply to most vertebrate genomes