Shaun Wilkinson

Recently Published

The ‘insect’ R package and bioinformatic pipeline.
The insect R package is a pipeline for the analysis of next generation sequencing (NGS) amplicon libraries using informatic sequence classification trees. The pipeline employs a machine-learning approach that uses a set of training sequences from GenBank and other databases to ‘learn’ a classification tree, which is then used to assign taxonomic IDs to a set of query sequences generated from an NGS platform such as Illumina MiSeq. The package also includes a suite of functions for FASTQ/FASTA sequence parsing, de-multiplexing, paired-end read stitching, primer trimming, quality filtering, de-replication, re-replication, and many more useful operations.