Recently Published
Scraping Surveys
This R file practices creating a function of reading in a website URL for the American Gut Project, filtering by certain accession ID numbers, and then filtering for a specific response to make a pie chart for.
Scrape Sequencing Tutorial
This R Lab was a tutorial for us to practice downloading accession numbers as one NCBI file to reduce the time it would take to retrieve sequences for genes of interest by hand.
For this tutorial, I found MT CO1 gene sequences for Aedes aegypti and received 28 hits from my search. I then used read.genBank to get sequences
Cleaning Textual Data
The purpose of this R lab was to read one survey response into R from the American Gut Project and reformat the .txt file into a more human-friendly output by cleaning the file.
Visualizing Glossary
This R Lab aimed to take 9 terms/functions we have been working with and make a glossary to see visualizations of them in use.
datetime
character
numeric
boolean
array
vector
data frame
list
tibble
Defining Factors
The purpose of this R Lab was to reformat data into a more viewer friendly format.
I changed the format of months from numbers (1-12) and time (hours) from single digits (1-23) to an h:m (00:00) format.
I also ensured data would be organized chronologically and not alphabetically and confirmed this by checking the levels and creating a visualization of bike use reported per month.
Data was from: https://www.kaggle.com/code/samratp/bike-share-analysis/output?select=NYC-2016-Summary.csv
Practicing PCA: R Lab
As practice completing Principal Component Analyses I followed a tutorial to complete the analysis, data transformation, and visualization.
Tutorial: https://bioinfo4all.wordpress.com/2021/01/31/tutorial-6-how-to-do-principal-component-analysis-pca-in-r/
Data Transformation
This R Lab had us use a data set on transcriptomics in yeast, we were tasked with filtering the data for a treatment condition, merging expression data with condition and label data frames, pivoting data, and creating a tibble of mean and median counts. The data was visualized in a violin plot with the mean and median count points colored in pink and sky blue respectively.
Encode Expression Data Visualization Lab
This week's project was to read in a .tsv file with gene counts for 16,000+ identified genes and use exploratory data analysis to create 3 plots that give biological relevant information about gene expression in the brain of a 90 year old woman in the RUSH Alzheimer's study.
3 initial plots were created following a tutorial of getting situated with the data set, then 3 unique plots were created for further visualization of differences of gene expression.
Data is from: https://www.encodeproject.org/experiments/ENCSR562BUN/
Plot Shimadzu MALDI-TOF spectra
MALDI-TOF mas spec data analyzed in R to create box plot and column plot visualizations and to identify overlapping data points.