Recently Published
Subsetting SNP Dataset by Sample ID
This project uses for loops in a parameterized Rmd that can subest SARS-CoV-2 Illumina short reads (PRJNA656695) by sample ID and datasheet on Linux host machines. I analyzed the short reads with R (dplyr, ggplot2).
Baur module 3 assignment
Assignment for module 3 of Coursera data products course
IBM HR Analytics
The project utilizes data analysis techniques, including descriptive statistics, visualization (scatter plots, violin plots, bar graphs), and statistical tests (hypothesis testing, logistic regression) to explore the relationship between various factors and employee attrition
WQD7004 Group Assignment
Detecting Online Payment Fraud Using Machine Learning Models
Lecturer: Dr. Ang Group 9
| Matric | Full Name |
|-------------|----------------|
| 23121328 | Mohammed Iqram |
| 24052516 | LI JUNMING |
| 22106713 | LI YUEXIN |
| 23108677 | ZHAO ZITONG |
| 23111676 | LIU YICONG |
SNPs in Genes ORF7b and S from States with Potential Transfer of SARS-CoV-2 from Odocoileus virginianus to Homo sapiens
This project is a practice of exploratory data analysis on White-tailed deer
Illumina short reads (ID: PRJNA984950). The SARS-CoV-2 SNP variants found within different populations of White-tailed deer across the United States were called using sastQC, trimmomatic, BWA, and vcftools. My goal was to find what significance location had to SARS-CoV-2 variants through finding distribution of samples collected and seeing whether certain genes had more SNPs in certain states. I was also interested in if statewide SNP trends coincided with SNP trends in North Carolina and Massachusetts, two states where White-tailed deer may be a potential resevoir
for SARS-Cov-2.
Basic of R
Basic of R