Recently Published
MSI prediction
Logistic Regression and Random Forest model for MSI status prediction. Using the combined training datasets.
MSI RF ver2
Random Forest model for MSI status prediction. Using the TCGA-COAD cancer samples for training.
MSI RF ver1
Random Forest model for MSI status prediction. Using the combined training datasets.
Random Forest Model Testing
Benchmarking HanniganGD_2018 data
OmicsMLRepo-cMD
cMD metadata overview
PathML AnVIL Implementation Test
Pilot: Implementing PathML's stain normalization step in AnVIL/Terra
Mouse Input on RAVmodel_Human
Apply RAVmodel_Human_C2 on mice T-cell data
microbiomeABX metadata
Metadata analysis for microbiomeABX R21
SIAMCAT Raymond Vincent
Test SIAMCAT modeling on Raymond and Vincent datasets
biobakeR
Run bioBakery wmx workflow in Terra using RunTerraWorkflow and biobakeR pacakges
Simulate random RAVs
Compare random RAVs and PCSSs
RAV184 and RAV312
Manual curation of two RAVs
HGNChelper failed cases
Examples of not-corrected inputs for HGNChelper
CRC Fig.4C with 18 datasets
Repeat CRC paper's Figure 4C with all 18 datasets
CRC Fig.4C with 10 datasets
Repeat CRC paper's Figure 4C with 10 validation datasets
bioBakeryR
bioBakeryR vignettes ver2
bioBakeryR_SE
Organize bioBakeryR outputs in SummarizedExperiment
multiDatasets
PCAmodel comparison: Default model
1399 Studies
PCAmodel comparison: more studies
10PCs
PCAmodel comparison: top 10 PCs
clNum4
PCAmodel comparison: a fewer clusters
GSEA on gene subsets
raw counts from 483 TCGA-COAD tumor samples --> log2(TPM+1) --> subset to 13,934 genes --> rowNorm --> PCA --> geneList from PC2 --> GSEA
GSEA on gene subsets
raw counts from 483 TCGA-COAD tumor samples --> log2(TPM+1) --> subset to 13,934 genes --> PCA --> geneList from PC2 --> GSEA
GSEA on gene subsets
raw counts from 483 TCGA-COAD tumor samples --> log2(TPM+1) --> subset to 13,934 genes --> rowSums --> GSEA
Test normalization on GSEA
raw counts from 483 TCGA-COAD tumor samples --> log2(TPM+1) --> rowNorm --> PCA --> build a gene list from PC2 --> GSEA
Test normalization on GSEA
raw counts from 483 TCGA-COAD tumor samples --> log2(TPM+1) --> PCA --> build a gene list from PC3 --> GSEA
Test normalization on GSEA
raw counts from 483 TCGA-COAD tumor samples --> log2(TPM+1) --> PCA --> build a gene list from PC1 --> GSEA
Test normalization on GSEA
raw counts from 483 TCGA-COAD tumor samples --> log2(TPM+1) --> PCA --> build a gene list from PC2 --> GSEA
Test normalization on GSEA
raw counts from 483 TCGA-COAD tumor samples --> log2(TPM+1) --> build a gene list based on rowSums --> GSEA
Validation Interactive Plot
Human B-cell expression dataset validated with PCAmodel is plotted in the interactive graph.
PCAmodel with MSigDB C7
PCAmodel from refine.bio data without controls. Subset to common genes with MSigDB C7
Validation on B-cell
B-cell dataset validation using PCAmodel from hierarchical clustering
BRCA Signature GSEA Visualization
Visualization GSEA results of Cl4935_263
Hierarchical Clustering refine.bio with two controls
Negative control data + the subset of TCGA-COAD
Validation
GSEA on PCcluster 263 (Cl4935_263) to test breast-related signature
Graph-based Clustering
Test graph-based clustering using Girvan-Newman algorithm
Hierarchical Clustering with different k
Finding minimum k that separate negative controls
ClusterR vs. Means
Compare ClusterR (different seeding) and Kmeans (different algorithm).
Clustering 5 PCs collection with controls
Use top 5 PCs from 676 refine.bio + 20 controls (pos & neg)
Search TOPMed Cohort
Search heart-disease related TOPMed cohorts
Different PCAmodels from the same training datasets
Three different PCAmodels from recount2 datasets based on their annotation database.
PureCN Purity/Ploidy
EDA on outliers
13_PCAmodels_vs_multiPLIER
loading_perData (using PCA+clustering) and multiPLIER Z matrix from the same gene set
avgLoadings from perData vs. megaData
Build avgLoadings from two different processing approaches.
CNVWorkflow Output
Analysis CNVWorkflow output - TMB, Mutational Signature, Variant Classification Accuracy
CNVWorkflow PureCN
Run PureCN for CNV analysis
CNVWorkflow Input
Prepare/ pre-processing input files for PureCN
CNVWorkflow
2018 NYC R/Bioconductor Meetup
CNVWorkflow_example_output
2018 NYC R/Bioconductor Meetup