gravatar

JoPaPi

John Pauline Pineda

Recently Published

Case Study : Characterizing Life Expectancy Drivers Across Countries Using Model-Agnostic Interpretation Methods for Black-Box Models
This case study aims to develop an interpretable regression model which could provide robust and reliable estimates of life expectancy from an optimal set of observations and predictors, while delivering accurate predictions when applied to new unseen data.
Methods : Exploring Penalized Models for Handling High-Dimensional Survival Data
This presentation illustrates various penalized survival models for high-dimensional data in R.
Methods : Exploring and Visualizing Extracted Dimensions from Principal Component Algorithms
This presentation illustrates various exploration and visualization methods for extracted dimensions obtained from principal component algorithms in R.
Methods : Sample Size and Power Calculations for Tests Comparing Proportions in Clinical Research
This presentation illustrates various sample size and power calculations for clinical research proportion comparison tests in R.
Methods : Sample Size and Power Calculations for Tests Comparing Means in Clinical Research
This presentation illustrates various sample size and power calculations for clinical research mean comparison tests in R.
Methods : Comparing Oversampling and Undersampling Algorithms for Class Imbalance Treatment
This presentation illustrates various oversampling and undersampling algorithms for augmenting imbalanced data prior to model training in R.
Methods : Exploring Performance Evaluation Metrics for Survival Prediction
This presentation illustrates various performance metrics which are adaptable to censoring conditions for evaluating survival model predictions in R.
Methods : Exploring Robust Logistic Regression Models for Handling Quasi-Complete Separation
This presentation illustrates various robust model variants applied to handle quasi-complete or complete separation during logistic regression modelling in R.
Methods : Estimating Outlier Scores Using Density and Distance-Based Anomaly Detection Algorithms
This presentation illustrates various density and distance-based anomaly detection algorithms for estimating outlier scores in R.
Methods : Estimating Outlier Scores Using Isolation Forest-Based Anomaly Detection Algorithms
This presentation illustrates various isolation forest-based anomaly detection algorithms for estimating outlier scores in R.
Methods : Identifying Multivariate Outliers Using Density-Based Clustering Algorithms
This presentation illustrates various density-based clustering algorithms for identifying multivariate outliers in R.
Methods : Exploring Dichotomization Thresholding Strategies for Optimal Classification
This presentation illustrates various dichotomization thresholding strategies for optimally classifying categorical responses in R.
Methods : Implementing Gradient Descent Algorithm in Estimating Regression Coefficients
This presentation illustrates the implementation of the gradient descent algorithm in estimating the coefficients of a linear regression model in R.
Methods : Formulating Segmented Groups Using Clustering Algorithms
This presentation illustrates various clustering algorithms for segmenting information in R.
Methods : Extracting Information Using Dimensionality Reduction Algorithms
This presentation illustrates various dimensionality reduction algorithms for extracting information in R.
Methods : Remedial Procedures for Skewed Data with Extreme Outliers
This presentation illustrates various remedial procedures for handling skewed data with extreme outliers for classification in R.
Methods : Selecting Informative Predictors Using Simulated Annealing and Genetic Algorithms
This presentation illustrates the implementation of simulated annealing and genetic algorithms in selecting informative predictors for a modelling problem in R.
Methods : Selecting Informative Predictors Using Univariate Filters
This presentation illustrates the implementation of univariate filters in selecting informative predictors for a modelling problem in R.
Methods : Selecting Informative Predictors Using Recursive Feature Elimination
This presentation illustrates the implementation of recursive feature elimination in selecting informative predictors for a modelling problem in R.
Methods : Evaluating Model-Independent Feature Importance for Predictors with Dichotomous Categorical Responses
This presentation illustrates various model-independent feature importance metrics for predictors with dichotomous categorical responses in R.
Methods : Evaluating Model-Independent Feature Importance for Predictors with Numeric Responses
This presentation illustrates various model-independent feature importance metrics for predictors with numeric responses in R.
Methods : Cost-Sensitive Learning for Severe Class Imbalance
This presentation illustrates various cost-sensitive learning procedures for handling class imbalance in R.
Methods : Remedial Procedures in Handling Imbalanced Data for Classification
This presentation illustrates various remedial procedures for handling imbalanced data for classification in R.
Methods : Evaluating Hyperparameter Tuning Strategies and Resampling Distributions
This presentation illustrates various evaluation procedures for hyperparameter tuning strategies and resampling distributions in R.
Methods : Modelling Multiclass Categorical Responses for Prediction
This presentation illustrates various predictive modelling procedures for multiclass categorical responses in R.
Methods : Modelling Dichotomous Categorical Responses for Prediction
This presentation illustrates various predictive modelling procedures for dichotomous categorical responses in R.
Methods : Modelling Numeric Responses for Prediction
This presentation illustrates various predictive modelling procedures for numeric responses in R.
Methods : Resampling Procedures for Model Hyperparameter Tuning and Internal Validation
This presentation illustrates various resampling procedures as applied during model hyperparameter tuning and internal validation in R.
Methods : Clinical Research Prediction Model Development and Evaluation for Prognosis
This presentation illustrates the best practices when developing and evaluating prognostic models for clinical research using R.
Methods : Missing Data Pattern Analysis, Imputation Method Evaluation and Post-Imputation Diagnostics
This presentation illustrates various missing data pattern analyses, imputation method evaluations and post-imputation diagnostics in R.
Methods : Survival Analysis and Descriptive Modelling for a Three-Group Right-Censored Data with Time-Independent Variables Using Cox Proportional Hazards Model
This presentation illustrates the survival analysis and descriptive modelling steps for a three-group right-censored data with time-independent variables using the Cox Proportional Hazards Model in R.
Methods : Survival Analysis and Descriptive Modelling for a Two-Group Right-Censored Data with Time-Independent Variables Using Cox Proportional Hazards Model
This presentation illustrates the survival analysis and descriptive modelling steps for a two-group right-censored data with time-independent variables using the Cox Proportional Hazards Model in R.
Methods : Treatment Comparison Tests Between a Single Factor Variable (2-Level) and a Single Response Variable (Numeric)
This presentation illustrates various treatment comparison tests applied to data with a single factor variable (2-level) and a single response variable (numeric) in R.
Methods : Data Quality Assessment, Preprocessing and Exploration for a Regression Modelling Problem
This presentation illustrates various data quality assessment, preprocessing and exploration methods in R.
Methods : Data Quality Assessment, Preprocessing and Exploration for a Classification Modelling Problem
This presentation illustrates various data quality assessment, preprocessing and exploration methods in R.
Projects : Analysis Dashboard for the Structural Topic Modeling Study of Philippine Broadsheets
This project summarizes the exploratory data analysis conducted for the pre-and post-structural topic modelling results involving the words, topics and documents obtained from the media contents of Philippine broadsheets.
Projects : Logistic Regression for Immunization Survey Data
This project implements the logit model under the binary response GLMs for understanding the factors driving the knowledge and awareness of survey respondents on immunization. Y corresponds to the measured knowledge on immunization. π represents the probability of having sufficient knowledge on immunization and n refers to the observation data for the 635 respondents used for the model.
Projects : Likert Scale Reliability Assessment
This project evaluates the reliability of statements used to measure an abstract concept through likert scales. Median test was the primary tool used to identify the statements which effectively delineated responses from the survey respondents.
Exercises : Combined and Separate Ratio Estimation
This exercise uses combined and separate ratio estimation in a stratified random sample to estimate the population total of a certain variable.
Exercises : Ratio Estimation
This exercise uses ratio estimation in a simple random sample to estimate the population mean of a certain variable.
Exercises : ANOVA Table for Simple and Stratified Random Sampling
This exercise generates the ANOVA table for simple and stratified random sampling methods.
Exercises : Simple and Stratified Random Sampling
This exercise estimates the population mean using simple and stratified random sampling.
Exercises : General Linear Hypothesis
This exercise computes the T2-statistic as used in testing hypotheses on linear combinations of the elements of the mean vector μ. Under the general linear hypothesis (GLH), the interest is in testing the hypotheses as follows: (1) Ho : Aμ = b and (2) Ha : Aμ ≠ b
Exercises : Inference on Single Mean Vectors
This exercise implements simultaneous inference for means using the Roy-Bose simultaneous confidence intervals and the Bonferroni method of multiple comparisons.
Exercises : Poisson Point Patterns
This exercise implements simple analysis for Poisson point patterns as applied on spatial data.