Easy web publishing from R
Write
R Markdown
documents in RStudio.
Share them here on RPubs.
(It’s free, and couldn’t be simpler!)
Get Started
Recently Published
Boston Healthcare Map
Map of Major healthcare institutions and schools in Boston, along with their national ranking
Gravity Model
Massachusetts Trade model with foreign nations, Ireland highlighted to show the impact of historical ties on modern trade trends.
Visualisasi Dataset Diamonds : Histogram Harga Berlian, Density Plot Berat Carat, dan Boxplot Harga Berlian
Visualisasi ini menampilkan eksplorasi dataset diamonds untuk melihat distribusi dan penyebaran data, melihat kelompok harga yang dominan, dan mendeteksi outlier melalui boxplot
Visualisasi Dataset Diamonds : Sebaran Warna, Tingkat Kejernihan dan Proporsi Cut
Visualisasi ini menampilkan eksplorasi dataset diamonds untuk melihat sebaran warna berlian,hubungan warna dengan kerjenihan dan proporsi jenis cut tiap warna
Actividad M1.1: Fundamentos de programación
Desarrollar habilidades básicas para manejo de información y comandos en R.
Document
This is the first assignment
SwiftKey Capstone: Exploratory Data Analysis of HC Corpora
This project is part of the Johns Hopkins University Data Science Specialization Capstone. The objective is to explore large-scale English text datasets (blogs, news, and Twitter) and build the foundation for a next-word prediction model similar to those used in mobile smart keyboards.
The analysis includes:
Basic dataset summaries (file size, line counts, maximum line length)
Exploratory analysis of text structure
Sampling and cleaning strategies suitable for large corpora
Word frequency analysis (unigrams and bigrams)
Distribution of words per line
Vocabulary coverage analysis (50% and 90% token coverage)
Preliminary modeling strategy for n-gram backoff prediction
The results highlight differences between text sources (short Twitter messages vs. long blog entries), motivate efficient sampling techniques, and inform the design of a responsive Shiny application for deployment.
The next phase of the capstone will implement an optimized n-gram model with a backoff strategy and deploy it via Shiny for real-time next-word prediction.