gravatar

mjgrav2001

Mark Jack

Recently Published

Plot
Presentation for Coursera Capstone Project
This presentation summarizes the NLP capstone project - developing an interactive app for word predication using a smoothing algorithm. Several libraries are uploaded. The creation of a corpus of documents from the three text data files mostly relies on the use of the libary 'quanteda'. It allows to quickly tokenize the corpus of documents to remove text features such as punctuation, numbers, white space, lowercase words etc. The processing time for the complete text data is considerable. Thus, a corpus is only created for a sample of the documents. Unigrams, bigrams, trigrams and qudgrams are generated via 'quanteda's' format of a document-frequency matrix (dfm). A dfm allows for quick and easy analysis of the most frequent occurring ngrams.
NLP Presentation, Coursera Capstone Project
This presentation summarizes the NLP capstone project - developing an interactive app for word predication using a smoothing algorithm. Several libraries are uploaded. The creation of a corpus of documents from the three text data files mostly relies on the use of the libary 'quanteda'. It allows to quickly tokenize the corpus of documents to remove text features such as punctuation, numbers, white space, lowercase words etc. The processing time for the complete text data is considerable. Thus, a corpus is only created for a sample of the documents. Unigrams, bigrams, trigrams and qudgrams are generated via 'quanteda's' format of a document-frequency matrix (dfm). A dfm allows for quick and easy analysis of the most frequent occurring ngrams.
Coursera Capstone Project Presentation - NLP
Coursera Capstone Project - A natural language processing model (NLP) with 'ngram' continuation probabilities via Kneser-Ney smoothing. ======================================================== Mark A. Jack November 14, 2016
Coursera Capstone Project Presentation - NLP
Coursera Capstone Project - A natural language processing model (NLP) with 'ngram' continuation probabilities via Kneser-Ney smoothing. ======================================================== Mark A. Jack November 14, 2016
Coursera Capstone Project Presentation - NLP
Coursera Capstone Project - A natural language processing model (NLP) with 'ngram' continuation probabilities via Kneser-Ney smoothing. ======================================================== Mark A. Jack November 14, 2016
Coursera Capstone Project Presentation - NLP
Coursera Capstone Project - A natural language processing model (NLP) with 'ngram' continuation probabilities via Kneser-Ney smoothing. ======================================================== Mark A. Jack November 14, 2016
Publish Document
Coursera Capstone Project - A natural language processing model (NLP) with 'ngram' continuation probabilities via Kneser-Ney smoothing.
CapstoneProject-Milestone
Milestone report (RMarkdown file) for NLP capstone project of Coursera 'Data Science Specialization'. Author: Mark A. Jack, March 25, 2016.
Manual versus automatic transmission - a Shiny app to generate histograms of miles per gallon with adjustable bin size
Slides created with Rpresentation for to present Shiny app in final project of Coursera course 'Developing Data Products' in the 'Data Scientist Specialization' (Johns Hopkins University). 5 slides total.
RepRes_CP2
Analysis of NOAA Weather Data in Peer 2 Assessment for ‘Reproducible Research’: This is the R markdown file to describe the results of the Peer 2 Assessment.
RepData_CP2