Recently Published
Milestone Report - NLP
The goal of the Capstone Project is to apply all techniques learned during the Data Science Specialization for building a brand new application: analysis of text data and natural language processing.
This work involves several tasks: (i) Understanding the problem; (ii) Data acquisition and cleaning; (iii) Exploratory analysis; (iv) Statistical modeling; (v) Predictive modeling; (vi) Creative exploration; (vii) Creating a data product; and (viii) Creating a short slide deck pitching the data product.
The current Milestone Report aims to explain the major features of the data, as well as to summarize plans for creating the prediction algorithm and Shiny app.
Wearable Computing: Self-tracking and quality of the movement activities
Nowadays, it is possible to collect a large amount of data about personal activity by using devices such as Jawbone Up, Nike FuelBand, and Fitbit. These type of devices are part of the quantified self movement, as people like to quantify how much they do a particular activity (but they rarely quantify how well they do it).
The goal of this project is to use data from accelerometers on the belt, forearm, arm, and dumbell of 6 participants, in order to quantify how well participants do it. They were asked to perform barbell lifts correctly and incorrectly in 5 different ways (see Refs.[1-2] for more details), i.e., 10 repetitions in five different fashions.
To predict the manner in which the participants did the exercise, a model is built and described step by step. Additionally, the machine learning algorithm will be applied to the 20 test cases (results of these tests represent a second part of the project which is not included in the current report).
The U.S. NOAA Storm Data Analysis by type of harmful events, and by events with greatest economic consequences
Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration's (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
The best Canadian state to live, based on the individual income statistics
This is an example of Shiny web app with associated supporting documentation, aiming to answer to the following question:
What is the best Canadian state to live, according to the desired salary?
User can Enter the desired salary, and after hitting Submit, the dashboard will show number of tax filers per each province which obtained the User inserted salary.
In addition, user can see other important information: Income per Source, Income per Range, as well as three Key Performance Indicator (KPI), namely Top Income Range, Total Number of Tax Filers, top Predicted Income per source.
Both ui.R and server.R are available in the github repository: https://github.com/lilianabraescu/Developing-Data-Products-The-best-Canadian-state-to-live
R Markdown Presentation & Plotly
Developing Data Products Assignment:
Create a web page presentation using R Markdown that features a plot created
with Plotly.
Host your webpage on either GitHub Pages, RPubs, or NeoCities.
Your webpage must contain the date that you created the document,
and it must contain a plot created with Plotly.
Interactive Map: Top Attractions in Montreal
Using R Markdown and Leaflet, an interactive map was created with top tourist attractions in Montreal, Canada.
Custom markers were added on the map, based on the latitude and longitude of each place.
Representative pictures which describe the specific tourist attractions were used for icons.
Further details about places of interest can be found through the links associated with each popup icon.