RPubs

Calculating & Comparing NMI & Jaccard Coefficient scores

Calculating & Comparing NMI & Jaccard Coefficient scores for the given clusters

almost 5 years ago

Clustering Based On Geographical Locations

Using R programming & K Means algorithm of clustering, given location data points are assigned to 3 clusters.

almost 5 years ago

Predicting On The Titanic Survival

The project covers predicting survivals on Titanic ship based on various machine learning algorithms. I predicted the survival after conducting following steps 1) Feature engineering 2) Missing value imputation 3) Exploratory Data Analysis 4) Statistical Inference 5) Building Models for Prediction

almost 9 years ago

Initial Solution for Titanic: Machine Learning from Disaster

1) Feature engineering 2) Missing value imputation 3) Exploratory Data Analysis 4) Building Models for Prediction

almost 9 years ago

Influence of Professor's Looks On Students' Teaching Evaluations

This project statistically tests whether the gender bias and looks of the professor affect the professor’s teaching evaluation done by the students.

about 9 years ago

Investment Portfolio Analysis And Allocation To Get Optimum Profit Using Sharpe Ratio

In this project, historical stocks/bonds/commodities prices of any listed company/bod/commodity in the world are fetched and compared automatically. Going further, you can choose the stocks/bonds/commodities for the portfolio and using this project, you can allocate your funds by merely updating the 'stocks' vector and the project suggest the best allocation of funds for your stocks to get the maximum returns on your portfolio at any time.

about 9 years ago

Getting, Comparing And Visualizing Stock Data from Yahoo Finance

This project extracts the stock data from Yahoo Finance. Different Stocks's data are compared. Comparative charts are plotted for Annualized Returns.

about 9 years ago

Presentation on Application of NLP For Predicting The Next Word

This is a presentation for predicting the next word based on last 1,2 or 3 words using Natural Language Processing (NLP). This presentation contains link for codes on Git-Hub and Shiny Server.

about 9 years ago

Predicting The Next Word Using Natural Language Programming

I created a robust program using n-grams table to clean the given text/word/phrase and then split the text. The program computes the next word based on the last one or two or three or more than 3 words in a given text. The application is also posted on Shiny Server.

about 9 years ago

Building An N-gram Model With Natural Language Processing (NLP)

The goal here is to build simple model for the relationship between words.This is the first step in building a predictive text mining application.Using the exploratory analysis, I am going to build a basic n-gram model for predicting the next word based on the previous 1, 2, or 3 words.Also, my model will handle the unseen n-grams.

about 9 years ago

NLP

The goal here is to build simple model for the relationship between words.This is the first step in building a predictive text mining application.Using the exploratory analysis, I am going to build a basic n-gram model for predicting the next word based on the previous 1, 2, or 3 words.Also, my model will handle the unseen n-grams.

over 9 years ago

Analysis of NOAA Storm in USA

In one of the other data analytics project named ‘Analysis of NOAA Storm in USA’, my goal of the project was to analyze the storms in USA and track the characteristics of major storms & weather events in USA. Using exploratory data analysis, I analyzed over 9 lac observations and concluded that tornados are the most devastating events in USA while flood causes the highest socioeconomic damage in USA.

over 9 years ago

Qualitative Human Activity Recognition

In one of the predictive data analytics machine learning project named ‘Qualitative Human Activity Recognition’, my goal was to predict using data from accelerometers on the belt, forearm, arm and dumbbell of 6 participants. They were asked to perform barbell lifts correctly and incorrectly in 5 different ways. After cleaning the dataset, I created training & test datasets. I then computed area under ROC curve for training set to check goodness of fit. I employed predictions models like Decision Tree Model (DTM), Linear Discriminant Analysis (LDA) Method, Random Forest (RF) Method. I tested accuracy of each of these prediction algorithms. I predicted using RF Method with 99.41 % accuracy and 0.59% out of sample error rate.

over 9 years ago

Human Activity Recognition using Smartphones

One of the most exciting areas in all of data science right now is wearable computing. Companies like Fitbit, Nike, and Jawbone Up are racing to develop the most advanced algorithms to attract new users. The experiments have been carried out with a group of 30 volunteers within an age bracket of 19-48 years for this project named ‘Human Activity Recognition using Smartphones Dataset’. Each person performed six activities (WALKING, WALKING UPSTAIRS, WALKING DOWNSTAIRS, SITTING, STANDING, LAYING) wearing a smartphone (Samsung Galaxy S II) on the waist. The raw dataset hence created was processed to produce an independent tidy dataset of mean and standard deviations of each of the 30 participants for each of the above-mentioned activities.

over 9 years ago

Analysis Of Air Pollution(PM2.5) In USA

In the project titled as ‘Analysis Of Air Pollution(PM2.5) In USA’, I performed exploratory data analysis to determine change in PM2.5 emissions across USA and in cities like Baltimore and Los Angles. I also investigated the sources of PM2.5 emissions and analyzed the trend of emissions. It was concluded that although the PM2.5 emissions across USA have declined but some states have witnessed zigzag trends. Different trends were also noticed for different sources of pollutions in some of the states.

over 9 years ago

Analysis Of Household Electric Power Consumption in USA

While analyzing household electric power consumption on typical days, I studied global active power, voltage, global reactive power and meter-wise energy consumption together and separately. Sub-meters 1,2&3 were installed at kitchen, laundry room and air conditioner plus water heater respectively. It was found that kitchen consumes most power in comparison when all meters are in use.

over 9 years ago

Statistical Analysis Of Tooth Growth for 60 Guinea Pigs

As a statistical analysis project, I analyzed the tooth growth data for 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC). Based on p value which was larger than 0.05, I arrived at the conclusion that the observed difference of tooth length means across supplement types was statistically not different from 0.However, I also found that tooth length indeed varies across groups of dose levels and all the variations are statistically significant.

over 9 years ago

Statistical Analysis Of Tooth Growth for 60 Guinea Pigs

As a statistical analysis project, I analyzed the tooth growth data for 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC). Based on p value which was larger than 0.05, I arrived at the conclusion that the observed difference of tooth length means across supplement types was statistically not different from 0.However, I also found that tooth length indeed varies across groups of dose levels and all the variations are statistically significant.

over 9 years ago

R Presentation Using Interactive Graphics Package 'Plotly'

over 9 years ago

Use of Interactive Graphic Packages Like Shiny

Utilizing interactive graphic packages like shiny, I created summary, structure, data and plots pertaining to a dataset in the output panel using reactive function. In this project, I used various widgets like Radio buttons, Slider and Checkboxes.

over 9 years ago

Analysing And Ranking 4000 Medicare Certified Hospitals In USA Based On The Prescribed Quality Standards

The project named ‘Analyzing And Ranking 4000 Medicare Certified Hospitals In USA Based On The Prescribed Quality Standards’ was aimed at assessing and ranking quality standards across certified hospitals. With the help of R-Programming, I plotted 30-day mortality rates for Heart Attack, Pneumonia And Heart Failure and subsequently determined the best hospital in a state. I also ranked all the hospitals in a state and across all the states for the given outcome of medical disease by using R-Programming.

over 9 years ago

Quantifying A Relationship Between A Set Of Variables And MPG Using Regression Analysis

While working on the project named ‘Quantifying A Relationship Between A Set Of Variables And MPG (Miles Per Gallon) Using Regression Analysis’, I used Regression Analysis & Exploratory Data Analysis techniques to study mainly the effect of transmission types (i.e. automatic or manual) on MPG.T-test concluded that mean of manual transmission is about 7.25 MPG more than annual transmission. Then, I developed 4 linear regression models and selected the one with highest Adjusted R-squared value, lowest residual standard error and least RSS (Residual Sum of Squares). I also observed from the conditional plots that cars that are lighter in weight with a manual transmission and cars that are heavier in weight with an automatic transmission have higher MPG values.

over 9 years ago

RPubs

abhijitjantre

Abhijit Jantre

Recently Published

Calculating & Comparing NMI & Jaccard Coefficient scores

Clustering Based On Geographical Locations

Predicting On The Titanic Survival

Initial Solution for Titanic: Machine Learning from Disaster

Influence of Professor's Looks On Students' Teaching Evaluations

Investment Portfolio Analysis And Allocation To Get Optimum Profit Using Sharpe Ratio

Getting, Comparing And Visualizing Stock Data from Yahoo Finance

Presentation on Application of NLP For Predicting The Next Word

Predicting The Next Word Using Natural Language Programming

Building An N-gram Model With Natural Language Processing (NLP)

NLP

Analysis of NOAA Storm in USA

Qualitative Human Activity Recognition

Human Activity Recognition using Smartphones

Analysis Of Air Pollution(PM2.5) In USA

Analysis Of Household Electric Power Consumption in USA

Statistical Analysis Of Tooth Growth for 60 Guinea Pigs

Statistical Analysis Of Tooth Growth for 60 Guinea Pigs

R Presentation Using Interactive Graphics Package 'Plotly'

Use of Interactive Graphic Packages Like Shiny

Analysing And Ranking 4000 Medicare Certified Hospitals In USA Based On The Prescribed Quality Standards

Quantifying A Relationship Between A Set Of Variables And MPG Using Regression Analysis

Reproducible Research Programming Assignment Peer Graded Project 2

Data Products Week 4 Programming Assignment

Data Products Week 3 Programming Assignment (Plotly)

Data Products Week 2 Programming Assignment

Coursera Data Products Week 2 Assignment

My First R Presentation

Sign In

abhijitjantre

Abhijit Jantre

Recently Published