gravatar

attarwala

Murtuza Attarwala

Recently Published

Datamining Capstone Final Report
As part of this final report, we look at all the six tasks, viz. topic modeling, cuisine similarity, dish discovery, mining popular dishes, restaurant recommendation, restaurant hygiene prediction and summarize the algorithms used for each task, as well as the usefulness of the results obtained for each of the tasks.
Restaurant Hygiene Prediction using Yelp Reviews
This report looks at the yelp hygiene dataset to predict whether a restaurant will pass the hygiene inspection or not. The dataset is composed of a training subset of 546 restaurants used to train the classifier, and a testing subset of 12753 restaurants used for evaluating the performance of the classifier.
Mining Popular Indian Restaurants
This report looks at yelp restaurant review dataset to discover knowledge about the cuisines. We mine the dataset for a particular cuisine, to discover common/popular dishes of a particular cuisine. For this report the author mined popular dishes for Indian Cuisine. Once we have knowledge about the popular dishes, we mine the dataset to gather knowledge about restaurants that serve those dishes. In this particular report we only mined the dataset to determine popular restarurants for the most popular indian dish, chicken tikka.
Mining Popular Indian Dishes
This report looks at yelp restaurant review dataset to discover knowledge about the cuisines. We mine the dataset for a particular cuisine, to discover common/popular dishes of a particular cuisine. For this report the author mined popular dishes for Indian Cuisine.
Dish Discovery for Indian Cuisine from Yelp Restaurant Reviews
This report looks at yelp restaurant review dataset to discover knowledge about the cuisines. We mine the dataset for a particular cuisine, to discover common/popular dishes of a particular cuisine. Typically when you go to try a new cuisine, you don’t know beforehand the types of dishes that are available for that cuisine. For this task, we would like to identify the dishes that are available for a cuisine by building a dish recognizer. The author decided to explore different dishes for Indian cuisine.
Cuisine Similarity from Yelp Restaurant Reviews
This report looks at yelp restaurant review dataset to discover knowledge about the cuisines. We mine the dataset to visually understand the landscape of different types of cuisines and their similarities. The cuisine map can help users understand what cuisines are available and their relations, which allows for the discovery of new cuisines, thus facilitating exploration of unfamiliar cuisines.
Topic Model for Yelp Restaurant Reviews
The document explores the topics that people are talking about in the yelp restaurant reviews. The topic model is generated using LDA, with the number of topics set to 10. The document also compares the topic models for positive and negative reviews for restaurants serving chinese cuisine.
Next Word Prediction
Next Word Prediction using n-gram probability model for coursera's data science specialization capstone project.
n-gram prediction exploratory analysis
Exploratory analysis for n-gram prediction model built as part of Coursera Datascience Specialization. This n-gram prediction model is built as part of the capstone project in collaboration with swift key.
Analyzing approval of President Obama's handling of Afghanistan war
This report tries to answer the question “Are the veterans (identified as an individual who has ever served on active duty in Armed Forces) more likely or less likely than non-veterans to approve President Obama’s handling of War in Afghanistan.
Analysis of human life and property damage by natural calamities between 1950 and 2011
Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. This project involves exploring the U.S. National Oceanic and Atmospheric Administration's storm database to answer the following questions: 1. Across the United State, which types of events (as indicated by EVTYPE variable) are most harmful with respect to population health ? 2. Across the United States, which types of events have the greatest economic consequences ?