Recently Published
Emissions Analysis
This project analyzes trends in fine particulate matter (PM2.5) emissions in the United States over a ten-year period (1999–2008) using data from the Environmental Protection Agency's (EPA) National Emissions Inventory (NEI). PM2.5 is a dangerous air pollutant that poses significant health risks, including respiratory and cardiovascular issues. The goal of this analysis is to investigate how PM2.5 emissions have changed over time and to identify the primary sources contributing to these trends.
The analysis examines both nationwide and localized trends. At the national level, total PM2.5 emissions were analyzed to assess whether overall pollution levels have decreased over time. Baltimore City, Maryland, was used as a case study to explore emissions trends by source type, including on-road, non-road, point, and non-point sources. Additionally, emissions from coal combustion-related sources were studied to evaluate changes driven by energy sector policies. The project also focused on motor vehicle emissions in Baltimore City and compared these trends with those observed in Los Angeles County, California, to highlight regional differences.
Key findings from the analysis reveal a significant decrease in total PM2.5 emissions in the United States between 1999 and 2008. In Baltimore City, reductions were observed across most source types, with particularly notable declines in motor vehicle emissions, likely due to cleaner technologies and stricter regulations. At the national level, PM2.5 emissions from coal combustion-related sources also decreased significantly, reflecting advancements in energy sector practices. The comparison between Baltimore and Los Angeles highlighted differing trends in motor vehicle emissions, influenced by factors such as urban infrastructure and population density.
This project incorporates six visualizations to present the findings effectively. These include:
Total PM2.5 emissions across the United States.
PM2.5 emissions in Baltimore City by source type.
Grouped bar plots showing source-type trends in Baltimore.
Trends in coal combustion-related emissions at the national level.
Motor vehicle emissions in Baltimore City over time.
A comparison of motor vehicle emissions between Baltimore City and Los Angeles County.
The data for this project was sourced from the EPA’s National Emissions Inventory (NEI), a comprehensive database of air pollution emissions across the United States. More information about the NEI is available at the EPA’s official website: EPA NEI Database.
This project demonstrates skills in data analysis, visualization, and reporting using R programming. Tools such as ggplot2 and gridExtra were employed to create compelling visualizations, providing actionable insights into air quality trends and pollution sources. Through this analysis, the project highlights the progress made in reducing harmful emissions and the challenges that remain in improving air quality.
Electric Power Consumption: Data Analysis and Visualization
Description
This project explores electric power consumption patterns using data-driven techniques to generate meaningful insights. The analysis is performed using R and includes various visualizations that highlight key trends and metrics related to energy usage.
Overview
Electric power consumption data has been analyzed to uncover patterns and trends. The dataset focuses on energy usage across various time periods and sub-metering categories. The project emphasizes the importance of effective visualization in understanding energy consumption dynamics.
Key Features
Histogram of Global Active Power:
This plot displays the distribution of global active power usage in the dataset. It provides insights into the most common energy consumption levels.
Time Series of Global Active Power:
A line plot showing variations in global active power over time. This plot reveals temporal trends and patterns in energy usage.
Energy Sub-Metering Comparison:
A multi-line plot comparing energy consumption across three different sub-metering categories. It highlights the distribution of energy usage in various sections of a household.
Multi-Panel Plot:
A consolidated view of various metrics, including global active power, voltage, sub-metering, and global reactive power. This multi-panel visualization provides a holistic understanding of energy usage.
Methodology
The dataset is filtered and processed in R using data.table for efficient data manipulation. Visualization techniques are implemented using both ggplot2 and base R plotting functions. The outputs include both single-variable plots and multi-panel layouts for comprehensive analysis.
Objectives
This project has two main objectives:
Provide an accessible and reproducible framework for energy consumption analysis.
Generate actionable insights into energy usage patterns using visualizations.
Output
The analysis includes four key visualizations:
A histogram showcasing the distribution of global active power.
A time series plot for tracking energy usage over time.
A multi-line plot for comparing energy sub-metering trends.
A multi-panel visualization summarizing key metrics.
Technology Used
This project uses R for both data processing and visualization. Key libraries and techniques include:
data.table: For efficient data processing and filtering.
Base R plotting functions: For creating multi-panel visualizations.
ggplot2: For creating clear and aesthetically pleasing plots.
Data Provided by: UC Irvine Machine Learning Repository
http://archive.ics.uci.edu/ml/
Outcome of Heart Attack Care Measures
Most hospitals have a 30-day mortality rate between 14 and 16, as indicated by the highest bars in the central portion of the chart. This suggests that these rates are the most common among hospitals, forming a bell-shaped distribution. The data also shows a slight right skew, indicating that while most hospitals fall within the central range, a few have significantly higher rates, suggesting variability in care quality. Outliers on both ends of the histogram reveal hospitals with either exceptionally low or high mortality rates, highlighting potential areas for further investigation or improvement in healthcare outcomes.
The data comes from the Hospital Compare website (http://hospitalcompare.hhs.gov)
run by the U.S. Department of Health and Human Services