Recently Published
ALY 3040 - Week 1 assignment: census sample
The US Census Bureau's (https://www.census.gov/) serves as the nation's leading provider of data about its people and economy.
This data is provided at different geographical levels namely: state, county, city, census tract, and census block group.
Census Block Groups (CBG) are the highest resolution that the census data is provided at.
Each CBG is defined by a unique 11 digits ID (e.g., 10010201002).
The file census_sample.csv Download census_sample.csv includes the following columns:
1- CBG ID (cbg_id)
2- Median Income (median_income)
3- Median Age (median_age)
4- Percentage of population from White race (white_ppl)
5- Percentage of population from Asian race (asian_ppl)
6- Percentage of population from Black-AfricanAmerican race (black_ppl)
7- Percentage of population from Histpanic-Latino race (hispanic_ppl)
8- Percentage of population with Professional College education degree (edu_college)
9- Percentage of population with Bachelors degree (edu_bachelors)
10- Percentage of population with Masters degree (edu_masters)
11- Percentage of population with a Ph.D. degree (edu_phd)
Download the file and use Python or R to read the file and perform the following:
1- Read the data and view the first 5 rows of the data (use head function).
2- Provide details about the data type in each column.
3- Remove the rows that have missing values (NA).
4- Keep only first 1000 rows for the rest of analysis.
5- Create a new columns named 'edu_degree' that is sum of all degree owners (i.e., Professional College, Bachelors, Masters, and PhD)
6- Remove columns edu_college, edu_bachelors, edu_masters, and edu_phd.
7- Use visualization methods you learned in class to explore the data.
8- Use pair plots and correlation plots to find patterns in the data.
You should write a report that covers all the steps above including the code, results, and your comments.
Finally, you should have a section of maximum one-page where you explain what interesting patterns you've observed in the data.
Submit your report file.
Data Manipulation Using R
English Version of "Análise População Carcerária de 2016 a 2019 no Brasil"
Relatório INFOPEN
Análise do baco de dados do INFOPEN de 2016 a 2019