gravatar

Rudaiba

Rudaiba Tarannum

Recently Published

Data Science Project
Data Science Project Exploring the iris flower dataset Rudaiba Tarannum YWCA Higher Secondary Girls’ School February 27, 2024
Data Science Project
Select one dataset (that you find interesting) from the “datasets” folder: datasets or if you have your research dataset you can use that too. But make sure your personal dataset contains At least 2 numerical columns One target column Download it and move it to your working directory (or any directory you feel comfortable with) of R Studio. Create a notebook file in R studio and load the dataset in a variable. Write a few sentences to explain the dataset. Identify numerical columns Identify categorical columns Identify the target variable (the last column is the target variable) Remove all the categorical columns (not the target column) from the dataset. Now we will call all the numerical columns as features and the last column as target or class. Use the ggplot2 library functions to plot the following and explain each figure in (2-3) sentences. A scatter plot of 3 combinations of any two features where each class (target) should be represented with different colours. Boxplot of all columns where each class (target) should be represented with different colours. If you have 5 features and 2 target categories, you will create 5 plots where each plot will contain 2 boxes. Violin plot of all columns where each class (target) should be represented with different colours. If you have 5 features and 2 target categories, you will create 5 plots where each plot will contain 2 violins.
Assgnment-2
Given a data frame as following exam_score = data.frame( ID = c(1, 2, 3, 4, 5), Name = c("Alice", "Bob", "David", "John", "Jenny"), Age = c(20, 25, 30, 22, 18), Score = c(100, 78, 90, 55, 81) ) Create the data frame. Add 2 new rows. Add a new column called “Income”. This column should be numerical. Find max, min, median, sum, mean, standard deviation, variance, and quantiles of columns Age, Score, and Income. Find correlation between Age and score Age and income Score and income Select rows where the score is greater than or equal to 80 Select rows with the age range of 20 to 30
Assignment - 1
Question 1 [8] Imagine, you have a DNA sequence consisting of letters from "A", "C", "G", and "T". Create two variables — gene1: with 8 characters and gene2: with 3 characters. Concatenate the two variables Create a substring from variable 1 starting from index 2 to 5 Does pattern of gene2 exists in gene1? If yes, find the index. Here is an example of a DNA sequence of length 5: DNA = “ACCGT” Question 2 [2] Select any two integer numbers and store it in two variables. Now do the following arithmetic operation on the two numbers and store each result in variables and print it. 1. Add 2. substract 3. multiply 4. divide 5. power 6. modulo The “Add” is done for you as an example var1 = 10 var2 = 3 result_add = var1 + var2 print(result_add)
R programming and data science (Class - 1)
This is an R program.