Enter your name and EID here

This is the dataset you will be working with:

bank_churners <- readr::read_csv("https://wilkelab.org/SDS375/datasets/bank_churners.csv")

bank_churners
## # A tibble: 10,127 x 21
##    CLIENTNUM Attrition_Flag Customer_Age Gender Dependent_count Education_Level
##        <dbl> <chr>                 <dbl> <chr>            <dbl> <chr>          
##  1 768805383 Existing Cust…           45 M                    3 High School    
##  2 818770008 Existing Cust…           49 F                    5 Graduate       
##  3 713982108 Existing Cust…           51 M                    3 Graduate       
##  4 769911858 Existing Cust…           40 F                    4 High School    
##  5 709106358 Existing Cust…           40 M                    3 Uneducated     
##  6 713061558 Existing Cust…           44 M                    2 Graduate       
##  7 810347208 Existing Cust…           51 M                    4 Unknown        
##  8 818906208 Existing Cust…           32 M                    0 High School    
##  9 710930508 Existing Cust…           37 M                    3 Uneducated     
## 10 719661558 Existing Cust…           48 M                    2 Graduate       
## # … with 10,117 more rows, and 15 more variables: Marital_Status <chr>,
## #   Income_Category <chr>, Card_Category <chr>, Months_on_book <dbl>,
## #   Total_Relationship_Count <dbl>, Months_Inactive_12_mon <dbl>,
## #   Contacts_Count_12_mon <dbl>, Credit_Limit <dbl>, Total_Revolving_Bal <dbl>,
## #   Avg_Open_To_Buy <dbl>, Total_Amt_Chng_Q4_Q1 <dbl>, Total_Trans_Amt <dbl>,
## #   Total_Trans_Ct <dbl>, Total_Ct_Chng_Q4_Q1 <dbl>,
## #   Avg_Utilization_Ratio <dbl>

More information about the dataset can be found here: https://www.kaggle.com/sakshigoyal7/credit-card-customers

Part 1

Question: Is attrition rate related to income level?

To answer this question, create a summary table and one visualization. The summary table should have three columns, income category, existing customers, and attrited customers, where the last two columns show the number of customers for the respective category.

The visualization should show the relative proportion of existing and attrited customers at each income level.

For both the table and the visualization, make sure that income categories are presented in a meaningful order. For simplicity, you can eliminate the income level “Unknown” from your analysis.

Hints:

  1. To make sure that the income levels are in a meaningful order, use fct_relevel(). Note that arrange() will order based on factor levels if you arrange by a factor.

  2. To generate the summary table, you will have to use pivot_wider() at the very end of your processing pipeline.

Introduction: Your introduction here.

Approach: Your approach here.

Analysis:

# Your R code here

Discussion: Your discussion of results here.

Part 2

Question: Your question here.

Introduction: Your introduction here.

Approach: Your approach here.

Analysis:

# Your R code here

Discussion: Your discussion of results here.