This project analyzes factors that may influence the probability of customer loan default using a real-world dataset from a financial institution. The analysis focuses on identifying trends and relationships across demographic and financial variables.
- Understand which customer attributes contribute to higher or lower loan default risk.
- Provide insights that can help credit divisions make more informed lending decisions.
- Apply data cleaning, transformation, and exploratory analysis techniques.
- Number of Children: Are customers with more children more likely to default?
- Family Status: How does marital status influence default probability?
- Income Level: Do low-income customers default more often?
- Loan Purpose: Are loans for education or car purchases riskier?
- Python (pandas, matplotlib, seaborn)
- Jupyter Notebook
credit_analysis.ipynb
β the main notebook containing all analysis stepscredit_scoring_eng.csv
β source data file
- Outliers and missing values were addressed using median imputation and basic filtering.
- The dataset was cleaned for inconsistencies like casing differences and invalid values (e.g., age = 0, children = -1).
Nabilla Hafsah Caesaredia