GitHub - Fedesgh/Building_Credit_Risk_Classifier_Using_Bagging_Kneighbors: Problem statment about modeling target vector and attempt to improve metrics

Motivation

The motivation for this repository are the difficulties that the dataset present when we define the Target and Features. One of the problems involve several data leakages.

There are several attempts in kaggle with low metrics particularly when we restrict the training set to features with information before the loan was granted and we want try to improve it:

https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data

We use various data preprocces techniques like SelectKbest with information value, Binning , Up-sampling with Imlearn, One Hot Encoder and Imputers

Problems at defining the target

loan_status (our target) has the followings values:

Current
Fully Paid
Charged Off
Late (31-120 days)
In Grace Period
Does not meet the credit policy. Status:Fully Paid
Late (16-30 days)
Default
Does not meet the credit policy. Status:Charged Off

The main point we must consider is that the values belong to differents moments in the loan life span.

Those that belong to an end of the Loan:

Fully Paid
Charged Off
Does not meet the credit policy. Status:Fully Paid
Default
Does not meet the credit policy. Status:Charged Off

Middle term of a loan:

Current
Late (31-120 days)
Late (16-30 days)

while In Grace Period belongs to the beginning.

On top of this we should consider:

All the loans regardless its end, were previously in time "In Period Grace"

All the loans regardless its end, were previously in time Current and/or Late

Our target

"Good loans" (1):

Fully Paid

"Bad loans" (0):

Charged Off
Does not meet the credit policy. Status:Fully Paid
Default
Does not meet the credit policy. Status:Charged Off

We just consider ends of loans categorys in the target, and we should consider only features in X_train set that belong before the loan was granted.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
venvcredit		venvcredit
First_model_strict.ipynb		First_model_strict.ipynb
LICENSE		LICENSE
README.md		README.md
result.jpg		result.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Motivation

Problems at defining the target

Our target

Result metrics.

About

Releases

Packages

Languages

License

Fedesgh/Building_Credit_Risk_Classifier_Using_Bagging_Kneighbors

Folders and files

Latest commit

History

Repository files navigation

Motivation

Problems at defining the target

Our target

Result metrics.

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages