WoE & IV
PreviousRestrukturiranje kredita kroz produženje roka otplateNextWoE transformacije u regresionim modelima
Last updated
Was this helpful?
Last updated
Was this helpful?
Zadatak: Importovati woe_iv.csv
fajl dat u prilogu, a zatim importovanom data frameu db
dodati novu variablu maturity.g
, definisanu na način da se vrijednosti varijable maturity
grupišu u 5 (po broju observacija) približno jednakih grupa. Dalje:
izračunati WoE i IV nove varijable maturity.g
u odnosu na binarnu zavisnu varijablu bo
;
izračunati WoE i IV nove varijable maturity.g
u odnosu na neprekidnu zavisnu varijablu co
.
> #naredne komande izvrsiti ukoliko paketi vec nisu instalirani
> #install.packages("Hmisc")
> #install.packages("dtplyr")
> #install.packages("dplyr")
> library(Hmisc)
> library(dtplyr)
> library(dplyr)
>
> #importovati woe_iv.csv fajl
> db <- read.csv("woe_iv.csv", header = TRUE)
> str(db)
'data.frame': 10000 obs. of 3 variables:
$ bo : int 0 0 0 0 0 0 0 0 0 0 ...
$ co : num 0.1361 0.0941 0.0847 0.0122 0.0122 ...
$ maturity: int 18 9 12 12 12 10 8 6 18 24 ...
> #bo - dobar (0) / los (1) indikator
> table(db$bo)
0 1
9500 500
> #kreirati grupe rocnosti kredita
> db$maturity.g <- cut2(db$maturity, g = 5)
> #kreirati data.table objekat
> db <- lazy_dt(db)
> db
Source: local data table [10,000 x 4]
Call: `_DT1`
bo co maturity maturity.g
<int> <dbl> <int> <fct>
1 0 0.136 18 [14,22)
2 0 0.0941 9 [ 4,11)
3 0 0.0847 12 [11,14)
4 0 0.0122 12 [11,14)
5 0 0.0122 12 [11,14)
6 0 0.0122 10 [ 4,11)
# ... with 9,994 more rows
# Use as.data.table()/as.data.frame()/as_tibble() to access results
> bo.s <- db %>%
+ group_by(maturity.g) %>%
+ summarise(no = n(),
+ ng = sum(bo%in%0),
+ nb = sum(bo)) %>%
+ mutate(dr = nb / no) %>%
+ ungroup() %>%
+ mutate(so = sum(no),
+ sg = sum(ng),
+ sb = sum(nb),
+ dist.g = ng / sg,
+ dist.b = nb / sb,
+ woe = log(dist.g / dist.b),
+ iv.c = (dist.g - dist.b) * woe,
+ iv.s = sum(iv.c))
> as.data.frame(bo.s)
maturity.g no ng nb dr so sg sb dist.g dist.b
1 [ 4,11) 2009 1961 48 0.02389248 10000 9500 500 0.2064211 0.096
2 [11,14) 2024 1942 82 0.04051383 10000 9500 500 0.2044211 0.164
3 [14,22) 2174 2058 116 0.05335787 10000 9500 500 0.2166316 0.232
4 [22,26) 1856 1772 84 0.04525862 10000 9500 500 0.1865263 0.168
5 [26,72] 1937 1767 170 0.08776458 10000 9500 500 0.1860000 0.340
woe iv.c iv.s
1 0.76556984 0.084535027 0.1893244
2 0.22031542 0.008905381 0.1893244
3 -0.06853925 0.001053340 0.1893244
4 0.10460835 0.001938007 0.1893244
5 -0.60319894 0.092892637 0.1893244