Review Lasso logistic regression Vertex-1 eve I attributes Network Logistic regression model Network Logistic Lasso regression model Use Case 3 Stefan Lyócsa Masaryk University *C FINTECH MANAGEMENT Štefan Lyócsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model Goal The goal is to predict loan default directly (1 - loan defaulted, 0 - otherwise). If the return on the loan is negative, we consider the loan to be defaulted. For this purpose we use logistic regression. We will use factor network model to extract network variables, hoping to increase predictions of default. • We create an adjacency matrix. • We calculate vertex level network variables. • We augment logistic regression models with new variables. • We compare the forecasting accuracy. Stefan Lyöcsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model Recall, that in logistic regression we model logarithm of odds, that we refer to as logit(pi): log(ODDSi) = W(t=y = logitipi) =0O + Pix InU Coefficients are estimated as: max -> LL(fio, ft) = Yh=i ViloQ{Pi) + (1 - y*)%(l - Pi) Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-1 eve I attributes Network Logistic regression model Network Logistic Lasso regression model We estimate logistic regression on our P2P dataset - similarly as before. Load the dataset - again: Number of predicted loans: Loans where we want to predict whether they default: • NF = 500 • N = dim(DT) [1] Storing the dependent variables for each of the samples: • DT$DEF = (DT$RR2<0)*1 The sample to use to estimate the model: • SI = DT[1:(N-NF),] The sample to use to predict (out-of-sample) loan return: • S2 = DT[(N-NF+1):N,] Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-1 eve I attributes Network Logistic regression model Network Logistic Lasso regression model Estimate standard model: • m4 = glm(formula = DEF new+ver3+ver4+lfi+lee+luk+lrs+ls female+lamt+int+durm+educprim+educbasic+ educvocat+educsecn espem+esfue+essem+esent+esret+dures+exper+ linctot+noliab+] lamntplr+lamteprl+nopearlyrep, family = binomial(link = "logit"),data = SI) • summary(m4) Predict defaults: • ypred = predict(m4,new=S2) • ypred = exp(ypred)/(1+exp(ypred)) • ytrue = S2$DEF Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model Eval lu 1a te d e fault pred ictions • Ratio of the number of correctly predicted defaults to the number of true defaults. True positive rate (TPR). Large TPR means we are good at predicting defaults. • Ratio of the number of wrongly predicted defaults to the number of true non-defaults. False positive rate (FPR). Low FPR means we are good and predicting non-defaults. The higher the TPR and the lower the FPR the better, thus we wish to maximize the ratio of TPR/FPR. Plotting TPR (y-axis) against FPR (x-axis) leads to the receiver operating characteristic curve, the so called ROC. Štefan Lyócsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model ictions • library(pROC) • roc_obj <- roc(ytrue, ypred) • plot(roc_obj, xlab = "False positive rate", ylab = "True positive rate") Area under the curve (AUC): • LOGIT = auc(roc_obj) Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-1 eve I attributes Network Logistic regression model Network Logistic Lasso regression model Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model In standard logistic model, the coefficients are estimated as: max ->• YJi=i yd°g(Pi) + (i - yi)log(l - Pi 00 A In LASSO logistic model, the coefficients are estimated as: max YH=\ WogiVi) + (1 - Vi)log(l - Pi)] - A£)?=1 \0j Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model P re pa ri n gd la' ta Define the matrix of input and output variables: • indep = as.matrix(DT[1:(N-NF),c(;new;,;ver3;,;ver4;,;lfi; ;undG;,;female;,;lamt;,;int;,;durm;,;educprim;,;educbasic;. ;educvocat;,;educsec;,;msmar;,;msco;,;mssi;,;msdi;,;nrodep: ;espem;,;esfue;,;essem;,;esent;,;esret;,;dures;,;exper;, ;linctot;,;noliab;,;lliatot;,;norli;,;noplo;,;lamountplo;, ;lamntplr;,;lamteprl;,;nopearlyrep;,;Deg;,;Hac;,paste(;g;J • dep = DT[1:(N-NF),;DEF;] Stefan Lyocsa FinTech • pred = as.matrix(DT[(N-NF+1):N,c(;new;,;ver3;,;ver4;,;lfj ;undG;,;female;,;lamt;,;int;,;durm;,;educprim;,;educbasic;. ;educvocat;,;educsec;,;msmar;,;msco;,;mssi;,;msdi;,;nrodep: ;espem;,;esfue;,;essem;,;esent;,;esret;,;dures;,;exper;, ;linctot;,;noliab;,;lliatot;,;norli;,;noplo;,;lamountplo;, ;lamntplr;,;lamteprl;,;nopearlyrep;,;Deg;,;Hac;,paste(;g;J • ytrue = DT[(N-NF+1):N,;DEF;] Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model Estimating the model: • m5 = cv.glmnet(x=indep, y=dep, type.measure=;auc;,alpha=1 • coef(m5, s = "lambda.lse") Predicting defaults: • ypred = predict(m5,newx=pred,s=m5$lambda.lse) • ypred = exp(ypred)/(1+exp(ypred)) • roc_obj <- roc(ytrue, ypred) • lines(roc.obj, col=;red;) • L0GIT_LASS0 = auc(roc_obj) Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-1 eve I attributes Network Logistic regression model Network Logistic Lasso regression model Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model c rea te i n e twork in R Import data again (but do not delete anything from previous Use Case). We arbitrarily select variables that we think might identify a bad loan: • X = DT[,c(;int;,;durm;,;linctot;,;noliab;)] Run the following function: • AM = FN_SVD(X,p=0.75,gam=0.10) • g = graph.from_adjacencyjiiatrix(AM, mode = ;undirected; , weighted = TRUE) We can visualize the Network Factor Model: • plot(g, graph = ;NFM;, vertex.label=NA,vertex.size = 3, main = ;Network factor model of the P2P applicants networks;) Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-1 eve I attributes Network Logistic regression model Network Logistic Lasso regression model 2 network in Stefan Lyocsa FinTech • vertex degree, • harmonic centrality, • Community detection - Louvain method. To address the issue of isolated vertices, one can assume that the shortest distance between vertex i and an isolated vertex j is oc, while conveniently assuming that 1/oc = 0. Harmonic centrality is therefore: where d(i,j) is the shortest path from vertex i to vertex j in the network. Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model ■ [•J vertex level attributes in R The following function calculates centrality and community: • NetDscr=BVC(g) Now add variable into the model: • DT$Deg = NetDscr$VCentrality[,l] • DT$Hac = NetDscr$VCentrality[,2] • DT = data.frajne(DT,NetDscr$Community) Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-1 eve I attributes Network Logistic regression model Network Logistic Lasso regression model • SI = DT[1:(N-NF),] • S2 = DT[(N-NF+1):N,] • m6 = glm(formula = DEF new+ver3+ver4+lfi+lee+luk+lrs+ls female+lamt+int+durm+educprim+educbasic+ educvocat+educsecn espem+esfue+essem+esent+esret+dures+exper+ linctot+noliab+] lamntplr+lamteprl+nopearlyrep+Deg+Hac+gl+g2+g3+g4, family = binomial(link = "logit"),data = SI) • summary(m6) Predicting defaults: • ypred = predict(m6,new=S2) • ypred = exp(ypred)/(1+exp(ypred)) • ytrue = S2$DEF • roc_obj <- roc(ytrue, ypred) • lines(roc_obj,col=;green;) • L0GIT_N = auc(roc_obj) Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-1 eve I attributes Network Logistic regression model Network Logistic Lasso regression model Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model P re pa ri n g data Define the matrix of input and output variables: • indep = as.matrix(DT[1:(N-NF),c(;new;,;ver3;,;ver4;,;lfi; ;undG;,;female;,;lamt;,;int;,;durm;,;educprim;,;educbasic;. ;educvocat;,;educsec;,;msmar;,;msco;,;mssi;,;msdi;,;nrodep: ;espem;,;esfue;,;essem;,;esent;,;esret;,;dures;,;exper;, ;linctot;,;noliab;,;lliatot;,;norli;,;noplo;,;lamountplo;, ;lamntplr;,;lamteprl;,;nopearlyrep;,;Deg;,;Hac;,paste(;g;J • dep = DT[1:(N-NF),;RR2;] Stefan Lyöcsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model P re pa ri n g data • pred = as.matrix(DT[(N-NF+1):N,c(;new;,;ver3;,;ver4;,;lfj ;undG;,;female;,;lamt;,;int;,;durm;,;educprim;,;educbasic;. ;educvocat;,;educsec;,;msmar;,;msco;,;mssi;,;msdi;,;nrodep: ;espem;,;esfue;,;essem;,;esent;,;esret;,;dures;,;exper;, ;linctot;,;noliab;,;lliatot;,;norli;,;noplo;,;lamountplo;, ;lamntplr;,;lamteprl;,;nopearlyrep;,;Deg;,;Hac;,paste(;g;J • ytrue = DT[(N-NF+1):N,;RR2;] Stefan Lyöcsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model Estimating the model: • m7 = cv.glmnet(x=indep, y=dep, type.measure=aucM,alpha=l. • coef(m7, s = "lambda.lse") Predicting defaults: • ypred = predict(m7,newx=pred,s=m7$lambda.lse) • ypred = exp(ypred)/(1+exp(ypred)) • roc_obj <- roc(ytrue, ypred) • lines(roc.obj,col=;blue;) • L0GIT_N_LASS0 = auc(roc_obj) Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model • AUC = c (LQGIT, LQGIT_LASSQ, LQGIT_M, LQGIT_M_LASSD) • names(AUC) = c("Logit","Logit-L","Logit-N","Logit-NL") • AUC = sort(AUC,decreasing=T) • cbind(AUC) I AUC Logit-N 0.8338517 Logit 0.8326703 Logit-NL 0.8305726 Logit-L 0.8300904 Stefan Lyocsa FinTech Review Lasso logistic regression Vertex-level attributes Network Logistic regression model Network Logistic Lasso regression model Use Case 3 Stefan Lyócsa Masaryk University *C FINTECH MANAGEMENT Štefan Lyócsa FinTech