当前位置:网站首页>keras逻辑回归进行贷款风险预测
keras逻辑回归进行贷款风险预测
2022-04-22 21:20:00 【洗剪吹队长】
analyticsvidhya项目贷款风险预测
Dream Housing Finance公司处理所有房屋贷款。他们遍布所有城市,半城市和农村地区。客户在该公司确认客户的贷款资格后,首先申请房屋贷款。
问题
公司希望在填写在线申请表时,根据提供的客户详细信息自动进行贷款资格流程(实时)。这些详细信息包括性别,婚姻状况,教育程度,家属人数,收入,贷款金额,信用记录等。为了使这一过程自动化,他们遇到了一个问题,即要确定有资格获得贷款金额的客户群,以便他们可以专门针对这些客户。在这里,他们提供了部分数据集。
获取并分析原始数据
数据集:https://datahack.analyticsvidhya.com/contest/practice-problem-loan-prediction-iii/#ProblemStatement
注册并下载训练集,待预测集文件
观察数据有缺失值,df.dropna()删除有NaN的行只剩下2/3数据,看来不能删除,只好逐列填充了

预处理数据
先合并训练集和待预测集,统一重编码变量
# 取值转换为01,均值补全缺失值
df['Gender'] = df['Gender'].str.replace('Male', '0').replace('Female', '1').fillna('0.5') #pd.value_counts(df['Gender'])
df['Married'] = df['Married'].str.replace('No', '0').replace('Yes', '1').fillna('0.5')
df['Dependents'] = df['Dependents'].fillna('0')
df['Self_Employed'] = df['Self_Employed'].str.replace('No', '0').replace('Yes', '1').fillna('0.11')
df['LoanAmount'] = df['LoanAmount'].fillna(df['LoanAmount'].mean())
df['Loan_Amount_Term'] = df['Loan_Amount_Term'].fillna(df['Loan_Amount_Term'].mean())
df['Credit_History'] = df['Credit_History'].fillna('0.83')
# 取值缩小到01之间
df['ApplicantIncome'] = df['ApplicantIncome']/df['ApplicantIncome'].max()
df['CoapplicantIncome'] = df['CoapplicantIncome']/df['CoapplicantIncome'].max()
df['LoanAmount'] = df['LoanAmount']/df['LoanAmount'].max()
df['Loan_Amount_Term'] = df['Loan_Amount_Term']/df['Loan_Amount_Term'].max()
# 定类定序列进行OneHot-Coding
x_OneHot_df = pd.get_dummies(data=df,columns=['Dependents', 'Education', 'Property_Area'])
# pd转np
ndarray = x_OneHot_df.values
#将数据分为变量和应变量
X = np.delete(ndarray, (0,9), axis=1)#删除编号列和结果列
Y = ndarray[:614,9] #614为训练集与待预测集分界
pre_X = X[614:]
X = X[:614]
print(X.shape, Y.shape, pre_X.shape)
创建模型并训练
model = Sequential([
Dense(8, activation='relu', input_dim=17),
Dense(8, activation='relu'),
Dense(1, activation='sigmoid'),
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(x=X, y=Y, validation_split=0.3, epochs=30, batch_size=16)
调试后训练过程展示如下
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss=history.history['loss']
val_loss=history.history['val_loss']
epochs_range = range(30)
plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

预测并保存预测结果
pre_Y = model.predict(pre_X)
result = []
for i in pre_Y:
if i < 0.5:
result.append('拒绝贷款')
else:
result.append('通过贷款')
df.iloc[614:, -1] = result
pre_df = df.iloc[614:]
pd.value_counts(pre_df.Loan_Status)
pre_df.to_excel('pre_result.xlsx')
# 通过贷款 299
# 拒绝贷款 68
版权声明
本文为[洗剪吹队长]所创,转载请带上原文链接,感谢
https://blog.csdn.net/xx1132856201/article/details/106155173
边栏推荐
- Hdlbits (10) learning notes - finite state machine (fsm1 - lemmings4)
- 最新版去水印+外賣cps小程序源碼
- QTP11. 5 / UFT including Chinese package
- FreeModbus快速入门指南
- 2022 G3 boiler water treatment national question bank and online simulation examination
- shell脚本中解决SCP命令免密登录
- K8s deploy redis cluster
- 不稳定排序(选择,快速)
- JMeter data and software
- 【玩转Lighthouse】搭建WooCommerce商店,启用支付宝当面付收款
猜你喜欢

SEREDS解串模块简介以及硬件实现

What important accessories are included in the M5 enhanced dual system package and how to choose?
![Solution Sudoku [pre DS hash structure + component DFS | pre-hash-1-9 feature -- binary state compression + DFS]](/img/06/295b26f1389da3d90ad211dc447123.png)
Solution Sudoku [pre DS hash structure + component DFS | pre-hash-1-9 feature -- binary state compression + DFS]

Selenium web automated testing

(1) UART subsystem learning plan

About Net core using actionfilter and automatic transaction of actionfilter

All paths of 344 leetcode binary tree

The last lesson of the first stage of C language: inverted string (for example: I like Beijing. Print as: Beijing. Like I)

Openvx's immediate mode and graph mode and examples

MySQL is still suitable for Silicon Valley courses (I)
随机推荐
kubernetes_ How to solve the problem that namespace cannot be deleted
Qtp11 tutorial
2022 electrician (elementary) examination question bank and online simulation examination
Hdlbits (XI) learning notes - finite state machine (FSM onehot - FSM serialdp)
Tool class xmlutil (parse and return soap and XML messages to obtain the value of the target node)
leetcode - 234. Palindrome linked list
[IPTV] Huawei Yuehe ec6108v9a brush machine
[database learning 01]
Selenium_Webdriver视频自动化脚本分享
Time report填写规则
All paths of 344 leetcode binary tree
Section de configuration des outils générée par le concepteur dans Visual Studio. Solution au problème de la mise à jour tardive des fichiers UI
Reflection and annotation
ImportError: cannot import name ‘get_all_providers‘ from ‘onnxruntime.capi._pybind_state‘
Pytoch note57 pytoch visual network structure
[hand pose estimation] [detailed reading of the paper] 3D hand pose estimation with a single infrared camera via domain transfer learning
OpenVX-将Image文件[pgm格式]读写为vx_image对象,以及写操作
Big talk test data (I)
驱动开发总结记录
重返天梯-L2-025 分而治之 (25 分)