当前位置:网站首页>Daily learning records - reading custom data sets
Daily learning records - reading custom data sets
2022-04-22 06:07:00 【Lithium salt block】
sklearn Read custom datasets
import csv
from sklearn.utils import Bunch
# Read watermelon dataset
def readWatermelonDataSet():
FeatureNames = []
FeatureList = []
LabelList = []
ifile = open("E:\My Word\study\RL0314\data.csv", "r")
reader = csv.reader(ifile)
cnt = 0
for row in reader:
if cnt == 0: # Read attribute name
headers = row
FeatureNames = headers[1:len(headers) - 1]
# print(FeatureNames)
else: # Read data and labels
headers = row
FeatureList.append(headers[1:len(headers) - 1])
LabelList.append(headers[len(headers) - 1])
cnt = cnt + 1
print(FeatureNames)
print(FeatureList)
print(LabelList)
return Bunch(
data=FeatureList,
target=LabelList,
feature_names=FeatureNames,
)
Be careful : If you want to use it directly sklearn Subsequent algorithm , The data set should be numeric data , If you add other columns of watermelon dataset, you will report errors later , Data preprocessing is needed .
The data set used here is like this :

Complete decision tree generation code :
import csv
from sklearn.utils import Bunch
from sklearn import tree
from sklearn.model_selection import train_test_split
import pandas as pd
import graphviz
import os
# Read watermelon dataset
def readWatermelonDataSet():
FeatureNames = []
FeatureList = []
LabelList = []
ifile = open("E:\My Word\study\RL0314\data.csv", "r")
reader = csv.reader(ifile)
cnt = 0
for row in reader:
if cnt == 0: # Read attribute name
headers = row
FeatureNames = headers[1:len(headers) - 1]
# print(FeatureNames)
else: # Read data and labels
headers = row
FeatureList.append(headers[1:len(headers) - 1])
LabelList.append(headers[len(headers) - 1])
cnt = cnt + 1
print(FeatureNames)
print(FeatureList)
print(LabelList)
return Bunch(
data=FeatureList,
target=LabelList,
feature_names=FeatureNames,
)
def main():
watermelon = readWatermelonDataSet() # Watermelon data
pd.concat([pd.DataFrame(watermelon.data), pd.DataFrame(watermelon.target)], axis=1)
Xtrain, Xtest, Ytarin, Ytest = train_test_split(watermelon.data, watermelon.target, test_size=0.3) # Test set 30% Training set 70%
""" Build a model """
clf = tree.DecisionTreeClassifier(criterion="entropy") # Instantiation , Classification tree
clf = clf.fit(Xtrain, Ytarin)
score = clf.score(Xtest, Ytest)
score
dot_data = tree.export_graphviz(clf
, feature_names=watermelon.feature_names
, class_names=[" Good melon ", " Bad melon "]
, filled=True
, rounded=True
, special_characters=True
, fontname="Microsoft YaHei")
graph = graphviz.Source(dot_data)
os.environ["PATH"] += os.pathsep + 'D:/DiyProgram/graphviz/bin/'
graph.render("watermelon1", view=True)
if __name__ == "__main__":
main()
Running results :


版权声明
本文为[Lithium salt block]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204220539556775.html
边栏推荐
- RTL8367学习笔记1——基础知识
- Chessboard coverage problem (divide and conquer)
- Speed measurement based on 51 single chip microcomputer and Hall sensor
- 第88篇 LeetCode剑指Offer动态规划(五)礼物的最大值
- 第72篇 LeetCode题目练习(五) 5.最长回文子串
- Usage of tcpdump
- Chapter 88 leetcode sword refers to offer dynamic programming (V) maximum value of gifts
- 通用定时器
- VB操作excel 格式设置及打印页面设置(精简)
- 蓝桥杯嵌入式扩展板学习之DS18B20
猜你喜欢

LeetCode: 322. Change exchange (dynamic programming, recursion, memo recursion and backtracking)

Jeecgboot online development 3

记录一次安装centos8+postgresql9.6+postgis的惨痛经历

IWDG

STM32 study notes 4 - HC_ Commissioning record of SR04 ultrasonic ranging module

Blue Bridge Cup 31 day sprint day18

CAN 数据帧,远程帧,错误帧,以及出错重连

Characteristics and usage of QT signal and slot

本地搭建服务器后的访问问题

QT学习之代码颜色区别
随机推荐
LeetCode 898. Subarray bitwise OR operation - set
Code color difference of QT learning
Experience of constructing H-bridge with MOS tube
日常学习记录——解决graphviz中文乱码问题
Rtl8367 learning note 2 - network configuration operation literacy
Part 75 leetcode exercise (8) 8 String to integer
STM32学习笔记4——HC_SR04超声波测距模块的调试记录
[2022 Ali security] real scene tampering image detection challenge final rank17 scheme sharing
Photoresist for learning of Blue Bridge Cup embedded expansion board
蓝桥杯嵌入式省赛第七届:模拟液位检测告警系统”
第88篇 LeetCode剑指Offer动态规划(五)礼物的最大值
Pykmip test
06 - data type
LeetCode: 322. Change exchange (dynamic programming, recursion, memo recursion and backtracking)
Blue Bridge Cup 31 day sprint Day10
14 - container - tuple
15 - container - Dictionary
oracle使用c语言编写自定义函数
Compiling OpenSSL of arm64 on M1 chip
jeecgboot-online表单开发-控件配置