当前位置:网站首页>[How to smash wool according to the music the couple listens to during the Qixi Festival] Does the background music affect the couple's choice of wine?
[How to smash wool according to the music the couple listens to during the Qixi Festival] Does the background music affect the couple's choice of wine?
2022-08-05 00:59:00 【sunny day qt01】
目录
The statistical method of feature selection is as follows
Qixi Festival special column
简介
七夕到了,Couples are also in high-end hotels,I've been in hotel sales before,Sales at this time are generally not bad.,But how to stand out from it,And that's about the background music in the bar.,有时,Couples can not order music,sometimes french:French手风琴,Some people will order Italian style:italian手风琴,Have you sell the wineFrench、italian、其他酒类
How can we get wool in the hands of these little lovers?,This requires feature selection in data mining..
特征选择的方法
无效变量
不相关变量,多余变量
The statistical method of feature selection is as follows
Just a few here
方差阈值化、卡方检验、ANOVA检验及T检验、皮尔森相关系数
高度相关特征的选择(多余变量)
模型方式的特征选择
决策树、逻辑回归,随机森林,XGBoost
模型会自动选择变量
递归式的特征选择.
将特征慢慢消除,限制到特定范围内.

当输入增加,就必须增加数据,不然模型就会不稳定
无效变量
不相关变量,多余变量

Redundancy:两个变量的相关性太高,说明1二者的概念可能是否接近,也就是多余变量,可以采取合并的方法.甚至删除字段,二者带来的信息
Irrelevancy:X4,X3就是不相关变量,X4变大时会发现目标值的变动.当X3变动的时候预测值是随机的,不相关,无法带来信息.

统计方式的特征选择
VT方差阈值化:算出数值型字段的方差,如果低于某个值,说明它包含的信息量不足.
方差不能事先对它进行标准化.比如Z-scold 它的方差为1,均值为0
必须决定一个门槛值,是否删除该字段
二元变量:把其中一个编码为1,一个编码为0方差就是P(1-P)(先做特征转换)

当方差越大,说明是越重要的字段.最大值是0.25.
当然,这个与目标无关
皮尔森相关系数:
高度相关特征的选择(多余变量):
经常会出现高度相关字段,带来的信息是重复的,利用皮尔森相关系数,查看二者的相关性.大于0.95就抹除变量.
要看保留那个,可以求变量1和变量2与目标的关系.
统计检验的方式:
输入字段与目标字段的关系
类别型字段:卡方检验:输入字段与目标字段的关联性
数值型字段:ANOVA检验(目标字段大于2就行):T检验(目标字段只有2个值,比如yes or no):来检验输入字段与目标字段的关联性.
ANOVA案例:背景音乐是否会影响消费者心情.音乐(输入字段)与酒类购买的关系.
无音乐,French手风琴,italian手风琴
酒:French、italian、其他酒类
统计量

真实销售减去期望值求和除以期望值求和


这是期望频数.设二者相互独立,概率1乘以概率2,乘总数243.
上表减下表,平方之和,除以均值之和
得到的值越大越好.The comparison values can be found in the chi-square statistics table,
先计算其卡方值,利用该值查表,对应的概率,如果小于显著性水平0.05,说明二者无关的概率极小,予以排除.
结论
Then we can conclude that there is a strong correlation between alcohol and music,Then we can actually scour it,We sell Italian wine when couples are listening to the Italian-style accordion,French手风琴,就卖French酒,那么我们就对症下药,pluck their wool.
边栏推荐
- MBps与Mbps区别
- GCC:头文件和库文件的路径
- 深度学习:使用nanodet训练自己制作的数据集并测试模型,通俗易懂,适合小白
- oracle create tablespace
- Helm Chart
- 新唐NUC980使用记录:在用户应用中使用GPIO
- GCC:屏蔽动态库之间的依赖
- Pytorch usage and tricks
- Software testing interview questions: the difference and connection between black box testing, white box testing, and unit testing, integration testing, system testing, and acceptance testing?
- (十七)51单片机——AD/DA转换
猜你喜欢

Bit rate vs. resolution, which one is more important?

sqlite--nested exception is org.apache.ibatis.exceptions.PersistenceException:

【机器学习】21天挑战赛学习笔记(二)

### Error querying database. Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionExcep

5. PCIe official example

matlab 采用描点法进行数据模拟和仿真

2022 Hangzhou Electric Power Multi-School Session 3 K Question Taxi

JUC线程池(一): FutureTask使用

Jin Jiu Yin Shi Interview and Job-hopping Season; Are You Ready?

Creative code confession
随机推荐
[230] Execute command error after connecting to Redis MISCONF Redis is configured to save RDB snapshots
Software test interview questions: BIOS, Fat, IDE, Sata, SCSI, Ntfs windows NT?
PCIe 核配置
GCC:屏蔽动态库之间的依赖
D - I Hate Non-integer Number (count of selected number dp
面试汇总:为何大厂面试官总问 Framework 的底层原理?
测试工作这么难找吗?今年32,失业2个月,大龄测试工程师接下来该拿什么养家?
Difference between MBps and Mbps
The principle of NMS and its code realization
Binary tree [full solution] (C language)
gorm joint table query - actual combat
活动推荐 | 快手StreamLake品牌发布会,8月10日一起见证!
GCC: paths to header and library files
软件基础的理论
Redis visual management software Redis Desktop Manager2022
详细全面的postman接口测试实战教程
torch.autograd.grad求二阶导数
工具类总结
pytorch的使用:卷积神经网络模块
leetcode: 267. Palindromic permutations II