当前位置:网站首页>33 Basic Statistics - One Item Nonparametric Test
33 Basic Statistics - One Item Nonparametric Test
2022-08-09 03:33:00 【paper limit】
1.The purpose of the chi-square test and its basic idea
文章目录
卡方检验的 目的就是通过样本数据的分布来检验总体分布与期望分布或某一理论分布是否一致,零假设是样本的总体分布与期望分布或某一理论分布无显著差异.
卡方检验 基本思想是,if from a random variable X X XA number of observations are randomly selected from the sample,when these samples fall X X X的 k k k个互不相关的子集中的观察频数服从一个多项分布,当 k k k趋于无穷时,This multinomial distribution follows a chi-square distribution,根据这个思想,对变量 X X X总体分布的检验可从各个观察频数的分析入手.
under the assumption that the null hypothesis holds,If the variable value falls in th i i iThe probability in the subsets is p i p_i pi,The corresponding expected frequency is n p i n p_i npi,期望频数的分布代表了When the null hypothesis holds的理论分布,可以采用卡方统计量来检验实际分布与期望的分布之间是否存在显著差异.A typical chi-square statistic is P e a r s o n Pearson Pearson统计量,定义为:
X 2 = ∑ i = 1 k (观测频数 − predicted frequency) 2 predicted frequency X^2=\sum_{i=1}^k{\frac{\text{(观测频数}-\text{predicted frequency)}^2}{\text{predicted frequency}}} X2=i=1∑kpredicted frequency(观测频数−predicted frequency)2
X 2 X^2 X2服从 k − 1 k-1 k−1个自由度的卡方分布.当 X 2 X^2 X2值越大,说明观测频数分布与期望分布差距越大.SPSS会自动计算 X 2 X^2 X2值,And calculate the corresponding probability according to the chi-square distribution table p p p 值.
如果 p p p 值小于显著性水平,拒绝零假设,认为总体分布与期望分布或某一理论分布有显著差异;反之,如果 p p p 值大于显著性水平,接受零假设,认为总体分布与期望分布或某一理论分布一致.
例子:https://blog.csdn.net/snowdroptulip/article/details/78770088 .Pretty detailed example.
2.二项分布检验
tested for the binomial distribution目的就是来检验样本中这两个类别的观察频率是否等于给定的检验比列,零假设是样本来自的总体分布与指定的二项分布无显著差异.
二项分布检验在小样本中采用精确检验方法,Approximate tests are used for large samples.Exact test method calculationnThe number of successes in each trial is less than or equal tox次的概率,即 P { X ⩽ x } = ∑ i = 0 x C n i p i q n − i P\left\{ X\leqslant x \right\} =\sum_{i=0}^x{C_{n}^{i}p^iq^{n-i}} P{ X⩽x}=∑i=0xCnipiqn−i.used in large samplesZ检验统计量,under the null hypothesisZThe test statistic approximately follows a normal distribution and is defined as Z = x ± 0.5 − n p n p ( 1 − p ) Z=\frac{x\pm 0.5-np}{\sqrt{np\left( 1-p \right)}} Z=np(1−p)x±0.5−np,The above formula performs continuity correction,当 x x x 小于 n / 2 n/2 n/2 时加 0.5 ,当 x x x 大于 n / 2 n/2 n/2 .时减0.5.
3.游程检验
The nature of run-length testing:首先,The type of the variable must be dichotomous,For example the gender variable,A variable consisting of only two numbers.然后,Analysis of runs tests目的is used to determine whether the order of observations is random.游程检验是最简单的判断随机性的方法.
所以,单样本检验的时候,The null hypothesis is that the sequence is random;而Runs test for two independent samples就是用来检验两个赝本来自总体的分布是否相同,此时的零假设就是两组独立样本来自总体分布无显著性差异.
4.单样本K-S检验(Kolmogorov-Smirnov)
This method is a goodness-of-fit test method,将变量的观察累积分布函数与指定的理论分布进比较,Mainly the normal distribution、Uniform distribution and Poisson distribution, etc.
单样本k-s检验的零假设就是样本来自的总体分布与制定理论分布无显著性差异.
基本思路如下:
under the assumption that the null hypothesis holds,计算各样本观测值在理论分布中出现的理论累积概率值 F ( X ) F(X) F(X),其次经计算各样本观测值实际累计概率值 S ( X ) S(X) S(X) ,Calculate the difference between the actual probability value and the theoretical probability value D ( X ) D(X) D(X) ,最后计算差值序列中的最大绝对差值 $D=\underset{1\leqslant i\leqslant n}{max}\left( |S\left( X_i \right) -F\left( X_i \right) |,|S\left( X_{i-1} \right) -F\left( X_i \right) |\right) , 这 个 ,这个 ,这个D$ Statistics are ours too k − s k-s k−s 统计量.
在小样本下,When the null hypothesis holds,D统计量服从 Kolmogorov分布,在大样本下,When the null hypothesis holds, n D \sqrt{n} D nD统计量近似服从 Kolmogorov 分布.当 D 小于 0 时,K(X)为0;当 D 大于 0 时, K ( x ) = ∑ j = − ∞ ∞ ( − 1 ) j exp ( − 2 j 2 x 2 ) K\left( x \right) =\sum_{j=-\infty}^{\infty}{\left( -1 \right) ^j\exp \left( -2j^2x^2 \right)} K(x)=∑j=−∞∞(−1)jexp(−2j2x2).
K − S K-S K−S 检验步骤:
1)建立假设检验
2)由样本数据计算经验分布函数与理论分布函数,代入计算
$D=\underset{1\leqslant i\leqslant n}{max}\left( |S\left( X_i \right) -F\left( X_i \right) |,|S\left( X_{i-1} \right) -F\left( X_i \right) |\right) $
1sisn
3)Look up the table to determine the critical value D n ( α ) D_n(\alpha) Dn(α)
4)作出判断
If the sample is calculated D n > D ( α ) D_n>D(\alpha) Dn>D(α),拒绝零假设,Otherwise the fit is considered satisfactory,即认为该样本来自于特定的理论分布.
边栏推荐
- One Pass 1258 - Digital Pyramid (Dynamic Programming)
- JSON的使用
- 06 Dynamic memory
- (a) 7 classes and objects
- 盘点检索任务中的损失函数
- What are the functions and applications of the smart counter control board?
- Exchange VLAN experiment
- win10上运行emwin
- "The Sword Offer" Problem Solution - week1 (continuously updated)
- 23 Lectures on Disassembly of Multi-merchant Mall System Functions-Platform Distribution Level
猜你喜欢

ARM开发(二)ARM体系结构——ARM,数据和指令类型,处理器工作模式,寄存器,状态寄存器,流水线,指令集,汇编小练习题

01| 数据类型

深度学习——以天气识别为例,探讨如何保存神经网络模型

深度学习:优化器

SQL注入(2)

Error detected while processing /home/test/.vim/plugin/visualmark.vim

"The Sword Offer" Problem Solution - week1 (continuously updated)

Second data CEO CAI data warming invited to jointly organize the acceleration data elements online salon

leetcode-23. Merge K ascending linked lists

《剑指offer》题解——week1(持续更新)
随机推荐
leetcode 1805. 字符串中不同整数的数目
31 基本统计概念
网路编程_socket返回值
医学影像分割系统综述Data preparation for artificial intelligence in medical imaging: A comprehensive guide ...
redis的四种模式
进程和计划任务管理
Chrome的JSON美化插件
30 norm
【问题记录】pip 安装报错 Failed to establish a new connection
leetcode 5709. 最大升序子数组和
VsCode如何使用国内镜像下载
el-popover 内嵌 el-table 后位置错位 乱飘 解决方案
powershell 执行策略
佛性问题排查小结
创建一个DAPP的全流程
C18-PEG- ALD批发_C18-PEG-CHO_C18-PEG-醛基
《基于机器视觉的高压输电线路覆冰厚度检测》论文笔记
opencv学习入门
Win10开始菜单打不开怎么办?
cmd路径空格问题解决方案