当前位置:网站首页>Complex AB experiment
Complex AB experiment
2022-08-10 06:50:00 【Coco-Lele】
1. 基本问题
Classification of inspection indicators
ABTest indicators can be divided into two categories:绝对值指标、Proportional value indicator.The variance is calculated differently for the two.
The proportional values can be divided into two categories according to the different denominators:The denominator is person times(留存率、转化率等)、The denominator is the number of actions(曝光点击率).
The denominator is person times,The split unit and the analysis unit are the same,可以用 z z z检验;The denominator is the number of actions,The units of analysis are not independent,要用 d e l t a delta delta检验.
Cumulative value over multiple days
The performance of indicators over multiple days is aggregated and calculated.For example, the per capita frequency of a certain behavior,Then the denominator is the total number of times that behavior occurred during the experiment,The numerator is the number of deduplicated people who entered the group during the experiment.
优点:To ensure independence between samples;增加样本量,Significance can increase with accumulation.
Multi-day accumulation of retention rate:Calculate the retention rate of new recruits every day,Then weighted according to the number of people.
不能用AB的情况
- When the intervention variable cannot be controlled(For example, the impact of watching live broadcasts on users,Some people cannot be forced to watch,Some people don't see it)
- Too much traffic is being used
- Policies can harm the user experience
AB实验步骤
Determine the experimental strategy;Develop experimental indicators of observation;计算样本量(显著性水平/统计功效/The minimum level of improvement in the indicator that needs to be observed/指标方差);Experimental development online;数据回收.
AB不显著
- Whether the minimum sample size is reached
- DIDEliminate fixed differences
- Check the experiment link,See if everyone is reached by the strategy(渗透率低,可以PSM)
2. delta检验
见上篇,Applicable when the split unit and analysis unit are different.
3. 贝叶斯检验
优点:
- Sample size does not need to be considered.
- The distribution of the posterior parameters can be obtained,Then quantify the probability of the improvement of the index、The size of the metric lift.
贝叶斯派 VS 频率派 基本理论:
先验分布 π ( θ ) \pi(\theta) π(θ) + 样本数据 P ( X ∣ θ ) P(X|\theta) P(X∣θ) = 后验分布 π ( θ ∣ X ) \pi(\theta|X) π(θ∣X)
共轭先验分布:贝塔分布 与 二项分布
θ \theta θ~ b e t a ( α , β ) beta(\alpha, \beta) beta(α,β), X X X~ B i n o m i a l ( n , p ) Binomial(n, p) Binomial(n,p), 则 θ ∣ X \theta|X θ∣X~ b e t a ( x + α , n − x + β ) beta(x+\alpha, n-x+\beta) beta(x+α,n−x+β)
4. Different hypothesis tests
z z z检验:Large sample data mean test(Distributions are not differentiated,中心极限定理;Does not distinguish whether the variance is known or not,n>30时t分布和z分布相似)
t t t检验:Small sample normal data mean test(小于30,方差未知)
F F F检验:方差齐性检验;单因素方差分析,Test the effect of the value of each level of a categorical variable.
卡方检验:The essence is to test whether the sample frequency is consistent with the expectation.Can be used to test the correlation between two sets of discrete variables(列联表);Test the similarity between the actual distribution and the expected distribution,Nonparametric tests are mostly used for categorical variables.
X 2 = Σ ( X − E ) 2 / E X^2=\Sigma(X-E)^2/E X2=Σ(X−E)2/E
令 E = n p E=np E=np,The square of the normal distribution can be obtained.k k k- s s s检验:Whether the sample satisfies a specific distribution;Look at the maximum value of the difference between the sample cumulative distribution and the theoretical cumulative distribution.
DID
y = α 1 ∗ t r e a t m e n t + α 2 ∗ p o s t + α 3 ∗ t r e a t m e n t ∗ p o s t + u y=\alpha_1*treatment + \alpha2 * post + \alpha_3*treatment*post+u y=α1∗treatment+α2∗post+α3∗treatment∗post+u
α 3 \alpha_3 α3represents the net effect of the policy
平行趋势检验
参考文献
ABexperimental interview
https://www.jiqizhixin.com/articles/2020-09-18-2
https://blog.csdn.net/deephub/article/details/112167937
边栏推荐
- 2022 Henan Mengxin League No. 5: University of Information Engineering B - Transportation Renovation
- 数据库学习之表的约束
- Qt使用私有接口绘制窗口阴影
- ESP32 485风速
- 大佬,oracle单表增量同步时候源库服务器额外占用内存近2g,这不正常吧
- Unity3d famous project-Dark Tree translation
- [网络安全]实操AWVS靶场复现CSRF漏洞
- Regular backup of mysql database (retain backups for nearly 7 days)
- 各位大佬 oracle cdc 默认配置 偶发会30秒才抓取到数据 这个怎么优化啊
- Data types for database learning
猜你喜欢
随机推荐
Qt使用私有接口绘制窗口阴影
【强化学习】《Easy RL》- Q-learning - CliffWalking(悬崖行走)代码解读
调试ZYNQ的u-boot 2017.3 不能正常启动,记录调试过程
神经网络可视化有3D版本了,美到沦陷 已开源
mysql数据库月增长量问题
2022河南萌新联赛第(五)场:信息工程大学 B - 交通改造
求问各位大佬,FLink SQL读取source的时候去指定水位线的时间字段,如果指定的这个字段中格
数据库学习之数据类型
高级测试:如何使用Flink对Strom任务的逻辑功能进行复现测试?
Regular backup of mysql database (retain backups for nearly 7 days)
Parallax Mapping: More Realistic Texture Detail Representation (Part 1): Why Use Parallax Mapping
【Event Preview on August 9】Prometheus Summit
761. Special Binary Sequences
强化学习_06_pytorch-DQN实践(CartPole-v0)
强化学习_05_DataWhale近端策略优化
全网可达,交换机和路由器的配置,vlan
第11章 数据库的设计规范【2.索引及调优篇】【MySQL高级】
Grammar Basics (Judgment Statements)
Reproduce dns out-band data combined with sqlmap
Two-dimensional cartoon rendering - coloring