当前位置:网站首页>Visual common drawing (V) scatter diagram
Visual common drawing (V) scatter diagram
2022-04-23 10:54:00 【The big pig of the little pig family】
Visualizing common drawings ( 5、 ... and ) Scatter plot
One . Introduction to scatter diagram
Scatter diagram is also called X-Y chart , It shows all the data in the form of points in the rectangular coordinate system , To show the degree of interaction between variables , The position of the point is determined by the value of the variable .
By observing the distribution of data points on the scatter plot , We can infer the correlation between variables . If there is no correlation between variables , Then, in the scatter diagram, it will be shown as randomly distributed discrete points , If there's a correlation , Then most of the data points will be relatively dense and present a certain trend . The correlation of data is mainly divided into :
- positive correlation ( The values of two variables increase at the same time ).
- negative correlation ( The value of one variable increases and the value of the other decreases ).
- Unrelated .
- Linear correlation .
- Exponential correlation .
Scatter charts are often used in conjunction with regression lines , Summarize and analyze the existing data for prediction analysis .
For those variables, there is a close relationship , But these relationships are not as accurate as mathematical and physical formulas , Scatter chart is a good graphic tool . But in the analysis process, we need to pay attention to , The correlation between these two variables is not equivalent to a definite causal relationship , Other influencing factors may also need to be considered .
Two . Composition of scatter diagram
A standard scatter diagram includes at least the following parts :
- The vertical axis : Represents the value of one of the variables
- The horizontal axis : Represents the value of one of the variables
- spot :(X,Y)
- The regression line : The line that runs through all points most accurately
3、 ... and . Application scenarios
Fit data : Data from two consecutive data fields .
The main function : Observe the distribution of data .
Number of applicable data : unlimited .
remarks : In order to better observe the data distribution , You need to set the transparency or color of data points .
Suitable for the scene :
- Display and compare values , Not only can it show trends , It can also display the shape of the data cluster , And the relationship of data points in the data cloud .
Not suitable for the scene :
- Display the proportion of each classification data .
Four . Realization
stay matplotlib
Use in scatter
Function to realize scatter diagram , The functions are described as follows :
scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None,vmin=None, vmax=None, alpha=None, linewidths=None, *,edgecolors=None, plotnonfinite=False, data=None, **kwargs)
Parameters 1:x,y: Specify the coordinates of the data scatter .
Parameters 2:s: Numerical type , Specifies the size of the scatter .
Parameters 3:c: Array or class array type , Specifies the color of the scatter .
Parameters 4:marker: Qualified string , Specifies the marker type of the scatter ( The default is :‘o’).
Parameters 5:cmap: Specify the selected colormap.
Parameters 6:norm: Unknown .
Parameters 7、8:min、vmax and norm Used together to normalize data .
Parameters 9:alpha: floating-point , Specifies the transparency of the scatter .
Parameters 10:linewidths: Integer type , Specifies the lineweight of the scatter edge ; If marker by None, Then use verts To construct a scatter marker
Parameters 11:verts: Unknown .
Parameters 12:edgecolors: Array or class array type , Specifies the scatter edge color , Will cycle .
Parameters 13:plotnonfinite: Boolean type , combination set_bad Use , Specifies whether to draw points in an unrestricted way .
Parameters 14:**kwargs: The accepted keyword parameters are passed to Collection
example .
Return value : The associated PathCollection example .
Use to SOCR-HeightWeight.csv Data sets, for example , The data set records a total of 25000 The height and weight of an object , With height as the horizontal axis , Take weight as the vertical axis , Look at the relationship between two variables , The complete code is as follows :
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
import numpy as np
import pandas as pd
plt.rcParams['font.sans-serif'] = ['SimHei'] # Settings support Chinese
plt.rcParams['axes.unicode_minus'] = False # Set up - Number
plt.style.use('seaborn-dark-palette')
df = pd.read_csv("SOCR-HeightWeight.csv", index_col=0)
height = df["Height(Inches)"].values.reshape(-1, 1)
weight = df["Weight(Pounds)"].values.reshape(-1, 1)
model = LinearRegression()
model.fit(height, weight)
coef = model.coef_[0]
intercept = model.intercept_[0]
height_avg = np.average(height)
weight_avg = np.average(weight)
quadrant1 = df[(df["Height(Inches)"] >= height_avg) & (df["Weight(Pounds)"] >= weight_avg)]
quadrant1_height = quadrant1["Height(Inches)"][:3000]
quadrant1_weight = quadrant1["Weight(Pounds)"][:3000]
plt.scatter(quadrant1_height, quadrant1_weight, alpha=0.3, label=" Scatter plot first quadrant ")
quadrant2 = df[(df["Height(Inches)"] <= height_avg) & (df["Weight(Pounds)"] >= weight_avg)]
quadrant2_height = quadrant2["Height(Inches)"][:3000]
quadrant2_weight = quadrant2["Weight(Pounds)"][:3000]
plt.scatter(quadrant2_height, quadrant2_weight, alpha=0.3, label=" Scatter plot second quadrant ")
quadrant3 = df[(df["Height(Inches)"] <= height_avg) & (df["Weight(Pounds)"] <= weight_avg)]
quadrant3_height = quadrant3["Height(Inches)"][:3000]
quadrant3_weight = quadrant3["Weight(Pounds)"][:3000]
plt.scatter(quadrant3_height, quadrant3_weight, alpha=0.3, label=" The third quadrant of the scatter chart ")
quadrant4 = df[(df["Height(Inches)"] >= height_avg) & (df["Weight(Pounds)"] <= weight_avg)]
quadrant4_height = quadrant4["Height(Inches)"][:3000]
quadrant4_weight = quadrant4["Weight(Pounds)"][:3000]
plt.scatter(quadrant4_height, quadrant4_weight, alpha=0.3, label=" The fourth quadrant of the scatter chart ")
# Draw average
plt.hlines(weight_avg, min(height), max(height), ls="--", color='r', lw=2, label=' Average weight ')
plt.vlines(height_avg, min(weight), max(weight), ls='--', color='k', lw=2, label=' Average height ')
x = np.arange(min(height), max(height), 0.05)
y = coef * x + intercept
plt.plot(x, y, lw=2, color="darkgray", label=" Regression line of height and weight ")
plt.title(" Scatter diagram of height and weight ", fontsize=25, fontweight="bold")
plt.xlabel(" height (Inches)", fontsize=20)
plt.ylabel(" weight (Pounds)", fontsize=20)
plt.legend(fontsize=15)
plt.show()
The effect is as follows :
5、 ... and . Reference resources
版权声明
本文为[The big pig of the little pig family]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230617063254.html
边栏推荐
- 主流手机分辨率与尺寸
- Manjaro installation and configuration (vscode, wechat, beautification, input method)
- 【leetcode】199.二叉树的右视图
- SQL Server cursor circular table data
- 使用zerotier让异地设备组局域网
- How can swagger2 custom parameter annotations not be displayed
- CentOS/Linux安装MySQL
- Example of pop-up task progress bar function based on pyqt5
- Swagger2 自定义参数注解如何不显示
- Jerry's users how to handle events in the simplest way [chapter]
猜你喜欢
STM32接电机驱动,杜邦线供电,然后反烧问题
Intuitive understanding entropy
Manjaro installation and configuration (vscode, wechat, beautification, input method)
第六站神京门户-------手机号码的转换
SQL Server 递归查询上下级
精彩回顾|「源」来如此 第六期 - 开源经济与产业投资
Ueditor -- limitation of 4m size of image upload component
解决方案架构师的小锦囊 - 架构图的 5 种类型
UEditor之——图片上传组件大小4M的限制
VIM + ctags + cscope development environment construction guide
随机推荐
Swagger2 自定义参数注解如何不显示
Leetcode22: bracket generation
Full stack cross compilation x86 completion process experience sharing
Derivation and regularization
How to Ping Baidu development board
Visual common drawing (I) stacking diagram
19. Delete the penultimate node of the linked list (linked list)
Let the LAN group use the remote device
Pycharm
Differences among restful, soap, RPC, SOA and microservices
Deploy jar package
Example of pop-up task progress bar function based on pyqt5
SWAT—Samba WEB管理工具介绍
Download and installation steps of xshell + xftp
解决方案架构师的小锦囊 - 架构图的 5 种类型
Solution architect's small bag - 5 types of architecture diagrams
VIM + ctags + cscope development environment construction guide
Can Jerry's AES 256bit [chapter]
997. Square of ordered array (array)
Cve-2019-0708 vulnerability exploitation of secondary vocational network security 2022 national competition