当前位置:网站首页>decast id.var measure. Var data splitting and merging
decast id.var measure. Var data splitting and merging
2022-04-23 13:01:00 【qq_ fifty-two million eight hundred and thirteen thousand one h】
R And data.table -melt/dcast( Data splitting and merging )
Write it at the front : The process of data shaping is indeed somewhat similar to kneading dough , First pass the data through melt() The function crumples the data , And then through dcast() Function to reshape the data into the desired shape
reshape2 package :
melt- Convert wide format data into long format .
cast- Convert long format data into wide format .(dcast- Return a data frame when outputting .acast- Returns a vector when outputting / matrix / Array .)
notes :melt It means data fusion , What it does is actually transfer the data from “ wide ” turn “ Long ”.
cast In addition to restoring data , You can also integrate data .
dcast Output data frame . Each variable on the left side of the formula will be used as a column in the result , The variable on the right is treated as a factor type , Each level produces a column in the result .
tidyr package :
gather- Convert wider data into a longer form , It is analogous to from reshape2 Function of fusion function in package
spread- Convert long data into a wider form , It is analogous to from reshape2 Function of casting function in package .
data.table package :
data.table Function of melt and dcast It's an enhancement package reshape2 The extension of the function with the same name in
library(data.table)
ID <- c(NA,1,2,2)
Time <- c(1,2,NA,1)
X1 <- c(5,3,NA,2)
X2 <- c(NA,5,1,4)
mydata <- data.table(ID,Time,X1,X2)
mydata
## ID Time X1 X2
## 1: NA 1 5 NA
## 2: 1 2 3 5
## 3: 2 NA NA 1
## 4: 2 1 2 4
md <- melt(mydata, id=c("ID","Time")) #or md <- melt(mydata, id=1:2)
#melt So that each line is a unique identifier - Variable combinations
md # Take the first column as id Column , All other columns can be merged
## ID Time variable value
## 1: NA 1 X1 5
## 2: 1 2 X1 3
## 3: 2 NA X1 NA
## 4: 2 1 X1 2
## 5: NA 1 X2 NA
## 6: 1 2 X2 5
## 7: 2 NA X2 1
## 8: 2 1 X2 4
Put the variable "variable", and "value" Put it together , The result is two new columns , One column is the variable variable, Refers to which blend variable , The other column is the value value, That is, the value corresponding to the variable . We also call this row by row arrangement as long data format
melt: The fusion of data sets is to reconstruct it into such a format : One line per measurement variable , The row contains the identifier variable needed to uniquely determine this measurement .
str(mydata)
## Classes 'data.table' and 'data.frame': 4 obs. of 4 variables:
## $ ID : num NA 1 2 2
## $ Time: num 1 2 NA 1
## $ X1 : num 5 3 NA 2
## $ X2 : num NA 5 1 4
## - attr(*, ".internal.selfref")=<externalptr>
str(md)
## Classes 'data.table' and 'data.frame': 8 obs. of 4 variables:
## $ ID : num NA 1 2 2 NA 1 2 2
## $ Time : num 1 2 NA 1 1 2 NA 1
## $ variable: Factor w/ 2 levels "X1","X2": 1 1 1 1 2 2 2 2
## $ value : num 5 3 NA 2 NA 5 1 4
## - attr(*, ".internal.selfref")=<externalptr>
setcolorder(md,c("ID","variable","Time","value")) ##setcolorder() It can be used to modify the order of columns .
md
## ID variable Time value
## 1: NA X1 1 5
## 2: 1 X1 2 3
## 3: 2 X1 NA NA
## 4: 2 X1 1 2
## 5: NA X2 1 NA
## 6: 1 X2 2 5
## 7: 2 X2 NA 1
## 8: 2 X2 1 4
mdr <- melt(mydata, id=c("ID","Time"),variable.name="Xzl",value.name="Vzl",na.rm = TRUE) #variable.name Define variable names
mdr
## ID Time Xzl Vzl
## 1: NA 1 X1 5
## 2: 1 2 X1 3
## 3: 2 1 X1 2
## 4: 1 2 X2 5
## 5: 2 NA X2 1
## 6: 2 1 X2 4
mdr1 <- melt(mydata, id=c("ID","Time"),variable.name="Xzl",value.name="Vzl",measure.vars=c("X1"),na.rm = TRUE) #measure.vars Screening
mdr1
## ID Time Xzl Vzl
## 1: NA 1 X1 5
## 2: 1 2 X1 3
## 3: 2 1 X1 2
md[Time==1]
## ID variable Time value
## 1: NA X1 1 5
## 2: 2 X1 1 2
## 3: NA X2 1 NA
## 4: 2 X2 1 4
md[Time==2]
## ID variable Time value
## 1: 1 X1 2 3
## 2: 1 X2 2 5
# Executive integration
# rowvar1 + rowvar2 + ... ~ colvar1 + colvar2 + ...
# In this formula ,rowvar1 + rowvar2 + ... The set of variables to be defined is crossed out , To determine the content of each line , and colvar1 + colvar2 + ... Defines what to cross out 、 Determine the variable set of each column .
newmd<- dcast(md, ID~variable, mean)
newmd
## ID X1 X2
## 1: 1 3 5.0
## 2: 2 NA 2.5
## 3: NA 5 NA
newmd2<- dcast(md, ID+variable~Time)
newmd2
## ID variable 1 2 NA
## 1: 1 X1 NA 3 NA
## 2: 1 X2 NA 5 NA
## 3: 2 X1 2 NA NA
## 4: 2 X2 4 NA 1
## 5: NA X1 5 NA NA
## 6: NA X2 NA NA NA
#ID+variable~Time Use Time Yes (ID,variable) grouping Time:1,2,NA similar excel Data dialysis
newmd3<- dcast(md, ID~variable+Time)
newmd3 #variable:X1,X2 Time:1,2,NA similar excel Data dialysis
## ID X1_1 X1_2 X1_NA X2_1 X2_2 X2_NA
## 1: 1 NA 3 NA NA 5 NA
## 2: 2 2 NA NA 4 NA 1
## 3: NA 5 NA NA NA NA NA
Even if it's just a small dust in the world , Fate should also be controlled by yourself , Like a sunflower , Welcome to the sunshine 、 Brave bloom
版权声明
本文为[qq_ fifty-two million eight hundred and thirteen thousand one h]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231258305791.html
边栏推荐
- 1130 - host XXX is not allowed to connect to this MySQL server error in Navicat remote connection database
- 软件测试周刊(第68期):解决棘手问题的最上乘方法是:静观其变,顺水推舟。
- Free and open source intelligent charging pile SaaS cloud platform of Internet of things
- Huawei cloud MVP email
- JMeter operation redis
- Temperature and humidity monitoring + timing alarm system based on 51 single chip microcomputer (C51 source code)
- MySQL supports IP access
- Process virtual address space partition
- XinChaCha Trust SSL Organization Validated
- Navicat远程连接数据库 出现 1130- Host xxx is not allowed to connect to this MySQL server错误
猜你喜欢
Synchronously update the newly added and edited data to the list
Kubernetes 入門教程
melt reshape decast 长数据短数据 长短转化 数据清洗 行列转化
ZigBee CC2530 minimum system and register configuration (1)
No idle servers? Import OVF image to quickly experience smartx super fusion community version
leetcode:437. Path sum III [DFS selected or not selected?]
拥抱机器视觉新蓝海,冀为好望开启数字经济发展新“冀”遇
JMeter operation redis
Free and open source intelligent charging pile SaaS cloud platform of Internet of things
Recovering data with MySQL binlog
随机推荐
风尚云网学习-input属性总结
mysql8安装
MySQL supports IP access
Record a website for querying compatibility, string Replaceall() compatibility error
Wu Enda's programming assignment - logistic regression with a neural network mindset
Navicat远程连接数据库 出现 1130- Host xxx is not allowed to connect to this MySQL server错误
Use of Presto date function
Recovering data with MySQL binlog
Jupiter notebook installation
The El table horizontal scroll bar is fixed at the bottom of the visual window
【蓝桥杯】4月17日省赛刷题训练(前3道题)
Plato farm - a game of farm metauniverse with Plato as the goal
three.js文字模糊问题
Homomorphic encryption technology learning
Free and open source charging pile Internet of things cloud platform
Keyword interpretation and some APIs in RT thread
Softbank vision fund entered the Web3 security industry and led a new round of investment of US $60 million in certik
4.DRF 权限&访问频率&过滤&排序
31. 下一个排列
CVPR 2022&NTIRE 2022|首个用于高光谱图像重建的 Transformer