当前位置:网站首页>decast id.var measure. Var data splitting and merging
decast id.var measure. Var data splitting and merging
2022-04-23 13:01:00 【qq_ fifty-two million eight hundred and thirteen thousand one h】
R And data.table -melt/dcast( Data splitting and merging )
Write it at the front : The process of data shaping is indeed somewhat similar to kneading dough , First pass the data through melt() The function crumples the data , And then through dcast() Function to reshape the data into the desired shape
reshape2 package :
melt- Convert wide format data into long format .
cast- Convert long format data into wide format .(dcast- Return a data frame when outputting .acast- Returns a vector when outputting / matrix / Array .)
notes :melt It means data fusion , What it does is actually transfer the data from “ wide ” turn “ Long ”.
cast In addition to restoring data , You can also integrate data .
dcast Output data frame . Each variable on the left side of the formula will be used as a column in the result , The variable on the right is treated as a factor type , Each level produces a column in the result .
tidyr package :
gather- Convert wider data into a longer form , It is analogous to from reshape2 Function of fusion function in package
spread- Convert long data into a wider form , It is analogous to from reshape2 Function of casting function in package .
data.table package :
data.table Function of melt and dcast It's an enhancement package reshape2 The extension of the function with the same name in
library(data.table)
ID <- c(NA,1,2,2)
Time <- c(1,2,NA,1)
X1 <- c(5,3,NA,2)
X2 <- c(NA,5,1,4)
mydata <- data.table(ID,Time,X1,X2)
mydata
## ID Time X1 X2
## 1: NA 1 5 NA
## 2: 1 2 3 5
## 3: 2 NA NA 1
## 4: 2 1 2 4
md <- melt(mydata, id=c("ID","Time")) #or md <- melt(mydata, id=1:2)
#melt So that each line is a unique identifier - Variable combinations
md # Take the first column as id Column , All other columns can be merged
## ID Time variable value
## 1: NA 1 X1 5
## 2: 1 2 X1 3
## 3: 2 NA X1 NA
## 4: 2 1 X1 2
## 5: NA 1 X2 NA
## 6: 1 2 X2 5
## 7: 2 NA X2 1
## 8: 2 1 X2 4
Put the variable "variable", and "value" Put it together , The result is two new columns , One column is the variable variable, Refers to which blend variable , The other column is the value value, That is, the value corresponding to the variable . We also call this row by row arrangement as long data format
melt: The fusion of data sets is to reconstruct it into such a format : One line per measurement variable , The row contains the identifier variable needed to uniquely determine this measurement .
str(mydata)
## Classes 'data.table' and 'data.frame': 4 obs. of 4 variables:
## $ ID : num NA 1 2 2
## $ Time: num 1 2 NA 1
## $ X1 : num 5 3 NA 2
## $ X2 : num NA 5 1 4
## - attr(*, ".internal.selfref")=<externalptr>
str(md)
## Classes 'data.table' and 'data.frame': 8 obs. of 4 variables:
## $ ID : num NA 1 2 2 NA 1 2 2
## $ Time : num 1 2 NA 1 1 2 NA 1
## $ variable: Factor w/ 2 levels "X1","X2": 1 1 1 1 2 2 2 2
## $ value : num 5 3 NA 2 NA 5 1 4
## - attr(*, ".internal.selfref")=<externalptr>
setcolorder(md,c("ID","variable","Time","value")) ##setcolorder() It can be used to modify the order of columns .
md
## ID variable Time value
## 1: NA X1 1 5
## 2: 1 X1 2 3
## 3: 2 X1 NA NA
## 4: 2 X1 1 2
## 5: NA X2 1 NA
## 6: 1 X2 2 5
## 7: 2 X2 NA 1
## 8: 2 X2 1 4
mdr <- melt(mydata, id=c("ID","Time"),variable.name="Xzl",value.name="Vzl",na.rm = TRUE) #variable.name Define variable names
mdr
## ID Time Xzl Vzl
## 1: NA 1 X1 5
## 2: 1 2 X1 3
## 3: 2 1 X1 2
## 4: 1 2 X2 5
## 5: 2 NA X2 1
## 6: 2 1 X2 4
mdr1 <- melt(mydata, id=c("ID","Time"),variable.name="Xzl",value.name="Vzl",measure.vars=c("X1"),na.rm = TRUE) #measure.vars Screening
mdr1
## ID Time Xzl Vzl
## 1: NA 1 X1 5
## 2: 1 2 X1 3
## 3: 2 1 X1 2
md[Time==1]
## ID variable Time value
## 1: NA X1 1 5
## 2: 2 X1 1 2
## 3: NA X2 1 NA
## 4: 2 X2 1 4
md[Time==2]
## ID variable Time value
## 1: 1 X1 2 3
## 2: 1 X2 2 5
# Executive integration
# rowvar1 + rowvar2 + ... ~ colvar1 + colvar2 + ...
# In this formula ,rowvar1 + rowvar2 + ... The set of variables to be defined is crossed out , To determine the content of each line , and colvar1 + colvar2 + ... Defines what to cross out 、 Determine the variable set of each column .
newmd<- dcast(md, ID~variable, mean)
newmd
## ID X1 X2
## 1: 1 3 5.0
## 2: 2 NA 2.5
## 3: NA 5 NA
newmd2<- dcast(md, ID+variable~Time)
newmd2
## ID variable 1 2 NA
## 1: 1 X1 NA 3 NA
## 2: 1 X2 NA 5 NA
## 3: 2 X1 2 NA NA
## 4: 2 X2 4 NA 1
## 5: NA X1 5 NA NA
## 6: NA X2 NA NA NA
#ID+variable~Time Use Time Yes (ID,variable) grouping Time:1,2,NA similar excel Data dialysis
newmd3<- dcast(md, ID~variable+Time)
newmd3 #variable:X1,X2 Time:1,2,NA similar excel Data dialysis
## ID X1_1 X1_2 X1_NA X2_1 X2_2 X2_NA
## 1: 1 NA 3 NA NA 5 NA
## 2: 2 2 NA NA 4 NA 1
## 3: NA 5 NA NA NA NA NA
Even if it's just a small dust in the world , Fate should also be controlled by yourself , Like a sunflower , Welcome to the sunshine 、 Brave bloom
版权声明
本文为[qq_ fifty-two million eight hundred and thirteen thousand one h]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231258305791.html
边栏推荐
- STM32 project transplantation: transplantation between chip projects of different models: Ze to C8
- Introduction to kubernetes
- The El table horizontal scroll bar is fixed at the bottom of the visual window
- SSM framework series - data source configuration day2-1
- Baserecyclerviewadapterhelper realizes pull-down refresh and pull-up loading
- 进程虚拟地址空间区域划分
- Teach you to quickly develop a werewolf killing wechat applet (with source code)
- 梳理网络IP代理的几大用途
- Community version Alibaba MQ ordinary message sending subscription demo
- Go language mapping operation
猜你喜欢

World Book Day: I'd like to recommend these books

mysql支持ip访问

Record the problems encountered in using v-print

Process virtual address space partition

31. 下一个排列

风尚云网学习-input属性总结

STM32 project transplantation: transplantation between chip projects of different models: Ze to C8

Navicat远程连接数据库 出现 1130- Host xxx is not allowed to connect to this MySQL server错误

Use source insight to view and edit source code

教你快速开发一个 狼人杀微信小程序(附源码)
随机推荐
梳理网络IP代理的几大用途
Use source insight to view and edit source code
22. Bracket generation
leetcode-791. Custom string sorting
风尚云网学习-h5的input:type属性的image属性
98. Error s.e.errormvcautoconfiguration $staticview reported by freemaker framework: cannot render error page for request
leetcode:437. Path sum III [DFS selected or not selected?]
The quill editor image zooms, multiple rich text boxes are used on one page, and the quill editor upload image address is the server address
Go language slicing operation
Byte jump 2020 autumn recruitment programming question: quickly find your own ranking according to the job number
(1) Openjuterpyrab comparison scheme
Three channel ultrasonic ranging system based on 51 single chip microcomputer (timer ranging)
Design of STM32 multi-channel temperature measurement wireless transmission alarm system (industrial timing temperature measurement / engine room temperature timing detection, etc.)
SSM framework series - data source configuration day2-1
leetcode-791. 自定义字符串排序
Golang realizes regular matching: the password contains at least one digit, letter and special character, and the length is 8-16
Navicat远程连接数据库 出现 1130- Host xxx is not allowed to connect to this MySQL server错误
mysql8安装
只是不断地建构平台,不断地收拢流量,并不能够做好产业互联网
51 single chip microcomputer stepping motor control system based on LabVIEW upper computer (upper computer code + lower computer source code + ad schematic + 51 complete development environment)