当前位置:网站首页>decast id.var measure. Var data splitting and merging
decast id.var measure. Var data splitting and merging
2022-04-23 13:01:00 【qq_ fifty-two million eight hundred and thirteen thousand one h】
R And data.table -melt/dcast( Data splitting and merging )
Write it at the front : The process of data shaping is indeed somewhat similar to kneading dough , First pass the data through melt() The function crumples the data , And then through dcast() Function to reshape the data into the desired shape
reshape2 package :
melt- Convert wide format data into long format .
cast- Convert long format data into wide format .(dcast- Return a data frame when outputting .acast- Returns a vector when outputting / matrix / Array .)
notes :melt It means data fusion , What it does is actually transfer the data from “ wide ” turn “ Long ”.
cast In addition to restoring data , You can also integrate data .
dcast Output data frame . Each variable on the left side of the formula will be used as a column in the result , The variable on the right is treated as a factor type , Each level produces a column in the result .
tidyr package :
gather- Convert wider data into a longer form , It is analogous to from reshape2 Function of fusion function in package
spread- Convert long data into a wider form , It is analogous to from reshape2 Function of casting function in package .
data.table package :
data.table Function of melt and dcast It's an enhancement package reshape2 The extension of the function with the same name in
library(data.table)
ID <- c(NA,1,2,2)
Time <- c(1,2,NA,1)
X1 <- c(5,3,NA,2)
X2 <- c(NA,5,1,4)
mydata <- data.table(ID,Time,X1,X2)
mydata
## ID Time X1 X2
## 1: NA 1 5 NA
## 2: 1 2 3 5
## 3: 2 NA NA 1
## 4: 2 1 2 4
md <- melt(mydata, id=c("ID","Time")) #or md <- melt(mydata, id=1:2)
#melt So that each line is a unique identifier - Variable combinations
md # Take the first column as id Column , All other columns can be merged
## ID Time variable value
## 1: NA 1 X1 5
## 2: 1 2 X1 3
## 3: 2 NA X1 NA
## 4: 2 1 X1 2
## 5: NA 1 X2 NA
## 6: 1 2 X2 5
## 7: 2 NA X2 1
## 8: 2 1 X2 4
Put the variable "variable", and "value" Put it together , The result is two new columns , One column is the variable variable, Refers to which blend variable , The other column is the value value, That is, the value corresponding to the variable . We also call this row by row arrangement as long data format
melt: The fusion of data sets is to reconstruct it into such a format : One line per measurement variable , The row contains the identifier variable needed to uniquely determine this measurement .
str(mydata)
## Classes 'data.table' and 'data.frame': 4 obs. of 4 variables:
## $ ID : num NA 1 2 2
## $ Time: num 1 2 NA 1
## $ X1 : num 5 3 NA 2
## $ X2 : num NA 5 1 4
## - attr(*, ".internal.selfref")=<externalptr>
str(md)
## Classes 'data.table' and 'data.frame': 8 obs. of 4 variables:
## $ ID : num NA 1 2 2 NA 1 2 2
## $ Time : num 1 2 NA 1 1 2 NA 1
## $ variable: Factor w/ 2 levels "X1","X2": 1 1 1 1 2 2 2 2
## $ value : num 5 3 NA 2 NA 5 1 4
## - attr(*, ".internal.selfref")=<externalptr>
setcolorder(md,c("ID","variable","Time","value")) ##setcolorder() It can be used to modify the order of columns .
md
## ID variable Time value
## 1: NA X1 1 5
## 2: 1 X1 2 3
## 3: 2 X1 NA NA
## 4: 2 X1 1 2
## 5: NA X2 1 NA
## 6: 1 X2 2 5
## 7: 2 X2 NA 1
## 8: 2 X2 1 4
mdr <- melt(mydata, id=c("ID","Time"),variable.name="Xzl",value.name="Vzl",na.rm = TRUE) #variable.name Define variable names
mdr
## ID Time Xzl Vzl
## 1: NA 1 X1 5
## 2: 1 2 X1 3
## 3: 2 1 X1 2
## 4: 1 2 X2 5
## 5: 2 NA X2 1
## 6: 2 1 X2 4
mdr1 <- melt(mydata, id=c("ID","Time"),variable.name="Xzl",value.name="Vzl",measure.vars=c("X1"),na.rm = TRUE) #measure.vars Screening
mdr1
## ID Time Xzl Vzl
## 1: NA 1 X1 5
## 2: 1 2 X1 3
## 3: 2 1 X1 2
md[Time==1]
## ID variable Time value
## 1: NA X1 1 5
## 2: 2 X1 1 2
## 3: NA X2 1 NA
## 4: 2 X2 1 4
md[Time==2]
## ID variable Time value
## 1: 1 X1 2 3
## 2: 1 X2 2 5
# Executive integration
# rowvar1 + rowvar2 + ... ~ colvar1 + colvar2 + ...
# In this formula ,rowvar1 + rowvar2 + ... The set of variables to be defined is crossed out , To determine the content of each line , and colvar1 + colvar2 + ... Defines what to cross out 、 Determine the variable set of each column .
newmd<- dcast(md, ID~variable, mean)
newmd
## ID X1 X2
## 1: 1 3 5.0
## 2: 2 NA 2.5
## 3: NA 5 NA
newmd2<- dcast(md, ID+variable~Time)
newmd2
## ID variable 1 2 NA
## 1: 1 X1 NA 3 NA
## 2: 1 X2 NA 5 NA
## 3: 2 X1 2 NA NA
## 4: 2 X2 4 NA 1
## 5: NA X1 5 NA NA
## 6: NA X2 NA NA NA
#ID+variable~Time Use Time Yes (ID,variable) grouping Time:1,2,NA similar excel Data dialysis
newmd3<- dcast(md, ID~variable+Time)
newmd3 #variable:X1,X2 Time:1,2,NA similar excel Data dialysis
## ID X1_1 X1_2 X1_NA X2_1 X2_2 X2_NA
## 1: 1 NA 3 NA NA 5 NA
## 2: 2 2 NA NA 4 NA 1
## 3: NA 5 NA NA NA NA NA
Even if it's just a small dust in the world , Fate should also be controlled by yourself , Like a sunflower , Welcome to the sunshine 、 Brave bloom
版权声明
本文为[qq_ fifty-two million eight hundred and thirteen thousand one h]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231258305791.html
边栏推荐
- Learning materials
- 进程虚拟地址空间区域划分
- 将opencv 图片转换为字节的方式
- Unable to create servlet under SRC subfile of idea
- Teach you to quickly develop a werewolf killing wechat applet (with source code)
- JDBC connection pool
- STM32 project transplantation: transplantation between chip projects of different models: Ze to C8
- Free and open source agricultural Internet of things cloud platform (version: 3.0.1)
- SSL certificate refund instructions
- Go iris framework implements multi service Demo: start (listen to port 8084) service 2 through the interface in service 1 (listen to port 8083)
猜你喜欢

Servlet监听器&过滤器介绍

拥抱机器视觉新蓝海,冀为好望开启数字经济发展新“冀”遇

SSL certificate refund instructions

R语言中dcast 和 melt的使用 简单易懂

Free and open source intelligent charging pile SaaS cloud platform of Internet of things

SSM框架系列——Junit单元测试优化day2-3

Introduction to servlet listener & filter

Record some NPM related problems (messy records)

SSM框架系列——数据源配置day2-1

Calculate the past date and days online, and calculate the number of live days
随机推荐
Byte warehouse intern interview SQL questions
Record some NPM related problems (messy records)
Introduction to servlet listener & filter
Subscribe to Alibaba demo send business messages
Deploying MySQL in cloud native kubesphere
Byte jump 2020 autumn recruitment programming question: quickly find your own ranking according to the job number
CVPR 2022 & ntire 2022 | the first transformer for hyperspectral image reconstruction
Use Proteus to simulate STM32 ultrasonic srf04 ranging! Code+Proteus
Packet capturing and sorting -- TCP protocol [8]
Timing role in the project
21 days learning mongodb notes
Translation of multi modal visual tracking: review and empirical comparison
Kubernetes 入門教程
将新增和编辑的数据同步更新到列表
World Book Day: I'd like to recommend these books
jmeter操作redis
Process virtual address space partition
CVPR 2022&NTIRE 2022|首个用于高光谱图像重建的 Transformer
The quill editor image zooms, multiple rich text boxes are used on one page, and the quill editor upload image address is the server address
风尚云网学习-h5的input:type属性的image属性