当前位置:网站首页>SAS data processing technology (1)
SAS data processing technology (1)
2022-08-10 23:56:00 【metaX】
SAS数据处理技术
通常情况下,使用data stepDoing data processing is a good option. Data step for data processing,可以提供.
Flexible programming capabilities Rich data processing functions
other tools and techniques
SORT, SQL, 以及TRANSPOSEprocess step,It is very useful for data processing and transformation.
SASThe macro function makes the code more flexible.
调试(Debugging)技术,Can be used to help identify logical errors.
SAS程序流程
Review some knowledge points
SORTprocess step的OUT=Option to specify the creation of a new output dataset, rather than overwriting the original dataset.
用FORMATProcedures can create user-defined output formats and input formats.默认情况下,The created format will be stored in the directory
libname orion 's:\workshop';
data work.qtrlsalesrep;
proc sort data=work.qtrlsalesrep;
proc format;
value $ctryfmt 'AU'='Australia'
'US'='United tates';
run;
(选b; a是data那句,c是set那句)
由 INPUT语句和 赋值语句创建的变量会被重新初始化(Initialized to missing values). 由 SET语句Read variables are not reinitialized(will only be overwritten by subsequent data)
OUTPUTThe statement implements data transposition
Output multiple lines of observations
例中,Each row of records generates three observations.Each new observation will include a codenameIDand a test valueSCORE.
data A;
input ID $ scorel-score3;
drop scorel-score3; # Field names are not output(列名)
score=scorel;output;
score=score2;output;
score=score3;output;
cards;
02126 99 96 94
02128 89 90 88
;
proc print;
run;
Use one data step to output create multipleSAS数据集
in a data step,It can be done by converting the output dataset 名称列在DATAwithin a statement to create multiple datasets. 通过在OUTPUTSpecify the dataset name within the statement,Direct output to one or more specific datasets is possible
data usa australia other;
set orion.employee_addresses;
if Country='AU'then output australia;
else if Country='US'then output usa;
else output other;
run;
* 或者
data usa australia other;
set orion.employee addresses;
select (Country);
when ('US')output usa;
when ('AU')output australia;
otherwise output other;
end;
run;
*在SELECT语句中使用DO-END,当一个表达式为真时,使用DO-ENDStatements can execute multiple statements
Use conditional statements to control which observation is output to which dataset
in the data step,用DROP语句和KEEP语句Controls which variables are output to the output dataset.
默认情况下,SASAll observations in the dataset are processed one by one.用FIRSTOBS=和OBS=Options control which observations are processed.
Create an accumulation variable
The general form of a summation statement:
variable + expression;
求和语句 If the variable to the left of the plus sign did not exist before,create the variable This variable is initialized to the value before the first loop of the data step0 This variable is automatically preserved 执行时,Add the value of the expression to the variable 忽略缺失值.
Cumulative summation of grouped data
定义First..和Last.过程 Calculates the cumulative sum of grouped data Use subsetsIFstatement to output the specified observation
data deptsals(keep=Dept Deptsal);
set SalSort;
by Dept;
if First.Dept then Deptsal=0;
Deptsal+Salary;
if Last.Dept;
run;
proc sort data=sashelp.cars
out=cars;
by Make;
run;
data total cars(keep=Make MSRP sum_price);
set cars;
by Make;
if first.Make then sum_price=0;
sum price+MSRP;
if last.Make;
run;
proc print data=total_cars noobs;
var Make sum_price;
format sum price dollar10.2;
run;
Different types of data are read in different ways
列输入,Formatted input and list input are usedINPUTThere are three ways a statement reads data.
方式 | 适用情况 |
---|---|
列输入 | The data column is fixed standard data |
格式化输入 | Data columns are fixed standard data and non-standard data |
列表输入 | Standard and non-standard data separated by spaces or other delimiters |
列输入(Column Input) Read in with column input,conditions that the data satisfies
in fixed fields
standard character or numeric values (如 58 -23 67.23 00.99 5.67E5 1.2E-2)
INPUTThe general form of the statement input method :
INPUT variable <$> startcol-endcol...;
Read raw data files using formatted input method
INPUTThe general form of the statement formatted input method: INPUT 指针控制 变量 输入格式...;
Formatted input methods read data in the following ways: Move the input pointer to the starting position Name the variable Specifies the input format input @5 FirstName $10.;
Column control pointer:@n Move the pointer to n列 +n 将指针向后移动n列
Controls when records are loaded
Read multiple records from the raw data file as one observation.
DATA SAS-data-set; INFILE 'raw-data-file-name'; INPUT specifications; INPUT specifications; <additional SAS statements> RUN;
Read a raw data file of mixed-type records.
The row pointer controller controls when new records are loaded
DATA SAS-data-set; INFILE 'raw-data-file-name'; INPUT specifications/ specifications; <additional SAS statements> RUN;
当SAS遇到一个“/”时,The record for the next row is loaded. Line control pointer:# n 载入第n行 / Load the next line
Read a subset of raw data files for mixed-type records.
Other tricks for the list input method
Data at the end of the record are missing values.
INFILE 'raw-data-file'MISSOVER;
Missing data are represented by two consecutive delimiters
INFILE 'file-name' DSD;
.Each record contains multiple observations.
INPUT var1 var2 var3 [email protected]@
边栏推荐
- 烘干衣服问题
- IEEE的论文哪里可以下载?
- CDN原理与应用简要介绍
- 开启新征程——枫叶先生第一篇博客
- [21天学习挑战赛——内核笔记](五)——devmem读写寄存器调试
- [Excel知识技能] 将数值格式数字转换为文本格式
- How to recover deleted files from the recycle bin, two methods of recovering files from the recycle bin
- 开源一夏|OpenHarmony如何选择图片在Image组件上显示(eTS)
- 回收站的文件删了怎么恢复,回收站文件恢复的两种方法
- Parse method's parameter list (including parameter names)
猜你喜欢
[Excel知识技能] 将文本型数字转换为数值格式
C language, operators of shift operators (> >, < <) explanation
如何便捷获取参考文献的引用格式?
16. 文件上传
Deep Learning Transformer Architecture Analysis
iNFTnews | In the Web3 era, users will have data autonomy
CSDN21天学习挑战赛之折半查找
[Excel知识技能] 将数值格式数字转换为文本格式
[C language articles] Expression evaluation (implicit type conversion, arithmetic conversion)
如果纯做业务测试的话,在测试行业有出路吗?
随机推荐
C语言篇,操作符之 移位运算符(>>、<<)详解
sqlmap combined with dnslog fast injection
Is there a way out in the testing industry if it is purely business testing?
6.0深入理解MySQL事务隔离级别与锁机制
5. Lombok
Talking about cors
【C语言】数据储存详解
Talk预告 | 中国科学技术大学和微软亚洲研究院联合培养博士生冷燚冲:语音识别的快速纠错模型FastCorrect
[C] the C language program design, dynamic address book (order)
如何判断一个数为多少进制?
如果纯做业务测试的话,在测试行业有出路吗?
服务器小常识
IEEE的论文哪里可以下载?
三栏布局实现
2. 依赖管理和自动配置
ROS Experiment Notes - Validation of UZH-FPV Dataset
多语种翻译-多语种翻译软件免费
16. 文件上传
proxy代理服务_2
How to determine how many bases a number is?