当前位置:网站首页>Introduction to data analysis 𞓜 kaggle Titanic mission (III) - > explore data analysis
Introduction to data analysis 𞓜 kaggle Titanic mission (III) - > explore data analysis
2022-04-23 10:33:00 【Ape knowledge】

Series index : Introduction to data analysis | kaggle Titanic mission
One 、 Exploratory data analysis
Mainly introduce the use of Pandas Sort 、 Arithmetic calculation and calculation description function describe() Use .
(1) Create a simulation data
# Build a digital DataFrame data
frame = pd.DataFrame(np.arange(8).reshape((2, 4)),
index=['2', '1'],
columns=['d', 'a', 'b', 'c'])
frame
pd.DataFrame(): Create a DataFrame object
np.arange(8).reshape((2, 4)) : Generate a two-dimensional array (2*4), First column :0,1,2,3 Second column :4,5,6,7
index=['2, 1] :DataFrame The index column of the object
columns=['d', 'a', 'b', 'c']:DataFrame The index line of the object
(2) Sort
frame.sort_values(by='c', ascending=True) #by The parameter points to the column to be arranged ,sacending Point sort method ( Ascending / Descending )
# Let the row index sort in ascending order
frame.sort_index()
# Let the column index sort in ascending order
frame.sort_index(axis=1)
# Sort the column index in descending order
frame.sort_index(axis=1, ascending=False)
# Let any two columns of data be sorted in descending order at the same time
frame.sort_values(by=['a', 'c'], ascending=False)
When sorting two columns , In order , If there are equal numbers in the former , Sort by the next column .
Such as :sort_values(by=['a','c'].ascending = [False, True])
This line means to follow a Arrange in descending order , When a Press the same value in b Ascending order .
(3) utilize Pandas Do arithmetic
frame1_a = pd.DataFrame(np.arange(9.).reshape(3, 3),
columns=['a', 'b', 'c'],
index=['one', 'two', 'three'])
frame1_b = pd.DataFrame(np.arange(12.).reshape(4, 3),
columns=['a', 'e', 'c'],
index=['first', 'one', 'two', 'second'])
frame1_a
# take frame_a and frame_b Add additivity
frame1_a + frame1_b
【 remind 】 Two DataFrame Add up , Will return a new DataFrame, The corresponding row and column values are added , If there is no corresponding, it will become null NaN.
Of course ,DataFrame There are many arithmetic operations , Such as subtraction , Division, etc , Interested students can see 《 utilize Python Data analysis 》 The fifth chapter Align arithmetic operations with data part , Find more relevant learning materials on the Internet .
# call describe function , Observe frame2 Basic information of data
frame2.describe()
''' count : Sample data size mean : The average of the sample data std : Standard deviation of sample data min : Minimum value of sample data 25% : Sample data 25% The value at the time of 50% : Sample data 50% The value at the time of 75% : Sample data 75% The value at the time of max : Maximum value of sample data '''
''' Look at the Titanic dataset The fare The basic statistics of this column '''
text[' The fare '].describe()
Introduction to data analysis | kaggle Titanic mission The series is constantly updated , welcome
Like collection+Focus on
Last one : Introduction to data analysis | kaggle Titanic mission ( Two )—>pandas Basics
Next : Introduction to data analysis | kaggle Titanic mission ( Four )—> Data cleaning and feature processing
My level is limited , Please comment and correct the deficiencies in the article in the comment area below ~If feelings help you , Point a praise Give me a hand ~
Share... From time to time Interesting 、 Have a material 、 Nutritious content , welcome Subscribe to follow My blog , Looking forward to meeting you here ~
版权声明
本文为[Ape knowledge]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230619310794.html
边栏推荐
- Jinglianwen technology - professional data annotation company and intelligent data annotation platform
- Ansible cloud computing automation command line compact version
- Realizing data value through streaming data integration (5) - flow analysis
- 中职网络安全2022国赛之CVE-2019-0708漏洞利用
- 精彩回顾 | DEEPNOVA x Iceberg Meetup Online《基于Iceberg打造实时数据湖》
- MySQL how to merge the same data in the same table
- Yarn resource scheduler
- 【leetcode】102.二叉树的层序遍历
- 101. Symmetric Tree
- Comparison and practice of prototype design of knowledge service app
猜你喜欢

C language - custom type

/etc/shadow可以破解吗?

Examination questions and answers of the third batch (main person in charge) of Guangdong safety officer a certificate in 2022

第120章 SQL函数 ROUND

Juc并发编程07——公平锁真的公平吗(源码剖析)
Detailed explanation of MapReduce calculation process

Read LSTM (long short term memory)

Configuration of LNMP

Xshell+Xftp 下载安装步骤

2022 mobile crane driver test question bank simulation test platform operation
随机推荐
转:毛姆:阅读是一座随身携带的避难所
997、有序数组的平方(数组)
二叉树的构建和遍历
Turn: Maugham: reading is a portable refuge
2022 mobile crane driver test question bank simulation test platform operation
C语言——自定义类型
Yarn core parameter configuration
Wonderful review | deepnova x iceberg meetup online "building a real-time data Lake based on iceberg"
Shell script interaction free
Redis design and Implementation
JVM——》常用参数
/etc/shadow可以破解吗?
Realizing data value through streaming data integration (5) - stream processing
707. Design linked list (linked list)
JUC concurrent programming 07 -- is fair lock really fair (source code analysis)
Jerry's factors that usually affect CPU performance test results are: [article]
242. Valid Letter ectopic words (hash table)
第120章 SQL函数 ROUND
24. Exchange the nodes in the linked list (linked list)
基于PyQt5实现弹出任务进度条功能示例