当前位置:网站首页>Introduction to data analysis 𞓜 kaggle Titanic mission (III) - > explore data analysis
Introduction to data analysis 𞓜 kaggle Titanic mission (III) - > explore data analysis
2022-04-23 10:33:00 【Ape knowledge】
Series index : Introduction to data analysis | kaggle Titanic mission
One 、 Exploratory data analysis
Mainly introduce the use of Pandas Sort 、 Arithmetic calculation and calculation description function describe() Use .
(1) Create a simulation data
# Build a digital DataFrame data
frame = pd.DataFrame(np.arange(8).reshape((2, 4)),
index=['2', '1'],
columns=['d', 'a', 'b', 'c'])
frame
pd.DataFrame()
: Create a DataFrame object
np.arange(8).reshape((2, 4))
: Generate a two-dimensional array (2*4), First column :0,1,2,3 Second column :4,5,6,7
index=['2, 1]
:DataFrame The index column of the object
columns=['d', 'a', 'b', 'c']
:DataFrame The index line of the object
(2) Sort
frame.sort_values(by='c', ascending=True) #by The parameter points to the column to be arranged ,sacending Point sort method ( Ascending / Descending )
# Let the row index sort in ascending order
frame.sort_index()
# Let the column index sort in ascending order
frame.sort_index(axis=1)
# Sort the column index in descending order
frame.sort_index(axis=1, ascending=False)
# Let any two columns of data be sorted in descending order at the same time
frame.sort_values(by=['a', 'c'], ascending=False)
When sorting two columns , In order , If there are equal numbers in the former , Sort by the next column .
Such as :sort_values(by=['a','c'].ascending = [False, True])
This line means to follow a Arrange in descending order , When a Press the same value in b Ascending order .
(3) utilize Pandas Do arithmetic
frame1_a = pd.DataFrame(np.arange(9.).reshape(3, 3),
columns=['a', 'b', 'c'],
index=['one', 'two', 'three'])
frame1_b = pd.DataFrame(np.arange(12.).reshape(4, 3),
columns=['a', 'e', 'c'],
index=['first', 'one', 'two', 'second'])
frame1_a
# take frame_a and frame_b Add additivity
frame1_a + frame1_b
【 remind 】 Two DataFrame Add up , Will return a new DataFrame, The corresponding row and column values are added , If there is no corresponding, it will become null NaN.
Of course ,DataFrame There are many arithmetic operations , Such as subtraction , Division, etc , Interested students can see 《 utilize Python Data analysis 》 The fifth chapter Align arithmetic operations with data part , Find more relevant learning materials on the Internet .
# call describe function , Observe frame2 Basic information of data
frame2.describe()
''' count : Sample data size mean : The average of the sample data std : Standard deviation of sample data min : Minimum value of sample data 25% : Sample data 25% The value at the time of 50% : Sample data 50% The value at the time of 75% : Sample data 75% The value at the time of max : Maximum value of sample data '''
''' Look at the Titanic dataset The fare The basic statistics of this column '''
text[' The fare '].describe()
Introduction to data analysis | kaggle Titanic mission The series is constantly updated , welcome
Like collection
+Focus on
Last one : Introduction to data analysis | kaggle Titanic mission ( Two )—>pandas Basics
Next : Introduction to data analysis | kaggle Titanic mission ( Four )—> Data cleaning and feature processing
My level is limited , Please comment and correct the deficiencies in the article in the comment area below ~If feelings help you , Point a praise Give me a hand ~
Share... From time to time Interesting 、 Have a material 、 Nutritious content , welcome Subscribe to follow My blog , Looking forward to meeting you here ~
版权声明
本文为[Ape knowledge]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230619310794.html
边栏推荐
- 349、两个数组的交集
- SQLServer 查询数据库死锁
- shell脚本免交互
- /Can etc / shadow be cracked?
- 【leetcode】102. Sequence traversal of binary tree
- Realizing data value through streaming data integration (4) - streaming data pipeline
- Jerry's factors that usually affect CPU performance test results are: [article]
- Juc并发编程06——深入剖析队列同步器AQS源码
- Using multithreading to output abc10 times in sequence
- Problems of class in C # and database connection
猜你喜欢
Comparison and practice of prototype design of knowledge service app
MapReduce core and foundation demo
Solve the problem of installing VMware after uninstalling
Example of pop-up task progress bar function based on pyqt5
第120章 SQL函数 ROUND
部署jar包
Operation of 2022 tea artist (primary) test question simulation test platform
Arm debugging (1): two methods to redirect printf to serial port in keil
Net start MySQL MySQL service is starting MySQL service failed to start. The service did not report any errors.
C language - custom type
随机推荐
SQL Server recursive query of superior and subordinate
Initial exploration of NVIDIA's latest 3D reconstruction technology instant NGP
19. Delete the penultimate node of the linked list (linked list)
Realizing data value through streaming data integration (5) - stream processing
Sim Api User Guide(4)
What if Jerry's function to locate the corresponding address is not accurate sometimes? [chapter]
Comparison and practice of prototype design of knowledge service app
Deploy jar package
定义链表(链表)
206、反转链表(链表)
/Can etc / shadow be cracked?
Redis design and Implementation
/etc/shadow可以破解吗?
图像处理——噪声小记
19、删除链表的倒数第N个节点(链表)
Contact between domain name and IP address
Define linked list (linked list)
Common DBA SQL statements (4) - Top SQL
域名和IP地址的联系
mysql同一个表中相同数据怎么合并