Академический Документы
Профессиональный Документы
Культура Документы
Pandas Basics
Getting
>>> s['b']
>>> df[1:]
-5
Pandas
The Pandas library is built on NumPy and provides easy-to-use
data structures and data analysis tools for the Python
programming language.
Dropping
>>> help(pd.Series.loc)
1
2
Country
India
Brazil
Capital
New Delhi
Braslia
Population
1303171035
207847528
By Position
>>> df.iloc([0],[0])
'Belgium'
B -5
Index
DataFrame
Columns
Index
'Belgium'
Country
1
Belgium
India
Brazil
Capital
Brussels
Population
11190846
A two-dimensional labeled
data structure with columns
of potentially different types
207847528
'Belgium'
By Label/Position
>>> df.ix[2]
>>> df.ix[:,'Capital']
>>> df.ix[1,'Capital']
Country
Brazil
Capital
Braslia
Population 207847528
0
1
2
Brussels
New Delhi
Braslia
Boolean Indexing
Setting
df.shape
df.index
df.columns
df.info()
df.count()
(rows,columns)
Describe index
Describe DataFrame columns
Info on DataFrame
Number of non-NA values
>>>
>>>
>>>
>>>
>>>
>>>
>>>
df.sum()
df.cumsum()
df.min()/df.max()
df.idmin()/df.idmax()
df.describe()
df.mean()
df.median()
Sum of values
Cummulative sum of values
Minimum/maximum values
Minimum/Maximum index value
Summary statistics
Mean of values
Median of values
Applying Functions
>>> f = lambda x: x*2
>>> df.apply(f)
>>> df.applymap(f)
Apply function
Apply function element-wise
10.0
5.0
b
d
NaN
7.0
I/O
>>> pd.read_csv('file.csv', header=None, nrows=5)
>>> pd.to_csv('myDataFrame.csv')
>>>
>>>
>>>
>>>
>>>
Data Alignment
'New Delhi'
>>> s['a'] = 6
>>> df.sort_index(by='Country')
>>> s.order()
>>> df.rank()
Summary
By Label
>>> df.loc([0], ['Country'])
Basic Information
>>> df.iat([0],[0])
Series
10.0
-5.0
5.0
7.0
DataCamp