Sie sind auf Seite 1von 21

Chapter – 11

Python Pandas – II – Dataframes and Other Operations


Characteristics of DataFrame :

Characteristics of a DataFrame are as follows-

• It has 2 index or 2 axes.

• It is somewhat like a spreadsheet where row index is called index and


column index is called column name.

• Indexes can be of numbers, strings or letters.

• Its column can have data of different types.

• It is value-mutable i.e. value can be changed anytime.

• It is Size-mutable i.e. we can add or delete rows/columns.

Creating and Displaying a DataFrame :

DataFrame object can be created by passing a data in 2D format.


import pandas as pd
<dataFrameObject> = pd.DataFrame(<a 2D Data Structure>,\
[columns=<column sequence>],[index=<index sequence>])
• We can create a DataFrame object by various methods by passing data
values. Like-
• Two-dimensional dictionaries
• 2D ndarrays (NumPy Array)
• Series type object
• Another DataFrame object

1. Creating a DataFrame Object from a 2-D Dictionary


(a) Creating a DataFrame Object from a 2-D Dictionary having
values as lists/ndarrays :
We can have a two-dimensional dictionary wherein the value part
consists of either lists or ndarrays.
Specifying our own indexes : in DataFrame( ) function :
(b) Creating a dataframe from a 2D dictionary having values as
dictionary objects :
A 2D dictionary can have values as dictionary objects too. E.g.

df1 = pd.DataFrame(diSales)

NOTE : While creating a dataframe


with a nested or 2d dictionary,
Python interprets the outer dict
keys as the columns and the inner
keys as the row indices.
2. Creating a DataFrame Object from a 2-D ndarray :

User Defined Column Name

User Defined Column Name


and Index.
3. Creating a DataFrame object from a 2D dictionary with values as
Series Objects :

The DataFrame object has columns assigned from the key(0,1,2) of the
dictionary object and its index assigned from the indexes
(‘Delhi’,’Mumbai’,’Colkata’,’Chennai’) of the Series object which are the
values of the dictionary object
4. Creating a DataFrame Object from another DataFrame Object

We can pass an existing DataFrame object to DataFrame( ) and it will


create another dataframe object having similar data. For example :

dtf3 and
dtf4 both
are
having
same
contents

Displaying a DataFrame : same as display other variable and


objects.

Two ways of
displaying
like other
variables and
objects

DataFrame Attributes :

All information related to DataFrame object (such as its size, its datatype
etc.) is available through attributes.

<DataFrame object>.<attribute name>


(a) Retrieving index (axis 0) , Columns (axis 1), axes’ details and data
types of columns :

(b)Retrieving size (number of elements) , shape, number of dimensions :


Size (number of elements)
Shape
Number of dimensions

(c) Checking of emptiness of dataframe :

(d) Getting number of rows in a dataframe : The len(<DF object>)


will return the number of rows in a dataframe e.g.

(e) Getting count of non-NA values in dataframe : to count of Non-NaN


values for each column, e.g.,
(f) Numpy Representation of DataFrame : using attribute : values ,
we can represent the values of a dataframe object in numpy way.

Selecting or Accessing Data :

Selecting/Accessing a column

<DataFrame object>[<column name>]

Or

<DataFrame object>.<column name>


Selecting / Accessing Multiple Columns : Give a list having multiple
column names inside square brackets.
Selecting / Accessing a Subset from a Dataframe using
Row/Columns Names:

<DataFrameObject>.loc[<startrow>:<endrow>,

<startcolumn>:<endcolumn>]

To Access Multiple Rows:

To Access Selective Columns :

<DF object>.loc[: ,<start column>:<end column>]


To access range of columns from a range of rows :

<DF object>.loc[<start row> : <end row>, <start column> : <endcolumn>]

Obtaining a Subset/Slice from a DataFrame using Row/Column


Numeric Index/Position:

<DF object>.iloc[<start row index> : <end row index> ,

<start column index> : <end column index>]


Selecting / Accessing Individual Value :

(i) <DF object>.<column>[<row name or row numeric index>]

Example :

(ii) using at or iat attribute with DF object

<DF object>.at[<row name>, <column name>]

<DF object>.iat[<numeric row index>, <numeric column index>]


Assigning / Modifying Data Values in DataFrame:

(a) to change or add a column :

<DF object>.<column name>[<row lable>] = <new value>

(b) To change or add a row :

<DF object> at <[<row name>, :] = <new value>

OR

<DF object> loc [<row name>,:] = <new value>

(c) To change or modify a single data value:

<DF>.<column name>[<row name/lable>]


Add and Deleting Columns in DataFrame:

Same way , we can modify an existing column by assigning a new list of


values for it.

Existing column – it will change the data values

Non-existing column – it will add a new column.

Other way of adding a column to a Dataframe:

Deleting Columns: use del

del<DF object>[<column name>]


Using pandas.iterrows() function
Example : Using iterrows() function to extract row-wise

Series object:
Using pandas.iteritems() Function

iteritems() function display vertical subset from a dataframe in the


form of column index and a series object containing values for all rows
in that column.

Example : Using iteritems() to extract data from dataframe column wise:

Das könnte Ihnen auch gefallen