Introduction to Python - Numpy
Overview of Numpy
- Numpy is a package that provides additional functionality often useful working with arrays for data science.
- Typically Numpy is imported as
np
. np.array()
will cast a list (or other collection) as a numpy array.- You can slice an array in the same way yo can slice a list.
import numpy as np
a = np.array([0, 1, 2, 3, 4, 5, 6])
print('A is of type:', type(a))
print('Print the entire array:', a)
print('Print the first value:', a[0])
print('Print the first three value:', a[0:3])
print('Print from second value till end of list:', a[2:])
print('Print the last value of a numpy array:', a[-1])
print('Print up till the 2nd to last value:', a[:-2])
Arrays and Functions
- A really powerful aspect of arrays is the capaiblity to do calculations over arrays.
- Numpy has a number of functions possible listed here.
- Often it is possible to do calculations directly or via np functions, as shown below.
import numpy as np
a = np.array([1, 2, 3, 4, 5, 6])
b1=10*a
b2=np.multiply(10,a)
c1=a+b1
c2=np.add(a,b1) #This is an alternate way of adding
d=np.log(a)
e=np.sqrt(a)
f=a**2 #This squares the value.
np.square([-1j, 1])
print('Print the entire array a:', a)
print('Print the entire array b1:', b1)
print('Print the entire array b2:', b2)
print('Print the entire array b3:', c1)
print('Print the entire array c2:', c2)
print('Print the entire array d:', d)
print('Print the entire array e:', e)
print('Print the entire array f:', f)
Creating and Manipulating Numpy Arrays
- The arrange function will generate an array.
- Reshape changes the structure of the array to n rows and m columns.
a=a.reshape(n, m)
-ones
will create an array with all ones andzeros
with all zeros. - Reshaping can get it in the appropriate structure, but make sure that the size fits the appropriate dimensions.
import numpy as np
a = np.arange(15)
print(a)
a2 = np.arange( 0, 15, 1 ) #Alternate specification with np.arrange(start, end, step)
print(a2)
a=a.reshape(3, 5)
print(a)
b= np.ones(shape=(3, 5), dtype=float)
print(b)
c= np.zeros(shape=(3, 5), dtype=int)
print(c)
d= np.full((3, 5), 4, dtype=int)
print(d)
e= np.arange( 0, 1.5, .1 ).reshape(3,5) #String together creations and reshaping. Also can use decimals.
print(e)
e= np.arange( 0, 1.5, .1 ).reshape(3,5)
Generating Random Numpy Data
- This is often useful, and we will be using it to demonstrate some initial techniques.
- Often you want random but repeatable results, so that for example a test could have a consistent average on a random array. For this we need to set a seed. You only have to do this once.
np.random.seed([2335])
a = np.random.uniform(50, 150, 10) #Between 50-150, generate 10 variables from uniform
b = np.random.standard_normal(10) #With mean 0 and standard deviation 1
print(a)
print(b)
Combining Numpy Arrays
concatenate
will string a list of numpy arrays togethernp.concatenate([a,b])
vstack
will stack numpy arrays- Defaults: start =0, end =last and step is 1.
- To print the entire array, leave start/stop/step blank
a[::]
a = np.arange(5)
b=np.concatenate([a,a])
c=np.vstack([a,a])
d=np.hstack([c,c])
print('a:',a,'\nb:',b,'\nc:',c,'\nd:',d)
Slicing Single Dimension Numpy Arrays
- Slicing arrays includes three numbers
a[start:stop:step]
but not all are required. - Defaults: start =0, end =last and step is 1.
- To print the entire array, leave start/stop/step blank
a[::]
e= np.arange( 0, 15, 1 )
print(e)
#[start:end:step]
print("This is the start, end, and step:",e[2:9:3])
print("Print every other:",e[::2])
print("Print starting at 2 and ending at 9, default step 1:",e[2:9])
print("Print all:",e[::])
print("Print all:",e[:])
print("Print all:",e)
Numpy Arrays From External Datasets
- We can take a list from an external dataset and change it to an numpy array.
#First let's download some data.
!wget https://raw.githubusercontent.com/rpi-techfundamentals/spring2019-materials/master/input/iris.csv
import csv
csv_file_object = csv.reader(open('iris.csv', newline=''), delimiter=',')
data=[]
header = next(csv_file_object) #
for row in csv_file_object:
data.append(row) # add each row to the
data = np.array(data)
print(data)
Slicing Two-dimensional Numpy Arrays
- We can slice arrays with
array[row, column]
were row and column each include the (start:stop:step) like in arrays - We can sepecify the type with the
.astype(np.float_)
- For a full list of Numpy types, see documentation
- If we create a one dimensional array from 2 dimensional numpy array, it will also be a numpy array of same type.
#We can slice the array several different ways and generate new variables.
irisdata=data[0::,0:4:].astype(np.float_) #This will select only the first 4 columns and change the type to float
irisdata=data[:,0:4].astype(np.float_)
iristype=data[0::,4:5:] # This will select only the type.
print(irisdata,'\n',iristype)
#This can be used to select column 1 and assign to new variable.
#This will sum up column 1
newvariable=irisdata[::,0:1:]
#This will sum up column 0
final=irisdata[::,0:1:].sum()
type(newvariable)
#print(newvariable)
print(final)
#This will take the mean of column 1
print('mean:', irisdata[::,0:1:].mean())
Credits
Copyright AnalyticsDojo 2016 This work is licensed under the Creative Commons Attribution 4.0 International license agreement.