Overview
Teaching: 15 min
Exercises: 15 minQuestions
How can I create my own functions?
Objectives
Learn about the advantages of modular programming
Explain and identify the difference between function definition and function call.
Write a function that takes a small, fixed number of arguments and produces a single result.
def
with a name, parameters, and a block of code.def
.def print_greeting():
"""Print hello"""
print('Hello!')
print_greeting()
Hello!
def increase_magnitude(x=5,i=2):
"""This increases the magnitude of a number by by 10 to the power of i"""
x = x * (10 ** i)
return x
increase_magnitude(50, 3)
50000
return
occurs:
We wish to determine the appropriate files to work with for our analysis. We could write a for loop and perform the appropriate operations within the body of the loop. Let’s write some basic functions for this instead. We’ll start by working on the console interactively on the console to figure out what it is that we want to do.
filepath = csvs[0]
with filepath.open('r') as f:
output = []
for line in range(2):
output.append(f.readline())
output
['https://s3.amazonaws.com/fcp-indi/data/Projects/ABIDE_Initiative/Outputs/freesurfer/5.1/Pitt_0050002/mri/T1.mgz,50002,abide_initiative\n',
'https://s3.amazonaws.com/fcp-indi/data/Projects/ABIDE_Initiative/Outputs/freesurfer/5.1/Pitt_0050003/mri/T1.mgz,50003,abide_initiative\n']
We’ll now encapsulate that in a function. In addition we wish to:
from pathlib import Path
def get_lines(filepath,nlines=2):
"""Return a list of length nlines, for as many lines of the file"""
with filepath.open('r') as f:
output = []
for line in range(nlines):
output.append(f.readline())
return output
Let’s save this function to a file so that we can maintain a module of functions that are useful for reuse:
%save my_funcs N # enter the last line number
The following commands were written to file `my_funcs.py`:
from pathlib import Path
def get_lines(filepath,nlines=2):
"""Return a list of length nlines, for as many lines of the file"""
with filepath.open('r') as f:
output = []
for line in range(nlines):
output.append(f.readline())
return output
Another useful operation we would like is to count the number of lines in the file:
def count_lines(filepath):
"""Count the number of lines in a text file"""
if filepath.stat().st_size == 0:
return 0
else:
with filepath.open('r') as f:
for i, _ in enumerate(f):
pass
return i+1 # because enumerate starts counting at 0
count_lines(file_path)
8060
%save my_funcs N # the line number of the function definition
The following commands were written to file `my_funcs.py`:
def count_lines(filepath):
"""Count the number of lines in a text file"""
if filepath.stat().st_size == 0:
return 0
else:
with filepath.open('r') as f:
for i, _ in enumerate(f):
pass
return i+1 # because enumerate starts counting at 0
Finally we’ll write a function that encapsulates the functionality of our previous functions:
def get_file_info(filepath,nlines=2):
""""
get_file_info(filepath[, nlines]) -> list
Return a line count and the first few lines of a file as a string"""
output = filepath.name + ': Num lines: ' + str(count_lines(filepath)) + '\n\n'
output += ''.join(get_lines(filepath, nlines))
return output
%save -a my_funcs N # the number of the last command
The following commands were written to file `my_funcs.py`:
def get_file_info(filepath,nlines=2):
""""
get_file_info(filepath[, nlines]) -> list
Return a line count and the first few lines of a file as a string"""
output = filepath.name + ': Num lines: ' + str(count_lines(filepath)) + '\n\n'
output += ''.join(get_lines(filepath, nlines))
return output
Now we can look through the csv files in metasearch to see if they contain useful data for our analysis:
for f in csvs:
print(get_file_info(f))
...
ngl_wyoming.csv: Num lines: 8267
d_first_name,d_mid_name,d_last_name,d_suffix,d_birth_date,d_death_date,section_id,row_num,site_num,cem_name,cem_addr_one,cem_addr_two,city,state,zip,cem_url,cem_phone,relationship,v_first_name,v_mid_name,v_last_name,v_suffix,branch,rank,war
"Paul","Frederick","Kratzer","","05/12/1924","04/28/2007","C","","764","OREGON TRAIL STATE VETERANS CEMETERY","80 VETERANS ROAD","BOX 669","EVANSVILLE","WY","82636","","307-235-6673","Veteran (Self)","Paul","Frederick","Kratzer","","US NAVY","GM2","WORLD WAR II",
nutrients.csv: Num lines: 7638
name,group,"protein (g)","calcium (g)","sodium (g)","fiber (g)","vitaminc (g)","potassium (g)","carbohydrate (g)","sugars (g)","fat (g)","water (g)",calories,"saturated (g)","monounsat (g)","polyunsat (g)"
"Beverage, instant breakfast powder, chocolate, not reconstituted","Dairy and Egg Products",19.9,0.285,0.385,0.4,0.0769,0.947,66.2,65.8,1.4,7.4,357,0.56,0.314,0.278
We probably don’t want all this output permanently although it can be useful now. For now let’s save the list of csv files along with how many lines they have:
for f in csvs:
print(f, 0)
%save -a metasearch_analysis.py N # the line number of the above command
The following commands were written to file `metasearch_analysis.py`:
for f in csvs:
print(f, 0)
Running scripts in debugger mode
Now is a good time to try using the ipdb debugger to see how one can observe what is happening as the flow of execution through the file occurs..
Order of Operations
The example above:
result = print_date(1871, 3, 19) print('result of call is:', result)
printed:
1871/3/19 result of call is: None
Explain why the two lines of output appeared in the order they did.
Find the python scripts
Write a function that calls at least one of the functions developed in the lesson. When given a directory as input the function should return a list with as many elements as there are python files (.py) in the directory. Each element should contain the name and the number of lines for a specific python file (.py).
Solution
def get_py_files(dir_path,nlines=0): """Returns a list of the python files in the dir_path dir_path should be a Path object from pathlib with a name of a directory"""" py_files = dir_path.glob('**/*.py') output = [] for f in py_files: output.append(get_file_info(f,nlines)) return output print(*get_py_files(metasearch_dir))
data_not_in_repo/metasearch/crawler/brain-development/extract.py: Num lines: 12 data_not_in_repo/metasearch/docs/data/merge_csv.py: Num lines: 25
Key Points
Break programs down into functions to make them easier to understand.
Define a function using
def
with a name, parameters, and a block of code.Defining a function does not run it.
Arguments in call are matched to parameters in definition.
Functions may return a result to their caller using
return
.