Writing Efficient Python Code

Introduction

Efficient code refers to code that satisfies two key concepts
First, efficient code is fast and has a small latency between execution and returning a result
Second, efficient code allocates resources skillfully and isn't subjected to unnecessary overhead
Although your definition of fast runtime and small memory usage may depend on the task at hand, the goal of writing efficient code is still to reduce both latency and overhead
Pythonic code is code that follows the best practices and guiding principles of Python. Pythonic code tends to be less verbose and easier to interpret
The Zen of Python is a list of a 19 idioms and best practices that summarize Python's design philosophy
It is one of Python Enhancement Proposals (PEP20), and can be accessed by running the import this command from the IPython console
Writing efficient Python code employs among other techniques, list comprehension:
- newList = [ expression(element) for element in oldList if condition ]
Python List comprehension provides a much more short syntax for creating a new list based on the values of an existing list
Advantages of List Comprehension
- More time-efficient and space-efficient than loops.
- Require fewer lines of code.
- Transforms iterative statement into a formula.
Example of finding names with 6 letters or more from a list 'names':
better_list = []
for name in names:
if len(name) >= 6:
better_list.append(name)
Pythonic way using list comprehension:
best_list = [name for name in names if len(name) >= 6]

Reducing loops to built-in functions

1. range()

Create list of nums from 1-10:
num = range(0, 11)
list_num = list(num)
Create list of even numbers: range(start, stop, step)
even_nums = range(2, 11, 2)
even_nums_list = list(even_nums)

2. enumerate()

enumerate creates an index item pair for each item in the object provided
We can also specify the starting index of enumerate with the keyword argument start: indexes = enumerate(list_a, start=5)
Example: for a list names = ['Jerry', 'Kramer', 'Elaine', 'George', 'Newman'] ordered according to arrival, attach an index representing the arrival order:

3. map()

map() can also be used with a lambda (an anonymous function)
We can use map and a lambda expression to apply a function, which we've defined on the fly, to our original list
The map function provides a quick and clean way to apply a function to an object iteratively without writing a for loop
Example to convert names to uppercase:

Power of Numpy Arrays

NumPy arrays provide a fast and memory efficient alternative to Python lists
To create: nums_arr = np.array(range(5))

1. NumPy Array Homogeneity

NumPy arrays are homogeneous, which means that they must contain elements of the same type
Homogeneity allows NumPy arrays to be more memory efficient and faster than Python lists
Requiring all elements be the same type eliminates the overhead needed for data type checking

2. NumPy Array Broadcasting

When analyzing data, you'll often want to perform operations over entire collections of values quickly
Lists don't support broadcasting, hence operations on entire list require for loops/list comprehension
NP arrays are advantageous because of their broadcasting functionality
NumPy arrays vectorize operations, so they are performed on all elements of an object at once. This allows us to efficiently perform calculations over entire arrays
e.g to square all elements in array, we square the array itself: np_arr ** 2

3. NumPy Array Indexing

Another advantage of NumPy arrays is their indexing capabilities
When comparing basic indexing between a one-dimensional array and list, the capabilities are identical
When using two-dimensional arrays and lists, the advantages of arrays are clear, as lists present more verbose syntax

4. NumPy Array Boolean Indexing

Wrap it all

Timing and Profiling Code

Timeit

%timeit is prefixed before single lines of code, and %%timeit is written in a new line before a block of code
Timeit gives output based on the following metrics (listed from fastest to slowest):
The number of runs represents how many iterations you'd like to use to estimate the runtime. The number of loops represents how many times you'd like the code to be executed per run
To specify the arguments for 2 runs and 10 loops: %timeit -r2 -n10 expression_to_time
A simple comparison of creating data structures using formal names and literal syntax shows that literal syntax is faster:

Profiling Code

%timeit works well with bite-sized code
If we wanted to time a large code base or see the line-by-line runtimes within a function, we use code profiling
Code profiling is a technique used to describe how long, and how often, various parts of a program are executed
The beauty of a code profiler is its ability to gather summary statistics on individual pieces of our code without using magic commands like %timeit

Code profiling for runtime

We will focus on the line_profiler package to profile a function's runtime line-by-line
To install it: pip install line_profiler; and to use it, you load to environment as %load_ext line_profiler
Syntax of usage: %lprun -f name_of_function full_function_call(arg1, arg2) # -f implies that we are profiling a function
Note that the total time reported when using %lprun and %timeit do not match. Remember, %timeit uses multiple loops in order to calculate an average and standard deviation of time, as compared to %lprun

Code profiling for memory usage

Just like we've used code profiling to gather detailed stats on runtimes, we can also use code profiling to analyze the memory allocation for each line of code in our code base
We'll use the memory_profiler package that is very similar to the line_profiler package
To install: pip install memory_profiler
Syntax of usage: `%mprun -f name_of_function full_function_call(arg1, arg2)
One drawback to using %mprun is that any function profiled for memory consumption must be defined in a filename.py file and imported
Steps:
%load_ext memory_profiler
from filename import name_of_function
%mprun -f name_of_function full_function_call(arg1, arg2)
Note that the memory is reported in mebibytes. Although one mebibyte is not exactly the same as one megabyte, for our purposes, we can assume they are close enough to mean the same thing
The memory_profiler package inspects memory consumption by querying the operating system. This might be slightly different from the amount of memory that is actually used by the Python interpreter
Thus, results may differ between platforms and even between runs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Writing Efficient Python Code.md

Writing Efficient Python Code.md

Writing Efficient Python Code

Introduction

Reducing loops to built-in functions

1. range()

2. enumerate()

3. map()

Power of Numpy Arrays

1. NumPy Array Homogeneity

2. NumPy Array Broadcasting

3. NumPy Array Indexing

4. NumPy Array Boolean Indexing

Wrap it all

Timing and Profiling Code

Timeit

Profiling Code

Code profiling for runtime

Code profiling for memory usage

Files

Writing Efficient Python Code.md

Latest commit

History

Writing Efficient Python Code.md

File metadata and controls

Writing Efficient Python Code

Introduction

Reducing loops to built-in functions

1. range()

2. enumerate()

3. map()

Power of Numpy Arrays

1. NumPy Array Homogeneity

2. NumPy Array Broadcasting

3. NumPy Array Indexing

4. NumPy Array Boolean Indexing

Wrap it all

Timing and Profiling Code

Timeit

Profiling Code

Code profiling for runtime

Code profiling for memory usage