2.4 Using Python

The base functionality of Python is provided in this section. Additional functions and explanations relating to specific methods or algorithms are provided in their respectful chapters in this book. We note that Python 3.6 or higher should be used (Python 2.7 is an older legacy version with which some of the code from this book will not work).

2.4.1 Python setup

There are a number of ways to setup Python on your machine. We will outline the three most frequent methods below:

Both Miniconda and Anaconda distributions utilise the conda package in their Python installations, which allows to download and install additional Python packages. The standard Python installation uses the pip package to download and install additional Python packages.

2.4.1.1 What’s the difference between pip and conda?

In short, pip allows us to only install Python packages. In other words, we would not have the ability to easily install additional non-Python libraries. In contrast, conda is a packageing tool and installer, which handles library dependencies outside of Python-only packages, as well as the Python packages themselves.

You can use conda and pip side-by-side, however you cannot use them interchangeably - pip cannot install conda format packages.

2.4.1.2 What’s the difference between Miniconda and Anaconda?

The differences are as follows:

  • Miniconda = Python + conda
  • Anaconda = Miniconda + conda install anaconda

In other words, Anaconda contains an additional (~160) Python packages than the miniconda distribution.

Take note that these additional packages may result in a total installation time of ~40-60 minutes for Anaconda. This also means that if you need to reinstall Anaconda, you will need to wait ~20-30 minutes for the uninstall process to complete, and then an additional 40 - 60 minutes for the installation to complete.

2.4.1.3 Which version should I choose?

For classes, it is recommended to choose the Anaconda distribution, as it contains most of the packages needed.

Alternatively, you can install Miniconda and the appropriate packages, e.g. see the beginning of Ch.3.11, or Ch.4.11. Installing Miniconda should take less time than Anaconda and may be faster, in case you need to reinstall it later.

Finally, only choose to install the standard Python installation, if you have some programming experience and are not afraid of messing with packages installation, which may require configurating additional library dependencies manually.

Only use one method to setup your Python environment, as having more than one installation may cause software conflicts!

2.4.1.4 Installing Python via the Anaconda distribution

Note: the website design for Anaconda has changed, as well as the website itself - www.anaconda.com.

We can install the Anaconda distribution of Python as follows:

Download the appropriate version depending on your operating system:

Make sure you download Anaconda for the latest version of Python:

Again, do not use Python 2.7 as the code syntax and package compatibility will break.

When installing anaconda, make sure that the following boxes are checked (unless you already have an existing non-anaconda python distribution installed):

Finally, after installing anaconda, launch the Anaconda Navigator and go to the packages:

There, check for any updates:

Then, navigate back and update JupyterLab:

After updating JupyterLab, you can update the remaining packages by opening the terminal:

and inputting:

2.4.2 A quick way to launch JupyterLab

On some systems launching the Anaconda navigator may take some time and since we are only interested in JupyterLab, we will make it easier for ourselves by creating an executable for JupyterLab with a custom home directory. Doing so is as straightforward as creating a folder called PrEcon on your desktop:

Then, open up Notepad and input:

Replace YOUR_PC_USER with your PC user and save the file on your desktop as JupyterLab.bat. In my case this is:

Make sure that you have selected ‘All Files’ for the file type.

One you double click on the .bat file, you will open up a window in your browser but do not close the terminal window as this will close JupyterLab!:

2.4.3 Introductory Python tutorial

Note that most of the functions and methods used in this book will be provided in each chapter. This sections serves only as a quick introduction to the basic functionality of Python.

In general, it is recommended to do either the Introduction to Python tutorials or The Python language from the Scipy Lecture Notes for a quick introduction without any additional software requirements.

On the other hand, similarly to R’s swirl package, we can install PyCharm Edu and get an interactive tutorial (unlike R, here we need to use a different application, instead of an additional package).

2.4.3.1 Python language tutorial using PyCharm

Download PyCharm Edu and install it. Make sure that you already have Anaconda (or alternatively, the base Python but not both as it may cause errors) installed before installing Pycharm Edu.

Once installed, start PyCharm Edu:

If you are certain that everything installed correctly, click Learn to browse courses and select Introduction to Python. Alternatively, to verify that everything works correctly, you can click Create New Project.

You can name your project anything you want and click Create.

In case you have a Python error that python_d.exe is not found when PyCharm creates the Project - see this question on stackoverflow.

Inside the Project select File -> Learn -> Browse Courses:

in the new dialog window select Introduction to Python:

and click Join.

Note:

  • The Interpreter:... may be different (for example, conda if you are using Anaconda) - don’t change it unless you know what you are doing - PyCharm Edu selects an available interpreter automatically.
  • In case you get a message that the PyCharm interpreter is not configured (even though you have selected a Python version/interpreter) - wait for the Indexing... to finalize and restart PyCharm.

Finally, the selected course will be loaded:

  • The left window is the available lessons.
  • The middle window is your code and input window - note the highlighted text type your name, where you need to input your name in the first task.
  • The right window contains the description of the task, as well as allows you to look at the hints, if you get stuck.

After inputting the required fields, you can click the green arrow to run your code in the script file:

The bottom window will automatically open and show the output of the script.

After examining the output and feeling confident about your answer, click the Check button. You can examine the output of the by clicking on Run in the bottom-left:

If you want to try some other commands and examine their output - you can click on Python console and type some commands in the console at the bottom to execute them one-by-one (as opposed to the script file in the middle window, which executes all of the commands if you press Check or click the previously mentioned green arrow to execute the code).

Finally, click Next to go to the next lesson.

If you accidentally opened more than one tutorial, you can manage your existing projects (open previously saved projects or delete existing ones) via File -> Open Recent -> Manage Projects:

This interactive tutorial will help you familiarize yourself with the basic functionality and syntax of the Python programming language.

After getting familiar with Python iteself, we can move on to JupyterLab, where we will examine hwo we can blend together Python code, its output, add some comments, text formatting as well as mathematical formulas in one document. This makes it easier to have templates/examples of data analysis tasks with model estimation code and result interpretation, without having to spend extra time by copy-pasting them in some other document.

2.4.3.2 Introductory JupyterLab notebook tutorial

Launch JupyterLab and create a new notebook file:

and rename it to python_intro:

There are three different cells to choose from:

  • Code - this type of cell treats the input as python (because we created a python notebook) code;
  • Markdown - this type of cell treats the input as markdown code;
  • Raw - the input is treated as raw text;

You also have a menu to:

  • save changes to your notebook;
  • add a new cell of the selected type to your notebook;
  • cut a selected cell
  • copy a selected cell;
  • paste the copied cells;
  • run a selected cell;
  • stop a selected running cell;
  • restart the notebook kernel - this clears the current workspace of any variables and loaded packages and is somewhat equivalent to restarting RStudio.

Next, create three different blocks with the following:

  • Code cell with:
  • Markdown cell with:
  • Raw cell with:

You can either compile a selected cell by pressing CTRL + ENTER, or all the cells with:

Notice that the Raw cell doesn’t produce any output and doesn’t compile any LaTeX / Markdown code.

2.4.3.3 Python programming language at a glance

Below we present some code examples of Pythons code syntax. Explanations are minimal - the idea is to have quick examples with output to verify how Python works. For more in depth examples, see the previous subsection.

2.4.3.3.1 Strings

Run the following code and verify that you understand what happened to the output:

Assign, print and transform strings:

## this is a sentence
## This is a sentence
## This Is A Sentence
## THIS IS A SENTENCE

Split a string into a list of words and select different elements from the list:

## ['this', 'is', 'a', 'sentence']
## this
## ['this']
## ['this', 'is']
## ['this', 'is', 'a']
## sentence
## ['a', 'sentence']
## ['this', 'is']
## ['this', 'is', 'a']
## ['this', 'is']

Combine different strings:

## this is a
## AB
## this is a dog
## this is a DOG
## This is a dog.

Trim white-space, add line breaks and tab spacing:

## '     a     '
## 'a     '
## '     a'
## 'a'
##  this is a sentence
## 
## this is a sentence
## ----
##      this is a sentence
## ----
2.4.3.3.2 Numbers

Run the following code and verify that you understand what happened to the output:

Assign values to variables, print the values with a string text and perform basic math operations:

## x = 2 y = 3
## 5
## -1
## 6
## 11
## 15

Carry a value to the power of different values:

## 1
## 2
## 4
## 9
## 8
## 5.0
2.4.3.3.3 Lists

A list can store multiple variables. The variables need not be of the same type.

Print different items in a list, combine different lists, etc.:

## ['dog', 11, 'cat', 13.5]
## dog
## 11
## []
## ['dog', 11, 'cat', 13.5, 'dog', 11, 'cat', 13.5]
## ['dog', 11, 'cat', 13.5, 'car', 5.0, 'pencil', 3.5]
## DOG

Note the different data types:

## <class 'list'>
## <class 'list'>
## <class 'str'>
## <class 'int'>
## <class 'str'>

Change items in the list:

## ['dog', 11, 'cat', 13.5]
## ['car', '11', 'cat', '13.5']

Add or remove items from a list:

## []
## ['dog', 'cat', 'dog & cat']
## ['dog', 'cow', 'cat', 'dog & cat']
## ['cow', 'cat', 'dog & cat']
## ['cow', 'cat']
## cat
## ['cow']
## ['car', 'cow']
## 'car'
## ['cow']

Finding elements in a list:

## ['cow', 'cat']
## 0
## 1

We get an error if we try to print an index of an item which is not in the list:

## Error in py_call_impl(callable, dots$args, dots$keywords): ValueError: 'dog' is not in list
## 
## Detailed traceback: 
##   File "<string>", line 1, in <module>

List with numeric values:

## Minimum: -2
## Maximum: 10
## Total: 33

Sort a list and print its length:

## ['cat', 'cow', 'dog', 'dog & cat']
## ['dog', 'cow', 'cat', 'dog & cat']
## ['cat', 'cow', 'dog', 'dog & cat']
## ['dog & cat', 'dog', 'cow', 'cat']
## This list contains 4 elements

Split strings:

## This is a sentence.
## ['This', 'is', 'a', 'sentence.']
## ['This is ', ' sentence.']
## <class 'str'>
## 19
## <class 'str'>
## T
## This i

Note that some of the functions, like insert(), remove(), sort(), pop(), etc. change the original elements in x. This is because lists are so called mutable objects. Mutable objects can be changed after they are created. Mutable objects are passed by object reference, instead of value.

This means, that modifying the original value changes any other variable that the value is passed as reference:
## [1, 2, 3]
## [1, 2, 3]
## [1, 2, 3, -1]
## [1, 2, 3, -1]
## [1, 2, 3, -1, 10]
## [1, 2, 3, -1, 10]
## [11, 2, 3, -1, 10]
## [11, 2, 3, -1, 10]

A workaround is to explicitly create a new variable, instead of a reference:

## [1, 2, 3]
## [1, 2, 3]
## [1, 2, 3]
## [1, 2, 3, -1]
## [1, 2, 3]
## [1, 2, 3]
## [1, 2, 3, -1]
## [1, 2, 3, 10]
## [1, 2, 3]
## [1, 2, 3, -1]
## [1, 2, 3, 10]
## [11, 2, 3]
2.4.3.3.5 Loops

We can loop through each item in a list. Note that we need to transform any non-strings to strings if we want to print and concatenate the value into a string:

## Item in the list: car
## Item in the list: 11
## Item in the list: pencil
## Item in the list: 13.5
## Item 0 is: car
## Item 1 is: 11
## Item 2 is: pencil
## Item 3 is: 13.5

Format a list as a numbered list via enumerate():

## 1) car ;
## 2) 11 ;
## 3) pencil ;
## 4) 13.5 ;

In the above example our numbered list started at 1. The list index numbers and the list values are printed in the {} symbols. Each list number is formated as i), followed by the list element value and with the ; symbol appended to the end. If we wanted, we could change, or remove these extra formatting options.

We can also create the formatting in a different way:

## Item index is: 0, item #1, item: This
## Item index is: 1, item #2, item: Is
## Item index is: 2, item #3, item: A
## Item index is: 3, item #4, item: List

range() function:

## 1
## 2
## 3
## 4
## [1, 2, 3, 4]
2.4.3.3.6 Tuples

Tuples are sequences, just like lists. The differences between tuples and lists - tuples cannot be changed, unlike lists, and tuples use parentheses, whereas lists use square brackets. Tuples are immutable which means you cannot update or change the values of tuple elements. You can, however, take portions of existing tuple variables and create new tuple variables.

## Item index is: 0, item #1, item: This
## Item index is: 1, item #2, item: Is
## Item index is: 2, item #3, item: A
## Item index is: 3, item #4, item: Tuple
## Value saved (equal to 1).
## Value saved (equal to 1).
## The color is green.
## My #1 color is green
## My #1 color is green
## The numbers are 7, 23, and 42.
2.4.3.3.9 Classes

Classes allow combining information and behaviour. For an example, see 2.7.7.