Direct software installation

This section describes the steps needed to directly install the software on your machine.

Note

If you ever need to update your software, you might need to fully remove any existing (older) versions beforehand. Furthermore, sometimes updating R or Python libraries might also result in problems with some of your existing code.

These notes are compiled on a docker container running Ubuntu and the steps for direct software installation are provided for the Windows operating system only.

Tip

In general, Windows, Linux and MacOS users are recommended to follow the containerized setup section.

Installing Python

There are a number of ways to setup Python on your machine. The three most frequent methods below:

Both Miniconda and Anaconda distributions utilize the conda package in their Python installations, which allows to download and install additional Python packages. The standard Python installation uses the pip package to download and install additional Python-only packages. In contrast, conda is a packaging tool and installer, which handles library dependencies outside of Python-only packages, as well as the Python packages themselves.

Important

Since 2020, due to licensing changes to Anaconda (see this stackoverflow answer) with regards to commercial use, it is less of a headache to use the standard Python installation. You can read more on the official Anaconda blog announcement.

The Python version that we will be using is:

Python 3.11.7

Python versions are specified as X.YY.ZZ where X is the major version number, YY is the minor version and ZZ is the micro version number. The code in these notes should work as long as you have the same major and minor versions, along with identical package versions.

Follow the steps outlined below:

  1. Go to the official download page and get the 64-bit Python installation with the same major and minor versions as the ones outlined above.
  2. Follow the installation steps. If you are on Windows - make sure that you select add python.exe to PATH at the beginning of the installation and Disable path length limit at the end of the installation. If you are using a different Operating System, make sure that Python is added to the systems PATH variable.
  3. Open the terminal on Windows press: Windows key + R, then type in cmd and press Enter.
  4. In the terminal type in python --version and press Enter. The resulting Python version should be the same as the one you’ve installed.
  5. Since you’ve added Python to PATH, you can also check the location of your Python executable by typing where python in the terminal.
Tip

On Ubuntu the command is which python and gives the following result:

/opt/python/3.11.7/bin/python

After completing the above steps you will now have the base installation of Python. We will also need a number of additional libraries, which we will install after having set up R.

Installing R

The R version that we will be using is:

R version 4.3.2 (2023-10-31)

Installing R is pretty straightforward:

  1. Go to R and select the version that matches the one outlined above and install the software.
  2. Download and install RTools (Windows only!). Select the versions that corresponds to the version of R that you’ve installed. Typing in rstudioapi::versionInfo()$long_version should give you the version of RStudio.

On Windows R might not be added to the system PATH by default. So, we will move on to installing RStudio.

(Optional) Installing quarto

While not directly used in these notes, Quarto is quite useful for rendering scientific documentation. As an example, these notes are written using quarto. In newer versions of RStudio support for quarto is already included, but you can install an up-to-date version of quarto from here.

The version of quarto that was used to render these notes is provided below:

1.3.450

Installing RStudio

The version of RStudio that we’re using is:

2023.09.1+494 (Desert Sunflower) for Ubuntu Jammy

It is structured as YYYY.MM.N+ZZZ. where YYYY is the year, MM is the month of the first release date and N is either 0, 1 or 2, indicating updated versions, released at later dates.

  1. Depending on the above version, go to RStudio download page and check if the version number matches. If it doesn’t - go to the older version download page and select the appropriate link based on the YYYY.MM.N format. Then make sure to download the RStudio Desktop version, which is free.
  2. Follow the installation steps and install RStudio.
  3. Open RStudio and in the Console window type in R.Version() to make sure that you have the correct version of R installed.

We can now install libraries for both programming languges.

Installing libraries

Many different libraries are available for R and Python. Below, we will list the ways to install specific library versions for most of the libraries1.

In addition, we might need to configure additional software:

  • polars library, which is available for both R and Python. If you get an error when trying to install polars in either R, or Python, then you might need to install Rust.
  • CmdStanR and CmdStanPy libraries utilize the cmdstan interface to Stan. Thankfully, cmdstan can be directly downloaded and installed by calling a function from either cmdstanr of cmdstanpy packages.

Firstly, go to Posit PackageManager and select your Operating system. Then in the “Snapshots” section select "Yes, always install packages from the date I choose" and select the following date:

2023-12-08

Begin by creating the following file named environment.txt:

AER
DAAG
DT
GGally
IRdisplay
IRkernel
MASS
Matrix
OECD
RCurl
ROCR
Rcpp
Rcrawler
arrow
astsa
bookdown
caTools
car
caret
cowplot
crayon
data.table
devtools
doParallel
doSNOW
dplyr
dyn
dynlm
e1071
eurostat
fGarch
fUnitRoots
duckdb
fansi
feather
fma
fontawesome
foreach
forecast
fpp
fpp2
fpp3
gdata
ggiraph
ggplot2
ggvis
glmnet
gplots
gt
htmltools
htmlwidgets
imputeTS
kableExtra
knitr
languageserver
fst
lars
latex2exp
lattice
leaps
lmtest
lrmest
lubridate
mFilter
markdown
mfx
mice
microbenchmark
mlr3
mlr3verse
multcomp
nnet
nortest
orcutt
pROC
pak
patchwork
pbdZMQ
plm
plotly
polynom
posterior
prophet
quantmod
randomForest
readxl
renv
repr
reshape2
purrr
qs
reticulate
rmarkdown
rpart
rpart.plot
rstudioapi
rugarch
rzmq
sandwich
seasonal
shiny
shinydashboard
shinythemes
rvest
spdep
stargazer
stringr
swirl
tempdisagg
tidymodels
tidyverse
tree
tsDyn
tseries
txtplot
urca
vars
viridisLite
waveslim
writexl
yaml
zoo
skimr
vroom

Set your working directory in RStudio to the same place as the environment.txt file. Then, inside RStudio run the following command (replace the https://packagemanager.posit.co/cran/__linux__/jammy/2023-12-08 address with the same date Posit Package Manager URL for your operating system):

options(repos = c(CRAN = "https://packagemanager.posit.co/cran/__linux__/jammy/2023-12-08"))

Finally, run the following code:

# Pass argument to script to change working directory to the file location
# e.g., Rscript /tmp/install_libraries.R /tmp/
args <- commandArgs(trailingOnly = TRUE)
if(length(args) > 0){
  setwd(args[1])
}

## Read the file with the list of libraries:
lib_list <- unname(unlist(na.omit(read.table("./environment.txt"))))

## Drop any packages, which are already installed as the base, or recommended packages:
new_packages <- installed.packages()[installed.packages()[,'Package'] %in% lib_list, ]
new_packages <- data.frame(new_packages[, c("Package", "Version", "Priority")])
new_packages <- new_packages[(new_packages$Priority %in% c("base", "recommended")), ]
if(nrow(new_packages) > 0){
  lib_list <- setdiff(lib_list, unique(new_packages$Package))
}

## Install the remaining libraries (uses the default CRAN, which is set by the dockerfile)
install.packages(lib_list)

## Other libraries, that are outside of CRAN (only latest versions):
install.packages("polars", repos = "https://rpolars.r-universe.dev/bin/linux/jammy/4.3") 
install.packages("cmdstanr", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))
# https://pkg.yangzhuoranyang.com/tsdl/articles/tsdl.html
devtools::install_github("FinYang/tsdl")

The versions of the installed libraries are as follows:

           Package   Version
1              AER    1.2-10
2            arrow  14.0.0.2
3            astsa       2.0
4         bookdown      0.37
5              car     3.1-2
6            caret    6.0-94
7          caTools    1.18.2
8          cowplot     1.1.1
9           crayon     1.5.2
10            DAAG    1.25.4
11      data.table    1.14.8
12        devtools     2.4.5
13      doParallel    1.0.17
14          doSNOW    1.0.20
15           dplyr     1.1.4
16              DT      0.30
17          duckdb   0.9.2-1
18             dyn   0.2-9.6
19           dynlm     0.3-6
20           e1071    1.7-14
21        eurostat     3.8.2
22           fansi     1.0.5
23         feather     0.3.5
24          fGarch   4031.90
25             fma       2.5
26     fontawesome     0.5.2
27         foreach     1.5.2
28        forecast    8.21.1
29             fpp       0.5
30            fpp2       2.5
31            fpp3       0.5
32             fst     0.9.8
33      fUnitRoots   4021.80
34           gdata     3.0.0
35          GGally     2.2.0
36         ggiraph     0.8.7
37         ggplot2     3.4.4
38           ggvis     0.4.8
39          glmnet     4.1-8
40          gplots     3.1.3
41              gt    0.10.0
42       htmltools     0.5.7
43     htmlwidgets     1.6.4
44        imputeTS       3.3
45       IRdisplay       1.1
46        IRkernel     1.3.2
47      kableExtra     1.3.4
48           knitr      1.45
49  languageserver    0.3.16
50            lars       1.3
51       latex2exp     0.9.6
52         lattice    0.21-9
53           leaps       3.1
54          lmtest    0.9-40
55          lrmest       3.0
56       lubridate     1.9.3
57        markdown      1.12
58            MASS    7.3-60
59          Matrix   1.6-1.1
60         mFilter     0.1-5
61             mfx     1.2-2
62            mice    3.16.0
63  microbenchmark    1.4.10
64            mlr3    0.17.0
65       mlr3verse     0.2.8
66        multcomp    1.4-25
67            nnet    7.3-19
68         nortest     1.0-4
69            OECD     0.2.5
70          orcutt       2.3
71             pak     0.7.0
72       patchwork     1.1.3
73          pbdZMQ    0.3-10
74             plm     2.6-3
75          plotly    4.10.3
76         polynom     1.4-1
77       posterior     1.5.0
78            pROC    1.18.5
79         prophet       1.0
80           purrr     1.0.2
81              qs    0.25.7
82        quantmod    0.4.25
83    randomForest   4.7-1.1
84            Rcpp    1.0.11
85        Rcrawler   0.1.9-1
86           RCurl 1.98-1.13
87          readxl     1.4.3
88            renv     1.0.3
89            repr     1.1.6
90        reshape2     1.4.4
91      reticulate    1.34.0
92       rmarkdown      2.25
93            ROCR    1.0-11
94           rpart    4.1.21
95      rpart.plot     3.1.1
96      rstudioapi    0.15.0
97         rugarch     1.5-1
98           rvest     1.0.3
99            rzmq    0.9.12
100       sandwich     3.0-2
101       seasonal     1.9.0
102          shiny     1.8.0
103 shinydashboard     0.7.2
104    shinythemes     1.2.0
105          skimr     2.1.5
106          spdep     1.3-1
107      stargazer     5.2.3
108        stringr     1.5.1
109          swirl     2.4.5
110     tempdisagg     1.1.1
111     tidymodels     1.1.1
112      tidyverse     2.0.0
113           tree    1.0-43
114          tsDyn    11.0.4
115        tseries   0.10-55
116        txtplot     1.0-4
117           urca     1.3-3
118           vars     1.6-0
119    viridisLite     0.4.2
120          vroom     1.6.5
121       waveslim     1.8.4
122        writexl     1.4.2
123           yaml     2.3.7
124            zoo    1.8-12

Create a file called requirements.txt and add the following libraries2:

absl-py==2.1.0
aiohttp==3.9.3
aiohttp-cors==0.7.0
aiosignal==1.3.1
altair==5.2.0
anyio==4.2.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
arviz==0.17.0
asciitree==0.3.3
asttokens==2.4.1
astunparse==1.6.3
async-lru==2.0.4
attrs==23.2.0
Babel==2.14.0
bambi==0.13.0
beartype==0.17.0
beautifulsoup4==4.12.3
bleach==6.1.0
blessed==1.20.0
blinker==1.7.0
bokeh==2.4.3
cachetools==5.3.2
certifi==2024.2.2
cffi==1.16.0
cftime==1.6.3
charset-normalizer==3.3.2
click==8.1.7
click-default-group==1.2.4
cloudpickle==3.0.0
cloup==2.1.2
cmdstanpy==1.2.0
colorama==0.4.6
colorful==0.5.6
comm==0.2.1
commonmark==0.9.1
cons==0.4.6
contourpy==1.2.0
cycler==0.12.1
dask==2024.1.1
datar==0.15.4
datatable==1.1.0
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
diot==0.2.3
distlib==0.3.8
distributed==2024.1.1
duckdb==0.9.2
etuples==0.3.9
executing==2.0.1
fasteners==0.19
fastjsonschema==2.19.1
fastprogress==1.0.3
filelock==3.13.1
flatbuffers==23.5.26
fonttools==4.47.2
formulae==0.5.1
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2023.12.2
gast==0.5.4
gitdb==4.0.11
GitPython==3.1.41
glcontext==2.5.0
google-api-core==2.16.2
google-auth==2.27.0
google-auth-oauthlib==1.2.0
google-pasta==0.2.0
googleapis-common-protos==1.62.0
gpustat==1.1.1
graphviz==0.20.1
great-tables==0.2.0
greenlet==3.0.3
griffe==0.40.0
grpcio==1.60.1
h5netcdf==1.3.0
h5py==3.10.0
htmltools==0.5.1
idna==3.6
importlib-metadata==7.0.1
importlib-resources==6.1.1
inflection==0.5.1
ipykernel==6.29.0
ipython==8.21.0
ipywidgets==8.1.1
isoduration==20.11.0
isosurfaces==0.1.0
jedi==0.19.1
Jinja2==3.1.3
joblib==1.3.2
json5==0.9.14
jsonpointer==2.4
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
jupyter==1.0.0
jupyter-cache==1.0.0
jupyter-console==6.6.3
jupyter-events==0.9.0
jupyter-lsp==2.2.2
jupyter_client==8.6.0
jupyter_core==5.7.1
jupyter_server==2.12.5
jupyter_server_terminals==0.5.2
jupyterlab==4.0.12
jupyterlab-lsp==5.0.2
jupyterlab-widgets==3.0.9
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.2
keras==2.15.0
kiwisolver==1.4.5
lckr_jupyterlab_variableinspector==3.2.1
libclang==16.0.6
llvmlite==0.42.0
locket==1.0.0
logical-unification==0.4.6
manim==0.18.0
ManimPango==0.5.0
mapbox-earcut==1.0.1
Markdown==3.5.2
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.8.2
matplotlib-inline==0.1.6
mdurl==0.1.2
miniKanren==1.0.3
mistune==3.0.2
mizani==0.9.3
ml-dtypes==0.2.0
moderngl==5.10.0
moderngl-window==2.4.4
modin==0.26.1
modin-spreadsheet==0.1.2
mpmath==1.3.0
msgpack==1.0.7
multidict==6.0.5
multipledispatch==1.0.0
nbclient==0.9.0
nbconvert==7.14.2
nbformat==5.9.2
nest-asyncio==1.6.0
netCDF4==1.6.5
networkx==3.2.1
notebook==7.0.7
notebook_shim==0.2.3
numba==0.59.0
numcodecs==0.12.1
numexpr==2.9.0
numpy==1.26.3
nvidia-ml-py==12.535.133
oauthlib==3.2.2
opencensus==0.11.4
opencensus-context==0.1.3
opt-einsum==3.3.0
overrides==7.7.0
packaging==23.2
pandas==2.1.4
pandocfilters==1.5.1
parso==0.8.3
partd==1.4.1
patsy==0.5.6
pexpect==4.9.0
Pillow==9.5.0
pipda==0.13.1
platformdirs==4.2.0
plotnine==0.12.4
plum-dispatch==2.3.2
polars==0.20.6
prometheus-client==0.19.0
prompt-toolkit==3.0.43
protobuf==4.23.4
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
py-spy==0.3.14
pyarrow==15.0.0
pyasn1==0.5.1
pyasn1-modules==0.3.0
pycairo==1.25.1
pycparser==2.21
pydantic==1.10.14
pydeck==0.8.1b0
pydub==0.25.1
pyglet==2.0.10
Pygments==2.17.2
pymc==5.10.3
pyparsing==3.1.1
pyrr==0.10.3
pytensor==2.18.6
python-dateutil==2.8.2
python-json-logger==2.0.7
python-simpleconf==0.6.0
pytz==2024.1
PyYAML==6.0.1
pyzmq==25.1.2
qtconsole==5.5.1
QtPy==2.4.1
quartodoc==0.7.2
ray==2.9.1
referencing==0.33.0
requests==2.31.0
requests-oauthlib==1.3.1
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.7.0
rpds-py==0.17.1
rsa==4.9
scikit-learn==1.4.0
scipy==1.12.0
screeninfo==0.8.1
seaborn==0.13.2
Send2Trash==1.8.2
simplug==0.3.2
siuba==0.4.2
six==1.16.0
skia-pathops==0.7.4
skimpy==0.0.14
smart-open==6.4.0
smmap==5.0.1
sniffio==1.3.0
sortedcontainers==2.4.0
soupsieve==2.5
sphobjinv==2.3.1
SQLAlchemy==2.0.25
srt==3.5.3
stack-data==0.6.3
stanio==0.3.0
statsmodels==0.14.1
streamlit==1.31.0
svgelements==1.9.6
sympy==1.12
tabulate==0.9.0
tblib==3.0.0
tenacity==8.2.3
tensorboard==2.15.1
tensorboard-data-server==0.7.2
tensorflow==2.15.0.post1
tensorflow-estimator==2.15.0
tensorflow-io-gcs-filesystem==0.35.0
termcolor==2.4.0
terminado==0.18.0
threadpoolctl==3.2.0
tinycss2==1.2.1
toml==0.10.2
toolz==0.12.1
torch==2.2.0+cpu
torchaudio==2.2.0+cpu
torchvision==0.17.0+cpu
tornado==6.4
tqdm==4.66.1
traitlets==5.14.1
typeguard==4.1.5
types-python-dateutil==2.8.19.20240106
typing_extensions==4.9.0
tzdata==2023.4
tzlocal==5.2
ujson==5.9.0
uri-template==1.3.0
urllib3==2.2.0
validators==0.22.0
virtualenv==20.25.0
watchdog==3.0.0
wcwidth==0.2.13
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0
Werkzeug==3.0.1
widgetsnbextension==4.0.9
wrapt==1.14.1
xarray==2024.1.1
xarray-datatree==0.0.13
xarray-einstats==0.7.0
yarl==1.9.4
zarr==2.16.1
zict==3.0.0
zipp==3.17.0

Finally, you can install the libraries by opening the terminal in the same location as requirements.txt and running:

run in terminal
python -m pip install --upgrade pip
python -m pip install --upgrade setuptools
python -m pip install -r requirements.txt --find-links https://download.pytorch.org/whl/torch_stable.html

Configuring R for JupyterLab

Open up R and run the following code:

reticulate::py_available(initialize = TRUE)
IRkernel::installspec(user=FALSE)

This will make R available in JupyterLab.

Visual Studio Code (vscode)

Download Visual Studio Code here and install it. In this case feel free to download the newest version, since we can use vscode for various tasks, such as:

  • Writing a dockerfile.
  • Running R code.
  • Running Python code.
  • Running JupyterLab.

It is useful to have vscode as a fallback IDE for R, since RStudio is better suited for most R-related tasks but may crash in certain cases (e.g. long calculations with multiple plot outputs).

vscode extensions

One important feature of vscode is the extensive number of extensions available for various programming languages and processes. A number of useful extensions for vscode are:

All of these (and many more) extensions can be found directly in vscode by pressing Ctrl+Shift+X and searching the markerplace directly.


  1. Unfortunately, some libraries in R (e.g. polars, tsdl and the Stan-related libraries) do not provide an easy way to download older versions. Thankfully, the specific library versions will be provided below, which will, hopefully, allow to preserve some reproducibility.↩︎

  2. Note that this library list was generated on Ubuntu, but should work on Windows. In case that some libraries are unavailable in Windows - remove them - such OS-specific libraries are secondary dependencies to the main packages (some main packages are: statsmodels, pandas, torch, etc.). pip will automatically install any dependencies not explicitly listed in the requirements.txt file.↩︎