dspa
Table of content
Table of content
Description
This repository contains a subset of the Python codes associated with the distributed Split and Augment Gibbs sampler (DSPA) described in the following paper (reproducing Table 4).
[1] P.-A. Thouvenin, A. Repetti, P. Chainais - P.-A. Thouvenin, A. Repetti, P. Chainais, submitted to JCGS, vol. ..., no. ..., pp. ..., year.
Authors: P.-A. Thouvenin, A. Repetti, P. Chainais
Installation
- Clone the current repository and create an anaconda environment as follows.
# If you need to install Miniconda
wget -P path/to/install https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh # say yes to the default options
conda install -c anaconda conda-build
source ~/.bashrc # or .zshrc, depending on your shell
# Cloning the repo
git clone https://gitlab.cristal.univ-lille.fr/pthouven/dspa.git
cd dspa
# Create anaconda environment
conda create --name demo-jcgs --file conda-osx-64.lock # or linux-64, win-64
# Install the proposed Python library in development mode in the environment
conda activate demo-jcgs
conda develop src/
# equivalent in pip:
# pip install -e .
- To possibly avoid file lock issue in h5py, add the following to the
~/.zshrc
file (or~/.bashrc
)
export HDF5_USE_FILE_LOCKING='FALSE'
Experiments
-
The
.sh
script provided in theexamples/deconvolution
folder requiretmux
to be installed. It opens a session in the background, from which the Python script provided is running.To run a representative example of the experiments from the
examples/deconvolution
folder, configure and run thesubmit_local_serial.sh
script (to run the serial version of the sampler), or thesubmit_local.sh
script (for the distributed sampler). Both contain similar parameters to be configured (more details in these 2 files). -
An example to reproduce the results of the strong scaling experiment from Table 4, using the
submit_local.sh
andsubmit_metric_p.sh
scripts, is reported below.
conda activate demo-jcgs
# * generate the data (only required once)
# set dataflag='--data' in submit_local.sh, then execute
bash submit_local.sh
# check name of the tmux session, called session_name in the following
tmux list-session
tmux a -t session_name # press crtl+b to kill the session once the work is done
# * run the sampler
# set dataflag='', in submit_local.sh, then, execute
bash submit_local.sh
# check name of the tmux session, called session_name in the following
tmux list-session
tmux a -t session_name
# to detach from a session (leave it running in the background, press ctrl+b, then ctrl+d)
# in the tmux session, press crtl+b to kill it once the work is done or, from a normal terminal
tmux kill-session -t session_name
# * deactivating the conda environment (when no longer needed)
conda deactivate
- To quickly inspect the content of an
.h5
file from the terminal (e.g., of some of the checkpoint files produces), the following commands may prove useful.
# replace <filename> by the name of your file
h5dump --header <filename>.h5 # displays the name and size of all variables contained in the file
h5dump <filename>.h5 # diplays the value of all the variables saved in the file
h5dump -d "/GroupFoo/databar[1,1;2,3;3,19;1,1]" <filename>.h5 # display part of the variable "databar" from the dataset GroupFoo within a given .h5 file
h5dump -d dset <filename>.h5 # displays content of a dataset dset
h5dls -d <filename>.h5/dset # displays content of a dataset dset
df -h --total # check total disk usage
du -sh ~ # disk usage for the current user (in ~ directory)
Please check the h5py
documentation for further details.
Versioning
This project will use SemVer for versioning, once the API is fully specified.
License
The project is licensed under the GPL-3.0 license.