Skip to content

lapinskim/ngs19

 
 

Repository files navigation

Materials for #NGSchool2019 - Machine Learning for Biomedicine

You will find here the materials for workshops, hackathons and lectures at the #NGSchool2019, together with installation directions and tips for running the software necessary for participation in the NGSchool2019.

Table of Content

General instructions

Colab

Google Colab is an online service in which you can run jupyter notebooks (and even use some limited GPU!) It comes with some preloaded libraries which makes it easier to teach and run tutorials without having to spend too much time on fixing dependencies etc.

Working on Prometheus

To access the cluster:

ssh [email protected]

Clone NGSchool repo:

git clone https://github.com/NGSchoolEU/ngs19.git

Inside folders with workshops, that are meant to be run on cluster, there will be *.slurm files with the job description.

To run a job:

sbatch jobname.slurm

Job log will be created in the current directory: jobname-log-JOBID.txt.

To check if the job is in the qeue:

squeue -u $USER

To cancel the job:

scancel JOBID

Majority of the workshops will be run inside notebooks, how to use them with cluster is described here Intro to HPC.

Talks

Guilliame Fillion - "An experiment on anti-academic research"

Workshops

Intro to HPC

tutor: Klemens Noga

The wbsite with info about the workshop can be accessed here

Intro to R

tutor: Maja Kuzman

Intro to Python

tutor: Kasia Kędzierska

The whole workshop will be executed in the Jupyter notebook, and will rely on several Python packages. In the directory you can find a setup_check.sh script you can run to see if your enviorenment satisfies all requirements.

Install and check if requirements are satisfied.

bash intro_to_python/setup_check.sh

Requirements:

  • python3
  • Jupyter
  • python3 modules:
    • numpy
    • pandas
    • matplotlib
    • scipy

Intro to Stats

tutor: German Demidov

Unsupervised learning

tutor: Kasia Kędzierska

Slides: unsupervised_learning/unsupervised_learning_slides.pdf

The workshop will be run in R notebook. We would work locally and the following packages are required.

Requirements:

  • R 3.5+
  • tidyverse 1.2.1+
  • factoextra 1.0.5+
  • ggpubr 0.2+
  • ggsci 2.9+
  • MASS 7.3-50+
  • tsne 0.1-3+
  • umap 0.2.3.1+
required_packages <- c("tidyverse", "factoextra", "ggpubr", 
                       "ggsci", "MASS", "tsne", "umap")

for (pkg in required_packages) {
  if(!require(pkg, character.only = TRUE, 
              quietly = TRUE, 
              warn.conflicts = FALSE)) {
    print(paste0("Warning! Installing package: ", pkg, "."))
    install.packages(pkg)
  } 
}

print("All done! :)")

Bayesian Inference

tutor: Roman Cheplyaka

Either in RStudio or in the interactive R session run following commands:

required_packages <- c("rstan", "StanHeaders", "magrittr", "reshape2", 
                       "forcats", "stringr", "dplyr", "purrr", "readr",
                       "tidyr", "tibble")

for (pkg in required_packages) {
  if(!require(pkg, character.only = TRUE, 
              quietly = TRUE, 
              warn.conflicts = FALSE)) {
    print(paste0("Warning! Installing package: ", pkg, "."))
    install.packages(pkg)
  } 
}

print("All done! :)")

Natural language processing

tutor: Noura Al Moubayed

Installation guidelines

  1. Install miniconda

Start by installing miniconda.

https://docs.conda.io/en/latest/miniconda.html

  1. Create conda environment

To simplify, we can crete the enviromnet from the yml file: nlp/workshop.yml

conda env create -f nlp/workshop.yml

  1. FROM LOCAL COPY Install missing package:

a. Copy the file from USB

Due to a large file size (>1GB), we are copying the en_core_web_lg from USB sticks distributed on site. When you copy the file from a USB, please change the following command to point to the location of the file.

b. Copy from server

If you didn't copy the file from USB stick, copy it from local server.

scp <your-user>@10.0.0.200:/srv/en_core_web_lg-2.2.0.tar.gz ~/

Now, install it.

# python -m spacy download en_core_web_lg
conda activate workshop
pip install /path/to/folder/with/en_core_web_lg-2.2.0.tar.gz
  1. Clone the repository

Make sure your github repository is up to date and unpack one of files from the nlp directory! The files is gziped to reduce its size.

git pull origin master
gunzip nlp/tutorial_features.pkl.gz

Running the workshop

cd nlp
conda activate workshop
jupyter notebook

Reinforcement Learning

tutor: Robert Loftin

In order to run locally:

conda create --name reinforced python=3.7
conda activate reinforced
pip install numpy==1.17.3
pip install gym==0.15.3
pip install matplotlib==3.0.3
#pip install torch==1.3.0
conda install pytorch torchvision cpuonly -c pytorch
pip install chainer
pip install minerl
pip install opencv-python-headless
pip install roboschool
conda install jupyter
conda install -c anaconda openjdk
jupyter-notebook

Deep learning methods for genomics

tutor: Ron Schwessinger

Slides

The seminar hands-on workshop will be run in a google colab notebook. A google account is required though. Additional information can be found in this repo but no need to install anything for the workshop.

Deep Generative Models for dimensionality reduction

tutor: Kaspar Märtens

Link to slides

In the hands-on part of the tutorial, we will implement an Autoencoder on MNIST data. See google colab notebook for Autoencoders on MNIST.

For those interested, there is also an additional colab notebook for Variational Autoencoders.

Tree based methods

tutor: Rosa Karlic

You will work locally in RStudio, execute following code to install packages:

required_packages <- c("caret", "rpart", "e1071", 
                       "ranger", "dplyr", "randomForest", "rpart.plot",
		       "ipred", "bst", "plyr")

for (pkg in required_packages) {
  if(!require(pkg, character.only = TRUE, 
              quietly = TRUE, 
              warn.conflicts = FALSE)) {
    print(paste0("Warning! Installing package: ", pkg, "."))
    install.packages(pkg)
  } 
}

print("All done! :)")

Lasso workshop

tutor: Tim Padvitski

You will work locally in RStudio, execute following code to install packages:

required_packages <- c("c060", "glmnet", "igraph)

for (pkg in required_packages) {
  if(!require(pkg, character.only = TRUE, 
              quietly = TRUE, 
              warn.conflicts = FALSE)) {
    print(paste0("Warning! Installing package: ", pkg, "."))
    install.packages(pkg)
  } 
}

print("All done! :)")

Hackathons

Dilated Convolutional Neural Nets for DNase-seq and ATAC-seq footprinting

Requirements:

  • python3
  • keras and Tensorflow v1.14 as backend
  • numpy
  • scikit-learn
  • google account for colab notebook work

Literature:

About

materials for #NGSchool2019 - Machine Learning for Biomedicine

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 97.6%
  • Shell 1.8%
  • CSS 0.6%