Bioinformatics Logo

RNAscClust

University Freiburg Logo

Synopsis


RNAscClust is a pipeline to cluster a set of structured RNAs taking their respective structural conservation into account. The aim of RNAscClust is to aid the discovery of families and classes of ncRNAs.

The input to RNAscClust is a set of multiple structural alignments of RNA sequences. Each alignment contains an RNA sequence from a species of interest structurally aligned to homologous sequences. RNAscClust computes minimum free-energy structures for each sequence from the species of interest using conserved base pairs as prior information for the folding. The sequences originating from the organism of interest are then clustered using a graph kernel-based strategy, which identifies common structural features.

Download


The source code is available as a tarball.

Latest release: RNAscClust 1.1.1

Previous releases: Releases

Installation and Usage


Instructions on installation and usage of the source package can be found in the file README.md included in the downloaded tarball.

RNAscClust Docker Image


RNAscClust is available as a Docker container on Docker Hub. Using the Docker container, one can set up the pipeline in a few minutes, without having to install any of the dependencies manually. The Docker container enables the user to easily reproduce all Figures and Tables shown in the Results section in the RNAscClust paper (see reference below under Publication) by executing a short sequence of command-line instructions. The RNAscClust Docker container supports multi-core execution.

Docker Acquisition and Usage Recipe

docker pull mmiladi/rnascclust:latest
docker run -it -h dockersgeserver mmiladi/rnascclust:latest

The Docker container will start up with examples on how to run the pipeline and evaluations.

Example Commands


# Inside a terminal of the host system:
docker pull mmiladi/rnascclust:latest
docker run -it -v `pwd`/cluster_evaluation:/cluster_evaluation -h dockersgeserver mmiladi/rnascclust:latest
# Inside the docker image:
cd /; bash /rnascclust/bin/clustering/run_clustering_docker.sh >cluster_evaluation/run_clustering_docker.log 2>&1

After execution of the clustering, which takes ~2 hours, the directory cluster_evaluation contains .pdf Figures and .txt Tables following the naming in the manuscript.

Benchmark Data Sets


Datasets used in the paper for benchmarking can be downloaded from here.

License


The software is available under GNU-GPL3.

Publication


Milad Miladi*, Alexander Junge*, Fabrizio Costa, Stefan E. Seemann, Jakob Hull Havgaard, Jan Gorodkin, and Rolf Backofen. RNAscClust: clustering RNA sequences using structure conservation and graph based motifs (2016). Bioinformatics 33, no. 14 (2017): 2089-2096.

*these authors contributed equally to this work

Contact


RNAscClust is developed by the Chair for Bioinformatics, University of Freiburg and the Center for non-coding RNA in Technology and Health (RTH), University of Copenhagen.

For scientific questions, please contact:

For technical questions, please contact: