Nucleosome positioning analysis and predictions

 Below is a manually curated collection of online tools relevant to nucleosome positioning. This list  is being constantly updated, comments are very welcome. The database of experimental nucleosome positioning in different cell types is moved to a separate page. For protein-DNA interaction of non-histone proteins and TFs, see the section on TF-DNA binding. Also, have a look at the epigenetic modifications section of the site.

*How to cite: Teif V.B. (2016). Nucleosome positioning: resources and tools online. Briefings in Bioinformatics 17, 745-757.  | Published version | Author’s PDF

| MNase-seq analysis software | nucleosome positioning prediction | experimental nucleosome datasets |

               1) Software to process nucleosome mapping experiments (sorted alphabetically):

BINOCh: Binding Inference from Nucleosome Occupancy Changes (He et al. 2010; Meyer et al. 2011). This is a Python package, which allows identification of putative enhancers by comparing nucleosome occupancy in two cell conditions and analyzing DNA motifs near nucleosome centres and edges. It requires as input sorted BED files and relies for peak calling on the software NPC developed by the same group. Applicable to single- and paired-end sequencing.

CAM: A Quality Control Pipeline For MNase-Seq Data. CAM uses either raw sequencing file or aligned file as an input (supporting both paired end and single end data) and provides multiple informative QC measurements and nucleosome organization profiles on potentially functionally related regions for a given MNase-seq dataset. CAM also includes 268 historical MNase-seq datasets from human and mouse as a reference atlas.

ChIPseqR: Analysis of ChIP-Seq experiments using R; included in the Bioconductor R package (Humburg et al. 2011). ChIPseqR takes as input mapped reads and outputs nucleosome centres and their scores. It allows producing basic statistical graphs using standard R functions. Applicable to single-end sequencing.

DANPOS and DANPOS2: Dynamic Analysis of Nucleosome Positioning and Occupancy by Sequencing (Chen et al. 2013). This is a Python package, which reports changes in location, fuzziness, or occupancy for a given nucleosome or any genomic region. It allows generating aggregate profile plots and heatmaps for subsets of genomic regions. Applicable to paired-end sequencing.

Dimnp: identifying differential nucleosome regions in multiple samples. The method is described in (Liu et al., 2017). The main difference from previous methods is the possibility to work with more than two experimental conditions.

DiNuP: A systematic approach to identify regions of differential nucleosome positioning (Fu et al. 2012). DiNuP compares the nucleosome profiles generated by high-throughput sequencing between different conditions. It provides a statistical P-value for each identified differential regions and empirically estimates the False Discovery Rate (FDR) as a cutoff when two samples have different sequencing depths and differentiate differential regions from the background noise.

DPNuc: Identifying Nucleosome Positions Based on the Dirichlet Process Mixture Model (Chen et al., 2015). In this method, Markov chain Monte Carlo (MCMC) simulations are employed to determine the mixture model with no need of prior knowledge about nucleosomes. The authors claim that this approach can more reliably detect the size distribution of nucleosomes. Web server address not indicated (?).

iNPS: The authors developed an improved version of the NPC nucleosome peak calling algorithm, which they claim to outperform the latter (Chen et al. 2014). Applicable to paired-end sequencing.

MLM: A Multi-Layer Method to analyze microarray nucleosome positioning data. A Matlab code is available for download (Di Gesu et al. 2009).

NOrMAL: Accurate nucleosome positioning using a modified Gaussian mixture model.  C++ code and executables are provided for download (Polishko et al. 2012). It is a command line tool designed to resolve overlapping nucleosomes and extract extra information (“fuzziness”, probability, etc.) of nucleosome placement. Newer software called PuFFIN developed by the same authors is claimed to outperform NOrMAL (see below). Applicable to paired-end sequencing.

NPS: Nucleosome Positioning from Sequencing (Zhang et al. 2008). This is a Python based nucleosome peak caller, which is recommended for the use together with software BINOCh from the same group (see below). Applicable to single-end NGS sequencing.

NSeq: a multithreaded Java application for finding positioned nucleosomes from sequencing data (Nellore et al. 2012). NSeq includes a user-friendly graphical interface written in Java. It computes FDRs for candidate nucleosomes from Monte Carlo simulations, plots nucleosome coverage and centers, and exploits the availability of multiple processor cores by parallelizing its computations. NSeq analyzes alignment data in BAM, SAM, or BED format. It assumes that the data are single-end. 

NucDe: Mapping nucleosome-linker boundaries (Kuan et al. 2009). This is an R package mapping nucleosome-linker boundaries from both MNase-ChIP-seq and MNase-seq data using a non-homogeneous hidden-state model based on first order differences of experimental data along genomic coordinates. Applicable to single-end sequencing.

NucHunter: Inferring nucleosome positions with their histone mark annotation from ChIP-seq data (Mammana et al. 2013). It uses data from histone ChIP-seq experiments to infer positioned nucleosomes, and can predict positioned nucleosomes from one or multiple BAM files, e.g. taking into account a control experiment. Applicable to paired-end sequencing.

NucleoATAC: A Python package for calling nucleosomes using ATAC-Seq data (Schep et al. 2015). Requires as input sorted aligned paired-end reads in BAM format, FASTA file with genome reference and sorted bed file with non-overlapping regions for which nucleosome analysis is to be performed. These regions will generally be broad open-chromatin regions. Outputs nucleosome calls and occupancy. Applicable to paired-end sequencing.

NucleoFinder: A statistical approach for the detection of nucleosome positions (Becker et al. 2013). This is an R package, which addresses both the positional heterogeneity across cells and experimental biases. Applicable to paired-end sequencing.

nucleR: Non-parametric nucleosome positioning. This is an R package included in the Bioconductor (Flores and Orozco 2011). It allows treating both NGS and Tiling Arrays experiments. The software is integrated with standard genomics R packages and allows for in situ visualization as well as to export results to common genome browser formats. Applicable to paired-end sequencing.

NucMapR package for chemical mapping of nucleosome positioning. The algorithm is described in Xi et al., 2014. Used in Brogaard et al., 2012; Voong et al., 2016.

NucPosSimulator:  Deriving non-overlapping nucleosome configurations from MNase-seq data (Schopflin et al. 2013). It utilizes a Monte Carlo approach to determine the most probable nucleosome position in overlapping and ambiguous DNA reads from high through-put sequencing experiments. In contrast to peak-calling procedures NucPosSimulator probes many possible solutions, and can apply a Simulated Annealing scheme, a heuristic optimization method, which finds an optimal solution for complex positioning problems. Applicable to paired-end sequencing.

NucTools: analysis of chromatin feature occupancy profiles from high-throughput sequencing data. The method and its applications are described in (Vainshtein et al., 2017).

NUCwave: Nucleosome occupancy maps from MNase-seq, ChIP-seq and CC-seq  (Quintales et al. 2014). It is a Python package which generates nucleosome occupancy maps from MNase-seq, ChIP-seq and chemical cleavage (CC-seq), both for single-end and paired-end reads. It requires as input files in a Bowtie output format. Applicable to paired-end sequencing.

Perl scripts to analyze MNase-seq experiments (Cole et al. 2012). The authors have listed the code in supplementary materials of their publication, which is useful for other developers.

PING and PING 2.0: Probabilistic inference for nucleosome positioning with MNase-based or sonicated short-read data. An R package for nucleosome peak calling integrated in the Bioconductor (Zhang et al. 2012; Woo et al. 2013). The authors say that PING compares favorably to NPS and TemplateFilter in scalability, accuracy and robustness.

PuFFIN: A parameter-free method to build genome-wide nucleosome maps from paired-end sequencing data (Polishko et al. 2014). PuFFIN is a command line tool for accurate placing of the nucleosomes based on the pair-end reads. It was designed to place non-overlapping nucleosomes using extra length information present in pair-end data-sets. PuFFIN is written in Python, and released in 2014. It outperforms NOrMAL previously released by the same authors, and is claimed by the authors to outperform also NSeq, NPS and Template Filtering. It returns nucleosome positions, the width of the peak, confidence score and fuzziness. Applicable to paired-end sequencing.

Python scripts for the study “Cell-free DNA Comprises an In Vivo Nucleosome Footprint that Informs Its Tissues-Of-Origin” (Snyder et al., 2016).

TemplateFilter: Perl source code and executable files for nucleosome positioning data processing (Weiner et al. 2010). Applicable to single-end NGS sequencing.

Tiling array analysis (Yuan et al. 2005). A Matlab code, which is complemented by MLM and NucleR packages (see below). Applicable to tiling microarray experiments for nucleosome positioning.

Skyline nucleosome browser: a web-based application for the identification of nucleosome peaks over the genome (Belch et al. 2010).

| MNase-seq analysis software | nucleosome positioning prediction | experimental nucleosome datasetstop of page |

                2) Software to predict preferential nucleosome positions from DNA sequence

The table below contains >20 algorithms that can be ordered by feature; alternatively you can search by keywords

*How to cite: Teif V.B. (2016). Nucleosome positioning: resources and tools online. Briefings in Bioinformatics 17, 745-757.  | Published version | Author’s PDF

DescriptionWeb-interfaceLocal installationDi-/tri-nucleotidePeriodicityk-mersEmpirical features
ICM Web: ICM Web allows users to assess nucleosome stability and fold any sequence of DNA into a 3D model of chromatin (Sereda & Bishop, 2010; Stolz & Bishop, 2010). The model is displayed in the visual browser JSmol or can be downloaded. ICM takes a DNA sequence and generates (i) a nucleosome energy level diagram, (ii) coarse-grained representations of free DNA and chromatin and (iii) plots of the helical parameters (Tilt, Roll, Twist, Shift, Slide and Rise) as a function of position. +----+
FineStr: Single-base-resolution nucleosome mapping server (Gabdank et al, 2010; Trifonov, 2010). The analysis is performed using the probe based on the 117-bp DNA bendability matrix derived from C. elegans. The authors suggested the universality of this pattern for other species. +-+++-
iNuc-PhysChem: Identifying nucleosomal or linker sequences from physicochemical properties (Chen et al, 2012). The algorithm identifies nucleosomal sequences by incorporating twelve physicochemical properties defined elsewhere, such as A-philicity, base stacking, B-DNA twist, bendability, bending stiffness, DNA denaturation energy, Z-DNA potential. The model was trained on data from H. sapiens, C. elegans and D. melanogaster. +++-++
iNuc-PseKNC: A sequence-based predictor for nucleosome positioning in genomes with pseudo k-tuple nucleotide composition (Guo et al, 2014). This is another software package from the developers of iNuc-PhysChem. Here, the samples of DNA sequences were formulated using six basic DNA local structural properties trained on datasets from H. sapiens, C. elegans and D. melanogaster. +-+-++
LeNup: Learning Nucleosome positioning from DNA sequences with improved convolutional neural networks. LeNup is a Python based open-source package based on convolutional neural networks to predict nucleosome positioning in H. sapiens, C. elegans, D. melanogaster as well as S. cerevisiae genomes, trained on benchmark datasets. -+--+-
Mapping_CC: Displays the nucleosome predictions based on the DNA dinucleotide correlation pattern. This algorithm was initially associated with one of the first high-throughput genome-wide nucleosome maps in Yeast (Ioshikhes et al, 2006). An updated version is available at -+++--
MOSAICS: Methodologies for Optimization and Sampling in Computational Studies (Minary & Levitt, 2014; Krawczyk, 2018). Perl scripts and a precompiled package to perform training-free atomistic prediction of nucleosome occupancy based on all-atom force field calculations. The effect of DNA methylation can be taken into account. -++--+
NucEnerGen: Nucleosome energetics predictions based on high throughput sequencing (Locke et al, 2010). It utilizes dynamic programming to calculate allowed nucleosome configurations and the Percus equation to infer sequence-dependent energies from the experimental occupancy profiles. -++++-
nuMap: A web application implementing the YR and W/S schemes to predict nucleosome positioning (Alharbi et al, 2014). The methodology is based on the sequence-dependent anisotropic bending, which dictates how DNA is wrapped around a histone octamer. This application allows users to specify a number of options such as schemes and parameters for threading calculation and provides multiple layout formats. +-++-+
NuPoP: Nucleosome Positioning Prediction Engine (Wang et al, 2008; Xi et al, 2010). NuPoP is built upon a duration hidden Markov model, in which the linker DNA length is explicitly modeled. NuPoP outputs the Viterbi prediction, nucleosome occupancy score (from backward and forward algorithms) and nucleosome affinity score. NuPoP has three formats including a web server prediction engine, a stand-alone Fortran program, and an R package. The latter two can predict nucleosome positioning for a DNA sequence of any length.
Nu-OSCAR: Nucleosome-Occupancy Study for Cis-elements Accurate Recognition. It is devoted to identifying binding sites of known transcription factors, which further incorporates nucleosome occupancy around sites on promoter regions. The derivation of the algorithm is based on a biophysical view of interactions between protein factors and nucleosome DNA.
nuScore: A nucleosome-positioning score calculator based on the DNA curvature properties (Tolstorukov et al, 2008). This software allows an important type of analysis, where a user enters many sequences to calculate the average nucleosome energy profile.
N-score: MATLAB and Python codes using a wavelet analysis based model for predicting nucleosome positions from DNA sequence (Yuan & Liu, 2008).
NXSensor: Prediction of nucleosome-excluding sequences based on DNA bending properties (Luykx et al, 2006). It takes as input DNA sequences in FASTA format, and outputs nucleosome-excluding or nucleosome favouring segments.
Segal Lab nucleosome positioning prediction (Field et al, 2008; Kaplan et al, 2009; Segal et al, 2006). This is one of the most popular tools in this class, realized as a web server (allows analyzing a limited number of DNA sequences), and a stand-alone application which can be installed on a local cluster. It allows calculating nucleosome occupancy or nucleosome start site probability profiles of non-overlapping nucleosomes; alternatively, it is possible to calculate the net nucleosome formation energy profile. It uses machine learning for energy assignment based on the training datasets and dynamic programming to sample nucleosome configurations (similar to NucEnerGen, NuPoP and the algorithm of van Noort and co-authors).
Phase: A web server for prediction of the nucleosome formation probability based on (i) the 10-11 bp periodicities of dinucleotides and (ii) the typical pattern "linker-nucleosome-linker" defined by the authors (Levitsky et al, 2014).
RECON: A web server for prediction of the nucleosome formation potential learned from dinucleotide frequencies distribution for nucleosome positioning sequences (Levitsky, 2004; Levitsky et al, 1999).
SymCurv: A program for nucleosome positioning prediction (Nikolaou et al, 2010). It calculates the curvature of the DNA sequence and uses a greedy algorithm to parse the sequence in nucleosome-bound and nucleosome-free segments.
Schiessel Lab nucleosome positioning prediction. This resource contains software packages based on two types of algorithms: nucleosome mutation Monte Carlo (Eslami-Mossallam et al, 2016) and nucleosome positioning with Markov chains (Tompitak et al, 2017). The latter combines the "mutation Monte Carlo" method with dynamic programming similar to NucEnerGen, NuPoP and the algorithm of Segal and co-authors mentioned above.
Trifonov’s strong nucleosomes: Based on the discovery of strong nucleosome positioning sequences which are visually seen as regular arrays in genomic sequence (Nibhani & Trifonov, 2015; Salih et al, 2015), the program from Trifonov’s lab is finding a specific class of strongly positioned nucleosomes of the RR/YY and TA periodic types. +-++--
van Noort Lab nucleosome positioning prediction (van der Heijden et al, 2012). This algorithm is based on dinucleotide distributions, but unlike other methods based on dinucleotide distributions it does not use machine learning and accounts only for the dinucleotide periodicity. In addition, this method uses dynamic programming to account for size exclusion and the Percus equation to assign nucleosome affinities (similar to NucEnerGen, NuPoP and the algorithms of Segal and co-authors and Schiessel and co-authors mentioned above).
G-Dash: A Genome Dashboard Integrating Modeling and Informatics
G-Dash unites the Interactive Chromatin Modeling (ICM) tools with the Biodalliance genome browser and the JSMol molecular viewer to rapidly fold any DNA sequence into atomic or coarse-grained models of DNA, nucleosomes or chromatin. As a chromatin modeling tool, G-Dash enables users to specify nucleosome positions from various experimental or theoretical sources, interactively manipulate nucleosomes, and assign different conformational states to each nucleosome. As an informatics tool, data associated with 3D structures are displayed as tracks in a genome browser. Described in Li et a., 2018.

| MNase-seq analysis software | nucleosome positioning prediction | experimental nucleosome datasets top of page |

 4) Experimental protocols for high-throughput nucleosome positioning experiments

“Chromatin Remodeling: Methods and Protocols”, Springer Protocols, 2012.

This special issue includes updated protocols from many leading labs in the field.

Protocols for Chromatin Assembly and Analysis from

This collection contains ~30 protocols from different publications.

Cuddapah et al. (2009) Cold Spring Harb. Protoc.

This protocol includespurification of human CD4+ T cells from lymphocytes and chromatinfragmentation using micrococcal nuclease (MNase) digestion,followed by chromatin immunoprecipitation (ChIP) and constructionof a library for Illumina/Solexa sequencing.

Segal et al. (2006) Nature 442, 772-778

These are the supplementary materials file from the paper of Segal et al. with their methods

Erbay Yigit (2008) Digeston of HeLa Nuclei by Micrococcal Nuclease (MNase)

A user-friendly protocol, understandable for a beginner

Nucleosome mapping with MNase, Tsukiyama Lab

Nucleosome spasing assay, Tsukiyama Lab

| MNas-Seq analysis software | nucleosome modeling software | experimental nucleosome datasetstop of page |