The Institute for Systems Biology RMBlast Download

Overview
RMBlast is a RepeatMasker compatible version of the standard NCBI blastn program. The primary difference between this distribution and the NCBI distribution is the addition of a new program "rmblastn" for use with RepeatMasker and RepeatModeler.

RMBlast supports RepeatMasker searches by adding a few necessary features to the stock NCBI blastn program. These include:

  • Support for custom matrices ( without KA-Statistics ).
  • Support for cross_match-like complexity adjusted scoring. Cross_match is Phil Green's seeded smith-waterman search algorithm.
  • Support for cross_match-like masklevel filtering.
Installation

We provide pre-built binaries of RMBlast as a convenience to save users the expense of compiling RMBlast from source themselves. Although we include the same binaries as in NCBI's builds, they are not intended to be used as a general-purpose BLAST+ distribution.

At this time our macOS binaries are not signed nor notarized. The first time you run each program in the package, you may first need to "Allow" it in "System Settings -> Security and Privacy", then confirm you want to "Open" it. For more details see https://support.apple.com/en-us/HT202491 (scroll down to "How to open an app that hasn't been notarized or is from an unidentified developer"). You can also compile the programs from source, or download the binaries with a program (e.g. 'curl') which does not yet implement this security feature.

We do not build binaries for all platforms. If your platform is unsupported or the binaries do not work on your machine, please compile RMBlast from source instead (instructions below).

Latest Release: 2.14.1

To use the the pre-built binaries:

  1. Download Pre-compiled Package:
    Download the RMBlast package for your platform:
  2. Unpack RMBlast Distribution:
    Unpack the distribution in a place that will be accessible to RepeatMasker ( ie. /usr/local ).
    • cd /usr/local
    • tar zxvf rmblast-2.14.1-x64-linux.tar.gz
  3. Configure RepeatMasker/RepeatModeler:
    To use the new search engine with RepeatMasker or RepeatModeler, run/re-run the configure program in the RepeatMasker directory and the configure program in the RepeatModeler directory.

To compile from source:

  1. Download NCBI Blast+ and rmblast patch file:
    ncbi-blast-2.14.1+-src.tar.gz
    isb-2.14.1+-rmblast.patch.gz
  2. Install Dependencies:
    You will need a C++ compiler and essential build tools such as make and autotools.
    • Debian and derivatives (Ubuntu, Mint, etc.): apt install build-essential
    • Fedora: yum groupinstall "Development Tools"
    Other missing dependencies might be detected by the configure step later, depending on your operating system.
  3. Unpack Distribution:
    Unpack the distribution in your home directory or in a temporary location ( i.e. /tmp ).
    • cd /mytmp/location/
    • tar zxvf ncbi-blast-2.14.1+-src.tar.gz
    • gunzip isb-2.14.1+-rmblast.patch.gz
  4. Patch:
    • cd ncbi-blast-2.14.1+-src
    • patch -p1 < ../isb-2.14.1+-rmblast.patch
  5. Build:
    To compile the programs for installation in /usr/local/rmblast run:
    • cd ncbi-blast-2.14.1+-src/c++
    • ./configure --with-mt --without-debug --without-krb5 --without-openssl --with-projects=scripts/projects/rmblastn/project.lst --prefix=/usr/local/rmblast
    • make
      • make -j can be used to parallelize the build on multiprocessor systems, e.g. make -j2 to dedicate two cores to the build process.
    • make install
  6. Configure RepeatMasker/RepeatModeler:
    To use the new search engine with RepeatMasker or RepeatModeler, run/re-run the configure program in the RepeatMasker directory and the configure program in the RepeatModeler directory.

Previous Release: 2.14.0

To use the the pre-built binaries:

  1. Download Pre-compiled Package:
    Download the RMBlast package for your platform:
  2. Unpack RMBlast Distribution:
    Unpack the distribution in a place that will be accessible to RepeatMasker ( ie. /usr/local ).
    • cd /usr/local
    • tar zxvf rmblast-2.14.0-x64-linux.tar.gz
  3. Configure RepeatMasker/RepeatModeler:
    To use the new search engine with RepeatMasker or RepeatModeler, run/re-run the configure program in the RepeatMasker directory and the configure program in the RepeatModeler directory.

To compile from source:

  1. Download NCBI Blast+ and rmblast patch file:
    ncbi-blast-2.14.0+-src.tar.gz
    isb-2.14.0+-rmblast.patch.gz
  2. Install Dependencies:
    You will need a C++ compiler and essential build tools such as make and autotools.
    • Debian and derivatives (Ubuntu, Mint, etc.): apt install build-essential
    • Fedora: yum groupinstall "Development Tools"
    Other missing dependencies might be detected by the configure step later, depending on your operating system.
  3. Unpack Distribution:
    Unpack the distribution in your home directory or in a temporary location ( i.e. /tmp ).
    • cd /mytmp/location/
    • tar zxvf ncbi-blast-2.14.0+-src.tar.gz
    • gunzip isb-2.14.0+-rmblast.patch.gz
  4. Patch:
    • cd ncbi-blast-2.14.0+-src
    • patch -p1 < ../isb-2.14.0+-rmblast.patch
  5. Build:
    To compile the programs for installation in /usr/local/rmblast run:
    • cd ncbi-blast-2.14.0+-src/c++
    • ./configure --with-mt --without-debug --without-krb5 --without-openssl --with-projects=scripts/projects/rmblastn/project.lst --prefix=/usr/local/rmblast
    • make
      • make -j can be used to parallelize the build on multiprocessor systems, e.g. make -j2 to dedicate two cores to the build process.
    • make install
  6. Configure RepeatMasker/RepeatModeler:
    To use the new search engine with RepeatMasker or RepeatModeler, run/re-run the configure program in the RepeatMasker directory and the configure program in the RepeatModeler directory.

RMBlast Usage Statistics Reporting

Starting with NCBI BLAST+ version 2.11.0, the standalone command-line tools report usage statistics back to NCBI including program version, values of some command-line parameters, and the sizes of databases and queries. Sequence data is not included among these statistics. This data is used for quality monitoring and to gauge the usage of BLAST+ executables by the community. More information about this system and an example of collected data is available at https://www.ncbi.nlm.nih.gov/books/NBK563686/.

Because RMBlast is primarily intended for use with RepeatMasker and RepeatModeler, we make changes to usage reporting in both our source/patch and binary distributions:

  • Usage statistics are sent to the repeatmasker.org web server, instead of to NCBI.
  • The programs do not collect or submit a database name ("db_name"). Names of sequence files used with RepeatMasker and RepeatModeler are usually chosen by users or automatically generated and are neither necessary nor useful for us to collect.

Similarly to NCBI, we use these statistics for the purpose of quality monitoring and to gauge the popularity of RMBlast.

If you wish to opt out of this usage reporting for any reason, you can set the environment variable BLAST_USAGE_REPORT=false. On Unix-like systems, this can be done with the "export" or "setenv" commands depending on your shell. For more details and information on other methods to opt out, see NCBI's documentation of this feature at https://www.ncbi.nlm.nih.gov/books/NBK563686/.

 Arian Smit - Institute for Systems Biology
 Robert Hubley   - Institute for Systems Biology
 Jeb Rosen  - Institute for Systems Biology
 
 National Center for Biotechnology Information ( NCBI )
 
 And special thanks to:
   Tom Madden,
   Christiam Camacho,
   George Coulouris,
   Denis Vakatov,
   Aaron Ucko,
   Ning Ma
 from NCBI for allowing me to bug them with questions about
 the BLAST source code, assistance with finding resources,
 and lending an ear to a tired programmer.

 This work is funded by NHGRI grant # RO1 HG002939
Release Notes
RMBlast 2.14.1
  • Enabled the blastn option "db_soft_mask" in rmblastn.
RMBlast 2.14.0
  • The application of the masklevel filtering to results generated by a multi-threaded executions didn't always return exactly the same alignments. If there are multiple equivalent alignments possible between the query/subject, the previous version of RMBlast could filter different (but score-equivalent) results depending on the order of completion of each thread. This version corrects that issue by applying a secondary sort on subject ID to equivalent scoring alignments.
  • Updated to current NCBI release.
RMBlast 2.13.0
  • Updated to current NCBI release.
  • This release introduced query-based threading which greatly improves the speed of our RepeatModeler program.<\li>
  • The rmblastn program now calculates the raw an adjusted versions of the Kimura divergence stats for our RepeatMasker/RepeatModeler alignments.
RMBlast 2.11.0
  • Updated to current NCBI release.
  • This release introduced opt-out reporting of usage statistics, which in the RMBlast distributions are instead sent to www.repeatmasker.org.
RMBlast 2.10.0
  • Updated to current NCBI release.
RMBlast 2.9.0+-p2
  • Fix directory layout and include additional binaries also provided in NCBI's releases.
RMBlast 2.9.0+-p1
  • Fix a bug that would occasionally cause a crash or garbled alignments.
RMBlast 2.9.0+
  • Updated to current NCBI release.
  • Provide pre-compiled RMBlast and BLAST+ binaries for RepeatMasker and RepeatModeler.
RMBlast 2.6.0+
  • Updated to current NCBI release.
RMBlast 2.2.27+
  • First integrated release of NCBI BLAST+ toolkit and RMBlast.
RMBlast [1.2] NCBI Blast 2.2.23+
  • First release of RMBlast.
RMBlast is licensed under the Open Source License v2.1.
Institute for Systems Biology
This server is made possible by funding from the National Human Genome Research Institute (NIGRI grant # RO1 HG002939).