The Institute for Systems Biology RepeatMasker Web Server

RepeatMasker screens DNA sequences in FASTA format against the Repbase-derived RepeatMasker library of repetitive elements or against the Dfam database and returns a masked query sequence ready for database searches. RepeatMasker also generates a table annotating the masked regions.

Reference: A.F.A. Smit, R. Hubley & P. Green, unpublished data. Current Version: open-4.0.9 ( Dfam: 3.0 only* )

PLEASE READ FIRST: As of May 20, 2019 GIRI has rescinded our working agreement allowing this service to utilize the RepBase RepeatMasker edition. At this time we can only offer masking using the open database Dfam 3.0, which now includes consensus sequences in addition to profile hidden Markov models for many transposable element families. For the time being users requiring RepBase will need to purchase a commercial or academic license and run RepeatMasker localy. We are working to expand the Dfam database and invite you to visit Dfam ( http://www.dfam.org ) for more information.

Check Current Queue Status


Basic Options

Sequence:
or

Select a sequence file to process or paste the sequences(s) in FASTA format. Large sequences will be queued, and may take a while to process.
Search Engine: rmblast hmmer cross_match

abblast

Select the search engine to use when searching the sequence. Cross_match is slower but often more sensitive than the other engines. ABBlast ( formally known as WUBlast ) is very fast with a slight cost of sensitivity. RMBlast is a RepeatMasker compatible version of the NCBI Blast tool suite. HMMER uses the new nhmmer program to search sequences against the new Dfam database ( human only ).
Speed/Sensitivity: rush quick default slow

Select the sensitivity of your search. The more sensitive the longer the processing time.
DNA source:
Select a species from the drop down box or select "Other.." and enter a species name in the text box. Try the protein based repeatmasker if the repeat database for your species is small.
Return Format: html tar file Select the format for the results of your search. The "tar" option will return the results as a compressed archive file, and "html" will present the results as a summary web page with links to the individual data files.
Return Method: html email The "HTML" return method will run RepeatMasker on your sequence and return the results immediately to your web browser, provided your sequences are short. The "email" return method will email you when your results are ready.

Lineage Annotation Options

If your query sequence is mammalian, RepeatMasker can determine if a repeat instance is expected to be present in one or more other mammalian species. This information can be used to annotate the RepeatMasker output or control the masking process.

Comparison Species: Annotate lineage specific repeats in your output with respect to this comparison species.
Lineage Specific Masking: Strong Weak
Do not mask satellites and simple repeats
Mask repeats not found in the first comparison species if the evidence is "strong" or "weak". If masking is selected you may also elect to exclude satellites and simple repeats from being masked.
Additional Comparison Species: Select an additional species for lineage specific comparison.

Advanced Options

Alignment Options: Select how you would like alignments displayed.
Masking Options: Select how you would like your sequence masked.
Contamination Check: Check for contamination in your sequence.
Repeat Options: Select the types of repeats you would like to mask.
Artifact Check: Check for bacterial insertion elements within your sequence before masking interspersed repeats.
Matrix: Select a specific GC level for your sequence.
Divergence Cutoff: Only mask repeats that are less divergent from the consensus than a specific percentage.


RepeatMasker "open-3.0" is licensed under the Open Source License v2.1.
To obtain a copy of the software to run locally ( on UNIX systems ) go here.

RepeatMasker written and supported by: Arian Smit & Robert Hubley

If you would like to mirror this site please send an email to the web site manager (webmaster@repeatmasker.systemsbiology.org).


Institute for Systems Biology
This server is made possible by funding from the National Human Genome Research Institute (NHGRI grant # RO1 HG002939).