This list of protein subcellular localisation prediction tools includes software, databases, and web services that are used for protein subcellular localization prediction.

Some tools are included that are commonly used to infer location through predicted structural properties, such as signal peptide or transmembrane helices, and these tools output predictions of these features rather than specific locations. These software related to protein structure prediction may also appear in lists of protein structure prediction software.


  • Descriptions sourced from the entry in the registry (used under CC-BY license) are indicated by link
Name Description References URL Year
AAIndexLocMachine-learning-based algorithm that uses amino acid index to predict protein subcellular localization based on its sequence. ( entry)[1] 2008
APSLAPPrediction of apoptosis protein sub cellular Localization [2] 2013
AtSubPA highly accurate subcellular localization prediction tool for annotating the Arabidopsis thaliana proteome. ( entry)[3] 2010
BaCelLoBaCelLo is a predictor for the subcellular localization of proteins in eukaryotes. ( entry)[4] 2006
BAR+BAR+ is a server for the structural and functional annotation of protein sequences ( entry)[5] 2011
BARBAR 3.0 is a server for the annotation of protein sequences relying on a comparative large-scale analysis on the entire UniProt. With BAR 3.0 and a sequence you can annotate when possible: function (Gene Ontology), structure (Protein Data Bank), protein domains (Pfam). Also if your sequence falls into a cluster with a structural/some structural template/s we provide an alignment towards the template/templates based on the Cluster-HMM (HMM profile) that allows you to directly compute your 3D model. Cluster HMMs are available for downloading. ( entry)[6][5] 2017
BASysBASys (Bacterial Annotation System) is a tool for automated annotation of bacterial genomic (chromosomal and plasmid) sequences including gene/protein names, GO functions, COG functions, possible paralogues and orthologues, molecular weights, isoelectric points, operon structures, subcellular localization, signal peptides, transmembrane regions, secondary structures, 3-D structures, reactions, and pathways. ( entry)[7] 2005
BOMPThe beta-barrel Outer Membrane protein Predictor (BOMP) takes one or more fasta-formatted polypeptide sequences from Gram-negative bacteria as input and predicts whether or not they are beta-barrel integral outer membrane proteins. ( entry)[8] 2004
BPROMPTBayesian PRediction Of Membrane Protein Topology (BPROMPT) uses a Bayesian Belief Network to combine the results of other membrane protein prediction methods for a protein sequence. ( entry)[9] 2003
BUSCABUSCA (Bologna Unified Subcellular Component Annotator) is a web-server for predicting protein subcellular localization. BUSCA integrates different tools to predict localization-related protein features as well as tools for discriminating subcellular localization of both globular and membrane proteins. ( entry)[10] 2018
Cell-PLocA package of web-servers for predicting subcellular localization of proteins in various organisms.[11] 2008
CELLOCELLO uses a two-level Support Vector Machine system to assign localizations to both prokaryotic and eukaryotic proteins.[12][13] 2006
ClubSub-PClubSub-P is a database of cluster-based subcellular localization (SCL) predictions for Archaea and Gram negative bacteria.[14] 2011
CoBaltDBCoBaltDB is a novel powerful platform that provides easy access to the results of multiple localization tools and support for predicting prokaryotic protein localizations.[15] 2010
ComiRComiR is a web tool for combinatorial microRNA (miRNA) target prediction. Given an messenger RNA (mRNA) in human, mouse, fly or worm genomes, ComiR predicts whether a given mRNA is targeted by a set of miRNAs. ( entry)[16] 2013
cropPALA data portal to access the compendium of data on crop protein subcellular locations. ( entry)[17] 2016
DAS-TMfilterDAS (Dense Alignment Surface) is based on low-stringency dot-plots of the query sequence against a set of library sequences - non-homologous membrane proteins - using a previously derived, special scoring matrix. The method provides a high precision hydrophobicity profile for the query from which the location of the potential transmembrane segments can be obtained. The novelty of the DAS-TMfilter algorithm is a second prediction cycle to predict TM segments in the sequences of the TM-library. ( entry)[18] Archived 2018-02-05 at the Wayback Machine 2002
DeepLocPrediction of eukaryotic protein subcellular localization using deep learning ( entry)[19] 2017
Light AttentionDeep learning architecture for predicting eukaryotic subcellular localization and web server which predicts 10 locations for arbitrary amounts of sequences that can be uploaded as .fasta or copy-pasted ( entry)[20] 2021
DIANA-microT v5.0Web server which predicts targets for miRNAs and provides functional information on the predicted miRNA:target gene interaction from various online biological resources. Updates enable the association of miRNAs to diseases through bibliographic analysis and connection to the UCSC genome browser. Updates include sophisticated workflows. ( entry)[21][22] 2013
DrugBankDrugBank is a unique bioinformatics/cheminformatics resource that combines detailed drug (i.e. chemical) data with comprehensive drug target (i.e. protein) information. The database contains >4100 drug entries including >800 FDA approved small molecule and biotech drugs as well as >3200 experimental drugs. Additionally, >14,000 protein or drug target sequences are linked to these drug entries. ( entry)[23] 2006
E.Coli IndexComprehensive guide of information relating to E. coli; home of Echobase: a database of E. coli genes characterized since the completion of the genome. ( entry)[24] 2009
ePlantA suite of open-source world wide web-based tools for the visualization of large-scale data sets from the model organism Arabidopsis thaliana. It can be applied to any model organism. Currently has 3 modules: a sequence conservation explorer that includes homology relationships and single nucleotide polymorphism data, a protein structure model explorer, a molecular interaction network explorer, a gene product subcellular localization explorer, and a gene expression pattern explorer. ( entry)[25] 2011
ESLpredESLpred is a tool for predicting subcellular localization of proteins using support vector machines. The predictions are based on dipeptide and amino acid composition, and physico-chemical properties. ( entry)[26] 2004
Euk-mPLoc 2.0Predicting the subcellular localization of eukaryotic proteins with both single and multiple sites.[27] 2010
HITA comprehensive and fully curated database for Herb Ingredients?? Targets (HIT). Those herbal ingredients with protein target information were carefully curated. The molecular target information involves those proteins being directly/indirectly activated/inhibited, protein binders and enzymes whose substrates or products are those compounds. Those up/down regulated genes are also included under the treatment of individual ingredients. In addition, the experimental condition, observed bioactivity and various references are provided as well for user??s reference. The database can be queried via keyword search or similarity search. Crosslinks have been made to TTD, DrugBank, KEGG, PDB, Uniprot, Pfam, NCBI, TCM-ID and other databases. ( entry)[28] 2011
HMMTOPPrediction of transmembranes helices and topology of proteins. ( entry)[29][30] 2001
HSLpredAllows predicting the subcellular localization of human proteins. This is based on various type of residue composition of proteins using SVM technique. ( entry)[31] 2005
idTargetidTarget is a web server for identifying biomolecular targets of small chemical molecules with robust scoring functions and a divide-and-conquer docking approach. idTarget screens against protein structures in PDB. ( entry)[32] 2012
iLoc-CellPredictor for subcellular locations of human proteins with multiple sites. ( entry)[33] 2012
KnowPredsiteA knowledge-based approach to predict the localization site(s) of both single-localized and multi-localized proteins for all eukaryotes.[34] 2009
lncRNAdblncRNAdb database contains a comprehensive list of long noncoding RNAs (lncRNAs) that have been shown to have, or to be associated with, biological functions in eukaryotes, as well as messenger RNAs that have regulatory roles. Each entry contains referenced information about the RNA, including sequences, structural information, genomic context, expression, subcellular localization, conservation, functional evidence and other relevant information. lncRNAdb can be searched by querying published RNA names and aliases, sequences, species and associated protein-coding genes, as well as terms contained in the annotations, such as the tissues in which the transcripts are expressed and associated diseases. In addition, lncRNAdb is linked to the UCSC Genome Browser for visualization and Noncoding RNA Expression Database (NRED) for expression information from a variety of sources. ( entry)[35] 2011
Loc3DLOC3D is a database of predicted subcellular localization for eukaryotic proteins of known three-dimensional (3D) structure and includes tools to predict the subcellular localization for submitted protein sequences. ( entry)[36][37][38] 2005
LOCATELOCATE is a curated database that houses data describing the membrane organization and subcellular localization of mouse proteins. ( entry)[39] 2006
LocDBLocDB is a manually curated database with experimental annotations for the subcellular localizations of proteins in Homo sapiens (HS, human) and Arabidopsis thaliana (AT, thale cress). Each database entry contains the experimentally derived localization in Gene Ontology (GO) terminology, the experimental annotation of localization, localization predictions by state-of-the-art methods and, where available, the type of experimental information. LocDB is searchable by keyword, protein name and subcellular compartment, as well as by identifiers from UniProt, Ensembl and TAIR resources. ( entry)[40] 2011
LOCtargetLOCtarget is a tool for predicting, and a database of pre-computed predictions for, sub-cellular localization of eukaryotic and prokaryotic proteins. Several methods are employed to make the predictions, including text analysis of SWISS-PROT keywords, nuclear localization signals, and the use of neural networks. ( entry)[41] 2004
LOCtreePrediction based on mimicking the cellular sorting mechanism using a hierarchical implementation of support vector machines. LOCtree is a comprehensive predictor incorporating predictions based on PROSITE/PFAM signatures as well as SwissProt keywords.[37] 2005
LocTree2Framework to predict localization in life's three domains, including globular and membrane proteins (3 classes for archaea; 6 for bacteria and 18 for eukaryota). The resulting method, LocTree2, works well even for protein fragments. It uses a hierarchical system of support vector machines that imitates the cascading mechanism of cellular sorting. The method reaches high levels of sustained performance (eukaryota: Q18=65%, bacteria: Q6=84%). LocTree2 also accurately distinguishes membrane and non-membrane proteins. In our hands, it compared favorably with top methods when tested on new data ( entry)[42] 2012
LocTree3Prediction of protein subcellular localization in 18 classes for eukaryota, 6 for bacteria and 3 for archaea ( entry)[42][43] 2014
MARSpredPrediction method  for discrimination between Mitochondrial-AARSs and Cytosolic-AARSs. ( entry)[44] 2012
MDLocDependency-Based Protein Subcellular Location Predictor. ( entry)[45] 2015
MemLociPredictor for the subcellular localization of proteins associated or inserted in eukaryotes membranes. ( entry)[46] 2011
MemPypePrediction of topology and subcellular localization of Eukaryotic membrane proteins. ( entry)[47] 2011
MetaLocGramNMeta subcellular localization predictor of Gram-negative protein. MetaLocGramN is a gateway to a number of primary prediction methods (various types: signal peptide, beta-barrel, transmembrane helices and subcellular localization predictors). In author's benchmark, MetaLocGramN performed better in comparison to other SCL predictive methods, since the average Matthews correlation coefficient reached 0.806 that enhanced the predictive capability by 12% (compared to PSORTb3). MetaLocGramN can be run via SOAP.[48] 2012
MirZMirZ is a web server that for evaluation and analysis of miRNA. It integrates two miRNA resources: the smiRNAdb miRNA expression atlas and the E1MMo miRNA target prediction algorithm. ( entry)[49] 2009
MitPredWeb-server specifically trained to predict the proteins which are destined to localized in mitochondria in yeast and animals particularly. ( entry)[50] 2006
MultiLocAn SVM-based prediction engine for a wide range of subcellular locations.[51] 2006
MycosubThis web-server was used to predict the subcellular localizations of mycobacterial proteins based on optimal tripeptide compositions. ( entry)[52] 2015
NetNESPrediction of the leucine-rich nuclear export signals (NES) in eukaryotic proteins ( entry)[53] 2004
ngLOCngLOC is an n-gram-based Bayesian classifier that predicts subcellular localization of proteins both in prokaryotes and eukaryotes. The overall prediction accuracy varies from 85.3% to 91.4% across species. ( entry)[54] 2007
OBCOLSoftware we designed to perform organelle-based colocalisation analysis from multi-fluorophore microscopy 2D, 3D and 4D cell imaging. ( entry)[55] 2009
PA-SUBPA-SUB (Proteome Analyst Specialized Subcellular Localization Server) can be used to predict the subcellular localization of proteins using established machine learning techniques. ( entry)[56][57] 2004
PharmMapperPharmMapper is a web server that identifies potential drug targets from its PharmTargetDB for a given input molecule. Potential targets are identified from a prediction of the spatial arrangement of features essential for a given molecule to interact with a target. ( entry)[58] 2010
PlantLocPlantLoc is a web server for predicting plant protein subcellular localization by substantiality motif. ( entry)[59] 2013
PRED-TMBBPRED-TMBB is a tool that takes a Gram-negative bacteria protein sequence as input and predicts the transmembrane strands and the probability of it being an outer membrane beta-barrel protein. The user has a choice of three different decoding methods. ( entry)[60][61] 2004
PredictNLSPrediction and analysis of nuclear localization signals ( entry)[62] 2000
PredictProtein OpenPrediction of various aspects of protein structure and function. A user may submit a query to the server without registration. ( entry)[63][64][65][66] 2014
PREP SuiteThe PREP (Predictive RNA Editors for Plants) suite predicts sites of RNA editing based on the principle that editing in plant organelles increases the conservation of proteins across species. Predictors for mitochondrial genes, chloroplast genes, and alignments input by the user are included. ( entry)[67][68] 2009
ProLoc-GOProLoc-GO is an efficient sequence-based method by mining informative Gene Ontology terms for predicting protein subcellular localization. ( entry)[69] 2008
ProLocEvolutionary support vector machine (ESVM) based classifier with automatic selection from a large set of physicochemical composition (PCC) features to design an accurate system for predicting protein subnuclear localization. ( entry)[70] 2007
ProtegenProtegen is a web-based database and analysis system that curates, stores and analyzes protective antigens. Protegen includes basic antigen information and experimental evidence curated from peer-reviewed articles. It also includes detailed gene/protein information (e.g. DNA and protein sequences, and COG classification). Different antigen features, such as protein weight and pI, and subcellular localizations of bacterial proteins are precomputed. ( entry)[71] 2011
Proteome AnalystProteome Analyst is a high-throughput tool for predicting properties for each protein in a proteome. The user provides a proteome in fasta format, and the system employs Psi-blast, Psipred and Modeller to predict protein function and subcellular localization. Proteome Analyst uses machine-learned classifiers to predict things such as GO molecular function. User-supplied training data can also be used to create custom classifiers. ( entry)[57] 2004
ProToxProTox is a web server for the in silico prediction of oral toxicities of small molecules in rodents. ( entry)[72][73] 2018
PSLpredMethod for subcellular localization proteins belongs to prokaryotic genomes. ( entry)[74] 2005
PSORTbPSORTb (for “bacterial” PSORT) is a high-precision localization prediction method for bacterial proteins.PSORTb has remained the most precise bacterial protein subcellular localization (SCL) predictor since it was first made available in 2003. PSORTb version improved recall, higher proteome-scale prediction coverage, and new refined localization subcategories. It is the first SCL predictor specifically geared for all prokaryotes, including archaea and bacteria with atypical membrane/cell wall topologies. ( entry)[75] 2010
PSORTdbPSORTdb (part of the PSORT family) is a database of protein subcellular localizations for bacteria and archaea that contains both information determined through laboratory experimentation (ePSORTdb dataset) and computational predictions (cPSORTdb dataset). ( entry)[76][77] 2010
psRobotpsRobot is a web-based tool for plant small RNA meta-analysis. psRobot computes stem-loop small RNA prediction, which aligns user uploaded sequences to the selected genome, extracts their predicted precursors, and predicts whether the precursors can fold into stem-loop shaped secondary structure. psRobot also computes small RNA target prediction, which predict the possible targets of user provided small RNA sequences from the selected transcript library. ( entry)[78] 2012
pTARGETpTARGET predicts the subcellular localization of eukaryotic proteins based on the occurrence patterns of location-specific protein functional domains and the amino acid compositional differences in proteins from nine distinct subcellular locations. ( entry)[79][80] 2006
RegPhosRegPhos is a database for exploration of the phosphorylation network associated with an input of genes/proteins. Subcellular localization information is also included. ( entry)[81] 2011
RepTarRepTar is a database of miRNA target predictions, based on the RepTar algorithm that is independent of evolutionary conservation considerations and is not limited to seed pairing sites. ( entry)[82] 2011
RNApredatorRNApredator is a web server for the prediction of bacterial sRNA targets. The user can choose from a large selection of genomes. Accessibility of the target to the sRNA is considered. ( entry)[83] 2011
S-PSorterA novel cell structure-driven classifier construction approach for predicting image-based protein subcellular location by employing the prior biological structural information. ( entry)[84] 2016
SChloroPrediction of protein sub-chloroplastinc localization. ( entry)[85] 2017
SCLAPAn Adaptive Boosting Method for Predicting Subchloroplast Localization of Plant Proteins.[86] 2013
SCLPredSCLpred protein subcellular localization prediction by N-to-1 neural networks.[87] 2011
SCLpred-EMSSubcellular localization prediction of endomembrane system and secretory pathway proteins by Deep N-to-1 Convolutional Neural Networks[88] 2020
SCLpred-MEMSubcellular localization prediction of membrane proteins by deep N-to-1 convolutional neural networks[89] 2021
SecretomePPredictions of non-classical (i.e. not signal peptide triggered) protein secretion ( entry)[90][91] 2005
SemiBiomarkerNew semi-supervised protocol that can use unlabeled cancer protein data in model construction by an iterative and incremental training strategy.It can result in improved accuracy and sensitivity of subcellular location difference detection. ( entry)[92] 2015
SherLocAn SVM-based predictor combining MultiLoc with text-based features derived from PubMed abstracts.[93] 2007
SUBA3A subcellular localisation database for Arabidopsis proteins, with online search interface. ( entry)[94][95] 2014
SubChloComputational system for predicting protein subchloroplast locations from its primary sequence. It can locate the protein whose subcellular location is chloroplast in one of the four parts: envelope (which consists of outer membrane and inner membrane), thylakoid lumen, stroma and thylakoid membrane. ( entry)[96] 2009
SuperPredThe SuperPred web server compares the structural fingerprint of an input molecule to a database of drugs connected to their drug targets and affected pathways. As the biological effect is well predictable, if the structural similarity is sufficient, the web-server allows prognoses about the medical indication area of novel compounds and to find new leads for known targets. Such information can be useful in drug classification and target prediction. ( entry)[97] 2008
SuperTargetWeb resource for analyzing drug-target interactions. Integrates drug-related info associated with medical indications, adverse drug effects, drug metabolism, pathways and Gene Ontology (GO) terms for target proteins. ( entry)[98] 2012
SwissTargetPredictionSwissTargetPrediction is a web server for target prediction of bioactive small molecules. This website allows you to predict the targets of a small molecule. Using a combination of 2D and 3D similarity measures, it compares the query molecule to a library of 280 000 compounds active on more than 2000 targets of 5 different organisms. ( entry)[99][100] 2014
T3DBThe Toxin and Toxin-Target Database (T3DB) is a unique bioinformatics resource that compiles comprehensive information about common or ubiquitous toxins and their toxin-targets. Each T3DB record (ToxCard) contains over 80 data fields providing detailed information on chemical properties and descriptors, toxicity values, protein and gene sequences (for both targets and toxins), molecular and cellular interaction data, toxicological data, mechanistic information and references. This information has been manually extracted and manually verified from numerous sources, including other electronic databases, government documents, textbooks and scientific journals. A key focus of the T3DB is on providing ??depth?? over ??breadth?? with detailed descriptions, mechanisms of action, and information on toxins and toxin-targets. Potential applications of the T3DB include clinical metabolomics, toxin target prediction, toxicity prediction and toxicology education. ( entry)[101] 2010
TALE-NTTranscription activator-like (TAL) Effector-Nucleotide Targeter 2.0 (TALE-NT) is a suite of web-based tools that allows for custom design of TAL effector repeat arrays for desired targets and prediction of TAL effector binding sites. ( entry)[102] 2012
TarFisDockTarget Fishing Dock (TarFisDock) is a web server that docks small molecules with protein structures in the Potential Drug Target Database (PDTD) in an effort to discover new drug targets. ( entry)[103] 2006
TargetRNATargetRNA is a web based tool for identifying mRNA targets of small non-coding RNAs in bacterial species. ( entry)[104] 2008
TargetPPrediction of N-terminal sorting signals.[105] 2000
TDR TargetsTropical Disease Research (TDR) Database: Designed and developed to facilitate the rapid identification and prioritization of molecular targets for drug development, focusing on pathogens responsible for neglected human diseases. The database integrates pathogen specific genomic information with functional data for genes collected from various sources, including literature curation. Information can be browsed and queried. ( entry)[106] 2012
TetraMitoSequence-based predictor for identifying submitochondria location of proteins. ( entry)[107] 2013
TMBETA-NETTool that predicts transmembrane beta strands in an outer membrane protein from its amino acid sequence. ( entry)[108][109] 2005
TMHMMPrediction of transmembrane helices to identify transmembrane proteins. [110] 2001
TMPredThe TMpred program makes a prediction of membrane-spanning regions and their orientation. The algorithm is based on the statistical analysis of TMbase, a database of naturally occurring transmembrane proteins ( entry)[111] 1993
TPpred 1.0Organelle targeting peptide prediction ( entry)[112] 2013
TPpred 2.0Mitochondrial targeting peptide prediction ( entry)[113][112] 2015
TPpred 3.0Organelle-targeting peptide detection and cleavage-site prediction ( entry)[113] 2015
TTDTherapeutic Target Database (TTD) has been developed to provide information about therapeutic targets and corresponding drugs. TTD includes information about successful, clinical trial and research targets, approved, clinical trial and experimental drugs linked to their primary targets, new ways to access data by drug mode of action, recursive search of related targets or drugs, similarity target and drug searching, customized and whole data download, and standardized target ID. ( entry)[114] 2010
UM-PPSThe University of Minnesota Pathway Prediction System (UM-PPS) is a web tool that recognizes functional groups in organic compounds that are potential targets of microbial catabolic reactions and predicts transformations of these groups based on biotransformation rules. Multi-level predictions are made. ( entry)[115] 2008
WoLF PSORTWoLF PSORT is an extension of the PSORT II program for protein subcellular location prediction. ( entry)[116] 2007
YLocYLoc is a web server for the prediction of subcellular localization. Predictions are explained and biological properties used for the prediction highlighted. In addition, a confidence estimates rates the reliability of individual predictions. ( entry)[117] 2010
Zinc Finger ToolsZinc Finger Tools provides several tools for selecting zinc finger protein target sites and for designing the proteins that will target them. ( entry)[118][119][120][121][122][123] Archived 2009-09-17 at the Wayback Machine 2006


