ExPASy: SIB bioinformatics resource portal
Panu Artimo1
, Manohar Jonnalagedda1,2
, Konstantin Arnold3
, Delphine Baratin4
,
Gabor Csardi5
, Edouard de Castro4
, Se´ verine Duvaud4
, Volker Flegel1
, Arnaud Fortier1
,
Elisabeth Gasteiger4
, Aure´ lien Grosdidier2
, Ce´ line Hernandez1
, Vassilios Ioannidis1
,
Dmitry Kuznetsov1
, Robin Liechti1
, Se´ bastien Moretti1,6
, Khaled Mostaguir4
,
Nicole Redaschi4
, Gre´ goire Rossier1
, Ioannis Xenarios1,4,7
and Heinz Stockinger1,
*
1
Vital-IT Group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland, 2
Molecular Modelling Group, SIB
Swiss Institute of Bioinformatics, Lausanne, Switzerland, 3
Computational Structural Biology Group, University of
Basel, SIB Swiss Institute of Bioinformatics, Basel, Switzerland, 4
Swiss-Prot Group, SIB Swiss Institute of
Bioinformatics, Geneva, Switzerland, 5
Department of Statistics, Harvard University, Cambride, USA, 6
Department of
Ecology and Evolution, University of Lausanne, Lausanne, Switzerland and 7
Center for Integrative Genomics,
University of Lausanne, Lausanne, Switzerland
Received December 16, 2011; Revised April 10, 2012; Accepted April 17, 2012
ABSTRACT
ExPASy (http://www.expasy.org) has worldwide
reputation as one of the main bioinformatics resources
for proteomics. It has now evolved,
becoming an extensible and integrative portal accessing
many scientific resources, databases and
software tools in different areas of life sciences.
Scientists can henceforth access seamlessly a
wide range of resources in many different domains,
such as proteomics, genomics, phylogeny/evolution,
systems biology, population genetics,
transcriptomics, etc. The individual resources (databases,
web-based and downloadable software
tools) are hosted in a ‘decentralized’ way by different
groups of the SIB Swiss Institute of Bioinformatics
and partner institutions. Specifically, a single web
portal provides a common entry point to a wide
range of resources developed and operated by different
SIB groups and external institutions. The
portal features a search function across ‘selected’
resources. Additionally, the availability and usage
of resources are monitored. The portal is aimed for
both expert users and people who are not familiar
with a specific domain in life sciences. The new
web interface provides, in particular, visual
guidance for newcomers to ExPASy.
INTRODUCTION
Historically, ExPASy (1,2) was one of the ﬁrst web servers
in the life sciences community, known as ‘Expert Protein
Analysis System’. We have now turned it into a bioinformatics
resource portal that serves not only the ﬁeld of
proteomics but also other domains of life sciences. The
major changes and their rationale are described below:
(i) comprehensive list of high-quality resources
provided by the SIB Swiss Institute of
Bioinformatics and several external institutions.
Swiss-Prot has been and still is the ﬂagship knowledge
base of the SIB and one of the main
proteomics-related resources on ExPASy.
However, in the last decade, the SIB has grown
to a Swiss-wide institute that covers all major bioinformatics
domains, including proteomics,
genomics, transcriptomics, evolution, systems
biology, etc. One goal of the new ExPASy portal
is to serve as entry point to all scientiﬁc databases
and tools provided by the SIB, and we revisited the
original focus on proteomics accordingly;
(ii) federated portal rather than a single server. The
original ExPASy server hosted resources from
mainly two groups located in Geneva (with a few
notable exceptions, such as SWISS-MODEL). With
the inclusion of resources from more than 20
groups in ﬁve different cities (Geneva, Lausanne,
*To whom correspondence should be addressed. Tel: +41 21 692 40 89; Fax: +41 21 692 40 65. Email: Heinz.Stockinger@isb-sib.ch;
helpdesk@expasy.org
The authors wish it to be known that, in their opinion, the ﬁrst two authors should be regarded as joint First Authors.
Published online 31 May 2012 Nucleic Acids Research, 2012, Vol. 40, Web Server issue W597–W603
doi:10.1093/nar/gks400
ß The Author(s) 2012. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/
by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
atMasarykUniversityonJanuary23,2013http://nar.oxfordjournals.org/Downloadedfrom
Berne, Basel and Zurich), located at seven Swiss
higher education institutions, ExPASy has been
turned into a federated portal that provides access
to resources at different locations. Each resource is
maintained independently but many of them still
follow a common release cycle or database update
schedule. Keeping independent release cycles for
some resources allows more ﬂexibility for the individual
resource providers;
(iii) new search features: a ﬂexible search functionality
is provided to allow proper access to the independent
resources now federated under the new ExPASy
portal. This relies on a standardized search interface
that permits querying several different resources
in parallel;
(iv) visual guidance: we aim at serving a wide range of
users including newcomers not familiar with the
ﬁelds covered by ExPASy. We have added an
attractive visual interface that is addressed to new
users even if they have little experience with the
provided domains and technologies; and
(v) monitor the quality and usage of resources:
providing high-quality usable resources is of major
importance. Consequently, we regularly monitor all
resources on the portal and provide the possibility
to notify the resource providers in case of availability
issues. Additionally, usage statistics are collected
to monitor the popularity of resources.
Figure 1 shows the homepage of the new version of
ExPASy, including a new logo reﬂecting the SIB’s corporate
identity. Different ‘categories’ beyond proteomics can
be selected in the menu item ‘Categories’ on the left-hand
side (cf. ‘Access to a large set of bioinformatics resources’
section). A query feature is directly accessible from the
page header (cf. ‘Cross-resource search interface’
section). In the centre of the page, a set of resources is
promoted (‘Featuring today’) to allow users to discover
new resources, which might help them in their work. These
featured links are updated every week and draws from the
list of all SIB resources available via the portal.
Additionally, the most ‘Popular resources’ are directly
accessible in a pane on the right side. An alphabetical
resources list can be accessed under the menu item
‘Resources A..Z’. Finally, ‘Latest News’ about resources
are displayed and made available via an RSS feed (see
RSS icon). In the remainder of this article, we will
present details of the new features as well as some hints
for experienced users of the previous ExPASy version.
MAIN FEATURES
Access to a large set of bioinformatics resources
The previous ExPASy server was mainly dedicated to
bioinformatics related to proteomics (protein identiﬁcation,
post-translational modiﬁcations, etc.), and only
Figure 1. ExPASy homepage.
W598 Nucleic Acids Research, 2012, Vol. 40, Web Server issue
atMasarykUniversityonJanuary23,2013http://nar.oxfordjournals.org/Downloadedfrom
some of the resources were applicable to both proteomes
and genomes (like sequence similarity search or sequence
alignment). The new portal provides access to over 130
resources (cf. Appendix A1 for the full list of resources)
from more than 20 different SIB groups. To reﬂect the
different scientiﬁc directions of the SIB’s groups, the following
scientiﬁc categories are currently available: proteomics,
genomics, structural bioinformatics, systems
biology, phylogeny/evolution, population genetics, transcriptomics,
biophysics, imaging and drug design. To facilitate
access to the numerous resources in proteomics
and genomics, additional subcategories are available in
the respective menu items.
Within each category, resources are classiﬁed as
‘Databases’ and/or as ‘Tools’. Most resources have web
interfaces (especially databases) but others may consist of
a downloadable software package with a command line or
graphical interface.
Each single resource is documented with the following
information to allow for a more speciﬁc identiﬁcation and
description (Figure 2).
(i) Resource name and description.
(ii) Indication of SIB group that maintains the
resource.
(iii) Scientiﬁc category.
(iv) Keywords: each resource is tagged with terms from a
controlled vocabulary to facilitate resource classiﬁcation
and search beyond simple categorization e.g.
sequence comparison, ChIP-seq, etc.
(v) URL for the web interface and for download if
available.
(vi) Software type: a resource can either be a website,
command line interface, GUI, library, etc.
(vii) Status (green check box) information to check if a
resources is accessible [cf. ‘Monitoring (availability
and usage checks)’ section].
A detailed search feature to ﬁnd resources on the portal
is available in the page header (see ‘search’ button in top/
centre of the ExPASy home page). One needs to select
‘Find Resources’ to discover (query) resources by name,
keyword, category or description. If this feature is used,
the information shown below becomes visible (Figure 2).
If a search term is found (for instance ‘UniProtKB’), it is
highlighted for the speciﬁc resource(s). The second main
usage of the query engine will be described in the
‘Cross-resource search interface’ section.
Internal implementation detail: all the resource-speciﬁc
information is stored in a relational database back-end.
This allows for a very dynamic website in case any of the
130 or more resources change. The portal is implemented
using the web framework CakePHP (http://
cakephp.org).
Cross-resource search interface
Although most of the resources provide speciﬁc search
functionalities, a single search querying a set of resources
at the same time is often more convenient. For instance,
the search can even include resources that might not yet
be known to the user but still be useful. Following this
idea, a REST-based cross-resource search protocol
(The detailed speciﬁcation of the interface can be found
at: http://wiki.isb-sib.ch/web-team/Sib-resource_queryinterface)
(Figure 3) has been designed. It is already implemented
by almost 20 SIB resources, such as
ENZYME, MyHits, STRING, UniProtKB, ViralZone,
PROSITE, SWISS-MODEL Repository, etc.
This query feature is available in the header of the page
via the item ‘Query all databases’ (Figure 3). In this case, a
standard text-based query e.g. ‘human’ is sent in parallel
to all compatible resources, i.e. the ones that implement
the ExPASy query interface. As soon as one of the latter
has performed the search, the number of matches (hits) is
displayed, and a link is added to the query results in the
resource’s original interface. If available, an additional
description of the result set is shown. The query can also
be restricted by specifying a category, e.g. only query
resources in the ﬁeld of ‘phylogeny/evolution’ or even by
specifying an individual resource.
In addition to the default behaviour (searching for
full-text matches), the search engine automatically
recognizes formatted data types (called ‘query types’), such
as UniProtKB accession numbers, PDBID, EnsemblID,
Figure 2. Detailed information describing a resource.
Nucleic Acids Research, 2012, Vol. 40, Web Server issue W599
atMasarykUniversityonJanuary23,2013http://nar.oxfordjournals.org/Downloadedfrom
etc., and sends the query only to resources supporting the
speciﬁed query type. The search can be extended beyond
these automatically selected resources, by clicking on the
button ‘Search again as text’, which initiates a full-text
search to all registered resources.
Visual guidance
As an alternative to the standard, text-based interface that
relies on scientiﬁc categories (which mainly targets expert
users), resources can be accessed via a visually attractive
interface. For instance, a biologist working at the cellular
level might want to ﬁnd corresponding tools and databases
in another domain. The different graphical entry
points are depicted in Figure 4. Once selected, databases
and tools will be displayed (Figure 5).
Once a category has been selected, the list of resources
shown can be narrowed down by a classical keywordbased
search feature, or by a tag cloud showing the
most frequent yet discriminative keywords among
displayed resources. In the example in Figure 5, the
Figure 3. Overview of resources that can be queried via the cross-resource search interface (A) (Query all databases). If a resource implements the
cross-resource search interface (B) (indicated by the red box named ‘Query interface’ and the arrow leading to the resource), ExPASy will query the
resource directly. Other resources that do not implement this speciﬁc interface (even if they have other query features), are not included in ExPASy’s
parallel query.
Figure 4. Entry point for the visual guidance interface—includes categories to be selected.
W600 Nucleic Acids Research, 2012, Vol. 40, Web Server issue
atMasarykUniversityonJanuary23,2013http://nar.oxfordjournals.org/Downloadedfrom
keyword ‘molecular modelling’ has been selected
(cf. ‘Selected keywords’ in the top left of Figure 5),
and resources related to this keyword are indicated in
the right pane.
Monitoring (availability and usage checks)
The federated portal lists resources from numerous groups
in various locations with different levels of expertise.
However, a common important goal is to provide highquality
resources that are available to the research community,
i.e. service downtime should be reduced to a
minimum. ExPASy provides a monitoring feature that
periodically checks if a service is accessible and sends
Email notiﬁcations in case of errors. The status of a
resource is then displayed on the portal (green sign if
ok, red sign if not ok). Additionally, the service up-time
is displayed (Figure 6).
If a resource provides a web service interface, detailed
functionality tests can be done. Since the EMBRACE
registry (3) already provides a service to register and
execute client-side tests for SOAP services, its interface is
used and interrogated for service information. All these
checks are performed by a Python script, which can also
be extended to add other useful tests. For instance, one
can test for additional functionality using example queries.
Service availability is one issue. Another question for
users of a portal is: how popular is a speciﬁc resource?
The information is not visible publicly (i.e. the statistics
only are accessible to resource providers and portal operators)
but the most popular resources on the portal are
ranked according to internally collected usage statistics,
e.g. UniProtKB appears on the top of the page.
Additional changes and hints for established users
We have also used the opportunity to critically review the
resources on the previous ExPASy server, update them
and phase out some of them. More than 40 resources
that were used frequently and for which all source codes
were still available, have been repackaged, recompiled and
migrated to 64-bit architectures. Whenever needed, the
user interfaces have been adapted. The repackaging also
resulted in new URLs for most of the resources, i.e. the
package name is now explicitly visible in the URL (either
*.expasy.org/<resourcename> or
<resourcename>.expasy.org). Example of a new URL:
http://web.expasy.org/translate instead of http://www
.expasy.org/tools/dna.html. However, old URLs will
remain available for some time and are automatically
redirected to new URLs.
The portal also has a new way to request help, i.e. we
provide a single point of entry rather than several different
Email addresses. Typically, we ask users to specify the
resource for which they have a speciﬁc question, i.e. one
can select from all existing resources. In this way, the
Figure 5. Tag cloud in the visual guidance interface.
Nucleic Acids Research, 2012, Vol. 40, Web Server issue W601
atMasarykUniversityonJanuary23,2013http://nar.oxfordjournals.org/Downloadedfrom
ExPASy helpdesk can better respond to the user’s questions
and address them to the correct people. Although
resources are provided by several different groups, all
providers can be contacted via a single form. That is a
substantial improvement with respect to the previous
version. Users only need to remember one place to ask
questions (i.e., to post comments), whereas the ExPASy
team takes care of ﬁnding the right contact person, i.e. we
provide a professional helpdesk with ﬁrst and second level
support.
Mirror sites external to the SIB were discontinued for
technical reasons but the new federated portal has a
fail-over mirror at a second site in Switzerland (Basel)
that can be used in case of technical interventions, etc.
The portal still provides access to the proteomics tools
links page and its classiﬁcations via the menu item ‘Links/
Documentation – Proteomics software tools’. Additionally,
the ‘Life Science Directory’ (previously known as Amos’
links) is still accessible. However, both pages are static and
are currently no longer maintained. This way, the portal
still provides access to resources that are not maintained by
the SIB. However, for a new release of ExPASy, it is
planned that external resources (i.e. proteomics tools
listed on the static page mentioned before) will be explicitly
listed on the portal, included in the main search and displayed
in a similar way as SIB resources. Additionally,
some resources such as BLAST have been developed by
external groups but the SIB operates the service and
provides support.
CONCLUSION
Since 28 June 2011, the new ExPASy SIB Bioinformatics
Resource portal has been online to serve the life sciences
community. The new portal has been designed and
developed by the SIB Web Team. It is fully operational
and is accessed several thousand times a day by researchers
from all over the world.
ACKNOWLEDGEMENTS
We thank the Steering Committee of the Web Team for
the support: Ron Appel, Lydie Bougueleret, Fre´ derique
Lisacek, Irene Perovsek, Torsten Schwede, Christian von
Mering and Ioannis Xenarios. Thanks also to all ExPASy
users who have provided feedback to improve the portal.
FUNDING
Swiss State Secretariat for Education and Research,
in part. Funding for open access charge: SIB.
Conﬂict of interest statement. None declared.
REFERENCES
1. Appel,R.D., Bairoch,A. and Hochstrasser,D.F. (1994) A new
generation of information retrieval tools for biologists: the
example of the ExPASy WWW server. Trends Biochem. Sci., 19,
258–260.
2. Gasteiger,E., Gattiker,A., Hoogland,C., Ivanyi,I., Appel,R.D. and
Bairoch,A. (2003) ExPASy: The proteomics server for in-depth
protein knowledge and analysis. Nucleic Acids Res., 31,
3784–3788.
3. Pettifer,S., Ison,J., Kalas,M., Thorne,D., McDermott,P.,
Jonassen,I., Liaquat,A., Ferna´ ndez,J.M., Rodriguez,J.M., INBPartners
et al. (2010) The EMBRACE web service collection.
Nucleic Acids Res., 38, W719–W723.
APPENDIX A1
In April 2012, the following SIB resources were listed on
the portal:
AACompIdent, AACompSim, ALF, AllAll, Arlequin,
Association Viewer, BayeScan, Bgee, Biochemical
Pathways, BLAST, boxshade, ChIP-Seq, Click2Drug,
CLIPZ, ClustalW, Codon Suite, COILS, Compute pI/
MW, CT-CBN, Decrease redundancy, Dotlet, efmtool,
ElMMo, EMBnet services, ENZYME, EPD, ESTscan,
ExpressionView, FastEpistasis, fastsimcoal, FetchGWI /
Figure 6. Monitoring information for a speciﬁc resource on the portal. This information is available if one clicks on the resources status sign
displayed on the top right in Figure 2.
W602 Nucleic Acids Research, 2012, Vol. 40, Web Server issue
atMasarykUniversityonJanuary23,2013http://nar.oxfordjournals.org/Downloadedfrom
tagger, FindMod, FindPept, Genome History,
GlycanMass, GlycoMod, GlycoSuiteDB, GMM,
GPSDB, HAMAP, HamapScan, HCD/CID spectra
merger, ImageMaster / Melanie, ImmunoDB, ISA,
Iscan, IsotopIdent, LALIGN, Linear Classiﬁcation,
MADAP, Make2D-DB II, MALDIPepQuant,
MAMOT, MARA, MARCOIL, MassSearch,
MIAPEGelDB, miROrtho, Mltree, MLTreeMap,
MOSAIC Software Repository, Msight, MyDomains,
MyHits, Myristoylator, Newick Utilities, neXtProt,
nfswatch, OMA, OpenFlu, OpenStructure, OrthoDB,
PANDITplus, PaxDb, PepPepSearch, PeptideCutter,
PeptideMass, Phylogenetic Tree, Phylogibbs, pIcarver,
Ping pong algorithm, PRATT, pROC, Prolune,
PROSITE, Protein Model Portal, Protein Spotlight,
ProtParam, ProtScale, QMEAN, QuickMod, RandSeq,
RaxML, ScanProsite, Selectome, Sequence Similarity
Maps (SSM), ShoRAH, SIBsim4, SIM, smirnaDB,
Soaplab services, SSA, STRING, SugarBind, Sulﬁnator,
SuperTree, SWISS-2DPAGE, SWISS-MODEL
Repository, SWISS-MODEL Workspace, SwissPdbViewer,
SwissDock, SwissParam, SwissRegulon,
SwissSidechain, SwissVar, T-Coffee, TagIdent, TagScan,
TCS, The Systems Biology Research Tool, TMPred,
Translate, TreeGen, TriFLe, tromer, UniPathway,
UniProtKB, UniProtKB/Swiss-Prot, Vertex Cover,
ViralZone, Vital-IT, World-2DPAGE Constellation,
World-2DPAGE Repository and ZFN-Site.
Nucleic Acids Research, 2012, Vol. 40, Web Server issue W603
atMasarykUniversityonJanuary23,2013http://nar.oxfordjournals.org/Downloadedfrom