## Tutorial

This tutorial shows the major options in the SPACER-server, helps to understand the requirements to the input data and parameters set by the user, and exemplifies the output with explanations of results and their meaning.

The server’s pipeline consists of several steps, some of them are optional depending on the users’ tasks/questions.

We use here tetrameric enzyme phosphofructokinase (PFK), which displays a classic example of allostery. The enzyme is allosterically inhibited by phosphoenolpyruvate (PEP) and activated by ADP binding to the same site. It is cooperative with respect to binding of the two substrates, fructose-6-phosphate (F6P) in the presence of PEP. We use here the crystal structure supplemented by the allosteric activator ADP (PDB ID 4pfk).

### Input data

Typically, user provides a four character PDB ID as an input. It is also possible to upload a custom file in PDB format. In the latter case user takes responsibility for providing correct assembly for olgomeric structure.

In our example we type in the PDB ID 4pfk.

### Biological Assembly

The server connects to EBI PISA database and uses the best matching assembly that corresponds to the input PDB ID. Additionally, user can use other assemblies with lower ranks provided by PISA and upload them to SPACER manually.

In case of PFK enzyme (PDB ID 4pfk), the best assmebly is retrieved from PISA and displayed in Jmol embedded viewer on the Bio Assembly Page. The subunits of the structure are shown in different colors.

In order to display the results mapped on structures SPACER requires that user's browser should support Jmol JAVA applet. It is recommended to check that your browser supports Jmol. Click here to test whether Java is working on my computer.

### Sessions

Some of the calculation steps may take from half an hour up to a day depending on protein assembly size and the parameters. For example, calculations on chaperones (PLoS Comp. Biol. e1002301, 2011) take up to 40 hours per structure. SPACER allows to track the progress of the tasks.

Additionally, it is possible to close the browser window and continue later when the results are ready. A session has a unique five or six-character identifier (case sensitive) associated with the particular analysis run. In order to restore the session later, one has to provide the session ID on SPACER home page or use the bookmark link obtained via sessions menu.

### Sites

Initially, the user is provided with the list of ligand-binding sites extracted from a given PDB structure. SPACER allows to predict new sites or add known catalytic and allosteric sites manually.

It is possible to visualize the site on structure by clicking the button with this icon: . The corresponding site will be highlighted in Jmol. Note that the structure is not oriented automatically, the user may have to rotate it in order to locate the highlighted site.

There are two options at this stage:

• predict more potential allosteric/catalytic sites;
• analyze communication between known sites.

If user is interested in identifying potential ligand-binding sites, it can be done using local closenessand binding leverage tools.

Alternatively, user can pick a residue and set a radius of the surface patch around it, customizing thus site of the interest. The user-annotated site will be added to the list of sites.

## Predict sites

### Local closeness

User can select number $$m$$ between 1 and 4. The m is a number of closest neighbors in the residues interaction graph, where each residue in the protein is a node. The local closeness of degree m for a node is defined as $$C_m = \sum_{k=1}^{m} \frac{n_k}{k^2}$$ where $$n_k$$ is the number of nodes whose shortest distance from a given node is exactly $$k$$. The local closeness of degree four ($$m = 4$$) is recommended for regular calculations, because this value of $$m$$ effectively means that only residues closer than 30-40 Å are included in the calculation, which roughly corresponds to the length scale of single domains. The $$m = 1$$ is recommended when small cavities on the surface are to be investigated. The output is the structure colored according to the value of local closeness. The value is normalized to fit the range 0-100.

Two orientations of the PFK structure. The surface areas with the highest local closeness are shown in red.

Reference Mitternacht S, Berezovsky IN (2011) A geometry-based generic predictor for catalytic and allosteric sites. Protein Eng Des Sel 24: 405-409.

### Binding leverage

Binding leverage measures ability of a binding site to couple to intrinsic motions of a protein by quantifying the cost of the binding site deformation when ligand is present and resisting the motion. Conformational changes are simulated here using coarse-grain normal modes potential. The binding leverage $$L_A$$ for a set of normal modes $$A$$ is calculated as $$L_A = \sum_{k \in A} \Delta U_k$$ where $$\Delta U_k$$ represents the change in potential energy of the spring between each pair of $$C_alpha$$ atoms $$i$$ and $$j$$, whose connecting line passes within 3.5 Å of any ligand atom $$\Delta U = \frac{k}{2} \sum_{ij} \Delta d_{ij}^{2}$$ The output consists of ten potential catalytic and effector biding sites with highest binding leverage.

Two orientations of the same PFK structure show the surface areas with highest binding leverage in red.
Note that the sites with highest leverage correspond to known ligand-binding sites.

Reference Mitternacht S, Berezovsky IN (2011) Binding leverage as a molecular basis for allosteric regulation. PLoS Comput Biol 7: e1002148.

## Allosteric communication

Allosteric communication can be calculated for up to four sites from the list that include known catalytic and allosteric sites, and sites customized by user or predicted by the server. The communication is characterized by the leverage coupling.

### Leverage coupling

Leverage coupling provides a quantitative characteristic of allosteric communication. The strength of communication between two sites $$P$$and $$Q$$ is defined as a dot product of binding leverages, $$\lambda_P$$ and $$\lambda_Q$$, of these sites: $$D_{PQ} = \lambda_P \cdot \lambda_Q$$.

The vector of the binding leverage of site $$P$$ is defined as $$\lambda_P = (\tilde \lambda_{P_{1}}, \dots, \tilde \lambda_{P_{n}})$$, where $$\tilde \lambda_{P_{\mu}}$$ is a binding leverage of the site $$P$$ caused by the normal mode $$\mu$$ $$\tilde \lambda_{P_\mu} = \frac{\sum_{i \in P} \lambda_{i\mu}}{\|P\|}$$ The normalized leverage coupling $$C_{PQ} = D_{PQ}^2 / ( D_{PP}D_{QQ} )$$ has the range $$0 \leq C_{PQ} \leq 1$$. It is necessary for the analysis big oligomeric strutcures and molecular machines, where the conformational change at binding sites is small compared to the large-scale functional motions. The output shows matrices $$D_{PQ}$$ and $$C_{PQ}$$ for selected sites.

$$D_{PQ}$$ and $$C_{PQ}$$ matrices. The last column and row in the matrix corresponds to background (denoted BG). The background is defined as the whole set of residues excluding the ones involved in the listed sites. It is possible to click on any cell in the matrix. The cell will highlight in green and the corresponding pair of sites will be highlighted in Jmol.

Jmol shows the pair of sites. The corresponding sets of residues P and Q are shown in orange and green.

In addition to calculating the leverage coupling for the pairs of sites it is possible to explore the allosteric communication between one site and the rest of the structure. User selects the site from the list and the structure is colored according to the value of $$D_{P_{i}}$$, where $$i$$ is the index of the residue.
Communication between one of the ADP-binding sites and the rest of the structure.
Two orientations of the same structure.
There is weak communication betwen the ADP-binding site and F6P-binding active sites

However, the communication between different ADP-binding sites is strong.

Communication between one of the F6P-binding sites and the rest of the structure.
Two orientations of the same structure.
F6P-binding sites are strongly communicating.

However, the F6P-binding (active) site is weakly communicating with the ADP-binding (regulatory) sites.

Reference Mitternacht S, Berezovsky IN (2011) Coherent conformational degrees of freedom as a structural basis for allosteric communication. PLoS Comput Biol 7: e1002301.