Recent developments in high-throughput reverse genetics1,2 have revolutionized our ability to map gene function and interactions3,4,5,6. The power of these approaches depends on their ability to identify functionally associated genes, which elicit similar phenotypic changes across several perturbations (chemical, environmental or genetic) when knocked out7,8,9. However, owing to the large number of perturbations, these approaches have been limited to growth or morphological readouts10. Here we use a high-content biochemical readout, thermal proteome profiling11, to measure the proteome-wide protein abundance and thermal stability in response to 121 genetic perturbations in Escherichia coli. We show that thermal stability, and therefore the state and interactions of essential proteins, is commonly modulated, raising the possibility of studying a protein group that is particularly inaccessible to genetics. We find that functionally associated proteins have coordinated changes in abundance and thermal stability across perturbations, owing to their co-regulation and physical interactions (with proteins, metabolites or cofactors). Finally, we provide mechanistic insights into previously determined growth phenotypes12 that go beyond the deleted gene. These data represent a rich resource for inferring protein functions and interactions.
This is a preview of subscription content
Subscription info for Chinese customers
We have a dedicated website for our Chinese customers. Please go to naturechina.com to subscribe to this journal.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The thermal proteome profiling data are available at http://ecoliTPP.shiny.embl.de. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD016589. The mass spectrometry metabolomics data have been deposited to the MassIVE repository with the dataset identifier MSV000084632. Data for protein complexes, pathways, and operons were retrieved from Ecocyc v21.1 (https://ecocyc.org/)56. STRING database v10.5 was used (https://string-db.org/)57. Data referring to protein localization were retrieved from STEPdb v1.0 (http://stepdb.eu/)61. Cellular processes targeted by mutants in this study were derived from Clusters of Orthologous Groups (COG) database (https://www.ncbi.nlm.nih.gov/research/cog-project/)62. Gene ontology annotations (release: 2020-01-01) were downloaded from http://geneontology.org.
Beltrao, P., Cagney, G. & Krogan, N. J. Quantitative genetic interactions reveal biological modularity. Cell 141, 739–745 (2010).
Costanzo, M. et al. Global genetic networks and the genotype-to-phenotype relationship. Cell 177, 85–100 (2019).
Typas, A. et al. Regulation of peptidoglycan synthesis by outer-membrane proteins. Cell 143, 1097–1109 (2010).
Gray, A. N. et al. Coordination of peptidoglycan synthesis and outer membrane constriction during Escherichia coli cell division. eLife 4, (2015).
Surma, M. A. et al. A lipid E-MAP identifies Ubx2 as a critical regulator of lipid saturation and lipid bilayer stress. Mol. Cell 51, 519–530 (2013).
Collins, S. R. et al. Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map. Nature 446, 806–810 (2007).
Nichols, R. J. et al. Phenotypic landscape of a bacterial cell. Cell 144, 143–156 (2011).
Costanzo, M. et al. A global genetic interaction network maps a wiring diagram of cellular function. Science 353, aaf1420 (2016).
Price, M. N. et al. Mutant phenotypes for thousands of bacterial genes of unknown function. Nature 557, 503–509 (2018).
Kritikos, G. et al. A tool named Iris for versatile high-throughput phenotyping in microorganisms. Nat. Microbiol. 2, 17014 (2017).
Savitski, M. M. et al. Tracking cancer drugs in living cells by thermal profiling of the proteome. Science 346, 1255784 (2014).
Herrera-Dominguez, L. & Typas, A. Exploring the dark (and not so dark) genome of E. coli, https://ecoli-darkgen.shinyapps.io/app-1/ (2020).
Babu, M. et al. Global landscape of cell envelope protein complexes in Escherichia coli. Nat. Biotechnol. 36, 103–112 (2018).
Wan, C. et al. Panorama of ancient metazoan macromolecular complexes. Nature 525, 339–344 (2015).
Tan, C. S. H. et al. Thermal proximity coaggregation for system-wide profiling of protein complex dynamics in cells. Science 359, 1170–1177 (2018).
Martinez Molina, D. et al. Monitoring drug target engagement in cells and tissues using the cellular thermal shift assay. Science 341, 84–87 (2013).
Bantscheff, M., Lemeer, S., Savitski, M. M. & Kuster, B. Quantitative mass spectrometry in proteomics: critical review update from 2007 to the present. Anal. Bioanal. Chem. 404, 939–965 (2012).
Mateus, A. et al. Thermal proteome profiling for interrogating protein interactions. Mol. Syst. Biol. 16, e9232 (2020).
Sridharan, S. et al. Proteome-wide solubility and thermal stability profiling reveals distinct regulatory roles for ATP. Nat. Commun. 10, 1155 (2019).
Mateus, A. et al. Thermal proteome profiling in bacteria: probing protein state in vivo. Mol. Syst. Biol. 14, e8242 (2018).
Becher, I. et al. Pervasive protein thermal stability variation during the cell cycle. Cell 173, 1495–1507 (2018).
Huang, J. X. et al. High throughput discovery of functional protein modifications by hotspot thermal profiling. Nat. Methods 16, 894–901 (2019).
Potel, C. M. et al. Impact of phosphorylation on thermal stability of proteins. Preprint at https://doi.org/10.1101/2020.01.14.903849 (2020).
Smith, I. R. et al. Identification of phosphosites that alter protein thermal stability. Preprint at https://doi.org/10.1101/2020.01.14.904300 (2020).
Becher, I. et al. Thermal profiling reveals phenylalanine hydroxylase as an off-target of panobinostat. Nat. Chem. Biol. 12, 908–910 (2016).
Baba, T. et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2, 2006.0008, (2006).
Werner, T. et al. Ion coalescence of neutron encoded TMT 10-plex reporter ions. Anal. Chem. 86, 3594–3601 (2014).
Parker, D. J., Demetci, P. & Li, G. W. Rapid accumulation of motility-activating mutations in resting liquid culture of Escherichia coli. J. Bacteriol. 201, e00259-19 (2019).
Palmer, T. & Berks, B. C. The twin-arginine translocation (Tat) protein export pathway. Nat. Rev. Microbiol. 10, 483–496 (2012).
Koo, B. M. et al. Construction and analysis of two genome-scale deletion libraries for Bacillus subtilis. Cell Syst. 4, 291–305 (2017).
Qi, L. S. et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152, 1173–1183 (2013).
Lawson, M. J. et al. In situ genotyping of a pooled strain library after characterizing complex phenotypes. Mol. Syst. Biol. 13, 947 (2017).
Peters, J. M. et al. A Comprehensive, CRISPR-based functional analysis of essential genes in bacteria. Cell 165, 1493–1506 (2016).
Kustatscher, G. et al. Co-regulation map of the human proteome enables identification of protein functions. Nat. Biotechnol. 37, 1361–1371 (2019).
Romanov, N. et al. Disentangling genetic and environmental effects on the proteotypes of individuals. Cell 177, 1308–1318 (2019).
Havugimana, P. C. et al. A census of human soluble protein complexes. Cell 150, 1068–1081 (2012).
Lalanne, J. B. et al. Evolutionary convergence of pathway-specific enzyme expression stoichiometry. Cell 173, 749–761 (2018).
Ghatak, S., King, Z. A., Sastry, A. & Palsson, B. O. The y-ome defines the 35% of Escherichia coli genes that lack experimental evidence of function. Nucleic Acids Res. 47, 2446–2454 (2019).
Mateus, A., Määttä, T. A. & Savitski, M. M. Thermal proteome profiling: unbiased assessment of protein state through heat-induced stability changes. Proteome Sci. 15, 13 (2017).
Shiver, A. L. et al. A chemical-genomic screen of neglected antibiotics reveals illicit transport of kasugamycin and blasticidin S. PLoS Genet. 12, e1006124 (2016).
Rousset, F. et al. Genome-wide CRISPR-dCas9 screens in E. coli identify essential genes and phage host factors. PLoS Genet. 14, e1007749 (2018).
Liu, X. et al. High-throughput CRISPRi phenotyping identifies new essential genes in Streptococcus pneumoniae. Mol. Syst. Biol. 13, 931 (2017).
Aoki, S. K. et al. Contact-dependent growth inhibition requires the essential outer membrane protein BamA (YaeT) as the receptor and the inner membrane transport protein AcrB. Mol. Microbiol. 70, 323–340 (2008).
Bernard, C. S., Sadasivam, M., Shiomi, D. & Margolin, W. An altered FtsA can compensate for the loss of essential cell division protein FtsN in Escherichia coli. Mol. Microbiol. 64, 1289–1305 (2007).
Malinverni, J. C. et al. YfiO stabilizes the YaeT complex and is essential for outer membrane protein assembly in Escherichia coli. Mol. Microbiol. 61, 151–164 (2006).
Sampson, B. A., Misra, R. & Benson, S. A. Identification and characterization of a new gene of Escherichia coli K-12 involved in outer membrane permeability. Genetics 122, 491–501 (1989).
Grenier, F., Matteau, D., Baby, V. & Rodrigue, S. Complete genome sequence of Escherichia coli BW25113. Genome Announc. 2, e01038-14 (2014).
Hughes, C. S. et al. Ultrasensitive proteome analysis using paramagnetic bead technology. Mol. Syst. Biol. 10, 757 (2014).
Hughes, C. S. et al. Single-pot, solid-phase-enhanced sample preparation for proteomics experiments. Nat. Protocols 14, 68–85 (2019).
Franken, H. et al. Thermal proteome profiling for unbiased identification of direct and indirect drug targets using multiplexed quantitative mass spectrometry. Nat. Protocols 10, 1567–1593 (2015).
Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A. & Vingron, M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18 (Suppl 1), S96–S104 (2002).
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Cui, L. et al. A CRISPRi screen in E. coli reveals sequence-specific toxicity of dCas9. Nat. Commun. 9, 1912 (2018).
Haeussler, M. et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 17, 148 (2016).
Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 77 (2011).
Keseler, I. M. et al. The EcoCyc database: reflecting new knowledge about Escherichia coli K-12. Nucleic Acids Res. 45 (D1), D543–D550 (2017).
Szklarczyk, D. et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47 (D1), D607–D613 (2019).
Guzman, L. M., Belin, D., Carson, M. J. & Beckwith, J. Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter. J. Bacteriol. 177, 4121–4130 (1995).
Otsuka, Y. et al. GenoBase: comprehensive resource database of Escherichia coli K-12. Nucleic Acids Res. 43, D606–D617 (2015).
Saka, K. et al. A complete set of Escherichia coli open reading frames in mobile plasmids facilitating genetic studies. DNA Res. 12, 63–68 (2005).
Orfanoudaki, G. & Economou, A. Proteome-wide subcellular topologies of E. coli polypeptides database (STEPdb). Mol. Cell. Proteomics 13, 3674–3687 (2014).
Tatusov, R. L., Galperin, M. Y., Natale, D. A. & Koonin, E. V. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28 33–36 (2000).
We thank P. Phapale for help with metabolomics analysis, and H. Link and D. Bikard for strains and plasmids for the dCas9 work. This work was supported by the European Molecular Biology Laboratory. A.M. and K.M. were supported by a fellowship from the EMBL Interdisciplinary Postdoc (EI3POD) programme under Marie Skłodowska-Curie Actions COFUND (grant number 664726). C.V.G. is recipient of an EMBO long-term postdoctoral fellowship and an add-on fellowship from the Christiane Nüsslein-Volhard-Stiftung. AT is supported by an ERC consolidator grant, uCARE.
The authors declare no competing interests.
Peer review information Nature thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Biological replicates show good reproducibility, with differences revealing biological phenomena.
a, Rarefaction analysis of the proteome coverage (proteins with at least two unique peptides in each mass spectrometry run) as a function of the number of mass spectrometry runs. b, Distribution of log2-transformed fold change differences between the two biological replicates. c, Scatter plot of protein fold changes between all biological replicate measurements (n = 1,512,475; all proteins, all temperatures, all mutants). r depicts Pearson correlation. d, Reproducibility of protein fold changes between biological replicate measurements at each temperature. e, Examples of replicate correlation for specific mutants, highlighting that flagellar proteins are common outliers in one of the two clones (nΔhemX = 13,150, nΔybaB = 12,313, nΔclpA = 12,950, nΔmrcB = 12,604, nΔfur = 12,543, nΔmlaA = 12,559, nΔlpp = 12,719; all proteins, all temperatures). f, Polymerase chain reaction of the promoter region of the flhDC operon (schematic on top) demonstrates the presence of insertions in mutant clones (gel on bottom, n = 1; for gel source data see Supplementary Fig. 2) with high flagellar protein expression (FliC fold-changes at the two lowest temperatures of each mutant replicate used as a proxy for abundance). g, Scatter plot of abundance and thermal stability z-scores of all proteins in all mutants (n = 170,150). r depicts Pearson correlation. h, Distribution of the number of mutants in which a protein is significantly altered (n = 1,764 proteins). Box plots as in Fig. 2a. i, Distribution of the number of proteins that are significantly altered in each mutant (n = 121 mutants). Box plots as in Fig. 2a.
Extended Data Fig. 2 Cellular processes targeted in this study and changes in thermal stability reflect protein complex architecture in E. coli mutants.
a, Distribution of cellular processes targeted in this study compared to the general distribution of the E. coli genome using Clusters of Orthologous Groups (COG). b, c, Schematic representation of protein complexes targeted by genetic perturbations in this study. Protein missing (encoded by gene deleted) is highlighted by a dashed line and other complex members are coloured according to their thermal stability (b) or abundance (c) in that mutant. *|z-score| >1.96 and with q-value ≤ 0.05. ΔtolC data are from ref. 20.
Extended Data Fig. 3 Protein co-expression patterns provide insight into gene expression regulation.
a, Correlation of DegP and OmpF log2-transformed fold changes to control in each of the genetic perturbations probed here (n = 120, as OmpF is not detected in ΔompF) at each temperature (colour coded; n = 10). Mutants that lead to cell envelope stress (highlighted), and therefore activation of stress response (see also b) lead to upregulation of DegP and downregulation of OmpF. b, Schematic representation of regulation of degP and ompF genes. CpxAR two-component system regulates both genes, while EnvZ/OmpR regulates only ompF. Heatmap shows Spearman’s rank correlation (calculated as in Fig. 3a) for proteins involved in regulation of degP and ompF.
a, Distribution of thermal stability z-scores of all proteins in the iron-sulfur cluster biosynthesis mutants, ΔiscA, ΔiscS and ΔiscU according to their Gene Ontology annotation as iron-sulfur cluster binding proteins (nΔiscA = 41, nΔiscS = 41, nΔiscU = 40) or not (nΔiscA = 1,400, nΔiscS = 1,415, nΔiscU = 1,314). Box plots as in Fig. 2a. Significance assessed with two-sided Wilcoxon signed-rank test (PΔiscA = 3.9 × 10−5, PΔiscS = 9.5 × 10−11, PΔiscU = 7.7 × 10−5). b, Volcano plot showing proteins that significantly change in their thermal stability (highlighted in red) in ΔtatB shows that CueO is thermally destabilized. c, Total and periplasmic protein extraction of different CueO constructs shows that deletion of Tat signal peptide (Δ28) and full-length construct in ΔtatB retain CueO protein levels, but only a small fraction makes it to the periplasm. CueO was detected using mouse monoclonal anti-Flag antibody and goat anti-mouse IgG-HRP (n = 1). An SDS–PAGE gel was run in parallel and stained with Coomassie to ensure that periplasmic extraction was successful (n = 1). d, Cellular thermal shift assay (CETSA) of CueO fused to Flag peptide, either using the full-length protein (WT) or a version lacking the first 28 amino acids (Δ28; corresponding to the Tat signal peptide). Experiments performed in living cells in ΔcueO strain. CueO was detected using mouse monoclonal anti-Flag antibody and goat anti-mouse IgG-HRP (n = 1). As a loading control, run on the same gel, rabbit anti-LpoB antibody3 and goat anti-rabbit IgG-HRP were used (n = 1). e, As in d, but comparing the thermal stability of CueO fused to Flag peptide, either in ΔcueO (WT) or ΔcueOΔtatB (Δ) live cells (n = 1). f, As in d, but comparing thermal stability of Δ28-CueO in ΔcueO strain and full-length CueO in ΔcueOΔtatB (n = 1). g, CETSA of Δ28-CueO in lysate of ΔcueO strain after addition of 4 mM CuCl2 or the same volume of vehicle (n = 1). For gel source data see Supplementary Fig. 2.
a, log2-transformed fold change of FtsK protein levels in each mutant compared to control at each temperature. FtsK is strongly thermally destabilized in the ΔphoP mutant and the ftsK knockdown is synthetically lethal with the phoP deletion (Fig. 2d). b, As in a for parC. ParC is strongly thermally stabilized in the ΔclpS mutant and thermally destabilized in the ΔphoP mutant and the parC knockdown is synthetically lethal with both. Synthetic lethality is also apparent in the ΔahpC, ΔamiA and ΔenvC mutants, despite the absence in changes in ParC thermal stability (Fig. 2e).
Extended Data Fig. 6 Protein correlation profiling recapitulates known biological interactions with abundance and thermal stability data having different contribution to functional associations.
a, Distribution of Spearman’s rank correlation of all protein pair comparisons compared to known operons, protein complexes, and metabolic pathways. Distribution statistics refer to all protein pairs. b, ROC analysis based on the decreasing absolute Spearman’s rank correlation compared to interactions in STRING database at different cut-offs of the combined STRING score. c–e, Spearman’s rank correlation of protein pairs belonging to the same operon (c), protein complex (d), or metabolic pathway (e) using solely changes in abundance (x axis) or changes in thermal stability (y axis). Protein pairs belonging to the same operon are highlighted in purple. Distribution of Spearman’s rank correlation are shown outside the axes. n = 446 for operons, n = 348 for protein complexes, and n = 801 for metabolic pathways. Proteins belonging to the same operon or complex mostly have coordinated abundance changes, while proteins belonging to the same pathway have also often coordinated thermal stability. f, Schematic representation of UDP-N-acetylmuramoyl-pentapeptide biosynthesis pathway. g, Example of protein pair (DdlA and MurC) co-changing in their thermal stability (rS = 0.79), but not abundance (rS = −0.13) across 81 genetic perturbations. Each data point corresponds to the abundance or thermal stability z-score in one of the genetic perturbations (colour-coded). h, Heat map of Spearman’s rank correlation of all quantified members of UDP-N-acetylmuramoyl-pentapeptide biosynthesis pathway based on co-changes in abundance (top triangle) or thermal stability alone (bottom triangle).
a, Heat map of Spearman’s rank correlation (lower triangle; based on protein abundance and thermal stability data across 121 mutants, as in Fig. 3a) and the physical distance (upper triangle; based on ribosome structure, PDB: 4YBB, and using the centres of mass of each protein) between the ribosome members. At the bottom, 30S and 50S ribosomal subunits are shown in purple and green, respectively, and lower triangle data are clustered hierarchically. b, c, High-resolution structure of the ribosome coloured according to the heat map clusters from a (b) or 30S and 50S ribosomal subunits (c). d–g, ATP synthase members (d, e; PDB: 5T4O) and respiratory complex I (f, g; PDB: 4HEA), as in a–c. h, Closely located members of protein complexes are more likely to be similarly regulated across different conditions. Spearman’s rank correlation plotted against the distance between complex subunits for the three complexes represented in the figure, with an apparent negative correlation. Box plots are as in Fig. 2a.
Extended Data Fig. 8 GO enrichments of co-changing partners of proteins of unknown function can reveal their function.
Examples of links between proteins of unknown function and GO terms that their co-changing proteins are enriched in. Some of these links are supported by external evidence (node colour, see Supplementary Discussion). Edges are coloured according to the enrichment P value using the Fisher’s exact test after correction for multiple comparison with the Benjamini–Hochberg procedure.
Extended Data Fig. 9 Metabolite levels correlate with thermal stability of enzyme producing or using the metabolite.
a, b, Scatter plot of metabolite log2 fold-changes in mutant compared to wild-type strain (y axis) and protein abundance (a) or thermal stability (b) in each mutant for enzymes that directly interact with the metabolite (x axis) (n = 19 mutants, except for G6P/F6P–PhoA (n = 7), 2-oxoglutarate–SucA (n = 18), succinate–SdhD (n = 12), malate–FumA (n = 6), and malate–FumB (n = 12)). r depicts the Pearson correlation coefficient for each metabolite-enzyme pair. Black line represents the linear fit and grey shades the 95% confidence interval of the fit. c, Twenty strains used for targeted metabolomics analysis. d, Distribution of Pearson correlation coefficients for metabolite levels in each mutant and abundance or thermal stability of enzymes that directly interact with the metabolite (upstream and downstream of metabolite, as in a and b). Box plots are as in Fig. 2a. With all data represented on top of the box plots (nG6P/F6P = 6, nPEP = 5, nPyruvate = 8, n2-oxoglutarate = 4, nSuccinate = 6, nMalate = 9).
Extended Data Fig. 10 Changes in protein abundance and thermal stability explain growth phenotypes of E. coli mutants.
a, Scatter plot of number of significantly affected proteins (abundance or thermal stability) in each mutant (x axis) and the number of significant growth phenotypes of the same mutant (y axis; data from ref. 12). P refers to the correlation P value and n to the number of mutants. b, Scatter plot of MdtK abundance in mutants profiled in this study and their sensitivity to 80 mM metformin12 (r = 0.44; n = 119 mutants). c, d, Spot assay for the indicated strains overexpressing mdtK, ahpC or cpxA, or a control empty plasmid in plates containing 0–80 mM metformin. Cells were diluted to OD578 = 0.5, serially diluted in tenfold steps, and spotted on LB agar plates containing 10 μg ml−1 tetracycline (to maintain plasmid), 0.1 mM IPTG (to induce expression of encoded gene), and metformin as indicated. e, As in b, but showing correlation of RecR abundance and UV exposure for 18 s (r = 0.53; n = 99 mutants). f, Schematic representation of the ybaB-recR operon and protein abundance scores in the ΔybaB mutant. g, Spot assay for the indicated strains overexpressing ybaB, recR, or a control empty plasmid after exposure to UV with a total energy of 85 mJ cm−2 or control non-exposed plate. Cells were diluted to OD578 = 0.1 and then serially diluted in tenfold steps, and spotted on LB agar plates containing 50 μg ml−1 ampicillin (to maintain plasmid) and 0.1 mM IPTG (to induce expression of encoded gene).
This file contains Supplementary Figures 1-2 and Supplementary Discussion.
Details on mutants used in this study.
Proteins identified in this study.
Results from mass spectrometry experiments. Relative log2 fold-changes of mutant vs control.
GO enrichment of highly variable proteins.
Abundance and thermal stability scores.
Protein correlation profiling.
GO enrichment of highly correlated proteins for each protein.
Glycolysis and TCA cycle metabolite levels in 19 mutants.
Significant correlations between growth phenotypes in different chemical and environmental perturbations and protein abundance or thermal stability. Only correlations with at least 60 data points and adjusted p-value <0.01 were considered significant.
About this article
Cite this article
Mateus, A., Hevler, J., Bobonis, J. et al. The functional proteome landscape of Escherichia coli. Nature 588, 473–478 (2020). https://doi.org/10.1038/s41586-020-3002-5