DirectContacts2: A network of direct physical protein interactions derived from high throughput mass spectrometry experiments
Proteins carry out cellular functions by self-assembling into functional complexes, a process that depends on direct physical interactions between components. While tools like AlphaFold and RoseTTAFold have advanced structure prediction, they remain limited in scaling to the full human proteome. DirectContacts2 addresses this challenge by integrating diverse large-scale protrin interaction datasets, including AP/MS (BioPlex1–3, Boldt et al., Hein et al.), biochemical fractionation (Wan et al.), proximity labeling (Gupta et al., Youn et al.), and RNA pulldown (Treiber et al.), to predict whether ~26 million human protein pairs interact directly or indirectly.
Funding
NIH R00, NSF/BBSRC
Citation
Erin R. Claussen, Miles D Woodcock-Girard, Samantha N Fischer, Kevin Drew
References
Kevin Drew, Christian L. Müller , Richard Bonneau, Edward M. Marcotte (2017) Identifying direct contacts between protein complex subunits from their conditional dependence in proteomics datasets. PLOS Computational Biology 13(10): e1005625. https://doi.org/10.1371/journal.pcbi.1005625
Samantha N. Fischer, Erin R Claussen, Savvas Kourtis, Sara Sdelci, Sandra Orchard, Henning Hermjakob, Georg Kustatscher, Kevin Drew hu.MAP3.0: Atlas of human protein complexes by integration of > 25,000 proteomic experiments. Molecular Systems Biology 1–33 (2025) doi:10.1038/s44320-025-00121-5.
Erickson, Nick, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, and Alexander Smola. "Autogluon-tabular: Robust and accurate automl for structured data." arXiv preprint arXiv:2003.06505 (2020).
Huttlin et al. Dual proteome-scale networks reveal cell-specific remodeling of the human interactome Cell. 2021 May 27;184(11):3022-3040.e28. doi: 10.1016/j.cell.2021.04.011.
Huttlin et al. Architecture of the human interactome defines protein communities and disease networks. Nature. 2017 May 25;545(7655):505-509. DOI: 10.1038/nature22366.
Treiber et al. A Compendium of RNA-Binding Proteins that Regulate MicroRNA Biogenesis.. Mol Cell. 2017 Apr 20;66(2):270-284.e13. doi: 10.1016/j.molcel.2017.03.014.
Boldt et al. An organelle-specific protein landscape identifies novel diseases and molecular mechanisms. Nat Commun. 2016 May 13;7:11491. doi: 10.1038/ncomms11491.
Youn et al. High-Density Proximity Mapping Reveals the Subcellular Organization of mRNA-Associated Granules and Bodies. Mol Cell. 2018 Feb 1;69(3):517-532.e11. doi: 10.1016/j.molcel.2017.12.020.
Gupta et al. A Dynamic Protein Interaction Landscape of the Human Centrosome-Cilium Interface. Cell. 2015 Dec 3;163(6):1484-99. doi: 10.1016/j.cell.2015.10.065.
Wan, Borgeson et al. Panorama of ancient metazoan macromolecular complexes. Nature. 2015 Sep 17;525(7569):339-44. doi: 10.1038/nature14877. Epub 2015 Sep 7.
Hein et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell. 2015 Oct 22;163(3):712-23. doi: 10.1016/j.cell.2015.09.053. Epub 2015 Oct 22.
Huttlin et al. The BioPlex Network: A Systematic Exploration of the Human Interactome. Cell. 2015 Jul 16;162(2):425-40. doi: 10.1016/j.cell.2015.06.043.
Reimand et al. g:Profiler-a web server for functional interpretation of gene lists (2016 update). Nucleic Acids Res. 2016 Jul 8;44(W1):W83-9. doi: 10.1093/nar/gkw199.
Associated Code
Code examples using the DirectContacts2 model can be found on our GitHub All feature matrices and associated files can be found in the DirectContacts2 dataset
Usage
Accessing and using the model
DirectContacts2 was constructed using AutoGluon an auto-ML tool. The module TabularPredictor is used to is used train, test, and make predictions with the model.
This can be downloaded using the following:
$ pip install autogluon==0.8.2
Then it can be imported as:
>>> from autogluon.tabular import TabularPredictor
Note that to perform operations with our model the 0.8.2 version must be used
To use the model and make predictions, we show two full code examples using the full feature matrix and the test feature matrix in jupyter notebooks.
All feature matrices can be pulled using the 'datasets' module from HuggingFace and examples of that are seen on our GitHub and on our DirectContacts2 HuggingFace dataset
Model card authors
Samantha Fischer (sfisch6@uic.edu)