Toward Personal Proteomics

Last week Attila Csordas of EBI announced the initial results of our collaboration focused on developing a catalog of and measuring changes in his salivary proteome. We co-released our initial results publicly on PRIDE, EBI’s proteomic data repository, and Proteome Cluster, our pipeline platform for proteomic data analysis. Here we lay out our motivations and goals for undertaking this collaboration.

The fields of genomics and proteomics have been searching for biomolecular markers for use in disease diagnosis and prognosis. Marker development has been difficult. There are many reasons for this, but the multifactorial and heterogeneous nature of disease is prime among them. It has long been known that cancer is not a single disease. But studies funded by NCI’s The Cancer Genome Atlas have illustrated this dramatically (Jones, et al., Parsons, et al.). A recent study of kidney cancer published in NEJM has garnered significant attention. The study demonstrated that significant genetic heterogeneity exists at different sites within a single tumor. This complicates the use of discrete sets of biomarkers derived from an individual for use as a potential tool to guide therapy within that same individual. Broad utilization of biomarkers derived from multifactorial and heterogeneous diseases across individuals will be a much larger challenge.

It is known that markers of disease can be detected in various biofluids such as blood and saliva. Our hypothesis is that early indicators of disease may be detectable as changes in the baseline molecular profiles of these fluids. To this end it is necessary to establish baseline profiles within and across healthy individuals over the course of daily life and then regularly monitor those profiles for change. There is growing appreciation for the role of the organisms which constitute the microbiome for maintaining general health and possibly for disease involvement. Even in cases where there is no disease involvement it is possible that the microbiome in any given microenvironment changes in response to disease as the disease changes that local microenvironment. These derivative changes may be more easily detected than changes which drive the disease.

To that end we established a collaboration with Attila, a scientist experienced in working with and interpreting complex proteomic data, to establish a baseline salivary proteomic profile. When searching the data we included a library of organisms currently associated with the human oral microbiome, currently over 1000 organisms. Proteome Cluster permits interrogation of large proteomic data sets against very large sequence libraries in a reasonable time frame by utilizing on-demand, scalable compute clusters on Amazon Web Services infrastructure.

At this early stage we are optimizing tools and methods throughout the pipeline, including sample collection, sample preparation, enrichment and fractionation, liquid chromatography and mass spectrometry, and data searching, analysis and presentation. We will discuss some of the issues relating to these methods in the context of salivary proteomic profiling of human and oral microbiome-derived proteins in later posts, including potential confounders, susceptibility to bias and the inherent challenges of detecting very small changes against a very large, very noisy background.

Cross-posted to


Comments are closed.

%d bloggers like this: