R Dannenfelser, M Nome, A Tahiri, J Ursini-Siegel, HK Moen Vollan, VD Haakensen, Å Helland, B Naume, C Caldas, A-L Børresen-Dale, VN Kristensen, OG Troyanskaya
The tumor microenvironment is now widely recognized for its role in tumor progression, treatment response, and clinical outcome. The intratumoral immunological landscape, in particular, has been shown to exert both pro-tumorigenic and anti-tumorigenic effects. Identifying immunologically active or silent tumors may be an important indication for administration of therapy, and detecting early infiltration patterns may uncover factors that contribute to early risk. Thus far, direct detailed studies of the cell composition of tumor infiltration have been limited; with some studies giving approximate quantifications using immunohistochemistry and other small studies obtaining detailed measurements by isolating cells from excised tumors and sorting them using flow cytometry. Herein we utilize a machine learning based approach to identify lymphocyte markers with which we can quantify the presence of B cells, cytotoxic T-lymphocytes, T-helper 1, and T-helper 2 cells in any gene expression data set and apply it to studies of breast tissue. By leveraging over 2,100 samples from existing large scale studies, we are able to find an inherent cell heterogeneity in clinically characterized immune infiltrates, a strong link between estrogen receptor activity and infiltration in normal and tumor tissues, changes with genomic complexity, and identify characteristic differences in lymphocyte expression among molecular groupings. With our extendable methodology for capturing cell type specific signal we systematically studied immune infiltration in breast cancer, finding an inverse correlation between beneficial lymphocyte infiltration and estrogen receptor activity in normal breast tissue and reduced infiltration in estrogen receptor negative tumors with high genomic complexity.