知識的價值不在于占有,而在于使用。

生信自學網-速科生物-生物信息學數據庫挖掘視頻教程

當前位置: 主頁 > 腫瘤免疫 >

免疫浸潤分析方法信息基本原理都在這里

時間:2019-07-30 11:19來源:生信自學網 作者:樂偉 點擊:
去卷積方法 GSEA分析方法
下載本文文獻,關注微信公眾號:biowolf_cn,回復“免疫浸潤”
1. 去卷積方法

CIBERSORT

網址:
https://cibersort.stanford.edu/
CIBERSORT是2015年在Nature Methods發表的一個方法,現在被引量為84,其方法摘要為:
a method for characterizing cell composition of complex tissues from their gene expression profiles. When applied to enumeration of hematopoietic subsets in RNA mixtures from fresh, frozen, and fixed tissues, including solid tumors, CIBERSORT outperformed other methods with respect to noise, unknown mixture content, and closely related cell types. CIBERSORT should enable large-scale analysis of RNA mixtures for cellular biomarkers and therapeutic targets (http://cibersort.stanford.edu).
在Cell Reports那篇文章提供的TCIA網址中得到了關于LM22的細胞占比數據,之前不理解,這篇文章有詳細的說明:
To assess the feasibility of leukocyte deconvolution from bulk tumors, we designed and validated a leukocyte gene signature matrix, termed LM22. It contains 547 genes that distinguish 22 human hematopoietic cell phenotypes, including seven T cell types, naïve and memory B cells, plasma cells, NK cells, and myeloid subsets (Supplementary Table 1, Supplementary Fig. 2, and Online Methods). Cell subsets can be further grouped into 11 major leukocyte types based on shared lineage (Supplementary Table 1).
雖然在這篇文章末尾提到了
We anticipate that CIBERSORT will prove valuable for analysis of cellular heterogeneity in microarray or RNA-Seq data derived from fresh, frozen, and fixed specimens, thereby complementing methods that require living cells as input.
表明文章發表時,CIBERSORT還沒有用于RNAseq數據(這點在TIMER中以一個優勢被提及)。但從Cell Reports文章《Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade》創建的TCIA網站來看,CIBERSORT方法已經應用到了TCGA的RNAseq數據上。

TIMER

《Comprehensive analyses of tumor immunity: implications for cancer immunotherapy》是2016年發表于Genome Biology的一篇計算免疫浸潤的方法。

它的主要工作結果為:
We developed a computational method to estimate the abundance of six tumor-infiltrating immune cell types (B cells, CD4 T cells, CD8 T cells, neutrophils, macrophages, and dendritic cells) to study 23 cancer types in The Cancer Genome Atlas (TCGA)
設計的大致流程如下:
The prerequisite data include tumor purity, tumor gene expression profiles (as transcript per million reads (TPM)) from TCGA, and an external reference dataset of purified immune cells. Tumor purity is critical to selecting genes informative for deconvolving immune cells in the tumor tissue and was inferred from copy number alteration data using an R package, CHAT, we have developed [15] (Fig. 1a). Our purity estimation method has been validated using diluted series with known tumor/normal mixture ratios and agreed with previous methods and orthogonal estimations [16]. The distributions of tumor purity showed large variations among different samples across most of the 23 TCGA cancer types (Additional file 1: Figure S1). For each cancer dataset, batch effects between TCGA and the external reference data sets were removed using ComBat [17] (Fig. 1b). Next we selected genes whose expression levels are negatively correlated with tumor purity (Fig. 1c; Additional file 1: Figure S2; Additional file 2: Table S1), an indication that these genes are expressed by stromal cells in the tumor microenvironment. Across all 23 cancers, informative genes selected from the above steps are significantly enriched for a predefined immune signature [18] (Fig. 1d). This result indicates that large numbers of immune cell-specific genes are highly expressed in the tumor microenvironment. Finally, we used constrained least squares fitting [19] on the informative immune signature genes to infer the abundance of the six immune cell types (Fig. 1e).
它指出了自己開發的方法相對于CIBERSORT的優點:
Our work first provided a systematic prognostic landscape of different tumor-infiltrating immune cells in diverse cancer types. We compared our results with two recent studies on the same topic [6, 11]. The method used in Gentles et al., CIBERSORT [35], is currently only applicable to microarray data, thus unable to analyze the TCGA RNA-seq data. Therefore, our immune component estimation is a unique addition to TCGA for future integrative analyses of tumor–immune interactions. By including more immune cell types into regression, CIBERSORT inference also suffered from statistical co-linearity that might have resulted in biased estimations (Additional file 7: Table S6; Additional file 8). Due to this limitation, although Gentles et al. studied more cell types, they reported few significant prognostic immune predictors, without correction for other clinical confounders. In contrast, we observed many more significant clinical associations with the correction of multiple cofactors.
值得注意的是,在文末提供的方法部分,作者詳細地介紹了方法的計算流程,重點強調了
To note, coefficients f estimated using this method are the relative abundance of immune cells. The scale of the estimation of an individual immune cell type is determined by the variance of the corresponding reference data Xr. Therefore, f are not comparable between cancer types or different immune cells. Source codes for TIMER and downstream statistical analysis as well as related data files are available at http://cistrome.org/TIMER/download.html.
說明最后的計算結果是一個系數,它表明的只是免疫細胞相對的豐富度。
2. GSEA
個人認為最值得關注的還是Cell Reports的文章Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade。它于2017年發表,本身的背景中就有提及前面兩種方法去卷積方法。

文章對TIMER的觀點是:
RNA expression data corrected for tumor purity were used to estimate infiltration of B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells ([Li et al., 2016](javascript:void(0);)). However, although such analyses of few major cell types are helpful for identifying clinical associations, higher resolution of the TIL landscape is required in order to dissect tumor-immune cell interactions and identify prognostic and predictive markers.
因此,該文章提供了更為細致的免疫浸潤組分圖景,有近30種細胞類型
Thus, it is of utmost importance to provide a comprehensive view of the intratumoral immune landscape including memory cells, cytotoxic cells (CD8+ T cells, natural killer [NK] cells, and NK T [NKT] cells), as well as immunosuppressive cells (Tregs and myeloid-derived suppressor cells [MDSCs]).
we estimated 28 subpopulations of TILs including major types related to adaptive immunity: activated T cells, central memory (Tcm), effector memory (Tem) CD4+ and CD8+ T cells, gamma delta T (Tγδ) cells, T helper 1 (Th1) cells, Th2 cells, Th17 cells, regulatory T cells (Treg), follicular helper T cells (Tfh), activated, immature, and memory B cells, as well as cell types related to innate immunity, such as macrophages, monocytes, mast cells, eosinophils, neutrophils, activated, plasmacytoid, and immature dendritic cells (DCs), NK cells, natural killer T (NKT) cells, and MDSCs.
在構建的TCIA中,它使用了CIBERSORT去卷積方法進行免疫浸潤的計算,方法是通過將TCGA的RNAseq數據轉化為CIBERSORT能夠處理的MicroArray數據集。
重點是,該文章提供了一種新的免疫浸潤細胞組分計算的思路:就是使用基因富集分析。
下面介紹了該方法及其優點
Our approach is based on the use of metagenes, i.e., non-overlapping sets of genes that are representative for specific immune cell subpopulations and are neither expressed in CRC cell lines nor in normal tissue. The expression of these sets of metagenes is then used to analyze statistical enrichment using gene set enrichment analysis (GSEA). The advantage of the metagene approach is the robustness of the method due to two characteristics: (1) the use of a set of genes instead of single genes that represent one immune subpopulation, because the use of single genes as markers for immune subpopulations can be misleading as many genes are expressed in different cell types; and (2) the assessment of relative expression changes of a set of genes in relation to the expression of all other genes in a sample. Thus, the calculations are less sensitive to noise resulting from sample impurity or sample preparation compared with the deconvolution methods.
這兩種方法都能夠在TCIA上使用:
We built a model to transform RNA-sequencing data to microarray data considering the TCGA samples for which microarray and RNA-sequencing data were available (n = 550; see Experimental Procedures). We present here only the results from the GSEA method, which depicts a more comprehensive picture of the tumor-suppressive or tumor-promoting roles of TILs. However, we make both GSEA and deconvolution data available on the TCIA website.

Enrichment bubble plot.png
該方法提供了免疫細胞在病人樣本中的富集度(上圖是以絕對數目顯示,也可以顯示富集的相對比例),通過NES(averaged normalized enrichment score)與FDR進行樣本的預篩。默認設定為NES>0, Q-value(FDR) < 0.1。可以下載相應的文件,包含樣本名(Barcode),NES和Q-value 3列。
入門免疫浸潤,可以學習生信自學網推出的免疫浸潤系列課程:

TCGA腫瘤免疫細胞浸潤模式挖掘
GEO數據庫免疫細胞浸潤視頻
甲基化免疫細胞浸潤模式
TCGA數據庫腫瘤微環境
TCGA數據庫腫瘤突變負荷

下載本文文獻,關注微信公眾號:biowolf_cn,回復“免疫浸潤”


責任編輯:樂偉
作者申明:本文版權屬于生信自學網(微信號:18520221056)未經授權,一律禁止轉載!
加生信自學網群
BioWolf二維碼生成器
頂一下
(1)
100%
踩一下
(0)
0%
------分隔線----------------------------
發表評論
請自覺遵守互聯網相關的政策法規,嚴禁發布色情、暴力、反動的言論。
評價:
表情:
用戶名: 驗證碼:點擊我更換圖片
TCGA腫瘤微環境
推薦內容
單基因發文套路
m6A