Machine Learning

Statistical Method Mcristior improves the robustness of MetaCell Partifing in single data analysis

This article was written by Pan Liu, a postdoctol researcher at UCLA and the Fred Hutchinson Cancer Center. PAN is McGrigor's first author Natural Communication the article.

Cell-based technologies have advanced rapidly in recent years, providing unprecedented opportunities for cellular differentiation, dynamic changes in cell populations, and genetic engineering. In addition to the widely used RNA sequencing (Scrna-Seq) 1,2new modalities such as single-access sequencing of Chromatin Chromatin (Scatac-SEQ) 3,4 and integrated profiling of subsequent accessibility and chromatin (SCmultiome) What is bought on the knee He enabled the breakdown of camera heterogeneity in single cell resolution in multi-layer omics. However, the information released by these technologies in general, mainly due to the limited recovery of each cell, as well as the incomplete transcription and illegal amplification, which causes the visible gene insertion to restore the dose tracking and subsequent formation 6.

Fig. 1. MCRigor Publications.

To reduce data sparsity and noise, the researchers proposed a “Metacell” The concept, in which cells with similar profiles are combined into one representative unit – a metacell – whose expression is defined by the definition of their cells, thus improving the signal and reducing the noise. However, existing metacell construction methods tend to express different metacell compositions and are very sensitive to hyperparameter settings, especially the average metacell size. The purchase was heard +. Such a lack makes it difficult for users to find out which metacell classification is reliable and how many metacell profiles are actually banned. As a result, the robustness of low-level analysis decreases, and the potential of metacells as a standard data frame for various tasks and OMICS tasks remains limited.

– Ours Natural Communication paper 8 It provides a robust metacell description based on a two-layer model of cell sequencing: the upper layer captures the biological differences in the true expression, while the lower layer of the sequence process reveals the set expression from the true expression. The structure in this definition, we develop MCRigormathematical framework for discovery amazing metacells within a given classification and selection The ideal method for metacell is to separate the hyperparameter beyond the option-hyperparameter configuration.

MCRIGOR not only detects and removes unknown metacells (its extended version, Mcrigor two stepsAnother is to separate the metacells that are doubtful from single cells and reassemble into a mass, which is more reliable), thus improving the reliability of Downstream analysis such as gene co-expression and exne. 2). In addition, McGrigor provides an integrated test procedure for measuring various metacell construction methods, providing reliable guidance to investigators in method selection.

In the first part of our paper 8we present McGrigor's method to find unknown metacells. Specifically, Mcror selects the internal heterogeneity of each metacell using a feature-supported equation, mcdivwhich measures the deviation of the characteristic coefficient from independence. The reason is that if all the member cells share the same levels of expression and the visible variation between them comes only from the measurement process, the factors should be almost independent. McGregor then built a the null distribution of McDIV uses a novel Double consent The process also identifies metacells that deviate significantly from this refreshing (Fig. 2a).

In both the used and real PBMC datasets, Mcristigazir correctly distinguished reliable metacells from unreliable populations (Fig. 2b-C). We also demonstrate the effectiveness of MCRIGOR in improving the reliability of the regression analysis. In the cell-line analysis, removing the most undesirable metacells increases the signal-to-noise ratio of the cell-cycle gene markers (Fig. 2D). In relatively healthy control data of VIVI-19, Mcrigor eliminates gene integration caused by unwanted metacells and reveals a strong expression of immune response modules (Fig. 2e). In SCMULTIOME Data analysis, McristiiGome improves the information of enhancer-gene associations, weak filtering of false positives while the preservation of signals corresponds to what is observed at the lingle-cell level (Fig. 2f).

Fig. 2. MCRIGOR detects popular metacells and prepares downstream analysis of SCRNA-SEQ and Multime (RNA + ATAC) data. a, Schematic of the MCRIGOR method for the detection of an undesirable metacell. b, McGregor successfully explores metacell heterogeneity and finds prominent metacells within metacells of a metacell-distributed method in synthetic data. c, Unknown metacells are identified by McGregor Express Heterogeneity and can occasionally appear as vendors, while reliable metacells remain internally homogeneous. d, MCRIGOR enhances cell cycle type expression within cell lines. e, MCRIGOR revealed enriched degradation of the adaptive immune response gene module (highlighted in yellow) in 19 samples (bottom row) compared to healthy controls (top row), f Applying MCRIGOR to the original metacell separation from sea paper enables the gene realpecer (left) and produces reliable findings (right).
Fig. 3. MCRIGOR COMBINED Metacell Way and Hyperparameter Selection of various single cell data. a, The MCRIGOR method scheme for optimizing metacell classification, is used The result As an indicator of balance Dukazi and Have compassiondemonstrated the performance of metacell partitioning with synthetic data with an example. b, Line Plots showing the magnitude of zero by Metacell classification produced by the three methods metacell, seacell, and supercell levels at all different levels of granurity (y). The prepared Metacell sections (triangles) correspond closely to the zero section observed in the Smrna fish data (red line). c, McGregor argues for the Metacell method and hyperparameter selection for differential gene analysis. between (b) and (c)colored triangles indicate the highest values ​​of Y selected by MCRIGOR for the three methods. d, MCRIGOR Metacell Classification Better Reveals Trajectories of Immune Cells in the Body Compared to Original Metacell Classification from the Zman-SEQ Study 9.

In the second part of our paper 8we present the MCRIGOR method for evaluating Metacell differentiation and optimization of Hyperpaspaseters. By measuring the reliability of the metacell against the data, McCroy assigns a general test score to the classification of each election and automatically selects the optimal parameter configuration among all the uvotes that have created methods (a guide for data processing in making informed decisions with data (Figure 3a).

We demonstrate the use of this efficient application in various reduced tasks. For example, the proportion of zero of metacells of well-prepared mCristi corresponds closely to the proportion of zero-standard zero measured with smrna-fish, showing its ability to differentiate biological zeros (Fig. 3b). In a separate analysis of the analysis, the results based on MCRIGOR-metacells are generally well prepared with those obtained from bulk rna-SEQ data, showing improved reliability (Figure 3c). In Time-Course data, MCRIGOR metacells develop trajectory correction and exhibit dynamic gene-expression dynamics consistent with experimental evidence (Fig. 3D).

MCRIGOR R packing and online tutorials are available

Full paper available at

References:

1. Picelli, S. et al. Full-length RNA-SEQ from single cells using Smart-SEQ2. NAT. Protoc. 9171-181 (2014).

2. Macosko, ez et al. Genome-genome vensarling profile of each cell using nanoliter droplets. Cell 1611202-1214 (2015).

3. Buenrostro, JD et al. Single chromatin analysis reveals principles of regulated diversity. Kind of 523486-490 (2015).

4. Cusanovich, DA et al. Multiplex single-cell chromatin accessibility by combinatorial cellulam identification. Knowledge of natural resources 348910-914 (2015).

5. CAO, J. et al. Integrated profiling of Chromatin accessibility and gene expression in thousands of single cells. Knowledge of natural resources 3611380-1385 (2018).

6. Jiang, R., Sun, T., Song, D. & LI, jj statistics or zero-biology debate about Scra-SEQ data. Genome biol. 2331 (2022).

7. BILOS, M., Hérault, L., Gabriel, Aa, Teleman, M. & Gfeller, D. Building and Analyzing Metacells Data. Mol. STERS. BIOL. 20744-766 (2024).

8 Biorxiv (2024) doi: 10.1101 / 2024.10.30.621093.

9. Kirschenbaum, D. et al. Time-Resolved Transcriptomics Defines Physiological Trajectories in GlioBlastoma. Cell 187149-165.E23 (2024).

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button