CritiCal-C Tutorial

A comprehensive guide to understanding and using the cytosine regulation analysis tool

Introduction
Platform Overview
Spearman Correlation & Grad-CAM
Single C Analysis
Combination Analysis
Sliding Window Analysis
Functional Enrichment

Welcome to CritiCal-C

CritiCal-C is a cutting-edge web-based platform that identifies critical cytosines in gene promoter sequences, which are pivotal for gene regulation across species.

CritiCal-C integrates multiple data sources - including in-house datasets, Plant Ensembl, UniProt, and KEGG while offering advanced visualization tools to help researchers understand cytosine-specific effects on gene regulation. This tutorial guides users through performing cytosine analysis and extracting biological insights into gene regulation mechanisms.

Cytosine Regulation
Cytosine modifications, particularly methylation, are key epigenetic marks that influence gene expression. CritiCal-C identifies which cytosines are most biologically impactful under multiple conditions.
Analysis Methods
The platform employs three complementary analytical approaches: individual cytosine knockout, a 100-bp sliding window, and cytosine pair combination assessment - all analyzed using Spearman correlation and interactive visualizations.
Integrated Platform
Our platform integrates data from in-house processed datasets, Plant Ensembl, UniProt, and KEGG databases with advanced visualization methods. The interface also integrates external APIs, including UniProt, QuickGO, and KEGG REST API, for protein annotations, gene ontology data, and pathway visualizations.

Platform Overview

The CritiCal-C platform enables comprehensive cytosine regulation analysis through three core modules: Gene Overview, Visualization Tools, and Functional Enrichment Analysis. This integrated approach helps researchers explore cytosine-specific regulatory effects and uncover underlying biological mechanisms.

Gene Overview

The Gene Overview section provides comprehensive information about the selected gene, including:

Gene ID and Symbol Unique identifiers and common name for the gene
Gene Length and Chromosome Physical characteristics and genomic location
UniProt ID and Description Protein information and functional description
2kb Upstream Region Sequence information for the promoter region
Visualization Graphs

CritiCal-C provides three powerful visualization methods to analyze cytosine regulation:

Single 'C' knockout
'C' knocked out in combination
'C' knocked out in 100bp overlapping Window
Select different visualization methods to explore cytosine correlation patterns in different ways.
DNA Baseline
Cytosine Selection
Expression Impact
Critical Analysis
A
T
C
G
Critical C
Gene Expression: 75%

Correlation Heatmap
Info

CritiCal-C Analysis Demo

This simulation identifies critical cytosines in a sample gene promoter region that influence gene expression. Click on a cytosine (C) to analyze its impact. Critical cytosines often act as molecular switches through methylation status, serving as binding sites for transcription factors.

Research Applications: Precision epigenome editing, disease biomarkers, methylation tools, and understanding epigenetic mechanisms in development and disease.

0%
Select cytosine pairs to test their combined effect

Gene Expression Level

Baseline expression: 100%
Select two cytosines to analyze their combined effect

Interaction Network

Filter by Effect Strength

Show All Strong Effects Only
0
Window Position
0
Cytosines in Window
0.0
Expression
0
Critical Window

Regulatory Effect by Window

Adenine (A)
Thymine (T)
Guanine (G)
Cytosine (C)
Functional Enrichment

The Functional Enrichment analysis helps you understand the biological significance of cytosine regulation:

GO Terms Analysis Explore gene ontology terms associated with regulated genes
KEGG Pathway Analysis Identify relevant biological pathways

Spearman correlation analysis serves as a core method in CritiCal-C for identifying critical cytosines with significant associations to gene expression patterns. While traditional methods like systematic cytosine knockout (applied individually or to groups) aid in gene annotation and database development, they are time-consuming and impractical for real-time criticality detection. To address these limitations, CritiCal-C implements Gradient-weighted Class Activation Mapping (Grad-CAM) - an explainable AI (XAI) technique that visualizes important regions in neural networks by highlighting which features most influence predictions. Grad-CAM pinpoints influential cytosines by analyzing gradients of target outputs relative to input features. This hybrid strategy combines strong statistical validation (Spearman correlation) with efficient, interpretable deep learning, enabling rapid and biologically meaningful discovery of regulatory cytosines.

How Spearman Correlation & Grad-CAM Works

CritiCal-C combines Spearman correlation and Grad-CAM to effectively identify gene-regulating cytosines. Spearman correlation measures the statistical impact of cytosine knockout on gene expression, while Grad-CAM uses explainable AI to pinpoint the most influential cytosines through deep learning analysis. By integrating both methods - Spearman for broad statistical screening and Grad-CAM for precise functional prioritization - the platform overcomes traditional limitations, providing statistically rigorous and biologically insightful results. These findings are presented in interactive visualizations, enabling researchers to efficiently identify key cytosines for experimental validation.

Single Cytosine Analysis

Single-cytosine analysis in CritiCal-C employs our deep learning model to systematically evaluate the regulatory influence of individual cytosines within 2kb promoter regions. This foundational approach enables precise identification of critical regulatory cytosines that function as molecular switches in gene expression control, offering nucleotide-level resolution of epigenetic regulation mechanisms.

How Single Cytosine Analysis Works

Single cytosine analysis in CritiCal-C works by systematically evaluating each cytosine within 2kb promoter regions through an iterative knockout approach using our deep learning model. The process follows these steps:

  1. First, the model predicts baseline gene expression with all cytosines intact
  2. Then, individual cytosines are computationally "knocked out"
  3. Gene expression is re-predicted to quantify each cytosine's regulatory impact based on expression changes
  4. Finally, Grad-CAM explainability identifies most critical cytosines through gradient-weighted class activation mapping of the model's convolutional layers

This approach enables precise identification of critical regulatory sites with nucleotide-level resolution.

Biological Implications

  • 1.Identifies Key Epigenetic Switches: Pinpoints are individual cytosines whose methylation status acts as a binary on/off switch for gene expression, revealing precise control points in regulatory regions.
  • 2. Enables Precision Epigenome Editing: It provides high-resolution targets for CRISPR/dCas9 or base editors, allowing single-nucleotide-level manipulation of gene expression without disrupting other regulatory elements.
  • 3. Uncovers Condition-Specific Regulation: Detects cytosines that gain/lose regulatory importance under specific stresses (e.g., a cytosine critical for drought response but neutral under standard conditions), linking epigenetic plasticity to environmental adaptation.

Combination Analysis

CritiCal-C's pair analysis identifies cooperative cytosine duos through dual knockout simulations, revealing synergistic regulatory impacts beyond single-site effects. This approach reveals epigenetic haplotypes and guides targeted editing strategies, with results visualized through interactive arc plots showing critical cytosine interactions.

How Combination Analysis Works

Combination analysis in CritiCal-C works through three key steps: (1) systematically testing all possible cytosine pairs within promoter regions via in dual knockout, (2) quantifying synergistic effects by comparing observed expression changes against expected additive impacts. The method reveals regulatory relationships where specific cytosine pairs collectively control gene expression more strongly than their individual effects.

Biological Implications

  • 1.Synergistic Gene Regulation: Reveal how specific cytosine pairs work together to amplify or suppress gene expression, uncovering complex epigenetic control mechanisms.
  • 2.Precision Epigenetic Engineering Targets: Identification of high-impact cytosine combinations for CRISPR/dCas9 editing will enable more effective gene modulation.
  • 3.Breeding & Trait Optimization: Provision of epigenetic markers for selecting crop varieties with optimal gene expression patterns under environmental stress.

Sliding Window Analysis

Identifies spatially clustered regulatory hotspots by analyzing cytosine effects across 100 bp overlapping windows in promoter regions.

How Sliding Window Works

Sliding window analysis in CritiCal-C systematically scanned promoter regions using 100 bp overlapping windows to identify spatially clustered regulatory hotspots. For each window, the method knocks out all contained cytosines while preserving the surrounding sequences, and then quantifies the impact on predicted gene expression. This approach revealed positional trends in cytosine-dependent regulation. The analysis outputs included: (1) window-level effect scores ranking regulatory influence, (2) visualization of positional effect patterns along promoters, and (3) collectively enabling targeted investigation of spatially organized epigenetic control elements that single-cytosine analyses might miss.

Biological Implications

  • 1.Identifies Functional cis-Regulatory Modules : Reveal spatially organized epigenetic control units where clustered cytosines jointly regulate gene expression, explaining how distal promoter elements coordinate transcriptional responses.
  • 2.Uncovers Position-Dependent Regulatory Logic: This demonstrates that the functional impact of cytosine depends on the genomic context (e.g., cytosines near transcription start sites vs. distal regions).
  • 3.Guides Precision Epigenome Editing : Prioritize high-impact genomic intervals (rather than single sites) for CRISPR/dCas9 targeting, improving efficiency in epigenetic crop engineering.

Functional Enrichment Analysis

Functional enrichment analysis in CritiCal-C helps elucidate the biological significance of genes containing critical cytosine. By examining the functional annotations associated with these genes, researchers can identify enriched biological processes, molecular functions, and cellular components.

Gene Ontology (GO) Enrichment

Biological Process
Molecular Function
Cellular Component
GO Term Description
GO:0006629 Lipid Metabolic Process
GO:0006631 Fatty Acid Metabolic Process
GO Term Description
GO:0003674 Molecular_Function
GO:0003824 Catalytic Activity
GO:0005515 Protein Binding
GO Term Description
GO:0005575 Cellular_Component
GO:0005622 Intracellular Anatomical Structure

KEGG Pathway Analysis

KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis provides insights into the metabolic and signaling pathways associated with the identified genes. This helps researchers understand how critical cytosine might influence specific biological pathways, offering a systems-level perspective on gene function.

Pathway ID Pathway Name
ath00062 Fatty acid elongation - Arabidopsis thaliana (thale cress)

Users can explore these pathways in detail by clicking on the pathway ID, which will redirect to the KEGG database for visualization of the complete pathway with highlighted genes containing critical cytosine.