Illustration to running workflow of web-server and standalone program
Description
PTFSpot: Deep co-learning on transcription factors and their binding regions attains impeccable universality in plants.
Running the standalone program
Requirements
- Python3
- Numpy
- keras
- tensorflow
- plotly
- pandas
- bayesian-optimization
- bedops
- Bedtools
- Alphafold2 generated PDB files
To build model implementing hyperparameter tuning
Example: python3 hyper_param.py file_for_tuning
Input file description
file_for_tuning = file containing label (0/1) and sequence (positive and negative instances). All in one line separated by tab for a single instance.
Label | Sequence |
1 | TGATAAACAAAGTGTGTAACATCACCTCATCTACATGTGTGATTTTTTTTTTGAATATAGACAACTTTTTAGTCAGAGTAGTGAGTATAGTGAGTTTCTGTAGAGAAGCTCATCTTAGAATTATTCATGTATTCCACTACTAAAATGTATTCCACTACT |
0 | AGATCTACAAGAGAAGATAAGTTTGAGGCAAATTCGAGATCTGGAAGCTGGTTTTCTCTTTACAAATAACACTAACCCTACCATCAAATCAAGAAAGGAGGCTTTGAACAAATAGCTTGATTGAAGTATGAAGTGGCTCGGTGGGCGACGATGA |
Output file description
param.txt = Optimized hyparameters for Transformers.
Name | Value |
learning_rate | 0.583 |
activation | relu |
activation2 | selu |
activation3 | LeakyRelu |
batch_size | 40 |
embed_dim | 28 |
epochs | 20 |
num_heads | 14 |
ff_dim | 14 |
neurons | 38 |
neurons2 | 12 |
dropout_rate | 0.16 |
dropout_rate2 | 0.17 |
Optimizer | Adadelta |
ptfspot.h5 = Hyperparameter optimized trained model for transformer.
Note
Always place your Alphafold2 generated TF pdb file in "pdb" folder (an example is provided).
This project was executed within the Ubuntu Linux platform's Open Source OS environment
Running script
Module: Transformer-DenseNet system (identify TF regions within sequence with the corresponding Alphafold2 generated TF protein structure)
To detect the TF binding region, In parent directory execute following command:
Output description
TFbinding regions detection module gives output in following format
ABF2_genomic_sequence.txt = TF binding regions result (Final result: ID, Start, End).
Sequence ID | Start | End |
seq1 | 30 | 190 |
seq2 | 10 | 170 |
seq3 | 21 | 151 |
seq4 | 73 | 233 |
To generate line plot, execute the following command:
python3 make-plot.py seq1.csv (generated plots are interactive)
An example of generated plot:
Python script to build model implementing hyperparameter tuning
Python version 3.6 or higher
Version 1.23.5
Version 2.13.1
Version 2.13.1
Version 5.11.0
Version 1.5.0
Version 1.2.0
sudo apt-get install bedops -y
sudo apt-get install bedtools -y
Alphafold2: https://alphafold.ebi.ac.uk/
Ubuntu: https://ubuntu.com/
python script to generate plots
filename
Complete execution shell script
Name of the fasta sequence containing file
Complete path of the folder which contains fasta file
Name of the pdb file generated from Alphafold2
Complete path of the folder containing scripts of PTFSpot
>seq
TGATAAACAAAGTGTGTAACATCACCTCATCTACATGTGTGATTTTTTTTTTGAATATAGACAACTTTTTAGTCAGAGT
AGTGAGTATAGTGAGTTTCTGTAGAGAAGCTCATCTTAGAATTATTCATGTATTCCACTACTAAAATGTATTCCACTAC
TAGATCTACAAGAGAAGATAAGTTTGAGGCAAATTCGAGATCTGGAAGCTGGTTTTCTCTTTACAAATAACACTAACCC
TACCATCAAATCAAGAAAGGAGGCTTTGAACAAATAGCTTGATTGAAGTATGAAGTGGCTCGGTGGGCGACGATGAGCA
Name of the TF pdb file without ".pdb"
file_for_tuning = file containing label (0/1) and sequence (positive and negative instances). All in one line separated by tab for a single instance.
Click to know more
Click to download an example pdb file
Click to expand
Hyperparameter optimized trained model for transformer.