Download datasets
This study used a total of 37 datasets.
Max length: The maximum length of any peptide sequence within the dataset.
Min length: The minimum length of any peptide sequence within the dataset.
Train positive: The number of bioactive peptides used for model training.
Train negative: The number of non-bioactive peptides used for model training.
Test positive: The number of bioactive peptides used for model evaluation.
Test negative: The number of non-bioactive peptides used for model evaluation.
Note: In the provided Excel file, peptide bioactivity is encoded as follows:
· 0 = Bioactive (Positive)
· 1 = Non-bioactive (Negative)
No. | Activity | Max length | Min length | Train positive | Train negative | Test positive | Test negative | Download |
---|---|---|---|---|---|---|---|---|
1 | Anti-hypertensive | 81 | 5 | 931 | 931 | 386 | 386 |
Download training set
Download test set |
2 | DPP-IV inhibitory | 90 | 2 | 532 | 532 | 133 | 133 |
Download training set
Download test set |
3 | Bitter | 39 | 2 | 256 | 256 | 64 | 64 |
Download training set
Download test set |
4 | Umami | 39 | 2 | 112 | 241 | 28 | 61 |
Download training set
Download test set |
5 | Anti-microbial | 180 | 11 | 3876 | 9552 | 2584 | 6369 |
Download training set
Download test set |
6 | Anti-malarial main | 183 | 2 | 111 | 1708 | 28 | 427 |
Download training set
Download test set |
7 | Anti-malarial alternative | 81 | 2 | 111 | 542 | 28 | 135 |
Download training set
Download test set |
8 | Quorum sensing | 64 | 5 | 200 | 200 | 20 | 20 |
Download training set
Download test set |
9 | Anti-cancer main | 50 | 3 | 689 | 689 | 172 | 172 |
Download training set
Download test set |
10 | Anti-cancer alternative | 50 | 3 | 776 | 776 | 194 | 194 |
Download training set
Download test set |
11 | Anti-MRSAs trains | 70 | 11 | 118 | 678 | 30 | 169 |
Download training set
Download test set |
12 | Tumor t cell antigens | 20 | 8 | 442 | 442 | 111 | 111 |
Download training set
Download test set |
13 | Blood–brain barrier | 50 | 5 | 326 | 326 | 99 | 99 |
Download training set
Download test set |
14 | Anti-parasitic activity | 254 | 5 | 255 | 1863 | 46 | 46 |
Download training set
Download test set |
15 | Neuro peptide | 100 | 5 | 1940 | 1940 | 485 | 485 |
Download training set
Download test set |
16 | Anti-bacterial | 30 | 3 | 4540 | 5809 | 1112 | 1475 |
Download training set
Download test set |
17 | Anti-fungal activity | 100 | 4 | 1168 | 1168 | 291 | 291 |
Download training set
Download test set |
18 | Anti-angio activity | 67 | 11 | 107 | 107 | 28 | 28 |
Download training set
Download test set |
19 | Anti-viral activity | 51 | 4 | 3272 | 3272 | 818 | 818 |
Download training set
Download test set |
20 | Toxicity | 50 | 13 | 1621 | 1663 | 312 | 270 |
Download training set
Download test set |
21 | Anti-inflammatory peptides | 30 | 7 | 690 | 1009 | 173 | 253 |
Download training set
Download test set |
22 | Anti-oxidant activity | 20 | 2 | 848 | 848 | 212 | 212 |
Download training set
Download test set |
24 | IL-5 inducing peptides main | 20 | 9 | 1525 | 6027 | 382 | 1552 |
Download training set
Download test set |
25 | IL-5 inducing peptides alternative | 20 | 9 | 1525 | 1526 | 382 | 381 |
Download training set
Download test set |
26 | IL-6 inducing peptides | 25 | 8 | 292 | 2393 | 73 | 598 |
Download training set
Download test set |
27 | IL-13 inducing peptides | 35 | 8 | 250 | 2326 | 63 | 582 |
Download training set
Download test set |
28 | Anti-tubercular Peptides RD | 61 | 5 | 199 | 199 | 47 | 47 |
Download training set
Download test set |
29 | Cell-penetrating | 61 | 10 | 364 | 369 | 98 | 85 |
Download training set
Download test set |
29 | Cell-penetrating MLCCP | 61 | 6 | 573 | 574 | 157 | 2184 |
Download training set
Download test set |
30 | Tumor-homing peptides main | 31 | 4 | 521 | 521 | 130 | 130 |
Download training set
Download test set |
31 | Tumor-homing peptides small | 10 | 4 | 375 | 375 | 94 | 94 |
Download training set
Download test set |
32 | Anti-coronavirus activity | 100 | 6 | 649 | 651 | 65 | 2231 |
Download training set
Download test set |
33 | Viral integrase inhibitory peptides VINIP | 58 | 8 | 110 | 650 | 12 | 71 |
Download training set
Download test set |
34 | Viral integrase inhibitory peptides INI | 62 | 8 | 110 | 459 | 12 | 51 |
Download training set
Download test set |
35 | Anti-diabetic | 41 | 11 | 188 | 188 | 48 | 48 |
Download training set
Download test set |
36 | Biofilm inhibitory | 54 | 5 | 201 | 80 | 10 | 10 |
Download training set
Download test set |
37 | Hemolytic | 100 | 5 | 433 | 423 | 663 | 1999 |
Download training set
Download test set |
The dataset size of bioactive peptides
37 datasets are divided into four categories: Disease Related, Antimicrobial Related, Immune Related, and Peptide Property. Click on the buttons to switch. Train_pos:positive samples in the training set. Train_neg:negative samples in the training set. Test_pos:positive samples in the independent test set. Test_neg:negative samples in the independent test set.
The amino acid composition of bioactive peptides
The figure describes the distribution of amino acid participation, where darker shades of orange represent higher proportions in the dataset. Train_pos:positive samples in the training set. Train_neg:negative samples in the training set. Test_pos:positive samples in the independent test set. Test_neg:negative samples in the independent test set.
The length distribution of peptides
The wider the violin plot, the more dispersed the amino acids of that length; the narrower it is, the more concentrated the amino acids of that length. Train_pos:positive samples in the training set. Train_neg:negative samples in the training set. Test_pos:positive samples in the independent test set. Test_neg:negative samples in the independent test set.