Statistics about the GENCODE Release M2

The statistics derive from the gtf file that contains only the annotation of the main chromosomes.

For details about the calculation of these statistics please see the README_stats.txt file.

General stats

Total No of Genes 38924
Protein-coding genes 22572
Long non-coding RNA genes 4074
Small non-coding RNA genes 5853
Pseudogenes 5948
- polymorphic pseudogenes 15
- pseudogenes 5931
Immunoglobulin/T-cell receptor gene segments
- protein coding segments 477
- pseudogenes 2
Total No of Transcripts 94545
Protein-coding transcripts 47394
- full length protein-coding 38260
- partial length protein-coding 9134
Nonsense mediated decay transcripts 4134
Long non-coding RNA loci transcripts 6053
 
Total No of distinct translations 38785
Genes that have more than one distinct translations 7934

Further details on this version's gene and transcript types

biotype genes transcripts
3prime_overlapping_ncrna 1 1
antisense 1476 2066
IG_C_gene 13 15
IG_D_gene 25 25
IG_J_gene 88 88
IG_LV_gene 304 304
IG_V_gene 2 2
IG_V_pseudogene 1 2
lincRNA 1792 2518
miRNA 1973 1973
misc_RNA 590 590
Mt_rRNA 2 2
Mt_tRNA 22 22
non_stop_decay 0 5
nonsense_mediated_decay 0 4134
polymorphic_pseudogene 15 19
processed_pseudogene 0 4759
processed_transcript 705 12877
protein_coding 22572 47394
pseudogene 5931 261
retained_intron 0 12607
rRNA 353 353
sense_intronic 90 98
sense_overlapping 10 27
snoRNA 1530 1530
snRNA 1383 1383
TR_V_gene 45 60
TR_V_pseudogene 1 1
transcribed_processed_pseudogene 0 122
transcribed_unprocessed_pseudogene 0 94
translated_processed_pseudogene 0 12
translated_unprocessed_pseudogene 0 1
unitary_pseudogene 0 17
unprocessed_pseudogene 0 1183