File Format Specifications – PanKbase
This document provides an overview of required formats for the different file types hosted by PanKbase.
Sequencing Data
Sequence Alignments
Normalized Genomic Signal
Gene Quantifications
- Format: Tab-delimited text
- Description: Contains gene or transcript-level counts and normalized expression values (e.g., from bulk RNA-seq).
- Specification: GTEx Gene Quantifications Format
Gene Count Matrix
- Format: Not specified
- Description: Matrix of gene expression values across multiple samples or cells.
QTL Summary Statistics
- Format: Tab-delimited text
- Description: Results from QTL mapping (e.g., eQTL, caQTL).
- Specification: To be added
Genetic Association Summary Statistics
- Format: Tab-delimited text
- Description: Results from GWAS or other genetic association analyses.
- Specification: To be added
Gene Sets
- Format:
.gmt
(Gene Matrix Transposed) - Description: Lists of genes grouped by pathway, function, or disease association.
- Specification: GMT Format – Broad GSEA Wiki