extra data for our long-read RNAseq paper (PNAS 2014)



  1. Visualizing reads on teh UCSC genome browser

    • all consensus-split-mapped-molecules ("CSMMs" - meaning each intron in the mapping respects the splice site consensus) can be inspected visually on the UCSC browser using this link



  2. consensus-split-mapped-molecules ("CSMMs" - meaning each intron in the mapping respects the splice site consensus) for three cell-lines can be found

    • here for GM12878 ,
    • here for GM12891
    • and here for GM12892
    • note, that alignments overlapping ribosomal RNA genes have been removed
    • note also, that this file contains a comment line for display on the UCSC browser. Uploading all reads to the UCSC browser is possible but may take a lot of time. We would advise to rewrite the comment line and all alignments in a genomic region into a smaller file and to upload this smaller file to the UCSC browser
    • on a linux-terminal you could select all alignments located on chr11 between 10Mb and 11Mb using the following command (and then submit the result to the UCSC browser)
      1. zcat supplementalFileS1.gff.gz | awk '{if($1=="track"){print; next;} if($1!="chr11" || $5<10000000 || $4>11000000){next;} print ; }' | gzip -c > test.gff
    • similarly treated data for a panel of human organs can be found here
    • similarly treated data using the 454 platform for the human cell lines K562 and HelaS3 can be found here. These reads are about half as long (ca. 520b but sequenced ten times as deeply)



  3. Longread (pacBio) fastq data

    • you will be able to download the raw inpt data from the SRA under accession SRP036136. You can also follow directly this link http://www.ncbi.nlm.nih.gov/sra/SRP036136
    • To make your life easier you can also get all the CCS for



  4. a long-read enhanced Gencode annotation

    • a gencode 15 annotation that we enhanced using pacBio longreads can be found here



  5. Illumina 101-bp-PE fastq data

    • please get these directly from the SRA using this link http://www.ncbi.nlm.nih.gov/sra/SRP036136



  6. Original h5 files