wiki:DataManagement/ProjectData

Version 2 (modified by laurent, 13 years ago) (diff)

--

~/gcc/groups/gonl

Directory structure of data management. On the Groningen cluster that would be target/gpfs2/gcc/groups/gonl Permission to read by gonl group, some folders also write.

  • /tools
    • symlink to the /gcc/tools folder (except the folders you don't have access too)
  • /resources
    • symlink to the /gcc/resources folder (except the folders you don't have access too)
  • /home
    • one folder per member of this group
  • /general
    • presentations, publications, other stuff
  • /projects
    • /batchX
      • /rawdata
        • here is a list of fq.gz files
      • /results
        • /alignment
          • here is a list of bam files
        • /stats
          • here is one file per QC tool
        • /snp
          • here is one vcf file per analysis run
      • /logs
      • /intermediate_results
        • whatever is needed, will be empty at end of project
    • /gwas_data
      • /rawdata
        • original provided plink or genomestudio
      • /results
        • here the cleaned up genotypes in agreed upon format
      • /logs
    • /groningen_immunochip
      • /rawdata
      • /results
    • /pilot
      • /rawdata
        • /alignment
          • symlinks to the raw alignments used -> /first_batch/rawdata/some.aligned.cleaned.bam
      • /result
        • /snp
        • /indel
        • /cnv
      • logs
    • bgi
      • batchX: contains all the analysis data from BGI. Please see...for more information