What is and how to create a provenance file

A provenance file is a JSON formatted file that contains information describing sources of input data fed into the cellmaps_pipeline. It is required to maintain a chain of history for FAIRSCAPE

Template provenance file:

{
  "name": "Name for pipeline run",
  "organization-name": "Name of lab or group. Ex: Ideker",
  "project-name": "Name of funding source or project",
  "cell-line": "Name of cell line. Ex: U2OS",
  "treatment": "Name of treatment, Ex: untreated",
  "release": "Name of release. Example: 0.1 alpha",
  "gene-set": "Name of gene set. Example chromatin",
  "edgelist": {
    "name": "Name of dataset",
    "author": "Author of dataset",
    "version": "Version of dataset",
    "date-published": "Date dataset was published",
    "description": "Description of dataset",
    "data-format": "Format of data"
  },
  "baitlist": {
    "name": "Name of dataset",
    "author": "Author of dataset",
    "version": "Version of dataset",
    "date-published": "Date dataset was published",
    "description": "Description of dataset",
    "data-format": "Format of data"
  },
  "samples": {
    "name": "Name of dataset",
    "author": "Author of dataset",
    "version": "Version of dataset",
    "date-published": "Date dataset was published",
    "description": "Description of dataset",
    "data-format": "Format of data"
  },
  "unique": {
    "name": "Name of dataset",
    "author": "Author of dataset",
    "version": "Version of dataset",
    "date-published": "Date dataset was published",
    "description": "Description of dataset",
    "data-format": "Format of data"
  }
}

The above template provenance file can be created a few ways:

By grabbing the JSON test from help output from cellmaps_pipelinecmd.py like so:

cellmaps_pipelinecmd.py -h

Or by directly writing the JSON to a file (in example below it is writing to provenance.json) via this command line invocation:

cellmaps_pipelinecmd.py . --example_provenance > provenance.json

Or, if input datasets are already registered with FAIRSCAPE

TODO

Note

FAIRSCAPE registration documentation is coming soon…