Usage programmatically
The pipeline can be invoked programmatically to run all the steps in the pipeline
serially via the ProgrammaticPipelineRunner as seen
in the example below,
or via SLURM using SLURMPipelineRunner
The example below runs the pipeline using example data. This may take up to an hour to run.
import os
import json
from cellmaps_pipeline.runner import ProgrammaticPipelineRunner
from cellmaps_pipeline.runner import CellmapsPipeline
# load the provenance as a dict
with open(os.path.join('examples', 'provenance.json'), 'r') as f:
json_prov = json.load(f)
runner = ProgrammaticPipelineRunner(outdir='testrun',
samples=os.path.join('examples', 'samples.csv'),
unique=os.path.join('examples', 'unique.csv'),
edgelist=os.path.join('examples', 'edgelist.tsv'),
baitlist=os.path.join('examples', 'baitlist.tsv'),
model_path='https://github.com/CellProfiling/densenet/releases/download/v0.1.0/external_crop512_focal_slov_hardlog_class_densenet121_dropout_i768_aug2_5folds_fold0_final.pth',
provenance=json_prov,
ppi_cutoffs=[0.001, 0.01],
input_data_dict={})
pipeline = CellmapsPipeline(outdir='testrun',
runner=runner,
input_data_dict={})
print('Status code: ' + str(pipeline.run()))
Note
Above assumes the repo has been cloned locally and the Python interpreter was run within the base directory of the repo