`epiout.dataset`

Module Contents

Dataset object to read bed and bam files and count reads for each peak.

class epiout.dataset.EpiOutDataset(bed, alignments, njobs=1, slack=200, subset_chrom=False)

Dataset object to read bed and bam files and count reads for each peak.

Parameters

bed – path to bed file or pyranges object.
alignments – path to metadata file or list of paths to bam files or dict of sample name and path to bam file.
njobs – number of jobs to run in parallel during counting.
slack – slack to merge peaks.
subset_chrom – subset chromosomes to only those in the bam files.

read_bed(self, bed, slack=200, subset_chrom=False)

Read bed file and overlapping merge peaks with slack of: by default 200bp, subset chromosomes of chr1, chr2, …, chrX, chrY, chrM, if subset_chrom is True, and sort by chromosome and start position.

read_alignments(self, alignments)

static count_reads(gr, bam, mapq=10)

Read bam file and count reads for each peak.

Parameters

static _filters(df_raw, min_count=100, min_percent_sample=0.5): min_count: minimum count at least one sample. min_num_sample: minimum number of sample peak with at least one read.

count(self, mapq=10, min_count=100, min_percent_sample=0.5)

Count reads for each peak and filter peaks with minimum count and: minimum number of samples with at least one read.

Parameters