pkgsrc-WIP-changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

wip/nanofilt: import nanofilt version 2.8.0.20210223 as wip/nanofilt



Module Name:	pkgsrc-wip
Committed By:	Brook Milligan <brook%nmsu.edu@localhost>
Pushed By:	brook
Date:		Mon Jun 7 18:24:17 2021 -0600
Changeset:	1233821547c4237853a764c76510f206f69885c6

Modified Files:
	Makefile
Added Files:
	nanofilt/DESCR
	nanofilt/Makefile
	nanofilt/PLIST
	nanofilt/distinfo

Log Message:
wip/nanofilt: import nanofilt version 2.8.0.20210223 as wip/nanofilt

Filter and trim of long read sequencing data.

Filtering on quality and/or read length, and optional trimming after
passing filters.  Reads from stdin, writes to stdout.  Optionally
reads directly from an uncompressed file specified on the command
line.  Intended uses:

- directly after fastq extraction
- prior to mapping
- in a stream between extraction and mapping

Due to a discrepancy between calculated read quality and the quality
as summarized by albacore this script takes since v1.1.0 optionally
also a `--summary` argument. Using this argument with the
sequencing_summary.txt file from albacore will do the filtering using
the quality scores from the summary. It's also faster.

### Examples

gunzip -c reads.fastq.gz | NanoFilt -q 10 -l 500 --headcrop 50 \
       | minimap2 genome.fa - | samtools sort -O BAM -@24 -o alignment.bam -
gunzip -c reads.fastq.gz | NanoFilt -q 12 --headcrop 75 \
       | gzip > trimmed-reads.fastq.gz
gunzip -c reads.fastq.gz | NanoFilt -q 10 | gzip > highQuality-reads.fastq.gz

To see a diff of this commit:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=1233821547c4237853a764c76510f206f69885c6

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

diffstat:
 Makefile          |  1 +
 nanofilt/DESCR    | 24 ++++++++++++++++++++++++
 nanofilt/Makefile | 27 +++++++++++++++++++++++++++
 nanofilt/PLIST    | 20 ++++++++++++++++++++
 nanofilt/distinfo |  6 ++++++
 5 files changed, 78 insertions(+)

diffs:
diff --git a/Makefile b/Makefile
index 09124e1235..61cb336f07 100644
--- a/Makefile
+++ b/Makefile
@@ -2463,6 +2463,7 @@ SUBDIR+=	n2n
 SUBDIR+=	naev
 SUBDIR+=	nag
 SUBDIR+=	nagios-plugin-mysql_health
+SUBDIR+=	nanofilt
 SUBDIR+=	nanoget
 SUBDIR+=	nanomath
 SUBDIR+=	nanostat
diff --git a/nanofilt/DESCR b/nanofilt/DESCR
new file mode 100644
index 0000000000..ec02fd5ba5
--- /dev/null
+++ b/nanofilt/DESCR
@@ -0,0 +1,24 @@
+Filter and trim of long read sequencing data.
+
+Filtering on quality and/or read length, and optional trimming after
+passing filters.  Reads from stdin, writes to stdout.  Optionally
+reads directly from an uncompressed file specified on the command
+line.  Intended uses:
+
+- directly after fastq extraction
+- prior to mapping
+- in a stream between extraction and mapping
+
+Due to a discrepancy between calculated read quality and the quality
+as summarized by albacore this script takes since v1.1.0 optionally
+also a `--summary` argument. Using this argument with the
+sequencing_summary.txt file from albacore will do the filtering using
+the quality scores from the summary. It's also faster.
+
+### Examples
+
+gunzip -c reads.fastq.gz | NanoFilt -q 10 -l 500 --headcrop 50 \
+       | minimap2 genome.fa - | samtools sort -O BAM -@24 -o alignment.bam -
+gunzip -c reads.fastq.gz | NanoFilt -q 12 --headcrop 75 \
+       | gzip > trimmed-reads.fastq.gz
+gunzip -c reads.fastq.gz | NanoFilt -q 10 | gzip > highQuality-reads.fastq.gz
diff --git a/nanofilt/Makefile b/nanofilt/Makefile
new file mode 100644
index 0000000000..9ff5eea3fa
--- /dev/null
+++ b/nanofilt/Makefile
@@ -0,0 +1,27 @@
+# $NetBSD$
+
+GITHUB_PROJECT=	nanofilt
+GITHUB_TAG=	76147c1
+DISTNAME=	NanoFilt-2.8.0.20210223
+CATEGORIES=	biology python
+MASTER_SITES=	${MASTER_SITE_GITHUB:=wdecoster/}
+DIST_SUBDIR=	${GITHUB_PROJECT}
+
+MAINTAINER=	pkgsrc-users%NetBSD.org@localhost
+HOMEPAGE=	https://github.com/wdecoster/nanofilt
+COMMENT=	Filtering and trimming of Oxford Nanopore sequencing data
+LICENSE=	mit
+
+DEPENDS+=	${PYPKGPREFIX}-biopython>=0:../../biology/py-biopython
+DEPENDS+=	${PYPKGPREFIX}-pandas>=0.22.0:../../math/py-pandas
+
+WRKSRC=		${WRKDIR}/nanofilt-76147c18855e7a1df11c87e91cf587dd9bd72a6d
+USE_LANGUAGES=	# none
+
+EGG_NAME=	${DISTNAME:C/\.[[:digit:]]+$$//}
+
+post-install:
+	rm -r ${DESTDIR}${PREFIX}/${PYSITELIB}/scripts
+
+.include "../../lang/python/egg.mk"
+.include "../../mk/bsd.pkg.mk"
diff --git a/nanofilt/PLIST b/nanofilt/PLIST
new file mode 100644
index 0000000000..f825640817
--- /dev/null
+++ b/nanofilt/PLIST
@@ -0,0 +1,20 @@
+@comment $NetBSD$
+bin/NanoFilt
+${PYSITELIB}/${EGG_INFODIR}/PKG-INFO
+${PYSITELIB}/${EGG_INFODIR}/SOURCES.txt
+${PYSITELIB}/${EGG_INFODIR}/dependency_links.txt
+${PYSITELIB}/${EGG_INFODIR}/entry_points.txt
+${PYSITELIB}/${EGG_INFODIR}/requires.txt
+${PYSITELIB}/${EGG_INFODIR}/top_level.txt
+${PYSITELIB}/nanofilt/NanoFilt.py
+${PYSITELIB}/nanofilt/NanoFilt.pyc
+${PYSITELIB}/nanofilt/NanoFilt.pyo
+${PYSITELIB}/nanofilt/__init__.py
+${PYSITELIB}/nanofilt/__init__.pyc
+${PYSITELIB}/nanofilt/__init__.pyo
+${PYSITELIB}/nanofilt/utils.py
+${PYSITELIB}/nanofilt/utils.pyc
+${PYSITELIB}/nanofilt/utils.pyo
+${PYSITELIB}/nanofilt/version.py
+${PYSITELIB}/nanofilt/version.pyc
+${PYSITELIB}/nanofilt/version.pyo
diff --git a/nanofilt/distinfo b/nanofilt/distinfo
new file mode 100644
index 0000000000..91e3c4486d
--- /dev/null
+++ b/nanofilt/distinfo
@@ -0,0 +1,6 @@
+$NetBSD$
+
+SHA1 (nanofilt/NanoFilt-2.8.0.20210223-76147c1.tar.gz) = 56a715bae5a6eaf142f486afbb6305f518ea25d5
+RMD160 (nanofilt/NanoFilt-2.8.0.20210223-76147c1.tar.gz) = 9fc8b0a080b696c62284eeed40fd3f85a5328d32
+SHA512 (nanofilt/NanoFilt-2.8.0.20210223-76147c1.tar.gz) = 8e5a53999859d372552055403110480f74ae77b1ebffaa5d161ca4260f792577ec4b8aed71c45b275e4850529bdcfcf05f0c3046c705e99302c8f1567fe2098f
+Size (nanofilt/NanoFilt-2.8.0.20210223-76147c1.tar.gz) = 19024 bytes


Home | Main Index | Thread Index | Old Index