pkgsrc-WIP-changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

GSAlign: Import GSAlign-1.0.22 as wip/GSAlign



Module Name:	pkgsrc-wip
Committed By:	Brook Milligan <brook%nmsu.edu@localhost>
Pushed By:	brook
Date:		Mon Apr 4 15:32:12 2022 -0600
Changeset:	9673bbde77906cb5b4c271d3ef5c6017d60209f8

Modified Files:
	Makefile
Added Files:
	GSAlign/DESCR
	GSAlign/Makefile
	GSAlign/PLIST
	GSAlign/distinfo

Log Message:
GSAlign: Import GSAlign-1.0.22 as wip/GSAlign

GSAlign: an ultra-fast sequence alignment algorithm for intra-species
genome comparison

Personal genomics and comparative genomics are two fields that are
more and more important in clinical practices and genome
researches. Both fields require sequence alignment to discover
sequence conservation and structural variation. Though many methods
have been developed to handle genome sequence alignment, some are
designed for small genome comparison while some are not efficient for
large genome comparison. Here, we present GSAlign to handle large
genome comparison efficiently. GSAlign includes three unique features:
1) it is the first attempt to use Burrows-Wheeler Transform on genome
sequence alignment; 2) it supports parallel computing; 3) it adopts a
divide-and-conquer strategy to separate a query sequence into regions
that are easy to align and regions that require gapped alignment. With
all these features, we demonstrated GSAlign is very efficient and
sensitive in finding both the exact matches and differences between
two genome sequences and it is much faster than existing
state-of-the-art methods.

# Changes
- 1.0.0: First release version
- 1.0.1: Fixed a bug in <<CheckMemoryUsage>>
- 1.0.2: Fixed a bug when running with multi-threads on Mac PCs
- 1.0.3: Added the Average Nucleotide Identity (ANI) output
- 1.0.4: Fixed a bug on reading input files
- 1.0.5: Fixed a bug on reading input files
- 1.0.6: Modified GSAlign's output and fixed type casting
- 1.0.7: Integrated bwt_index into GSAlign
- 1.0.8: Use bwa_idx_build to build the bwt index instead of using bwt_index
- 1.0.9: Fixed a bug on finding gnuplot's path
- 1.0.10: Fixed a bug on reading input files
- 1.0.12: Added an option "-gp" to specify the path of gnuplot
- 1.0.13: Fixed a bug on reading input files
- 1.0.16: Fixed a bug on reporting coordinate of alignment
- 1.0.17: Fixed a bug on spanning reference sequences
- 1.0.18: Fixed a bug on spanning reference sequences
- 1.0.19: Added the option of "-one" to set one-on-one alignment

To see a diff of this commit:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=9673bbde77906cb5b4c271d3ef5c6017d60209f8

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

diffstat:
 GSAlign/DESCR    | 19 +++++++++++++++++++
 GSAlign/Makefile | 26 ++++++++++++++++++++++++++
 GSAlign/PLIST    |  3 +++
 GSAlign/distinfo |  5 +++++
 Makefile         |  1 +
 5 files changed, 54 insertions(+)

diffs:
diff --git a/GSAlign/DESCR b/GSAlign/DESCR
new file mode 100644
index 0000000000..5b9e5c1415
--- /dev/null
+++ b/GSAlign/DESCR
@@ -0,0 +1,19 @@
+GSAlign: an ultra-fast sequence alignment algorithm for intra-species
+genome comparison
+
+Personal genomics and comparative genomics are two fields that are
+more and more important in clinical practices and genome
+researches. Both fields require sequence alignment to discover
+sequence conservation and structural variation. Though many methods
+have been developed to handle genome sequence alignment, some are
+designed for small genome comparison while some are not efficient for
+large genome comparison. Here, we present GSAlign to handle large
+genome comparison efficiently. GSAlign includes three unique features:
+1) it is the first attempt to use Burrows-Wheeler Transform on genome
+sequence alignment; 2) it supports parallel computing; 3) it adopts a
+divide-and-conquer strategy to separate a query sequence into regions
+that are easy to align and regions that require gapped alignment. With
+all these features, we demonstrated GSAlign is very efficient and
+sensitive in finding both the exact matches and differences between
+two genome sequences and it is much faster than existing
+state-of-the-art methods.
diff --git a/GSAlign/Makefile b/GSAlign/Makefile
new file mode 100644
index 0000000000..8b00f787e0
--- /dev/null
+++ b/GSAlign/Makefile
@@ -0,0 +1,26 @@
+# $NetBSD$
+
+GITHUB_PROJECT=	GSAlign
+GITHUB_TAG=	refs/tags/1.0.22
+DISTNAME=	1.0.22
+PKGNAME=	${GITHUB_PROJECT}-${DISTNAME}
+CATEGORIES=	biology
+MASTER_SITES=	${MASTER_SITE_GITHUB:=hsinnan75/}
+DIST_SUBDIR=	${GITHUB_PROJECT}
+
+MAINTAINER=	pkgsrc-users%NetBSD.org@localhost
+HOMEPAGE=	https://github.com/hsinnan75/GSAlign/
+COMMENT=	Ultra-fast intra-species sequence alignment
+LICENSE=	mit
+
+WRKSRC=		${WRKDIR}/GSAlign-1.0.22
+USE_TOOLS+=	gmake
+USE_LANGUAGES=	c c++
+
+INSTALLATION_DIRS+=	bin
+
+do-install:
+	${INSTALL_PROGRAM} ${WRKSRC}/bin/GSAlign ${DESTDIR}${PREFIX}/bin
+	${INSTALL_PROGRAM} ${WRKSRC}/bin/bwt_index ${DESTDIR}${PREFIX}/bin
+
+.include "../../mk/bsd.pkg.mk"
diff --git a/GSAlign/PLIST b/GSAlign/PLIST
new file mode 100644
index 0000000000..592c087792
--- /dev/null
+++ b/GSAlign/PLIST
@@ -0,0 +1,3 @@
+@comment $NetBSD$
+bin/GSAlign
+bin/bwt_index
diff --git a/GSAlign/distinfo b/GSAlign/distinfo
new file mode 100644
index 0000000000..901bd12892
--- /dev/null
+++ b/GSAlign/distinfo
@@ -0,0 +1,5 @@
+$NetBSD$
+
+BLAKE2s (GSAlign/1.0.22.tar.gz) = 19cb2a58fda632e0347e890a1548e9e5b1dd03069362f270542908aed16d708e
+SHA512 (GSAlign/1.0.22.tar.gz) = 028daf9e245e7c7a3e5333f3689d37d289a81f3460d2f63810e6c9ad42d2a0cf19ae9d847d23d18c006abf9b9ae951838e2e0123b7c730ec089a5fc287bd1874
+Size (GSAlign/1.0.22.tar.gz) = 6281514 bytes
diff --git a/Makefile b/Makefile
index d68bcee8d6..6e0b63635e 100644
--- a/Makefile
+++ b/Makefile
@@ -17,6 +17,7 @@ SUBDIR+=	ETL
 SUBDIR+=	FLIF
 SUBDIR+=	FLIF-git
 SUBDIR+=	GNUMail-pgp
+SUBDIR+=	GSAlign
 SUBDIR+=	GSCommander
 SUBDIR+=	Geomyidae-git
 SUBDIR+=	I2util


Home | Main Index | Thread Index | Old Index