tech-pkg archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: broken math/py-numba and devel/py-llvmlite, even broken approach with llvm?
Am Thu, 18 Jan 2024 15:45:15 +0100
schrieb "Dr. Thomas Orgis" <thomas.orgis%uni-hamburg.de@localhost>:
> I think I'll try to rig up a build that does what upstream wants here,
> building llvm just for llvmlite and link it in statically. The current
> patches maybe minimal, but running llvmlite with a stock llvm build,
> let alone one with a differing version, is explicitly not supported.
Please see and judge/test the attached changes that make py-llvmlite and
py-numba at least build for me on a Linux box. There's LLVM now built
specifically built for llvmlite, just as upstream does it.
The disabled OPENMP in numba irks me. I'll probably just patch that out
for now for my local builds, as I just fave openmp support in my GCC
toolchain and don't feel like messing with the openmp package.
Alrighty then,
Thomas
--
Dr. Thomas Orgis
HPC @ Universität Hamburg
Index: devel/py-llvmlite/Makefile
===================================================================
RCS file: /cvsroot/pkgsrc/devel/py-llvmlite/Makefile,v
retrieving revision 1.24
diff -u -r1.24 Makefile
--- devel/py-llvmlite/Makefile 15 Aug 2022 19:14:43 -0000 1.24
+++ devel/py-llvmlite/Makefile 19 Jan 2024 23:50:33 -0000
@@ -1,8 +1,9 @@
# $NetBSD: Makefile,v 1.24 2022/08/15 19:14:43 wiz Exp $
-DISTNAME= llvmlite-0.38.1
+DISTNAME= llvmlite-0.41.1
PKGNAME= ${PYPKGPREFIX}-${DISTNAME}
CATEGORIES= devel python
+
MASTER_SITES= ${MASTER_SITE_PYPI:=l/llvmlite/}
MAINTAINER= pkgsrc-users%NetBSD.org@localhost
@@ -10,20 +11,89 @@
COMMENT= Lightweight LLVM Python binding for writing JIT compilers
LICENSE= 2-clause-bsd
-USE_LANGUAGES= c++14
+# Statically linking in a purpose-built LLVM as upstream urges to do.
+# They support only a certain version of LLVM per release, and that
+# with patches.
+LLVM_VERSION= 14.0.6
+DISTFILES= ${DEFAULT_DISTFILES}
+DISTFILES+= llvm-${LLVM_VERSION}.src.tar.xz
+DISTFILES+= lld-${LLVM_VERSION}.src.tar.xz
+DISTFILES+= libunwind-${LLVM_VERSION}.src.tar.xz
+
+LLVM_SITE= https://github.com/llvm/llvm-project/releases/download/llvmorg-${LLVM_VERSION}/
+SITES.llvm-${LLVM_VERSION}.src.tar.xz= ${LLVM_SITE}
+SITES.lld-${LLVM_VERSION}.src.tar.xz= ${LLVM_SITE}
+SITES.libunwind-${LLVM_VERSION}.src.tar.xz= ${LLVM_SITE}
+
+USE_LANGUAGES= c c++
+USE_CXX_FEATURES= c++14
+# Just for LLVM build.
+USE_TOOLS= cmake
+
+# See
+# https://github.com/numba/llvmlite/blob/main/conda-recipes/llvmdev/build.sh
+# for the procedure. This is what
+# https://llvmlite.readthedocs.io/en/latest/admin-guide/install.html
+# points to. Need to match up this to the correct llvmlite release, as
+# they do not include this in the tarball. Python people think building
+# stuff from source is hard and keep it so:-/
+# I kept some upstream comments inline.
+
+LLVM_CMAKE_ARGS= -DCMAKE_INSTALL_PREFIX=${WRKDIR}/llvm-inst
+LLVM_CMAKE_ARGS+= -DCMAKE_BUILD_TYPE:STRING=Release
+LLVM_CMAKE_ARGS+= -DLLVM_ENABLE_PROJECTS:STRING=lld
+# We explicitly want static linking.
+LLVM_CMAKE_ARGS+= -DBUILD_SHARED_LIBS:BOOL=OFF
+LLVM_CMAKE_ARGS+= -DLLVM_ENABLE_ASSERTIONS:BOOL=ON
+LLVM_CMAKE_ARGS+= -DLINK_POLLY_INTO_TOOLS:BOOL=ON
+# Don't really require libxml2. Turn it off explicitly to avoid accidentally linking to system libs
+LLVM_CMAKE_ARGS+= -DLLVM_ENABLE_LIBXML2:BOOL=OFF
+# Urgh, llvm *really* wants to link to ncurses / terminfo and we *really* do not want it to.
+LLVM_CMAKE_ARGS+= -DHAVE_TERMINFO_CURSES=OFF
+LLVM_CMAKE_ARGS+= -DLLVM_ENABLE_TERMINFO=OFF
+# Sometimes these are reported as unused. Whatever.
+LLVM_CMAKE_ARGS+= -DHAVE_TERMINFO_NCURSES=OFF
+LLVM_CMAKE_ARGS+= -DHAVE_TERMINFO_NCURSESW=OFF
+LLVM_CMAKE_ARGS+= -DHAVE_TERMINFO_TERMINFO=OFF
+LLVM_CMAKE_ARGS+= -DHAVE_TERMINFO_TINFO=OFF
+LLVM_CMAKE_ARGS+= -DHAVE_TERMIOS_H=OFF
+LLVM_CMAKE_ARGS+= -DCLANG_ENABLE_LIBXML=OFF
+LLVM_CMAKE_ARGS+= -DLIBOMP_INSTALL_ALIASES=OFF
+LLVM_CMAKE_ARGS+= -DLLVM_ENABLE_RTTI=OFF
+# Not sure if this should be adapted for pkgsrc.
+LLVM_CMAKE_ARGS+= -DLLVM_TARGETS_TO_BUILD=all
+LLVM_CMAKE_ARGS+= -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=WebAssembly
+# for llvm-lit
+LLVM_CMAKE_ARGS+= -DLLVM_INCLUDE_UTILS=ON
+# doesn't build without the rest of LLVM project
+LLVM_CMAKE_ARGS+= -DLLVM_INCLUDE_BENCHMARKS:BOOL=OFF
+LLVM_CMAKE_ARGS+= -DLLVM_INCLUDE_DOCS=OFF
+LLVM_CMAKE_ARGS+= -DLLVM_INCLUDE_EXAMPLES=OFF
-# https://github.com/numba/llvmlite/pull/802
-BROKEN= "No support for llvm 14 yet."
-# officially supports llvm 11 as of 0.37.0
-MAKE_ENV+= LLVMLITE_SKIP_LLVM_VERSION_CHECK=1
+MAKE_ENV+= LLVM_CONFIG=${WRKDIR}/llvm-inst/bin/llvm-config
# unable to pass LLVM bit-code files to linker
MAKE_ENV.NetBSD+= CXX_FLTO_FLAGS=
MAKE_ENV.NetBSD+= LD_FLTO_FLAGS=
+# From 3.8 on is fine.
PYTHON_VERSIONS_INCOMPATIBLE= 27
+# The llvm build detects lots of stuff outside the build sandbox ...
+# a python it likes, git ... just hoping that this does not matter
+# much for the static lib being used by llvmlite.
+
pre-configure:
+ cd ${WRKDIR}/llvm-${LLVM_VERSION}.src && \
+ for f in ${FILESDIR}/llvm*.patch; do patch -Np2 < $$f; done
+ ${LN} -s llvm-${LLVM_VERSION}.src ${WRKDIR}/llvm
+ ${LN} -s lld-${LLVM_VERSION}.src ${WRKDIR}/lld
+ ${LN} -s libunwind-${LLVM_VERSION}.src ${WRKDIR}/libunwind
+ cd ${WRKDIR} && mkdir build && cd build && \
+ cmake -G'Unix Makefiles' ${LLVM_CMAKE_ARGS} ../llvm && \
+ ${MAKE} -j${MAKE_JOBS} && \
+ ${MAKE} -j${MAKE_JOBS} check-llvm-unit && \
+ ${MAKE} install
${SED} -e 's/ -stdlib=libc++//' ${WRKSRC}/ffi/Makefile.freebsd > ${WRKSRC}/ffi/Makefile.netbsd
.include "../../mk/bsd.prefs.mk"
@@ -34,6 +104,5 @@
${DESTDIR}${PREFIX}/${PYSITELIB}/llvmlite/binding/libllvmlite.dylib
.endif
-.include "../../lang/llvm/buildlink3.mk"
.include "../../lang/python/egg.mk"
.include "../../mk/bsd.pkg.mk"
Index: devel/py-llvmlite/PLIST
===================================================================
RCS file: /cvsroot/pkgsrc/devel/py-llvmlite/PLIST,v
retrieving revision 1.6
diff -u -r1.6 PLIST
--- devel/py-llvmlite/PLIST 12 Jan 2022 21:13:50 -0000 1.6
+++ devel/py-llvmlite/PLIST 19 Jan 2024 23:50:33 -0000
@@ -1,4 +1,4 @@
-@comment $NetBSD: PLIST,v 1.6 2022/01/12 21:13:50 wiz Exp $
+@comment $NetBSD$
${PYSITELIB}/${EGG_INFODIR}/PKG-INFO
${PYSITELIB}/${EGG_INFODIR}/SOURCES.txt
${PYSITELIB}/${EGG_INFODIR}/dependency_links.txt
@@ -46,6 +46,9 @@
${PYSITELIB}/llvmlite/binding/options.py
${PYSITELIB}/llvmlite/binding/options.pyc
${PYSITELIB}/llvmlite/binding/options.pyo
+${PYSITELIB}/llvmlite/binding/orcjit.py
+${PYSITELIB}/llvmlite/binding/orcjit.pyc
+${PYSITELIB}/llvmlite/binding/orcjit.pyo
${PYSITELIB}/llvmlite/binding/passmanagers.py
${PYSITELIB}/llvmlite/binding/passmanagers.pyc
${PYSITELIB}/llvmlite/binding/passmanagers.pyo
@@ -85,15 +88,6 @@
${PYSITELIB}/llvmlite/ir/values.py
${PYSITELIB}/llvmlite/ir/values.pyc
${PYSITELIB}/llvmlite/ir/values.pyo
-${PYSITELIB}/llvmlite/llvmpy/__init__.py
-${PYSITELIB}/llvmlite/llvmpy/__init__.pyc
-${PYSITELIB}/llvmlite/llvmpy/__init__.pyo
-${PYSITELIB}/llvmlite/llvmpy/core.py
-${PYSITELIB}/llvmlite/llvmpy/core.pyc
-${PYSITELIB}/llvmlite/llvmpy/core.pyo
-${PYSITELIB}/llvmlite/llvmpy/passes.py
-${PYSITELIB}/llvmlite/llvmpy/passes.pyc
-${PYSITELIB}/llvmlite/llvmpy/passes.pyo
${PYSITELIB}/llvmlite/tests/__init__.py
${PYSITELIB}/llvmlite/tests/__init__.pyc
${PYSITELIB}/llvmlite/tests/__init__.pyo
@@ -112,9 +106,6 @@
${PYSITELIB}/llvmlite/tests/test_ir.py
${PYSITELIB}/llvmlite/tests/test_ir.pyc
${PYSITELIB}/llvmlite/tests/test_ir.pyo
-${PYSITELIB}/llvmlite/tests/test_llvmpy.py
-${PYSITELIB}/llvmlite/tests/test_llvmpy.pyc
-${PYSITELIB}/llvmlite/tests/test_llvmpy.pyo
${PYSITELIB}/llvmlite/tests/test_refprune.py
${PYSITELIB}/llvmlite/tests/test_refprune.pyc
${PYSITELIB}/llvmlite/tests/test_refprune.pyo
Index: devel/py-llvmlite/distinfo
===================================================================
RCS file: /cvsroot/pkgsrc/devel/py-llvmlite/distinfo,v
retrieving revision 1.21
diff -u -r1.21 distinfo
--- devel/py-llvmlite/distinfo 22 May 2022 12:16:59 -0000 1.21
+++ devel/py-llvmlite/distinfo 19 Jan 2024 23:50:33 -0000
@@ -1,9 +1,15 @@
$NetBSD: distinfo,v 1.21 2022/05/22 12:16:59 adam Exp $
-BLAKE2s (llvmlite-0.38.1.tar.gz) = ebc28cc09fccd56c5e0c02398c61a564945c279f3951e6769743538f5153b06b
-SHA512 (llvmlite-0.38.1.tar.gz) = a872a8535173426feaf8af01824a22e0a439a99e67801d8e78397137aebec82ebd53aeb16d797da86f9570f90c3362d00c2180e4d3b6c564d0d490c37b2c4ed6
-Size (llvmlite-0.38.1.tar.gz) = 129131 bytes
-SHA1 (patch-ffi_Makefile.freebsd) = 39a533f17952c73ef7cbfe910bc58166a106448c
-SHA1 (patch-ffi_Makefile.linux) = 64fe000e738b61f0ece5c3b6cb86a1d548955c70
+BLAKE2s (libunwind-14.0.6.src.tar.xz) = 21da632762db6524a46c1f721908b233265afe83728c1de5dd7757c662db0d99
+SHA512 (libunwind-14.0.6.src.tar.xz) = c8f3804c47ac33273238899e5682f9cb52465dcceff0e0ecf9925469620c6c9a62cc2c708a35a0e156b666e1198df52c5fff1da9d5ee3194605dfd62c296b058
+Size (libunwind-14.0.6.src.tar.xz) = 108680 bytes
+BLAKE2s (lld-14.0.6.src.tar.xz) = 2fc265b616bbdbaeecc8385fda204dbc28b1d871d98f4b3b3cd5183c4d6eefc8
+SHA512 (lld-14.0.6.src.tar.xz) = fad97b441f9642b73edd240af2c026259de0951d5ace42779e9e0fcf5e417252a1d744e2fc51e754a45016621ba0c70088177f88695af1c6ce290dd26873b094
+Size (lld-14.0.6.src.tar.xz) = 1366180 bytes
+BLAKE2s (llvm-14.0.6.src.tar.xz) = 2d44946453add45426569fd4187654f83881341c5c0109e4ffacc60e8f73af60
+SHA512 (llvm-14.0.6.src.tar.xz) = 6461bdde27aac17fa44c3e99a85ec47ffb181d0d4e5c3ef1c4286a59583e3b0c51af3c8081a300f45b99524340773a3011380059e3b3a571c3b0a8733e96fc1d
+Size (llvm-14.0.6.src.tar.xz) = 49660136 bytes
+BLAKE2s (llvmlite-0.41.1.tar.gz) = 2da761d269e0be534391778303456a1f71033e65c8e51a6719c70dab07e1ae48
+SHA512 (llvmlite-0.41.1.tar.gz) = f344c49dae8494fc3e7c1b30a516f046d718d7d1aab69bab8d9f636dce3136d3970de40f0c6fd5dc48cd7292699f0afdf1e41264820d4d421ee2d1e14e321e71
+Size (llvmlite-0.41.1.tar.gz) = 146564 bytes
SHA1 (patch-ffi_build.py) = 9a992dd33f624055d5c8bea3986c4243c87b4ccf
-SHA1 (patch-ffi_targets.cpp) = 99f888839916fa42848f9dad2f28468b70cf668f
Index: devel/py-llvmlite/files/llvm14-clear-gotoffsetmap.patch
===================================================================
RCS file: devel/py-llvmlite/files/llvm14-clear-gotoffsetmap.patch
diff -N devel/py-llvmlite/files/llvm14-clear-gotoffsetmap.patch
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ devel/py-llvmlite/files/llvm14-clear-gotoffsetmap.patch 19 Jan 2024 23:50:33 -0000
@@ -0,0 +1,31 @@
+From 322c79fff224389b4df9f24ac22965867007c2fa Mon Sep 17 00:00:00 2001
+From: Graham Markall <gmarkall%nvidia.com@localhost>
+Date: Mon, 13 Mar 2023 21:35:11 +0000
+Subject: [PATCH] RuntimeDyldELF: Clear the GOTOffsetMap when finalizing the
+ load
+
+This needs resetting so that stale entries are not left behind when the
+GOT section and index are reset.
+
+See llvm/llvm#61402: RuntimeDyldELF doesn't clear GOTOffsetMap in
+finalizeLoad(), leading to invalid GOT relocations on AArch64 -
+https://github.com/llvm/llvm-project/issues/61402.
+---
+ llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/llvm-14.0.6.src/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp b/llvm-14.0.6.src/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+index f92618afdff6..eb3c27a9406a 100644
+--- a/llvm-14.0.6.src/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
++++ b/llvm-14.0.6.src/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+@@ -2345,6 +2345,7 @@ Error RuntimeDyldELF::finalizeLoad(const ObjectFile &Obj,
+ }
+ }
+
++ GOTOffsetMap.clear();
+ GOTSectionID = 0;
+ CurrentGOTIndex = 0;
+
+--
+2.34.1
+
Index: devel/py-llvmlite/files/llvm14-remove-use-of-clonefile.patch
===================================================================
RCS file: devel/py-llvmlite/files/llvm14-remove-use-of-clonefile.patch
diff -N devel/py-llvmlite/files/llvm14-remove-use-of-clonefile.patch
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ devel/py-llvmlite/files/llvm14-remove-use-of-clonefile.patch 19 Jan 2024 23:50:33 -0000
@@ -0,0 +1,54 @@
+diff -ur a/llvm-14.0.6.src/lib/Support/Unix/Path.inc b/llvm-14.0.6.src/lib/Support/Unix/Path.inc
+--- a/llvm-14.0.6.src/lib/Support/Unix/Path.inc 2022-03-14 05:44:55.000000000 -0400
++++ b/llvm-14.0.6.src/lib/Support/Unix/Path.inc 2022-09-19 11:30:59.000000000 -0400
+@@ -1462,6 +1462,7 @@
+ std::error_code copy_file(const Twine &From, const Twine &To) {
+ std::string FromS = From.str();
+ std::string ToS = To.str();
++ /*
+ #if __has_builtin(__builtin_available)
+ if (__builtin_available(macos 10.12, *)) {
+ // Optimistically try to use clonefile() and handle errors, rather than
+@@ -1490,6 +1491,7 @@
+ // cheaper.
+ }
+ #endif
++ */
+ if (!copyfile(FromS.c_str(), ToS.c_str(), /*State=*/NULL, COPYFILE_DATA))
+ return std::error_code();
+ return std::error_code(errno, std::generic_category());
+diff -ur a/llvm-14.0.6.src/unittests/Support/Path.cpp b/llvm-14.0.6.src/unittests/Support/Path.cpp
+--- a/llvm-14.0.6.src/unittests/Support/Path.cpp 2022-03-14 05:44:55.000000000 -0400
++++ b/llvm-14.0.6.src/unittests/Support/Path.cpp 2022-09-19 11:33:07.000000000 -0400
+@@ -2267,15 +2267,15 @@
+
+ EXPECT_EQ(fs::setPermissions(TempPath, fs::set_uid_on_exe), NoError);
+ EXPECT_TRUE(CheckPermissions(fs::set_uid_on_exe));
+-
++#if !defined(__APPLE__)
+ EXPECT_EQ(fs::setPermissions(TempPath, fs::set_gid_on_exe), NoError);
+ EXPECT_TRUE(CheckPermissions(fs::set_gid_on_exe));
+-
++#endif
+ // Modern BSDs require root to set the sticky bit on files.
+ // AIX and Solaris without root will mask off (i.e., lose) the sticky bit
+ // on files.
+ #if !defined(__FreeBSD__) && !defined(__NetBSD__) && !defined(__OpenBSD__) && \
+- !defined(_AIX) && !(defined(__sun__) && defined(__svr4__))
++ !defined(_AIX) && !(defined(__sun__) && defined(__svr4__)) && !defined(__APPLE__)
+ EXPECT_EQ(fs::setPermissions(TempPath, fs::sticky_bit), NoError);
+ EXPECT_TRUE(CheckPermissions(fs::sticky_bit));
+
+@@ -2297,10 +2297,12 @@
+ EXPECT_TRUE(CheckPermissions(fs::all_perms));
+ #endif // !FreeBSD && !NetBSD && !OpenBSD && !AIX
+
++#if !defined(__APPLE__)
+ EXPECT_EQ(fs::setPermissions(TempPath, fs::all_perms & ~fs::sticky_bit),
+ NoError);
+ EXPECT_TRUE(CheckPermissions(fs::all_perms & ~fs::sticky_bit));
+ #endif
++#endif
+ }
+
+ #ifdef _WIN32
Index: devel/py-llvmlite/files/llvm14-svml.patch
===================================================================
RCS file: devel/py-llvmlite/files/llvm14-svml.patch
diff -N devel/py-llvmlite/files/llvm14-svml.patch
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ devel/py-llvmlite/files/llvm14-svml.patch 19 Jan 2024 23:50:33 -0000
@@ -0,0 +1,2194 @@
+From 9de32f5474f1f78990b399214bdbb6c21f8f098e Mon Sep 17 00:00:00 2001
+From: Ivan Butygin <ivan.butygin%gmail.com@localhost>
+Date: Sun, 24 Jul 2022 20:31:29 +0200
+Subject: [PATCH] Fixes vectorizer and extends SVML support
+
+Fixes vectorizer and extends SVML support
+Patch was updated to fix SVML calling convention issues uncovered by llvm 10.
+In previous versions of patch SVML calling convention was selected based on
+compilation settings. So if you try to call 256bit vector function from avx512
+code function will be called with avx512 cc which is incorrect. To fix this
+SVML cc was separated into 3 different cc for 128, 256 and 512bit vector lengths
+which are selected based on actual input vector length.
+
+Original patch merged several fixes:
+
+1. https://reviews.llvm.org/D47188 patch fixes the problem with improper calls
+to SVML library as it has non-standard calling conventions. So accordingly it
+has SVML calling conventions definitions and code to set CC to the vectorized
+calls. As SVML provides several implementations for the math functions we also
+took into consideration fast attribute and select more fast implementation in
+such case. This work is based on original Matt Masten's work.
+Author: Denis Nagorny
+
+2. https://reviews.llvm.org/D53035 patch implements support to legalize SVML
+calls by breaking down the illegal vector call instruction into multiple legal
+vector call instructions during code generation. Currently the vectorizer does
+not check legality of the generated SVML (or any VECLIB) call instructions, and
+this can lead to potential problems even during vector type legalization. This
+patch addresses this issue by adding a legality check during code generation and
+replaces the illegal SVML call with corresponding legalized instructions.
+(RFC: http://lists.llvm.org/pipermail/llvm-dev/2018-June/124357.html)
+Author: Karthik Senthil
+
+diff --git a/llvm-14.0.6.src/include/llvm/Analysis/TargetLibraryInfo.h b/llvm-14.0.6.src/include/llvm/Analysis/TargetLibraryInfo.h
+index 17d1e3f770c14..110ff08189867 100644
+--- a/llvm-14.0.6.src/include/llvm/Analysis/TargetLibraryInfo.h
++++ b/llvm-14.0.6.src/include/llvm/Analysis/TargetLibraryInfo.h
+@@ -39,6 +39,12 @@ struct VecDesc {
+ NotLibFunc
+ };
+
++enum SVMLAccuracy {
++ SVML_DEFAULT,
++ SVML_HA,
++ SVML_EP
++};
++
+ /// Implementation of the target library information.
+ ///
+ /// This class constructs tables that hold the target library information and
+@@ -157,7 +163,7 @@ class TargetLibraryInfoImpl {
+ /// Return true if the function F has a vector equivalent with vectorization
+ /// factor VF.
+ bool isFunctionVectorizable(StringRef F, const ElementCount &VF) const {
+- return !getVectorizedFunction(F, VF).empty();
++ return !getVectorizedFunction(F, VF, false).empty();
+ }
+
+ /// Return true if the function F has a vector equivalent with any
+@@ -166,7 +172,10 @@ class TargetLibraryInfoImpl {
+
+ /// Return the name of the equivalent of F, vectorized with factor VF. If no
+ /// such mapping exists, return the empty string.
+- StringRef getVectorizedFunction(StringRef F, const ElementCount &VF) const;
++ std::string getVectorizedFunction(StringRef F, const ElementCount &VF, bool IsFast) const;
++
++ Optional<CallingConv::ID> getVectorizedFunctionCallingConv(
++ StringRef F, const FunctionType &FTy, const DataLayout &DL) const;
+
+ /// Set to true iff i32 parameters to library functions should have signext
+ /// or zeroext attributes if they correspond to C-level int or unsigned int,
+@@ -326,8 +335,13 @@ class TargetLibraryInfo {
+ bool isFunctionVectorizable(StringRef F) const {
+ return Impl->isFunctionVectorizable(F);
+ }
+- StringRef getVectorizedFunction(StringRef F, const ElementCount &VF) const {
+- return Impl->getVectorizedFunction(F, VF);
++ std::string getVectorizedFunction(StringRef F, const ElementCount &VF, bool IsFast) const {
++ return Impl->getVectorizedFunction(F, VF, IsFast);
++ }
++
++ Optional<CallingConv::ID> getVectorizedFunctionCallingConv(
++ StringRef F, const FunctionType &FTy, const DataLayout &DL) const {
++ return Impl->getVectorizedFunctionCallingConv(F, FTy, DL);
+ }
+
+ /// Tests if the function is both available and a candidate for optimized code
+diff --git a/llvm-14.0.6.src/include/llvm/AsmParser/LLToken.h b/llvm-14.0.6.src/include/llvm/AsmParser/LLToken.h
+index 78ebb35e0ea4d..3ffb57db8b18b 100644
+--- a/llvm-14.0.6.src/include/llvm/AsmParser/LLToken.h
++++ b/llvm-14.0.6.src/include/llvm/AsmParser/LLToken.h
+@@ -133,6 +133,9 @@ enum Kind {
+ kw_fastcc,
+ kw_coldcc,
+ kw_intel_ocl_bicc,
++ kw_intel_svmlcc128,
++ kw_intel_svmlcc256,
++ kw_intel_svmlcc512,
+ kw_cfguard_checkcc,
+ kw_x86_stdcallcc,
+ kw_x86_fastcallcc,
+diff --git a/llvm-14.0.6.src/include/llvm/IR/CMakeLists.txt b/llvm-14.0.6.src/include/llvm/IR/CMakeLists.txt
+index 0498fc269b634..23bb3de41bc1a 100644
+--- a/llvm-14.0.6.src/include/llvm/IR/CMakeLists.txt
++++ b/llvm-14.0.6.src/include/llvm/IR/CMakeLists.txt
+@@ -20,3 +20,7 @@ tablegen(LLVM IntrinsicsX86.h -gen-intrinsic-enums -intrinsic-prefix=x86)
+ tablegen(LLVM IntrinsicsXCore.h -gen-intrinsic-enums -intrinsic-prefix=xcore)
+ tablegen(LLVM IntrinsicsVE.h -gen-intrinsic-enums -intrinsic-prefix=ve)
+ add_public_tablegen_target(intrinsics_gen)
++
++set(LLVM_TARGET_DEFINITIONS SVML.td)
++tablegen(LLVM SVML.inc -gen-svml)
++add_public_tablegen_target(svml_gen)
+diff --git a/llvm-14.0.6.src/include/llvm/IR/CallingConv.h b/llvm-14.0.6.src/include/llvm/IR/CallingConv.h
+index fd28542465225..096eea1a8e19b 100644
+--- a/llvm-14.0.6.src/include/llvm/IR/CallingConv.h
++++ b/llvm-14.0.6.src/include/llvm/IR/CallingConv.h
+@@ -252,6 +252,11 @@ namespace CallingConv {
+ /// M68k_INTR - Calling convention used for M68k interrupt routines.
+ M68k_INTR = 101,
+
++ /// Intel_SVML - Calling conventions for Intel Short Math Vector Library
++ Intel_SVML128 = 102,
++ Intel_SVML256 = 103,
++ Intel_SVML512 = 104,
++
+ /// The highest possible calling convention ID. Must be some 2^k - 1.
+ MaxID = 1023
+ };
+diff --git a/llvm-14.0.6.src/include/llvm/IR/SVML.td b/llvm-14.0.6.src/include/llvm/IR/SVML.td
+new file mode 100644
+index 0000000000000..5af710404c9d9
+--- /dev/null
++++ b/llvm-14.0.6.src/include/llvm/IR/SVML.td
+@@ -0,0 +1,62 @@
++//===-- Intel_SVML.td - Defines SVML call variants ---------*- tablegen -*-===//
++//
++// The LLVM Compiler Infrastructure
++//
++// This file is distributed under the University of Illinois Open Source
++// License. See LICENSE.TXT for details.
++//
++//===----------------------------------------------------------------------===//
++//
++// This file is used by TableGen to define the different typs of SVML function
++// variants used with -fveclib=SVML.
++//
++//===----------------------------------------------------------------------===//
++
++class SvmlVariant;
++
++def sin : SvmlVariant;
++def cos : SvmlVariant;
++def pow : SvmlVariant;
++def exp : SvmlVariant;
++def log : SvmlVariant;
++def acos : SvmlVariant;
++def acosh : SvmlVariant;
++def asin : SvmlVariant;
++def asinh : SvmlVariant;
++def atan2 : SvmlVariant;
++def atan : SvmlVariant;
++def atanh : SvmlVariant;
++def cbrt : SvmlVariant;
++def cdfnorm : SvmlVariant;
++def cdfnorminv : SvmlVariant;
++def cosd : SvmlVariant;
++def cosh : SvmlVariant;
++def erf : SvmlVariant;
++def erfc : SvmlVariant;
++def erfcinv : SvmlVariant;
++def erfinv : SvmlVariant;
++def exp10 : SvmlVariant;
++def exp2 : SvmlVariant;
++def expm1 : SvmlVariant;
++def hypot : SvmlVariant;
++def invsqrt : SvmlVariant;
++def log10 : SvmlVariant;
++def log1p : SvmlVariant;
++def log2 : SvmlVariant;
++def sind : SvmlVariant;
++def sinh : SvmlVariant;
++def sqrt : SvmlVariant;
++def tan : SvmlVariant;
++def tanh : SvmlVariant;
++
++// TODO: SVML does not currently provide _ha and _ep variants of these fucnctions.
++// We should call the default variant of these functions in all cases instead.
++
++// def nearbyint : SvmlVariant;
++// def logb : SvmlVariant;
++// def floor : SvmlVariant;
++// def fmod : SvmlVariant;
++// def ceil : SvmlVariant;
++// def trunc : SvmlVariant;
++// def rint : SvmlVariant;
++// def round : SvmlVariant;
+diff --git a/llvm-14.0.6.src/lib/Analysis/CMakeLists.txt b/llvm-14.0.6.src/lib/Analysis/CMakeLists.txt
+index aec84124129f4..98286e166fbe2 100644
+--- a/llvm-14.0.6.src/lib/Analysis/CMakeLists.txt
++++ b/llvm-14.0.6.src/lib/Analysis/CMakeLists.txt
+@@ -150,6 +150,7 @@ add_llvm_component_library(LLVMAnalysis
+ DEPENDS
+ intrinsics_gen
+ ${MLDeps}
++ svml_gen
+
+ LINK_LIBS
+ ${MLLinkDeps}
+diff --git a/llvm-14.0.6.src/lib/Analysis/TargetLibraryInfo.cpp b/llvm-14.0.6.src/lib/Analysis/TargetLibraryInfo.cpp
+index 02923c2c7eb14..83abde28a62a4 100644
+--- a/llvm-14.0.6.src/lib/Analysis/TargetLibraryInfo.cpp
++++ b/llvm-14.0.6.src/lib/Analysis/TargetLibraryInfo.cpp
+@@ -110,6 +110,11 @@ bool TargetLibraryInfoImpl::isCallingConvCCompatible(Function *F) {
+ F->getFunctionType());
+ }
+
++static std::string svmlMangle(StringRef FnName, const bool IsFast) {
++ std::string FullName = FnName.str();
++ return IsFast ? FullName : FullName + "_ha";
++}
++
+ /// Initialize the set of available library functions based on the specified
+ /// target triple. This should be carefully written so that a missing target
+ /// triple gets a sane set of defaults.
+@@ -1876,8 +1881,9 @@ void TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib(
+ }
+ case SVML: {
+ const VecDesc VecFuncs[] = {
+- #define TLI_DEFINE_SVML_VECFUNCS
+- #include "llvm/Analysis/VecFuncs.def"
++ #define GET_SVML_VARIANTS
++ #include "llvm/IR/SVML.inc"
++ #undef GET_SVML_VARIANTS
+ };
+ addVectorizableFunctions(VecFuncs);
+ break;
+@@ -1897,20 +1903,51 @@ bool TargetLibraryInfoImpl::isFunctionVectorizable(StringRef funcName) const {
+ return I != VectorDescs.end() && StringRef(I->ScalarFnName) == funcName;
+ }
+
+-StringRef
+-TargetLibraryInfoImpl::getVectorizedFunction(StringRef F,
+- const ElementCount &VF) const {
++std::string TargetLibraryInfoImpl::getVectorizedFunction(StringRef F,
++ const ElementCount &VF,
++ bool IsFast) const {
++ bool FromSVML = ClVectorLibrary == SVML;
+ F = sanitizeFunctionName(F);
+ if (F.empty())
+- return F;
++ return F.str();
+ std::vector<VecDesc>::const_iterator I =
+ llvm::lower_bound(VectorDescs, F, compareWithScalarFnName);
+ while (I != VectorDescs.end() && StringRef(I->ScalarFnName) == F) {
+- if (I->VectorizationFactor == VF)
+- return I->VectorFnName;
++ if (I->VectorizationFactor == VF) {
++ if (FromSVML) {
++ return svmlMangle(I->VectorFnName, IsFast);
++ }
++ return I->VectorFnName.str();
++ }
+ ++I;
+ }
+- return StringRef();
++ return std::string();
++}
++
++static CallingConv::ID getSVMLCallingConv(const DataLayout &DL, const FunctionType &FType)
++{
++ assert(isa<VectorType>(FType.getReturnType()));
++ auto *VecCallRetType = cast<VectorType>(FType.getReturnType());
++ auto TypeBitWidth = DL.getTypeSizeInBits(VecCallRetType);
++ if (TypeBitWidth == 128) {
++ return CallingConv::Intel_SVML128;
++ } else if (TypeBitWidth == 256) {
++ return CallingConv::Intel_SVML256;
++ } else if (TypeBitWidth == 512) {
++ return CallingConv::Intel_SVML512;
++ } else {
++ llvm_unreachable("Invalid vector width");
++ }
++ return 0; // not reachable
++}
++
++Optional<CallingConv::ID>
++TargetLibraryInfoImpl::getVectorizedFunctionCallingConv(
++ StringRef F, const FunctionType &FTy, const DataLayout &DL) const {
++ if (F.startswith("__svml")) {
++ return getSVMLCallingConv(DL, FTy);
++ }
++ return {};
+ }
+
+ TargetLibraryInfo TargetLibraryAnalysis::run(const Function &F,
+diff --git a/llvm-14.0.6.src/lib/AsmParser/LLLexer.cpp b/llvm-14.0.6.src/lib/AsmParser/LLLexer.cpp
+index e3bf41c9721b6..4f9dccd4e0724 100644
+--- a/llvm-14.0.6.src/lib/AsmParser/LLLexer.cpp
++++ b/llvm-14.0.6.src/lib/AsmParser/LLLexer.cpp
+@@ -603,6 +603,9 @@ lltok::Kind LLLexer::LexIdentifier() {
+ KEYWORD(spir_kernel);
+ KEYWORD(spir_func);
+ KEYWORD(intel_ocl_bicc);
++ KEYWORD(intel_svmlcc128);
++ KEYWORD(intel_svmlcc256);
++ KEYWORD(intel_svmlcc512);
+ KEYWORD(x86_64_sysvcc);
+ KEYWORD(win64cc);
+ KEYWORD(x86_regcallcc);
+diff --git a/llvm-14.0.6.src/lib/AsmParser/LLParser.cpp b/llvm-14.0.6.src/lib/AsmParser/LLParser.cpp
+index 432ec151cf8ae..3bd6ee61024b8 100644
+--- a/llvm-14.0.6.src/lib/AsmParser/LLParser.cpp
++++ b/llvm-14.0.6.src/lib/AsmParser/LLParser.cpp
+@@ -1781,6 +1781,9 @@ void LLParser::parseOptionalDLLStorageClass(unsigned &Res) {
+ /// ::= 'ccc'
+ /// ::= 'fastcc'
+ /// ::= 'intel_ocl_bicc'
++/// ::= 'intel_svmlcc128'
++/// ::= 'intel_svmlcc256'
++/// ::= 'intel_svmlcc512'
+ /// ::= 'coldcc'
+ /// ::= 'cfguard_checkcc'
+ /// ::= 'x86_stdcallcc'
+@@ -1850,6 +1853,9 @@ bool LLParser::parseOptionalCallingConv(unsigned &CC) {
+ case lltok::kw_spir_kernel: CC = CallingConv::SPIR_KERNEL; break;
+ case lltok::kw_spir_func: CC = CallingConv::SPIR_FUNC; break;
+ case lltok::kw_intel_ocl_bicc: CC = CallingConv::Intel_OCL_BI; break;
++ case lltok::kw_intel_svmlcc128:CC = CallingConv::Intel_SVML128; break;
++ case lltok::kw_intel_svmlcc256:CC = CallingConv::Intel_SVML256; break;
++ case lltok::kw_intel_svmlcc512:CC = CallingConv::Intel_SVML512; break;
+ case lltok::kw_x86_64_sysvcc: CC = CallingConv::X86_64_SysV; break;
+ case lltok::kw_win64cc: CC = CallingConv::Win64; break;
+ case lltok::kw_webkit_jscc: CC = CallingConv::WebKit_JS; break;
+diff --git a/llvm-14.0.6.src/lib/CodeGen/ReplaceWithVeclib.cpp b/llvm-14.0.6.src/lib/CodeGen/ReplaceWithVeclib.cpp
+index 0ff045fa787e8..175651949ef85 100644
+--- a/llvm-14.0.6.src/lib/CodeGen/ReplaceWithVeclib.cpp
++++ b/llvm-14.0.6.src/lib/CodeGen/ReplaceWithVeclib.cpp
+@@ -157,7 +157,7 @@ static bool replaceWithCallToVeclib(const TargetLibraryInfo &TLI,
+ // and the exact vector width of the call operands in the
+ // TargetLibraryInfo.
+ const std::string TLIName =
+- std::string(TLI.getVectorizedFunction(ScalarName, VF));
++ std::string(TLI.getVectorizedFunction(ScalarName, VF, CI.getFastMathFlags().isFast()));
+
+ LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Looking up TLI mapping for `"
+ << ScalarName << "` and vector width " << VF << ".\n");
+diff --git a/llvm-14.0.6.src/lib/IR/AsmWriter.cpp b/llvm-14.0.6.src/lib/IR/AsmWriter.cpp
+index 179754e275b03..c4e95752c97e8 100644
+--- a/llvm-14.0.6.src/lib/IR/AsmWriter.cpp
++++ b/llvm-14.0.6.src/lib/IR/AsmWriter.cpp
+@@ -306,6 +306,9 @@ static void PrintCallingConv(unsigned cc, raw_ostream &Out) {
+ case CallingConv::X86_RegCall: Out << "x86_regcallcc"; break;
+ case CallingConv::X86_VectorCall:Out << "x86_vectorcallcc"; break;
+ case CallingConv::Intel_OCL_BI: Out << "intel_ocl_bicc"; break;
++ case CallingConv::Intel_SVML128: Out << "intel_svmlcc128"; break;
++ case CallingConv::Intel_SVML256: Out << "intel_svmlcc256"; break;
++ case CallingConv::Intel_SVML512: Out << "intel_svmlcc512"; break;
+ case CallingConv::ARM_APCS: Out << "arm_apcscc"; break;
+ case CallingConv::ARM_AAPCS: Out << "arm_aapcscc"; break;
+ case CallingConv::ARM_AAPCS_VFP: Out << "arm_aapcs_vfpcc"; break;
+diff --git a/llvm-14.0.6.src/lib/IR/Verifier.cpp b/llvm-14.0.6.src/lib/IR/Verifier.cpp
+index 989d01e2e3950..bae7382a36e13 100644
+--- a/llvm-14.0.6.src/lib/IR/Verifier.cpp
++++ b/llvm-14.0.6.src/lib/IR/Verifier.cpp
+@@ -2457,6 +2457,9 @@ void Verifier::visitFunction(const Function &F) {
+ case CallingConv::Fast:
+ case CallingConv::Cold:
+ case CallingConv::Intel_OCL_BI:
++ case CallingConv::Intel_SVML128:
++ case CallingConv::Intel_SVML256:
++ case CallingConv::Intel_SVML512:
+ case CallingConv::PTX_Kernel:
+ case CallingConv::PTX_Device:
+ Assert(!F.isVarArg(), "Calling convention does not support varargs or "
+diff --git a/llvm-14.0.6.src/lib/Target/X86/X86CallingConv.td b/llvm-14.0.6.src/lib/Target/X86/X86CallingConv.td
+index 4dd8a6cdd8982..12e65521215e4 100644
+--- a/llvm-14.0.6.src/lib/Target/X86/X86CallingConv.td
++++ b/llvm-14.0.6.src/lib/Target/X86/X86CallingConv.td
+@@ -498,6 +498,21 @@ def RetCC_X86_64 : CallingConv<[
+ CCDelegateTo<RetCC_X86_64_C>
+ ]>;
+
++// Intel_SVML return-value convention.
++def RetCC_Intel_SVML : CallingConv<[
++ // Vector types are returned in XMM0,XMM1
++ CCIfType<[v4f32, v2f64],
++ CCAssignToReg<[XMM0,XMM1]>>,
++
++ // 256-bit FP vectors
++ CCIfType<[v8f32, v4f64],
++ CCAssignToReg<[YMM0,YMM1]>>,
++
++ // 512-bit FP vectors
++ CCIfType<[v16f32, v8f64],
++ CCAssignToReg<[ZMM0,ZMM1]>>
++]>;
++
+ // This is the return-value convention used for the entire X86 backend.
+ let Entry = 1 in
+ def RetCC_X86 : CallingConv<[
+@@ -505,6 +520,10 @@ def RetCC_X86 : CallingConv<[
+ // Check if this is the Intel OpenCL built-ins calling convention
+ CCIfCC<"CallingConv::Intel_OCL_BI", CCDelegateTo<RetCC_Intel_OCL_BI>>,
+
++ CCIfCC<"CallingConv::Intel_SVML128", CCDelegateTo<RetCC_Intel_SVML>>,
++ CCIfCC<"CallingConv::Intel_SVML256", CCDelegateTo<RetCC_Intel_SVML>>,
++ CCIfCC<"CallingConv::Intel_SVML512", CCDelegateTo<RetCC_Intel_SVML>>,
++
+ CCIfSubtarget<"is64Bit()", CCDelegateTo<RetCC_X86_64>>,
+ CCDelegateTo<RetCC_X86_32>
+ ]>;
+@@ -1064,6 +1083,30 @@ def CC_Intel_OCL_BI : CallingConv<[
+ CCDelegateTo<CC_X86_32_C>
+ ]>;
+
++// X86-64 Intel Short Vector Math Library calling convention.
++def CC_Intel_SVML : CallingConv<[
++
++ // The SSE vector arguments are passed in XMM registers.
++ CCIfType<[v4f32, v2f64],
++ CCAssignToReg<[XMM0, XMM1, XMM2]>>,
++
++ // The 256-bit vector arguments are passed in YMM registers.
++ CCIfType<[v8f32, v4f64],
++ CCAssignToReg<[YMM0, YMM1, YMM2]>>,
++
++ // The 512-bit vector arguments are passed in ZMM registers.
++ CCIfType<[v16f32, v8f64],
++ CCAssignToReg<[ZMM0, ZMM1, ZMM2]>>
++]>;
++
++def CC_X86_32_Intr : CallingConv<[
++ CCAssignToStack<4, 4>
++]>;
++
++def CC_X86_64_Intr : CallingConv<[
++ CCAssignToStack<8, 8>
++]>;
++
+ //===----------------------------------------------------------------------===//
+ // X86 Root Argument Calling Conventions
+ //===----------------------------------------------------------------------===//
+@@ -1115,6 +1158,9 @@ def CC_X86_64 : CallingConv<[
+ let Entry = 1 in
+ def CC_X86 : CallingConv<[
+ CCIfCC<"CallingConv::Intel_OCL_BI", CCDelegateTo<CC_Intel_OCL_BI>>,
++ CCIfCC<"CallingConv::Intel_SVML128", CCDelegateTo<CC_Intel_SVML>>,
++ CCIfCC<"CallingConv::Intel_SVML256", CCDelegateTo<CC_Intel_SVML>>,
++ CCIfCC<"CallingConv::Intel_SVML512", CCDelegateTo<CC_Intel_SVML>>,
+ CCIfSubtarget<"is64Bit()", CCDelegateTo<CC_X86_64>>,
+ CCDelegateTo<CC_X86_32>
+ ]>;
+@@ -1227,3 +1273,27 @@ def CSR_SysV64_RegCall_NoSSE : CalleeSavedRegs<(add RBX, RBP,
+ (sequence "R%u", 12, 15))>;
+ def CSR_SysV64_RegCall : CalleeSavedRegs<(add CSR_SysV64_RegCall_NoSSE,
+ (sequence "XMM%u", 8, 15))>;
++
++// SVML calling convention
++def CSR_32_Intel_SVML : CalleeSavedRegs<(add CSR_32_RegCall_NoSSE)>;
++def CSR_32_Intel_SVML_AVX512 : CalleeSavedRegs<(add CSR_32_Intel_SVML,
++ K4, K5, K6, K7)>;
++
++def CSR_64_Intel_SVML_NoSSE : CalleeSavedRegs<(add RBX, RSI, RDI, RBP, RSP, R12, R13, R14, R15)>;
++
++def CSR_64_Intel_SVML : CalleeSavedRegs<(add CSR_64_Intel_SVML_NoSSE,
++ (sequence "XMM%u", 8, 15))>;
++def CSR_Win64_Intel_SVML : CalleeSavedRegs<(add CSR_64_Intel_SVML_NoSSE,
++ (sequence "XMM%u", 6, 15))>;
++
++def CSR_64_Intel_SVML_AVX : CalleeSavedRegs<(add CSR_64_Intel_SVML_NoSSE,
++ (sequence "YMM%u", 8, 15))>;
++def CSR_Win64_Intel_SVML_AVX : CalleeSavedRegs<(add CSR_64_Intel_SVML_NoSSE,
++ (sequence "YMM%u", 6, 15))>;
++
++def CSR_64_Intel_SVML_AVX512 : CalleeSavedRegs<(add CSR_64_Intel_SVML_NoSSE,
++ (sequence "ZMM%u", 16, 31),
++ K4, K5, K6, K7)>;
++def CSR_Win64_Intel_SVML_AVX512 : CalleeSavedRegs<(add CSR_64_Intel_SVML_NoSSE,
++ (sequence "ZMM%u", 6, 21),
++ K4, K5, K6, K7)>;
+diff --git a/llvm-14.0.6.src/lib/Target/X86/X86ISelLowering.cpp b/llvm-14.0.6.src/lib/Target/X86/X86ISelLowering.cpp
+index 8bb7e81e19bbd..1780ce3fc6467 100644
+--- a/llvm-14.0.6.src/lib/Target/X86/X86ISelLowering.cpp
++++ b/llvm-14.0.6.src/lib/Target/X86/X86ISelLowering.cpp
+@@ -3788,7 +3788,8 @@ void VarArgsLoweringHelper::forwardMustTailParameters(SDValue &Chain) {
+ // FIXME: Only some x86_32 calling conventions support AVX512.
+ if (Subtarget.useAVX512Regs() &&
+ (is64Bit() || (CallConv == CallingConv::X86_VectorCall ||
+- CallConv == CallingConv::Intel_OCL_BI)))
++ CallConv == CallingConv::Intel_OCL_BI ||
++ CallConv == CallingConv::Intel_SVML512)))
+ VecVT = MVT::v16f32;
+ else if (Subtarget.hasAVX())
+ VecVT = MVT::v8f32;
+diff --git a/llvm-14.0.6.src/lib/Target/X86/X86RegisterInfo.cpp b/llvm-14.0.6.src/lib/Target/X86/X86RegisterInfo.cpp
+index 130cb61cdde24..9eec3b25ca9f2 100644
+--- a/llvm-14.0.6.src/lib/Target/X86/X86RegisterInfo.cpp
++++ b/llvm-14.0.6.src/lib/Target/X86/X86RegisterInfo.cpp
+@@ -272,6 +272,42 @@ X86RegisterInfo::getRegPressureLimit(const TargetRegisterClass *RC,
+ }
+ }
+
++namespace {
++std::pair<const uint32_t *, const MCPhysReg *> getSVMLRegMaskAndSaveList(
++ bool Is64Bit, bool IsWin64, CallingConv::ID CC) {
++ assert(CC >= CallingConv::Intel_SVML128 && CC <= CallingConv::Intel_SVML512);
++ unsigned Abi = CC - CallingConv::Intel_SVML128 ; // 0 - 128, 1 - 256, 2 - 512
++
++ const std::pair<const uint32_t *, const MCPhysReg *> Abi64[] = {
++ std::make_pair(CSR_64_Intel_SVML_RegMask, CSR_64_Intel_SVML_SaveList),
++ std::make_pair(CSR_64_Intel_SVML_AVX_RegMask, CSR_64_Intel_SVML_AVX_SaveList),
++ std::make_pair(CSR_64_Intel_SVML_AVX512_RegMask, CSR_64_Intel_SVML_AVX512_SaveList),
++ };
++
++ const std::pair<const uint32_t *, const MCPhysReg *> AbiWin64[] = {
++ std::make_pair(CSR_Win64_Intel_SVML_RegMask, CSR_Win64_Intel_SVML_SaveList),
++ std::make_pair(CSR_Win64_Intel_SVML_AVX_RegMask, CSR_Win64_Intel_SVML_AVX_SaveList),
++ std::make_pair(CSR_Win64_Intel_SVML_AVX512_RegMask, CSR_Win64_Intel_SVML_AVX512_SaveList),
++ };
++
++ const std::pair<const uint32_t *, const MCPhysReg *> Abi32[] = {
++ std::make_pair(CSR_32_Intel_SVML_RegMask, CSR_32_Intel_SVML_SaveList),
++ std::make_pair(CSR_32_Intel_SVML_RegMask, CSR_32_Intel_SVML_SaveList),
++ std::make_pair(CSR_32_Intel_SVML_AVX512_RegMask, CSR_32_Intel_SVML_AVX512_SaveList),
++ };
++
++ if (Is64Bit) {
++ if (IsWin64) {
++ return AbiWin64[Abi];
++ } else {
++ return Abi64[Abi];
++ }
++ } else {
++ return Abi32[Abi];
++ }
++}
++}
++
+ const MCPhysReg *
+ X86RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
+ assert(MF && "MachineFunction required");
+@@ -327,6 +363,11 @@ X86RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
+ return CSR_64_Intel_OCL_BI_SaveList;
+ break;
+ }
++ case CallingConv::Intel_SVML128:
++ case CallingConv::Intel_SVML256:
++ case CallingConv::Intel_SVML512: {
++ return getSVMLRegMaskAndSaveList(Is64Bit, IsWin64, CC).second;
++ }
+ case CallingConv::HHVM:
+ return CSR_64_HHVM_SaveList;
+ case CallingConv::X86_RegCall:
+@@ -449,6 +490,11 @@ X86RegisterInfo::getCallPreservedMask(const MachineFunction &MF,
+ return CSR_64_Intel_OCL_BI_RegMask;
+ break;
+ }
++ case CallingConv::Intel_SVML128:
++ case CallingConv::Intel_SVML256:
++ case CallingConv::Intel_SVML512: {
++ return getSVMLRegMaskAndSaveList(Is64Bit, IsWin64, CC).first;
++ }
+ case CallingConv::HHVM:
+ return CSR_64_HHVM_RegMask;
+ case CallingConv::X86_RegCall:
+diff --git a/llvm-14.0.6.src/lib/Target/X86/X86Subtarget.h b/llvm-14.0.6.src/lib/Target/X86/X86Subtarget.h
+index 5d773f0c57dfb..6bdf5bc6f3fe9 100644
+--- a/llvm-14.0.6.src/lib/Target/X86/X86Subtarget.h
++++ b/llvm-14.0.6.src/lib/Target/X86/X86Subtarget.h
+@@ -916,6 +916,9 @@ class X86Subtarget final : public X86GenSubtargetInfo {
+ case CallingConv::X86_ThisCall:
+ case CallingConv::X86_VectorCall:
+ case CallingConv::Intel_OCL_BI:
++ case CallingConv::Intel_SVML128:
++ case CallingConv::Intel_SVML256:
++ case CallingConv::Intel_SVML512:
+ return isTargetWin64();
+ // This convention allows using the Win64 convention on other targets.
+ case CallingConv::Win64:
+diff --git a/llvm-14.0.6.src/lib/Transforms/Utils/InjectTLIMappings.cpp b/llvm-14.0.6.src/lib/Transforms/Utils/InjectTLIMappings.cpp
+index 047bf5569ded3..59897785f156c 100644
+--- a/llvm-14.0.6.src/lib/Transforms/Utils/InjectTLIMappings.cpp
++++ b/llvm-14.0.6.src/lib/Transforms/Utils/InjectTLIMappings.cpp
+@@ -92,7 +92,7 @@ static void addMappingsFromTLI(const TargetLibraryInfo &TLI, CallInst &CI) {
+
+ auto AddVariantDecl = [&](const ElementCount &VF) {
+ const std::string TLIName =
+- std::string(TLI.getVectorizedFunction(ScalarName, VF));
++ std::string(TLI.getVectorizedFunction(ScalarName, VF, CI.getFastMathFlags().isFast()));
+ if (!TLIName.empty()) {
+ std::string MangledName =
+ VFABI::mangleTLIVectorName(TLIName, ScalarName, CI.arg_size(), VF);
+diff --git a/llvm-14.0.6.src/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm-14.0.6.src/lib/Transforms/Vectorize/LoopVectorize.cpp
+index 46ff0994e04e7..f472af5e1a835 100644
+--- a/llvm-14.0.6.src/lib/Transforms/Vectorize/LoopVectorize.cpp
++++ b/llvm-14.0.6.src/lib/Transforms/Vectorize/LoopVectorize.cpp
+@@ -712,6 +712,27 @@ class InnerLoopVectorizer {
+ virtual void printDebugTracesAtStart(){};
+ virtual void printDebugTracesAtEnd(){};
+
++ /// Check legality of given SVML call instruction \p VecCall generated for
++ /// scalar call \p Call. If illegal then the appropriate legal instruction
++ /// is returned.
++ Value *legalizeSVMLCall(CallInst *VecCall, CallInst *Call);
++
++ /// Returns the legal VF for a call instruction \p CI using TTI information
++ /// and vector type.
++ ElementCount getLegalVFForCall(CallInst *CI);
++
++ /// Partially vectorize a given call \p Call by breaking it down into multiple
++ /// calls of \p LegalCall, decided by the variant VF \p LegalVF.
++ Value *partialVectorizeCall(CallInst *Call, CallInst *LegalCall,
++ unsigned LegalVF);
++
++ /// Generate shufflevector instruction for a vector value \p V based on the
++ /// current \p Part and a smaller VF \p LegalVF.
++ Value *generateShuffleValue(Value *V, unsigned LegalVF, unsigned Part);
++
++ /// Combine partially vectorized calls stored in \p CallResults.
++ Value *combinePartialVecCalls(SmallVectorImpl<Value *> &CallResults);
++
+ /// The original loop.
+ Loop *OrigLoop;
+
+@@ -4596,6 +4617,17 @@ static bool mayDivideByZero(Instruction &I) {
+ return !CInt || CInt->isZero();
+ }
+
++static void setVectorFunctionCallingConv(CallInst &CI, const DataLayout &DL,
++ const TargetLibraryInfo &TLI) {
++ Function *VectorF = CI.getCalledFunction();
++ FunctionType *FTy = VectorF->getFunctionType();
++ StringRef VFName = VectorF->getName();
++ auto CC = TLI.getVectorizedFunctionCallingConv(VFName, *FTy, DL);
++ if (CC) {
++ CI.setCallingConv(*CC);
++ }
++}
++
+ void InnerLoopVectorizer::widenCallInstruction(CallInst &I, VPValue *Def,
+ VPUser &ArgOperands,
+ VPTransformState &State) {
+@@ -4664,9 +4696,246 @@ void InnerLoopVectorizer::widenCallInstruction(CallInst &I, VPValue *Def,
+ if (isa<FPMathOperator>(V))
+ V->copyFastMathFlags(CI);
+
++ const DataLayout &DL = V->getModule()->getDataLayout();
++ setVectorFunctionCallingConv(*V, DL, *TLI);
++
++ // Perform legalization of SVML call instruction only if original call
++ // was not Intrinsic
++ if (!UseVectorIntrinsic &&
++ (V->getCalledFunction()->getName()).startswith("__svml")) {
++ // assert((V->getCalledFunction()->getName()).startswith("__svml"));
++ LLVM_DEBUG(dbgs() << "LV(SVML): Vector call inst:"; V->dump());
++ auto *LegalV = cast<Instruction>(legalizeSVMLCall(V, CI));
++ LLVM_DEBUG(dbgs() << "LV: Completed SVML legalization.\n LegalV: ";
++ LegalV->dump());
++ State.set(Def, LegalV, Part);
++ addMetadata(LegalV, &I);
++ } else {
+ State.set(Def, V, Part);
+ addMetadata(V, &I);
++ }
++ }
++}
++
++//===----------------------------------------------------------------------===//
++// Implementation of functions for SVML vector call legalization.
++//===----------------------------------------------------------------------===//
++//
++// Unlike other VECLIBs, SVML needs to be used with target-legal
++// vector types. Otherwise, link failures and/or runtime failures
++// will occur. A motivating example could be -
++//
++// double *a;
++// float *b;
++// #pragma clang loop vectorize_width(8)
++// for(i = 0; i < N; ++i) {
++// a[i] = sin(i); // Legal SVML VF must be 4 or below on AVX
++// b[i] = cosf(i); // VF can be 8 on AVX since 8 floats can fit in YMM
++// }
++//
++// Current implementation of vector code generation in LV is
++// driven based on a single VF (in InnerLoopVectorizer::VF). This
++// inhibits the flexibility of adjusting/choosing different VF
++// for different instructions.
++//
++// Due to this limitation it is much more straightforward to
++// first generate the illegal sin8 (svml_sin8 for SVML vector
++// library) call and then legalize it than trying to avoid
++// generating illegal code from the beginning.
++//
++// A solution for this problem is to check legality of the
++// call instruction right after generating it in vectorizer and
++// if it is illegal we split the call arguments and issue multiple
++// calls to match the legal VF. This is demonstrated currently for
++// the SVML vector library calls (non-intrinsic version only).
++//
++// Future directions and extensions:
++// 1) This legalization example shows us that a good direction
++// for the VPlan framework would be to model the vector call
++// instructions in a way that legal VF for each call is chosen
++// correctly within vectorizer and illegal code generation is
++// avoided.
++// 2) This logic can also be extended to general vector functions
++// i.e. legalization OpenMP decalre simd functions. The
++// requirements needed for this will be documented soon.
++
++Value *InnerLoopVectorizer::legalizeSVMLCall(CallInst *VecCall,
++ CallInst *Call) {
++ ElementCount LegalVF = getLegalVFForCall(VecCall);
++
++ assert(LegalVF.getKnownMinValue() > 1 &&
++ "Legal VF for SVML call must be greater than 1 to vectorize");
++
++ if (LegalVF == VF)
++ return VecCall;
++ else if (LegalVF.getKnownMinValue() > VF.getKnownMinValue())
++ // TODO: handle case when we are underfilling vectors
++ return VecCall;
++
++ // Legal VF for this SVML call is smaller than chosen VF, break it down into
++ // smaller call instructions
++
++ // Convert args, types and return type to match legal VF
++ SmallVector<Type *, 4> NewTys;
++ SmallVector<Value *, 4> NewArgs;
++
++ for (Value *ArgOperand : Call->args()) {
++ Type *Ty = ToVectorTy(ArgOperand->getType(), LegalVF);
++ NewTys.push_back(Ty);
++ NewArgs.push_back(UndefValue::get(Ty));
+ }
++
++ // Construct legal vector function
++ const VFShape Shape =
++ VFShape::get(*Call, LegalVF /*EC*/, false /*HasGlobalPred*/);
++ Function *LegalVectorF = VFDatabase(*Call).getVectorizedFunction(Shape);
++ assert(LegalVectorF != nullptr && "Can't create legal vector function.");
++
++ LLVM_DEBUG(dbgs() << "LV(SVML): LegalVectorF: "; LegalVectorF->dump());
++
++ SmallVector<OperandBundleDef, 1> OpBundles;
++ Call->getOperandBundlesAsDefs(OpBundles);
++ auto LegalV = std::unique_ptr<CallInst>(CallInst::Create(LegalVectorF, NewArgs, OpBundles));
++
++ if (isa<FPMathOperator>(LegalV))
++ LegalV->copyFastMathFlags(Call);
++
++ const DataLayout &DL = VecCall->getModule()->getDataLayout();
++ // Set SVML calling conventions
++ setVectorFunctionCallingConv(*LegalV, DL, *TLI);
++
++ LLVM_DEBUG(dbgs() << "LV(SVML): LegalV: "; LegalV->dump());
++
++ Value *LegalizedCall = partialVectorizeCall(VecCall, LegalV.get(), LegalVF.getKnownMinValue());
++
++ LLVM_DEBUG(dbgs() << "LV(SVML): LegalizedCall: "; LegalizedCall->dump());
++
++ // Remove the illegal call from Builder
++ VecCall->eraseFromParent();
++
++ return LegalizedCall;
++}
++
++ElementCount InnerLoopVectorizer::getLegalVFForCall(CallInst *CI) {
++ const DataLayout DL = CI->getModule()->getDataLayout();
++ FunctionType *CallFT = CI->getFunctionType();
++ // All functions that need legalization should have a vector return type.
++ // This is true for all SVML functions that are currently supported.
++ assert(isa<VectorType>(CallFT->getReturnType()) &&
++ "Return type of call that needs legalization is not a vector.");
++ auto *VecCallRetType = cast<VectorType>(CallFT->getReturnType());
++ Type *ElemType = VecCallRetType->getElementType();
++
++ unsigned TypeBitWidth = DL.getTypeSizeInBits(ElemType);
++ unsigned VectorBitWidth = TTI->getRegisterBitWidth(TargetTransformInfo::RGK_FixedWidthVector);
++ unsigned LegalVF = VectorBitWidth / TypeBitWidth;
++
++ LLVM_DEBUG(dbgs() << "LV(SVML): Type Bit Width: " << TypeBitWidth << "\n");
++ LLVM_DEBUG(dbgs() << "LV(SVML): Current VL: " << VF << "\n");
++ LLVM_DEBUG(dbgs() << "LV(SVML): Vector Bit Width: " << VectorBitWidth
++ << "\n");
++ LLVM_DEBUG(dbgs() << "LV(SVML): Legal Target VL: " << LegalVF << "\n");
++
++ return ElementCount::getFixed(LegalVF);
++}
++
++// Partial vectorization of a call instruction is achieved by making clones of
++// \p LegalCall and overwriting its argument operands with shufflevector
++// equivalent decided based on \p LegalVF and current Part being filled.
++Value *InnerLoopVectorizer::partialVectorizeCall(CallInst *Call,
++ CallInst *LegalCall,
++ unsigned LegalVF) {
++ unsigned NumParts = VF.getKnownMinValue() / LegalVF;
++ LLVM_DEBUG(dbgs() << "LV(SVML): NumParts: " << NumParts << "\n");
++ SmallVector<Value *, 8> CallResults;
++
++ for (unsigned Part = 0; Part < NumParts; ++Part) {
++ auto *ClonedCall = cast<CallInst>(LegalCall->clone());
++
++ // Update the arg operand of cloned call to shufflevector
++ for (unsigned i = 0, ie = Call->arg_size(); i != ie; ++i) {
++ auto *NewOp = generateShuffleValue(Call->getArgOperand(i), LegalVF, Part);
++ ClonedCall->setArgOperand(i, NewOp);
++ }
++
++ LLVM_DEBUG(dbgs() << "LV(SVML): ClonedCall: "; ClonedCall->dump());
++
++ auto *PartialVecCall = Builder.Insert(ClonedCall);
++ CallResults.push_back(PartialVecCall);
++ }
++
++ return combinePartialVecCalls(CallResults);
++}
++
++Value *InnerLoopVectorizer::generateShuffleValue(Value *V, unsigned LegalVF,
++ unsigned Part) {
++ // Example:
++ // Consider the following vector code -
++ // %1 = sitofp <4 x i32> %0 to <4 x double>
++ // %2 = call <4 x double> @__svml_sin4(<4 x double> %1)
++ //
++ // If the LegalVF is 2, we partially vectorize the sin4 call by invoking
++ // generateShuffleValue on the operand %1
++ // If Part = 1, output value is -
++ // %shuffle = shufflevector <4 x double> %1, <4 x double> undef, <2 x i32><i32 0, i32 1>
++ // and if Part = 2, output is -
++ // %shuffle7 =shufflevector <4 x double> %1, <4 x double> undef, <2 x i32><i32 2, i32 3>
++
++ assert(isa<VectorType>(V->getType()) &&
++ "Cannot generate shuffles for non-vector values.");
++ SmallVector<int, 4> ShuffleMask;
++ Value *Undef = UndefValue::get(V->getType());
++
++ unsigned ElemIdx = Part * LegalVF;
++
++ for (unsigned K = 0; K < LegalVF; K++)
++ ShuffleMask.push_back(static_cast<int>(ElemIdx + K));
++
++ auto *ShuffleInst =
++ Builder.CreateShuffleVector(V, Undef, ShuffleMask, "shuffle");
++
++ return ShuffleInst;
++}
++
++// Results of the calls executed by smaller legal call instructions must be
++// combined to match the original VF for later use. This is done by constructing
++// shufflevector instructions in a cumulative fashion.
++Value *InnerLoopVectorizer::combinePartialVecCalls(
++ SmallVectorImpl<Value *> &CallResults) {
++ assert(isa<VectorType>(CallResults[0]->getType()) &&
++ "Cannot combine calls with non-vector results.");
++ auto *CallType = cast<VectorType>(CallResults[0]->getType());
++
++ Value *CombinedShuffle;
++ unsigned NumElems = CallType->getElementCount().getKnownMinValue() * 2;
++ unsigned NumRegs = CallResults.size();
++
++ assert(NumRegs >= 2 && isPowerOf2_32(NumRegs) &&
++ "Number of partial vector calls to combine must be a power of 2 "
++ "(atleast 2^1)");
++
++ while (NumRegs > 1) {
++ for (unsigned I = 0; I < NumRegs; I += 2) {
++ SmallVector<int, 4> ShuffleMask;
++ for (unsigned J = 0; J < NumElems; J++)
++ ShuffleMask.push_back(static_cast<int>(J));
++
++ CombinedShuffle = Builder.CreateShuffleVector(
++ CallResults[I], CallResults[I + 1], ShuffleMask, "combined");
++ LLVM_DEBUG(dbgs() << "LV(SVML): CombinedShuffle:";
++ CombinedShuffle->dump());
++ CallResults.push_back(CombinedShuffle);
++ }
++
++ SmallVector<Value *, 2>::iterator Start = CallResults.begin();
++ SmallVector<Value *, 2>::iterator End = Start + NumRegs;
++ CallResults.erase(Start, End);
++
++ NumElems *= 2;
++ NumRegs /= 2;
++ }
++
++ return CombinedShuffle;
+ }
+
+ void LoopVectorizationCostModel::collectLoopScalars(ElementCount VF) {
+diff --git a/llvm-14.0.6.src/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm-14.0.6.src/lib/Transforms/Vectorize/SLPVectorizer.cpp
+index 644372483edde..342f018b92184 100644
+--- a/llvm-14.0.6.src/lib/Transforms/Vectorize/SLPVectorizer.cpp
++++ b/llvm-14.0.6.src/lib/Transforms/Vectorize/SLPVectorizer.cpp
+@@ -6322,6 +6322,17 @@ Value *BoUpSLP::vectorizeTree(ArrayRef<Value *> VL) {
+ return Vec;
+ }
+
++static void setVectorFunctionCallingConv(CallInst &CI, const DataLayout &DL,
++ const TargetLibraryInfo &TLI) {
++ Function *VectorF = CI.getCalledFunction();
++ FunctionType *FTy = VectorF->getFunctionType();
++ StringRef VFName = VectorF->getName();
++ auto CC = TLI.getVectorizedFunctionCallingConv(VFName, *FTy, DL);
++ if (CC) {
++ CI.setCallingConv(*CC);
++ }
++}
++
+ Value *BoUpSLP::vectorizeTree(TreeEntry *E) {
+ IRBuilder<>::InsertPointGuard Guard(Builder);
+
+@@ -6794,7 +6805,12 @@ Value *BoUpSLP::vectorizeTree(TreeEntry *E) {
+
+ SmallVector<OperandBundleDef, 1> OpBundles;
+ CI->getOperandBundlesAsDefs(OpBundles);
+- Value *V = Builder.CreateCall(CF, OpVecs, OpBundles);
++
++ CallInst *NewCall = Builder.CreateCall(CF, OpVecs, OpBundles);
++ const DataLayout &DL = NewCall->getModule()->getDataLayout();
++ setVectorFunctionCallingConv(*NewCall, DL, *TLI);
++
++ Value *V = NewCall;
+
+ // The scalar argument uses an in-tree scalar so we add the new vectorized
+ // call to ExternalUses list to make sure that an extract will be
+diff --git a/llvm-14.0.6.src/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll b/llvm-14.0.6.src/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll
+index df8b7c498bd00..63a36549f18fd 100644
+--- a/llvm-14.0.6.src/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll
++++ b/llvm-14.0.6.src/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll
+@@ -10,7 +10,7 @@ target triple = "x86_64-unknown-linux-gnu"
+ define <4 x double> @exp_v4(<4 x double> %in) {
+ ; SVML-LABEL: define {{[^@]+}}@exp_v4
+ ; SVML-SAME: (<4 x double> [[IN:%.*]]) {
+-; SVML-NEXT: [[TMP1:%.*]] = call <4 x double> @__svml_exp4(<4 x double> [[IN]])
++; SVML-NEXT: [[TMP1:%.*]] = call <4 x double> @__svml_exp4_ha(<4 x double> [[IN]])
+ ; SVML-NEXT: ret <4 x double> [[TMP1]]
+ ;
+ ; LIBMVEC-X86-LABEL: define {{[^@]+}}@exp_v4
+@@ -37,7 +37,7 @@ declare <4 x double> @llvm.exp.v4f64(<4 x double>) #0
+ define <4 x float> @exp_f32(<4 x float> %in) {
+ ; SVML-LABEL: define {{[^@]+}}@exp_f32
+ ; SVML-SAME: (<4 x float> [[IN:%.*]]) {
+-; SVML-NEXT: [[TMP1:%.*]] = call <4 x float> @__svml_expf4(<4 x float> [[IN]])
++; SVML-NEXT: [[TMP1:%.*]] = call <4 x float> @__svml_expf4_ha(<4 x float> [[IN]])
+ ; SVML-NEXT: ret <4 x float> [[TMP1]]
+ ;
+ ; LIBMVEC-X86-LABEL: define {{[^@]+}}@exp_f32
+diff --git a/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-calls-finite.ll b/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-calls-finite.ll
+index a6e191c3d6923..d6e2e11106949 100644
+--- a/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-calls-finite.ll
++++ b/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-calls-finite.ll
+@@ -39,7 +39,8 @@ for.end: ; preds = %for.body
+ declare double @__exp_finite(double) #0
+
+ ; CHECK-LABEL: @exp_f64
+-; CHECK: <4 x double> @__svml_exp4
++; CHECK: <2 x double> @__svml_exp2
++; CHECK: <2 x double> @__svml_exp2
+ ; CHECK: ret
+ define void @exp_f64(double* nocapture %varray) {
+ entry:
+@@ -99,7 +100,8 @@ for.end: ; preds = %for.body
+ declare double @__log_finite(double) #0
+
+ ; CHECK-LABEL: @log_f64
+-; CHECK: <4 x double> @__svml_log4
++; CHECK: <2 x double> @__svml_log2
++; CHECK: <2 x double> @__svml_log2
+ ; CHECK: ret
+ define void @log_f64(double* nocapture %varray) {
+ entry:
+@@ -159,7 +161,8 @@ for.end: ; preds = %for.body
+ declare double @__pow_finite(double, double) #0
+
+ ; CHECK-LABEL: @pow_f64
+-; CHECK: <4 x double> @__svml_pow4
++; CHECK: <2 x double> @__svml_pow2
++; CHECK: <2 x double> @__svml_pow2
+ ; CHECK: ret
+ define void @pow_f64(double* nocapture %varray, double* nocapture readonly %exp) {
+ entry:
+@@ -190,7 +193,8 @@ declare float @__exp2f_finite(float) #0
+
+ define void @exp2f_finite(float* nocapture %varray) {
+ ; CHECK-LABEL: @exp2f_finite(
+-; CHECK: call <4 x float> @__svml_exp2f4(<4 x float> %{{.*}})
++; CHECK: call intel_svmlcc128 <4 x float> @__svml_exp2f4_ha(<4 x float> %{{.*}})
++; CHECK: call intel_svmlcc128 <4 x float> @__svml_exp2f4_ha(<4 x float> %{{.*}})
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -219,7 +223,8 @@ declare double @__exp2_finite(double) #0
+
+ define void @exp2_finite(double* nocapture %varray) {
+ ; CHECK-LABEL: @exp2_finite(
+-; CHECK: call <4 x double> @__svml_exp24(<4 x double> {{.*}})
++; CHECK: call intel_svmlcc128 <2 x double> @__svml_exp22_ha(<2 x double> {{.*}})
++; CHECK: call intel_svmlcc128 <2 x double> @__svml_exp22_ha(<2 x double> {{.*}})
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -276,7 +281,8 @@ for.end: ; preds = %for.body
+ declare double @__log2_finite(double) #0
+
+ ; CHECK-LABEL: @log2_f64
+-; CHECK: <4 x double> @__svml_log24
++; CHECK: <2 x double> @__svml_log22
++; CHECK: <2 x double> @__svml_log22
+ ; CHECK: ret
+ define void @log2_f64(double* nocapture %varray) {
+ entry:
+@@ -333,7 +339,8 @@ for.end: ; preds = %for.body
+ declare double @__log10_finite(double) #0
+
+ ; CHECK-LABEL: @log10_f64
+-; CHECK: <4 x double> @__svml_log104
++; CHECK: <2 x double> @__svml_log102
++; CHECK: <2 x double> @__svml_log102
+ ; CHECK: ret
+ define void @log10_f64(double* nocapture %varray) {
+ entry:
+@@ -390,7 +397,8 @@ for.end: ; preds = %for.body
+ declare double @__sqrt_finite(double) #0
+
+ ; CHECK-LABEL: @sqrt_f64
+-; CHECK: <4 x double> @__svml_sqrt4
++; CHECK: <2 x double> @__svml_sqrt2
++; CHECK: <2 x double> @__svml_sqrt2
+ ; CHECK: ret
+ define void @sqrt_f64(double* nocapture %varray) {
+ entry:
+diff --git a/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-calls.ll b/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-calls.ll
+index 42c280df6ad02..088bbdcf1aa4a 100644
+--- a/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-calls.ll
++++ b/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-calls.ll
+@@ -48,7 +48,7 @@ declare float @llvm.exp2.f32(float) #0
+
+ define void @sin_f64(double* nocapture %varray) {
+ ; CHECK-LABEL: @sin_f64(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_sin4(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_sin4_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -71,7 +71,7 @@ for.end:
+
+ define void @sin_f32(float* nocapture %varray) {
+ ; CHECK-LABEL: @sin_f32(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_sinf4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_sinf4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -94,7 +94,7 @@ for.end:
+
+ define void @sin_f64_intrinsic(double* nocapture %varray) {
+ ; CHECK-LABEL: @sin_f64_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_sin4(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_sin4_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -117,7 +117,7 @@ for.end:
+
+ define void @sin_f32_intrinsic(float* nocapture %varray) {
+ ; CHECK-LABEL: @sin_f32_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_sinf4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_sinf4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -140,7 +140,7 @@ for.end:
+
+ define void @cos_f64(double* nocapture %varray) {
+ ; CHECK-LABEL: @cos_f64(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_cos4(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_cos4_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -163,7 +163,7 @@ for.end:
+
+ define void @cos_f32(float* nocapture %varray) {
+ ; CHECK-LABEL: @cos_f32(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_cosf4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_cosf4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -186,7 +186,7 @@ for.end:
+
+ define void @cos_f64_intrinsic(double* nocapture %varray) {
+ ; CHECK-LABEL: @cos_f64_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_cos4(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_cos4_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -209,7 +209,7 @@ for.end:
+
+ define void @cos_f32_intrinsic(float* nocapture %varray) {
+ ; CHECK-LABEL: @cos_f32_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_cosf4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_cosf4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -232,7 +232,7 @@ for.end:
+
+ define void @pow_f64(double* nocapture %varray, double* nocapture readonly %exp) {
+ ; CHECK-LABEL: @pow_f64(
+-; CHECK: [[TMP8:%.*]] = call <4 x double> @__svml_pow4(<4 x double> [[TMP4:%.*]], <4 x double> [[WIDE_LOAD:%.*]])
++; CHECK: [[TMP8:%.*]] = call intel_svmlcc256 <4 x double> @__svml_pow4_ha(<4 x double> [[TMP4:%.*]], <4 x double> [[WIDE_LOAD:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -257,7 +257,7 @@ for.end:
+
+ define void @pow_f64_intrinsic(double* nocapture %varray, double* nocapture readonly %exp) {
+ ; CHECK-LABEL: @pow_f64_intrinsic(
+-; CHECK: [[TMP8:%.*]] = call <4 x double> @__svml_pow4(<4 x double> [[TMP4:%.*]], <4 x double> [[WIDE_LOAD:%.*]])
++; CHECK: [[TMP8:%.*]] = call intel_svmlcc256 <4 x double> @__svml_pow4_ha(<4 x double> [[TMP4:%.*]], <4 x double> [[WIDE_LOAD:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -282,7 +282,7 @@ for.end:
+
+ define void @pow_f32(float* nocapture %varray, float* nocapture readonly %exp) {
+ ; CHECK-LABEL: @pow_f32(
+-; CHECK: [[TMP8:%.*]] = call <4 x float> @__svml_powf4(<4 x float> [[TMP4:%.*]], <4 x float> [[WIDE_LOAD:%.*]])
++; CHECK: [[TMP8:%.*]] = call intel_svmlcc128 <4 x float> @__svml_powf4_ha(<4 x float> [[TMP4:%.*]], <4 x float> [[WIDE_LOAD:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -307,7 +307,7 @@ for.end:
+
+ define void @pow_f32_intrinsic(float* nocapture %varray, float* nocapture readonly %exp) {
+ ; CHECK-LABEL: @pow_f32_intrinsic(
+-; CHECK: [[TMP8:%.*]] = call <4 x float> @__svml_powf4(<4 x float> [[TMP4:%.*]], <4 x float> [[WIDE_LOAD:%.*]])
++; CHECK: [[TMP8:%.*]] = call intel_svmlcc128 <4 x float> @__svml_powf4_ha(<4 x float> [[TMP4:%.*]], <4 x float> [[WIDE_LOAD:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -332,7 +332,7 @@ for.end:
+
+ define void @exp_f64(double* nocapture %varray) {
+ ; CHECK-LABEL: @exp_f64(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_exp4(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_exp4_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -355,7 +355,7 @@ for.end:
+
+ define void @exp_f32(float* nocapture %varray) {
+ ; CHECK-LABEL: @exp_f32(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_expf4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_expf4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -378,7 +378,7 @@ for.end:
+
+ define void @exp_f64_intrinsic(double* nocapture %varray) {
+ ; CHECK-LABEL: @exp_f64_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_exp4(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_exp4_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -401,7 +401,7 @@ for.end:
+
+ define void @exp_f32_intrinsic(float* nocapture %varray) {
+ ; CHECK-LABEL: @exp_f32_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_expf4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_expf4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -424,7 +424,7 @@ for.end:
+
+ define void @log_f64(double* nocapture %varray) {
+ ; CHECK-LABEL: @log_f64(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_log4(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_log4_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -447,7 +447,7 @@ for.end:
+
+ define void @log_f32(float* nocapture %varray) {
+ ; CHECK-LABEL: @log_f32(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_logf4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_logf4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -470,7 +470,7 @@ for.end:
+
+ define void @log_f64_intrinsic(double* nocapture %varray) {
+ ; CHECK-LABEL: @log_f64_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_log4(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_log4_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -493,7 +493,7 @@ for.end:
+
+ define void @log_f32_intrinsic(float* nocapture %varray) {
+ ; CHECK-LABEL: @log_f32_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_logf4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_logf4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -516,7 +516,7 @@ for.end:
+
+ define void @log2_f64(double* nocapture %varray) {
+ ; CHECK-LABEL: @log2_f64(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_log24(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_log24_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -539,7 +539,7 @@ for.end:
+
+ define void @log2_f32(float* nocapture %varray) {
+ ; CHECK-LABEL: @log2_f32(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_log2f4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_log2f4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -562,7 +562,7 @@ for.end:
+
+ define void @log2_f64_intrinsic(double* nocapture %varray) {
+ ; CHECK-LABEL: @log2_f64_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_log24(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_log24_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -585,7 +585,7 @@ for.end:
+
+ define void @log2_f32_intrinsic(float* nocapture %varray) {
+ ; CHECK-LABEL: @log2_f32_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_log2f4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_log2f4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -608,7 +608,7 @@ for.end:
+
+ define void @log10_f64(double* nocapture %varray) {
+ ; CHECK-LABEL: @log10_f64(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_log104(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_log104_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -631,7 +631,7 @@ for.end:
+
+ define void @log10_f32(float* nocapture %varray) {
+ ; CHECK-LABEL: @log10_f32(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_log10f4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_log10f4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -654,7 +654,7 @@ for.end:
+
+ define void @log10_f64_intrinsic(double* nocapture %varray) {
+ ; CHECK-LABEL: @log10_f64_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_log104(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_log104_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -677,7 +677,7 @@ for.end:
+
+ define void @log10_f32_intrinsic(float* nocapture %varray) {
+ ; CHECK-LABEL: @log10_f32_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_log10f4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_log10f4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -700,7 +700,7 @@ for.end:
+
+ define void @sqrt_f64(double* nocapture %varray) {
+ ; CHECK-LABEL: @sqrt_f64(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_sqrt4(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_sqrt4_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -723,7 +723,7 @@ for.end:
+
+ define void @sqrt_f32(float* nocapture %varray) {
+ ; CHECK-LABEL: @sqrt_f32(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_sqrtf4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_sqrtf4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -746,7 +746,7 @@ for.end:
+
+ define void @exp2_f64(double* nocapture %varray) {
+ ; CHECK-LABEL: @exp2_f64(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_exp24(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_exp24_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -769,7 +769,7 @@ for.end:
+
+ define void @exp2_f32(float* nocapture %varray) {
+ ; CHECK-LABEL: @exp2_f32(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_exp2f4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_exp2f4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -792,7 +792,7 @@ for.end:
+
+ define void @exp2_f64_intrinsic(double* nocapture %varray) {
+ ; CHECK-LABEL: @exp2_f64_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x double> @__svml_exp24(<4 x double> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc256 <4 x double> @__svml_exp24_ha(<4 x double> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -815,7 +815,7 @@ for.end:
+
+ define void @exp2_f32_intrinsic(float* nocapture %varray) {
+ ; CHECK-LABEL: @exp2_f32_intrinsic(
+-; CHECK: [[TMP5:%.*]] = call <4 x float> @__svml_exp2f4(<4 x float> [[TMP4:%.*]])
++; CHECK: [[TMP5:%.*]] = call intel_svmlcc128 <4 x float> @__svml_exp2f4_ha(<4 x float> [[TMP4:%.*]])
+ ; CHECK: ret void
+ ;
+ entry:
+@@ -836,4 +836,44 @@ for.end:
+ ret void
+ }
+
++; CHECK-LABEL: @atan2_finite
++; CHECK: intel_svmlcc256 <4 x double> @__svml_atan24(
++; CHECK: intel_svmlcc256 <4 x double> @__svml_atan24(
++; CHECK: ret
++
++declare double @__atan2_finite(double, double) local_unnamed_addr #0
++
++define void @atan2_finite([100 x double]* nocapture %varray) local_unnamed_addr #0 {
++entry:
++ br label %for.cond1.preheader
++
++for.cond1.preheader: ; preds = %for.inc7, %entry
++ %indvars.iv19 = phi i64 [ 0, %entry ], [ %indvars.iv.next20, %for.inc7 ]
++ %0 = trunc i64 %indvars.iv19 to i32
++ %conv = sitofp i32 %0 to double
++ br label %for.body3
++
++for.body3: ; preds = %for.body3, %for.cond1.preheader
++ %indvars.iv = phi i64 [ 0, %for.cond1.preheader ], [ %indvars.iv.next, %for.body3 ]
++ %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
++ %1 = trunc i64 %indvars.iv.next to i32
++ %conv4 = sitofp i32 %1 to double
++ %call = tail call fast double @__atan2_finite(double %conv, double %conv4)
++ %arrayidx6 = getelementptr inbounds [100 x double], [100 x double]* %varray, i64 %indvars.iv19, i64 %indvars.iv
++ store double %call, double* %arrayidx6, align 8
++ %exitcond = icmp eq i64 %indvars.iv.next, 100
++ br i1 %exitcond, label %for.inc7, label %for.body3, !llvm.loop !5
++
++for.inc7: ; preds = %for.body3
++ %indvars.iv.next20 = add nuw nsw i64 %indvars.iv19, 1
++ %exitcond21 = icmp eq i64 %indvars.iv.next20, 100
++ br i1 %exitcond21, label %for.end9, label %for.cond1.preheader
++
++for.end9: ; preds = %for.inc7
++ ret void
++}
++
+ attributes #0 = { nounwind readnone }
++!5 = distinct !{!5, !6, !7}
++!6 = !{!"llvm.loop.vectorize.width", i32 8}
++!7 = !{!"llvm.loop.vectorize.enable", i1 true}
+diff --git a/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-legal-calls.ll b/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-legal-calls.ll
+new file mode 100644
+index 0000000000000..326c763994343
+--- /dev/null
++++ b/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-legal-calls.ll
+@@ -0,0 +1,513 @@
++; Check legalization of SVML calls, including intrinsic versions (like @llvm.<fn_name>.<type>).
++
++; RUN: opt -vector-library=SVML -inject-tli-mappings -loop-vectorize -force-vector-width=8 -force-vector-interleave=1 -mattr=avx -S < %s | FileCheck %s
++
++target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
++target triple = "x86_64-unknown-linux-gnu"
++
++declare double @sin(double) #0
++declare float @sinf(float) #0
++declare double @llvm.sin.f64(double) #0
++declare float @llvm.sin.f32(float) #0
++
++declare double @cos(double) #0
++declare float @cosf(float) #0
++declare double @llvm.cos.f64(double) #0
++declare float @llvm.cos.f32(float) #0
++
++declare double @pow(double, double) #0
++declare float @powf(float, float) #0
++declare double @llvm.pow.f64(double, double) #0
++declare float @llvm.pow.f32(float, float) #0
++
++declare double @exp(double) #0
++declare float @expf(float) #0
++declare double @llvm.exp.f64(double) #0
++declare float @llvm.exp.f32(float) #0
++
++declare double @log(double) #0
++declare float @logf(float) #0
++declare double @llvm.log.f64(double) #0
++declare float @llvm.log.f32(float) #0
++
++
++define void @sin_f64(double* nocapture %varray) {
++; CHECK-LABEL: @sin_f64(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <4 x double> @__svml_sin4_ha(<4 x double> [[TMP2:%.*]])
++; CHECK: [[TMP3:%.*]] = call intel_svmlcc256 <4 x double> @__svml_sin4_ha(<4 x double> [[TMP4:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to double
++ %call = tail call double @sin(double %conv)
++ %arrayidx = getelementptr inbounds double, double* %varray, i64 %iv
++ store double %call, double* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @sin_f32(float* nocapture %varray) {
++; CHECK-LABEL: @sin_f32(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <8 x float> @__svml_sinf8_ha(<8 x float> [[TMP2:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to float
++ %call = tail call float @sinf(float %conv)
++ %arrayidx = getelementptr inbounds float, float* %varray, i64 %iv
++ store float %call, float* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @sin_f64_intrinsic(double* nocapture %varray) {
++; CHECK-LABEL: @sin_f64_intrinsic(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <4 x double> @__svml_sin4_ha(<4 x double> [[TMP2:%.*]])
++; CHECK: [[TMP3:%.*]] = call intel_svmlcc256 <4 x double> @__svml_sin4_ha(<4 x double> [[TMP4:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to double
++ %call = tail call double @llvm.sin.f64(double %conv)
++ %arrayidx = getelementptr inbounds double, double* %varray, i64 %iv
++ store double %call, double* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @sin_f32_intrinsic(float* nocapture %varray) {
++; CHECK-LABEL: @sin_f32_intrinsic(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <8 x float> @__svml_sinf8_ha(<8 x float> [[TMP2:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to float
++ %call = tail call float @llvm.sin.f32(float %conv)
++ %arrayidx = getelementptr inbounds float, float* %varray, i64 %iv
++ store float %call, float* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @cos_f64(double* nocapture %varray) {
++; CHECK-LABEL: @cos_f64(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <4 x double> @__svml_cos4_ha(<4 x double> [[TMP2:%.*]])
++; CHECK: [[TMP3:%.*]] = call intel_svmlcc256 <4 x double> @__svml_cos4_ha(<4 x double> [[TMP4:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to double
++ %call = tail call double @cos(double %conv)
++ %arrayidx = getelementptr inbounds double, double* %varray, i64 %iv
++ store double %call, double* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @cos_f32(float* nocapture %varray) {
++; CHECK-LABEL: @cos_f32(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <8 x float> @__svml_cosf8_ha(<8 x float> [[TMP2:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to float
++ %call = tail call float @cosf(float %conv)
++ %arrayidx = getelementptr inbounds float, float* %varray, i64 %iv
++ store float %call, float* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @cos_f64_intrinsic(double* nocapture %varray) {
++; CHECK-LABEL: @cos_f64_intrinsic(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <4 x double> @__svml_cos4_ha(<4 x double> [[TMP2:%.*]])
++; CHECK: [[TMP3:%.*]] = call intel_svmlcc256 <4 x double> @__svml_cos4_ha(<4 x double> [[TMP4:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to double
++ %call = tail call double @llvm.cos.f64(double %conv)
++ %arrayidx = getelementptr inbounds double, double* %varray, i64 %iv
++ store double %call, double* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @cos_f32_intrinsic(float* nocapture %varray) {
++; CHECK-LABEL: @cos_f32_intrinsic(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <8 x float> @__svml_cosf8_ha(<8 x float> [[TMP2:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to float
++ %call = tail call float @llvm.cos.f32(float %conv)
++ %arrayidx = getelementptr inbounds float, float* %varray, i64 %iv
++ store float %call, float* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @pow_f64(double* nocapture %varray, double* nocapture readonly %exp) {
++; CHECK-LABEL: @pow_f64(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <4 x double> @__svml_pow4_ha(<4 x double> [[TMP2:%.*]], <4 x double> [[TMP3:%.*]])
++; CHECK: [[TMP4:%.*]] = call intel_svmlcc256 <4 x double> @__svml_pow4_ha(<4 x double> [[TMP5:%.*]], <4 x double> [[TMP6:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to double
++ %arrayidx = getelementptr inbounds double, double* %exp, i64 %iv
++ %tmp1 = load double, double* %arrayidx, align 4
++ %tmp2 = tail call double @pow(double %conv, double %tmp1)
++ %arrayidx2 = getelementptr inbounds double, double* %varray, i64 %iv
++ store double %tmp2, double* %arrayidx2, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @pow_f64_intrinsic(double* nocapture %varray, double* nocapture readonly %exp) {
++; CHECK-LABEL: @pow_f64_intrinsic(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <4 x double> @__svml_pow4_ha(<4 x double> [[TMP2:%.*]], <4 x double> [[TMP3:%.*]])
++; CHECK: [[TMP4:%.*]] = call intel_svmlcc256 <4 x double> @__svml_pow4_ha(<4 x double> [[TMP5:%.*]], <4 x double> [[TMP6:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to double
++ %arrayidx = getelementptr inbounds double, double* %exp, i64 %iv
++ %tmp1 = load double, double* %arrayidx, align 4
++ %tmp2 = tail call double @llvm.pow.f64(double %conv, double %tmp1)
++ %arrayidx2 = getelementptr inbounds double, double* %varray, i64 %iv
++ store double %tmp2, double* %arrayidx2, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @pow_f32(float* nocapture %varray, float* nocapture readonly %exp) {
++; CHECK-LABEL: @pow_f32(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <8 x float> @__svml_powf8_ha(<8 x float> [[TMP2:%.*]], <8 x float> [[WIDE_LOAD:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to float
++ %arrayidx = getelementptr inbounds float, float* %exp, i64 %iv
++ %tmp1 = load float, float* %arrayidx, align 4
++ %tmp2 = tail call float @powf(float %conv, float %tmp1)
++ %arrayidx2 = getelementptr inbounds float, float* %varray, i64 %iv
++ store float %tmp2, float* %arrayidx2, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @pow_f32_intrinsic(float* nocapture %varray, float* nocapture readonly %exp) {
++; CHECK-LABEL: @pow_f32_intrinsic(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <8 x float> @__svml_powf8_ha(<8 x float> [[TMP2:%.*]], <8 x float> [[TMP3:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to float
++ %arrayidx = getelementptr inbounds float, float* %exp, i64 %iv
++ %tmp1 = load float, float* %arrayidx, align 4
++ %tmp2 = tail call float @llvm.pow.f32(float %conv, float %tmp1)
++ %arrayidx2 = getelementptr inbounds float, float* %varray, i64 %iv
++ store float %tmp2, float* %arrayidx2, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @exp_f64(double* nocapture %varray) {
++; CHECK-LABEL: @exp_f64(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <4 x double> @__svml_exp4_ha(<4 x double> [[TMP2:%.*]])
++; CHECK: [[TMP3:%.*]] = call intel_svmlcc256 <4 x double> @__svml_exp4_ha(<4 x double> [[TMP4:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to double
++ %call = tail call double @exp(double %conv)
++ %arrayidx = getelementptr inbounds double, double* %varray, i64 %iv
++ store double %call, double* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @exp_f32(float* nocapture %varray) {
++; CHECK-LABEL: @exp_f32(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <8 x float> @__svml_expf8_ha(<8 x float> [[TMP2:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to float
++ %call = tail call float @expf(float %conv)
++ %arrayidx = getelementptr inbounds float, float* %varray, i64 %iv
++ store float %call, float* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @exp_f64_intrinsic(double* nocapture %varray) {
++; CHECK-LABEL: @exp_f64_intrinsic(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <4 x double> @__svml_exp4_ha(<4 x double> [[TMP2:%.*]])
++; CHECK: [[TMP3:%.*]] = call intel_svmlcc256 <4 x double> @__svml_exp4_ha(<4 x double> [[TMP4:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to double
++ %call = tail call double @llvm.exp.f64(double %conv)
++ %arrayidx = getelementptr inbounds double, double* %varray, i64 %iv
++ store double %call, double* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @exp_f32_intrinsic(float* nocapture %varray) {
++; CHECK-LABEL: @exp_f32_intrinsic(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <8 x float> @__svml_expf8_ha(<8 x float> [[TMP2:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to float
++ %call = tail call float @llvm.exp.f32(float %conv)
++ %arrayidx = getelementptr inbounds float, float* %varray, i64 %iv
++ store float %call, float* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @log_f64(double* nocapture %varray) {
++; CHECK-LABEL: @log_f64(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <4 x double> @__svml_log4_ha(<4 x double> [[TMP2:%.*]])
++; CHECK: [[TMP3:%.*]] = call intel_svmlcc256 <4 x double> @__svml_log4_ha(<4 x double> [[TMP4:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to double
++ %call = tail call double @log(double %conv)
++ %arrayidx = getelementptr inbounds double, double* %varray, i64 %iv
++ store double %call, double* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @log_f32(float* nocapture %varray) {
++; CHECK-LABEL: @log_f32(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <8 x float> @__svml_logf8_ha(<8 x float> [[TMP2:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to float
++ %call = tail call float @logf(float %conv)
++ %arrayidx = getelementptr inbounds float, float* %varray, i64 %iv
++ store float %call, float* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @log_f64_intrinsic(double* nocapture %varray) {
++; CHECK-LABEL: @log_f64_intrinsic(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <4 x double> @__svml_log4_ha(<4 x double> [[TMP2:%.*]])
++; CHECK: [[TMP3:%.*]] = call intel_svmlcc256 <4 x double> @__svml_log4_ha(<4 x double> [[TMP4:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to double
++ %call = tail call double @llvm.log.f64(double %conv)
++ %arrayidx = getelementptr inbounds double, double* %varray, i64 %iv
++ store double %call, double* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++define void @log_f32_intrinsic(float* nocapture %varray) {
++; CHECK-LABEL: @log_f32_intrinsic(
++; CHECK: [[TMP1:%.*]] = call intel_svmlcc256 <8 x float> @__svml_logf8_ha(<8 x float> [[TMP2:%.*]])
++; CHECK: ret void
++;
++entry:
++ br label %for.body
++
++for.body:
++ %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
++ %tmp = trunc i64 %iv to i32
++ %conv = sitofp i32 %tmp to float
++ %call = tail call float @llvm.log.f32(float %conv)
++ %arrayidx = getelementptr inbounds float, float* %varray, i64 %iv
++ store float %call, float* %arrayidx, align 4
++ %iv.next = add nuw nsw i64 %iv, 1
++ %exitcond = icmp eq i64 %iv.next, 1000
++ br i1 %exitcond, label %for.end, label %for.body
++
++for.end:
++ ret void
++}
++
++attributes #0 = { nounwind readnone }
++
+diff --git a/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-legal-codegen.ll b/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-legal-codegen.ll
+new file mode 100644
+index 0000000000000..9422653445dc2
+--- /dev/null
++++ b/llvm-14.0.6.src/test/Transforms/LoopVectorize/X86/svml-legal-codegen.ll
+@@ -0,0 +1,61 @@
++; Check that vector codegen splits illegal sin8 call to two sin4 calls on AVX for double datatype.
++; The C code used to generate this test:
++
++; #include <math.h>
++;
++; void foo(double *a, int N){
++; int i;
++; #pragma clang loop vectorize_width(8)
++; for (i=0;i<N;i++){
++; a[i] = sin(i);
++; }
++; }
++
++; RUN: opt -vector-library=SVML -inject-tli-mappings -loop-vectorize -force-vector-width=8 -mattr=avx -S < %s | FileCheck %s
++
++; CHECK: [[I1:%.*]] = sitofp <8 x i32> [[I0:%.*]] to <8 x double>
++; CHECK-NEXT: [[S1:%shuffle.*]] = shufflevector <8 x double> [[I1]], <8 x double> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
++; CHECK-NEXT: [[I2:%.*]] = call fast intel_svmlcc256 <4 x double> @__svml_sin4(<4 x double> [[S1]])
++; CHECK-NEXT: [[S2:%shuffle.*]] = shufflevector <8 x double> [[I1]], <8 x double> undef, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
++; CHECK-NEXT: [[I3:%.*]] = call fast intel_svmlcc256 <4 x double> @__svml_sin4(<4 x double> [[S2]])
++; CHECK-NEXT: [[comb:%combined.*]] = shufflevector <4 x double> [[I2]], <4 x double> [[I3]], <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
++; CHECK: store <8 x double> [[comb]], <8 x double>* [[TMP:%.*]], align 8
++
++
++target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
++target triple = "x86_64-unknown-linux-gnu"
++
++; Function Attrs: nounwind uwtable
++define dso_local void @foo(double* nocapture %a, i32 %N) local_unnamed_addr #0 {
++entry:
++ %cmp5 = icmp sgt i32 %N, 0
++ br i1 %cmp5, label %for.body.preheader, label %for.end
++
++for.body.preheader: ; preds = %entry
++ %wide.trip.count = zext i32 %N to i64
++ br label %for.body
++
++for.body: ; preds = %for.body, %for.body.preheader
++ %indvars.iv = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
++ %0 = trunc i64 %indvars.iv to i32
++ %conv = sitofp i32 %0 to double
++ %call = tail call fast double @sin(double %conv) #2
++ %arrayidx = getelementptr inbounds double, double* %a, i64 %indvars.iv
++ store double %call, double* %arrayidx, align 8, !tbaa !2
++ %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
++ %exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
++ br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !6
++
++for.end: ; preds = %for.body, %entry
++ ret void
++}
++
++; Function Attrs: nounwind
++declare dso_local double @sin(double) local_unnamed_addr #1
++
++!2 = !{!3, !3, i64 0}
++!3 = !{!"double", !4, i64 0}
++!4 = !{!"omnipotent char", !5, i64 0}
++!5 = !{!"Simple C/C++ TBAA"}
++!6 = distinct !{!6, !7}
++!7 = !{!"llvm.loop.vectorize.width", i32 8}
+diff --git a/llvm-14.0.6.src/test/Transforms/Util/add-TLI-mappings.ll b/llvm-14.0.6.src/test/Transforms/Util/add-TLI-mappings.ll
+index e8c83c4d9bd1f..615fdc29176a2 100644
+--- a/llvm-14.0.6.src/test/Transforms/Util/add-TLI-mappings.ll
++++ b/llvm-14.0.6.src/test/Transforms/Util/add-TLI-mappings.ll
+@@ -12,12 +12,12 @@ target triple = "x86_64-unknown-linux-gnu"
+
+ ; COMMON-LABEL: @llvm.compiler.used = appending global
+ ; SVML-SAME: [6 x i8*] [
+-; SVML-SAME: i8* bitcast (<2 x double> (<2 x double>)* @__svml_sin2 to i8*),
+-; SVML-SAME: i8* bitcast (<4 x double> (<4 x double>)* @__svml_sin4 to i8*),
+-; SVML-SAME: i8* bitcast (<8 x double> (<8 x double>)* @__svml_sin8 to i8*),
+-; SVML-SAME: i8* bitcast (<4 x float> (<4 x float>)* @__svml_log10f4 to i8*),
+-; SVML-SAME: i8* bitcast (<8 x float> (<8 x float>)* @__svml_log10f8 to i8*),
+-; SVML-SAME: i8* bitcast (<16 x float> (<16 x float>)* @__svml_log10f16 to i8*)
++; SVML-SAME: i8* bitcast (<2 x double> (<2 x double>)* @__svml_sin2_ha to i8*),
++; SVML-SAME: i8* bitcast (<4 x double> (<4 x double>)* @__svml_sin4_ha to i8*),
++; SVML-SAME: i8* bitcast (<8 x double> (<8 x double>)* @__svml_sin8_ha to i8*),
++; SVML-SAME: i8* bitcast (<4 x float> (<4 x float>)* @__svml_log10f4_ha to i8*),
++; SVML-SAME: i8* bitcast (<8 x float> (<8 x float>)* @__svml_log10f8_ha to i8*),
++; SVML-SAME: i8* bitcast (<16 x float> (<16 x float>)* @__svml_log10f16_ha to i8*)
+ ; MASSV-SAME: [2 x i8*] [
+ ; MASSV-SAME: i8* bitcast (<2 x double> (<2 x double>)* @__sind2 to i8*),
+ ; MASSV-SAME: i8* bitcast (<4 x float> (<4 x float>)* @__log10f4 to i8*)
+@@ -59,9 +59,9 @@ declare float @llvm.log10.f32(float) #0
+ attributes #0 = { nounwind readnone }
+
+ ; SVML: attributes #[[SIN]] = { "vector-function-abi-variant"=
+-; SVML-SAME: "_ZGV_LLVM_N2v_sin(__svml_sin2),
+-; SVML-SAME: _ZGV_LLVM_N4v_sin(__svml_sin4),
+-; SVML-SAME: _ZGV_LLVM_N8v_sin(__svml_sin8)" }
++; SVML-SAME: "_ZGV_LLVM_N2v_sin(__svml_sin2_ha),
++; SVML-SAME: _ZGV_LLVM_N4v_sin(__svml_sin4_ha),
++; SVML-SAME: _ZGV_LLVM_N8v_sin(__svml_sin8_ha)" }
+
+ ; MASSV: attributes #[[SIN]] = { "vector-function-abi-variant"=
+ ; MASSV-SAME: "_ZGV_LLVM_N2v_sin(__sind2)" }
+diff --git a/llvm-14.0.6.src/utils/TableGen/CMakeLists.txt b/llvm-14.0.6.src/utils/TableGen/CMakeLists.txt
+index 97df6a55d1b59..199e0285c9e5d 100644
+--- a/llvm-14.0.6.src/utils/TableGen/CMakeLists.txt
++++ b/llvm-14.0.6.src/utils/TableGen/CMakeLists.txt
+@@ -47,6 +47,7 @@ add_tablegen(llvm-tblgen LLVM
+ SearchableTableEmitter.cpp
+ SubtargetEmitter.cpp
+ SubtargetFeatureInfo.cpp
++ SVMLEmitter.cpp
+ TableGen.cpp
+ Types.cpp
+ X86DisassemblerTables.cpp
+diff --git a/llvm-14.0.6.src/utils/TableGen/SVMLEmitter.cpp b/llvm-14.0.6.src/utils/TableGen/SVMLEmitter.cpp
+new file mode 100644
+index 0000000000000..a5aeea48db28b
+--- /dev/null
++++ b/llvm-14.0.6.src/utils/TableGen/SVMLEmitter.cpp
+@@ -0,0 +1,110 @@
++//===------ SVMLEmitter.cpp - Generate SVML function variants -------------===//
++//
++// The LLVM Compiler Infrastructure
++//
++// This file is distributed under the University of Illinois Open Source
++// License. See LICENSE.TXT for details.
++//
++//===----------------------------------------------------------------------===//
++//
++// This tablegen backend emits the scalar to svml function map for TLI.
++//
++//===----------------------------------------------------------------------===//
++
++#include "CodeGenTarget.h"
++#include "llvm/Support/Format.h"
++#include "llvm/TableGen/Error.h"
++#include "llvm/TableGen/Record.h"
++#include "llvm/TableGen/TableGenBackend.h"
++#include <map>
++#include <vector>
++
++using namespace llvm;
++
++#define DEBUG_TYPE "SVMLVariants"
++#include "llvm/Support/Debug.h"
++
++namespace {
++
++class SVMLVariantsEmitter {
++
++ RecordKeeper &Records;
++
++private:
++ void emitSVMLVariants(raw_ostream &OS);
++
++public:
++ SVMLVariantsEmitter(RecordKeeper &R) : Records(R) {}
++
++ void run(raw_ostream &OS);
++};
++} // End anonymous namespace
++
++/// \brief Emit the set of SVML variant function names.
++// The default is to emit the high accuracy SVML variants until a mechanism is
++// introduced to allow a selection of different variants through precision
++// requirements specified by the user. This code generates mappings to svml
++// that are in the scalar form of llvm intrinsics, math library calls, or the
++// finite variants of math library calls.
++void SVMLVariantsEmitter::emitSVMLVariants(raw_ostream &OS) {
++
++ const unsigned MinSinglePrecVL = 4;
++ const unsigned MaxSinglePrecVL = 16;
++ const unsigned MinDoublePrecVL = 2;
++ const unsigned MaxDoublePrecVL = 8;
++
++ OS << "#ifdef GET_SVML_VARIANTS\n";
++
++ for (const auto &D : Records.getAllDerivedDefinitions("SvmlVariant")) {
++ StringRef SvmlVariantNameStr = D->getName();
++ // Single Precision SVML
++ for (unsigned VL = MinSinglePrecVL; VL <= MaxSinglePrecVL; VL *= 2) {
++ // Emit the scalar math library function to svml function entry.
++ OS << "{\"" << SvmlVariantNameStr << "f" << "\", ";
++ OS << "\"" << "__svml_" << SvmlVariantNameStr << "f" << VL << "\", "
++ << "ElementCount::getFixed(" << VL << ")},\n";
++
++ // Emit the scalar intrinsic to svml function entry.
++ OS << "{\"" << "llvm." << SvmlVariantNameStr << ".f32" << "\", ";
++ OS << "\"" << "__svml_" << SvmlVariantNameStr << "f" << VL << "\", "
++ << "ElementCount::getFixed(" << VL << ")},\n";
++
++ // Emit the finite math library function to svml function entry.
++ OS << "{\"__" << SvmlVariantNameStr << "f_finite" << "\", ";
++ OS << "\"" << "__svml_" << SvmlVariantNameStr << "f" << VL << "\", "
++ << "ElementCount::getFixed(" << VL << ")},\n";
++ }
++
++ // Double Precision SVML
++ for (unsigned VL = MinDoublePrecVL; VL <= MaxDoublePrecVL; VL *= 2) {
++ // Emit the scalar math library function to svml function entry.
++ OS << "{\"" << SvmlVariantNameStr << "\", ";
++ OS << "\"" << "__svml_" << SvmlVariantNameStr << VL << "\", " << "ElementCount::getFixed(" << VL
++ << ")},\n";
++
++ // Emit the scalar intrinsic to svml function entry.
++ OS << "{\"" << "llvm." << SvmlVariantNameStr << ".f64" << "\", ";
++ OS << "\"" << "__svml_" << SvmlVariantNameStr << VL << "\", " << "ElementCount::getFixed(" << VL
++ << ")},\n";
++
++ // Emit the finite math library function to svml function entry.
++ OS << "{\"__" << SvmlVariantNameStr << "_finite" << "\", ";
++ OS << "\"" << "__svml_" << SvmlVariantNameStr << VL << "\", "
++ << "ElementCount::getFixed(" << VL << ")},\n";
++ }
++ }
++
++ OS << "#endif // GET_SVML_VARIANTS\n\n";
++}
++
++void SVMLVariantsEmitter::run(raw_ostream &OS) {
++ emitSVMLVariants(OS);
++}
++
++namespace llvm {
++
++void EmitSVMLVariants(RecordKeeper &RK, raw_ostream &OS) {
++ SVMLVariantsEmitter(RK).run(OS);
++}
++
++} // End llvm namespace
+diff --git a/llvm-14.0.6.src/utils/TableGen/TableGen.cpp b/llvm-14.0.6.src/utils/TableGen/TableGen.cpp
+index 2d4a45f889be6..603d0c223b33a 100644
+--- a/llvm-14.0.6.src/utils/TableGen/TableGen.cpp
++++ b/llvm-14.0.6.src/utils/TableGen/TableGen.cpp
+@@ -57,6 +57,7 @@ enum ActionType {
+ GenAutomata,
+ GenDirectivesEnumDecl,
+ GenDirectivesEnumImpl,
++ GenSVMLVariants,
+ };
+
+ namespace llvm {
+@@ -138,7 +139,9 @@ cl::opt<ActionType> Action(
+ clEnumValN(GenDirectivesEnumDecl, "gen-directive-decl",
+ "Generate directive related declaration code (header file)"),
+ clEnumValN(GenDirectivesEnumImpl, "gen-directive-impl",
+- "Generate directive related implementation code")));
++ "Generate directive related implementation code"),
++ clEnumValN(GenSVMLVariants, "gen-svml",
++ "Generate SVML variant function names")));
+
+ cl::OptionCategory PrintEnumsCat("Options for -print-enums");
+ cl::opt<std::string> Class("class", cl::desc("Print Enum list for this class"),
+@@ -272,6 +275,9 @@ bool LLVMTableGenMain(raw_ostream &OS, RecordKeeper &Records) {
+ case GenDirectivesEnumImpl:
+ EmitDirectivesImpl(Records, OS);
+ break;
++ case GenSVMLVariants:
++ EmitSVMLVariants(Records, OS);
++ break;
+ }
+
+ return false;
+diff --git a/llvm-14.0.6.src/utils/TableGen/TableGenBackends.h b/llvm-14.0.6.src/utils/TableGen/TableGenBackends.h
+index 71db8dc77b052..86c3a3068c2dc 100644
+--- a/llvm-14.0.6.src/utils/TableGen/TableGenBackends.h
++++ b/llvm-14.0.6.src/utils/TableGen/TableGenBackends.h
+@@ -93,6 +93,7 @@ void EmitExegesis(RecordKeeper &RK, raw_ostream &OS);
+ void EmitAutomata(RecordKeeper &RK, raw_ostream &OS);
+ void EmitDirectivesDecl(RecordKeeper &RK, raw_ostream &OS);
+ void EmitDirectivesImpl(RecordKeeper &RK, raw_ostream &OS);
++void EmitSVMLVariants(RecordKeeper &RK, raw_ostream &OS);
+
+ } // End llvm namespace
+
+diff --git a/llvm-14.0.6.src/utils/vim/syntax/llvm.vim b/llvm-14.0.6.src/utils/vim/syntax/llvm.vim
+index 205db16b7d8cd..2572ab5a59e1b 100644
+--- a/llvm-14.0.6.src/utils/vim/syntax/llvm.vim
++++ b/llvm-14.0.6.src/utils/vim/syntax/llvm.vim
+@@ -104,6 +104,7 @@ syn keyword llvmKeyword
+ \ inreg
+ \ intel_ocl_bicc
+ \ inteldialect
++ \ intel_svmlcc
+ \ internal
+ \ jumptable
+ \ linkonce
Index: devel/py-llvmlite/patches/patch-ffi_Makefile.freebsd
===================================================================
RCS file: devel/py-llvmlite/patches/patch-ffi_Makefile.freebsd
diff -N devel/py-llvmlite/patches/patch-ffi_Makefile.freebsd
--- devel/py-llvmlite/patches/patch-ffi_Makefile.freebsd 14 Jan 2022 19:49:10 -0000 1.2
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,23 +0,0 @@
-$NetBSD: patch-ffi_Makefile.freebsd,v 1.2 2022/01/14 19:49:10 adam Exp $
-
-Add missing source code.
-Add -fPIC for linking.
-
---- ffi/Makefile.freebsd.orig 2021-03-25 14:26:22.000477300 +0000
-+++ ffi/Makefile.freebsd
-@@ -11,13 +11,13 @@ LIBS = $(LLVM_LIBS)
- INCLUDE = core.h
- SRC = assembly.cpp bitcode.cpp core.cpp initfini.cpp module.cpp value.cpp \
- executionengine.cpp transforms.cpp passmanagers.cpp targets.cpp dylib.cpp \
-- linker.cpp object_file.cpp
-+ linker.cpp object_file.cpp custom_passes.cpp
- OUTPUT = libllvmlite.so
-
- all: $(OUTPUT)
-
- $(OUTPUT): $(SRC) $(INCLUDE)
-- $(CXX) -shared $(CXXFLAGS) $(SRC) -o $(OUTPUT) $(LDFLAGS) $(LIBS)
-+ $(CXX) -shared $(CXXFLAGS) $(SRC) -o $(OUTPUT) $(LDFLAGS) $(LIBS) -fPIC
-
- clean:
- rm -rf test
Index: devel/py-llvmlite/patches/patch-ffi_Makefile.linux
===================================================================
RCS file: devel/py-llvmlite/patches/patch-ffi_Makefile.linux
diff -N devel/py-llvmlite/patches/patch-ffi_Makefile.linux
--- devel/py-llvmlite/patches/patch-ffi_Makefile.linux 19 Dec 2019 22:12:43 -0000 1.1
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,13 +0,0 @@
-$NetBSD: patch-ffi_Makefile.linux,v 1.1 2019/12/19 22:12:43 joerg Exp $
-
---- ffi/Makefile.linux.orig 2019-12-19 19:40:48.890888990 +0000
-+++ ffi/Makefile.linux
-@@ -19,7 +19,7 @@ all: $(OUTPUT)
- $(OUTPUT): $(SRC) $(INCLUDE)
- # static-libstdc++ avoids runtime dependencies on a
- # particular libstdc++ version.
-- $(CXX) $(CXX_STATIC_LINK) -shared $(CXXFLAGS) $(SRC) -o $(OUTPUT) $(LDFLAGS) $(LIBS)
-+ $(CXX) $(CXX_STATIC_LINK) -shared $(CXXFLAGS) $(SRC) -o $(OUTPUT) $(LDFLAGS) $(LIBS) -fPIC
-
- clean:
- rm -rf test $(OUTPUT)
Index: devel/py-llvmlite/patches/patch-ffi_targets.cpp
===================================================================
RCS file: devel/py-llvmlite/patches/patch-ffi_targets.cpp
diff -N devel/py-llvmlite/patches/patch-ffi_targets.cpp
--- devel/py-llvmlite/patches/patch-ffi_targets.cpp 14 Jan 2022 19:49:10 -0000 1.2
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,17 +0,0 @@
-$NetBSD: patch-ffi_targets.cpp,v 1.2 2022/01/14 19:49:10 adam Exp $
-
-Stopgap fix for llvm-12+
-https://github.com/numba/llvmlite/pull/802/files
-
---- ffi/targets.cpp.orig 2022-01-14 14:39:38.000000000 +0000
-+++ ffi/targets.cpp
-@@ -233,7 +233,9 @@ LLVMPY_CreateTargetMachine(LLVMTargetRef
- rm = Reloc::DynamicNoPIC;
-
- TargetOptions opt;
-+#if LLVM_VERSION_MAJOR < 12
- opt.PrintMachineCode = PrintMC;
-+#endif
- opt.MCOptions.ABIName = ABIName;
-
- bool jit = JIT;
Index: math/py-numba/Makefile
===================================================================
RCS file: /cvsroot/pkgsrc/math/py-numba/Makefile,v
retrieving revision 1.33
diff -u -r1.33 Makefile
--- math/py-numba/Makefile 1 Aug 2023 23:20:47 -0000 1.33
+++ math/py-numba/Makefile 19 Jan 2024 23:50:33 -0000
@@ -1,6 +1,6 @@
# $NetBSD: Makefile,v 1.33 2023/08/01 23:20:47 wiz Exp $
-DISTNAME= numba-0.55.2
+DISTNAME= numba-0.58.1
PKGNAME= ${PYPKGPREFIX}-${DISTNAME}
CATEGORIES= math python
MASTER_SITES= ${MASTER_SITE_PYPI:=n/numba/}
@@ -10,15 +10,16 @@
COMMENT= NumPy aware dynamic Python compiler using LLVM
LICENSE= 2-clause-bsd
-DEPENDS+= ${PYPKGPREFIX}-llvmlite>=0.38.0:../../devel/py-llvmlite
+DEPENDS+= ${PYPKGPREFIX}-llvmlite>=0.41.0:../../devel/py-llvmlite
DEPENDS+= ${PYPKGPREFIX}-setuptools-[0-9]*:../../devel/py-setuptools
# OpenMP is not portable
+# Really? We should fix that.
MAKE_ENV+= NUMBA_DISABLE_OPENMP=1
USE_LANGUAGES= c c++
-PYTHON_VERSIONS_INCOMPATIBLE= 27 38
+PYTHON_VERSIONS_INCOMPATIBLE= 27 312
USE_PKG_RESOURCES= yes
Index: math/py-numba/PLIST
===================================================================
RCS file: /cvsroot/pkgsrc/math/py-numba/PLIST,v
retrieving revision 1.17
diff -u -r1.17 PLIST
--- math/py-numba/PLIST 14 Jan 2022 19:52:24 -0000 1.17
+++ math/py-numba/PLIST 19 Jan 2024 23:50:33 -0000
@@ -1,6 +1,5 @@
-@comment $NetBSD: PLIST,v 1.17 2022/01/14 19:52:24 adam Exp $
+@comment $NetBSD$
bin/numba-${PYVERSSUFFIX}
-bin/pycc-${PYVERSSUFFIX}
${PYSITELIB}/${EGG_INFODIR}/PKG-INFO
${PYSITELIB}/${EGG_INFODIR}/SOURCES.txt
${PYSITELIB}/${EGG_INFODIR}/dependency_links.txt
@@ -19,17 +18,14 @@
${PYSITELIB}/numba/_dynfunc.c
${PYSITELIB}/numba/_dynfunc.so
${PYSITELIB}/numba/_dynfuncmod.c
-${PYSITELIB}/numba/_hashtable.c
${PYSITELIB}/numba/_hashtable.h
${PYSITELIB}/numba/_helperlib.c
${PYSITELIB}/numba/_helperlib.so
${PYSITELIB}/numba/_helpermod.c
${PYSITELIB}/numba/_lapack.c
-${PYSITELIB}/numba/_npymath_exports.c
${PYSITELIB}/numba/_numba_common.h
${PYSITELIB}/numba/_pymodule.h
${PYSITELIB}/numba/_random.c
-${PYSITELIB}/numba/_typeof.c
${PYSITELIB}/numba/_typeof.h
${PYSITELIB}/numba/_unicodetype_db.h
${PYSITELIB}/numba/_version.py
@@ -127,9 +123,6 @@
${PYSITELIB}/numba/core/cpu_options.py
${PYSITELIB}/numba/core/cpu_options.pyc
${PYSITELIB}/numba/core/cpu_options.pyo
-${PYSITELIB}/numba/core/dataflow.py
-${PYSITELIB}/numba/core/dataflow.pyc
-${PYSITELIB}/numba/core/dataflow.pyo
${PYSITELIB}/numba/core/datamodel/__init__.py
${PYSITELIB}/numba/core/datamodel/__init__.pyc
${PYSITELIB}/numba/core/datamodel/__init__.pyo
@@ -208,6 +201,9 @@
${PYSITELIB}/numba/core/itanium_mangler.py
${PYSITELIB}/numba/core/itanium_mangler.pyc
${PYSITELIB}/numba/core/itanium_mangler.pyo
+${PYSITELIB}/numba/core/llvm_bindings.py
+${PYSITELIB}/numba/core/llvm_bindings.pyc
+${PYSITELIB}/numba/core/llvm_bindings.pyo
${PYSITELIB}/numba/core/lowering.py
${PYSITELIB}/numba/core/lowering.pyc
${PYSITELIB}/numba/core/lowering.pyo
@@ -220,9 +216,6 @@
${PYSITELIB}/numba/core/options.py
${PYSITELIB}/numba/core/options.pyc
${PYSITELIB}/numba/core/options.pyo
-${PYSITELIB}/numba/core/overload_glue.py
-${PYSITELIB}/numba/core/overload_glue.pyc
-${PYSITELIB}/numba/core/overload_glue.pyo
${PYSITELIB}/numba/core/postproc.py
${PYSITELIB}/numba/core/postproc.pyc
${PYSITELIB}/numba/core/postproc.pyo
@@ -268,7 +261,7 @@
${PYSITELIB}/numba/core/runtime/context.py
${PYSITELIB}/numba/core/runtime/context.pyc
${PYSITELIB}/numba/core/runtime/context.pyo
-${PYSITELIB}/numba/core/runtime/nrt.c
+${PYSITELIB}/numba/core/runtime/nrt.cpp
${PYSITELIB}/numba/core/runtime/nrt.h
${PYSITELIB}/numba/core/runtime/nrt.py
${PYSITELIB}/numba/core/runtime/nrt.pyc
@@ -280,6 +273,24 @@
${PYSITELIB}/numba/core/runtime/nrtopt.py
${PYSITELIB}/numba/core/runtime/nrtopt.pyc
${PYSITELIB}/numba/core/runtime/nrtopt.pyo
+${PYSITELIB}/numba/core/rvsdg_frontend/__init__.py
+${PYSITELIB}/numba/core/rvsdg_frontend/__init__.pyc
+${PYSITELIB}/numba/core/rvsdg_frontend/__init__.pyo
+${PYSITELIB}/numba/core/rvsdg_frontend/bcinterp.py
+${PYSITELIB}/numba/core/rvsdg_frontend/bcinterp.pyc
+${PYSITELIB}/numba/core/rvsdg_frontend/bcinterp.pyo
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/__init__.py
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/__init__.pyc
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/__init__.pyo
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/bc2rvsdg.py
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/bc2rvsdg.pyc
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/bc2rvsdg.pyo
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/regionpasses.py
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/regionpasses.pyc
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/regionpasses.pyo
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/regionrenderer.py
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/regionrenderer.pyc
+${PYSITELIB}/numba/core/rvsdg_frontend/rvsdg/regionrenderer.pyo
${PYSITELIB}/numba/core/serialize.py
${PYSITELIB}/numba/core/serialize.pyc
${PYSITELIB}/numba/core/serialize.pyo
@@ -398,9 +409,6 @@
${PYSITELIB}/numba/core/typing/npydecl.py
${PYSITELIB}/numba/core/typing/npydecl.pyc
${PYSITELIB}/numba/core/typing/npydecl.pyo
-${PYSITELIB}/numba/core/typing/randomdecl.py
-${PYSITELIB}/numba/core/typing/randomdecl.pyc
-${PYSITELIB}/numba/core/typing/randomdecl.pyo
${PYSITELIB}/numba/core/typing/setdecl.py
${PYSITELIB}/numba/core/typing/setdecl.pyc
${PYSITELIB}/numba/core/typing/setdecl.pyo
@@ -518,6 +526,9 @@
${PYSITELIB}/numba/cuda/compiler.py
${PYSITELIB}/numba/cuda/compiler.pyc
${PYSITELIB}/numba/cuda/compiler.pyo
+${PYSITELIB}/numba/cuda/cpp_function_wrappers.cu
+${PYSITELIB}/numba/cuda/cuda_fp16.h
+${PYSITELIB}/numba/cuda/cuda_fp16.hpp
${PYSITELIB}/numba/cuda/cuda_paths.py
${PYSITELIB}/numba/cuda/cuda_paths.pyc
${PYSITELIB}/numba/cuda/cuda_paths.pyo
@@ -552,6 +563,9 @@
${PYSITELIB}/numba/cuda/cudadrv/ndarray.py
${PYSITELIB}/numba/cuda/cudadrv/ndarray.pyc
${PYSITELIB}/numba/cuda/cudadrv/ndarray.pyo
+${PYSITELIB}/numba/cuda/cudadrv/nvrtc.py
+${PYSITELIB}/numba/cuda/cudadrv/nvrtc.pyc
+${PYSITELIB}/numba/cuda/cudadrv/nvrtc.pyo
${PYSITELIB}/numba/cuda/cudadrv/nvvm.py
${PYSITELIB}/numba/cuda/cudadrv/nvvm.pyc
${PYSITELIB}/numba/cuda/cudadrv/nvvm.pyo
@@ -582,12 +596,18 @@
${PYSITELIB}/numba/cuda/errors.py
${PYSITELIB}/numba/cuda/errors.pyc
${PYSITELIB}/numba/cuda/errors.pyo
+${PYSITELIB}/numba/cuda/extending.py
+${PYSITELIB}/numba/cuda/extending.pyc
+${PYSITELIB}/numba/cuda/extending.pyo
${PYSITELIB}/numba/cuda/initialize.py
${PYSITELIB}/numba/cuda/initialize.pyc
${PYSITELIB}/numba/cuda/initialize.pyo
${PYSITELIB}/numba/cuda/intrinsic_wrapper.py
${PYSITELIB}/numba/cuda/intrinsic_wrapper.pyc
${PYSITELIB}/numba/cuda/intrinsic_wrapper.pyo
+${PYSITELIB}/numba/cuda/intrinsics.py
+${PYSITELIB}/numba/cuda/intrinsics.pyc
+${PYSITELIB}/numba/cuda/intrinsics.pyo
${PYSITELIB}/numba/cuda/kernels/__init__.py
${PYSITELIB}/numba/cuda/kernels/__init__.pyc
${PYSITELIB}/numba/cuda/kernels/__init__.pyo
@@ -669,6 +689,9 @@
${PYSITELIB}/numba/cuda/simulator/reduction.py
${PYSITELIB}/numba/cuda/simulator/reduction.pyc
${PYSITELIB}/numba/cuda/simulator/reduction.pyo
+${PYSITELIB}/numba/cuda/simulator/vector_types.py
+${PYSITELIB}/numba/cuda/simulator/vector_types.pyc
+${PYSITELIB}/numba/cuda/simulator/vector_types.pyo
${PYSITELIB}/numba/cuda/simulator_init.py
${PYSITELIB}/numba/cuda/simulator_init.pyc
${PYSITELIB}/numba/cuda/simulator_init.pyo
@@ -687,10 +710,6 @@
${PYSITELIB}/numba/cuda/tests/cudadrv/__init__.py
${PYSITELIB}/numba/cuda/tests/cudadrv/__init__.pyc
${PYSITELIB}/numba/cuda/tests/cudadrv/__init__.pyo
-${PYSITELIB}/numba/cuda/tests/cudadrv/data/__init__.py
-${PYSITELIB}/numba/cuda/tests/cudadrv/data/__init__.pyc
-${PYSITELIB}/numba/cuda/tests/cudadrv/data/__init__.pyo
-${PYSITELIB}/numba/cuda/tests/cudadrv/data/jitlink.ptx
${PYSITELIB}/numba/cuda/tests/cudadrv/test_array_attr.py
${PYSITELIB}/numba/cuda/tests/cudadrv/test_array_attr.pyc
${PYSITELIB}/numba/cuda/tests/cudadrv/test_array_attr.pyo
@@ -739,15 +758,18 @@
${PYSITELIB}/numba/cuda/tests/cudadrv/test_inline_ptx.py
${PYSITELIB}/numba/cuda/tests/cudadrv/test_inline_ptx.pyc
${PYSITELIB}/numba/cuda/tests/cudadrv/test_inline_ptx.pyo
-${PYSITELIB}/numba/cuda/tests/cudadrv/test_ir_patch.py
-${PYSITELIB}/numba/cuda/tests/cudadrv/test_ir_patch.pyc
-${PYSITELIB}/numba/cuda/tests/cudadrv/test_ir_patch.pyo
+${PYSITELIB}/numba/cuda/tests/cudadrv/test_is_fp16.py
+${PYSITELIB}/numba/cuda/tests/cudadrv/test_is_fp16.pyc
+${PYSITELIB}/numba/cuda/tests/cudadrv/test_is_fp16.pyo
${PYSITELIB}/numba/cuda/tests/cudadrv/test_linker.py
${PYSITELIB}/numba/cuda/tests/cudadrv/test_linker.pyc
${PYSITELIB}/numba/cuda/tests/cudadrv/test_linker.pyo
${PYSITELIB}/numba/cuda/tests/cudadrv/test_managed_alloc.py
${PYSITELIB}/numba/cuda/tests/cudadrv/test_managed_alloc.pyc
${PYSITELIB}/numba/cuda/tests/cudadrv/test_managed_alloc.pyo
+${PYSITELIB}/numba/cuda/tests/cudadrv/test_mvc.py
+${PYSITELIB}/numba/cuda/tests/cudadrv/test_mvc.pyc
+${PYSITELIB}/numba/cuda/tests/cudadrv/test_mvc.pyo
${PYSITELIB}/numba/cuda/tests/cudadrv/test_nvvm_driver.py
${PYSITELIB}/numba/cuda/tests/cudadrv/test_nvvm_driver.pyc
${PYSITELIB}/numba/cuda/tests/cudadrv/test_nvvm_driver.pyo
@@ -775,6 +797,18 @@
${PYSITELIB}/numba/cuda/tests/cudapy/__init__.py
${PYSITELIB}/numba/cuda/tests/cudapy/__init__.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/__init__.pyo
+${PYSITELIB}/numba/cuda/tests/cudapy/cache_usecases.py
+${PYSITELIB}/numba/cuda/tests/cudapy/cache_usecases.pyc
+${PYSITELIB}/numba/cuda/tests/cudapy/cache_usecases.pyo
+${PYSITELIB}/numba/cuda/tests/cudapy/cache_with_cpu_usecases.py
+${PYSITELIB}/numba/cuda/tests/cudapy/cache_with_cpu_usecases.pyc
+${PYSITELIB}/numba/cuda/tests/cudapy/cache_with_cpu_usecases.pyo
+${PYSITELIB}/numba/cuda/tests/cudapy/extensions_usecases.py
+${PYSITELIB}/numba/cuda/tests/cudapy/extensions_usecases.pyc
+${PYSITELIB}/numba/cuda/tests/cudapy/extensions_usecases.pyo
+${PYSITELIB}/numba/cuda/tests/cudapy/recursion_usecases.py
+${PYSITELIB}/numba/cuda/tests/cudapy/recursion_usecases.pyc
+${PYSITELIB}/numba/cuda/tests/cudapy/recursion_usecases.pyo
${PYSITELIB}/numba/cuda/tests/cudapy/test_alignment.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_alignment.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_alignment.pyo
@@ -796,9 +830,15 @@
${PYSITELIB}/numba/cuda/tests/cudapy/test_boolean.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_boolean.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_boolean.pyo
+${PYSITELIB}/numba/cuda/tests/cudapy/test_caching.py
+${PYSITELIB}/numba/cuda/tests/cudapy/test_caching.pyc
+${PYSITELIB}/numba/cuda/tests/cudapy/test_caching.pyo
${PYSITELIB}/numba/cuda/tests/cudapy/test_casting.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_casting.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_casting.pyo
+${PYSITELIB}/numba/cuda/tests/cudapy/test_cffi.py
+${PYSITELIB}/numba/cuda/tests/cudapy/test_cffi.pyc
+${PYSITELIB}/numba/cuda/tests/cudapy/test_cffi.pyo
${PYSITELIB}/numba/cuda/tests/cudapy/test_compiler.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_compiler.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_compiler.pyo
@@ -838,6 +878,9 @@
${PYSITELIB}/numba/cuda/tests/cudapy/test_dispatcher.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_dispatcher.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_dispatcher.pyo
+${PYSITELIB}/numba/cuda/tests/cudapy/test_enums.py
+${PYSITELIB}/numba/cuda/tests/cudapy/test_enums.pyc
+${PYSITELIB}/numba/cuda/tests/cudapy/test_enums.pyo
${PYSITELIB}/numba/cuda/tests/cudapy/test_errors.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_errors.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_errors.pyo
@@ -952,6 +995,9 @@
${PYSITELIB}/numba/cuda/tests/cudapy/test_record_dtype.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_record_dtype.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_record_dtype.pyo
+${PYSITELIB}/numba/cuda/tests/cudapy/test_recursion.py
+${PYSITELIB}/numba/cuda/tests/cudapy/test_recursion.pyc
+${PYSITELIB}/numba/cuda/tests/cudapy/test_recursion.pyo
${PYSITELIB}/numba/cuda/tests/cudapy/test_reduction.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_reduction.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_reduction.pyo
@@ -976,9 +1022,15 @@
${PYSITELIB}/numba/cuda/tests/cudapy/test_transpose.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_transpose.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_transpose.pyo
+${PYSITELIB}/numba/cuda/tests/cudapy/test_ufuncs.py
+${PYSITELIB}/numba/cuda/tests/cudapy/test_ufuncs.pyc
+${PYSITELIB}/numba/cuda/tests/cudapy/test_ufuncs.pyo
${PYSITELIB}/numba/cuda/tests/cudapy/test_userexc.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_userexc.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_userexc.pyo
+${PYSITELIB}/numba/cuda/tests/cudapy/test_vector_type.py
+${PYSITELIB}/numba/cuda/tests/cudapy/test_vector_type.pyc
+${PYSITELIB}/numba/cuda/tests/cudapy/test_vector_type.pyo
${PYSITELIB}/numba/cuda/tests/cudapy/test_vectorize.py
${PYSITELIB}/numba/cuda/tests/cudapy/test_vectorize.pyc
${PYSITELIB}/numba/cuda/tests/cudapy/test_vectorize.pyo
@@ -1009,21 +1061,60 @@
${PYSITELIB}/numba/cuda/tests/cudasim/test_cudasim_issues.py
${PYSITELIB}/numba/cuda/tests/cudasim/test_cudasim_issues.pyc
${PYSITELIB}/numba/cuda/tests/cudasim/test_cudasim_issues.pyo
+${PYSITELIB}/numba/cuda/tests/data/__init__.py
+${PYSITELIB}/numba/cuda/tests/data/__init__.pyc
+${PYSITELIB}/numba/cuda/tests/data/__init__.pyo
+${PYSITELIB}/numba/cuda/tests/data/cuda_include.cu
+${PYSITELIB}/numba/cuda/tests/data/error.cu
+${PYSITELIB}/numba/cuda/tests/data/jitlink.cu
+${PYSITELIB}/numba/cuda/tests/data/jitlink.ptx
+${PYSITELIB}/numba/cuda/tests/data/warn.cu
${PYSITELIB}/numba/cuda/tests/doc_examples/__init__.py
${PYSITELIB}/numba/cuda/tests/doc_examples/__init__.pyc
${PYSITELIB}/numba/cuda/tests/doc_examples/__init__.pyo
+${PYSITELIB}/numba/cuda/tests/doc_examples/ffi/__init__.py
+${PYSITELIB}/numba/cuda/tests/doc_examples/ffi/__init__.pyc
+${PYSITELIB}/numba/cuda/tests/doc_examples/ffi/__init__.pyo
+${PYSITELIB}/numba/cuda/tests/doc_examples/ffi/functions.cu
${PYSITELIB}/numba/cuda/tests/doc_examples/test_cg.py
${PYSITELIB}/numba/cuda/tests/doc_examples/test_cg.pyc
${PYSITELIB}/numba/cuda/tests/doc_examples/test_cg.pyo
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_cpu_gpu_compat.py
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_cpu_gpu_compat.pyc
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_cpu_gpu_compat.pyo
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_ffi.py
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_ffi.pyc
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_ffi.pyo
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_laplace.py
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_laplace.pyc
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_laplace.pyo
${PYSITELIB}/numba/cuda/tests/doc_examples/test_matmul.py
${PYSITELIB}/numba/cuda/tests/doc_examples/test_matmul.pyc
${PYSITELIB}/numba/cuda/tests/doc_examples/test_matmul.pyo
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_montecarlo.py
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_montecarlo.pyc
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_montecarlo.pyo
${PYSITELIB}/numba/cuda/tests/doc_examples/test_random.py
${PYSITELIB}/numba/cuda/tests/doc_examples/test_random.pyc
${PYSITELIB}/numba/cuda/tests/doc_examples/test_random.pyo
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_reduction.py
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_reduction.pyc
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_reduction.pyo
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_sessionize.py
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_sessionize.pyc
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_sessionize.pyo
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_ufunc.py
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_ufunc.pyc
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_ufunc.pyo
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_vecadd.py
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_vecadd.pyc
+${PYSITELIB}/numba/cuda/tests/doc_examples/test_vecadd.pyo
${PYSITELIB}/numba/cuda/tests/nocuda/__init__.py
${PYSITELIB}/numba/cuda/tests/nocuda/__init__.pyc
${PYSITELIB}/numba/cuda/tests/nocuda/__init__.pyo
+${PYSITELIB}/numba/cuda/tests/nocuda/test_function_resolution.py
+${PYSITELIB}/numba/cuda/tests/nocuda/test_function_resolution.pyc
+${PYSITELIB}/numba/cuda/tests/nocuda/test_function_resolution.pyo
${PYSITELIB}/numba/cuda/tests/nocuda/test_import.py
${PYSITELIB}/numba/cuda/tests/nocuda/test_import.pyc
${PYSITELIB}/numba/cuda/tests/nocuda/test_import.pyo
@@ -1036,6 +1127,12 @@
${PYSITELIB}/numba/cuda/types.py
${PYSITELIB}/numba/cuda/types.pyc
${PYSITELIB}/numba/cuda/types.pyo
+${PYSITELIB}/numba/cuda/ufuncs.py
+${PYSITELIB}/numba/cuda/ufuncs.pyc
+${PYSITELIB}/numba/cuda/ufuncs.pyo
+${PYSITELIB}/numba/cuda/vector_types.py
+${PYSITELIB}/numba/cuda/vector_types.pyc
+${PYSITELIB}/numba/cuda/vector_types.pyo
${PYSITELIB}/numba/cuda/vectorizers.py
${PYSITELIB}/numba/cuda/vectorizers.pyc
${PYSITELIB}/numba/cuda/vectorizers.pyo
@@ -1058,6 +1155,9 @@
${PYSITELIB}/numba/experimental/jitclass/decorators.py
${PYSITELIB}/numba/experimental/jitclass/decorators.pyc
${PYSITELIB}/numba/experimental/jitclass/decorators.pyo
+${PYSITELIB}/numba/experimental/jitclass/overloads.py
+${PYSITELIB}/numba/experimental/jitclass/overloads.pyc
+${PYSITELIB}/numba/experimental/jitclass/overloads.pyo
${PYSITELIB}/numba/experimental/structref.py
${PYSITELIB}/numba/experimental/structref.pyc
${PYSITELIB}/numba/experimental/structref.pyo
@@ -1084,9 +1184,15 @@
${PYSITELIB}/numba/misc/findlib.py
${PYSITELIB}/numba/misc/findlib.pyc
${PYSITELIB}/numba/misc/findlib.pyo
+${PYSITELIB}/numba/misc/firstlinefinder.py
+${PYSITELIB}/numba/misc/firstlinefinder.pyc
+${PYSITELIB}/numba/misc/firstlinefinder.pyo
${PYSITELIB}/numba/misc/gdb_hook.py
${PYSITELIB}/numba/misc/gdb_hook.pyc
${PYSITELIB}/numba/misc/gdb_hook.pyo
+${PYSITELIB}/numba/misc/gdb_print_extension.py
+${PYSITELIB}/numba/misc/gdb_print_extension.pyc
+${PYSITELIB}/numba/misc/gdb_print_extension.pyo
${PYSITELIB}/numba/misc/help/__init__.py
${PYSITELIB}/numba/misc/help/__init__.pyc
${PYSITELIB}/numba/misc/help/__init__.pyo
@@ -1111,6 +1217,9 @@
${PYSITELIB}/numba/misc/numba_entry.py
${PYSITELIB}/numba/misc/numba_entry.pyc
${PYSITELIB}/numba/misc/numba_entry.pyo
+${PYSITELIB}/numba/misc/numba_gdbinfo.py
+${PYSITELIB}/numba/misc/numba_gdbinfo.pyc
+${PYSITELIB}/numba/misc/numba_gdbinfo.pyo
${PYSITELIB}/numba/misc/numba_sysinfo.py
${PYSITELIB}/numba/misc/numba_sysinfo.pyc
${PYSITELIB}/numba/misc/numba_sysinfo.pyo
@@ -1158,6 +1267,24 @@
${PYSITELIB}/numba/np/polynomial.py
${PYSITELIB}/numba/np/polynomial.pyc
${PYSITELIB}/numba/np/polynomial.pyo
+${PYSITELIB}/numba/np/random/__init__.py
+${PYSITELIB}/numba/np/random/__init__.pyc
+${PYSITELIB}/numba/np/random/__init__.pyo
+${PYSITELIB}/numba/np/random/_constants.py
+${PYSITELIB}/numba/np/random/_constants.pyc
+${PYSITELIB}/numba/np/random/_constants.pyo
+${PYSITELIB}/numba/np/random/distributions.py
+${PYSITELIB}/numba/np/random/distributions.pyc
+${PYSITELIB}/numba/np/random/distributions.pyo
+${PYSITELIB}/numba/np/random/generator_core.py
+${PYSITELIB}/numba/np/random/generator_core.pyc
+${PYSITELIB}/numba/np/random/generator_core.pyo
+${PYSITELIB}/numba/np/random/generator_methods.py
+${PYSITELIB}/numba/np/random/generator_methods.pyc
+${PYSITELIB}/numba/np/random/generator_methods.pyo
+${PYSITELIB}/numba/np/random/random_methods.py
+${PYSITELIB}/numba/np/random/random_methods.pyc
+${PYSITELIB}/numba/np/random/random_methods.pyo
${PYSITELIB}/numba/np/ufunc/__init__.py
${PYSITELIB}/numba/np/ufunc/__init__.pyc
${PYSITELIB}/numba/np/ufunc/__init__.pyo
@@ -1274,8 +1401,8 @@
${PYSITELIB}/numba/tests/__init__.pyc
${PYSITELIB}/numba/tests/__init__.pyo
${PYSITELIB}/numba/tests/annotation_usecases.py
-${PLIST.py3x}${PYSITELIB}/numba/tests/annotation_usecases.pyc
-${PLIST.py3x}${PYSITELIB}/numba/tests/annotation_usecases.pyo
+${PYSITELIB}/numba/tests/annotation_usecases.pyc
+${PYSITELIB}/numba/tests/annotation_usecases.pyo
${PYSITELIB}/numba/tests/cache_usecases.py
${PYSITELIB}/numba/tests/cache_usecases.pyc
${PYSITELIB}/numba/tests/cache_usecases.pyo
@@ -1303,6 +1430,9 @@
${PYSITELIB}/numba/tests/doc_examples/test_examples.py
${PYSITELIB}/numba/tests/doc_examples/test_examples.pyc
${PYSITELIB}/numba/tests/doc_examples/test_examples.pyo
+${PYSITELIB}/numba/tests/doc_examples/test_interval_example.py
+${PYSITELIB}/numba/tests/doc_examples/test_interval_example.pyc
+${PYSITELIB}/numba/tests/doc_examples/test_interval_example.pyo
${PYSITELIB}/numba/tests/doc_examples/test_jitclass.py
${PYSITELIB}/numba/tests/doc_examples/test_jitclass.pyc
${PYSITELIB}/numba/tests/doc_examples/test_jitclass.pyo
@@ -1315,6 +1445,12 @@
${PYSITELIB}/numba/tests/doc_examples/test_llvm_pass_timings.py
${PYSITELIB}/numba/tests/doc_examples/test_llvm_pass_timings.pyc
${PYSITELIB}/numba/tests/doc_examples/test_llvm_pass_timings.pyo
+${PYSITELIB}/numba/tests/doc_examples/test_numpy_generators.py
+${PYSITELIB}/numba/tests/doc_examples/test_numpy_generators.pyc
+${PYSITELIB}/numba/tests/doc_examples/test_numpy_generators.pyo
+${PYSITELIB}/numba/tests/doc_examples/test_parallel_chunksize.py
+${PYSITELIB}/numba/tests/doc_examples/test_parallel_chunksize.pyc
+${PYSITELIB}/numba/tests/doc_examples/test_parallel_chunksize.pyo
${PYSITELIB}/numba/tests/doc_examples/test_rec_array.py
${PYSITELIB}/numba/tests/doc_examples/test_rec_array.pyc
${PYSITELIB}/numba/tests/doc_examples/test_rec_array.pyo
@@ -1327,6 +1463,9 @@
${PYSITELIB}/numba/tests/doc_examples/test_typed_list_usage.py
${PYSITELIB}/numba/tests/doc_examples/test_typed_list_usage.pyc
${PYSITELIB}/numba/tests/doc_examples/test_typed_list_usage.pyo
+${PYSITELIB}/numba/tests/doctest_usecase.py
+${PYSITELIB}/numba/tests/doctest_usecase.pyc
+${PYSITELIB}/numba/tests/doctest_usecase.pyo
${PYSITELIB}/numba/tests/dummy_module.py
${PYSITELIB}/numba/tests/dummy_module.pyc
${PYSITELIB}/numba/tests/dummy_module.pyo
@@ -1348,9 +1487,15 @@
${PYSITELIB}/numba/tests/gdb/test_break_on_symbol.py
${PYSITELIB}/numba/tests/gdb/test_break_on_symbol.pyc
${PYSITELIB}/numba/tests/gdb/test_break_on_symbol.pyo
+${PYSITELIB}/numba/tests/gdb/test_break_on_symbol_version.py
+${PYSITELIB}/numba/tests/gdb/test_break_on_symbol_version.pyc
+${PYSITELIB}/numba/tests/gdb/test_break_on_symbol_version.pyo
${PYSITELIB}/numba/tests/gdb/test_conditional_breakpoint.py
${PYSITELIB}/numba/tests/gdb/test_conditional_breakpoint.pyc
${PYSITELIB}/numba/tests/gdb/test_conditional_breakpoint.pyo
+${PYSITELIB}/numba/tests/gdb/test_pretty_print.py
+${PYSITELIB}/numba/tests/gdb/test_pretty_print.pyc
+${PYSITELIB}/numba/tests/gdb/test_pretty_print.pyo
${PYSITELIB}/numba/tests/gdb_support.py
${PYSITELIB}/numba/tests/gdb_support.pyc
${PYSITELIB}/numba/tests/gdb_support.pyo
@@ -1393,6 +1538,9 @@
${PYSITELIB}/numba/tests/npyufunc/test_ufuncbuilding.py
${PYSITELIB}/numba/tests/npyufunc/test_ufuncbuilding.pyc
${PYSITELIB}/numba/tests/npyufunc/test_ufuncbuilding.pyo
+${PYSITELIB}/numba/tests/npyufunc/test_update_inplace.py
+${PYSITELIB}/numba/tests/npyufunc/test_update_inplace.pyc
+${PYSITELIB}/numba/tests/npyufunc/test_update_inplace.pyo
${PYSITELIB}/numba/tests/npyufunc/test_vectorize_decor.py
${PYSITELIB}/numba/tests/npyufunc/test_vectorize_decor.pyc
${PYSITELIB}/numba/tests/npyufunc/test_vectorize_decor.pyo
@@ -1522,6 +1670,9 @@
${PYSITELIB}/numba/tests/test_chained_assign.py
${PYSITELIB}/numba/tests/test_chained_assign.pyc
${PYSITELIB}/numba/tests/test_chained_assign.pyo
+${PYSITELIB}/numba/tests/test_chrome_trace.py
+${PYSITELIB}/numba/tests/test_chrome_trace.pyc
+${PYSITELIB}/numba/tests/test_chrome_trace.pyo
${PYSITELIB}/numba/tests/test_cli.py
${PYSITELIB}/numba/tests/test_cli.pyc
${PYSITELIB}/numba/tests/test_cli.pyo
@@ -1588,6 +1739,9 @@
${PYSITELIB}/numba/tests/test_dispatcher.py
${PYSITELIB}/numba/tests/test_dispatcher.pyc
${PYSITELIB}/numba/tests/test_dispatcher.pyo
+${PYSITELIB}/numba/tests/test_doctest.py
+${PYSITELIB}/numba/tests/test_doctest.pyc
+${PYSITELIB}/numba/tests/test_doctest.pyo
${PYSITELIB}/numba/tests/test_dummyarray.py
${PYSITELIB}/numba/tests/test_dummyarray.pyc
${PYSITELIB}/numba/tests/test_dummyarray.pyo
@@ -1630,6 +1784,12 @@
${PYSITELIB}/numba/tests/test_fastmath.py
${PYSITELIB}/numba/tests/test_fastmath.pyc
${PYSITELIB}/numba/tests/test_fastmath.pyo
+${PYSITELIB}/numba/tests/test_findlib.py
+${PYSITELIB}/numba/tests/test_findlib.pyc
+${PYSITELIB}/numba/tests/test_findlib.pyo
+${PYSITELIB}/numba/tests/test_firstlinefinder.py
+${PYSITELIB}/numba/tests/test_firstlinefinder.pyc
+${PYSITELIB}/numba/tests/test_firstlinefinder.pyo
${PYSITELIB}/numba/tests/test_flow_control.py
${PYSITELIB}/numba/tests/test_flow_control.pyc
${PYSITELIB}/numba/tests/test_flow_control.pyo
@@ -1654,6 +1814,9 @@
${PYSITELIB}/numba/tests/test_generators.py
${PYSITELIB}/numba/tests/test_generators.pyc
${PYSITELIB}/numba/tests/test_generators.pyo
+${PYSITELIB}/numba/tests/test_getitem_on_types.py
+${PYSITELIB}/numba/tests/test_getitem_on_types.pyc
+${PYSITELIB}/numba/tests/test_getitem_on_types.pyo
${PYSITELIB}/numba/tests/test_gil.py
${PYSITELIB}/numba/tests/test_gil.pyc
${PYSITELIB}/numba/tests/test_gil.pyo
@@ -1681,6 +1844,9 @@
${PYSITELIB}/numba/tests/test_inlining.py
${PYSITELIB}/numba/tests/test_inlining.pyc
${PYSITELIB}/numba/tests/test_inlining.pyo
+${PYSITELIB}/numba/tests/test_interpreter.py
+${PYSITELIB}/numba/tests/test_interpreter.pyc
+${PYSITELIB}/numba/tests/test_interpreter.pyo
${PYSITELIB}/numba/tests/test_interproc.py
${PYSITELIB}/numba/tests/test_interproc.pyc
${PYSITELIB}/numba/tests/test_interproc.pyo
@@ -1777,6 +1943,9 @@
${PYSITELIB}/numba/tests/test_np_functions.py
${PYSITELIB}/numba/tests/test_np_functions.pyc
${PYSITELIB}/numba/tests/test_np_functions.pyo
+${PYSITELIB}/numba/tests/test_np_randomgen.py
+${PYSITELIB}/numba/tests/test_np_randomgen.pyc
+${PYSITELIB}/numba/tests/test_np_randomgen.pyo
${PYSITELIB}/numba/tests/test_npdatetime.py
${PYSITELIB}/numba/tests/test_npdatetime.pyc
${PYSITELIB}/numba/tests/test_npdatetime.pyo
@@ -1855,6 +2024,9 @@
${PYSITELIB}/numba/tests/test_python_int.py
${PYSITELIB}/numba/tests/test_python_int.pyc
${PYSITELIB}/numba/tests/test_python_int.pyo
+${PYSITELIB}/numba/tests/test_pythonapi.py
+${PYSITELIB}/numba/tests/test_pythonapi.pyc
+${PYSITELIB}/numba/tests/test_pythonapi.pyo
${PYSITELIB}/numba/tests/test_random.py
${PYSITELIB}/numba/tests/test_random.pyc
${PYSITELIB}/numba/tests/test_random.pyo
@@ -1876,6 +2048,9 @@
${PYSITELIB}/numba/tests/test_remove_dead.py
${PYSITELIB}/numba/tests/test_remove_dead.pyc
${PYSITELIB}/numba/tests/test_remove_dead.pyo
+${PYSITELIB}/numba/tests/test_repr.py
+${PYSITELIB}/numba/tests/test_repr.pyc
+${PYSITELIB}/numba/tests/test_repr.pyo
${PYSITELIB}/numba/tests/test_retargeting.py
${PYSITELIB}/numba/tests/test_retargeting.pyc
${PYSITELIB}/numba/tests/test_retargeting.pyo
@@ -1981,6 +2156,9 @@
${PYSITELIB}/numba/tests/test_unpack_sequence.py
${PYSITELIB}/numba/tests/test_unpack_sequence.pyc
${PYSITELIB}/numba/tests/test_unpack_sequence.pyo
+${PYSITELIB}/numba/tests/test_unpickle_without_module.py
+${PYSITELIB}/numba/tests/test_unpickle_without_module.pyc
+${PYSITELIB}/numba/tests/test_unpickle_without_module.pyo
${PYSITELIB}/numba/tests/test_unsafe_intrinsics.py
${PYSITELIB}/numba/tests/test_unsafe_intrinsics.pyc
${PYSITELIB}/numba/tests/test_unsafe_intrinsics.pyo
Index: math/py-numba/distinfo
===================================================================
RCS file: /cvsroot/pkgsrc/math/py-numba/distinfo,v
retrieving revision 1.25
diff -u -r1.25 distinfo
--- math/py-numba/distinfo 27 May 2022 08:29:57 -0000 1.25
+++ math/py-numba/distinfo 19 Jan 2024 23:50:33 -0000
@@ -1,6 +1,6 @@
$NetBSD: distinfo,v 1.25 2022/05/27 08:29:57 adam Exp $
-BLAKE2s (numba-0.55.2.tar.gz) = eada4a93b87c339bff047f01bc58907c3e6e20def1c58be60fa0175d5c99d34c
-SHA512 (numba-0.55.2.tar.gz) = 713e5e4d3e3af40a89481ec3faf39b078a3fc7f2af8df7d150a9d0acd9a857511c35e4b9433682834cd6dd8d3f39492e750320d3cf21d259f5db619316d8c937
-Size (numba-0.55.2.tar.gz) = 2302683 bytes
+BLAKE2s (numba-0.58.1.tar.gz) = 8aa4d16a0c8cf3d4d6fa2cf9c6c53f2dee486a00cc53d3207bb6f6f8174df8cf
+SHA512 (numba-0.58.1.tar.gz) = 5cce612dc385326a7322d88be02f1084a3b9c38e1af729af95d7ebcbae3650904bbd23ddff105958eebc07e45628eb422a6b2dcec3c64be09ccf2903753ba225
+Size (numba-0.58.1.tar.gz) = 2623830 bytes
SHA1 (patch-numba_np_ufunc_workqueue.c) = 139f9685a6a5b9fcf857b49dc0f3555e0e361b54
Home |
Main Index |
Thread Index |
Old Index