pkgsrc-WIP-changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

apache-arrow: Update to 18.0.0



Module Name:	pkgsrc-wip
Committed By:	Matthew Danielson <matthewd%fastmail.us@localhost>
Pushed By:	matthewd
Date:		Sun Nov 10 14:08:32 2024 -0800
Changeset:	4040783f41bbe97e3b6bb21ad2e9733a61044d10

Modified Files:
	apache-arrow/Makefile
	apache-arrow/PLIST
	apache-arrow/distinfo
	apache-arrow/version.mk

Log Message:
apache-arrow: Update to 18.0.0

Apache Arrow 18.0.0 (2024-10-28 07:00:00+00:00)
Bug Fixes

    GH-36295 - [C++] data corruption when using `group_by` and `aggregate` on large data sets
    GH-39789 - [Go][Parquet] Close current row group when finished writing unbuffered batch (#43326)
    GH-40557 - [C++] Use PutObject request for S3 in OutputStream when only uploading small data (#41564)
    GH-41396 - [Ruby] Add workaround for re2.pc on Ubuntu 20.04 (#43721)
    GH-41481 - [CI] Update how extra environment variables are specified for the integration test docker job (#42009)
    GH-41696 - [Python][Packaging] Bump MACOSX_DEPLOYMENT_TARGET to 12 instead of 11 (#43137)
    GH-41891 - [C++] Clean up implicit fallthrough warnings (#41892)
    GH-41993 - [Go] IPC writer shift voffsets when offsets array does not start from zero (#43176)
    GH-42240 - [R] Fix crash in ParquetFileWriter$WriteTable and add WriteBatch (#42241)
    GH-43046 - [C++] Fix avx2 gather rows more than 2^31 issue in CompareColumnsToRows (#43065)
    GH-43130 - [C++][ArrowFlight] Crash due to UCS thread mode
    GH-43150 - [Docs] Correction needed in pyarrow.compute.microsecond
    GH-43152 - [Release] Require “digest/sha1” explicitly for thread safety (#43154)
    GH-43153 - [R] pull on a grouped query returns the wrong column (#43172)
    GH-43163 - [R] Fix bindings in Math group generics (#43162)
    GH-43167 - [C++] Add workaround for missing Boost dependency of Thrift (#43328)
    GH-43175 - [C++] Skip not Emscripten ready tests in CSV tests (#43724)
    GH-43183 - [C++] Add date{32,64} to date{32,64} cast (#43192)
    GH-43186 - [Go] Use auto-aligned atomic int64 for pqarrow pathbuilders (#43206)
    GH-43194 - [R] R_existsVarInFrame isn’t available earlier than R 4.2 (#43243)
    GH-43202 - [C++][Compute] Detect and explicit error for offset overflow in row table (#43226)
    GH-43211 - [C++] Fix decimal benchmarks to avoid out-of-bounds accesses (#43212)
    GH-43217 - [Java] Remove flight-core shaded jars (#43224)
    GH-43218 - [C++] Resolve Abseil like any other dependency in the build system (#43219)
    GH-43221 - [C++][Parquet] Refactor parquet::encryption::AesEncryptor to use unique_ptr (#43222)
    GH-43228 - [C++] Fix Abseil compile error on GCC 13 (#43157)
    GH-43232 - [Release][Packaging][Python] Add tzdata as conda env requirement to avoid ORC failure (#43233)
    GH-43245 - [Packaging][deb] Add missing libabsl-dev dependency (#43246)
    GH-43267 - [C#] Correctly import sliced arrays through the C Data interface (#44117)
    GH-43270 - [Release] Fix input variables on post-01-tag.sh (#43271)
    GH-43276 - [Go][Parquet] Make DeltaBitPacking Encoders/Decoders Generic (#43279)
    GH-43282 - [Release][Docs][Packaging] Upload correct docs job when uploading binaries (#43283)
    GH-43284 - [Release] Fix version detection timing for bump deb package names on post-12-bump-versions.sh script (#43294)
    GH-43293 - [Docs] Update code block for Installing Java Modules (#43295)
    GH-43299 - [Release][Packaging] Only include pyarrow folder when finding packages on setuptools (#43325)
    GH-43314 - [CI][Java] Delete arrow-maven-plugins from release script (#43313)
    GH-43320 - [Java] fix for SchemaChangeRuntimeException transferring empty FixedSizeListVector (#43321)
    GH-43331 - [C++] Add missing serde methods to Location (#43332)
    GH-43346 - [Docs][Format] Update broken links (#43347)
    GH-43349 - [R] Fix altrep string columns from readr (#43351)
    GH-43357 - [R] Fix some lints (#43338)
    GH-43359 - [Go][Parquet] ReadRowGroups panics with canceled context (#43360)
    GH-43377 - [Java][CI] Java-Jars CI is Failing with a linking error on macOS (#43385)
    GH-43378 - [Java][CI] Don’t configure multithreading when building javadocs (#43674)
    GH-43382 - [C++][Parquet] min-max Statistics doesn’t work well when one of min-max is truncated (#43383)
    GH-43388 - [Python] Give precedence to pycapsule interface in pa.schema(..) (#43486)
    GH-43393 - [C++][Parquet] parquet-dump-footer: Remove redundant link and fix –debug processing (#43375)
    GH-43394 - [Java][Benchmarking] Fix Java benchmarks for Java 17+ (#43395)
    GH-43400 - [C++] Ensure using bundled GoogleTest when we use bundled GoogleTest (#43465)
    GH-43412 - [Java][Benchmarking] Use JDK_JAVA_OPTIONS for JVM arguments (#43411)
    GH-43414 - [C++][Compute] Fix invalid memory access when resizing var-length buffer in row table (#43415)
    GH-43429 - [C++][FlightRPC] Fix Flight UCX build issues (#43430)
    GH-43432 - [Java][Packaging] Clean up java-jars job (#43431)
    GH-43440 - [R] Unable to filter a factor column with %in% (#43446)
    GH-43447 - [C++] FIlter out zero length buffers on gRPC transport (#43448)
    GH-43449 - [CI][Conan] Don’t push used images (#43470)
    GH-43463 - [C++][Gandiva] Always use gdv_function_stubs.h in context_helper.cc (#43464)
    GH-43467 - [C++] Add support for the official LZ4 CMake package (#43468)
    GH-43487 - [Python] Sanitize Python reference handling in UDF implementation (#43557)
    GH-43502 - [Java] Fix Java JNI / AMD64 manylinux2014 Java JNI test not test dataset module (#43503)
    GH-43506 - [Java] Fix TestFragmentScanOptions result not match (#43639)
    GH-43554 - [Go] Handle excluded fields (#43555)
    GH-43577 - [Java] getBuffers method needs correction on clear flag usage (#43583)
    GH-43588 - [Python] Allow tuple for rename columns (#43609)
    GH-43618 - [Packaging][Python] Fix vcpkg version detection in macOS wheel build jobs (#43615)
    GH-43627 - [R] Fix summarize() performance regression (pushdown) (#43649)
    GH-43635 - [R][CI] Don’t install Quarto (#43636)
    GH-43665 - [R] Remove references to bindings vignette (#43889)
    GH-43667 - [Java] Keeping Flight default header size consistent between server and client (#43697)
    GH-43707 - [Python] Fix compilation on Cython<3 (#43765)
    GH-43717 - [Java][FlightSQL] Add all ActionTypes to FlightSqlUtils.FLIGHT_SQL_ACTIONS (#43718)
    GH-43735 - [R] AWS SDK fails to build on one of CRAN’s M1 builders (#43736)
    GH-43743 - [CI][Docs] Ensure creating build directory (#43744)
    GH-43748 - [R] Handle package_version in safe_r_metadata (#43895)
    GH-43785 - [Python][CI] Correct PARQUET_TEST_DATA path in wheel tests (#43786)
    GH-43787 - [C++] Register the new Opaque extension type by default (#43788)
    GH-43815 - [CI][Packaging][Python] Avoid uploading wheel to gemfury if version already exists (#43816)
    GH-43837 - [Go][IPC] Consolidate StreamWriter and FileWriter, ensuring that EOS indicator is written in file (#43890)
    GH-43860 - [Go][Parquet] Handle the error correctly (#43861)
    GH-43868 - [CI][Python] Skip test that requires PARQUET_TEST_DATA env on emscripten (#43906)
    GH-43869 - [Java][CI] Flight related failure in the AMD64 Windows Server 2022 Java JDK 11 CI (#43850)
    GH-43870 - [C++][Acero] Fix typos in join benchmark (#43871)
    GH-43877 - [Ruby] Add support for 0 decimal value (#43882)
    GH-43885 - [C++][CI] Catch potential integer overflow in PoolBuffer (#43886)
    GH-43933 - [CI] Remove docker-compose warnings (#43934)
    GH-43952 - [CI] Bump actions/{upload 	download}-artifact from 3 to latest v4 in /.github/workflows (#43940)
    GH-43960 - [R] fix str_sub binding to properly handle negative end values (#44141)
    GH-43966 - [Java] Check for nullabilities when comparing StructVector (#43968)
    GH-44046 - [Python] Fix threading issues with borrowed refs and pandas (#44047)
    GH-44050 - [CI][Integration] Execute integration test again (#44051)
    GH-44069 - [Docs][R] Add note to to_arrow() docs about collect/compute (#44094)
    GH-44071 - [C++] Leak S3 structures if finalization happens too late (#44090)
    GH-44076 - [CI] Remove verify-rc-binaries-wheel-macos-11 which is now deprecated (#44077)
    GH-44081 - [C++][Parquet] Fix reported metrics in parquet-arrow-reader-writer-benchmark (#44082)
    GH-44088 - [Java] Fix copyFrom in BaseVariableWidthViewVector (#44078)
    GH-44096 - [C++] Don’t use Boost.Process with Emscripten (#44097)
    GH-44098 - [C++] Add home made _mm256_set_m128i for compilers who are missing it (#44116)
    GH-44122 - [R] Don’t use the new pipe yet (#44123)
    GH-44127 - [CI][R] Fix util_enable_core_dumps.sh path (#44128)
    GH-44153 - [GLib][FlightRPC] Fix closure annotation (#44154)
    GH-44214 - [C++] JsonExtensionType equality check ignores storage type (#44215)
    GH-44218 - [Benchmarking][Python] Avoid uwsgi install failure on macOS (#44221)
    GH-44234 - [CI][C++][AppVeyor] Use conda instead of Mamba (#44235)
    GH-44253 - [CI][Release][Python] Do not verify Python on Ubuntu 20.04 (#44254)
    GH-44256 - [C++][FS][Azure] Fix edgecase where GetFileInfo incorrectly returns NotFound on flat namespace and Azurite (#44302)
    GH-44268 - [Release][Ruby][CI] Pin version of glib used in verification script (#44270)
    GH-44269 - [C++][FS][Azure] Catch missing exceptions on HNS support check (#44274)
    GH-44277 - [CI] Use Miniforge instead of Mambaforge (#44278)
    GH-44297 - [Integration][CI] Skip nanoarrow IPC integration tests for compressed/dictionary-encoded files (#44298)
    GH-44300 - [Integration][Archery] Don’t import unused testers (#44301)
    GH-44303 - [C++][FS][Azure] Fix minor hierarchical namespace bugs (#44307)
    GH-44334 - [C++] Fix S3 error handling in ObjectOutputStream (#44335)
    GH-44337 - [CI][GLib] Fix a flaky StreamDecoder and Buffer test (#44341)
    GH-44342 - [C++] Disable jemalloc by default on ARM (#44380)
    GH-44358 - [Packaging][Debian] Add workaround for CUDA include path (#44359)
    GH-44369 - [CI][Python] Remove ds requirement from test collection on test_dataset.py (#44370)
    GH-44373 - [Packaging][Java] Fix brew link to Python 3.13 on macOS (#44374)
    GH-44381 - [Ruby][Release] Pin not only glib but also python on verification jobs (#44382)
    GH-44386 - [Integration][Release] Pin Python 3.12 for Integration verification when using Conda (#44388)
    GH-44422 - [Packaging][Release][Linux] Upload artifacts before test (#44425)

New Features and Improvements

    GH-15058 - [C++][Python] Native support for UUID (#37298)
    GH-17682 - [C++][Python] Bool8 Extension Type Implementation (#43488)
    GH-17682 - [Go] Bool8 Extension Type Implementation (#43323)
    GH-17682 - [Format] Add Bool8 Canonical Extension Type (#43234)
    GH-25118 - [Python] Make NumPy an optional runtime dependency (#41904)
    GH-28866 - [Java] Java Dataset API ScanOptions expansion (#41646)
    GH-30058 - [Python] Add StructType attribute to access all its fields (#43481)
    GH-30863 - [JS] Use a singleton StructRow proxy handler (#44289)
    GH-32538 - [C++][Parquet] Add JSON canonical extension type (#13901)
    GH-34529 - [C++][Compute] Replace explicit checking with DCHECK for invariants in row segmenter (#44236)
    GH-37756 - [Format][Docs] Document IPC Compression (#43950)
    GH-38041 - [C++][CI] Improve IPC fuzzing seed corpus (#43621)
    GH-38051 - [Java] Remove Java 8 support (#43139)
    GH-38183 - [CI][Python] Use pipx to install GCS testbench (#43852)
    GH-38255 - [Java] Implement Flight SQL Bulk Ingestion (#43551)
    GH-38847 - [Documentation][C++] Explicitly note that compute is optional (#43629)
    GH-39638 - [Docs][R] Add r-universe instructions (#44033)
    GH-39982 - [Java] Add RunEndEncodedVector (#43888)
    GH-40036 - [C++] Azure file system write buffering & async writes (#43096)
    GH-40154 - [C++][Parquet] Separate encoders and decoder (#43972)
    GH-40216 - [Python][CI][Packaging] Don’t upload sdist to scientific-python nightly channel (only wheels) (#43943)
    GH-40216 - [Python][CI][Packaging] Upload nightly wheels to main label of scientific-python-nightly-wheels channel (#43932)
    GH-40216 - [CI][Packaging][Python] Upload pyarrow nightly wheels to scientific python channel on Anaconda (#43862)
    GH-40493 - [GLib][Ruby] Add GArrowStreamDecoder (#44170)
    GH-40570 - [CI] Default environment to Ubuntu 22.04 instead of 20.04 (#44151)
    GH-40860 - [GLib][Parquet] Add gparquet_arrow_file_writer_write_record_batch() (#44001)
    GH-40936 - [Java] Implement Holder-based functions in `ViewVarBinaryVector`
    GH-40937 - [Java] Implement Holder-based functions for ViewVarCharVector & ViewVarBinaryVector (#44187)
    GH-41056 - [GLib][FlightRPC] Add gaflight_client_do_put() and related APIs (#43813)
    GH-41272 - [Java] LargeListViewVector Implementation (#43516)
    GH-41291 - [Java] LargeListViewVector Implementation transferPair implementation (#43637)
    GH-41347 - [FlightRPC][C#] Allow hosting flight server in pre-Kestrel .net versions (#41348)
    GH-41569 - [Java] ListViewVector Implementation for UnionListViewReader (#43077)
    GH-41579 - [C++][Python][Parquet] Support reading/writing key-value metadata from/to ColumnChunkMetaData (#41580)
    GH-41584 - [Java] ListView Implementation for C Data Interface (#43686)
    GH-41585 - [Java] LargeListView Implementation for C Data Interface
    GH-41623 - [Docs][C++] Is arrow::dataset namespace still experimental?
    GH-41640 - [Go] Implement BYTE_STREAM_SPLIT Parquet Encoding (#43066)
    GH-41665 - [Python] Ensure (Chunked)Array/RecordBatch/Table methods don’t crash with non-CPU data
    GH-41673 - [Format][Docs] Add arrow format introductory page (#41593)
    GH-41909 - [C++] Add arrow::ArrayStatistics (#43273)
    GH-41922 - [CI][C++] Update Minio version (#44225)
    GH-41951 - [Java] Add @FormatMethod annotations (#43376)
    GH-42014 - [Python] Let StructArray.from_array accept a type in addition to names or fields (#43047)
    GH-42085 - [Python] Test FlightStreamReader iterator (#42086)
    GH-42102 - [C++][Parquet] Add binary that extracts a footer from a parquet file (#42174)
    GH-42222 - [Python] Add bindings for CopyTo on RecordBatch and Array classes (#42223)
    GH-42247 - [C++] Support casting to and from utf8_view/binary_view (#43302)
    GH-43044 - [R] So-called non-API entry points (#43173)
    GH-43069 - [Python] Use Py_IsFinalizing from pythoncapi_compat.h (#43767)
    GH-43075 - [CI][Crossbow][Docker] Set timeout for docker-tests (#43078)
    GH-43092 - [Swift] Update ArrowData for Nested Types (allow children)
    GH-43095 - [C++] Update bundled vendor/datetime to support for building with libc++ and C++20 (#43094)
    GH-43097 - [C++] Implement PathFromUri support for Azure file system (#43098)
    GH-43114 - [Archery][Dev] Support setuptools-scm >= 8.0.0 (#43156)
    GH-43129 - [C++][Compute] Fix the unnecessary allocation of extra bytes when encoding row table (#43125)
    GH-43141 - [C++][Parquet] Replace use of int with int32_t in the internal Parquet encryption APIs (#43413)
    GH-43142 - [C++][Parquet] Refactor Encryptor API to use arrow::util::span instead of raw pointers (#43195)
    GH-43143 - [C++][Parquet] Default initialize some parquet metadata variables (#43144)
    GH-43160 - [Swift] Add Struct Array (#43161)
    GH-43164 - [C++] Fix CMake link order for AWS SDK (#43230)
    GH-43168 - [Swift] Add buffer and array builders for Struct type (#43171)
    GH-43169 - [Swift] Add StructArray to ArrowReader (#43335)
    GH-43185 - [C++] Suggest a cast when Concatenate fails due to offsets overflow (#43190)
    GH-43187 - [C++] Support basic is_in predicate simplification (#43761)
    GH-43197 - [C++][AzureFS] Ignore password field in URI (#44220)
    GH-43209 - [C++] Add lint for DCHECK in public headers (#43248)
    GH-43229 - [Java] Update Maven project info (#43231)
    GH-43238 - [C++][FlightRPC] Reduce repetition in flight/types.cc in serde functions (#43237)
    GH-43249 - [C++][Parquet] remove useless template parameter of DeltaLengthByteArrayEncoder (#43250)
    GH-43254 - [C++] Always prefer mimalloc to jemalloc (#40875)
    GH-43258 - [C++][Flight] Use a Base CRTP type for the types used in RPC calls (#43255)
    GH-43266 - [C#] Add LargeBinary, LargeString and LargeList array types (#43269)
    GH-43291 - [C++] Expand the ‘take’ function tests to cover more chunked-array cases (#43292)
    GH-43301 - [C++][Parquet] Enhance the comment for ColumnReader/Decoder (#44003)
    GH-43319 - [R][Docs] Update packaging checklist (#43345)
    GH-43329 - [C++] Order classes in flight/types.h according to Flight.proto (#43330)
    GH-43380 - [Java] Add support for cross jdk version testing (#43381)
    GH-43391 - [Python] Add bindings for memory manager and device to Context class (#43392)
    GH-43396 - [Java] Remove/replace jsr305 (#43397)
    GH-43418 - [CI] Add wheels and java-jars to vcpkg group for tasks (#43419)
    GH-43425 - [Java] Upgrade JNI to version 10 (#43424)
    GH-43427 - [C++][Parquet] Deprecate ColumnChunk::file_offset field and no longer write Metadata at end of Chunk (#43428)
    GH-43437 - [Java] Update protobuf from 3.25.1 to 3.25.4 (#43436)
    GH-43443 - [Go][IPC] Infer schema from first record if not specified (#43484)
    GH-43444 - [C++] Add benchmark for binary view builder (#43445)
    GH-43450 - [CI] Temporarily turn off conda jobs that are failing (#43451)
    GH-43453 - [Format] Add Opaque canonical extension type (#43457)
    GH-43454 - [C++][Python] Add Opaque canonical extension type (#43458)
    GH-43455 - [Go] Add Opaque canonical extension type (#43459)
    GH-43456 - [Java] Add Opaque canonical extension type (#43460)
    GH-43469 - [Java] Change the default CompressionCodec.Factory to leverage compression support transparently (#43471)
    GH-43479 - [Java] Change visibility of MemoryUtil.UNSAFE (#43480)
    GH-43483 - [Java][C++] Support more CsvFragmentScanOptions in JNI call (#43482)
    GH-43492 - [C++] Thirdparty: Bump lz4 to 1.10.0 (#43493)
    GH-43495 - [C++][Compute] Widen the row offset of the row table to 64-bit (#43389)
    GH-43500 - [R][CI] Bump dev docs CI job from ubuntu 20.04 (#43501)
    GH-43507 - [C++] Use ViewOrCopyTo instead of CopyTo when pretty printing non-CPU data (#43508)
    GH-43509 - [R] Add link to ?acero from ?list_compute_functions (#44210)
    GH-43512 - [Java] ListViewVector Visitor-based component Integration (#43513)
    GH-43514 - [Python] Deprecate passing build flags to setup.py (#43515)
    GH-43518 - [Python][Packaging][CI] Drop Python 3.8 support (#43970)
    GH-43519 - [Python][CI] Add Python 3.13 conda test build (#44192)
    GH-43519 - [Python][CI][Packaging] Use released versions to build and test wheels on Python 3.13 (#44193)
    GH-43519 - [Python] Set up wheel building for Python 3.13 (#43539)
    GH-43532 - [Python] Remove usage of deprecated pkg_resources in setup.py (#43602)
    GH-43536 - [Python][CI] Add a Crossbow job with the free-threaded build (#43671)
    GH-43536 - [Python] Do not use borrowed references APIs (#43540)
    GH-43536 - [Python] Declare support for free-threading in Cython (#43606)
    GH-43543 - [FlightRPC][C++] Reduce the number of references to protobuf::Any (#43544)
    GH-43548 - [R][CI] Use grep -F to simplify matching or rchk output (#43477)
    GH-43559 - [Python][CI] Add a Crossbow job with a debug CPython interpreter (#43565)
    GH-43578 - [C++] Simplify arrow::ArrayStatistics::ValueType (#43581)
    GH-43591 - [C++][GLib] Don’t install arrow-cuda.pc/arrow-cuda-glib.pc on Windows (#43593)
    GH-43592 - [C++] Remove redundant default constructor/deconstructor in arrow::ArrayStatistics (#43579)
    GH-43594 - [C++] Remove std::optional from arrow::ArrayStatistics::is_{min,max}_exact (#43595)
    GH-43608 - [CI][Archery] Prefer docker compose over docker-compose (#43586)
    GH-43633 - [R] Add tests for packages that might be tricky to roundtrip data to Tables + Parquet files (#43634)
    GH-43638 - [Java] LargeListViewVector RangeEqualVisitor and TypeEqualVisitor integration (#43642)
    GH-43643 - [Java] LargeListViewVector IPC Integration (#43681)
    GH-43669 - [Docs][Dev] Document archery –debug flag in section about docker (#43935)
    GH-43672 - [C#] Schema should be optional on FlightInfo (#43673)
    GH-43677 - [C++][FlightRPC] Move the FlightTestServer to its own .cc and .h files (#43678)
    GH-43680 - [Integration] Unskip nanoarrow in IPC integration tests (#43715)
    GH-43684 - [Python][Dataset] Python / Cython interface to C++ arrow::dataset::Partitioning::Format (#43740)
    GH-43687 - [C++] Compute: fix register kernel SimdLevel for AddMinMax512AggKernels (#43704)
    GH-43688 - [C++] Prevent Snappy from disabling RTTI when bundled (#43706)
    GH-43690 - [Python][CI] Simplify python/requirements-wheel-test.txt file (#43691)
    GH-43702 - [C++][FS][Azure] Use the latest Azurite and update the bundled Azure SDK for C++ to azure-identity_1.9.0 (#43723)
    GH-43703 - [C++][Parquet][CI] Parquet: Introducing more bad_data for testing (#43708)
    GH-43712 - [C++][Parquet] Dataset: Handle num-nulls in Parquet correctly when !HasNullCount() (#43726)
    GH-43719 - [C++] Clarify the way SIMD-enabled agg kernels come from the same code in different compilation units (#43720)
    GH-43727 - [Python] RecordBatch fails gracefully on non-cpu devices (#43729)
    GH-43728 - [Python] ChunkedArray fails gracefully on non-cpu devices (#43795)
    GH-43732 - [Go] Require Go 1.22 or above (#43864)
    GH-43733 - [C++] Fix Scalar boolean handling in row encoder (#43734)
    GH-43738 - [GLib] Add GArrowAzureFileSytem (#43739)
    GH-43746 - [C++] Add support for Boost 1.86 (#43766)
    GH-43758 - [C++] Compute: More comment in RowEncoder (#43763)
    GH-43759 - [C++] Acero: Minor code enhancement for Join (#43760)
    GH-43764 - [Go][FlightSQL] Add NewPreparedStatement function (#43781)
    GH-43768 - [C++] Fix the case when boolean_{any 	all} meets constant input with length in Acero (#43799)
    GH-43776 - [C++] Add chunked Take benchmarks with a small selection factor (#43772)
    GH-43790 - [Go][Parquet] Add support for LZ4_RAW compression codec (#43835)
    GH-43796 - [C++] Indent preprocessor directives (#43798)
    GH-43797 - [C++] Attach arrow::ArrayStatistics to arrow::ArrayData (#43801)
    GH-43802 - [GLib] Add GAFlightRecordBatchWriter (#43803)
    GH-43805 - [C++] Enable filesystem automatically when one of ARROW_{AZURE,GCS,HDFS,S3}=ON is specified (#43806)
    GH-43809 - [Docs] Update extension type examples to not use UUID (#44120)
    GH-43814 - [GLib][FlightRPC] Add GAFlightServerClass::do_put (#43999)
    GH-43840 - [CI] Add cuda group to tasks.yml and minor updates for new cuda runner image (#43841)
    GH-43846 - [Python][Packaging] Remove numpy dependency from pyarrow packaging (#44148)
    GH-43854 - [C++] Expose the set of device types where a ChunkedArray is allocated (#43853)
    GH-43872 - [Go][CI] Disable Dependabot for Go (#44102)
    GH-43873 - [Go][CI] Remove Go related test CI (#44143)
    GH-43874 - [CI][Integration][Go] Use apache/arrow-go (#44142)
    GH-43875 - [Go][CI] Remove Go related lint configurations (#44144)
    GH-43878 - [Go][Release] Remove Go related codes from our release scripts (#44172)
    GH-43879 - [Go] Remove go related code (#44293)
    GH-43883 - [CI] Remove Python version guard when installing GCS testbench (#43884)
    GH-43894 - [R] format_aggregation() should print options too (#43896)
    GH-43902 - [Java] Support for Long memory addresses (#43903)
    GH-43907 - [C#][FlightRPC] Add Grpc Call Options support on Flight Client (#43910)
    GH-43927 - [C++] Make ChunkResolver::ResolveMany output a list of ChunkLocations (#43928)
    GH-43944 - [C++][Parquet] Add support for arrow::ArrayStatistics: non zero-copy int based types (#43945)
    GH-43946 - [C++][Parquet] Guard against use of cleared decryptor/encryptor (#43947)
    GH-43953 - [C++] Add tests based on random data and benchmarks to ChunkResolver::ResolveMany (#43954)
    GH-43962 - [Java] Consider warnings as errors for Adapter Module (#43963)
    GH-43964 - [Python] Build macOS and manylinux wheels for free-threading (#43965)
    GH-43967 - [C++] Enhance error message for URI parsing (#43938)
    GH-43969 - [CI][Dev] Prune .dockerignore (#43971)
    GH-43973 - [Python] Table fails gracefully on non-cpu devices (#43974)
    GH-43979 - [CI][C++][Dev] Add cpplint to pre-commit (#43982)
    GH-43983 - [C++][Parquet] Add support for arrow::ArrayStatistics: zero-copy types (#43984)
    GH-43986 - [C++][Acero] Some code cleanup to Grouper (#43988)
    GH-43992 - [C++] Add missing std::move() in array_nested.cc (#43993)
    GH-43996 - [Java] Mark new allocated ArrowSchema as released (#43997)
    GH-43998 - [C++][Docs] Add missing install command in building docs (#44000)
    GH-44006 - [GLib][Parquet] Add gparquet_arrow_file_writer_new_row_group() (#44039)
    GH-44007 - [GLib][Parquet] Add gparquet_arrow_file_writer_new_buffered_row_group() (#44100)
    GH-44008 - [C++][Parquet] Add support for arrow::ArrayStatistics: boolean (#44009)
    GH-44011 - [Java] Consider warnings as errors for C Module (#44012)
    GH-44013 - [Java] Consider warnings as errors for Dataset Module (#44014)
    GH-44016 - [Java] Consider warnings as errors for Format Module (#44017)
    GH-44034 - [Go][Format][FlightRPC] Update go_package in Flight.proto and FlightSql.proto (#44035)
    GH-44036 - [C++] IPC: ipc reader/writer code enhancement (#44019)
    GH-44044 - [Java] Consider warnings as errors for Vector Module (#44045)
    GH-44052 - [C++][Compute] Reduce the complexity of row segmenter (#44053)
    GH-44058 - [CI][Integration] Group logs on GitHub Actions (#44060)
    GH-44062 - [Dev][Archery][Integration] Reduce needless test matrix (#44099)
    GH-44063 - [Python] Deprecate the no longer used serialize/deserialize Pyarrow C++ functions (#44064)
    GH-44072 - [C++][Parquet] Add Float16 reading benchmarks (#44073)
    GH-44079 - [C++][Parquet] Remove deprecated APIs (#44080)
    GH-44085 - [CI][R] Update Ubuntu version for R force test (#44087)
    GH-44095 - [CI][Python] Enable S3 testing on Windows wheel builds (#44093)
    GH-44111 - [CI][Python] Enable S3 tests on macOS CI (#44129)
    GH-44149 - [Packaging][CI] Remove references to deprecated Ubuntu bionic (#44150)
    GH-44155 - [Archery][Integration] Rename “language” to “implementation” (#44156)
    GH-44158 - [Archery][Integration] Add more explanation how –target-implementations works (#44177)
    GH-44167 - [C++][Acero] Add more row segmenter tests (#44166)
    GH-44178 - [GLib][FlightRPC] Add GAFlightCallOptions:timeout (#44181)
    GH-44186 - [C++][Parquet] Fix typo in parquet/column_writer.cc (#40856)
    GH-44194 - [C++] Avoid repeated ArrayData::offset lookups (#44190)
    GH-44206 - [CI][macOS] Drop support for macOS 12 (#44212)
    GH-44222 - [C++][Gandiva] Accept LLVM 19.1 (#44233)
    GH-44229 - [Docs] Add PyArrow to JAX example to the docs (#44230)
    GH-44237 - [C#] Use stack allocated buffer when serializing decimal values (#44238)
    GH-44249 - [C++] Unify simd header includings (#44250)
    GH-44271 - [C#] Add support for Decimal32 and Decimal64 (#44272)
    GH-44273 - [C++][Decimal] Use 0E+1 not 0.E+1 for broader compatibility (#44275)
    GH-44290 - [Java][Flight] Add ActionType description getter (#44291)
    GH-44314 - [Packaging][Python] Use macOS 12 as deployment target to have macOS 12 pyarrow wheels (#44315)
    GH-44347 - [Packaging][C++] Enable Azure file system for deb/rpm (#44348)
    GH-44355 - [Packaging][Python] Disable interactive deb configuration in wheel-manylinux--cp313t- (#44362)
    GH-44415 - [Release][Ruby] Remove pins from glib section of release verification script (#44407)

To see a diff of this commit:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=4040783f41bbe97e3b6bb21ad2e9733a61044d10

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

diffstat:
 apache-arrow/Makefile   |  4 +--
 apache-arrow/PLIST      | 87 +++++++++++--------------------------------------
 apache-arrow/distinfo   |  6 ++--
 apache-arrow/version.mk |  2 +-
 4 files changed, 25 insertions(+), 74 deletions(-)

diffs:
diff --git a/apache-arrow/Makefile b/apache-arrow/Makefile
index 2c8db81680..aa8c5783af 100644
--- a/apache-arrow/Makefile
+++ b/apache-arrow/Makefile
@@ -54,8 +54,8 @@ CMAKE_CONFIGURE_ARGS+=	-DARROW_WITH_LZ4=ON
 CMAKE_CONFIGURE_ARGS+=	-DARROW_PARQUET=ON
 CMAKE_CONFIGURE_ARGS+=	-DPARQUET_BUILD_EXECUTABLES=ON
 CMAKE_CONFIGURE_ARGS+=	-DPARQUET_REQUIRE_ENCRYPTION=ON
-CMAKE_CONFIGURE_ARGS+=	-DARROW_SUBSTRAIT=ON
-CMAKE_CONFIGURE_ARGS+=	-DARROW_FLIGHT=ON
+CMAKE_CONFIGURE_ARGS+=	-DARROW_SUBSTRAIT=OFF
+CMAKE_CONFIGURE_ARGS+=	-DARROW_FLIGHT=OFF
 CMAKE_CONFIGURE_ARGS+=	-DARROW_WITH_SNAPPY=ON
 CMAKE_CONFIGURE_ARGS+=	-DARROW_WITH_ZLIB=ON
 CMAKE_CONFIGURE_ARGS+=	-DARROW_WITH_ZSTD=ON
diff --git a/apache-arrow/PLIST b/apache-arrow/PLIST
index b644ef3f42..e37dff264d 100644
--- a/apache-arrow/PLIST
+++ b/apache-arrow/PLIST
@@ -1,6 +1,7 @@
 @comment $NetBSD$
 bin/arrow-file-to-stream
 bin/arrow-stream-to-file
+bin/parquet-dump-footer
 bin/parquet-dump-schema
 bin/parquet-reader
 bin/parquet-scan
@@ -27,7 +28,6 @@ include/arrow/acero/test_nodes.h
 include/arrow/acero/time_series_util.h
 include/arrow/acero/tpch_node.h
 include/arrow/acero/type_fwd.h
-include/arrow/acero/unmaterialized_table.h
 include/arrow/acero/util.h
 include/arrow/acero/visibility.h
 include/arrow/api.h
@@ -52,6 +52,7 @@ include/arrow/array/builder_union.h
 include/arrow/array/concatenate.h
 include/arrow/array/data.h
 include/arrow/array/diff.h
+include/arrow/array/statistics.h
 include/arrow/array/util.h
 include/arrow/array/validate.h
 include/arrow/buffer.h
@@ -113,20 +114,12 @@ include/arrow/dataset/type_fwd.h
 include/arrow/dataset/visibility.h
 include/arrow/datum.h
 include/arrow/device.h
-include/arrow/engine/api.h
-include/arrow/engine/pch.h
-include/arrow/engine/substrait/api.h
-include/arrow/engine/substrait/extension_set.h
-include/arrow/engine/substrait/extension_types.h
-include/arrow/engine/substrait/options.h
-include/arrow/engine/substrait/relation.h
-include/arrow/engine/substrait/serde.h
-include/arrow/engine/substrait/test_plan_builder.h
-include/arrow/engine/substrait/test_util.h
-include/arrow/engine/substrait/type_fwd.h
-include/arrow/engine/substrait/util.h
-include/arrow/engine/substrait/visibility.h
+include/arrow/device_allocation_type_set.h
+include/arrow/extension/bool8.h
 include/arrow/extension/fixed_shape_tensor.h
+include/arrow/extension/json.h
+include/arrow/extension/opaque.h
+include/arrow/extension/uuid.h
 include/arrow/extension_type.h
 include/arrow/filesystem/api.h
 include/arrow/filesystem/azurefs.h
@@ -141,28 +134,6 @@ include/arrow/filesystem/s3_test_util.h
 include/arrow/filesystem/s3fs.h
 include/arrow/filesystem/test_util.h
 include/arrow/filesystem/type_fwd.h
-include/arrow/flight/api.h
-include/arrow/flight/client.h
-include/arrow/flight/client_auth.h
-include/arrow/flight/client_cookie_middleware.h
-include/arrow/flight/client_middleware.h
-include/arrow/flight/client_tracing_middleware.h
-include/arrow/flight/middleware.h
-include/arrow/flight/otel_logging.h
-include/arrow/flight/pch.h
-include/arrow/flight/platform.h
-include/arrow/flight/server.h
-include/arrow/flight/server_auth.h
-include/arrow/flight/server_middleware.h
-include/arrow/flight/server_tracing_middleware.h
-include/arrow/flight/test_definitions.h
-include/arrow/flight/test_util.h
-include/arrow/flight/transport.h
-include/arrow/flight/transport_server.h
-include/arrow/flight/type_fwd.h
-include/arrow/flight/types.h
-include/arrow/flight/types_async.h
-include/arrow/flight/visibility.h
 include/arrow/io/api.h
 include/arrow/io/buffered.h
 include/arrow/io/caching.h
@@ -228,6 +199,7 @@ include/arrow/testing/gtest_compat.h
 include/arrow/testing/gtest_util.h
 include/arrow/testing/matchers.h
 include/arrow/testing/pch.h
+include/arrow/testing/process.h
 include/arrow/testing/random.h
 include/arrow/testing/uniform_real.h
 include/arrow/testing/util.h
@@ -247,7 +219,6 @@ include/arrow/util/benchmark_util.h
 include/arrow/util/binary_view_util.h
 include/arrow/util/bit_block_counter.h
 include/arrow/util/bit_run_reader.h
-include/arrow/util/bit_stream_utils.h
 include/arrow/util/bit_util.h
 include/arrow/util/bitmap.h
 include/arrow/util/bitmap_builders.h
@@ -303,12 +274,12 @@ include/arrow/util/memory.h
 include/arrow/util/mutex.h
 include/arrow/util/parallel.h
 include/arrow/util/pcg_random.h
+include/arrow/util/prefetch.h
 include/arrow/util/print.h
 include/arrow/util/queue.h
 include/arrow/util/range.h
 include/arrow/util/ree_util.h
 include/arrow/util/regex.h
-include/arrow/util/rle_encoding.h
 include/arrow/util/rows_to_batches.h
 include/arrow/util/simd.h
 include/arrow/util/small_vector.h
@@ -435,7 +406,6 @@ lib/cmake/Arrow/ArrowTargets-release.cmake
 lib/cmake/Arrow/ArrowTargets.cmake
 lib/cmake/Arrow/FindBrotliAlt.cmake
 lib/cmake/Arrow/FindOpenSSLAlt.cmake
-lib/cmake/Arrow/FindProtobufAlt.cmake
 lib/cmake/Arrow/FindSnappyAlt.cmake
 lib/cmake/Arrow/FindglogAlt.cmake
 lib/cmake/Arrow/Findlz4Alt.cmake
@@ -451,15 +421,6 @@ lib/cmake/ArrowDataset/ArrowDatasetConfig.cmake
 lib/cmake/ArrowDataset/ArrowDatasetConfigVersion.cmake
 lib/cmake/ArrowDataset/ArrowDatasetTargets-release.cmake
 lib/cmake/ArrowDataset/ArrowDatasetTargets.cmake
-lib/cmake/ArrowFlight/ArrowFlightConfig.cmake
-lib/cmake/ArrowFlight/ArrowFlightConfigVersion.cmake
-lib/cmake/ArrowFlight/ArrowFlightTargets-release.cmake
-lib/cmake/ArrowFlight/ArrowFlightTargets.cmake
-lib/cmake/ArrowFlight/FindgRPCAlt.cmake
-lib/cmake/ArrowSubstrait/ArrowSubstraitConfig.cmake
-lib/cmake/ArrowSubstrait/ArrowSubstraitConfigVersion.cmake
-lib/cmake/ArrowSubstrait/ArrowSubstraitTargets-release.cmake
-lib/cmake/ArrowSubstrait/ArrowSubstraitTargets.cmake
 lib/cmake/Parquet/FindThriftAlt.cmake
 lib/cmake/Parquet/ParquetConfig.cmake
 lib/cmake/Parquet/ParquetConfigVersion.cmake
@@ -467,42 +428,32 @@ lib/cmake/Parquet/ParquetTargets-release.cmake
 lib/cmake/Parquet/ParquetTargets.cmake
 lib/libarrow.a
 lib/libarrow.so
-lib/libarrow.so.1700
-lib/libarrow.so.1700.0.0
+lib/libarrow.so.1800
+lib/libarrow.so.1800.0.0
 lib/libarrow_acero.a
 lib/libarrow_acero.so
-lib/libarrow_acero.so.1700
-lib/libarrow_acero.so.1700.0.0
+lib/libarrow_acero.so.1800
+lib/libarrow_acero.so.1800.0.0
 lib/libarrow_bundled_dependencies.a
 lib/libarrow_dataset.a
 lib/libarrow_dataset.so
-lib/libarrow_dataset.so.1700
-lib/libarrow_dataset.so.1700.0.0
-lib/libarrow_flight.a
-lib/libarrow_flight.so
-lib/libarrow_flight.so.1700
-lib/libarrow_flight.so.1700.0.0
-lib/libarrow_substrait.a
-lib/libarrow_substrait.so
-lib/libarrow_substrait.so.1700
-lib/libarrow_substrait.so.1700.0.0
+lib/libarrow_dataset.so.1800
+lib/libarrow_dataset.so.1800.0.0
 lib/libparquet.a
 lib/libparquet.so
-lib/libparquet.so.1700
-lib/libparquet.so.1700.0.0
+lib/libparquet.so.1800
+lib/libparquet.so.1800.0.0
 lib/pkgconfig/arrow-acero.pc
 lib/pkgconfig/arrow-compute.pc
 lib/pkgconfig/arrow-csv.pc
 lib/pkgconfig/arrow-dataset.pc
 lib/pkgconfig/arrow-filesystem.pc
-lib/pkgconfig/arrow-flight.pc
 lib/pkgconfig/arrow-json.pc
-lib/pkgconfig/arrow-substrait.pc
 lib/pkgconfig/arrow.pc
 lib/pkgconfig/parquet.pc
 share/arrow/gdb/gdb_arrow.py
-share/arrow/gdb/libarrow.so.1700.0.0-gdb.py
+share/arrow/gdb/libarrow.so.1800.0.0-gdb.py
 share/doc/arrow/LICENSE.txt
 share/doc/arrow/NOTICE.txt
 share/doc/arrow/README.md
-@pkgdir share/gdb/auto-load/home/matthew/pkgsrc/install.20240810/lib
+@pkgdir share/gdb/auto-load/home/matthew/pkgsrc/install.20241018/lib
diff --git a/apache-arrow/distinfo b/apache-arrow/distinfo
index 99188f243e..c4a47c6da0 100644
--- a/apache-arrow/distinfo
+++ b/apache-arrow/distinfo
@@ -3,9 +3,9 @@ $NetBSD$
 BLAKE2s (13.0.0.tar.gz) = b2edcdae20ea56461825c8ccd41f69ac17c3ffaf06b73239d1c62675e1b5ecf4
 SHA512 (13.0.0.tar.gz) = cdc42ddad3353297cf25ea2b6b3f09967f5f388efc26241f2997979fdbbac072819ff771145bc5bfa86cb326cca84b4119e8e6e3f658407961cf203a40603a7f
 Size (13.0.0.tar.gz) = 259967 bytes
-BLAKE2s (apache-arrow-17.0.0.tar.gz) = f43e8c901e26fe2b17ebd36d0a6758ad3dfa5cf5c92cdc908c62b7cb869a2a8a
-SHA512 (apache-arrow-17.0.0.tar.gz) = 4e2a617b8deeb9f94ee085653a721904a75696f0827bcba82b535cc7f4f723066a09914c7fa83c593e51a8a4031e8bf99e563cac1ebb1d89604cb406975d4864
-Size (apache-arrow-17.0.0.tar.gz) = 21822331 bytes
+BLAKE2s (apache-arrow-18.0.0.tar.gz) = 602de222b141621500e7a0598bc443c74a6aabf92c6ada46a7d8b2f4d1da4d18
+SHA512 (apache-arrow-18.0.0.tar.gz) = 4df30ab5561da695eaa864422626b9898555d86ca56835c3b8a8ca93a1dbaf081582bb36e2440d1daf7e1dd48c76941f1152a4f25ce0dbcc1c2abe244a00c05e
+Size (apache-arrow-18.0.0.tar.gz) = 19113236 bytes
 BLAKE2s (jemalloc-5.3.0.tar.bz2) = 285e6145b9d3b575b1ec5cfdae8af40b461149085f001839d64685c0d56e2689
 SHA512 (jemalloc-5.3.0.tar.bz2) = 22907bb052096e2caffb6e4e23548aecc5cc9283dce476896a2b1127eee64170e3562fa2e7db9571298814a7a2c7df6e8d1fbe152bd3f3b0c1abec22a2de34b1
 Size (jemalloc-5.3.0.tar.bz2) = 736023 bytes
diff --git a/apache-arrow/version.mk b/apache-arrow/version.mk
index 9ee843c808..6fabe7e89c 100644
--- a/apache-arrow/version.mk
+++ b/apache-arrow/version.mk
@@ -1,2 +1,2 @@
 # $NetBSD$
-APACHE_ARROW_VERSION=	17.0.0
+APACHE_ARROW_VERSION=	18.0.0


Home | Main Index | Thread Index | Old Index