pkgsrc-WIP-changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

apache-arrow: Update to 13.0.0



Module Name:	pkgsrc-wip
Committed By:	Matthew Danielson <matthewd%fastmail.us@localhost>
Pushed By:	matthewd
Date:		Thu Aug 31 14:36:12 2023 -0700
Changeset:	2676f4f19e044a7bcc1482c6b432ee09600703a7

Modified Files:
	apache-arrow/PLIST
	apache-arrow/distinfo
	apache-arrow/version.mk

Log Message:
apache-arrow: Update to 13.0.0

Bug Fixes
    GH-14969 - [R][Docs] Enable pkgdown built-in search (#36374)
    GH-20385 - [C++][Parquet] Reject partial load of an extension type (#33634)
    GH-23870 - [Python] Ensure parquet.write_to_dataset doesn’t create empty files for non-observed dictionary (category) values (#36465)
    GH-32832 - [Go] support building with tinygo (#35723)
    GH-34017 - [Python][FlightRPC][Doc] Fix FlightStreamReader.read_chunk’s docstring (#35583)
    GH-34293 - [Java] Error loading native libraries on Windows (#34312)
    GH-34338 - [Java] Removing the automatic enabling of BaseAllocator.DEBUG on -ea (#36042)
    GH-34351 - [C++][Parquet] Statistics: add detail documentation and tiny optimization (#35989)
    GH-34363 - [C++] Use equal size parts in S3 upload for R2 compatibility (#35808)
    GH-34391 - [C++] Future as-of-join-node hangs on distant times (#34392)
    GH-34523 - [C++] Avoid mixing bundled Abseil and system Abseil (#35387)
    GH-34656 - [CI][Python] Use gemfury tool to upload wheels instead of curl to fix Windows wheel upload (#35032)
    GH-34723 - [Java] Enable log trace for Netty allocator memory usage (#35314)
    GH-34752 - [C++] Add support for LoongArch (#34740)
    GH-34775 - [R] arrow_table: as.data.frame() sometimes returns a tbl and sometimes a data.frame (#35173)
    GH-34884 - [Python] : Support pickling pyarrow.dataset PartitioningFactory objects (#36550)
    GH-34884 - [Python] : Support pickling pyarrow.dataset Partitioning subclasses (#36462)
    GH-34886 - [Python] Add correct array numpy conversion for Table and RecordBatch (#36242)
    GH-34897 - [R] Ensure that the RStringViewer helper class does not own any Array references (#35812)
    GH-34907 - [Docs][R] Version selector reports that release version is dev (#35103)
    GH-35007 - [C++] Fix reading stdin (#35006)
    GH-35015 - [Go] Fix parquet memleak (#35973)
    GH-35027 - [Go] : Use base64.StdEncoding in FixedSizeBinaryBuilder Unmarshal (#35028)
    GH-35053 - [Java] Fix MemoryUtil to support Java 21 (#36370)
    GH-35059 - [C++] Fix “hash_count” for run-end encoded inputs (#35129)
    GH-35101 - [C++] Update deprecated LOCATION target property in ArrowConfig.cmake.in (#35109)
    GH-35107 - [FlightSQL] : Use uint8 to refer to 8 bit unsigned integers rather than uint1 (#35108)
    GH-35118 - [Format][FlightSQL] More use int32 to refer to 32-bit integers rather than int (#35213)
    GH-35118 - [FlightSQL] Use int32 to refer to 32-bit integers rather than int (#35120)
    GH-35140 - [R] Rewrite configure script and ensure we don’t use mismatched libarrow (#35147)
    GH-35144 - [C++] Fix a unit test broken when the output order of the aggregate node changed (#35145)
    GH-35177 - [Docs][Python] Suppress “WARNING: autosummary: failed to import serialize” (#35182)
    GH-35179 - [C++] Fix IMPORTED_LOCATION property for Arrow::bundled_dependencies (#35196)
    GH-35188 - [Go] Use AppendValueFromString for extension types in CSV Reader (#35189)
    GH-35190 - [Go] Correctly handle null values in CSV reader (#35191)
    GH-35193 - [Python][Packaging] Enable GCS on Windows wheels (#35255)
    GH-35202 - [Go][Parquet] Fix panic reading nested empty list (#35276)
    GH-35234 - [Go] Fix skip argument to Callers (#35231)
    GH-35240 - [Go][FlightRPC] Fix crash in client middleware (#35241)
    GH-35266 - [GLib][Parquet] Fix a GC bug that parent metadata reference is missing in sub metadata (#35286)
    GH-35266 - [CI][GLib][Parquet] Omit gparquet_column_chunk_metadata_equal() test (#35278)
    GH-35267 - [C#] Serialize TotalBytes and TotalRecords in FlightInfo (#35222)
    GH-35270 - [C++] Use Buffer instead of raw buffer in hash join internals (#35347)
    GH-35297 - [C++][IPC] Fix schema deserialization of map field (#35298)
    GH-35306 - Fix Schema.Fields() to return copy of fields (#35307)
    GH-35310 - [Go] Incorrect value decimal128 from string (#35311)
    GH-35316 - [C++][FlightSQL] Use RowsToBatches() instead of ArrayFromJSON() in SQLite example server (#35322)
    GH-35326 - [Go] Fix *array.List and *array.LargeList ValueOffsets implementation (#35327)
    GH-35346 - [CI][Python] Move gdb from env-file to dockerfile (#35348)
    GH-35352 - [Java] Fix issues with “semi complex” types. (#35353)
    GH-35359 - [C++] FixedSizeListArray.flatten() errors if all elements are null (#35674)
    GH-35360 - [C++] Take offset into account in ScalarHashImpl::ArrayHash() (#35814)
    GH-35363 - [C++] Fix Substrait schema names and for segmented aggregation (#35364)
    GH-35379 - [C++][FlightRPC] Add teardown needed checks to avoid crash on error (#35380)
    GH-35383 - [C++] Prefer max_concurrency over executor capacity to avoid segmentation fault (#35384)
    GH-35406 - [Website][Docs] Missing logo on Arrow docs page
    GH-35413 - [Python] Add concrete floating point array types to pyarrow public API (#35414)
    GH-35421 - [Go] Ensure interface contract between array.X.ValueStr & array.XBuilder.AppendValueFromString (#35457)
    GH-35425 - [R] Tests failures on R < 4.0 due to data.frame conversion (#35432)
    GH-35438 - [Docs] Make corrections to the source docs (#35549)
    GH-35445 - [R] Behavior something like group_by(foo) 	> across(everything()) is different from dplyr (#35473)
    GH-35448 - [C++] Fix detection of %z in strptime format (#35449)
    GH-35468 - [C++] Fix Acero var/std for multiple batches (#35469)
    GH-35483 - [CI][C++] Add header for snprintf for Windows (#35484)
    GH-35490 - [Python] Interchange protocol: update tests for string and large_string (#35504)
    GH-35501 - [C++] Fix error C2280 in MSVC (#35683)
    GH-35503 - [CI][Packaging][C++] Snappy patch fails to apply on arm64 windows wheel builds (#35509)
    GH-35521 - [C++] Hash null bitmap only if null count is 0 (#35522)
    GH-35526 - [CI][C++] Fixing arrow::internal::IsNullRunEndEncoded redeclared (#35527)
    GH-35528 - [Java] Fix RangeEqualsVisitor comparing BitVector with different begin index (#35525)
    GH-35534 - [R] Ensure missing grouping variables are added to the beginning of the variable list (#36305)
    GH-35539 - [C++] Remove use of internal header files from public header file (#35592)
    GH-35553 - [JAVA] Fix unwrap() in NettyArrowBuf (#35554)
    GH-35571 - [C++][CI][Parquet] Change EQ to FLOAT_EQ in Decryption tests (#35605)
    GH-35573 - [Python] pa.FixedShapeTensorArray.to_numpy_ndarray fails on sliced arrays (#36164)
    GH-35576 - [C++] Make Decimal{128,256}::FromReal more accurate (#35997)
    GH-35588 - [Java] returning a constant hashCode for null values, resolves #35588 (#35590)
    GH-35593 - [R] Confusing (NULL) results when using `[[` and `$` to try to extract columns from Datasets
    GH-35596 - [C++][CI] Improve compilation caching with PCG (#35597)
    GH-35599 - [Python] Canonical fixed-shape tensor extension array/type is not picklable. (#35933)
    GH-35606 - [CI][C++][MinGW32] Use more accurate float inputs for decimal test (#35680)
    GH-35617 - [Docs] Current n_buffers use in C API example (#35626)
    GH-35618 - [C++][Doc] Improve doc for Datum (#35794)
    GH-35633 - [R] R builds failing with error ‘Invalid: Timestamps already have a timezone: ‘UTC’. Cannot localize to ‘UTC’’ (#35671)
    GH-35635 - [C++][CI] Preserve root when ignoring host on PathFromUriHelper to fix HDFS tests (#36063)
    GH-35636 - [C++] Extract two expensive test suites from compute-vector-test (#36401)
    GH-35649 - [R] Always call RecordBatchReader::ReadNext() from DuckDB from the main R thread (#36307)
    GH-35651 - [C++] Suppress self-move warning introduced in gcc 13 (#36328)
    GH-35651 - [C++] Don’t use self-move with MinGW (#35653)
    GH-35662 - [CI][C++][MinGW] Avoid crash in FormatTwoDigits() with release build (#35663)
    GH-35665 - [C++][Parquet] DeltaLengthByteArrayEncoder::Put reserve too much space (#35670)
    GH-35675 - [C++] Don’t copy the ArraySpan into the REE ArraySpan (#35677)
    GH-35681 - [Ruby] Add support for #select_columns with empty table (#35682)
    GH-35684 - [Go][Parquet] Fix nil dereference with nil list array (#35690)
    GH-35710 - [R] Followup improvements to new configure script (#36435)
    GH-35712 - [C++][CI] MacOS Disable ASSERT_DEATH in arrow-array-test (#35724)
    GH-35728 - [CI][Python] Move test_total_bytes_allocated to a subprocess to improve reliability (#36355)
    GH-35733 - [Java] Fix minor type in IntervalMonthDayNanoVector ctor (#35734)
    GH-35736 - [C++] Fix compile key_map_avx2.cc (#35737)
    GH-35760 - [C++] C Data Interface helpers should also run checks in non-debug mode (#36215)
    GH-35761 - [Go] Fix map comparison in TypeEqual (#35762)
    GH-35763 - [Go] Fix TypeEqual for lists (#35764)
    GH-35789 - [C++] Remove check_overflow from CumulativeSumOptions (#35790)
    GH-35809 - [C#] Improvements to the C Data Interface (#35810)
    GH-35819 - [GLib][Ruby] Refer dependency objects of GArrowExecutePlan (#35963)
    GH-35833 - [C++] Add support for Abseil 20230125 (#35881)
    GH-35837 - [C++] Acero will hang if StopProducing is called while backpressure is applied on the source node (#35902)
    GH-35838 - [C++] Add backpressure test for asof join node (#35874)
    GH-35838 - [C++] Fix asof join backpresure (#35878)
    GH-35853 - [Python] Fix deprecation warnings from NumPy NEP50 (#35854)
    GH-35858 - [Python] Fixup linting from PR GH-36011 (#36046)
    GH-35858 - [Python] disallow none schema parquet writer (#36011)
    GH-35859 - [Python] Actually change the default row group size to 1Mi (#36012)
    GH-35866 - [Go] Provide a copy in arrow.NestedType.Fields() implementations (#35867)
    GH-35868 - [C++] Occasional TSAN failure on asof-join-node-test (#35904)
    GH-35869 - [R][Release] U ndefined symbol _ZN5arrow6Status14AddContextLineEPKciS2_ on test-r-devdocs on maintenance branch for 12.0.1
    GH-35870 - [C++] Add support for changing optimization flags with CMAKE_CXX_FLAGS_DEBUG (#35924)
    GH-35891 - [Doc][Python] Update link to Parquet C++ repository (#35892)
    GH-35911 - [Go] Fix method CastToBytes of decimal256Traits (#35912)
    GH-35943 - [Dev] Ensure link issue works when PR body is empty (#36460)
    GH-35948 - [Go] Only cast int8 and unit8 to float64 when JSON marshaling arrays (#35950)
    GH-35952 - [R] Ensure that schema metadata can actually be set as a named character vector (#35954)
    GH-35960 - [Java] Detect overflow in allocation (#36185)
    GH-35965 - [Go] Fix Decimal256DictionaryBuilder (#35966)
    GH-35982 - [Go] Fix go1.18 broken builds (#35983)
    GH-35988 - [C#] The C data interface implementation can leak on import (#35996)
    GH-36003 - [Packaging][RPM] RPM jobs have a duplicated artifact pattern (#36004)
    GH-36013 - [C++] Disabling bundled OpenTelemetry with Protobuf 3.22+ (#36016)
    GH-36052 - [Go][Parquet] Cross build failures for 386 (#36066)
    GH-36053 - [C++] summarizing a variable results in NA at random, while there is no NA in the subset of data (#36368)
    GH-36076 - [C++] Remove deprecated cli flag (#36077)
    GH-36082 - [Release] Do nothing deb bump minor/patch version by post-11-bump-versions.sh on main (#36083)
    GH-36090 - [C++] Add testing libraries for Acero & Datasets (#36206)
    GH-36117 - [C++] Ensure creating BUILD_OUTPUT_ROOT_DIRECTORY (#36160)
    GH-36121 - [R] Warn for set_io_thread_count() with num_threads < 2 (#36304)
    GH-36168 - [C++][Python] Support halffloat for Arrow list to pandas (#35944)
    GH-36172 - [R] Windows devdocs build failing as it uses libarrow built without JSON capabilities (#36174)
    GH-36176 - [C++] Fix regression for single-key Table sorting (#36179)
    GH-36182 - [Gandiva][C++] Fix substring_index function when index is negative. (#36184)
    GH-36200 - [CI][Docs] Avoid “No space left on device” (#36230)
    GH-36201 - [Python][CI] test_total_bytes_allocated fails on arm64 wheels for manylinux
    GH-36209 - [Java] Upgrade Netty due to security vulnerability (#36211)
    GH-36214 - [C++] Specify FieldPath::Hash as template parameter where possible (#36222)
    GH-36224 - [CI] Update rest api invocations in GitHub scripts (#36225)
    GH-36239 - [CI][C++] Add support for multiple flags for ARROW__FLAGS_ (#36281)
    GH-36245 - [C++] Compile errors with gcc 13
    GH-36257 - [CI][Dev][Archery] bot requires pygithub 1.59.0 or later (#36467)
    GH-36259 - [R] Docs for as_schema description incorrect (#36260)
    GH-36311 - [C++] Fix integer overflows in utf8_slice_codeunits (#36575)
    GH-36327 - [C++][CI] Fix Valgrind failures (#36461)
    GH-36329 - [C++][CI] Use OpenSSL 3 on macOS (#36336)
    GH-36331 - [C++][CI] Sporadic errors in AsofJoinTest (#36356)
    GH-36340 - [Java] Address race condition in allocator logger thread (#36341)
    GH-36346 - [C++] Safe S3 finalization (#36442)
    GH-36349 - [Python][CI] Avoid using ‘build/etc/localtime’ timezone in hypothesis tests (#36391)
    GH-36352 - [Python] Add project_id to GcsFileSystem options (#36376)
    GH-36353 - [R] Fix package version references to be text only and never numeric (#36364)
    GH-36369 - [C++][FlightRPC] Fix a hang bug in FlightClient::Authenticate*() (#36372)
    GH-36396 - [R] Non-existent functions called in array tests (#36397)
    GH-36404 - [CI][C++][Gandiva] Crash tests for JNI build on arm64 macOS
    GH-36446 - [C++] Minor style improvements in ConcatenateImpl (#36463)
    GH-36447 - [C++][CI] arrow-s3fs-test fails on some nightly jobs
    GH-36448 - [C++][CI] vcpkg nightly job fails to build scalar_test.cc
    GH-36449 - [C++][CI] Don’t use -g1 for Python jobs (#36453)
    GH-36451 - [CI][C++] Fix compilation failure on Fedora 35 (#36457)
    GH-36452 - [CI][C++] Test C++20 support with compatible compiler (#36454)
    GH-36456 - [R] Link to correct version of OpenSSL when using autobrew (#36551)
    GH-36475 - [C++][CI] Fix Flight feature verification (#36473)
    GH-36476 - [C++][FlightRPC] Fix uninitialized fields in FlightInfo (#36484)
    GH-36477 - [CI][macOS] Ignore brew update failure on crossbow tasks (#36478)
    GH-36482 - [C++][CI] Fix sporadic test failures in AsofJoinBasicTest (#36499)
    GH-36498 - [Python][CI] Hypothesis nightly test fails with pytz.exceptions.UnknownTimeZoneError: ‘Factory’ (#36508)
    GH-36500 - [CI][Java][JAR] Remove Homebrew’s protobuf (#36515)
    GH-36501 - [CI][Java][JAR] Ensure removing Homebrew’s gRPC packages (#36516)
    GH-36523 - [C++] Fix TSan-detected lock ordering issues in S3 (#36536)
    GH-36524 - [GLib] Suppress a pessimizing-move warning (#36531)
    GH-36537 - [Python] Ensure dataset writer follows default Parquet version of 2.6 (#36538)
    GH-36543 - [CI][Docs] Use -g1 instead of -g for building docs (#36576)
    GH-36598 - [C++][MinGW] Fix build failure with Protobuf 23.4 (#36606)
    GH-36629 - [CI][Python] Skip dask tests due to our non-nanosecond changes in arrow->pandas conversion (#36630)
    GH-36641 - [C++] Remove reference to acero from non-acero file (#36650)
    GH-36659 - [Python] Fix pyarrow.dataset.Partitioning.eq when comparing with other type (#36661)
    GH-36669 - [Go] Guard against garbage in C Data structures (#36670)
    GH-36686 - [C++] Pass CMAKE_OSX_SYSROOT to external projects (#36706)
    GH-36687 - [R] Add correct branch name to autobrew formulae to facilitate local testing (#36689)
    GH-36707 - [C++] Use ARROW_PACKAGE_PREFIX for OPENSSL_ROOT_DIR too (#36710)
    GH-36812 - [C#] Fix C API support to work with .NET desktop framework (#36813)
    GH-36832 - [Packaging][RPM] Remove needless Requires (#36833)
    GH-36892 - [C++] Fix performance regressions in FieldPath::Get (#37032)
    GH-36913 - [C++] Skip empty buffer concatenation to fix UBSan error (#36914)
    GH-36928 - [Java] Make it run well with the netty newest version 4.1.96 (#36926)
    GH-36969 - [R] Disable GCS by default when doing a bundled build on gcc-13 (#37147)
    GH-37019 - [R] Documentation for read_parquet() et al needs updating (#37020)
    GH-37197 - [Java][CI][Packaging] Free some disk space on the java-jars GitHub job (#37198)
    GH-37201 - [CI][Packaging][Java] java-jars job fail on macOS aarch_64
New Features and Improvements
    GH-14790 - [Dev] Avoid extra comment with Closes issue id on PRs (#35811)
    GH-14946 - [C++] Add flattening FieldPath/FieldRef::Get methods (#35197)
    GH-15187 - [Java] Made reader initialization lazy and added new getTransferPair() function that takes in a Field type (#34424)
    GH-18547 - [Java] Support re-emitting dictionaries in ArrowStreamWriter (#35920)
    GH-20047 - [MATLAB] Enable GitHub Actions CI for MATLAB Interface on Windows (#35792)
    GH-21761 - [Python] accept pyarrow scalars in array constructor (#36162)
    GH-26153 - [C++] Share common codes for RecordBatchStreamReader and StreamDecoder (#36344)
    GH-29781 - [C++][Parquet] Switch to use compliant nested types by default (#35146)
    GH-29887 - [C++] Implement dictionary array sorting (#35280)
    GH-31521 - [C++][Flight] Migrate Flight SQL client to Result (#36559)
    GH-32190 - [C++][Compute] Implement cumulative prod, max and min functions (#36020)
    GH-32282 - [R] Update case_when() binding to match changes in dplyr (#35502)
    GH-32335 - [C++][Docs] Add design document for Acero (#35320)
    GH-32605 - [C#] Extend validity buffer api (#35342)
    GH-32605 - [C#] Extend ArrowBuffer.BitmapBuilder to improve performance of array concatenation (#13810)
    GH-32739 - [CI][Docs] Document Docs PR Preview (#35614)
    GH-32763 - [C++] Add FromProto for fetch & sort (#34651)
    GH-33206 - [C++] Add support for StructArray sorting and nested sort keys (#35727)
    GH-33321 - [Python] Support converting to non-nano datetime64 for pandas >= 2.0 (#35656)
    GH-33517 - [C++][Flight] Exercise UCX on CI (#14667)
    GH-33804 - [Python] Add support for manylinux_2_28 wheel (#34818)
    GH-33854 - [MATLAB] Add basic libmexclass integration code to MATLAB interface (#34563)
    GH-33856 - [C#] Implement C Data Interface for C# (#35496)
    GH-33980 - [Docs][Python] Document DataFrame Interchange Protocol implementation and usage (#35835)
    GH-33987 - [R] Support new dplyr .by/by argument (#35667)
    GH-34216 - [Python] Support for reading JSON Datasets With Python (#34586)
    GH-34223 - [Java] Java Substrait Consumer JNI call to ACERO C++ (#34227)
    GH-34375 - [C++][Parquet] Ignore page header stats when page index enabled (#35455)
    GH-34386 - [C++] Add a PathFromUriOrPath method (#34420)
    GH-34436 - [R] Bindings for JSON Dataset (#35055)
    GH-34509 - [C++][Parquet] Improve docstrings for ArrowReaderProperties::batch_size (#36486)
    GH-34722 - [C++][Parquet] Minor: Update wording of Parquet NextPage (#35368)
    GH-34729 - [C++][Python] Enhanced Arrow<->Pandas map/pydict support (#34730)
    GH-34749 - [Java] Make Zstd compression level configurable (#34873)
    GH-34787 - [Python] Accept zero_copy_only=False for ChunkedArray.to_numpy (#35582)
    GH-34788 - [Python][Packaging][CI] Drop Python 3.7 support (#36061)
    GH-34852 - [C++][Go][Java][FlightRPC] Add support for ordered data (#35178)
    GH-34858 - [Swift] Initial reader impl (#34842)
    GH-34868 - [Python] Share docstrings between classes (#34894)
    GH-34911 - [C++] Add first and last aggregator (#34912)
    GH-34918 - [C++] Update vendored double-conversion 3.2.1 (#34919)
    GH-34921 - [C++][Python][Java] Require CMake 3.16 or later (#35921)
    GH-34949 - [C++][Parquet] Enable page index by columns (#35230)
    GH-34971 - [Format] Add non-CPU version of C Data Interface (#34972)
    GH-34979 - [Python] Create a base class for Table and RecordBatch (#34980)
    GH-35004 - [C++] Remove RelationInfo (#35005)
    GH-35033 - [Java][Datasets] Add support for multi-file datasets from Java (#35034)
    GH-35035 - [R] Implement names<- for Schemas (#35172)
    GH-35067 - [JavaScript] toString for signed BigNums (#35067)
    GH-35084 - [Docs][Format] Add how to change format specification (#35174)
    GH-35099 - [CI][Packaging] Upgrade vcpkg to 2023.04.15 Release (#35430)
    GH-35112 - [Python] Expose keys_sorted in python MapType (#35113)
    GH-35124 - [C++] Avoid unnecessary copy when outputting join result (#35114)
    GH-35125 - [C++][Acero] Add a self-defined io-executor in QueryOptions (#35464)
    GH-35130 - [Docs] Document how to become a collaborator to get triage role (#36445)
    GH-35134 - [C++] Add arrow_vendored namespace around double-conversion library (#35135)
    GH-35136 - [Go][FlightSQL] Support backends without CreatePreparedStatement implemented (#35137)
    GH-35162 - [Go] Float16 arithmetic (#35163)
    GH-35164 - [Go] Additional methods for decimal data types (#35165)
    GH-35168 - [CI][Packaging][Conan] Merge upstream changes (#35169)
    GH-35171 - [C++][Parquet] Implement CRC for data page v2 (#35242)
    GH-35180 - [R] Implement bindings for cumsum function (#35339)
    GH-35212 - [Go] Add ability to show full call stack with ARROW_CHECKED_MAX_RETAINED_FRAMES (#35215)
    GH-35228 - [C++][Parquet] Minor: Comment typo fixing in Parquet Reader (#35229)
    GH-35245 - [Java][Dataset][Linux] Enable GCS (#35246)
    GH-35247 - [C++] Add Arrow Substrait support for stddev/variance (#35249)
    GH-35250 - [Python] Add test for datetime column conversion to pandas (#35546)
    GH-35256 - [Go] Add ToMap to Metadata (#35257)
    GH-35264 - [Python] Interchange protocol: test clean-up (#35530)
    GH-35275 - [Java] Ensure VectorSchemaRoot slice returns a new root (#35476)
    GH-35279 - [C++][Parquet] Tools: enhancement Parquet print stats (#35262)
    GH-35282 - [C++] auto enable brotli when enable fuzzing (#35283)
    GH-35290 - [JS] Update dependencies (#35291)
    GH-35302 - [Go] Improve unsupported type error message in pqarrow (#35303)
    GH-35304 - [C++][ORC] Support attributes conversion (#35499)
    GH-35315 - [C++][CMake] Add presets for Flight SQL (#35317)
    GH-35335 - [Python][Docs] Fix docstring of map_ (#35336)
    GH-35361 - [C++] Remove Perl dependency from cpp/build-support/run-test.sh (#35362)
    GH-35375 - [C++][FlightRPC] Add arrow::flight::ServerCallContext::incoming_headers() (#35376)
    GH-35377 - [C++][FlightRPC] Add a ServerCallContext parameter to arrow::flight::ServerAuthHandler methods (#35378)
    GH-35390 - [Python] Consolidate some APIs in Table and RecordBatch (#35396)
    GH-35400 - [R] Import download.file from utils (#35401)
    GH-35403 - [Docs] Support sphinx 6 for building the docs (#36296)
    GH-35411 - [MATLAB] Create a templated C++ Proxy Class for Numeric Arrays (#35479)
    GH-35415 - [Python] RecordBatch string reprsentation includes column preview (#35416)
    GH-35417 - [GLib] Add GArrowRunEndEncodedDataType (#36444)
    GH-35418 - [GLib] Add GArrowRunEndEncodedArray (#36470)
    GH-35435 - [Ruby][Flight] Add ArrowFlight::Client#authenticate_basic (#35436)
    GH-35442 - [C++][FlightRPC] Pass ServerCallContext instead of CallHeaders to ServerMiddlewareFactory::StartCall() (#35454)
    GH-35480 - [MATLAB] Add abstract MATLAB base class called arrow.array.Array (#35491)
    GH-35482 - [Go] Append nulls to values in array.FixedSizeListBuilder.AppendNull (#35481)
    GH-35485 - [CI][Python] Archery formats Python C++ codebase (#35487)
    GH-35489 - [MATLAB] Add CMake build directory to MATLAB .gitignore (#35493)
    GH-35492 - [MATLAB] : Add arrow.array.Float32Array MATLAB Class (#35495)
    GH-35500 - [C++][Go][Java][FlightRPC] Add support for result set expiration (#36009)
    GH-35506 - [C++] Support First and Last aggregators in Substrait (#35513)
    GH-35511 - [C++] Util: add memory_pool in SwapEndianArrayData (#36431)
    GH-35515 - [C++][Python] Add non decomposable aggregation UDF (#35514)
    GH-35516 - [R] Add 11.0.0.3 to backwards compatibility matrix (#35517)
    GH-35537 - [MATLAB] Create shared test class utility for numeric arrays (#35556)
    GH-35542 - [R] Implement schema extraction function (#35543)
    GH-35545 - [R] Re-organise reference page on pkgdown site (#36171)
    GH-35550 - [MATLAB] Add public toMATLAB method to arrow.array.Array for converting to MATLAB types (#35551)
    GH-35557 - [MATLAB] Add unsigned integer array MATLAB classes (i.e. UInt8Array, UInt16Array, UInt32Array, UInt64Array) (#35562)
    GH-35558 - [MATLAB] Add signed integer array MATLAB classes (i.e. Int8Array, Int16Array, Int32Array, Int64Array) (#35561)
    GH-35579 - [C++] Support non-named FieldRefs in Parquet scanner (#35798)
    GH-35598 - [MATLAB] Add a public Valid property to to the MATLAB arrow.array.<Array> classes to query Null values (i.e. validity bitmap support) (#35655)
    GH-35601 - [R][Documentation] Add missing docs to fileysystem.R (#35895)
    GH-35607 - [C++] Support simple Substrait aggregate extensions (#35608)
    GH-35609 - [Docs] Enable the build of subsections of the documentation (#35610)
    GH-35611 - [C++] Remove unnecessary safe operations for ListBuilder and BinaryBuilder (#35613)
    GH-35652 - [Go][Compute] Allow executing Substrait Expressions using Go Compute (#35654)
    GH-35659 - [Swift] Initial Swift IPC writer (#35660)
    GH-35669 - [C++] Update to double-conversion 3.3.0, activate new flags, remove patches (#36002)
    GH-35676 - [MATLAB] Add an InferNulls name-value pair for controlling null value inference during construction of arrow.array.Array (#35827)
    GH-35686 - [Go] Add AppendTime to TimestampBuilder (#35687)
    GH-35693 - [MATLAB] Add Valid as a name-value pair on the arrow.array.Float64Array constructor (#35977)
    GH-35705 - [R] Rename docs page from acero (#36107)
    GH-35706 - [CI] Set minimal permissions on pr_review_trigger.yml (#35708)
    GH-35709 - [R][Documentation] Document passing data to duckdb for windowed aggregates (#35882)
    GH-35711 - [Go] Add Value and GetValueIndex methods to some builders (#35744)
    GH-35729 - [C++][Parquet] Implement batch interface for BloomFilter in Parquet (#35731)
    GH-35746 - [Parquet][C++][Python] Switch default Parquet version to 2.6 (#36137)
    GH-35749 - [C++] Handle run-end encoded filters in compute kernels (#35750)
    GH-35752 - [CI][GLib][Ruby] Pass GITHUB_ACTIONS environment variable to Docker containers (#35753)
    GH-35754 - [CI][GLib] Don’t build static C++ libraries (#35755)
    GH-35757 - [C++][Parquet] using page-encoding-stats to build encodings (#35758)
    GH-35765 - [C++] Split vector_selection.cc into more compilation units (#35751)
    GH-35779 - [R][Documentation] Document workaround for window-like functionality (#35702)
    GH-35783 - [JS] Update dependencies (#35784)
    GH-35786 - [C++] Add pairwise_diff function (#35787)
    GH-35788 - [Swift] bug fixes and change reader/writer to user Result type (#35774)
    GH-35803 - [Doc] Add columns to the Implementation Status tables for Swift (#35862)
    GH-35817 - [Docs][C++] Fix value_counts/unique doc about null handling (#35818)
    GH-35828 - [Go] Add array.WithUnorderedMapKeys option for array.ApproxEqual (#35823)
    GH-35847 - [C++][Thirdparty] Bump xxhash version to v0.8.1 (#35849)
    GH-35871 - [Go] Account for struct validity bitmap in array.ApproxEqual (#35872)
    GH-35879 - [C++] Bump bundled google-cloud-cpp to 2.12.0 (#36119)
    GH-35906 - [Docs] Enable building the documentation without having pyarrow installed (#35907)
    GH-35909 - [Go] Deprecate arrow.MapType.ValueField & arrow.MapType.ValueType methods (#35899)
    GH-35914 - [MATLAB] Integrate the latest libmexclass changes to support error-handling (#35918)
    GH-35915 - [Ruby] Add support for converting function options from Hash automatically (#35927)
    GH-35922 - [C++] Drop support for Debian GNU/Linux buster (10) (#35923)
    GH-35926 - [C++][Parquet] Allow disabling ColumnIndex by disabling statistics (#35958)
    GH-35935 - [C++] Clean interruption of a Acero plan with use_threads=false (#35953)
    GH-35949 - [R] CSV File reader options class objects should print the selected values (#35955)
    GH-35961 - [C++][FlightSQL] Accept Protobuf 3.12.0 or later (#35962)
    GH-35969 - [Swift] use ArrowType instead of ArrowType.info and add binary, time32 and time64 types (#35985)
    GH-35974 - [Go] Don’t panic if importing C Array Stream fails (#35978)
    GH-35975 - [Go] Support importing decimal256 (#35981)
    GH-35979 - [C++] Refactor Acero scalar and hash aggregation into separate files (#35980)
    GH-35984 - [MATLAB] Add null support to all numeric array classes (#36039)
    GH-35987 - [C++] Unpin brew protobuf version (#36087)
    GH-35987 - [C++] Pin brew protobuf version to 21 (#36029)
    GH-35990 - [CI][C++][Windows] Don’t use -l for “choco list” (#35991)
    GH-36006 - [Packaging][RPM] Add support for Amazon Linux 2023 (#36081)
    GH-36008 - [Ruby][Parquet] Add Parquet::ArrowFileReader#each_row_group (#36022)
    GH-36014 - [Go] Allow duplicate field names in structs (#36015)
    GH-36023 - [CI][Ruby][Release] Suppress meaningless progress log from verify-rc-ruby (#36024)
    GH-36025 - [JS] Allow Node.js 18.14 or later in verify-release-candidate.sh (#36089)
    GH-36031 - [JS] : Update dependencies (#36032)
    GH-36033 - [JS] Remove BigInt compat (#36034)
    GH-36038 - [Python] Implement reduce on ExtensionType class (#36170)
    GH-36040 - [MATLAB] Add arrow.array.BooleanArray class (#36041)
    GH-36045 - [Python] Improve usability of pc.map_lookup / MapLookupOptions (#36387)
    GH-36047 - [C++][Compute] Add support for duration types to IndexIn and IsIn (#36058)
    GH-36050 - [Docs][C] Fix memory leak in C export documentation (#36051)
    GH-36055 - [JS] Use Node.js 18 in CI (#36147)
    GH-36056 - [CI] Enable Dependabot for GitHub Actions (#36194)
    GH-36059 - [C++][Compute] Reserve space for hashtable for scalar lookup functions (#36067)
    GH-36070 - [Go][Flight] Add Flight Client Cookie Middleware (#36071)
    GH-36072 - [MATLAB] Add MATLAB arrow.tabular.RecordBatch class (#36190)
    GH-36074 - [C++] Clarify docs for ConcatenateTablesOptions::field_merge_options (#36075)
    GH-36092 - [C++] Simplify concurrency in as-of-join node (#36094)
    GH-36095 - [Go] Add doc for pqarrow.FileWriter.WriteBuffered (#36163)
    GH-36096 - [Python] Call from_arrow in Array.to_pandas (#36314)
    GH-36098 - [MATLAB] Change C++ proxy constructors to accept an options struct instead of a cell array containing the arguments (#36108)
    GH-36105 - [Go] Support float16 in csv (#36106)
    GH-36109 - [MATLAB] Store a nullptr as the validity bitmap if all array elements are valid (#36114)
    GH-36120 - [C#] Support schema metadata through the C API (#36122)
    GH-36128 - [C++][Compute] Allow multiplication between duration and all integer types (#36231)
    GH-36129 - [Python] Consolidate common APIs in Table and RecordBatch (#36130)
    GH-36131 - [Docs] Use https://arrow.apache.org/julia/ for Julia URL (#36156)
    GH-36141 - [Go] Support large and fixed types in csv (#36142)
    GH-36151 - [Java] Add volatile declaration to keyPosition in ParallelSearcher (#36152)
    GH-36157 - [C++][Dev] Add support for using python3 to run IWYU (#36159)
    GH-36166 - [C++][MATLAB] Add utility to convert UTF-8 strings to UTF-16 and UTF-16 strings to UTF-8 (#36167)
    GH-36173 - [C++] Add lone high and low code-point test case for UTF8StringToUTF16 (#36383)
    GH-36177 - [MATLAB] Add the Type object hierarchy to the MATLAB interface (#36210)
    GH-36178 - [C++] support prefetching for ReadRangeCache lazy mode (#36180)
    GH-36181 - [Go] add methods AppendNulls and AppendEmptyValues for all builders (#36145)
    GH-36198 - [Go] Remove deprecated equality checks (#36169)
    GH-36203 - [C++] Support casting in both ways for is_in and index_in (#36204)
    GH-36207 - [MATLAB] Add MATLAB autosave files (.asv) to the .gitignore (#36208)
    GH-36212 - [MATLAB] Update README.md to mention support for arrow.array.Array classes (#36213)
    GH-36217 - [MATLAB] Add arrow.array.TimestampArray (#36333)
    GH-36218 - [CI][Go] Run benchmark steps only on the main branch (#36229)
    GH-36218 - [CI][Go] Run benchmark steps only on the main branch (#36219)
    GH-36220 - [CI] Run the “Docker Push” step only on the main branch (#36221)
    GH-36227 - [C++] New GcsOption to set the project id (#36228)
    GH-36232 - [Packaging][Ubuntu] Drop support for Ubuntu 22.10 (kinetic) (#36237)
    GH-36233 - [Packaging][Ubuntu] Add support for Ubuntu 23.04 (lunar) (#36238)
    GH-36234 - [Packaging][Debian] Add support for Debian GNU/Linux trixie (13) (#36285)
    GH-36241 - [Packaging] Drop support for Amazon Linux 2 (#36282)
    GH-36243 - [Dev] Remove PR workflow label as part of merge (#36244)
    GH-36249 - [MATLAB] Create a MATLAB_ASSIGN_OR_ERROR macro to mirror the C++ ARROW_ASSIGN_OR_RAISE macro (#36273)
    GH-36250 - [MATLAB] Add arrow.array.StringArray class (#36366)
    GH-36251 - [MATLAB] Add Type property to arrow.array.Array (#36270)
    GH-36252 - [Python] Add non decomposable hash aggregate UDF (#36253)
    GH-36255 - [C++] Add benchmarks for “if_else” kernel on lists (#36256)
    GH-36264 - [R] Add scalar() function (#36265)
    GH-36271 - [R] Split out R6 classes and convenience functions (#36394)
    GH-36284 - [Python][Parquet] Support write page index in Python API (#36290)
    GH-36287 - [Ruby] Add support for installing arrow-c-glib conda package automatically (#36288)
    GH-36293 - [C++] Use ipc_write_options.memory_pool for compressed buffer and shrink after compression (#36294)
    GH-36297 - [C++][Parquet] Benchmark for non-binary dict encoding (#36298)
    GH-36299 - [R][CI] Remove pkgdown check CI step (#36300)
    GH-36309 - [C++] Add ability to cast between scalars of list-like types (#36310)
    GH-36317 - [C++] Return a BufferVector from CleanListOffsets (#36316)
    GH-36319 - [Go][Parquet] Improved row group writer error messages (#36320)
    GH-36337 - [Ruby] Relax required Apache Arrow C++ version (#36338)
    GH-36342 - [C++] Add missing move semantic to RecordBatch (#36343)
    GH-36345 - [C++] Prefer TypeError over Invalid in IsIn and IndexIn kernels (#36358)
    GH-36359 - [MATLAB] Add support for Timestamp arrays to RecordBatch (#36361)
    GH-36367 - [C++] Add a zipped range utility (#36393)
    GH-36375 - [Java] Added creating MapWriter in ComplexWriter. (#36351)
    GH-36380 - [R] Create convenience function arrow_array (#36381)
    GH-36384 - [Go] Schema: NumFields (#36365)
    GH-36402 - [CI][macOS] Ignore brew update failure (#36403)
    GH-36405 - [C++][ORC] Upgrade ORC to 1.9.0 (#36406)
    GH-36407 - [C++] Add arrow::ipc::Listener::OnSchemaDecoded(schema, filtered_schema) (#36533)
    GH-36408 - [GLib][FlightSQL] Add support for INSERT/UPDATE/DELETE (#36409)
    GH-36414 - [C++] Add missing type_traits.h predicate: is_var_length_list() (#36415)
    GH-36421 - [Java] Enable Support for reading JSON Datasets (#36422)
    GH-36423 - [C++][Compute] Support “or” in Expression::IsSatisfiable (#36424)
    GH-36450 - [CI][Python] Upload wheel artifacts for Windows (#36466)
    GH-36479 - [C++][FlightRPC] Use gRPC version detected by find_package() (#36581)
    GH-36483 - [C++] Make UTF8StringToUTF16 and UTF16StringToUTF8 accept string_views (#36485)
    GH-36492 - [CI][Python] Add Ubuntu 22.04 nightly build (#36480)
    GH-36513 - [Dev][C#] Add Dependabot configuration for NuGet (#36514)
    GH-36541 - [Python][CI] Fixup nopandas build after merge of GH-33321 (#36586)
    GH-36541 - [Python][CI] Ensure the “Without pandas” CI build has no pandas installed (don’t install doc requirements in conda-python image) (#36542)
    GH-36544 - [Swift] Add/change some init methods to public access (#36545)
    GH-36553 - [Python] Improve error message if certain submodule (cython or cpp) is not built (#36554)
    GH-36556 - [CI][C++] Enable S3 in Valgrind build (#36579)
    GH-36560 - [MATLAB] Remove the DeepCopy name-value pair from arrow.array.<Numeric>Array constructors (#36561)
    GH-36568 - [Go] Include Timestamp Zone in ValueStr (#36569)
    GH-36577 - [Dev][C#] Use version-update:semver-major for some packages (#36578)
    GH-36582 - [CI][C++][Homebrew] Backport the latest formula changes (#36583)
    GH-36599 - [MATLAB] Bump libmexclass version to 3465900 (#36600)
    GH-36744 - [Python][Packaging] Add upper pin for cython<3 to pyarrow build dependencies (#36743)
    GH-36746 - [R] Update NEWS.md for 12.0.1.1 release (#36747)
    GH-36756 - [CI][Python] Install Cython < 3.0 on verify-release-candidate script (#36757)
    GH-36805 - [R] Update NEWS.md for 13.0.0 (#36806)
    GH-36839 - [CI][Docs] Update test-ubuntu-default-docs to use GitHub actions instead of Azure (#36840)
    GH-36947 - [CI] Move free up disk space to the Jinja macros to be able to reuse it on docs job (#36948)
    PARQUET-2316 - [C++] Allow partial PreBuffer in the parquet FileReader (#36192)
    PARQUET-2323 - [C++] Use bitmap to store pre-buffered column chunks (#36649)

To see a diff of this commit:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=2676f4f19e044a7bcc1482c6b432ee09600703a7

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

diffstat:
 apache-arrow/PLIST      | 24 +++++++++++++-----------
 apache-arrow/distinfo   |  6 +++---
 apache-arrow/version.mk |  2 +-
 3 files changed, 17 insertions(+), 15 deletions(-)

diffs:
diff --git a/apache-arrow/PLIST b/apache-arrow/PLIST
index c629b7b50b..397e3f3f9e 100644
--- a/apache-arrow/PLIST
+++ b/apache-arrow/PLIST
@@ -3,11 +3,11 @@ bin/arrow-file-to-stream
 bin/arrow-stream-to-file
 include/arrow/acero/accumulation_queue.h
 include/arrow/acero/aggregate_node.h
+include/arrow/acero/api.h
 include/arrow/acero/asof_join_node.h
 include/arrow/acero/benchmark_util.h
 include/arrow/acero/bloom_filter.h
 include/arrow/acero/exec_plan.h
-include/arrow/acero/groupby.h
 include/arrow/acero/hash_join.h
 include/arrow/acero/hash_join_dict.h
 include/arrow/acero/hash_join_node.h
@@ -304,9 +304,11 @@ include/arrow/vendored/double-conversion/bignum.h
 include/arrow/vendored/double-conversion/cached-powers.h
 include/arrow/vendored/double-conversion/diy-fp.h
 include/arrow/vendored/double-conversion/double-conversion.h
+include/arrow/vendored/double-conversion/double-to-string.h
 include/arrow/vendored/double-conversion/fast-dtoa.h
 include/arrow/vendored/double-conversion/fixed-dtoa.h
 include/arrow/vendored/double-conversion/ieee.h
+include/arrow/vendored/double-conversion/string-to-double.h
 include/arrow/vendored/double-conversion/strtod.h
 include/arrow/vendored/double-conversion/utils.h
 include/arrow/vendored/pcg/pcg_extras.hpp
@@ -407,21 +409,21 @@ lib/cmake/Parquet/ParquetTargets-release.cmake
 lib/cmake/Parquet/ParquetTargets.cmake
 lib/libarrow.a
 lib/libarrow.so
-lib/libarrow.so.1200
-lib/libarrow.so.1200.1.0
+lib/libarrow.so.1300
+lib/libarrow.so.1300.0.0
 lib/libarrow_acero.a
 lib/libarrow_acero.so
-lib/libarrow_acero.so.1200
-lib/libarrow_acero.so.1200.1.0
+lib/libarrow_acero.so.1300
+lib/libarrow_acero.so.1300.0.0
 lib/libarrow_bundled_dependencies.a
 lib/libarrow_dataset.a
 lib/libarrow_dataset.so
-lib/libarrow_dataset.so.1200
-lib/libarrow_dataset.so.1200.1.0
+lib/libarrow_dataset.so.1300
+lib/libarrow_dataset.so.1300.0.0
 lib/libparquet.a
 lib/libparquet.so
-lib/libparquet.so.1200
-lib/libparquet.so.1200.1.0
+lib/libparquet.so.1300
+lib/libparquet.so.1300.0.0
 lib/pkgconfig/arrow-acero.pc
 lib/pkgconfig/arrow-compute.pc
 lib/pkgconfig/arrow-csv.pc
@@ -431,8 +433,8 @@ lib/pkgconfig/arrow-json.pc
 lib/pkgconfig/arrow.pc
 lib/pkgconfig/parquet.pc
 share/arrow/gdb/gdb_arrow.py
-share/arrow/gdb/libarrow.so.1200.1.0-gdb.py
+share/arrow/gdb/libarrow.so.1300.0.0-gdb.py
 share/doc/arrow/LICENSE.txt
 share/doc/arrow/NOTICE.txt
 share/doc/arrow/README.md
-@pkgdir share/gdb/auto-load/home/matthew/pkgsrc/install.20230703/lib
+@pkgdir share/gdb/auto-load/home/matthew/pkgsrc/install.20230721/lib
diff --git a/apache-arrow/distinfo b/apache-arrow/distinfo
index 2c52406319..ef81df277e 100644
--- a/apache-arrow/distinfo
+++ b/apache-arrow/distinfo
@@ -3,9 +3,9 @@ $NetBSD$
 BLAKE2s (9.0.1.tar.gz) = a785e1ad5fd5df76c95e7cf9a6eadeb86ffbc46ea4342f49f19381434bd0f78c
 SHA512 (9.0.1.tar.gz) = ed56287f608ccdf5bc5d5fc2918e313e7c4cecdd9ef2c9993a72ea900d9ff662c57ac5326c7a809eb11505c6f39d4599f3f161b97b6e03c65783b824b8d700d2
 Size (9.0.1.tar.gz) = 215065 bytes
-BLAKE2s (apache-arrow-12.0.1.tar.gz) = 9ae10c69fabfaba3e676eff06df7112a3c101d093673dfe01580daf90ed6359e
-SHA512 (apache-arrow-12.0.1.tar.gz) = 551ae200551fcc73b7deddcc5f0b06633159ab1308506901a9086e4e2e34e4437f26d609fdbacba0ebe7d1fe83bdb8e92a268e9e41575d655d5b2d4fbef7a7ce
-Size (apache-arrow-12.0.1.tar.gz) = 20172604 bytes
+BLAKE2s (apache-arrow-13.0.0.tar.gz) = aa96f6ef2b0298fc299d5f37eab5ad4122843b5d036cca9762289f92fc885f25
+SHA512 (apache-arrow-13.0.0.tar.gz) = 3314d79ef20ac2cfc63f2c16fafb30c3f6187c10c6f5ea6ff036f6db766621d7c65401d85bf1e979bd0ecf831fbb0a785467642792d6bf77016f9807243c064e
+Size (apache-arrow-13.0.0.tar.gz) = 20542669 bytes
 BLAKE2s (jemalloc-5.3.0.tar.bz2) = 285e6145b9d3b575b1ec5cfdae8af40b461149085f001839d64685c0d56e2689
 SHA512 (jemalloc-5.3.0.tar.bz2) = 22907bb052096e2caffb6e4e23548aecc5cc9283dce476896a2b1127eee64170e3562fa2e7db9571298814a7a2c7df6e8d1fbe152bd3f3b0c1abec22a2de34b1
 Size (jemalloc-5.3.0.tar.bz2) = 736023 bytes
diff --git a/apache-arrow/version.mk b/apache-arrow/version.mk
index 962ec91acf..2193ba8323 100644
--- a/apache-arrow/version.mk
+++ b/apache-arrow/version.mk
@@ -1,2 +1,2 @@
 # $NetBSD$
-APACHE_ARROW_VERSION=	12.0.1
+APACHE_ARROW_VERSION=	13.0.0



Home | Main Index | Thread Index | Old Index