tensorflow changelog


Hey there, code wranglers! We've been busy optimizing, fixing, and adding some cool new features to our codebase. Here's the latest scoop on what's new, improved, and bug-fixed. 🚀

  • New feature: We've added the GetDefaultLayout API to the IFRT Proxy, allowing you to easily retrieve default layouts for specified data types, dimensions, devices, and memory kinds. This is a big win for optimizing data placement and access patterns! 🎉

  • Improvement: Reinstated support for cuDNN explicit CUDA graph construction in the GPU backend, thanks to the release of cuDNN frontend v1.11.0. This enhancement is crucial for boosting performance in deep learning apps. 💪

  • New feature: Say hello to collect_symlink_data_aspect, a nifty addition for hunting down symlinked files in target runfiles. This makes file management in the build process more robust and efficient. 🔍

  • New feature: We've added a "copy" button for the full HLO instruction text format in HTML outputs. Now, you can easily copy HLO instruction text directly from the rendered output. Handy, right? 🖱️

  • New feature: Introducing IOPDDL utilities to XLA Auto Sharding's third-party directory. These tools are essential for tackling optimization problems and evaluating solutions. 🛠️

  • New feature: Simplified the ComputeAndCompareLiteral function with an overload that doesn't require an error_spec. This makes testing a breeze! 🌬️

  • Improvement: Enhanced the HLO diff tool to better visualize repetitive computation patterns. This makes it easier to spot and analyze patterns in computation differences. 🔍

  • Bugfix: Addressed a concurrency issue in GPU compiler tests by mutex-guarding the default_device_assignment_ pointer. No more race conditions here! 🏎️💨

  • Bugfix: Fixed undefined behaviors in PJRT by correcting how pointers are cast between unrelated types. Safety first! 🚦

  • Bugfix: Improved the conversion of HLO to StableHLO for programs with bounded dynamism. Now, the conversion process handles these programs more robustly. 🔄

  • Improvement: Integrated updates from LLVM, aligning with the latest changes and enhancing TensorFlow's capabilities and performance. ⚙️

  • Chore: We've moved tensorflow/lite/experimental/litert to the google-ai-edge/litert repository, streamlining the codebase for better organization. 📦

That's all for now, folks! Keep coding and stay awesome! 😎

Included Commits

2025-04-04T02:59:28 See commit

The recent commit introduces a new API method, GetDefaultLayout, to the IFRT (Intermediate Representation Framework) Proxy within the XLA (Accelerated Linear Algebra) library. This addition involves modifications to several files, including the client and server components of the IFRT Proxy. The GetDefaultLayout method allows users to retrieve the default layout for a specified data type, dimensions, device, and memory kind. The implementation includes creating a request with the necessary parameters, invoking the corresponding RPC method, and handling the response to deserialize the layout.

In addition to the API implementation, the commit also includes updates to the protocol buffer definitions to accommodate the new request and response types for GetDefaultLayout. The backend has been modified to process these requests, ensuring that the necessary logic is in place to handle the retrieval of layout information effectively. Furthermore, tests have been added to validate the functionality of the new API, confirming that it correctly returns the expected layout based on the input parameters. Overall, this commit enhances the capabilities of the IFRT Proxy by providing a mechanism to obtain layout information, which is crucial for optimizing data placement and access patterns in computational tasks.

Files changed

  • third_party/xla/xla/python/ifrt_proxy/client/BUILD
  • third_party/xla/xla/python/ifrt_proxy/client/client.cc
  • third_party/xla/xla/python/ifrt_proxy/client/client.h
  • third_party/xla/xla/python/ifrt_proxy/client/rpc_helper.cc
  • third_party/xla/xla/python/ifrt_proxy/client/rpc_helper.h
  • third_party/xla/xla/python/ifrt_proxy/common/ifrt_service.proto
  • third_party/xla/xla/python/ifrt_proxy/server/ifrt_backend.cc
  • third_party/xla/xla/python/ifrt_proxy/server/ifrt_backend.h
  • third_party/xla/xla/python/ifrt_proxy/server/ifrt_backend_test.cc
2025-04-07T18:58:49 See commit

This commit addresses a concurrency issue in the XLA (Accelerated Linear Algebra) GPU compiler tests by introducing mutex protection for the default_device_assignment_ pointer in the HloHardwareIndependentTestBase class. The problem arose when multiple threads simultaneously invoked the ParseAndReturnVerifiedModule method, leading to potential read-write race conditions that could reset the pointer, which was caught by ThreadSanitizer (TSAN). To resolve this, the commit adds an absl::Mutex to guard access to the default_device_assignment_, ensuring thread safety during its modification.

Changes made in the code include the addition of the mutex in the header file and the implementation of mutex locks in the TearDown method and the GetModuleConfigForTest method to protect against concurrent modifications. This enhancement is expected to stabilize the tests by preventing race conditions related to the device assignment, ultimately improving the reliability of the GPU compiler tests.

Files changed

  • third_party/xla/xla/hlo/testlib/BUILD
  • third_party/xla/xla/hlo/testlib/hlo_hardware_independent_test_base.cc
  • third_party/xla/xla/hlo/testlib/hlo_hardware_independent_test_base.h
2025-04-07T22:03:36 See commit

This commit introduces enhancements to the HLO (High-Level Optimizer) diff tool within the XLA (Accelerated Linear Algebra) library, specifically aimed at improving the visualization of repetitive computation patterns. Key changes include the addition of a new class, GraphUrlGenerator, which facilitates the generation of URLs for graph visualizations corresponding to pairs of computations and instructions. This new functionality is integrated into the existing HTML rendering process, allowing for clearer presentation of grouped computations that exhibit similar differences.

Additionally, the commit modifies several existing files to streamline the output of the diff results, including the removal of unnecessary code and the addition of new functions that summarize and print repetitive computation groups. The updates enhance the clarity and usability of the diff tool, making it easier for developers to identify and analyze patterns in computation differences. Overall, these changes contribute to a more efficient and user-friendly experience when working with HLO diffs.

Files changed

  • third_party/xla/xla/hlo/tools/hlo_diff/hlo_diff_main.cc
  • third_party/xla/xla/hlo/tools/hlo_diff/hlo_diff_summary.h
  • third_party/xla/xla/hlo/tools/hlo_diff/render/BUILD
  • third_party/xla/xla/hlo/tools/hlo_diff/render/graph_url_generator.h
  • third_party/xla/xla/hlo/tools/hlo_diff/render/hlo_gumgraph_html_renderer.cc
  • third_party/xla/xla/hlo/tools/hlo_diff/render/hlo_gumgraph_html_renderer.h
2025-04-08T04:05:16 See commit

The commit titled "Roll forward 'PR #22292: [GPU] Support cuDNN explicit CUDA graph construction'" reinstates a previously reverted pull request after the underlying issue was resolved with the release of cuDNN frontend version 1.11.0. The original pull request aimed to enhance GPU support by implementing explicit CUDA graph construction using cuDNN, which is crucial for optimizing performance in deep learning applications.

This merge, identified as PR #24464, modifies several files within the XLA project, particularly in the GPU backend and stream executor components, to accommodate the updated cuDNN functionalities. The changes include updates to command buffer implementations and header files, indicating a comprehensive integration of the new features across the codebase. The successful merging of this change signifies a significant step forward in enhancing GPU capabilities within the XLA framework.

Files changed

  • third_party/xla/xla/backends/gpu/runtime/command_buffer_cmd.cc
  • third_party/xla/xla/stream_executor/BUILD
  • third_party/xla/xla/stream_executor/command_buffer.h
  • third_party/xla/xla/stream_executor/cuda/BUILD
  • third_party/xla/xla/stream_executor/cuda/cuda_command_buffer.cc
  • third_party/xla/xla/stream_executor/cuda/cuda_command_buffer.h
  • third_party/xla/xla/stream_executor/cuda/cuda_dnn.cc
  • third_party/xla/xla/stream_executor/cuda/cuda_dnn.h
  • third_party/xla/xla/stream_executor/dnn.h
  • third_party/xla/xla/stream_executor/gpu/BUILD
  • third_party/xla/xla/stream_executor/gpu/gpu_command_buffer.cc
  • third_party/xla/xla/stream_executor/gpu/gpu_command_buffer.h
  • third_party/xla/xla/stream_executor/gpu/gpu_command_buffer_test.cc
  • third_party/xla/xla/stream_executor/rocm/rocm_command_buffer.h
2025-04-08T21:04:07 See commit

This commit introduces a "copy" button feature for the full HLO (High-Level Operation) instruction text format in the HTML output generated by the XLA (Accelerated Linear Algebra) tool. The changes primarily involve modifications to the hlo_gumgraph_html_renderer.cc file, where a new JavaScript function is added to facilitate copying text to the clipboard. The implementation includes enhancements to the CSS for styling the new button, as well as adjustments to existing functions to integrate the copy functionality seamlessly.

In addition to the copy button, the commit refactors several functions related to printing instructions and computations, ensuring that the new feature is incorporated throughout the HTML rendering process. This results in a more user-friendly interface, allowing users to easily copy HLO instruction text directly from the rendered output. Overall, the changes enhance the usability of the tool by providing a convenient way to interact with the instruction text in the HTML format.

Files changed

  • third_party/xla/xla/hlo/tools/hlo_diff/render/hlo_gumgraph_html_renderer.cc
2025-04-08T22:13:06 See commit

The recent commit introduces a new aspect called collect_symlink_data_aspect, designed to identify and collect symlinked files within the target runfiles. This addition enhances the existing functionality of the collect_data_aspect by allowing the collection of files that are specifically linked via symlinks, broadening the scope of file management within the build process. The implementation includes modifications to the python_wheel.bzl file, where new logic has been added to filter and gather files based on their extensions, specifically targeting files that are symlinked.

In addition to the new aspect, the commit also refines the existing data collection methods by adjusting how file extensions are handled and improving the overall structure of the file collection process. The changes include the use of dictionaries for file storage and a more streamlined approach to checking file attributes, which contributes to better performance and clarity in the code. Overall, this commit enhances the build system's capabilities in managing symlinked files, making it more robust and efficient.

Files changed

  • third_party/xla/third_party/py/python_wheel.bzl
2025-04-09T01:11:13 See commit

This commit integrates updates from the LLVM project, specifically aligning the TensorFlow codebase with the changes made in the LLVM commit identified by cd54cb062bba. The integration involves modifications across several files, including updates to the MLIR (Multi-Level Intermediate Representation) tests and transformations, as well as adjustments to various patches and workspace configurations related to LLVM and its third-party dependencies.

Key changes include updates to the convert_control_to_data_outputs.mlir and convert_control_to_data_outputs.cc files, reflecting the latest LLVM features and improvements. Additionally, the commit introduces a new patch for Triton and modifies existing patches and workspace files to ensure compatibility and functionality with the updated LLVM integration. Overall, this commit enhances the integration of LLVM within TensorFlow, improving the framework's capabilities and performance.

Files changed

  • tensorflow/compiler/mlir/tensorflow/tests/convert_control_to_data_outputs.mlir
  • tensorflow/compiler/mlir/tensorflow/transforms/convert_control_to_data_outputs.cc
  • third_party/llvm/generated.patch
  • third_party/llvm/workspace.bzl
  • third_party/shardy/temporary.patch
  • third_party/shardy/workspace.bzl
  • third_party/triton/llvm_integration/cl744822685.patch
  • third_party/triton/llvm_integration/series.bzl
  • third_party/xla/xla/mlir_hlo/deallocation/transforms/buffer_reuse.cc
2025-04-09T19:22:37 See commit

This commit addresses issues related to the conversion of High-Level Operations (HLO) to StableHLO for programs that exhibit bounded dynamism. It modifies several files within the XLA (Accelerated Linear Algebra) project, particularly enhancing the GetModuleFromHLOText and GetModuleFromHLOProto functions to include a new parameter, emit_mhlo, which determines whether to use the new StableHLO APIs or the legacy MHLO APIs during the conversion process. Additionally, it introduces checks for bounded dynamism in the MhloToStablehlo function, ensuring that if a module contains bounded dynamic types, it first converts to MHLO before proceeding to StableHLO.

The commit also includes the addition of a test case that verifies the correct handling of HLO programs with bounded dynamism, ensuring that the conversion process accurately reflects the expected outcomes. Debugging statements have been added to facilitate tracing and understanding the conversion flow, particularly in scenarios involving bounded dynamism. Overall, these changes aim to enhance the robustness and correctness of the HLO to StableHLO conversion process, accommodating a broader range of program characteristics.

Files changed

  • third_party/xla/xla/hlo/tools/hlo_translate.cc
  • third_party/xla/xla/hlo/translate/BUILD
  • third_party/xla/xla/hlo/translate/hlo_to_mhlo/hlo_function_importer.cc
  • third_party/xla/xla/hlo/translate/hlo_to_mhlo/hlo_module_importer.cc
  • third_party/xla/xla/hlo/translate/hlo_to_mhlo/tests/import_bounded_dynamism_stablehlo.mlir
  • third_party/xla/xla/hlo/translate/stablehlo.cc
2025-04-09T20:04:08 See commit

This commit introduces a set of IOPDDL (Integer Optimization Problem Description Language) utilities into the XLA (Accelerated Linear Algebra) Auto Sharding module's third-party directory. The additions include several source and header files that define the data structures and methods necessary for problem evaluation and solution generation in the context of optimization tasks. Specifically, the new files include iopddl.cc, which implements functionalities for reading problems and evaluating solutions, and iopddl.h, which defines the basic structures such as Node, Edge, Problem, and associated methods. Additionally, a solver is implemented in solver.cc to generate random solutions within a specified timeout.

The commit also adds a test suite in iopddl_test.cc, which contains various test cases to validate the functionality of the newly introduced utilities, ensuring that they correctly evaluate solutions and handle edge cases such as exceeding usage limits or providing invalid strategy indices. An example JSON file (example.json) is included to illustrate a problem structure that can be used for testing. Overall, these changes enhance the XLA Auto Sharding module by providing essential tools for optimization problem handling and solution evaluation.

Files changed

  • third_party/xla/xla/hlo/experimental/auto_sharding/BUILD
  • third_party/xla/xla/hlo/experimental/auto_sharding/example.json
  • third_party/xla/xla/hlo/experimental/auto_sharding/iopddl.cc
  • third_party/xla/xla/hlo/experimental/auto_sharding/iopddl.h
  • third_party/xla/xla/hlo/experimental/auto_sharding/iopddl_test.cc
  • third_party/xla/xla/hlo/experimental/auto_sharding/solver.cc
  • third_party/xla/xla/hlo/experimental/auto_sharding/solver.h
2025-04-09T21:48:38 See commit

The recent commit introduces an overload for the ComputeAndCompareLiteral function within the ClientLibraryTestRunnerMixin class, allowing it to be called without specifying an error_spec. This change simplifies the usage of the function by providing a version that defaults the error_spec parameter to std::nullopt, making it easier for users to compare computed results with expected literals without needing to define error specifications explicitly.

The modification involves adding 10 lines of code to the client_library_test_runner_mixin.h file, enhancing the test runner's capabilities in the XLA (Accelerated Linear Algebra) library. This update is expected to streamline the testing process, as it reduces the complexity associated with handling error specifications in comparisons, thereby improving the overall usability of the testing framework.

Files changed

  • third_party/xla/xla/tests/client_library_test_runner_mixin.h
2025-04-10T01:15:01 See commit

The commit involves the complete removal of the tensorflow/lite/experimental/litert directory, which has been relocated to the google-ai-edge/litert repository. This change signifies a significant restructuring within the TensorFlow Lite framework, indicating a shift in how the experimental LiteRT (Lightweight Runtime) components are managed and developed.

As a result of this move, numerous files and directories associated with LiteRT, including build files, source code, headers, tests, and various utilities, have been deleted from the TensorFlow Lite repository. This clean-up suggests a focus on streamlining the codebase and potentially improving the organization and accessibility of the LiteRT components in their new location.

Files changed

  • tensorflow/lite/experimental/litert/BUILD
  • tensorflow/lite/experimental/litert/build_common/export_litert_only_darwin.lds
  • tensorflow/lite/experimental/litert/build_common/export_litert_only_linux.lds
  • tensorflow/lite/experimental/litert/build_common/special_rule.bzl
  • tensorflow/lite/experimental/litert/c/BUILD
  • tensorflow/lite/experimental/litert/c/litert_accelerator.cc
  • tensorflow/lite/experimental/litert/c/litert_accelerator.h
  • tensorflow/lite/experimental/litert/c/litert_accelerator_compilation_options.cc
  • tensorflow/lite/experimental/litert/c/litert_accelerator_compilation_options.h
  • tensorflow/lite/experimental/litert/c/litert_accelerator_compilation_options_test.cc
  • tensorflow/lite/experimental/litert/c/litert_accelerator_registration.cc
  • tensorflow/lite/experimental/litert/c/litert_accelerator_registration.h
  • tensorflow/lite/experimental/litert/c/litert_accelerator_registration_test.cc
  • tensorflow/lite/experimental/litert/c/litert_accelerator_test.cc
  • tensorflow/lite/experimental/litert/c/litert_any.h
  • tensorflow/lite/experimental/litert/c/litert_common.cc
  • tensorflow/lite/experimental/litert/c/litert_common.h
  • tensorflow/lite/experimental/litert/c/litert_common_test.cc
  • tensorflow/lite/experimental/litert/c/litert_compilation_options.cc
  • tensorflow/lite/experimental/litert/c/litert_compilation_options.h
  • tensorflow/lite/experimental/litert/c/litert_compilation_options_test.cc
  • tensorflow/lite/experimental/litert/c/litert_environment_options.cc
  • tensorflow/lite/experimental/litert/c/litert_environment_options.h
  • tensorflow/lite/experimental/litert/c/litert_environment_options_test.cc
  • tensorflow/lite/experimental/litert/c/litert_event.cc
  • tensorflow/lite/experimental/litert/c/litert_event.h
  • tensorflow/lite/experimental/litert/c/litert_gl_types.h
  • tensorflow/lite/experimental/litert/c/litert_layout.h
  • tensorflow/lite/experimental/litert/c/litert_logging.h
  • tensorflow/lite/experimental/litert/c/litert_logging_test.cc
  • tensorflow/lite/experimental/litert/c/litert_model.h
  • tensorflow/lite/experimental/litert/c/litert_model_test.cc
  • tensorflow/lite/experimental/litert/c/litert_op_code.h
  • tensorflow/lite/experimental/litert/c/litert_tensor_buffer.h
  • tensorflow/lite/experimental/litert/c/litert_tensor_buffer_requirements_test.cc
  • tensorflow/lite/experimental/litert/c/litert_tensor_buffer_test.cc
  • tensorflow/lite/experimental/litert/cc/litert_accelerator_compilation_options.h
  • tensorflow/lite/experimental/litert/cc/litert_any.h
  • tensorflow/lite/experimental/litert/cc/litert_compilation_options.h
  • tensorflow/lite/experimental/litert/cc/litert_compiled_model.cc
  • tensorflow/lite/experimental/litert/cc/litert_compiled_model.h
  • tensorflow/lite/experimental/litert/cc/litert_compiled_model_integration_test.cc
  • tensorflow/lite/experimental/litert/cc/litert_compiled_model_test.cc
  • tensorflow/lite/experimental/litert/cc/litert_consts.h
  • tensorflow/lite/experimental/litert/cc/litert_detail.h
  • tensorflow/lite/experimental/litert/cc/litert_element_type_test.cc
  • tensorflow/lite/experimental/litert/cc/litert_environment.h
  • tensorflow/lite/experimental/litert/cc/litert_environment_test.cc
  • tensorflow/lite/experimental/litert/cc/litert_expected_test.cc
  • tensorflow/lite/experimental/litert/cc/litert_handle.h
  • tensorflow/lite/experimental/litert/cc/litert_layout.h
  • tensorflow/lite/experimental/litert/cc/litert_macros.cc
  • tensorflow/lite/experimental/litert/cc/litert_macros.h
  • tensorflow/lite/experimental/litert/cc/litert_macros_test.cc
  • tensorflow/lite/experimental/litert/cc/litert_model.cc
  • tensorflow/lite/experimental/litert/cc/litert_model.h
  • tensorflow/lite/experimental/litert/cc/litert_model_predicates.cc
  • tensorflow/lite/experimental/litert/cc/litert_model_predicates.h
  • tensorflow/lite/experimental/litert/cc/litert_model_predicates_test.cc
  • tensorflow/lite/experimental/litert/cc/litert_model_test.cc
  • tensorflow/lite/experimental/litert/cc/litert_op_options.cc
  • tensorflow/lite/experimental/litert/cc/litert_op_options.h
  • tensorflow/lite/experimental/litert/cc/litert_shared_library.cc
  • tensorflow/lite/experimental/litert/cc/litert_tensor_buffer_requirements.h
  • tensorflow/lite/experimental/litert/cc/test_shared_library.cc
  • tensorflow/lite/experimental/litert/compiler/plugin/BUILD
  • tensorflow/lite/experimental/litert/compiler/plugin/algo.cc
  • tensorflow/lite/experimental/litert/compiler/plugin/algo.h
  • tensorflow/lite/experimental/litert/compiler/plugin/algo_test.cc
  • tensorflow/lite/experimental/litert/compiler/plugin/compiler_flags.cc
  • tensorflow/lite/experimental/litert/compiler/plugin/compiler_flags.h
  • tensorflow/lite/experimental/litert/compiler/plugin/compiler_flags_test.cc
  • tensorflow/lite/experimental/litert/core/dispatch_op_schema.cc
  • tensorflow/lite/experimental/litert/core/environment.cc
  • tensorflow/lite/experimental/litert/core/environment.h
  • tensorflow/lite/experimental/litert/core/environment_options.cc
  • tensorflow/lite/experimental/litert/core/environment_options.h
  • tensorflow/lite/experimental/litert/core/environment_options_test.cc
  • tensorflow/lite/experimental/litert/core/environment_test.cc
  • tensorflow/lite/experimental/litert/core/filesystem.cc
  • tensorflow/lite/experimental/litert/core/model/flatbuffer_to_litert.cc
  • tensorflow/lite/experimental/litert/core/model/flatbuffer_to_litert.h
  • tensorflow/lite/experimental/litert/core/model/graph_validation.cc
  • tensorflow/lite/experimental/litert/core/model/graph_validation.h
  • tensorflow/lite/experimental/litert/core/model/ir_allocator.h
  • tensorflow/lite/experimental/litert/core/model/ir_allocator_test.cc
  • tensorflow/lite/experimental/litert/core/model/litert_to_flatbuffer.cc
  • tensorflow/lite/experimental/litert/core/model/model_buffer.cc
  • tensorflow/lite/experimental/litert/core/model/model_buffer.h
  • tensorflow/lite/experimental/litert/core/model/model_buffer_test.cc
  • tensorflow/lite/experimental/litert/core/model/model_file_test_util.cc
  • tensorflow/lite/experimental/litert/core/model/model_graph_test.cc
  • tensorflow/lite/experimental/litert/core/model/model_load.h
  • tensorflow/lite/experimental/litert/core/model/model_serialize.h
  • tensorflow/lite/experimental/litert/core/util/BUILD
  • tensorflow/lite/experimental/litert/core/util/flatbuffer_tools.cc
  • tensorflow/lite/experimental/litert/core/util/flatbuffer_tools.h
  • tensorflow/lite/experimental/litert/core/util/tensor_type_util.cc
  • tensorflow/lite/experimental/litert/core/version.h
  • tensorflow/lite/experimental/litert/python/BUILD
  • tensorflow/lite/experimental/litert/runtime/BUILD
  • tensorflow/lite/experimental/litert/runtime/accelerator.h
  • tensorflow/lite/experimental/litert/runtime/accelerator_model_compilation_data.h
  • tensorflow/lite/experimental/litert/runtime/accelerator_model_compilation_data_test.cc
  • tensorflow/lite/experimental/litert/runtime/accelerator_registry.cc
  • tensorflow/lite/experimental/litert/runtime/accelerator_registry.h
  • tensorflow/lite/experimental/litert/runtime/accelerator_test.cc
  • tensorflow/lite/experimental/litert/runtime/accelerators/BUILD
  • tensorflow/lite/experimental/litert/runtime/accelerators/accelerator_implementation_helper.h
  • tensorflow/lite/experimental/litert/runtime/accelerators/auto_registration.cc
  • tensorflow/lite/experimental/litert/runtime/accelerators/dispatch/BUILD
  • tensorflow/lite/experimental/litert/runtime/accelerators/dispatch/dispatch_accelerator.cc
  • tensorflow/lite/experimental/litert/runtime/accelerators/dispatch/dispatch_accelerator.h
  • tensorflow/lite/experimental/litert/runtime/accelerators/xnnpack/BUILD
  • tensorflow/lite/experimental/litert/runtime/accelerators/xnnpack/xnnpack_accelerator.cc
  • tensorflow/lite/experimental/litert/runtime/accelerators/xnnpack/xnnpack_accelerator.h
  • tensorflow/lite/experimental/litert/runtime/compilation_options.h
  • tensorflow/lite/experimental/litert/runtime/compiled_model.cc
  • tensorflow/lite/experimental/litert/runtime/compiler/BUILD
  • tensorflow/lite/experimental/litert/runtime/compiler/jit_compilation_mediatek_test.cc
  • tensorflow/lite/experimental/litert/runtime/compiler/jit_compilation_qualcomm_test.cc
  • tensorflow/lite/experimental/litert/runtime/dispatch/BUILD
  • tensorflow/lite/experimental/litert/runtime/dispatch/README.md
  • tensorflow/lite/experimental/litert/runtime/dispatch/litert_dispatch.cc
  • tensorflow/lite/experimental/litert/runtime/dmabuf_buffer.cc
  • tensorflow/lite/experimental/litert/runtime/event.cc
  • tensorflow/lite/experimental/litert/runtime/fastrpc_buffer.cc
  • tensorflow/lite/experimental/litert/runtime/gl_buffer.cc
  • tensorflow/lite/experimental/litert/runtime/gl_buffer.h
  • tensorflow/lite/experimental/litert/runtime/gl_texture.cc
  • tensorflow/lite/experimental/litert/runtime/gl_texture.h
  • tensorflow/lite/experimental/litert/runtime/ion_buffer.cc
  • tensorflow/lite/experimental/litert/runtime/opencl/buffer.cc
  • tensorflow/lite/experimental/litert/runtime/opencl/buffer.h
  • tensorflow/lite/experimental/litert/runtime/opencl/buffer_test.cc
  • tensorflow/lite/experimental/litert/runtime/tensor_buffer_requirements.h
  • tensorflow/lite/experimental/litert/test/testdata/add_cst.mlir
  • tensorflow/lite/experimental/litert/test/testdata/add_simple.mlir
  • tensorflow/lite/experimental/litert/test/testdata/cst_multi_subgraph.mlir
  • tensorflow/lite/experimental/litert/test/testdata/fully_connected_3d.mlir
  • tensorflow/lite/experimental/litert/test/testdata/mul_simple.mlir
  • tensorflow/lite/experimental/litert/test/testdata/multi_composite.mlir
  • tensorflow/lite/experimental/litert/test/testdata/multi_op_multi_subgraph.mlir
  • tensorflow/lite/experimental/litert/test/testdata/multi_use_cst.mlir
  • tensorflow/lite/experimental/litert/test/testdata/nested_composite.mlir
  • tensorflow/lite/experimental/litert/test/testdata/one_mul.mlir
  • tensorflow/lite/experimental/litert/test/testdata/rms_norm.mlir
  • tensorflow/lite/experimental/litert/test/testdata/rms_norm_composite.mlir
  • tensorflow/lite/experimental/litert/test/testdata/scala_reshape.mlir
  • tensorflow/lite/experimental/litert/test/testdata/shared_input_cpu_npu.mlir
  • tensorflow/lite/experimental/litert/test/testdata/simple_add_op.mlir
  • tensorflow/lite/experimental/litert/test/testdata/simple_average_poll_2d.mlir
  • tensorflow/lite/experimental/litert/test/testdata/simple_batch_matmul_op.mlir
  • tensorflow/lite/experimental/litert/test/testdata/simple_cascade_model_npu.mlir
2025-04-11T00:17:31 See commit

This commit addresses undefined behavior in the PJRT (Portable JAX Runtime) codebase by correcting the way pointers are cast between unrelated types. Specifically, it resolves the issue of reinterpret_cast being used to convert a pointer from a custom PJRT extension struct Foo to a PJRT_Extension_Base*, which is not valid since Foo does not inherit from PJRT_Extension_Base. To comply with C language standards, where inheritance is not supported, the solution involves modifying the structure of Foo to include a PJRT_Extension_Base variable as its first field.

The changes made in this commit include updates to various PJRT extension structures and functions, ensuring that they utilize the new base field for proper type casting. This adjustment enhances the safety and correctness of the PJRT code by preventing potential issues arising from invalid pointer conversions. The modifications span several files, impacting both the CPU and GPU implementations of PJRT, and include the addition of necessary fields and adjustments to function calls to reference the new base structure correctly.

Files changed

  • third_party/xla/xla/pjrt/c/pjrt_c_api_cpu_internal.cc
  • third_party/xla/xla/pjrt/c/pjrt_c_api_custom_partitioner_extension.h
  • third_party/xla/xla/pjrt/c/pjrt_c_api_ffi_extension.h
  • third_party/xla/xla/pjrt/c/pjrt_c_api_ffi_internal.cc
  • third_party/xla/xla/pjrt/c/pjrt_c_api_gpu_extension.h
  • third_party/xla/xla/pjrt/c/pjrt_c_api_gpu_internal.cc
  • third_party/xla/xla/pjrt/c/pjrt_c_api_gpu_test.cc
  • third_party/xla/xla/pjrt/c/pjrt_c_api_stream_extension.h
  • third_party/xla/xla/pjrt/c/pjrt_c_api_triton_extension.h
  • third_party/xla/xla/pjrt/c/pjrt_c_api_triton_internal.h
  • third_party/xla/xla/pjrt/pjrt_c_api_client.cc
  • third_party/xla/xla/pjrt/plugin/example_plugin/myplugin_c_pjrt.cc