deploy: 2e502fc

intel · Mar 25, 2024 · 913d692 · 913d692
1 parent c212e1c
commit 913d692
Show file tree

Hide file tree

Showing 73 changed files with 438 additions and 150 deletions.
diff --git a/latest/CODE_OF_CONDUCT.html b/latest/CODE_OF_CONDUCT.html
@@ -4,7 +4,7 @@
   <meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-  <title>Contributor Covenant Code of Conduct &mdash; Intel® Extension for TensorFlow* 0.1.dev1+gb2bad43 documentation</title>
+  <title>Contributor Covenant Code of Conduct &mdash; Intel® Extension for TensorFlow* 0.1.dev1+g2e502fc documentation</title>
       <link rel="stylesheet" type="text/css" href="_static/pygments.css" />
       <link rel="stylesheet" type="text/css" href="_static/css/theme.css" />
       <link rel="stylesheet" type="text/css" href="_static/custom.css" />
@@ -225,7 +225,7 @@ <h2>Attribution<a class="headerlink" href="#attribution" title="Permalink to thi
   Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
     <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
     provided by <a href="https://readthedocs.org">Read the Docs</a>.
-   <jinja2.runtime.BlockReference object at 0x7f239a770220> 
+   <jinja2.runtime.BlockReference object at 0x7f97e9b529e0> 
   <p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
 
 

diff --git a/latest/SECURITY.html b/latest/SECURITY.html
@@ -4,7 +4,7 @@
   <meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-  <title>Security Policy &mdash; Intel® Extension for TensorFlow* 0.1.dev1+gb2bad43 documentation</title>
+  <title>Security Policy &mdash; Intel® Extension for TensorFlow* 0.1.dev1+g2e502fc documentation</title>
       <link rel="stylesheet" type="text/css" href="_static/pygments.css" />
       <link rel="stylesheet" type="text/css" href="_static/css/theme.css" />
       <link rel="stylesheet" type="text/css" href="_static/custom.css" />
@@ -120,7 +120,7 @@ <h2>Report a Vulnerability<a class="headerlink" href="#report-a-vulnerability" t
   Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
     <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
     provided by <a href="https://readthedocs.org">Read the Docs</a>.
-   <jinja2.runtime.BlockReference object at 0x7f239a3a6f20> 
+   <jinja2.runtime.BlockReference object at 0x7f97e9b17c40> 
   <p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
 
 

diff --git a/latest/_sources/docs/README.md.txt b/latest/_sources/docs/README.md.txt
@@ -43,19 +43,22 @@
   <tbody>
     <tr>
         <td colspan="2" align="center"><a href="guide/environment_variables.html">Environment variables</a></td>
-	<td colspan="2" align="center"><a href="guide/python_api.html">Python API</a></td>
-        <td colspan="2" align="center"><a href="guide/advanced_auto_mixed_precision.html">Advanced auto mixed precision</a></td>
-        <td colspan="2" align="center"><a href="guide/itex_fusion.html">Graph optimization</a></td>
+	    	<td colspan="2" align="center"><a href="guide/python_api.html">Python API</a></td>
+        <td colspan="4" align="center"><a href="guide/next_pluggable_device.html">Next Pluggable Device</a></td>
         <td colspan="2" align="center"><a href="guide/threadpool.html">CPU Thread Pool</a></td>
-	<td colspan="2" align="center"><a href="guide/weight_prepack.html">Weight prepack</a></td>
     </tr>
     <tr>
-    	<td colspan="2" align="center"><a href="guide/itex_ops.html">Custom operator</a></td>
-	<td colspan="2" align="center"><a href="guide/itex_ops_override.html">Operator override</a></td>
-	<td colspan="2" align="center"><a href="guide/INT8_quantization.html">INT8 quantization</a></td>
-	<td colspan="2" align="center"><a href="guide/XPUAutoShard.html">XPUAutoShard</a></td>
+        <td colspan="2" align="center"><a href="guide/itex_fusion.html">Graph optimization</a></td>
+        <td colspan="2" align="center"><a href="guide/itex_ops.html">Custom operator</a></td>
+        <td colspan="4" align="center"><a href="guide/advanced_auto_mixed_precision.html">Advanced auto mixed precision</a></td>
+	      <td colspan="2" align="center"><a href="guide/itex_ops_override.html">Operator override</a></td>
+    </tr>
+    <tr>    
+	      <td colspan="3" align="center"><a href="guide/INT8_quantization.html">INT8 quantization</a></td>
+	      <td colspan="2" align="center"><a href="guide/XPUAutoShard.html">XPUAutoShard</a></td>
         <td colspan="2" align="center"><a href="guide/how_to_enable_profiler.html">GPU profiler</a></td>
-	<td colspan="2" align="center"><a href="guide/launch.html">CPU launcher</a></td>
+	      <td colspan="2" align="center"><a href="guide/launch.html">CPU launcher</a></td>
+      	<td colspan="2" align="center"><a href="guide/weight_prepack.html">Weight prepack</a></td>
     </tr>
   </tbody>
   <thead>
@@ -94,9 +97,13 @@
   Generally, the default configuration of Intel® Extension for TensorFlow\* provides good performance without any code changes. 
   Intel® Extension for TensorFlow\* also provides simple frontend Python APIs and utilities for advanced users to get more optimized performance with only minor code changes for different kinds of application scenarios. Typically, you only need to add two or three clauses to the original code.
 
+* Next Pluggable Device (NPD)
+
+  The Next Pluggable Device (NPD) represents an advanced generation of TensorFlow plugin mechanisms. It not only facilitates a seamless integration of new accelerator plugins for registering devices with TensorFlow without requiring modifications to the TensorFlow codebase, but it also serves as a conduit to OpenXLA via its PJRT plugin. This innovative approach significantly streamlines the process of extending TensorFlow's capabilities with new hardware accelerators, enhancing both efficiency and flexibility.
+
 * Advanced auto mixed precision (AMP)
 
-  Low precision data types `bfloat16` and` float16` are natively supported from the `3rd Generation Xeon® Scalable Processors`, code name [Cooper Lake](https://ark.intel.com/content/www/us/en/ark/products/series/204098/3rd-generation-intel-xeon-scalable-processors.html),  with `AVX512` instruction set and the Intel® Data Center GPU, which further boosts performance and uses less memory. The lower-precision data types supported by Advanced Auto Mixed Precision (AMP) are fully enabled in Intel® Extension for TensorFlow*.
+  Low precision data types `bfloat16` and` float16` are natively supported by the `3rd Generation Xeon® Scalable Processors`, codenamed [Cooper Lake](https://ark.intel.com/content/www/us/en/ark/products/series/204098/3rd-generation-intel-xeon-scalable-processors.html),  with `AVX512` instruction set and the Intel® Data Center GPU, which further boosts performance and uses less memory. The lower-precision data types supported by Advanced Auto Mixed Precision (AMP) are fully enabled in Intel® Extension for TensorFlow*.
 
 * Graph optimization
 

diff --git a/latest/_sources/docs/guide/next_pluggable_device.md.txt b/latest/_sources/docs/guide/next_pluggable_device.md.txt
@@ -0,0 +1,80 @@
+# NextPluggableDevice Overview
+
+The NextPluggableDevice (NPD) represents an advanced generation of [PluggableDevice](https://github.com/tensorflow/community/blob/master/rfcs/20200624-pluggable-device-for-tensorflow.html) mechanism. It not only facilitates a seamless integration of new accelerator plugins for registering devices with TensorFlow without requiring modifications to the TensorFlow codebase, but it also serves as a conduit to [OpenXLA (Accelerated Linear Algebra)](https://github.com/openxla/xla) via its [PJRT plugin](https://github.com/openxla/community/blob/main/rfcs/20230123-pjrt-plugin.html).
+
+- [Overview](#NextPluggableDevice-Overview)
+
+- [Why NextPluggableDevice](#Why-NextPluggableDevice)
+
+- [Starting With NextPluggableDevice](#How-to-start-with-XLA-using-NextPluggableDevice)
+
+- [Architecture](#NextPluggableDevice-Architecture)
+
+- [Runtime Switch](#Runtime-Switch-of-NextPluggableDevice-and-PluggableDevice)
+
+## Why NextPluggableDevice
+
+Previously, Stock TensorFlow has designed & developed the [PluggableDevice](https://github.com/tensorflow/community/blob/master/rfcs/20200624-pluggable-device-for-tensorflow.html) to extend new device extensions without making device-specific changes to the TensorFlow code, and the PluggableDevice is tightly integrated with the [StreamExecutor C API](https://github.com/tensorflow/community/pull/257) today.
+
+However, excessive binding with StreamExecutor has made it difficult for the PluggableDevice to be compatible 
+with [OpenXLA](https://github.com/openxla/xla). Precisely for this reason, TensorFlow evolved to use [PJRT plugin](https://github.com/openxla/community/blob/main/rfcs/20230123-pjrt-plugin.html) as the device API, resulting in the decoupling of the Pluggable Device from StreamExecutor, implemented as NextPluggableDevice.
+
+## Start with XLA using NextPluggableDevice
+
+Enabling XLA in ITEX is exactly the same as it is in TensorFlow, except that you need to export environment variables first:
+```
+$ export TF_XLA_FLAGS="--tf_xla_use_device_api=true  --tf_xla_auto_jit=2"
+$ python
+>>> import tensorflow as tf  # TensorFlow registers NextPluggableDevice here
+>>> @tf.function(experimental_compile=True)
+... def add_with_xla(a, b):
+...     return a + b
+>>> a = tf.constant([1.0, 2.0, 3.0])
+>>> b = tf.constant([4.0, 5.0, 6.0])
+>>> result = add_with_xla(a, b)
+>>> print("Result: ", result)
+Result:  tf.Tensor([5. 7. 9.], shape=(3,), dtype=float32)
+```
+
+## NextPluggableDevice Architecture
+
+The NextPluggableDevice represents an advanced generation of the [PluggableDevice](https://github.com/tensorflow/community/blob/master/rfcs/20200624-pluggable-device-for-tensorflow.html) mechanism. Intel® Extension for TensorFlow* integrates the NextPluggableDevice as a new device type, along with the corresponding [PJRT CAPI](https://github.com/tensorflow/tensorflow/blob/master/third_party/xla/xla/pjrt/c/pjrt_c_api.h) for registering its Ops & Kernels, XLA PJRT client, Runtime, as well as the legacy Graph Optimization API and Profiler interface. In this way, it not only facilitates a seamless integration of new accelerator plugins for registering devices with TensorFlow without requiring modifications to the TensorFlow codebase, but it also serves as a conduit to [OpenXLA](https://github.com/openxla/xla) via its [PJRT plugin](https://github.com/openxla/community/blob/main/rfcs/20230123-pjrt-plugin.html).
+<p align="center">
+  <img src="images/npd_architecture.png" alt="npd_architecture.png" />
+</p>
+
+### OpenXLA PJRT Plugin
+
+[PJRT](https://github.com/tensorflow/tensorflow/blob/master/third_party/xla/xla/pjrt/c/pjrt_c_api.h) is a uniform Device API in the OpenXLA ecosystem. The long term vision for PJRT is that: (1) frameworks (TensorFlow, PyTorch, JAX, etc.) will call PJRT, which has device-specific implementations that are opaque to the frameworks; (2) each device focuses on implementing PJRT APIs, and can remain opaque to the frameworks.
+<p align="center">
+  <img src="images/openxla_pjrt.png" alt="openxla_pjrt.png" />
+</p>
+
+
+the PJRT API will provide an easy interface with which frameworks can integrate a packaged compiler and runtime solution. It will be the supported interface used by TensorFlow and JAX for all compiler and runtime integration. And as such it will be easy for other compilers and runtimes that implement the PJRT interface to integrate with these systems.
+
+## Runtime Switch of NextPluggableDevice and PluggableDevice
+Intel® Extension for TensorFlow* is fully integrated with both PluggableDevice and NextPluggableDevice, and provides a runtime switch mechanism for enhancing both efficiency and flexibility. Figure 3 presents the architectures of the PluggableDevice and NextPluggableDevice.
+
+<p align="center">
+  <img src="images/npd_pd_architecture.png" alt="npd_pd_architecture.png" />
+</p>
+
+Intel® Extension for TensorFlow* offers environmental variables to enable and disable different devices. Simply export the variables listed in the table below within your runtime environment to selectively enable or disable the corresponding device.
+|Environment Variable|NextPluggableDevice|PluggableDevice|
+|:-|:-:|:-:|
+|export ITEX_ENABLE_NEXTPLUGGABLE_DEVICE=1| enabled | disabled |
+|export ITEX_ENABLE_NEXTPLUGGABLE_DEVICE=0| disabled | enabled |
+|export TF_XLA_FLAGS="--tf_xla_use_device_api=true  --tf_xla_auto_jit=2"| enabled | disabled |
+|default| enabled | disabled |
+
+### Check Currently Used Device Type
+In order to easily distinguish the currently used device type, users can check the verbose output as below:
+```
+# Using NextPluggableDevice
+tensorflow/core/common_runtime/next_pluggable_device/next_pluggable_device_factory.cc:118] Created 1 TensorFlow NextPluggableDevices. Physical device type: XPU
+
+# Using PluggableDevice
+tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device -> physical PluggableDevice (device: 0, name: XPU, pci bus id: <undefined>)
+```
+
diff --git a/latest/_static/documentation_options.js b/latest/_static/documentation_options.js
@@ -1,6 +1,6 @@
 var DOCUMENTATION_OPTIONS = {
     URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'),
-    VERSION: '0.1.dev1+gb2bad43',
+    VERSION: '0.1.dev1+g2e502fc',
     LANGUAGE: 'en',
     COLLAPSE_INDEX: false,
     BUILDER: 'html',

diff --git a/latest/docker/README.html b/latest/docker/README.html
@@ -4,7 +4,7 @@
   <meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-  <title>Intel® Extension for TensorFlow* Docker Container Guide &mdash; Intel® Extension for TensorFlow* 0.1.dev1+gb2bad43 documentation</title>
+  <title>Intel® Extension for TensorFlow* Docker Container Guide &mdash; Intel® Extension for TensorFlow* 0.1.dev1+g2e502fc documentation</title>
       <link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
       <link rel="stylesheet" type="text/css" href="../_static/css/theme.css" />
       <link rel="stylesheet" type="text/css" href="../_static/custom.css" />
@@ -213,7 +213,7 @@ <h2>Verify That Intel GPU is Accessible From TensorFlow<a class="headerlink" hre
   Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
     <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
     provided by <a href="https://readthedocs.org">Read the Docs</a>.
-   <jinja2.runtime.BlockReference object at 0x7f239a56c100> 
+   <jinja2.runtime.BlockReference object at 0x7f97e9b15b40> 
   <p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
 
 

diff --git a/latest/docker/tensorflow-serving/README.html b/latest/docker/tensorflow-serving/README.html
@@ -4,7 +4,7 @@
   <meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-  <title>Intel® Extension for TensorFlow* Serving - Docker Container Guide &mdash; Intel® Extension for TensorFlow* 0.1.dev1+gb2bad43 documentation</title>
+  <title>Intel® Extension for TensorFlow* Serving - Docker Container Guide &mdash; Intel® Extension for TensorFlow* 0.1.dev1+g2e502fc documentation</title>
       <link rel="stylesheet" type="text/css" href="../../_static/pygments.css" />
       <link rel="stylesheet" type="text/css" href="../../_static/css/theme.css" />
       <link rel="stylesheet" type="text/css" href="../../_static/custom.css" />
@@ -169,7 +169,7 @@ <h2>Running the Container<a class="headerlink" href="#running-the-container" tit
   Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
     <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
     provided by <a href="https://readthedocs.org">Read the Docs</a>.
-   <jinja2.runtime.BlockReference object at 0x7f239a37d450> 
+   <jinja2.runtime.BlockReference object at 0x7f97e9b6c250> 
   <p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>