Skip to content

Commit

Permalink
deploy: 2e502fc
Browse files Browse the repository at this point in the history
  • Loading branch information
guizili0 committed Mar 25, 2024
1 parent c212e1c commit 913d692
Show file tree
Hide file tree
Showing 73 changed files with 438 additions and 150 deletions.
4 changes: 2 additions & 2 deletions latest/CODE_OF_CONDUCT.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />

<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Contributor Covenant Code of Conduct &mdash; Intel® Extension for TensorFlow* 0.1.dev1+gb2bad43 documentation</title>
<title>Contributor Covenant Code of Conduct &mdash; Intel® Extension for TensorFlow* 0.1.dev1+g2e502fc documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="_static/css/theme.css" />
<link rel="stylesheet" type="text/css" href="_static/custom.css" />
Expand Down Expand Up @@ -225,7 +225,7 @@ <h2>Attribution<a class="headerlink" href="#attribution" title="Permalink to thi
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
<jinja2.runtime.BlockReference object at 0x7f239a770220>
<jinja2.runtime.BlockReference object at 0x7f97e9b529e0>
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>


Expand Down
4 changes: 2 additions & 2 deletions latest/SECURITY.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />

<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Security Policy &mdash; Intel® Extension for TensorFlow* 0.1.dev1+gb2bad43 documentation</title>
<title>Security Policy &mdash; Intel® Extension for TensorFlow* 0.1.dev1+g2e502fc documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="_static/css/theme.css" />
<link rel="stylesheet" type="text/css" href="_static/custom.css" />
Expand Down Expand Up @@ -120,7 +120,7 @@ <h2>Report a Vulnerability<a class="headerlink" href="#report-a-vulnerability" t
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
<jinja2.runtime.BlockReference object at 0x7f239a3a6f20>
<jinja2.runtime.BlockReference object at 0x7f97e9b17c40>
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>


Expand Down
27 changes: 17 additions & 10 deletions latest/_sources/docs/README.md.txt
Original file line number Diff line number Diff line change
Expand Up @@ -43,19 +43,22 @@
<tbody>
<tr>
<td colspan="2" align="center"><a href="guide/environment_variables.html">Environment variables</a></td>
<td colspan="2" align="center"><a href="guide/python_api.html">Python API</a></td>
<td colspan="2" align="center"><a href="guide/advanced_auto_mixed_precision.html">Advanced auto mixed precision</a></td>
<td colspan="2" align="center"><a href="guide/itex_fusion.html">Graph optimization</a></td>
<td colspan="2" align="center"><a href="guide/python_api.html">Python API</a></td>
<td colspan="4" align="center"><a href="guide/next_pluggable_device.html">Next Pluggable Device</a></td>
<td colspan="2" align="center"><a href="guide/threadpool.html">CPU Thread Pool</a></td>
<td colspan="2" align="center"><a href="guide/weight_prepack.html">Weight prepack</a></td>
</tr>
<tr>
<td colspan="2" align="center"><a href="guide/itex_ops.html">Custom operator</a></td>
<td colspan="2" align="center"><a href="guide/itex_ops_override.html">Operator override</a></td>
<td colspan="2" align="center"><a href="guide/INT8_quantization.html">INT8 quantization</a></td>
<td colspan="2" align="center"><a href="guide/XPUAutoShard.html">XPUAutoShard</a></td>
<td colspan="2" align="center"><a href="guide/itex_fusion.html">Graph optimization</a></td>
<td colspan="2" align="center"><a href="guide/itex_ops.html">Custom operator</a></td>
<td colspan="4" align="center"><a href="guide/advanced_auto_mixed_precision.html">Advanced auto mixed precision</a></td>
<td colspan="2" align="center"><a href="guide/itex_ops_override.html">Operator override</a></td>
</tr>
<tr>
<td colspan="3" align="center"><a href="guide/INT8_quantization.html">INT8 quantization</a></td>
<td colspan="2" align="center"><a href="guide/XPUAutoShard.html">XPUAutoShard</a></td>
<td colspan="2" align="center"><a href="guide/how_to_enable_profiler.html">GPU profiler</a></td>
<td colspan="2" align="center"><a href="guide/launch.html">CPU launcher</a></td>
<td colspan="2" align="center"><a href="guide/launch.html">CPU launcher</a></td>
<td colspan="2" align="center"><a href="guide/weight_prepack.html">Weight prepack</a></td>
</tr>
</tbody>
<thead>
Expand Down Expand Up @@ -94,9 +97,13 @@
Generally, the default configuration of Intel® Extension for TensorFlow\* provides good performance without any code changes.
Intel® Extension for TensorFlow\* also provides simple frontend Python APIs and utilities for advanced users to get more optimized performance with only minor code changes for different kinds of application scenarios. Typically, you only need to add two or three clauses to the original code.

* Next Pluggable Device (NPD)

The Next Pluggable Device (NPD) represents an advanced generation of TensorFlow plugin mechanisms. It not only facilitates a seamless integration of new accelerator plugins for registering devices with TensorFlow without requiring modifications to the TensorFlow codebase, but it also serves as a conduit to OpenXLA via its PJRT plugin. This innovative approach significantly streamlines the process of extending TensorFlow's capabilities with new hardware accelerators, enhancing both efficiency and flexibility.

* Advanced auto mixed precision (AMP)

Low precision data types `bfloat16` and` float16` are natively supported from the `3rd Generation Xeon® Scalable Processors`, code name [Cooper Lake](https://ark.intel.com/content/www/us/en/ark/products/series/204098/3rd-generation-intel-xeon-scalable-processors.html), with `AVX512` instruction set and the Intel® Data Center GPU, which further boosts performance and uses less memory. The lower-precision data types supported by Advanced Auto Mixed Precision (AMP) are fully enabled in Intel® Extension for TensorFlow*.
Low precision data types `bfloat16` and` float16` are natively supported by the `3rd Generation Xeon® Scalable Processors`, codenamed [Cooper Lake](https://ark.intel.com/content/www/us/en/ark/products/series/204098/3rd-generation-intel-xeon-scalable-processors.html), with `AVX512` instruction set and the Intel® Data Center GPU, which further boosts performance and uses less memory. The lower-precision data types supported by Advanced Auto Mixed Precision (AMP) are fully enabled in Intel® Extension for TensorFlow*.

* Graph optimization

Expand Down
80 changes: 80 additions & 0 deletions latest/_sources/docs/guide/next_pluggable_device.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# NextPluggableDevice Overview

The NextPluggableDevice (NPD) represents an advanced generation of [PluggableDevice](https://github.com/tensorflow/community/blob/master/rfcs/20200624-pluggable-device-for-tensorflow.html) mechanism. It not only facilitates a seamless integration of new accelerator plugins for registering devices with TensorFlow without requiring modifications to the TensorFlow codebase, but it also serves as a conduit to [OpenXLA (Accelerated Linear Algebra)](https://github.com/openxla/xla) via its [PJRT plugin](https://github.com/openxla/community/blob/main/rfcs/20230123-pjrt-plugin.html).

- [Overview](#NextPluggableDevice-Overview)

- [Why NextPluggableDevice](#Why-NextPluggableDevice)

- [Starting With NextPluggableDevice](#How-to-start-with-XLA-using-NextPluggableDevice)

- [Architecture](#NextPluggableDevice-Architecture)

- [Runtime Switch](#Runtime-Switch-of-NextPluggableDevice-and-PluggableDevice)

## Why NextPluggableDevice

Previously, Stock TensorFlow has designed & developed the [PluggableDevice](https://github.com/tensorflow/community/blob/master/rfcs/20200624-pluggable-device-for-tensorflow.html) to extend new device extensions without making device-specific changes to the TensorFlow code, and the PluggableDevice is tightly integrated with the [StreamExecutor C API](https://github.com/tensorflow/community/pull/257) today.

However, excessive binding with StreamExecutor has made it difficult for the PluggableDevice to be compatible
with [OpenXLA](https://github.com/openxla/xla). Precisely for this reason, TensorFlow evolved to use [PJRT plugin](https://github.com/openxla/community/blob/main/rfcs/20230123-pjrt-plugin.html) as the device API, resulting in the decoupling of the Pluggable Device from StreamExecutor, implemented as NextPluggableDevice.

## Start with XLA using NextPluggableDevice

Enabling XLA in ITEX is exactly the same as it is in TensorFlow, except that you need to export environment variables first:
```
$ export TF_XLA_FLAGS="--tf_xla_use_device_api=true --tf_xla_auto_jit=2"
$ python
>>> import tensorflow as tf # TensorFlow registers NextPluggableDevice here
>>> @tf.function(experimental_compile=True)
... def add_with_xla(a, b):
... return a + b
>>> a = tf.constant([1.0, 2.0, 3.0])
>>> b = tf.constant([4.0, 5.0, 6.0])
>>> result = add_with_xla(a, b)
>>> print("Result: ", result)
Result: tf.Tensor([5. 7. 9.], shape=(3,), dtype=float32)
```

## NextPluggableDevice Architecture

The NextPluggableDevice represents an advanced generation of the [PluggableDevice](https://github.com/tensorflow/community/blob/master/rfcs/20200624-pluggable-device-for-tensorflow.html) mechanism. Intel® Extension for TensorFlow* integrates the NextPluggableDevice as a new device type, along with the corresponding [PJRT CAPI](https://github.com/tensorflow/tensorflow/blob/master/third_party/xla/xla/pjrt/c/pjrt_c_api.h) for registering its Ops & Kernels, XLA PJRT client, Runtime, as well as the legacy Graph Optimization API and Profiler interface. In this way, it not only facilitates a seamless integration of new accelerator plugins for registering devices with TensorFlow without requiring modifications to the TensorFlow codebase, but it also serves as a conduit to [OpenXLA](https://github.com/openxla/xla) via its [PJRT plugin](https://github.com/openxla/community/blob/main/rfcs/20230123-pjrt-plugin.html).
<p align="center">
<img src="images/npd_architecture.png" alt="npd_architecture.png" />
</p>

### OpenXLA PJRT Plugin

[PJRT](https://github.com/tensorflow/tensorflow/blob/master/third_party/xla/xla/pjrt/c/pjrt_c_api.h) is a uniform Device API in the OpenXLA ecosystem. The long term vision for PJRT is that: (1) frameworks (TensorFlow, PyTorch, JAX, etc.) will call PJRT, which has device-specific implementations that are opaque to the frameworks; (2) each device focuses on implementing PJRT APIs, and can remain opaque to the frameworks.
<p align="center">
<img src="images/openxla_pjrt.png" alt="openxla_pjrt.png" />
</p>


the PJRT API will provide an easy interface with which frameworks can integrate a packaged compiler and runtime solution. It will be the supported interface used by TensorFlow and JAX for all compiler and runtime integration. And as such it will be easy for other compilers and runtimes that implement the PJRT interface to integrate with these systems.

## Runtime Switch of NextPluggableDevice and PluggableDevice
Intel® Extension for TensorFlow* is fully integrated with both PluggableDevice and NextPluggableDevice, and provides a runtime switch mechanism for enhancing both efficiency and flexibility. Figure 3 presents the architectures of the PluggableDevice and NextPluggableDevice.

<p align="center">
<img src="images/npd_pd_architecture.png" alt="npd_pd_architecture.png" />
</p>

Intel® Extension for TensorFlow* offers environmental variables to enable and disable different devices. Simply export the variables listed in the table below within your runtime environment to selectively enable or disable the corresponding device.
|Environment Variable|NextPluggableDevice|PluggableDevice|
|:-|:-:|:-:|
|export ITEX_ENABLE_NEXTPLUGGABLE_DEVICE=1| enabled | disabled |
|export ITEX_ENABLE_NEXTPLUGGABLE_DEVICE=0| disabled | enabled |
|export TF_XLA_FLAGS="--tf_xla_use_device_api=true --tf_xla_auto_jit=2"| enabled | disabled |
|default| enabled | disabled |

### Check Currently Used Device Type
In order to easily distinguish the currently used device type, users can check the verbose output as below:
```
# Using NextPluggableDevice
tensorflow/core/common_runtime/next_pluggable_device/next_pluggable_device_factory.cc:118] Created 1 TensorFlow NextPluggableDevices. Physical device type: XPU

# Using PluggableDevice
tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device -> physical PluggableDevice (device: 0, name: XPU, pci bus id: <undefined>)
```

2 changes: 1 addition & 1 deletion latest/_static/documentation_options.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
var DOCUMENTATION_OPTIONS = {
URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'),
VERSION: '0.1.dev1+gb2bad43',
VERSION: '0.1.dev1+g2e502fc',
LANGUAGE: 'en',
COLLAPSE_INDEX: false,
BUILDER: 'html',
Expand Down
4 changes: 2 additions & 2 deletions latest/docker/README.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />

<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Intel® Extension for TensorFlow* Docker Container Guide &mdash; Intel® Extension for TensorFlow* 0.1.dev1+gb2bad43 documentation</title>
<title>Intel® Extension for TensorFlow* Docker Container Guide &mdash; Intel® Extension for TensorFlow* 0.1.dev1+g2e502fc documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/css/theme.css" />
<link rel="stylesheet" type="text/css" href="../_static/custom.css" />
Expand Down Expand Up @@ -213,7 +213,7 @@ <h2>Verify That Intel GPU is Accessible From TensorFlow<a class="headerlink" hre
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
<jinja2.runtime.BlockReference object at 0x7f239a56c100>
<jinja2.runtime.BlockReference object at 0x7f97e9b15b40>
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>


Expand Down
4 changes: 2 additions & 2 deletions latest/docker/tensorflow-serving/README.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<meta charset="utf-8" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />

<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Intel® Extension for TensorFlow* Serving - Docker Container Guide &mdash; Intel® Extension for TensorFlow* 0.1.dev1+gb2bad43 documentation</title>
<title>Intel® Extension for TensorFlow* Serving - Docker Container Guide &mdash; Intel® Extension for TensorFlow* 0.1.dev1+g2e502fc documentation</title>
<link rel="stylesheet" type="text/css" href="../../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../../_static/css/theme.css" />
<link rel="stylesheet" type="text/css" href="../../_static/custom.css" />
Expand Down Expand Up @@ -169,7 +169,7 @@ <h2>Running the Container<a class="headerlink" href="#running-the-container" tit
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
<jinja2.runtime.BlockReference object at 0x7f239a37d450>
<jinja2.runtime.BlockReference object at 0x7f97e9b6c250>
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>


Expand Down
Loading

0 comments on commit 913d692

Please sign in to comment.