Improve documentation around formatting assets and annotations in a d…

…ataset (#681) * Improve documentation around formatting assets and annotations in a dataset * Explicitly specify supported file extensions for assets * Improve documentation around using npz files for PointCloudAsset * Fix documentation references * Update docs/dataset/advanced-usage/dataset-formatting/computer-vision.md Co-authored-by: Miller Peterson <miller.peterson@gmail.com> --------- Co-authored-by: Miller Peterson <miller.peterson@gmail.com>
kolenaIO · Sep 18, 2024 · 9618a50 · 9618a50
1 parent 4908c64
commit 9618a50
Show file tree

Hide file tree

Showing 3 changed files with 56 additions and 14 deletions.
diff --git a/docs/dataset/advanced-usage/dataset-formatting/computer-vision.md b/docs/dataset/advanced-usage/dataset-formatting/computer-vision.md
@@ -72,7 +72,40 @@ Below is an example datapoint:
     | --- | --- | --- | --- | --- |
     | `s3://kolena-examples/data/h0.png`| `s3://kolena-examples/data/thumbnail/h0.png` | horse | 153.994 | 84.126 |
 
-## 2D Object Detection
+### Including Assets and Annotations
+
+Kolena supports the inclusion of overlay annotations and asset files as fields in a dataset.
+
+We recommend using the [annotation](../../../reference/annotation.md) and [asset](../../../reference/asset.md) dataclasses
+for ease of annotation and asset manipulation:
+
+```
+# Creates a single-row DataFrame with an image datapoint, a `bbox` annotation field, and a `mesh` asset file.
+
+import pandas as pd
+from kolena.annotation import BoundingBox
+from kolena.asset import MeshAsset
+
+locator = "s3://kolena-public-examples/coco-2014-val/data/COCO_val2014_000000000294.jpg"
+bbox = BoundingBox(top_left=(27.7, 69.83), bottom_right=(392.61, 427))
+mesh = MeshAsset(locator="s3://kolena-public-examples/a-large-dataset-of-object-scans/data/mesh/00004.ply")
+df = pd.DataFrame([dict(locator=locator, bbox=bbox, mesh=mesh)])
+
+# DataFrame can now be directly uploaded as a dataset
+from kolena.dataset import upload_dataset
+upload_dataset("my-dataset", df, id_fields=["locator"])
+
+# Or serialized to CSV and uploaded through the web UI.
+# If serializing to CSV please use the provided `kolena.io.dataframe_to_csv` method. The Pandas provided `to_csv` method
+# does not adhere to the JSON spec, and may serialize malformed objects.
+from kolena.io import dataframe_to_csv
+
+dataframe_to_csv(df, "my-dataset.csv", index=False)
+```
+
+## Specific Workflows
+
+### 2D Object Detection
 
 !!! example
     You can follow this [example 2D object detection ↗](https://github.com/kolenaIO/kolena/blob/trunk/examples/dataset/object_detection_2d/object_detection_2d/upload_dataset.py)
@@ -117,7 +150,7 @@ bboxes = [
     instead of [`pandas.to_csv()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html).
     `pandas.to_csv` does not serialize Kolena annotation objects in a way that is compatible with the platform.
 
-### Uploading Model Results
+#### Uploading Model Results
 
 Model results contain your model inferences as well as any custom metrics that you wish to monitor on Kolena.
 The data structure of model results is very similar to the structure of a dataset with minor differences.
@@ -151,7 +184,7 @@ to compute your metrics that are supported by Kolena's [Object Detection Task Me
     Follow the [2D Object Detection result upload](https://github.com/kolenaIO/kolena/blob/trunk/examples/dataset/object_detection_2d/object_detection_2d/upload_results.py)
     example for optimal setup.
 
-## 3D Object Detection
+### 3D Object Detection
 
 [`annotations`](../../../reference/annotation.md) are used to visualize overlays on top of images.
 To render 3D Bounding boxes you can use
@@ -181,7 +214,7 @@ To render 3D Bounding boxes you can use
     [`pandas.to_csv()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html).
     `pandas.to_csv` does not serialize Kolena annotation objects in a way that is compatible with the platform.
 
-### Uploading Model Results
+#### Uploading Model Results
 
 Model results contain your model inferences as well as any custom metrics that you wish to monitor on Kolena.
 The data structure of model results is very similar to the structure of a dataset with minor differences.
@@ -202,7 +235,7 @@ to compute your metrics that are supported by Kolena's [Object Detection Task Me
     Follow the [3D Object Detection result upload script](https://github.com/kolenaIO/kolena/blob/trunk/examples/dataset/object_detection_3d/object_detection_3d/upload_results.py)
     on how to setup both 3D and 2D bounding boxes in your results for multi-modal 3D object detection data.
 
-## Video
+### Video
 
 Videos are best represented in Kolena using the Gallery view. To setup the Gallery view, add links to your video files
 stored on the cloud under the `locator` column. Kolena automatically looks for that column name and renders your video files
@@ -213,7 +246,7 @@ Kolena supports `mov`, `mp4`, `mpeg` and other web browser supported video types
     Bounding box visualization only works for videos with `5`, `15`, `29.97` and `59.94` frame rates.
     Please let us know if you are working with a frame rate outside of the ones mentioned.
 
-### Setting up bounding box annotations on videos
+#### Setting up bounding box annotations on videos
 
 To overlay bounding boxes on videos, you will need to define a new class based on
 [`LabeledBoundingBox`](../../../reference/annotation.md#kolena.annotation.LabeledBoundingBox) or

diff --git a/docs/dataset/core-concepts/index.md b/docs/dataset/core-concepts/index.md
@@ -66,20 +66,21 @@ as text. Below table outlines what extensions are supported for optimal visualiz
 |----------------|---------------------------------------------------------------------------------------|
 | Image          | `jpg`, `jpeg`, `png`, `gif`, `bmp` and other web browser supported image types.       |
 | Audio          | `flac`, `mp3`, `wav`, `acc`, `ogg`, `ra` and other web browser supported audio types. |
-| Video          | `mov`, `mp4`, `mpeg`, `avi` and other web browser supported video types.              |
+| Video          | `mov`, `mp4`, `mpeg` and other web browser supported video types.                     |
 | Document       | `txt` and `pdf` files.                                                                |
 | Point Cloud    | `pcd` files.                                                                          |
 
 **Assets**: allow you to connect multiple referenced files to each datapoint for visualization and analysis.
 Multiple assets can be attached to a single datapoint.
 
-| Asset Type                                                              | Description                                                    |
-|-------------------------------------------------------------------------|----------------------------------------------------------------|
-| [`ImageAsset`](../../reference/asset.md#kolena.asset.ImageAsset)           | Useful if your data is modeled as multiple related images.     |
-| [`BinaryAsset`](../../reference/asset.md#kolena.asset.BinaryAsset)         | Useful if you want to attach any segmentation or bitmap masks. |
-| [`AudioAsset`](../../reference/asset.md#kolena.asset.AudioAsset)           | Useful if you want to attach an audio file.                    |
-| [`VideoAsset`](../../reference/asset.md#kolena.asset.VideoAsset)           | Useful if you want to attach a video file.                     |
-| [`PointCloudAsset`](../../reference/asset.md#kolena.asset.PointCloudAsset) | Useful for attaching 3D point cloud data.                      |
+| Asset Type                                                                 | Description                                                    | Supported Extensions          |
+|----------------------------------------------------------------------------|----------------------------------------------------------------|-------------------------------|
+| [`ImageAsset`](../../reference/asset.md#kolena.asset.ImageAsset)           | Useful if your data is modeled as multiple related images.     | Same as above reference files |
+| [`BinaryAsset`](../../reference/asset.md#kolena.asset.BinaryAsset)         | Useful if you want to attach any segmentation or bitmap masks. | Any, including `.bin` files   |
+| [`AudioAsset`](../../reference/asset.md#kolena.asset.AudioAsset)           | Useful if you want to attach an audio file.                    | Same as above reference files |
+| [`VideoAsset`](../../reference/asset.md#kolena.asset.VideoAsset)           | Useful if you want to attach a video file.                     | Same as above reference files |
+| [`PointCloudAsset`](../../reference/asset.md#kolena.asset.PointCloudAsset) | Useful for attaching 3D point cloud data.                      | `.pcd`, `.npy`, `.npz`        |
+| [`MeshAsset`](../../reference/asset.md#kolena.asset.MeshAsset)             | Useful for attaching and visualizing 3D mesh files.            | `.ply`                        |
 
 **Annotations**: allow you to visualize overlays on top of datapoints through the use of[`annotation`](../../reference/annotation.md).
 We currently support 10 different types of annotations each enabling a specific modality.

diff --git a/kolena/asset.py b/kolena/asset.py
@@ -96,6 +96,14 @@ class PointCloudAsset(Asset):
     """
     A three-dimensional point cloud located in a cloud bucket. Points are assumed to be specified in a right-handed,
     Z-up coordinate system with the origin around the sensor that captured the point cloud.
+
+    PointCloudAsset supports the following extensions: `.pcd`, `.npy`, and `.npz`.
+
+    If using an `.npy` or `.npz` file format, PointCloudAsset expects a (N, 3) or (N, 4) shaped numpy array, with each
+    row as a array of `(x, y, z, [intensity])` values.
+
+    If using an `.npz` file, include an `npz_key` when initializing to specify the path field to load as an array:
+    `PointCloudAsset(locator="s3://my-bucket/path/to/my-point-cloud.npz", npz_key="points")`
     """
 
     locator: str