From 4e8a402e1518792c233e271a2a95c8fb73b0852e Mon Sep 17 00:00:00 2001 From: Alan Chen Date: Tue, 20 Aug 2024 17:27:36 -0700 Subject: [PATCH] fix: fluvio/quickstart, sync w/ versioned_docs (#211) - **fix fluvio/quickstart** - **resync versioned_docs** --- docs/fluvio/quickstart.mdx | 12 +- .../connectors/developers/overview.mdx | 4 +- .../connectors/troubleshooting.mdx | 69 +++++++ .../version-0.11.11/fluvio/quickstart.mdx | 12 +- .../tutorials/config/http-cat-fact-basic.yaml | 12 ++ .../config/http-cat-fact-transform.yaml | 19 ++ .../fluvio/tutorials/config/sql-cat-fact.yaml | 27 +++ .../fluvio/tutorials/output-sql.mdx | 195 ++++++++++++++++++ .../fluvio/tutorials/source-http-basic.mdx | 155 ++++++++++++++ .../tutorials/source-http-transformation.mdx | 139 +++++++++++++ 10 files changed, 639 insertions(+), 5 deletions(-) create mode 100644 versioned_docs/version-0.11.11/connectors/troubleshooting.mdx create mode 100644 versioned_docs/version-0.11.11/fluvio/tutorials/config/http-cat-fact-basic.yaml create mode 100644 versioned_docs/version-0.11.11/fluvio/tutorials/config/http-cat-fact-transform.yaml create mode 100644 versioned_docs/version-0.11.11/fluvio/tutorials/config/sql-cat-fact.yaml create mode 100644 versioned_docs/version-0.11.11/fluvio/tutorials/output-sql.mdx create mode 100644 versioned_docs/version-0.11.11/fluvio/tutorials/source-http-basic.mdx create mode 100644 versioned_docs/version-0.11.11/fluvio/tutorials/source-http-transformation.mdx diff --git a/docs/fluvio/quickstart.mdx b/docs/fluvio/quickstart.mdx index 91dda058..2e711017 100644 --- a/docs/fluvio/quickstart.mdx +++ b/docs/fluvio/quickstart.mdx @@ -111,7 +111,17 @@ meta: name: http-quotes type: http-source topic: quotes -http:{#my-explicit-id} +http: + endpoint: https://demo-data.infinyon.com/api/quote + interval: 3s +``` + +### Running the HTTP Connector + +We'll use [Connector Developer Kit (cdk)] to download and run the connector. + +```bash copy="fl" +$ cdk hub download infinyon/http-source@0.3.8 ``` ```bash copy="fl" diff --git a/versioned_docs/version-0.11.11/connectors/developers/overview.mdx b/versioned_docs/version-0.11.11/connectors/developers/overview.mdx index 406a4cca..575606d9 100644 --- a/versioned_docs/version-0.11.11/connectors/developers/overview.mdx +++ b/versioned_docs/version-0.11.11/connectors/developers/overview.mdx @@ -67,9 +67,6 @@ These connectors are not guaranteed to work with latest fluvio: * https://github.com/infinyon/labs-redis-sink-connector * https://github.com/infinyon/duckdb-connector - - -[Rust Installation Guide]: https://www.rust-lang.org/tools/install [Fluvio Connector Development Kit (CDK)]: ../cdk.mdx [Generate a Connector]: ./generate.mdx [Build and Test]: ./build.mdx @@ -77,3 +74,4 @@ These connectors are not guaranteed to work with latest fluvio: [Logging]: ./logging.mdx [Secrets]: ./secrets.mdx [Publish to Connector Hub]: ./publish.mdx +[Install Rust]: https://www.rust-lang.org/tools/install diff --git a/versioned_docs/version-0.11.11/connectors/troubleshooting.mdx b/versioned_docs/version-0.11.11/connectors/troubleshooting.mdx new file mode 100644 index 00000000..1363cc79 --- /dev/null +++ b/versioned_docs/version-0.11.11/connectors/troubleshooting.mdx @@ -0,0 +1,69 @@ +--- +sidebar_position: 200 +title: "Troubleshooting" +description: "Connector Troubleshooting" +--- + +# Connector Build Troubleshooting + +## Multiplatform builds + +Connectors on the Hub can be published associated with multiple targets so +they can be run multiple platforms. This can be accomplished by building cdk on +the same target it's intended for, or cross compiling one target from another +platform. Compiling for one target while on another target can be complex, so +this troubleshooting section provides added target toolchain support information +for some common platfom/target combinations. + +## MacOS + +Build command for local builds, if `cdk build` does not work, explicitly specify +the target: + +```bash +cdk build --target aarch64-apple-darwin +``` + +## Ubuntu or Debian based Linux Distributions {#ubuntu-debian} + +Build prerequisites for `x86-unknown-linux-musl` on Ubuntu from an +`x86-unknown-linux-gnu` environment. + +System packages: +```bash +sudo apt install build-essential musl-tools +``` + +Build command to build and test locally, instead of 'cdk build', use the +following: + +```bash +CARGO_TARGET_X86_64_UNKNOWN_LINUX_MUSL_LINKER=x86_64-linux-musl-gcc cdk build +``` + +## Other rust cargo cross-platform build toolchains + +Connector projects are rust projects, and different choices exist for cross +compilation. A project connector binary build by a rust toolchain can be +published by `cdk` with a `--no-build` flag. Different cross compilation +projects for rust include: + +- Cargo cross https://github.com/cross-rs/cross +- Cargo zigbuild https://github.com/rust-cross/cargo-zigbuild + +For a working example, see the [connector-publish github workflow]. + +## Windows, WSL, WSL2 + +Fluvio and `cdk` are supported for windows only though WSL2. WSL2 often installs +Ubuntu as the default Linux distrubition. See the [Ubuntu] section for more +build troubleshooting. + +## Infinyon Cloud Certified Connectors + +Infinyon Certified connectors are built for the `aarch64-unknown-linux-musl` target. + +To build and publish for the cloud, InfinyOn often uses a [connector-publish github workflow]. + +[Ubuntu]: #ubuntu-debian +[connector-publish github workflow]: https://github.com/infinyon/fluvio/blob/master/.github/workflows/connector-publish.yml diff --git a/versioned_docs/version-0.11.11/fluvio/quickstart.mdx b/versioned_docs/version-0.11.11/fluvio/quickstart.mdx index 91dda058..2e711017 100644 --- a/versioned_docs/version-0.11.11/fluvio/quickstart.mdx +++ b/versioned_docs/version-0.11.11/fluvio/quickstart.mdx @@ -111,7 +111,17 @@ meta: name: http-quotes type: http-source topic: quotes -http:{#my-explicit-id} +http: + endpoint: https://demo-data.infinyon.com/api/quote + interval: 3s +``` + +### Running the HTTP Connector + +We'll use [Connector Developer Kit (cdk)] to download and run the connector. + +```bash copy="fl" +$ cdk hub download infinyon/http-source@0.3.8 ``` ```bash copy="fl" diff --git a/versioned_docs/version-0.11.11/fluvio/tutorials/config/http-cat-fact-basic.yaml b/versioned_docs/version-0.11.11/fluvio/tutorials/config/http-cat-fact-basic.yaml new file mode 100644 index 00000000..6a4c74ea --- /dev/null +++ b/versioned_docs/version-0.11.11/fluvio/tutorials/config/http-cat-fact-basic.yaml @@ -0,0 +1,12 @@ + +apiVersion: 0.1.0 +meta: + version: 0.3.8 + name: cat-facts + type: http-source + topic: cat-facts + create-topic: true + +http: + endpoint: "https://catfact.ninja/fact" + interval: 10s diff --git a/versioned_docs/version-0.11.11/fluvio/tutorials/config/http-cat-fact-transform.yaml b/versioned_docs/version-0.11.11/fluvio/tutorials/config/http-cat-fact-transform.yaml new file mode 100644 index 00000000..a9c42aa2 --- /dev/null +++ b/versioned_docs/version-0.11.11/fluvio/tutorials/config/http-cat-fact-transform.yaml @@ -0,0 +1,19 @@ +apiVersion: 0.1.0 +meta: + version: 0.3.8 + name: cat-facts-transformed + type: http-source + topic: cat-facts-data-transform + create-topic: true + +http: + endpoint: https://catfact.ninja/fact + interval: 10s + +transforms: + - uses: infinyon/jolt@0.4.1 + with: + spec: + - operation: default + spec: + source: "http" diff --git a/versioned_docs/version-0.11.11/fluvio/tutorials/config/sql-cat-fact.yaml b/versioned_docs/version-0.11.11/fluvio/tutorials/config/sql-cat-fact.yaml new file mode 100644 index 00000000..f525a306 --- /dev/null +++ b/versioned_docs/version-0.11.11/fluvio/tutorials/config/sql-cat-fact.yaml @@ -0,0 +1,27 @@ +# sql.yaml +apiVersion: 0.1.0 +meta: + name: simple-cat-facts-sql + type: sql-sink + version: 0.4.3 + topic: cat-facts +sql: + url: "postgres://user:password@db.postgreshost.example/dbname" +transforms: + - uses: infinyon/json-sql@0.2.1 + invoke: insert + with: + mapping: + table: "animalfacts" + map-columns: + "length": + json-key: "length" + value: + type: "int" + default: "0" + required: true + "raw_fact_json": + json-key: "$" + value: + type: "jsonb" + required: true \ No newline at end of file diff --git a/versioned_docs/version-0.11.11/fluvio/tutorials/output-sql.mdx b/versioned_docs/version-0.11.11/fluvio/tutorials/output-sql.mdx new file mode 100644 index 00000000..dd54022b --- /dev/null +++ b/versioned_docs/version-0.11.11/fluvio/tutorials/output-sql.mdx @@ -0,0 +1,195 @@ +--- +sidebar_position: 3 +title: "Streaming data to SQL" +description: "Part 3 of HTTP to SQL Sink tutorial series." +--- + + +# Prerequisites + +This guide uses `local` Fluvio cluster. If you need to install it, please follow the instructions [here][installation]! + +We will be using `Postgres` database, You can download it and set up from [PostgreSQL] website for your OS. Alternatively use a cloud service like [ElephantSQL]. + + +# Introduction + +In previous tutorials, we have seen how to read data from external sources and write it to a Fluvio topic. In this tutorial, we will go through how to sink data from a Fluvio topic to external sink such as a database. + +We will use `sink` type of connectors. All `sink` connectors consume data from luvio topic and write it to an external system. Particularly, we will use the `SQL Sink Connector` which can write to a PostgreSQL or SQLite database. + +Since this is targeted to `SQL` database, configuration will be concern with mapping the JSON data to SQL columns. Sink Connector will perform these steps: + +- Read data from the topic +- Transform the data to SQL insert statement. +- Send the SQL insert statement to the database. + +SQL transformation will be done using SmartModule which allow you to plug-in different transformation logic if needed. + +We will be using topics from first tutorial [Streaming from HTTP Source] which stream data to `cat-facts` topic. Please run that tutorial first to set up the topic. + +As in previous tutorials, we will use `cdk` to manage the connectors. Run following command to download the connector from the Hub. + +```bash +$ cdk hub download infinyon/sql-sink@0.4.3 +``` + +Then download SQL SmartModule from the Hub. + +```bash +$ fluvio hub sm download infinyon/json-sql@0.2.1 +``` + +Then you should see two smartmodules downloaded assuming you have already downloaded the `jolt` SmartModule from previous tutorial. + +```bash +$ fluvio sm list + SMARTMODULE SIZE + infinyon/json-sql@0.2.1 559.6 KB + infinyon/jolt@0.4.1 589.3 KB +``` + +# Sink Connector configuration + +Coyp and paste following config and save it as `sql-cat-fact.yaml`. + +import CodeBlock from '@theme/CodeBlock'; +import CatSQL from '!!raw-loader!./config/sql-cat-fact.yaml'; + +{CatSQL} + +This configuration will read data from `cat-facts` topic and insert into `animalfacts` table in the database. The `json-sql` SmartModule will transform the JSON data into SQL insert statement. + +Please change line containing `url` to your database connection string. + +## SQL Mapping + +The SmartModule `json-sql` implements a domain specific language (DSL) to specify a transformation of input JSON to SQL insert statement. It uses model similar to [Django Model] where SQL tables are abstract into a model. The model is then used to generate SQL insert statement. + +The mapping is designed for translation JSON into SQL. Each column of the table is mapped from a JSON expression. + +For example, here is mapping for `length` column: + +```yaml + "length": + json-key: "length" + value: + type: "int" + default: "0" + required: true +``` + +This mapping will take `length` field from JSON and insert into `length` column in the table. If `length` field is not found, it will use default value of `0`. + +# Setting up the Database + +In order to run the connector, you need to create a table in your database. Run following SQL command in postgres CLI: + +```sql +# create table animalfacts(length integer, raw_fact_json jsonb); +``` + +You can confirm table is created: + +```sql +# select * from animalfacts; + length | raw_fact_json +--------+--------------- +(0 rows) + +``` + + +Once you have the config file, you can create the connector using the `cdk deploy start` command. + +```bash +$ cdk deploy start --ipkg infinyon-sql-sink-0.4.3.ipkg --config ./sql-cat-fact.yaml +``` + +You can use `cdk deploy list` to view the status of the connector. + +```bash +$ cdk deploy list + NAME STATUS + simple-cat-facts-sql Running +``` + +# Generate data and checking the data + +Fluvio topic allow you to decouple the data source from the data sink. This means both source and sink can be run independently without affecting each other. +You can run the source connector to generate data but it is not required for this demo. + +Here, we will manually produce same data from previous tutorial to the `cat-facts` topic. This way we can control the data and see how it is sinked to the database. +By default, sink connector will consume the data from the end of topic which means it will ignore exiting data in the topic. + +Let's produce a single record to the topic. + +``` +$ fluvio produce cat-facts +{"fact":"A cat’s jaw can’t move sideways, so a cat can’t chew large chunks of food.","length":74} +Ok! +``` + +Then you can query the database to see the record. + +```sql +# select * from animalfacts; + length | raw_fact_json +--------+------------------------------------------------------------------------------------------------------ + 74 | {"fact": "A cat’s jaw can’t move sideways, so a cat can’t chew large chunks of food.", "length": 74} +(1 row) +``` + +You can add more records to the topic and see how SQL connector is inserting the data into the database. + +``` +$ fluvio produce cat-facts +{"fact":"Unlike humans, cats are usually lefties. Studies indicate that their left paw is typically their dominant paw.","length":110} +Ok! +``` + +```sql +# select * from animalfacts; + length | raw_fact_json +--------+------------------------------------------------------------------------------------------------------------------------------------------- + 74 | {"fact": "A cat’s jaw can’t move sideways, so a cat can’t chew large chunks of food.", "length": 74} + 110 | {"fact": "Unlike humans, cats are usually lefties. Studies indicate that their left paw is typically their dominant paw.", "length": 110} +(2 rows) +``` + +# Cleaning up + +Same in previous tutorials, use `cdk deploy shutdown` to stop the connector. + + +## Conclusion + +This tutorial showed you how to sink data from a Fluvio topic to a SQL database. You can use the same concept to sink data to other databases or systems. + +You can combine this tutorial with previous tutorials to create a complete data pipeline from source to sink. This just requires deploying multiple connectors. + +With Fluvio's event driven architecture, source and sink can be run independently and doesn't effect each other. You can also chain together multiple sources and sinks to create complex data pipelines. + +## Reference + +* [Fluvio CLI Produce] +* [Fluvio CLI Consume] +* [Fluvio CLI Topic] +* [Fluvio CLI Profile] +* [SmartModule] +* [Transformations] + +[Connector Overview]: connectors/overview.mdx +[Fluvio CLI Produce]: fluvio/cli/fluvio/produce.mdx +[Fluvio CLI Consume]: fluvio/cli/fluvio/consume.mdx +[Fluvio CLI Topic]: fluvio/cli/fluvio/topic.mdx +[Fluvio CLI Profile]: fluvio/cli/fluvio/profile.mdx +[SmartModule]: smartmodules/overview.mdx +[Transformations]: fluvio/concepts/transformations.mdx +[castfact.ninja]: https://catfact.ninja +[PostgreSQL]: https://www.postgresql.org/ +[ElephantSQL]: https://www.elephantsql.com/ +[Configuration]: connectors/configuration.mdx +[SmartModule Hub]: hub/smartmodules/index.md +[installation]: fluvio/quickstart.mdx#install-fluvio +[Django Model]: https://docs.djangoproject.com/en/5.0/topics/db/models/ diff --git a/versioned_docs/version-0.11.11/fluvio/tutorials/source-http-basic.mdx b/versioned_docs/version-0.11.11/fluvio/tutorials/source-http-basic.mdx new file mode 100644 index 00000000..9b473727 --- /dev/null +++ b/versioned_docs/version-0.11.11/fluvio/tutorials/source-http-basic.mdx @@ -0,0 +1,155 @@ +--- +sidebar_position: 1 +title: "Streaming from HTTP Source" +description: "Part 1 of HTTP to SQL Sink tutorial series." +--- + +# Prerequisites + +This guide uses `local` Fluvio cluster. If you need to install it, please follow the instructions [here][installation]! + + + +# Introduction + +This tutorial will guide you through creating a simple data pipeline that streams data from a website but can be applied to any HTTP source. + +Tutorial will cover: +- Connector +- How to download and start an Inbound HTTP Connector +- Reading data from the topic +- Stopping the connector + + + +# Connector Overview + +The Fluvio connector lets fluvio to interact with external systems. There are two types of connectors: `Source` and `Sink`. + +The `Source` connector reads data from an external system and writes it to a Fluvio topic. +The `Sink` connector reads data from a Fluvio topic and writes it to an external system. + +In this tutorial, we will be using the `HTTP Source Connector` to read data from a website and write it to a topic. It is designed to read data from an HTTP endpoint and write it to a Fluvio topic. + +It is designed to either poll the endpoint at a regular interval or stream data. In this tutorial, we will be polling the endpoint every 10 seconds as website is not designed to stream data. + +The connector is configured using a YAML file. Please refer to the [Connector Overview] for more information on connectors and [Configuration] format. + +For local clusters, the connectors are managed using `cdk` which is already installed with Fluvio CLI. You are responsible for managing lifecycle of the connectors. + +If you are using Infinyon Cloud, the connectors are managed by the platform and you do not need to worry about lifecycle management. + +## Hub + +The Fluvio Hub is a central repository for connectors. You can download, list and upload connectors using the `cdk hub` command. + +The connector you downloaded contains binary executable. The connector is packaged as an `ipkg` file. + +In this tutorial, we will be using the `HTTP Source Connector` from the Hub. + +```bash +$ cdk hub download infinyon/http-source@0.3.8 +``` + +This will download `infinyon-http-source-0.3.8.ipkg` file which contains the connector binary. + +# Configuration for HTTP Source Connector + +After downloading the connector, you need to create a configuration file for the connector. Following is the basic configuration file for the `HTTP Source Connector` that reads data from [castfact.ninja] website. + +Coyp and paste following config and save it as `http-cat-facts.yaml`. + +import CodeBlock from '@theme/CodeBlock'; +import CatFactBasic from '!!raw-loader!./config/http-cat-fact-basic.yaml'; + +{CatFactBasic} + +Each connector has a `meta` section which is same across all connector. The `http` section is specific to the `HTTP Source Connector`. + +Noticed that in the `meta`, `create-type` is set to true which means the topic will be created if it does not exist. + +The `interval` is set to `10s` which means the connector will poll the endpoint every 10 seconds. + + +# Starting the Connector + +Once you have the config file, you can create the connector using the `cdk deploy start` command. + +```bash +$ cdk deploy start --ipkg infinyon-http-source-0.3.8.ipkg --config ./http-cat-facts.yaml +``` + +You can use `cdk deploy list` to view the status of the connector. + +```bash +$ cdk deploy list + NAME STATUS + cat-facts Running +``` + +# Checking the data + +Use `fluvio consume` to view the incoming data in the topic `cat-facts` which was created by the connector. + +This command will consume the last 4 records from the topic in the stream mode. + +```bash +$ fluvio consume cat-facts -T4 +Consuming records starting 4 from the end of topic 'cat-facts' +{"fact":"A cat lover is called an Ailurophilia (Greek: cat+lover).","length":57} +{"fact":"British cat owners spend roughly 550 million pounds yearly on cat food.","length":71} +{"fact":"Fossil records from two million years ago show evidence of jaguars.","length":67} +{"fact":"Relative to its body size, the clouded leopard has the biggest canines of all animals\u2019 canines. Its dagger-like teeth can be as long as 1.8 inches (4.5 cm).","length":156} +``` +The http connector will poll the endpoint every 10 seconds and write the data to the topic. As you can see, CLI output will refresh every 10 seconds with new data. + +You can terminate the consume command by pressing `Ctrl+C`. + + +### Cleaning Up + +To shutdown traffic, you can shutdown local connector using the `cdk deploy shutdown` command. + +```bash +$ cdk deploy shutdown --name cat-facts +``` + +You can also delete the topic by using the `fluvio topic delete` command. + +```bash +$ fluvio topic delete cat-facts +``` + +This should result in no connectors running: +```bash +$ cdk deploy list + NAME STATUS +``` + +Note that the topic `cat-facts` will still exist and you can consume the data from it. If you start the connector again, it will append data to the same topic. + +If you restart the host, the connector will not be running. You will need to start it again. You can automate this by using a service manager like `systemd` or `supervisor`. Alternatively, you can use docker to run the connector. + +## Conclusion and Next Step + +Congratulations! You have successfully created a data pipeline that reads data from a website and writes it to a topic. + +This tutorial just barely scratches the surface of what you can do with Fluvio and Connectors. + +There are many other source connectors available in the Hub such as `mqtt` and `kafka`. You can explore them using the `cdk hub connector list` command. + +In the next tutorial [Customizing HTTP Connector], we will introduce `SmartModule` which can be used to transform the data before writing it to the topic. + +## Reference + +* [Fluvio CLI Consume] +* [Fluvio CLI Topic] +* [Fluvio CLI Profile] + +[Customizing HTTP Connector]: source-http-transformation.mdx +[Connector Overview]: connectors/overview.mdx +[Fluvio CLI Consume]: fluvio/cli/fluvio/consume.mdx +[Fluvio CLI Topic]: fluvio/cli/fluvio/topic.mdx +[Inbound HTTP Connector]: hub/connectors/inbound/http.mdx +[castfact.ninja]: https://catfact.ninja +[installation]: fluvio/quickstart.mdx#install-fluvio diff --git a/versioned_docs/version-0.11.11/fluvio/tutorials/source-http-transformation.mdx b/versioned_docs/version-0.11.11/fluvio/tutorials/source-http-transformation.mdx new file mode 100644 index 00000000..222dc3b5 --- /dev/null +++ b/versioned_docs/version-0.11.11/fluvio/tutorials/source-http-transformation.mdx @@ -0,0 +1,139 @@ +--- +sidebar_position: 2 +title: "Customizing Connector with Transformation" +description: "Part 2 of HTTP to SQL Sink tutorial series." +--- + +# Prerequisites + +This guide uses `local` Fluvio cluster. If you need to install it, please follow the instructions [here][installation]! + + +# Introduction + +Building on the [Streaming from HTTP Source] tutorial, we will customize the Source HTTP Connector with transformation that will let you modify the records before they are sent to the topic. + +Why would you want to do this? + +Often, you are only interested in a subset of the data, or add/remove fields, or even change the shape of the record. This is where transformations come in. + +In this tutorial, we will show you how to customize the HTTP Source Connector to add a field to every record before it is sent to the topic. + + +# SmartModule + +SmartModule is a reusable piece of code that can be attached to a connector to perform a specific task. In this case, we will use a SmartModule to transform the incoming JSON records before they are sent to the topic. + +SmartModule is executed using WebAssembly runtime, which means it is very small and secure. You can write your own SmartModules in Rust, or use the prepackaged ones from the [SmartModule Hub]. + +Unlike connectors, SmartModules in the Hub are packaged WASM files. + +The SmartModule we will use in this tutorial implements domain specific language (DSL) called [Jolt], to specify a transformation of input JSON to another shape of JSON data. + +To use jolt SmartModule, we need to download it to the cluster. + +```bash +$ fluvio hub sm download infinyon/jolt@0.4.1 +``` + +Once you have the SmartModule downloaded, you can list them: + +```bash +$ fluvio hub sm list + SMARTMODULE SIZE + infinyon/jolt@0.4.1 589.3 KB +``` + + +# Customizing HTTP Connector with JOLT SmartModule + +In order to customize the HTTP Source Connector, we need to create a configuration file that includes the transformation. + +Before we start modified connector, please terminate existing running connector. + +Coyp and paste following config and save it as `http-cat-facts-transform.yaml`. This configuration use the same endpoint as the previous tutorial except different topic name. + +import CodeBlock from '@theme/CodeBlock'; +import CatFactBasic from '!!raw-loader!./config/http-cat-fact-transform.yaml'; + +{CatFactBasic} + +Noticed that we just added `transforms` section which added JOLT specification to insert a new field `source` with value `http` to every record. + +Once you have the config file, you can create the connector using the `cdk deploy start` command as before. + +```bash +$ cdk deploy start --ipkg infinyon-http-source-0.3.8.ipkg --config ./http-cat-facts-transform.yaml +``` + + +You can use `cdk deploy list` to view the status of the connector. + +```bash +$ cdk deploy list + NAME STATUS + cat-facts Running +``` + +# Checking the data + +Similar to the previous tutorial, we can use `fluvio consume` to view the incoming data in the topic `cat-facts-transformed` which was created by the connector. + +```bash +$ fluvio consume cat-facts -T4 +Consuming records starting 4 from the end of topic 'cat-facts' + +{"fact":"Heat occurs several times a year and can last anywhere from 3 to 15 days.","length":73,"source":"http"} +{"fact":"Cats prefer to remain non-confrontational. They will not fight to show dominance, but rather to stake their territory. Cats will actually go to extremes to avoid one another in order to prevent a possible confrontation.","length":219,"source":"http"} +{"fact":"Cats only sweat through their paws and nowhere else on their body","length":65,"source":"http"} +{"fact":"Unlike dogs, cats do not have a sweet tooth. Scientists believe this is due to a mutation in a key taste receptor.","length":114,"source":"http"} +``` +As you can see, the `source` field has been added to every record. This is the transformation we added to the connector. + +You can terminate the consume command by pressing `Ctrl+C`. + + +### Cleaning up + +Shut down the connector and delete the topic. + +```bash +$ cdk deploy shutdown --name cat-facts-transformed +$ fluvio topic delete cat-facts-transform +``` + +## Conclusion and Next Step + +In this tutorial, we showed you how to customize the HTTP Source Connector with transformation using SmartModule. + +SmartModule is a powerful tool that can be used to perform complex transformations on the incoming data. + +In the next tutorial, we will show you how to use the sink connector to stream data from an topic to a SQL database. + + +## Reference + +* [Fluvio CLI Produce] +* [Fluvio CLI Consume] +* [Fluvio CLI Topic] +* [Fluvio CLI Profile] +* [SmartModule] +* [SmartModule Rust API] +* [Transformations] + +[Streaming from HTTP Source]: source-http-basic.mdx +[Connector Overview]: connectors/overview.mdx +[Fluvio CLI Produce]: fluvio/cli/fluvio/produce.mdx +[Fluvio CLI Consume]: fluvio/cli/fluvio/consume.mdx +[Fluvio CLI Topic]: fluvio/cli/fluvio/topic.mdx +[Fluvio CLI Profile]: fluvio/cli/fluvio/profile.mdx +[SmartModule]: smartmodules/overview.mdx +[SmartModule Rust API]: https://docs.rs/fluvio-smartmodule/latest/fluvio_smartmodule/ +[Transformations]: fluvio/concepts/transformations.mdx +[Inbound HTTP Connector]: hub/connectors/inbound/http.mdx +[castfact.ninja]: https://catfact.ninja +[Configuration]: connectors/configuration.mdx +[infinyon/jolt@x.y.z]: hub/smartmodules/jolt.mdx +[Jolt]: https://github.com/infinyon/fluvio-jolt +[SmartModule Hub]: hub/smartmodules/index.md +[installation]: fluvio/quickstart.mdx#install-fluvio