Kudu
Supported pipeline types:
|
When you configure the Kudu destination, you specify the connection information for one or more Kudu masters, define the table to use, and optionally define field mappings. By default, the destination writes field data to columns with matching names.
The Kudu destination can use CRUD operations defined in the sdc.operation.type record header attribute to write data. You can define a default operation for records without the header attribute or value. You can also configure whether to use multi-row operations for inserts and deletes, and how to handle records with unsupported operations.For information about Data Collector change data processing and a list of CDC-enabled origins, see Processing Changed Data.
If the destination receives a change data capture log from some origin systems, you must select the format of the change log.
You can also configure the external consistency mode, operation timeouts, and the maximum number of worker threads to use.
Define the CRUD Operation
The Kudu destination can insert, update, delete, or upsert data. The destination writes the records based on the CRUD operation defined in a CRUD operation header attribute or in operation-related stage properties.
You define the CRUD operation in the following ways:
- CRUD record header attribute
- You can define the CRUD operation in a CRUD operation record header attribute. The destination looks for the CRUD operation to use in the sdc.operation.type record header attribute.
- Operation stage properties
- You define a default operation in the destination properties. The destination uses the default operation when the sdc.operation.type record header attribute is not set.
Kudu Data Types
The Kudu destination converts Data Collector data types to the following compatible Kudu data types:
Data Collector Data Type | Kudu Data Type |
---|---|
Boolean | Bool |
Byte | Int8 |
Byte Array | Binary |
Decimal | Decimal. Available in Kudu version 1.7 and later. If using an earlier version of Kudu, configure your pipeline to convert the Decimal data type to a different Kudu data type. |
Double | Double |
Float | Float |
Integer | Int32 |
Long | Int64 or Unixtime_micros. The destination determines the data
type to use based on the mapped Kudu column. The Data Collector Long data type stores millisecond values. The Kudu Unixtime_micros data type stores microsecond values. When converting to the Unixtime_micros data type, the destination multiplies the field value by 1,000 to convert the value to microseconds. |
Short | Int16 |
String | String |
- Character
- Date
- Datetime
- List
- List-Map
- Map
- Time
Configuring a Kudu Destination
Configure a Kudu destination to write to a Kudu cluster.