Data Generator
Supported pipeline types:
|
You might use the Data Generator processor when you want to perform processing that is available only at a field level for an entire record. For example, the Encrypt and Decrypt Fields processor can encrypt data in one or more fields, but does not encrypt entire records. To encrypt entire records, you can use the Data Generator to serialize records into a single field, then use the Encrypt and Decrypt Fields processor to encrypt the field.
When you configure the Data Generator, you specify the target field and the output type to use, String or Byte Array. You also specify the data format for the serialized record and related properties.
Target Field
When you use the Data Generator processor, you specify the target field for the serialized record.
When you specify a new field path, the processor creates the new field and writes the
serialized record to it. When you enter an existing field path, the processor replaces
the data in the existing field with the serialized record. When you enter
/ for the root field, the processor replaces the entire record with
a single field containing the serialized record.
Data Formats
- Avro
- The stage writes records based on the Avro schema. You can use one of the following methods to specify the location of the Avro schema definition:
- Binary
- The stage writes binary data to a single field in the record.
- Delimited
- Generates a record for each delimited line. You can use the
following delimited format types:
- Default CSV - File that includes comma-separated values. Ignores empty lines in the file.
- RFC4180 CSV - Comma-separated file that strictly follows RFC4180 guidelines.
- MS Excel CSV - Microsoft Excel comma-separated file.
- MySQL CSV - MySQL comma-separated file.
- Tab-Separated Values - File that includes tab-separated values.
- PostgreSQL CSV - PostgreSQL comma-separated file.
- PostgreSQL Text - PostgreSQL text file.
- Custom - File that uses user-defined delimiter, escape, and quote characters.
- Multi Character Delimited - File that uses multiple user-defined characters to delimit fields and lines, and single user-defined escape and quote characters.
- JSON
- Generates a record for each JSON object. You can process JSON files that include multiple JSON objects or a single JSON array.
- Protobuf
- Generates a record for every protobuf message. By default, the origin assumes messages contain multiple protobuf messages.
- SDC Record
- The destination writes records in the SDC Record data format.
- Text
- The destination writes data from a single text field to the destination system. When you configure the stage, you select the field to use.
- XML
- Generates records based on a user-defined delimiter element. Use an XML element directly under the root element or define a simplified XPath expression. If you do not define a delimiter element, the origin treats the XML file as a single record.
Configuring a Data Generator
Configure a Data Generator processor to serialize a record into a single field.