JDBC Multitable Consumer
Supported pipeline types:
|
- Use the JDBC Query Consumer origin to configure your own SQL query to read database data from a single table or from a join of tables.
- Use a changed-data origin to read changed data from a particular database and process subsequent changes.
- Use the Teradata Consumer origin to read data from Teradata Database tables and quickly retrieve large amounts of data.
By default, the origin processes tables incrementally, using primary key columns or user-defined offset columns to track its progress. You can configure the origin to perform non-incremental processing to enable it to also process tables that do not have a key or offset column.
The origin can use multiple threads to enable parallel processing of tables and partitions. Use the JDBC Multitable Consumer origin to read multiple tables from one or more schemas in the same database. For example, you might use the origin to perform database replication.
When you configure the JDBC Multitable Consumer origin, you define groups of database tables to read. The origin generates SQL queries based on the table configurations that you define.
When you configure the origin, you specify connection information and custom JDBC configuration properties to determine how the origin connects to the database. You also configure the number of threads to use and the maximum number of queries to run per second. When the source database has high-precision timestamps, such as IBM Db2 TIMESTAMP(9) fields, you can configure the origin to write strings rather than datetime values to maintain the precision.
When you define the table configuration for the groups of tables that you want to process, you can optionally override the default key column and specify the initial offset to use. You can enable non-incremental processing for tables without key or offset columns. You can configure the origin to perform multithreaded partition processing, multithreaded table processing, or use the default - a mix of both. When configuring partitions, you can configure the offset size, number of active partitions, and offset conditions.
You define the strategy that the origin uses to create each batch of data and the number of batches to create from each result set. You can configure advanced properties, such as the initial order to read from tables, connection related properties, and transaction isolation. And you can specify what the origin does when encountering an unsupported data type: convert the data to string or stop the pipeline.
When the pipeline stops, JDBC Multitable Consumer notes where it stops reading. When the pipeline starts again, JDBC Multitable Consumer continues processing from where it stopped by default. You can reset the origin to process all available data, using any initial offsets that you defined.
You can configure advanced connection properties. To use a JDBC version older than 4.0, you specify the driver class name and define a health check query.
The origin can generate events for an event stream. For more information about dataflow triggers and the event framework, see Dataflow Triggers Overview.
Database Vendors and Drivers
The JDBC Multitable Consumer origin can read database data from multiple database vendors.
| Database Vendor | Versions and Drivers |
|---|---|
| MySQL |
|
| PostgreSQL |
When connecting to a PostgreSQL database, you do not need to install a JDBC driver. Data Collector includes the JDBC driver required for PostgreSQL. |
| Oracle | Oracle 11g with the Oracle 11.2.0 JDBC driver |
| Microsoft SQL Server | SQL Server 2017 with the Microsoft JDBC driver 4.2 for SQL Server |
Oracle Data Types
The JDBC Multitable Consumer origin converts Oracle data types into Data Collector data types.
| Oracle Data Type | Data Collector Data Type |
|---|---|
| Number | Decimal |
| Char | String |
| Varchar | String |
| Varchar2 | String |
| Nchar | String |
| NvarChar2 | String |
| Binary_float | Float |
| Binary_double | Double |
| Date | Datetime |
| Timestamp | Datetime |
| Timestamp with time zone | Zoned_datetime |
| Timestamp with local time zone | Zoned_datetime |
| Long | String |
| Blob | Byte_array |
| Clob | String |
| Nclob | String |
| XMLType | String |
Installing the JDBC Driver
For information about installing additional drivers, see Install External Libraries in the Data Collector documentation.
Working with a MySQL JDBC Driver
- The driver returns time values to the second.
-
Due to a MySQL JDBC driver issue, the driver cannot return time values to the millisecond. Instead, the driver returns the values to the second. For example, if a column has a value of 20:12:50.581, the driver reads the value as 20:12:50.000.
- The origin might not read new rows created in MySQL while the pipeline is running.
- When using the default transaction isolation level, the origin might not read new rows that are created in MySQL as the pipeline is running. To resolve this issue, configure the origin to use the read committed transaction isolation level in the Advanced tab.
Table Configuration
When you configure JDBC Multitable Consumer, you define a table configuration for each group of tables that you want to read. A table configuration defines a group of tables with the same table name pattern, that are from one or more schemas with the same name pattern, and that have proper primary keys or the same user-defined offset columns.
You can define one or more table configurations.
For example, you can define one table configuration to replicate a database that has a
proper primary key for each table. You simply enter the schema name and use the default
table name pattern % which matches all tables in the schema.
- store_a
- store_b
- store_c
- customers
The three store tables use orderID as the primary key. You want to override the primary key for the customers table, and so need to define customerID as the offset column for that table. You want to read all available data in the tables, so do not need to define an initial offset value.
You define one table configuration as follows so that the origin can read the three store tables:
- Schema - SALES
- Table Name Pattern - store%
Then you define the second table configuration as follows so that the origin can read the customers table:
- Schema - SALES
- Table Name Pattern - customers
- Override Offset Columns - enabled
- Offset Columns - customerID
Let's take a closer look at the schema and table name patterns and offset properties that you define in a table configuration.
Schema and Table Name Patterns
You define the group of tables that the JDBC Multitable Consumer origin reads by defining schema and table name patterns for the table configuration. The origin reads all tables whose names match the table pattern in the schemas whose names match the schema pattern.
The
schema and table name patterns use the SQL LIKE syntax. For example, the LIKE syntax
uses the percentage wildcard (%) to represent any string of zero or more characters. The
schema name pattern st% matches schemas whose names start with "st".
The default table name pattern % matches all tables in the specified
schemas.
For more information about valid patterns for the SQL LIKE syntax, see https://msdn.microsoft.com/en-us/library/ms179859.aspx.
You can optionally define a schema or table exclusion pattern to exclude some schemas or tables from being read. The schema and table exclusion patterns use a Java-based regular expression, or regex. For more information about using regular expressions with Data Collector, see Regular Expressions Overview.
For example, let's say that you want to read all tables in the US_WEST
and US_EAST schemas except for tables that start with "dept". You enter
the following schema name, table name, and table exclusion pattern:
- Schema - US%
- Table Name Pattern - %
- Table Exclusion Pattern - dept.*
Since you do not need to exclude any schemas, you simply leave the schema exclusion pattern empty.
Or, let's say that you want to read all tables from all schemas, except for the
sys and system schemas. You enter the following
schema name, table name, and schema exclusion pattern and leave the table exclusion
pattern blank:
- Schema - %
- Table Name Pattern - %
- Schema Exclusion Pattern - sys|system
Offset Column and Value
The JDBC Multitable Consumer origin uses an offset column and initial offset value to determine where to start reading data within tables and partitions.
By default, the origin uses the primary key of the tables as the offset column and uses no initial offset value. When you use multithreaded table processing and the table has a composite primary key, the origin uses each primary key as an offset column. You cannot use composite keys with multithreaded partition processing.
SELECT * FROM <table> ORDER BY <offset column_1>, <offset column_2>, ...Where <offset column_n> represents each primary key of the
table, such as when the table has a composite primary key. When you restart the pipeline
or when the origin switches back to a previously read table, the origin adds a WHERE
clause to the SQL query to continue reading from the last saved offset.
To use this default behavior, you do not need to configure any of the offset properties.
- Override the primary key as the offset column
- You can override the primary key and define another offset column or columns. Or if the table doesn’t have a primary key, you can define the offset column or columns to use.
- Define an initial offset value
- The initial offset value is a value within the offset column where you want JDBC Multitable Consumer to start reading. When you define an initial offset value, you must first enter the offset column name and then the value. If you are using the default primary key as the offset column, enter the name of the primary key.
- Define additional offset column conditions
- You can use the expression language to define additional conditions that the origin uses to determine where to start reading data. The origin adds the defined condition to the WHERE clause of the SQL query.
Reading from Views
The JDBC Multitable Consumer origin can read from views in addition to tables.
The origin reads from all tables and views that are included in the defined table configurations. If a table configuration includes views that you do not want to read, simply exclude them from the configuration.
Use the origin to read from simple views that select data from a single table.
We do not recommend using the origin to read from complex views that combine data from two or more tables using joins. If the origin reads from complex views, it runs multiple queries in parallel which can cause a heavy workload on the database.
Multithreaded Processing Modes
The JDBC Multitable Consumer origin performs parallel processing and enables the creation of a multithreaded pipeline. The origin can use multiple threads to process entire tables or partitions within tables.
By default, the origin performs multithreaded partition processing for the tables that fulfill the partition processing requirements, and performs multithreaded table processing for all other tables. When using the default behavior, the origin notes the tables that allow partition processing in the Data Collector log. When needed, you can configure the origin to require partition processing for all tables or to perform only table processing. You can also allow the single-threaded non-incremental processing of tables when needed.
- Multithreaded table processing - The origin can use up to one thread per table. Can process tables with multiple offset columns.
- Multithreaded partition processing - The origin can use up to one
thread per table partition. Use to process larger volumes of data than
multithreaded table processing.
Multithreaded partition processing requires a single primary key or user-defined offset column of a supported data type, and additional details for partition creation. Tables with composite keys or a key or user-defined offset column of an unsupported data type cannot be partitioned.
- Off - Use to perform multithreaded table processing.
Can be used to perform non-incremental loads of tables without key or offset columns, when enabled.
- On (Best Effort) - Use to perform partition processing where possible and allow
multithreaded table processing for tables with multiple key or offset columns.
Can be used to perform non-incremental loads of tables without key or offset columns, when enabled.
- On (Required) - Use to perform partition processing for all specified tables.
Does not allow performing other types of processing for tables that do not meet the partition processing requirements.
Multithreaded Table Processing
As the pipeline runs, each thread connects to the origin system, creates a batch of data, and passes the batch to an available pipeline runner. A pipeline runner is a sourceless pipeline instance - an instance of the pipeline that includes all of the processors and destinations in the pipeline and performs all pipeline processing after the origin.
Each pipeline runner processes one batch at a time, just like a pipeline that runs on a single thread. When the flow of data slows, the pipeline runners wait idly until they are needed, generating an empty batch at regular intervals. You can configure the Runner Idle Time pipeline property to specify the interval or to opt out of empty batch generation.
Multithreaded pipelines preserve the order of records within each batch, just like a single-threaded pipeline. But since batches are processed by different pipeline runners, the order that batches are written to destinations is not ensured.
The order of batch processing depends on many factors. For more information, see Understanding the Processing Queue.
For more information about multithreaded pipelines, see Multithreaded Pipeline Overview.
Example
Say you are reading from ten tables. You set the Number of Threads property to 5 and the Maximum Pool Size property to 6. When you start the pipeline, the origin retrieves the list of tables. The origin then creates five threads to read from the first five tables, and by default Data Collector creates a matching number of pipeline runners. Upon receiving data, a thread passes a batch to each of the pipeline runners for processing.
At any given moment, the five pipeline runners can each process a batch, so this multithreaded pipeline processes up to five batches at a time. When incoming data slows, the pipeline runners sit idle, available for use as soon as the data flow increases.
Multithreaded Partition Processing
By default, the JDBC Multitable Consumer origin performs multithreaded partition processing for all tables that meet the partition processing requirements, and performs table processing for all other tables.
To perform multithreaded processing of partitions within a table, you enable partition processing in the table configuration, then specify the partition size and the maximum number of partitions to use. Limiting the number of partitions also limits the number of threads that can be dedicated to processing data in the table.
When you configure a set of tables for unlimited partitions, the origin creates up to twice as many partitions as the pipeline thread count. For example, if you have 5 threads, the table can have up to 10 partitions.
Similar to multithreaded table processing, each thread reads data from a single partition, and each partition can have a maximum of one thread read from it at a time.
When processing partitions, the processing order depends on many factors. For a full description, see Understanding the Processing Queue.
Partition Processing Requirements
To perform multithreaded partition processing for a table, the table must meet the following requirements:
- Single key or offset column
- The table must have a single primary key or user-defined offset column. Performing multithreaded partition processing on a table with composite keys generates an error and stops the pipeline.
- Numeric data type
- To use partition processing, the primary key or user-defined offset column must have a numeric data type that allows arithmetic partitioning.
Multiple Offset Value Handling
When processing partitions, JDBC Multitable Consumer origin allows processing multiple records with the same offset value. For example, the origin can process multiple records with the same timestamp in a transaction_date offset column.
When you stop the pipeline as the origin is processing a series of records with the same offset value, the origin notes the offset. Then, when you restart the pipeline, it starts with a record with the next logical offset value, skipping any unprocessed records that use the same last-saved offset.
For example, say you specified a datetime column as a user-defined offset column, and five records in the table share the same datetime value. Now say you happen to stop the pipeline after it processes the second record. The pipeline stores the datetime value as the offset where it stopped. When you restart the pipeline, processing begins with the next datetime value, skipping the three unprocessed records with the last-saved offset value.
Best Effort: Processing Non-Compliant Tables
To process tables in a table configuration that might not meet the partition processing requirements, you can use the On (Best Effort) option when you configure the Multithreaded Partition Processing mode property.
When you select the best effort option, the origin performs multithreaded partition processing for all tables that meet the partition processing requirements. The origin performs multithreaded table processing for tables that include multiple key or offset columns. And if you enable non-incremental processing, the origin can also process all tables that do not include key or offset columns.
Non-Incremental Processing
You can configure the JDBC Multitable Consumer origin to perform non-incremental processing for tables with no primary keys or user-defined offset columns. By default, the origin performs incremental processing and does not process tables without a key or offset column.
You can enable non-incremental processing for the set of tables defined in a table configuration.
Use On (Best Effort) or Off to perform non-incremental processing of the table. With either option selected, table is processed using a single thread, like multithreaded table processing.
- The origin uses a single thread to process all available data in the table.
- After the origin processes all available data, it notes that the table has been
processed as an offset. So, if you stop and restart the pipeline after the
origin completes all processing, the origin does not reprocess the table.
If you want to reprocess data in the table, you can reset the origin before restarting the pipeline. This resets the origin for all tables that the origin processes.
- If the pipeline stops while the origin is still processing available data, when the pipeline restarts, the origin reprocesses the entire table. This occurs because the table has no key or offset column to allow for tracking progress.
For example, say you configure the origin to use five threads and process a set of tables that includes a table with no key or offset column. To process data in this table, you enable the Enable Non-Incremental Load table configuration property. You also set Multithreaded Partition Processing Mode to On (Best Effort) to allow the origin to use multithreaded partition processing when possible and allow both non-incremental processing and multithreaded table processing when needed.
When you start the pipeline, the origin allocates one thread to the table that requires non-incremental processing. It processes the table data using multithreaded table processing until all data is processed. When the thread completes processing all available data, the origin notes this as part of the offset and the thread becomes available to process data from other tables. In the meantime, the four other threads process data from the rest of the tables using multithreaded partition processing when possible.
Batch Strategy
You can specify the batch strategy to use when processing data. The batch strategy behaves differently depending on whether you use multithreaded table processing or multithreaded partition processing. The behavior can also be affected by the Batches from Result Set property.
Process All Available Rows
- Multithreaded table processing
-
When the origin performs multithreaded table processing for all tables, each thread creates multiple batches of data from one table, until all available rows are read from that table.
The thread runs one SQL query for all batches created from the table. Then, the thread switches to the next available table, running another SQL query to read all available rows from that table.
For example, let's say the origin has batch size of 100 and uses two concurrent threads to read from four tables, each of which contains 1,000 rows. The first thread runs a SQL query to create 10 batches of 100 rows each from table1, while the second thread uses the same strategy to read data from table2.
When table1 and table2 are fully read, the threads switch to table3 and table4 and complete the same process. When the first thread finishes reading from table3, the thread switches back to the next available table to read all available data from the last saved offset.
The number of threads that can process the tables is limited by the Number of Threads property for the origin.
When the tables being processed use both table and partition processing, the threads query the partitions as described below. For details on how the tables and partitions rotate through the processing queue, see Understanding the Processing Queue.
- Multithreaded partition processing
Multithreaded partition processing is similar to multithreaded table processing, except that it works at a partition level.
Each thread creates multiple batches of data from one partition. The number of batches that it creates and processes at one time is based on the Batches from Result Set property.
Each thread runs one SQL query for the batches to be created from the partition. Then, the thread switches to the next available partition, running another SQL query.For example, if you set the Batches from Result Set property to 3, a thread runs a query to create 3 batches of data from the partition that it processes. When it completes processing the three batches, it becomes available to process the next partition or table in the processing queue.
The number of threads that can process partitions for each table is limited by the Number of Threads property for the origin and the Max Active Partitions table property.
For details on how the tables and partitions rotate through the processing queue, see Understanding the Processing Queue.
Switch Tables
The Switch Tables batch strategy differs greatly depending on whether the origin performs full table or partition processing. The number of batches created and processed at one time is based on the Batches from Result Set property.
- Multithreaded table processing
-
When the origin performs multithreaded table processing for all tables, each thread creates a set of batches from one table, and then switches to the next available table to create the next set of batches.
The thread runs an initial SQL query to create the first set of batches from the table. The database caches the remaining rows in a result set in the database for the same thread to access again, and then the thread switches to the next available table. A table is available in the following situations:- The table does not have an open result set cache. In this case, the thread runs an initial SQL query to create the first batch, caching the remaining rows in a result set in the database.
- The table has an open result set cache created by that same thread. In this case, the thread creates the batch from the result set cache in the database rather than running another SQL query.
A table is not available when the table has an open result set cache created by another thread. No other thread can read from that table until the result set is closed.
When you configure a switch table strategy, define the result set cache size and the number of batches that a thread can create from the result set. After a thread creates the configured number of batches, a different thread can read from the table.Note: By default, the origin instructs the database to cache an unlimited number of result sets. A thread can create an unlimited number of batches from that result set.For example, let's say an origin has a batch size of 100 and uses two concurrent threads and to read from four tables, each of which contains 10,000 rows. You set the result set cache size to 500 and set the number of batches read from the result set to 5.
Thread1 runs an SQL query on table1, which returns all 10,000 rows. The thread creates a batch when it reads the first 100 rows. The next 400 rows are cached as a result set in the database. Since thread2 is similarly processing table2, thread1 switches to the next available table, table3, and repeats the same process. After creating a batch from table3, thread1 switches back to table1 and retrieves the next batch of rows from the result set that it previously cached in the database.
After thread1 creates five batches using the result set cache for table1. Thread1 then switches to the next available table. A different thread runs an SQL query to read additional rows from table1, beginning from the last saved offset.
When the tables being processed use both table and partition processing, the threads query the partitions as described below. For details on how the tables and partitions rotate through the processing queue, see Understanding the Processing Queue.
- Multithreaded partition processing
-
Multithreaded partition processing is similar to multithreaded table processing, with a twist - each thread creates a set of batches from one partition for a table, then all partitions from the same table are moved to the end of the processing queue. This allows the origin to switch to the next available table.
The behavior around caching the result set and the number of batches to process from the result set is the same, but at a partition level.
For examples of how tables and partitions rotate through the processing queue, see Understanding the Processing Queue.
Initial Table Order Strategy
You can define the initial order that the origin uses to read the tables.
- None
- Reads the tables in the order that they are listed in the database.
- Alphabetical
- Reads the tables in alphabetical order.
- Referential Constraints
- Reads the tables based on the dependencies between the tables. The origin reads the parent table first, and then reads the child tables that refer to the parent table with a foreign key.
The origin uses this table order only for the initial reading of the tables. When threads switch back to previously read tables, they read from the next available table, regardless of the defined order.
Understanding the Processing Queue
The JDBC Multitable Consumer origin maintains a virtual queue to determine the data to process from different tables. The queue includes each table defined in the origin. When a table is to be processed by partition, multiple partitions for the table are added to the queue, limited by the Max Partitions property.
The origin rotates and reorganizes the queue based on the Per Batch Strategy property. And it processes data from the queue with the threads specified in the Number of Threads property and the Batches from Result Set property.
Below are some scenarios to help clarify how the queue works.
Multiple Tables, No Partition Processing
A B C DWhen
a thread becomes available, it processes data from the first table in the queue. The
number of batches is based on the Batches from Result Set property. The processing of
the tables depends on how you define the Per Batch Strategy property:- Process All Available Rows in the Table
- With this batch strategy, threads do not start processing data in the next table until all available data is processed for the preceding table.
- Switch Tables
- With this batch strategy, the order of the queue remains the same, but each thread performs a SQL query to create a set of batches based on the Batches from Result Set property. When it completes processing, it performs the same process with the next table in the queue.
Multiple Partitions, No Table Processing
Say you have table A, B, and C and all three tables are loaded up with lots of data to process. Tables A and B are configured with a maximum of 3 active partitions. And since table C has the largest volume of data, you allow an unlimited number of partitions. Again, let's use the alphabetical initial table ordering.
A1 A2 A3 B1 B2 B3 C1 C2 C3 C4 C5 C6 C7 C8 A
partition remains in the queue until the origin confirms that there is no more data in
the partition. When a thread becomes available, it creates a set of batches from the
first partition of the first table in the queue. The number of batches is based on the
Batches from Result Set property. The order of tables and partitions in the queue
depends on how you define the Per Batch Strategy, as follows:- Process All Available Rows in the Table
- When processing partitions, this batch strategy retains the original order
of the queue, but rotates through the partitions as each thread processes a
set of batches. Note: In practice, this means that rows from subsequent tables can be processed before a previous table is completed, since available threads continue to pick up partitions from the queue.
- Switch Tables
- When processing partitions, this batch strategy forces all subsequent, consecutive partitions from the same table to the end of the queue each time a thread processes a set of batches from a partition.
Both Partition and Table Processing
When processing a mix of full tables and partitioned tables, the queue basically behaves the same as when processing only partitions, with full tables being processed as partitioned tables with a single partition. Let's walk through it.
A B1 B2 B3 C1 C2 C3 C4 C5 C6 C7 C8 When a thread
becomes available, it processes a set of batches from the first table or partition in
the queue. The number of batches is based on the Batches from Result Set property. The
order of the queue depends on how you define the Per Batch Strategy, as follows:- Process All Available Rows in the Table
- With this batch strategy, the queue remains in the basic initial order and
rotates as each thread claims a set of batches from the next table or
partition. The unpartitioned table A is processed like a table with a single
partition.
Note that unpartitioned tables are not processed in full when they move to the front of the queue. For this behavior, configure all tables to be processed without partitions. Or, set the Batches from Result Set property to -1.
When the pipeline starts, the 4 threads process a set of batches from the A table and from partitions B1, B2, and B3. Since the table and partitions all still contain data, they then move to the end of the queue as follows:
C1 C2 C3 C4 C5 C6 C7 C8 A B1 B2 B3As each thread completes processing, it processes a set of batches from the front of the queue. After each of the 4 threads takes another set of batches, the queue looks like this:C5 C6 C7 C8 A B1 B2 B3 C1 C2 C3 C4 - Switch Tables
- When processing tables and partitions, this batch strategy forces all
subsequent, consecutive partitions from the same table to the end of the
queue. And it treats unpartitioned tables as a table with a single
partition. As a result, the queue rotation is a simplified version of
processing only partitioned tables.
So we have this initial order:
A B1 B2 B3 C1 C2 C3 C4 C5 C6 C7 C8The first thread processes a set of batches from table A, and since there are no related partitions, it simply goes to the end of the queue:B1 B2 B3 C1 C2 C3 C4 C5 C6 C7 C8 AThe second thread processes a set of batches from B1, pushes the rest of the table B partitions to the end of the queue, and B1 lands at the end because it contains more data to be processed:C1 C2 C3 C4 C5 C6 C7 C8 A B2 B3 B1The third thread processes a set of batches from C1, pushes the rest of the table C partitions to the end, and C1 takes the last slot:A B2 B3 B1 C2 C3 C4 C5 C6 C7 C8 C1And then the fourth thread processes another set of batches from table A, and moves A to the end of the queue:B2 B3 B1 C2 C3 C4 C5 C6 C7 C8 C1 A
JDBC Header Attributes
The JDBC Multitable Consumer origin generates JDBC record header attributes that provide additional information about each record, such as the original data type of a field or the source tables for the record. The origin receives these details from the JDBC driver.
You can use the record:attribute or record:attributeOrDefault functions to access the information in the attributes. For more information about working with record header attributes, see Working with Header Attributes.
JDBC record header attributes include a "jdbc" prefix to differentiate the JDBC attributes from other record header attributes.
| JDBC Header Attribute | Description |
|---|---|
| jdbc.tables | Provides a comma-separated list of source
tables for the fields in the record.
Note: Not all JDBC drivers
provide this information.
Oracle uses all caps for schema, table, and column names by default. Names can be lower- or mixed-case only if the schema, table, or column was created with quotation marks around the name. |
| jdbc.partition | Provides the full offset key for the partition that produced the record |
| jdbc.threadNumber | Provides the number of the thread that produced the record. |
| jdbc.<column name>.jdbcType | Provides the numeric value of the original SQL data type for each field in the record. See the Java documentation for a list of the data types that correspond to numeric values. |
| jdbc.<column name>.precision | Provides the original precision for all numeric and decimal fields. |
| jdbc.<column name>.scale | Provides the original scale for all numeric and decimal fields. |
Event Generation
The JDBC Multitable Consumer origin can generate events that you can use in an event stream. When you enable event generation, the origin generates an event when it completes processing the data returned by the specified queries for all tables. It also generates events when it completes processing the data returned from a table and the data returned from a schema.
- With the Pipeline Finisher executor to
stop the pipeline and transition the pipeline to a Finished state when
the origin completes processing available data.
When you restart a pipeline stopped by the Pipeline Finisher executor, the origin continues processing from the last-saved offset unless you reset the origin.
For an example, see Case Study: Stop the Pipeline.
- With the Email executor to send a custom email
after receiving an event.
For an example, see Case Study: Sending Email.
-
With a destination to store information about completed queries.
For an example, see Case Study: Event Storage.
For more information about dataflow triggers and the event framework, see Dataflow Triggers Overview.
Event Record
| Record Header Attribute | Description |
|---|---|
| sdc.event.type | Event type. Uses the following type:
|
| sdc.event.version | Integer that indicates the version of the event record type. |
| sdc.event.creation_timestamp | Epoch timestamp when the stage created the event. |
The JDBC Multitable Consumer origin can generate the following event record:
- no-more-data
- The JDBC Multitable Consumer origin generates a no-more-data event record when the origin completes processing all data returned by the queries for all tables.
- schema-finished
- The JDBC Multitable Consumer origin generates a schema-finished event record when the origin completes processing all data within a schema.
- table-finished
- The JDBC Multitable Consumer origin generates a table-finished event record when the origin completes processing all data within a table.
Configuring a JDBC Multitable Consumer
Configure a JDBC Multitable Consumer origin to use a JDBC connection to read database data from multiple tables.