Resetting the Origin

You can reset the origin when you want the Data Collector to process all available data instead of processing data from the last-saved offset. Reset the origin when the pipeline is not running.

You can reset the origin for the following origin stages:

Amazon S3
Directory
Elasticsearch
File Tail
Google Cloud Storage
Hadoop FS Standalone
HTTP Client
JDBC Multitable Consumer
JDBC Query Consumer
Kinesis Consumer
MapR DB JSON
MapR FS Standalone
MongoDB
MongoDB Oplog
MySQL Binary Log
Salesforce
SFTP/FTP Client
SQL Server CDC Client
SQL Server Change Tracking
Teradata Consumer
Windows Event Log

For these origins, when you stop the pipeline, the Data Collector notes where it stopped processing data. When you restart the pipeline, it continues from where it left off by default. When you want the Data Collector to process all available data instead of continuing from where it stopped, reset the origin. For unique details about resetting this origin, see Resetting the Origin.

You can configure the Kafka and MapR Streams origins to process all available data by specifying an additional Kafka configuration property. For more information, see "Processing All Unread Data" in the stage documentation. You can reset the Azure IoT/Event Hub Consumer origin by deleting offset details in the Microsoft Azure portal. The remaining origin stages process transient data where resetting the origin has no effect.

You can reset the origin for multiple pipelines at the same time from the Home page. Or, you can reset the origin for a single pipeline from the pipeline canvas.

To reset the origin:

Select multiple pipelines from the Home page, or view a single pipeline in the pipeline canvas.
Click the More icon, and then click Reset Origin.
In the Reset Origin Confirmation dialog box, click Yes to reset the origin.