Resetting the Origin
You can reset the origin when you want the Data Collector to process all available data instead of processing data from the last-saved offset. Reset the origin when the pipeline is not running.
- Amazon S3
- Directory
- Elasticsearch
- File Tail
- Google Cloud Storage
- Hadoop FS Standalone
- HTTP Client
- JDBC Multitable Consumer
- JDBC Query Consumer
- Kinesis Consumer
- MapR DB JSON
- MapR FS Standalone
- MongoDB
- MongoDB Oplog
- MySQL Binary Log
- Salesforce
- SFTP/FTP Client
- SQL Server CDC Client
- SQL Server Change Tracking
- Teradata Consumer
- Windows Event Log
For these origins, when you stop the pipeline, the Data Collector notes where it stopped processing data. When you restart the pipeline, it continues from where it left off by default. When you want the Data Collector to process all available data instead of continuing from where it stopped, reset the origin. For unique details about resetting this origin, see Resetting the Origin.
You can configure the Kafka and MapR Streams origins to process all available data by specifying an additional Kafka configuration property. For more information, see "Processing All Unread Data" in the stage documentation. You can reset the Azure IoT/Event Hub Consumer origin by deleting offset details in the Microsoft Azure portal. The remaining origin stages process transient data where resetting the origin has no effect.
You can reset the origin for multiple pipelines at the same time from the Home page. Or, you can reset the origin for a single pipeline from the pipeline canvas.
To reset the origin:
- Select multiple pipelines from the Home page, or view a single pipeline in the pipeline canvas.
- Click the More icon, and then click Reset Origin.
- In the Reset Origin Confirmation dialog box, click Yes to reset the origin.