Transformer User Guide
Index
A
B
C
D
E
F
H
I
J
K
L
M
O
P
Q
R
S
T
W
A
ADLS Gen1 destination
configuring
[1]
data formats
[1]
overview
[1]
partitions
[1]
[2]
prerequisites
[1]
retrieve authentication information
[1]
ADLS Gen1 origin
configuring
[1]
data formats
[1]
overview
[1]
partitions
[1]
prerequisites
[1]
retrieve authentication information
[1]
ADLS Gen2 destination
configuring
[1]
data formats
[1]
overview
[1]
prerequisites
[1]
retrieve configuration details
[1]
ADLS Gen2 origin
configuring
[1]
data formats
[1]
overview
[1]
partitions
[1]
prerequisites
[1]
retrieve configuration details
[1]
ADLS stages
local pipeline prerequisites
[1]
Aggregate processor
aggregate functions
[1]
configuring
[1]
default output fields
[1]
example
[1]
overview
[1]
shuffling of data
[1]
Amazon S3 destination
AWS credentials
[1]
configuring
[1]
data formats
[1]
overview
[1]
partitions
[1]
Amazon S3 origin
data formats
[1]
overview
[1]
partitions
[1]
security
[1]
Amazon S3 stages
local pipeline prerequisites
[1]
B
batch pipelines
case study
[1]
description
[1]
browser
requirements
[1]
C
caching
for origins and processors
[1]
ludicrous mode
[1]
case study
batch pipelines
[1]
streaming pipelines
[1]
client deployment mode
Hadoop YARN cluster
[1]
cluster
Databricks
[1]
Hadoop YARN
[1]
requirements
[1]
cluster deployment mode
Hadoop YARN cluster
[1]
conditions
Delta Lake destination
[1]
Filter processor
[1]
Join processor
[1]
Stream Selector processor
[1]
Window processor
[1]
cross join
Join processor
[1]
custom schemas
application to JSON and delimited data
[1]
DDL schema format
[1]
error handling
[1]
JSON schema format
[1]
origins
[1]
D
Databricks
cluster
[1]
cluster configuration
[1]
Databricks pipelines
staging directory
[1]
data formats
ADLS Gen1 destination
[1]
ADLS Gen1 origin
[1]
ADLS Gen2 destination
[1]
ADLS Gen2 origin
[1]
Amazon S3 destination
[1]
Amazon S3 origin
[1]
File destination
[1]
File origin
[1]
Hive destination
[1]
Kafka destination
[1]
Kafka origin
[1]
data preview
data type display
[1]
overview
[1]
data types
[1]
in preview
[1]
Deduplicate processor
configuring
[1]
overview
[1]
default output fields
Aggregate processor
[1]
default stream
Stream Selector
[1]
Delta Lake destination
configuring
[1]
overview
[1]
overwrite condition
[1]
partitions
[1]
schema updates
[1]
write mode
[1]
deployment mode
Hadoop YARN cluster
[1]
destinations
ADLS G1
[1]
ADLS G2
[1]
Amazon S3
[1]
Delta Lake
[1]
File
[1]
Hive
[1]
JDBC
[1]
Kafka
[1]
directory path
File destination
[1]
File origin
[1]
drivers
JDBC destination
[1]
JDBC origin
[1]
E
encryption zones
using KMS to access HDFS encryption zones
[1]
environment variables
PySpark processor
[1]
execution mode
pipelines
[1]
expressions
Spark SQL Expression processor
[1]
F
Field Remover processor
configuring
[1]
overview
[1]
fields
referencing
[1]
File destination
configuring
[1]
data formats
[1]
directory path
[1]
overview
[1]
partitions
[1]
File origin
configuring
[1]
custom schema
[1]
data formats
[1]
directory path
[1]
overview
[1]
partitions
[1]
Filter processor
configuring
[1]
filter condition
[1]
overview
[1]
full outer join
Join processor
[1]
H
Hadoop impersonation mode
configuring KMS for encryption zones
[1]
lowercasing user names
[1]
overview
[1]
Hadoop YARN
cluster
[1]
deployment mode
[1]
directory requirements
[1]
impersonation
[1]
Kerberos authentication
[1]
history
pipeline run
[1]
Hive destination
additional Hive configuration properties
[1]
configuring
[1]
data formats
[1]
overview
[1]
partitions
[1]
Hive origin
additional Hive configuration properties
[1]
configuring
[1]
full mode query guidelines
[1]
incremental and full query mode
[1]
incremental mode query guidelines
[1]
overview
[1]
partitions
[1]
SQL query
[1]
I
impersonation mode
Hadoop
[1]
inner join
Join processor
[1]
inputs variable
PySpark processor
[1]
installation
overview
[1]
requirements
[1]
Spark cluster mode
[1]
Spark local mode
[1]
J
JDBC destination
configuring
[1]
driver installation
[1]
overview
[1]
partitions
[1]
tested versions and drivers
[1]
write mode
[1]
JDBC origin
configuring
[1]
driver installation
[1]
offset column
[1]
overview
[1]
partitions
[1]
tested versions and drivers
[1]
job cluster
Databricks
[1]
Join processor
condition
[1]
configuring
[1]
criteria
[1]
cross join
[1]
full outer join
[1]
inner join
[1]
join types
[1]
left anti join
[1]
left outer join
[1]
left semi join
[1]
matching fields
[1]
overview
[1]
right outer join
[1]
shuffling of data
[1]
join types
Join processor
[1]
K
Kafka destination
configuring
[1]
data formats
[1]
Kerberos authentication
[1]
message
[1]
overview
[1]
security
[1]
SSL/TLS encryption
[1]
Kafka origin
configuring
[1]
custom schemas
[1]
data formats
[1]
Kerberos authentication
[1]
offsets
[1]
overview
[1]
partitions
[1]
security
[1]
SSL/TLS encryption
[1]
Kerberos authentication
Hadoop YARN cluster
[1]
Kafka destination
[1]
Kafka origin
[1]
L
left anti join
Join processor
[1]
left outer join
Join processor
[1]
left semi join
Join processor
[1]
local pipelines
configuring
[1]
lookups
overview
[1]
streaming example
[1]
ludicrous mode
caching
[1]
enabling
[1]
optimizing pipeline performance
[1]
pipeline statistics
[1]
M
message
Kafka destination
[1]
monitoring
overview
[1]
pausing
[1]
Spark web UI
[1]
viewing statistics
[1]
O
offset column
JDBC
[1]
offsets
Kafka origin
[1]
origins
ADLS Gen1
[1]
ADLS Gen2
[1]
Amazon S3
[1]
caching
[1]
File
[1]
Hive
[1]
JDBC
[1]
Kafka
[1]
multiple
[1]
output order
preview
[1]
output variable
PySpark processor
[1]
P
partitioning
overview
[1]
partitions
ADLS Gen1 destination
[1]
[2]
ADLS Gen1 origin
[1]
ADLS Gen2 origin
[1]
Amazon S3 destination
[1]
Amazon S3 origin
[1]
based on origins
[1]
changing
[1]
Delta Lake destination
[1]
File destination
[1]
File origin
[1]
Hive destination
[1]
Hive origin
[1]
initial
[1]
initial number
[1]
JDBC destination
[1]
JDBC origin
[1]
Kafka origin
[1]
Rank processor
[1]
performing lookups
overview
[1]
pipeline performance
ludicrous mode
[1]
pipeline run
history
[1]
summary
[1]
pipelines
comparison with Data Collector
[1]
configuring
[1]
monitoring
[1]
pause monitoring
[1]
previewing
[1]
run history
[1]
Spark configuration
[1]
stage library match requirement
[1]
ports
default
[1]
prerequisites
ADLS and Amazon S3 stages
[1]
PySpark processsor
[1]
stage-related
[1]
preview
availability
[1]
color codes
[1]
editing properties
[1]
output order
[1]
overview
[1]
pipeline
[1]
writing to destinations
[1]
processor
output order
[1]
processors
Aggregate
[1]
caching
[1]
Deduplicate
[1]
Field Remover
[1]
Filter
[1]
Join
[1]
Profile
[1]
PySpark
[1]
Rank
[1]
referencing fields
[1]
Repartition
[1]
shuffling of data
[1]
Sort
[1]
Spark SQL Expression
[1]
Spark SQL Query
[1]
Stream Selector
[1]
Type Converter
[1]
Window
[1]
Profile processor
configuring
[1]
output records
[1]
overview
[1]
statistics
[1]
proxy users
Transformer
[1]
PySpark processor
configuring
[1]
custom code
[1]
environment variables
[1]
examples
[1]
inputs variable
[1]
output variable
[1]
overview
[1]
prerequisites
[1]
[2]
Python requirements
[1]
referencing fields
[1]
Q
query mode
Hive origin
[1]
R
Rank processor
configuring
[1]
example
[1]
order by
[1]
overview
[1]
partition by
[1]
rank functions
[1]
shuffling of data
[1]
repartitioning
overview
[1]
types
[1]
Repartition processor
configuring
[1]
overview
[1]
shuffling of data
[1]
types
[1]
use cases
[1]
right outer join
Join processor
[1]
S
schema updates
Delta Lake destination
[1]
security
Kafka destination
[1]
Kafka origin
[1]
shuffling
overview
[1]
sorting
multiple fields
[1]
Sort processor
configuring
[1]
multiple fields
[1]
overview
[1]
Spark
run locally
[1]
run on cluster
[1]
Spark configuration
local mode
[1]
pipelines
[1]
Spark history server
monitoring
[1]
Spark processing
description
[1]
Spark SQL Expression processor
expressions
[1]
overview
[1]
Spark SQL processor
configuring
[1]
Spark SQL query
syntax
[1]
Spark SQL Query processor
configuring
[1]
examples
[1]
overview
[1]
query syntax
[1]
referencing fields
[1]
Spark web UI
monitoring
[1]
SQL query
Hive origin
[1]
SSL/TLS encryption
Kafka destination
[1]
Kafka origin
[1]
stage library match requirement
in a pipeline
[1]
staging directory
Databricks pipelines
[1]
statistics
pipeline
[1]
Profile processor
[1]
stages
[1]
streaming pipelines
case study
[1]
description
[1]
Stream Selector processor
conditions
[1]
configuring
[1]
default stream
[1]
overview
[1]
T
Technology Preview functionality
description
[1]
Transformer
architecture
[1]
description
[1]
for Data Collector users
[1]
launching
[1]
proxy users
[1]
spark-submit
[1]
starting
[1]
Type Converter processor
configuring
[1]
field type conversion
[1]
overview
[1]
W
Window processor
conditions
[1]
configuring
[1]
overview
[1]
window types
[1]
window types
Window processor
[1]
write mode
Delta Lake destination
[1]
JDBC destination
[1]
© 2019 StreamSets, Inc.