Options
This page describes the available configuration settings for Artie Transfer to use.
Below, these are the various options that can be specified within a configuration file. Once it has been created, you can run Artie Transfer like this:
Note: Keys here are formatted in dot notation for readability purposes, please ensure that the proper nesting is done when writing this into your configuration file. To see sample configuration files, visit the Examples page.
Key | Optional | Description |
---|---|---|
| N | This is the destination. Supported values are currently:
|
| Y | Defaults to Other valid options are Please check the respective sections below on what else is required. |
| Y | DSN for Sentry alerts. If blank, will just go to stdout. |
| Y | Defaults to |
| Y | Defaults to |
| Y | Defaults to |
Kafka
bootstrapServer
Pass in the Kafka bootstrap server. For best practices, pass in a comma separated list of bootstrap servers to maintain high availability. This is the same spec as Kafka. Type: String Optional: No
groupID
This is the name of the Kafka consumer group. You can set to whatever you'd like. Just remember that the offsets are associated to a particular consumer group. Type: String Optional: No
username + password
If you'd like to use SASL/SCRAM auth, you can pass the username and password. Type: String Optional: Yes
enableAWSMSKIAM
Turn this on if you would like to use IAM authentication to communicate with Amazon MSK. If you enabel this, make sure to pass in AWS_REGION
, AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
.
Type: Boolean
Optional: Yes
Topic Configs
TopicConfigs
are used at the table level and store configurations like:
Destination's database, schema and table name.
What does the data format look like? Is there an idempotent key?
Whether it should do row based soft deletion or not.
Whether it should drop deleted columns or not.
These are stored in this particular fashion. See Examples for more details.
Key | Optional | Description |
---|---|---|
| N | Name of the database in destination. |
| Y | Name of the table in destination.
|
| N | Name of the schema in Snowflake. Not needed for BigQuery. |
| N | Name of the Kafka topic. |
| N | Name of the column that is used for idempotency. This field is highly recommended.
For example: |
| N | Name of the CDC connector (thus format) we should be expecting to parse against. Currently, the supported values are:
|
| N | Format for what Kafka Connect will the key to be. This is called |
| Y | Defaults to |
| Y | Defaults to |
| Y | Comma-separated string for Transfer to specified operations. Valid values are:
Can be specified like: |
This is getting deprecated in the next Transfer version. Use | Y | Defaults to |
| Y | Defaults to |
| Y | Defaults to |
| Y | Enable this to turn on BigQuery table partitioning.
This is available starting |
BigQuery Partition Settings
This is the object stored under Topic Config.
partitionType
Type of partitioning. We currently support only time-based partitioning. The valid values right now are just time
.
Type: String
Optional: Yes
partitionField
Which field or column is being partitioned on. Type: String Optional: Yes
partitionBy
This is used for time partitioning, what is the time granularity? Valid values right now are just daily
Type: String
Optional: Yes
Google Pub/Sub
projectID
pathToCredentials
This is the path to the credentials for the service account to use. You can re-use the same credentials as BigQuery, or you can use a different service account to support use cases of cross-account transfers. Type: String Optional: No
topicConfigs
Follow the same convention as kafka.topicConfigs
above.
BigQuery
Key | Optional | Description |
---|---|---|
| Y | Path to the credentials file for Google.
You can also directly inject |
| N | Google Cloud Project ID |
| Y | Location of the BigQuery dataset.
Defaults to |
| N | The default dataset used. This just allows us to connect to BigQuery using data source notation (DSN). |
| Y | Batch size is used to chunk the request to BigQuery's Storage API to avoid the 10 mb limit.
If this is not passed in, we will just default to |
Shared Transfer config
additionalDateFormats
By default, Artie Transfer supports a wide array of date formats. If your layout is supported, you can specify additional ones here. If you're unsure, please refer to this guide. Type: List of layouts Optional: Yes
createAllColumnsIfAvailable
By default, Artie Transfer will only create the column within the destination if the column contains a not null value. You can override this behavior by setting this value to true
.
Type: Boolean
Optional: Yes
Snowflake
Please see: Snowflake on how to gather these values.
Key | Optional | Description |
---|---|---|
| N | |
| N | Snowflake username |
| N | Snowflake password |
| N | Virtual warehouse name |
| N | Snowflake region. |
Redshift
Key | Optional | Description |
---|---|---|
| N | Host URL
e.g. |
| N | - |
| N | Namespace / Database in Redshift. |
| N | |
| N | |
| N | Bucket for where staging files will be stored. Click here to see how to set up a S3 bucket and have it automatically purged based on expiration. |
| Y | The prefix for S3, say bucket is foo and prefix is bar. It becomes: s3://foo/bar/file.txt |
| N | Redshift credentials clause to store staging files into S3. Source |
| Y | Defaults to false.
If this is passed in, Artie Transfer will mask the column value with:
1. If value is a string, |
S3
optionalPrefix
Prefix after the bucket name. If this is specified, Artie Transfer will save the files under s3://artie-transfer/optionalPrefix/...
Type: String
Optional: Yes
bucket
S3 bucket name. Example: foo
.
Type: String
Optional: No
awsAccessKeyID
The AWS_ACCESS_KEY_ID
for the service account.
Type: String
Optional: No
awsSecretAccessKey
The AWS_SECRET_ACCESS_KEY
for the service account.
Type: String
Optional: No
Telemetry
Overview of Telemetry can be found here: Overview.
Key | Type | Optional | Description |
---|---|---|---|
| Object | Y | Parent object. See below. |
| String | Y | Provider to export metrics to. Transfer currently only supports: |
| Object | Y | Additional settings block, see below |
| Array | Y | Tags that will appear for every metrics like: |
| String | Y | Optional namespace prefix for metrics. Defaults to |
| String | Y | Address for where the statsD agent is running. Defaults to |
| Number | Y | Percentage of data to send. Provide a number between 0 and 1. Defaults to |
Last updated