ETL - Sources

Source components represent the source of the data to be extracted. Some Extractors like JDBCExtractor work without a source, and thus can be optional.

Available Sources

file input http


Represents a source file, from which data is read. Files can be text files or compressed with tar.gz.

  • Component name: file


Parameter Description Type Mandatory Default value
path File path string true -
lock Lock the file while the extraction phase boolean false false
encoding File encoding string false UTF-8


Extracts from the file "/temp/actor.tar.gz":

{ "file": { "path": "/temp/actor.tar.gz", "lock" : true , "encoding" : "UTF-8"} }


Extracts data from console input. This is useful when the ETL works in a PIPE with other tools

  • Component name: input


Parameter Description Type Mandatory Default value


Extracts the file as input

cat /etc/csv| "{transformers:[{csv:{}}]}"


Uses an HTTP endpoint as a data source.

  • Component name: http


Parameter Description Type Mandatory Default value
url HTTP URL to invoke String true -
method HTTP Method between "GET", "POST", "PUT", "DELETE", "HEAD", "OPTIONS", "TRACE" String false GET
headers Request headers as inner document key/value Document false


Execute an HTTP request against the URL "" in a GET, setting the User-Agent in the headers:

{ "http": {
    "url": "",
    "method": "GET",
    "headers": {
      "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36"