Load Data from S3

Load data from Amazon S3 files and optionally store it in Objects with batch processing. This function provides secure S3 connectivity, multi-format file support (CSV, JSON, TXT), configurable parsing options, and automatic type detection.

Technical Name: S3DataLoadFunc

Properties

Execution Mode: SYNC
Type: NATIVE
Category: System Functions
Function ID: 17930933-4345-4ec6-bc43-fbbaf26f463c

Input Schema

Required Parameters

Parameter	Type	Description
`connectionId`	string (UUID)	UUID of the S3 connection to use for data loading
`filePath`	string	S3 key/path to the file to be processed (max 1024 characters)
`fileType`	string	Type of file to process: `CSV`, `JSON`, `XLSX`, `TXT`, or `PARQUET`
`batchSize`	integer	Number of records to process in each batch (1-10000, default: 1000)
`skipRows`	integer	Number of rows to skip from the beginning of the file (default: 0)
`delimiter`	string	Delimiter character for CSV files (max 5 characters, default: ",")

Optional Parameters

Parameter	Type	Description
`mobjectId`	string (UUID)	UUID of the Object where data will be stored (optional for query-only execution)
`hasHeader`	boolean	Whether the file has a header row (default: true)
`encoding`	string	Character encoding of the file: `UTF-8`, `UTF-16`, `ISO-8859-1`, or `Windows-1252` (default: UTF-8)
`datePattern`	string	Pattern for parsing date fields (e.g., 'yyyy-MM-dd', max 50 characters)
`maxRecords`	integer	Maximum number of records to process (0 means no limit, default: 0)
`fieldMappings`	object	Optional field mappings to transform source data fields to target MData fields. Key is source field name, value is target field name

Input Example

Load CSV File to Object

{
  "connectionId": "123e4567-e89b-12d3-a456-426614174000",
  "filePath": "data/sales/2024/sales_data.csv",
  "fileType": "CSV",
  "mobjectId": "sales-mobject-uuid",
  "batchSize": 1000,
  "skipRows": 1,
  "delimiter": ",",
  "hasHeader": true,
  "encoding": "UTF-8",
  "maxRecords": 0
}

Load JSON File with Field Mappings

{
  "connectionId": "s3-connection-uuid",
  "filePath": "exports/customers.json",
  "fileType": "JSON",
  "mobjectId": "customer-mobject-uuid",
  "batchSize": 500,
  "skipRows": 0,
  "fieldMappings": {
    "customer_name": "name",
    "customer_email": "email",
    "customer_phone": "phone"
  },
  "encoding": "UTF-8"
}

Query-Only Execution (No Storage)

{
  "connectionId": "s3-connection-uuid",
  "filePath": "reports/data.csv",
  "fileType": "CSV",
  "batchSize": 1000,
  "skipRows": 0,
  "delimiter": ",",
  "hasHeader": true,
  "maxRecords": 100
}

Output Schema

Field	Type	Description
`success`	boolean	Whether the operation completed successfully
`totalRecords`	integer	Total number of records found in the file
`processedRecords`	integer	Number of records that were processed
`createdRecords`	integer	Number of MData records successfully created
`skippedRecords`	integer	Number of records that were skipped due to validation errors
`errors`	array (string)	List of errors encountered during processing
`warnings`	array (string)	List of warnings encountered during processing
`fileInfo`	object	Information about the processed file (fileName, fileSize, lastModified)
`data`	array (object)	Raw data records (only returned when mObjectId is not provided)
`executionTime`	integer	Execution time in milliseconds
`connectionId`	string (UUID)	The S3 connection ID that was used
`mObjectId`	string (UUID)	The Object ID where data was stored (if provided)

Output Example

Success Response with Data Storage

{
  "success": true,
  "totalRecords": 5000,
  "processedRecords": 5000,
  "createdRecords": 4950,
  "skippedRecords": 50,
  "errors": [],
  "warnings": [
    "50 records skipped due to validation errors"
  ],
  "fileInfo": {
    "fileName": "sales_data.csv",
    "fileSize": 1048576,
    "lastModified": "2024-01-15T10:30:00"
  },
  "executionTime": 2500,
  "connectionId": "s3-connection-uuid",
  "mObjectId": "sales-mobject-uuid"
}

Query-Only Response

{
  "success": true,
  "totalRecords": 100,
  "processedRecords": 100,
  "createdRecords": 0,
  "skippedRecords": 0,
  "errors": [],
  "warnings": [],
  "fileInfo": {
    "fileName": "data.csv",
    "fileSize": 51200,
    "lastModified": "2024-01-15T10:30:00"
  },
  "data": [
    {
      "column1": "value1",
      "column2": "value2"
    }
  ],
  "executionTime": 500,
  "connectionId": "s3-connection-uuid"
}

How It Works

The function performs the following steps:

Validate Input: Validates file path, format, parsing parameters, and S3 connection
Establish S3 Connection: Connects to S3 using the provided connection ID and validates file accessibility
Parse File: Parses file content based on file type:
- CSV: Custom delimiters, headers, row skipping, automatic type detection
- JSON: Array and object parsing with max record limits
- TXT: Text file parsing with delimiter support
Transform Data: Applies field mappings if provided and transforms data types
Process Data:
- If mobjectId is provided: Creates MData records in batches
- If mobjectId is not provided: Returns raw data without storage
Return Result: Returns processing statistics, file metadata, and optional data

Supported File Formats

CSV Files

Custom delimiters (comma, semicolon, tab, etc.)
Header row detection
Row skipping
Automatic type detection (numbers, booleans, dates, strings)
Custom date pattern parsing

JSON Files

JSON array parsing (array of objects)
Single JSON object parsing
Max record limits
Nested object support

TXT Files

Delimiter-based parsing
Encoding support

Use Cases

Bulk Data Import from S3

Import large CSV files from S3 into Objects:

{
  "connectionId": "s3-connection-uuid",
  "filePath": "imports/products.csv",
  "fileType": "CSV",
  "mobjectId": "product-mobject-uuid",
  "batchSize": 5000,
  "skipRows": 1,
  "delimiter": ",",
  "hasHeader": true
}

Data Transformation with Field Mappings

Transform and map fields during import:

{
  "connectionId": "s3-connection-uuid",
  "filePath": "data/customers.json",
  "fileType": "JSON",
  "mobjectId": "customer-mobject-uuid",
  "batchSize": 1000,
  "fieldMappings": {
    "first_name": "firstName",
    "last_name": "lastName",
    "email_address": "email"
  }
}

Workflow Data Processing

Process S3 files in workflows:

Start → S3DataLoad → TransformData → InsertMData → End

Query-Only File Reading

Read and process S3 files without storage:

{
  "connectionId": "s3-connection-uuid",
  "filePath": "reports/summary.csv",
  "fileType": "CSV",
  "batchSize": 1000,
  "maxRecords": 1000
}

Notes

S3 Connection: Requires a configured S3 connection with proper credentials and bucket access
File Path: S3 key/path must be valid and accessible with the provided connection credentials
Batch Processing: Large files are processed in batches to optimize memory usage
Type Detection: Automatic type detection for numbers, booleans, dates, and strings
Field Mappings: Use field mappings to transform source field names to target Object field names
Error Handling: Validation errors skip individual records but continue processing
Security: File path validation prevents directory traversal attacks
Encoding: Supports multiple character encodings for international file support
Date Parsing: Custom date patterns can be specified for date field parsing

Load Data from Database - Execute SQL queries against database connections and load results into Objects
Parse CSV - Parse CSV files with automatic field resolution
Insert MData - Create new MData records

Load Data from S3

On this page