Load Data from S3
Load data from S3 files and store in Objects with optional batch processing. Supports CSV, JSON, and TXT formats with configurable parsing options
Load Data from S3
Load data from Amazon S3 files and optionally store it in Objects with batch processing. This function provides secure S3 connectivity, multi-format file support (CSV, JSON, TXT), configurable parsing options, and automatic type detection.
Technical Name: S3DataLoadFunc
Properties
- Execution Mode: SYNC
- Type: NATIVE
- Category: System Functions
- Function ID:
17930933-4345-4ec6-bc43-fbbaf26f463c
Input Schema
Required Parameters
| Parameter | Type | Description |
|---|---|---|
connectionId | string (UUID) | UUID of the S3 connection to use for data loading |
filePath | string | S3 key/path to the file to be processed (max 1024 characters) |
fileType | string | Type of file to process: CSV, JSON, XLSX, TXT, or PARQUET |
batchSize | integer | Number of records to process in each batch (1-10000, default: 1000) |
skipRows | integer | Number of rows to skip from the beginning of the file (default: 0) |
delimiter | string | Delimiter character for CSV files (max 5 characters, default: ",") |
Optional Parameters
| Parameter | Type | Description |
|---|---|---|
mobjectId | string (UUID) | UUID of the Object where data will be stored (optional for query-only execution) |
hasHeader | boolean | Whether the file has a header row (default: true) |
encoding | string | Character encoding of the file: UTF-8, UTF-16, ISO-8859-1, or Windows-1252 (default: UTF-8) |
datePattern | string | Pattern for parsing date fields (e.g., 'yyyy-MM-dd', max 50 characters) |
maxRecords | integer | Maximum number of records to process (0 means no limit, default: 0) |
fieldMappings | object | Optional field mappings to transform source data fields to target MData fields. Key is source field name, value is target field name |
[Image placeholder: S3DataLoad function configuration panel]
Input Example
Load CSV File to Object
{
"connectionId": "123e4567-e89b-12d3-a456-426614174000",
"filePath": "data/sales/2024/sales_data.csv",
"fileType": "CSV",
"mobjectId": "sales-mobject-uuid",
"batchSize": 1000,
"skipRows": 1,
"delimiter": ",",
"hasHeader": true,
"encoding": "UTF-8",
"maxRecords": 0
}Load JSON File with Field Mappings
{
"connectionId": "s3-connection-uuid",
"filePath": "exports/customers.json",
"fileType": "JSON",
"mobjectId": "customer-mobject-uuid",
"batchSize": 500,
"skipRows": 0,
"fieldMappings": {
"customer_name": "name",
"customer_email": "email",
"customer_phone": "phone"
},
"encoding": "UTF-8"
}Query-Only Execution (No Storage)
{
"connectionId": "s3-connection-uuid",
"filePath": "reports/data.csv",
"fileType": "CSV",
"batchSize": 1000,
"skipRows": 0,
"delimiter": ",",
"hasHeader": true,
"maxRecords": 100
}Output Schema
| Field | Type | Description |
|---|---|---|
success | boolean | Whether the operation completed successfully |
totalRecords | integer | Total number of records found in the file |
processedRecords | integer | Number of records that were processed |
createdRecords | integer | Number of MData records successfully created |
skippedRecords | integer | Number of records that were skipped due to validation errors |
errors | array (string) | List of errors encountered during processing |
warnings | array (string) | List of warnings encountered during processing |
fileInfo | object | Information about the processed file (fileName, fileSize, lastModified) |
data | array (object) | Raw data records (only returned when mObjectId is not provided) |
executionTime | integer | Execution time in milliseconds |
connectionId | string (UUID) | The S3 connection ID that was used |
mObjectId | string (UUID) | The Object ID where data was stored (if provided) |
Output Example
Success Response with Data Storage
{
"success": true,
"totalRecords": 5000,
"processedRecords": 5000,
"createdRecords": 4950,
"skippedRecords": 50,
"errors": [],
"warnings": [
"50 records skipped due to validation errors"
],
"fileInfo": {
"fileName": "sales_data.csv",
"fileSize": 1048576,
"lastModified": "2024-01-15T10:30:00"
},
"executionTime": 2500,
"connectionId": "s3-connection-uuid",
"mObjectId": "sales-mobject-uuid"
}Query-Only Response
{
"success": true,
"totalRecords": 100,
"processedRecords": 100,
"createdRecords": 0,
"skippedRecords": 0,
"errors": [],
"warnings": [],
"fileInfo": {
"fileName": "data.csv",
"fileSize": 51200,
"lastModified": "2024-01-15T10:30:00"
},
"data": [
{
"column1": "value1",
"column2": "value2"
}
],
"executionTime": 500,
"connectionId": "s3-connection-uuid"
}[Image placeholder: S3DataLoad output visualization]
How It Works
The function performs the following steps:
- Validate Input: Validates file path, format, parsing parameters, and S3 connection
- Establish S3 Connection: Connects to S3 using the provided connection ID and validates file accessibility
- Parse File: Parses file content based on file type:
- CSV: Custom delimiters, headers, row skipping, automatic type detection
- JSON: Array and object parsing with max record limits
- TXT: Text file parsing with delimiter support
- Transform Data: Applies field mappings if provided and transforms data types
- Process Data:
- If
mobjectIdis provided: Creates MData records in batches - If
mobjectIdis not provided: Returns raw data without storage
- If
- Return Result: Returns processing statistics, file metadata, and optional data
Supported File Formats
CSV Files
- Custom delimiters (comma, semicolon, tab, etc.)
- Header row detection
- Row skipping
- Automatic type detection (numbers, booleans, dates, strings)
- Custom date pattern parsing
JSON Files
- JSON array parsing (array of objects)
- Single JSON object parsing
- Max record limits
- Nested object support
TXT Files
- Delimiter-based parsing
- Encoding support
Use Cases
Bulk Data Import from S3
Import large CSV files from S3 into Objects:
{
"connectionId": "s3-connection-uuid",
"filePath": "imports/products.csv",
"fileType": "CSV",
"mobjectId": "product-mobject-uuid",
"batchSize": 5000,
"skipRows": 1,
"delimiter": ",",
"hasHeader": true
}Data Transformation with Field Mappings
Transform and map fields during import:
{
"connectionId": "s3-connection-uuid",
"filePath": "data/customers.json",
"fileType": "JSON",
"mobjectId": "customer-mobject-uuid",
"batchSize": 1000,
"fieldMappings": {
"first_name": "firstName",
"last_name": "lastName",
"email_address": "email"
}
}Workflow Data Processing
Process S3 files in workflows:
Start → S3DataLoad → TransformData → InsertMData → End[Image placeholder: Workflow example]
Query-Only File Reading
Read and process S3 files without storage:
{
"connectionId": "s3-connection-uuid",
"filePath": "reports/summary.csv",
"fileType": "CSV",
"batchSize": 1000,
"maxRecords": 1000
}Notes
- S3 Connection: Requires a configured S3 connection with proper credentials and bucket access
- File Path: S3 key/path must be valid and accessible with the provided connection credentials
- Batch Processing: Large files are processed in batches to optimize memory usage
- Type Detection: Automatic type detection for numbers, booleans, dates, and strings
- Field Mappings: Use field mappings to transform source field names to target Object field names
- Error Handling: Validation errors skip individual records but continue processing
- Security: File path validation prevents directory traversal attacks
- Encoding: Supports multiple character encodings for international file support
- Date Parsing: Custom date patterns can be specified for date field parsing
Related Functions
- Load Data from Database - Execute SQL queries against database connections and load results into Objects
- Parse CSV - Parse CSV files with automatic field resolution
- Insert MData - Create new MData records