Monetize360
WorkflowsFunctionsSystem Functions

Load Data from S3

Load data from S3 files and store in Objects with optional batch processing. Supports CSV, JSON, and TXT formats with configurable parsing options

Load Data from S3

Load data from Amazon S3 files and optionally store it in Objects with batch processing. This function provides secure S3 connectivity, multi-format file support (CSV, JSON, TXT), configurable parsing options, and automatic type detection.

Technical Name: S3DataLoadFunc

Properties

  • Execution Mode: SYNC
  • Type: NATIVE
  • Category: System Functions
  • Function ID: 17930933-4345-4ec6-bc43-fbbaf26f463c

Input Schema

Required Parameters

ParameterTypeDescription
connectionIdstring (UUID)UUID of the S3 connection to use for data loading
filePathstringS3 key/path to the file to be processed (max 1024 characters)
fileTypestringType of file to process: CSV, JSON, XLSX, TXT, or PARQUET
batchSizeintegerNumber of records to process in each batch (1-10000, default: 1000)
skipRowsintegerNumber of rows to skip from the beginning of the file (default: 0)
delimiterstringDelimiter character for CSV files (max 5 characters, default: ",")

Optional Parameters

ParameterTypeDescription
mobjectIdstring (UUID)UUID of the Object where data will be stored (optional for query-only execution)
hasHeaderbooleanWhether the file has a header row (default: true)
encodingstringCharacter encoding of the file: UTF-8, UTF-16, ISO-8859-1, or Windows-1252 (default: UTF-8)
datePatternstringPattern for parsing date fields (e.g., 'yyyy-MM-dd', max 50 characters)
maxRecordsintegerMaximum number of records to process (0 means no limit, default: 0)
fieldMappingsobjectOptional field mappings to transform source data fields to target MData fields. Key is source field name, value is target field name

[Image placeholder: S3DataLoad function configuration panel]

Input Example

Load CSV File to Object

{
  "connectionId": "123e4567-e89b-12d3-a456-426614174000",
  "filePath": "data/sales/2024/sales_data.csv",
  "fileType": "CSV",
  "mobjectId": "sales-mobject-uuid",
  "batchSize": 1000,
  "skipRows": 1,
  "delimiter": ",",
  "hasHeader": true,
  "encoding": "UTF-8",
  "maxRecords": 0
}

Load JSON File with Field Mappings

{
  "connectionId": "s3-connection-uuid",
  "filePath": "exports/customers.json",
  "fileType": "JSON",
  "mobjectId": "customer-mobject-uuid",
  "batchSize": 500,
  "skipRows": 0,
  "fieldMappings": {
    "customer_name": "name",
    "customer_email": "email",
    "customer_phone": "phone"
  },
  "encoding": "UTF-8"
}

Query-Only Execution (No Storage)

{
  "connectionId": "s3-connection-uuid",
  "filePath": "reports/data.csv",
  "fileType": "CSV",
  "batchSize": 1000,
  "skipRows": 0,
  "delimiter": ",",
  "hasHeader": true,
  "maxRecords": 100
}

Output Schema

FieldTypeDescription
successbooleanWhether the operation completed successfully
totalRecordsintegerTotal number of records found in the file
processedRecordsintegerNumber of records that were processed
createdRecordsintegerNumber of MData records successfully created
skippedRecordsintegerNumber of records that were skipped due to validation errors
errorsarray (string)List of errors encountered during processing
warningsarray (string)List of warnings encountered during processing
fileInfoobjectInformation about the processed file (fileName, fileSize, lastModified)
dataarray (object)Raw data records (only returned when mObjectId is not provided)
executionTimeintegerExecution time in milliseconds
connectionIdstring (UUID)The S3 connection ID that was used
mObjectIdstring (UUID)The Object ID where data was stored (if provided)

Output Example

Success Response with Data Storage

{
  "success": true,
  "totalRecords": 5000,
  "processedRecords": 5000,
  "createdRecords": 4950,
  "skippedRecords": 50,
  "errors": [],
  "warnings": [
    "50 records skipped due to validation errors"
  ],
  "fileInfo": {
    "fileName": "sales_data.csv",
    "fileSize": 1048576,
    "lastModified": "2024-01-15T10:30:00"
  },
  "executionTime": 2500,
  "connectionId": "s3-connection-uuid",
  "mObjectId": "sales-mobject-uuid"
}

Query-Only Response

{
  "success": true,
  "totalRecords": 100,
  "processedRecords": 100,
  "createdRecords": 0,
  "skippedRecords": 0,
  "errors": [],
  "warnings": [],
  "fileInfo": {
    "fileName": "data.csv",
    "fileSize": 51200,
    "lastModified": "2024-01-15T10:30:00"
  },
  "data": [
    {
      "column1": "value1",
      "column2": "value2"
    }
  ],
  "executionTime": 500,
  "connectionId": "s3-connection-uuid"
}

[Image placeholder: S3DataLoad output visualization]

How It Works

The function performs the following steps:

  1. Validate Input: Validates file path, format, parsing parameters, and S3 connection
  2. Establish S3 Connection: Connects to S3 using the provided connection ID and validates file accessibility
  3. Parse File: Parses file content based on file type:
    • CSV: Custom delimiters, headers, row skipping, automatic type detection
    • JSON: Array and object parsing with max record limits
    • TXT: Text file parsing with delimiter support
  4. Transform Data: Applies field mappings if provided and transforms data types
  5. Process Data:
    • If mobjectId is provided: Creates MData records in batches
    • If mobjectId is not provided: Returns raw data without storage
  6. Return Result: Returns processing statistics, file metadata, and optional data

Supported File Formats

CSV Files

  • Custom delimiters (comma, semicolon, tab, etc.)
  • Header row detection
  • Row skipping
  • Automatic type detection (numbers, booleans, dates, strings)
  • Custom date pattern parsing

JSON Files

  • JSON array parsing (array of objects)
  • Single JSON object parsing
  • Max record limits
  • Nested object support

TXT Files

  • Delimiter-based parsing
  • Encoding support

Use Cases

Bulk Data Import from S3

Import large CSV files from S3 into Objects:

{
  "connectionId": "s3-connection-uuid",
  "filePath": "imports/products.csv",
  "fileType": "CSV",
  "mobjectId": "product-mobject-uuid",
  "batchSize": 5000,
  "skipRows": 1,
  "delimiter": ",",
  "hasHeader": true
}

Data Transformation with Field Mappings

Transform and map fields during import:

{
  "connectionId": "s3-connection-uuid",
  "filePath": "data/customers.json",
  "fileType": "JSON",
  "mobjectId": "customer-mobject-uuid",
  "batchSize": 1000,
  "fieldMappings": {
    "first_name": "firstName",
    "last_name": "lastName",
    "email_address": "email"
  }
}

Workflow Data Processing

Process S3 files in workflows:

Start → S3DataLoad → TransformData → InsertMData → End

[Image placeholder: Workflow example]

Query-Only File Reading

Read and process S3 files without storage:

{
  "connectionId": "s3-connection-uuid",
  "filePath": "reports/summary.csv",
  "fileType": "CSV",
  "batchSize": 1000,
  "maxRecords": 1000
}

Notes

  • S3 Connection: Requires a configured S3 connection with proper credentials and bucket access
  • File Path: S3 key/path must be valid and accessible with the provided connection credentials
  • Batch Processing: Large files are processed in batches to optimize memory usage
  • Type Detection: Automatic type detection for numbers, booleans, dates, and strings
  • Field Mappings: Use field mappings to transform source field names to target Object field names
  • Error Handling: Validation errors skip individual records but continue processing
  • Security: File path validation prevents directory traversal attacks
  • Encoding: Supports multiple character encodings for international file support
  • Date Parsing: Custom date patterns can be specified for date field parsing