Batch Framework
Table of Contents
New in version MinIO: RELEASE.2022-10-08T20-11-00Z
The Batch Framework was introduced with the replicate
job type in the mc
RELEASES.2022-10-08T20-11-00Z.
Overview
The MinIO Batch Framework allows you to create, manage, monitor, and execute jobs using a YAML-formatted job definition file (a “batch file”). The batch jobs run directly on the MinIO deployment to take advantage of the server-side processing power without constraints of the local machine where you run the MinIO Client.
A batch file defines one job task.
Once started, MinIO starts processing the job. Time to completion depends on the resources available to the deployment.
If any portion of the job fails, MinIO retries the job up to the number of times defined in the job definition.
The MinIO Batch Framework supports the following job types:
Job Type |
Description |
---|---|
|
Perform a one-time replication procedure from one MinIO location to another MinIO location. |
|
Perform a one-time process to cycle the sse-s3 or sse-kms cryptographic keys on objects. |
MinIO Batch CLI
Install the MinIO Client
Define an
alias
for the MinIO deployment
The mc batch
commands include
The |
|
The |
|
The |
|
The |
|
The |
Access to mc batch
You can use MinIO’s Policy Based Access Control and the administrative policy actions to restrict who can start a batch job, retrieve a list of running jobs, or describe a running job.
Job Types
Replicate
Use the replicate
job type to create a batch job that replicates objects from one MinIO deployment to another MinIO target.
New in version MinIO: Server RELEASE.2023-05-04T21-44-30Z
replicate
batch jobs also support mc mirror
-like behavior when presented an S3-compatible source or target.
At least one of the deployment locations, either the source or the target, must be local
.
The definition file can limit the replication by bucket, prefix, and/or filters to only replicate certain objects.
Changed in version MinIO: Server RELEASE.2023-04-07T05-28-58Z
You can replicate from a remote MinIO deployment to the local deployment that runs the batch job.
For example, you can use a batch job to perform a one-time replication sync to push objects from a bucket on a local deployment at minio-local/invoices/
to a bucket on a remote deployment at minio-remote/invoices
.
You can also pull objects from the remote deployment at minio-remote/invoices
to the local deployment at minio-local/invoices
.
The advantages of Batch Replication over mc mirror
include:
Removes the client to cluster network as a potential bottleneck
A user only needs access to starting a batch job with no other permissions, as the job runs entirely server side on the cluster
The job provides for retry attempts in event that objects do not replicate
Batch jobs are one-time, curated processes allowing for fine control replication
(MinIO to MinIO only) The replication process copies object versions from source to target
Changed in version MinIO: Server RELEASE.2023-02-17T17-52-43Z
Run batch replication with multiple workers in parallel by specifying the MINIO_BATCH_REPLICATION_WORKERS
environment variable.
Sample YAML Description File for a replicate
Job Type
Create a basic replicate
job definition file you can edit with mc batch generate
.
replicate:
apiVersion: v1
# source of the objects to be replicated
source:
type: TYPE # valid values are "s3" or "minio"
bucket: BUCKET
prefix: PREFIX
# endpoint: ENDPOINT
# credentials:
# accessKey: ACCESS-KEY
# secretKey: SECRET-KEY
# sessionToken: SESSION-TOKEN # Available when rotating credentials are used
# target where the objects must be replicated
target:
type: TYPE # valid values are "s3" or "minio"
bucket: BUCKET
prefix: PREFIX
# endpoint: ENDPOINT
# credentials:
# accessKey: ACCESS-KEY
# secretKey: SECRET-KEY
# sessionToken: SESSION-TOKEN # Available when rotating credentials are used
# optional flags based filtering criteria
# for all source objects
flags:
filter:
newerThan: "7d" # match objects newer than this value (e.g. 7d10h31s)
olderThan: "7d" # match objects older than this value (e.g. 7d10h31s)
createdAfter: "date" # match objects created after "date"
createdBefore: "date" # match objects created before "date"
# tags:
# - key: "name"
# value: "pick*" # match objects with tag 'name', with all values starting with 'pick'
## NOTE: metadata filter not supported when "source" is non MinIO.
# metadata:
# - key: "content-type"
# value: "image/*" # match objects with 'content-type', with all values starting with 'image/'
notify:
endpoint: "https://notify.endpoint" # notification endpoint to receive job status events
token: "Bearer xxxxx" # optional authentication token for the notification endpoint
retry:
attempts: 10 # number of retries for the job before giving up
delay: "500ms" # least amount of delay between each retry
Key Rotate
New in version MinIO: RELEASE.2023-04-07T05-28-58Z
Use the keyrotate
job type to create a batch job that cycles the sse-s3 or sse-kms keys for encrypted objects.
The YAML configuration supports filters to restrict key rotation to a specific set of objects by creation date, tags, metadata, or kms key. You can also define retry attempts or set a notification endpoint and token.
Sample YAML Description File for a keyrotate
Job Type
Create a basic keyrotate
job definition file you can edit with mc batch generate
.
keyrotate:
apiVersion: v1
bucket: bucket
prefix:
encryption:
type: sse-kms # valid values are sse-s3 and sse-kms
# The following encryption values only apply for sse-kms type.
# For sse-s3 key types, MinIO uses the key provided by the MINIO_KMS_KES_KEY_FILE environment variable.
# The following two values are ignored if type is set to sse-s3.
key: my-new-keys2 # valid only for sse-kms
context: <new-kms-key-context> # valid only for sse-kms
# optional flags based filtering criteria
flags:
filter:
newerThan: "84h" # match objects newer than this value (e.g. 7d10h31s)
olderThan: "80h" # match objects older than this value (e.g. 7d10h31s)
createdAfter: "2023-03-02T15:04:05Z07:00" # match objects created after "date"
createdBefore: "2023-03-02T15:04:05Z07:00" # match objects created before "date"
tags:
- key: "name"
value: "pick*" # match objects with tag 'name', with all values starting with 'pick'
metadata:
- key: "content-type"
value: "image/*" # match objects with 'content-type', with all values starting with 'image/'
kmskey: "key-id" # match objects with KMS key-id (applicable only for sse-kms)
# optional entries to add notifications for the job
notify:
endpoint: "https://notify.endpoint" # notification endpoint to receive job status events
token: "Bearer xxxxx" # optional authentication token for the notification endpoint
# optional entries to add retry attempts if the job is interrupted
retry:
attempts: 10 # number of retries for the job before giving up
delay: "500ms" # least amount of delay between each retry