Custom resources
SaunaFS Operator introduces custom resources: SaunafsCluster
, SaunafsMetadataVolume
, SaunafsChunkVolume
and SaunafsExport
.
All these resources are namespaced. Single SaunaFS cluster can consist only of SaunafsCluster
, SaunafsMetadataVolume
s, SaunafsChunkVolume
s and persistent volume claims from the same namespace.
DANGER
SaunaFS Operator doesn't verify if the persistent volume claims are empty before use nor empties them when their Saunafs...Volume
s are removed. Be careful when adding new volumes, especially for metadata - you might lose your data if the volume you introduce already has metadata from another cluster.
SaunafsCluster
This resource represents a single SaunaFS Cluster - you can deploy more than one SaunaFS cluster in a single Kubernetes cluster.
Example manifest:
apiVersion: saunafs.sarkan.io/v1beta1
kind: SaunafsCluster
metadata:
name: example-cluster
namespace: saunafs-operator
spec:
# Decides whether the cluster should be exposed externally (i.e. outside Kubernetes cluster).
#
# If true then
# - metadata servers will be exposed using LoadBalancer service,
# - chunkservers will be exposed using host ports of Kubernetes nodes.
exposeExternally: true
# List of replication goals possible for this cluster. Up to 40 goals can be configured.
replicationGoal:
- name: "1"
replication: "_" # One copy for each file or directory chunk across chunk servers.
- name: "2"
replication: "_ _" # Two copies anywhere
- name: "3"
replication: "_ _ _"
- name: "4"
replication: "_ _ _ _"
- name: "5"
replication: "_ _ _ _ _"
- name: ec21
replication: $ec(2,1) # Erasure coding with 2 data and 1 parity on all servers.
- name: ec31
replication: $ec(3,1)
- name: ec32
replication: $ec(3,2)
# Optional PVC selectors for metadata and chunk storage. When present operator will
# watch for PVCs with that labels and create a corresponding objects automatically.
pvcSelectors:
# Create SaunafsMetadataVolume object automatically for PVC with that label.
# Required if PV selector for metadata storage is specified.
metadataStorage: example-cluster=metadata
# Create SaunafsChunkVolume object automatically for PVC with that label.
# Required if PV selector for chunk storage is specified.
chunkStorage: example-cluster=chunks
# Optional PV selectors for metadata and chunk storage. When present operator will
# watch for PVs with that labels and create a corresponding PVCs automatically.
pvSelectors:
# The created PVCs will be automatically assigned PVC metadata storage label.
metadataStorage: example-cluster=metadata
# The created PVCs will be automatically assigned PVC chunk storage label.
chunkStorage: example-cluster=chunks
# Optional container image overrides.
images:
metadataServer: ""
chunkServer: ""
elector: ""
# Optional desired minimum count of chunkservers in a single cluster.
desiredMinimumChunkserverCount: 5
# Optional resource requests and limits for SaunaFS components.
# If not specified, containers will be created without CPU/memory constraints.
#
# The resource settings below are designed for ~1 million files workloads
# and provide enough CPU and memory headroom to handle moderate traffic.
# For large-scale deployments, consider increasing
# master memory limits accordingly.
#
# These settings should be adjusted based on the target environment,
# file count, and expected workload characteristics.
#
# Estimated memory consumption for metadata is approximately 500 bytes per stored file.
# This translates to:
# - 1 million files ~= 500 MiB
# - 100 million files ~= 50 GiB
# - 1 billion files ~= 500 GiB
resources:
metadataServer:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1"
memory: "2Gi"
chunkserver:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "2"
memory: "4Gi"
# SaunaFS metadata server configurable options.
metadataConfiguration:
# Limit glibc malloc arenas to a specific value to reduce virtual memory usage (Linux only). (default is 0)
limitGlibcMallocArenas: 4
# Prioritize the local chunkserver for local chunk data when available. (default is `true`)
preferLocalChunkserver: true
# Keep a specified number of previous metadata files. (default is 1)
backMetaKeepPrevious: 1
# Enable automatic recovery of metadata after crashes. (default is `false`)
autoRecovery: true
# Set the initial delay in seconds for starting chunk operations. (default is 300)
operationsDelayInit: 300
# Set the delay in seconds after chunkserver disconnection for chunk operations. (default is 3600)
operationsDelayDisconnect: 300
# Limit the chunks loop to check no more chunks per second than specified. (default is 100000)
chunksLoopMaxCps: 100000
# Ensure the chunks loop checks all chunks within a specified time. (default is 300)
chunksLoopMinTime: 300
# Set a hard limit on CPU usage for the chunks loop. (percentage value, default is 60%)
chunksLoopMaxCPU: 60
# Define a soft maximum number of chunks to delete on one chunkserver. (default is 10)
chunksSoftDelLimit: 10
# Define a hard maximum number of chunks to delete on one chunkserver. (default is 25)
chunksHardDelLimit: 25
# Set the maximum number of chunks to replicate to one chunkserver. (default is 2)
chunksWriteRepLimit: 2
# Set the maximum number of chunks to replicate from one chunkserver. (default is 10)
chunksReadRepLimit: 10
# Set the percentage of endangered chunks to replicate with high priority. (percentage value, default is 0%)
endangeredChunksPriority: 0
# Define the maximum capacity of the endangered chunks queue. (default is "1Mi")
endangeredChunksMaxCapacity: "1Mi"
# Set the allowable disk usage difference before triggering rebalancing. (percentage value, default is 10%)
acceptableDifference: 10
# Allow chunk movement between servers with different labels for balancing. (default is `false`)
chunksRebalancingBetweenLabels: false
# Reject clients older than version 1.6.0. (default is `false`)
rejectOldClients: true
# Specify the period for bandwidth allocation renegotiation. (in milliseconds, default is 100ms)
globalIoLimitsRenegotiationPeriodMs: 100
# Allow data flow after inactivity without waiting, up to the specified milliseconds. (default is 10ms)
globalIoLimitsAccumulateMs: 250
# Set the frequency for sending metadata checksums to backups. (default is every 50 metadata updates)
metadataChecksumInterval: 50
# Define the speed for recalculating metadata checksums in the background. (default is 100 objects per function call)
metadataChecksumRecalculationSpeed: 100
# Disable checksum verification while applying the changelog. (default is `false`)
disableMetadataChecksumVerification: false
# Prevent inode access time updates on each access. (default is `true`)
noAtime: true
# Set the minimum time between metadata dump requests from shadow masters. (in seconds, default is 1800)
metadataSaveRequestMinPeriod: 1800
# Retain client session data on the master server for the specified time. Values between 60 and 604800 (one week) are accepted. (in seconds, default is 86400s, i.e., 24 hours)
sessionSustainTime: 86400
# Avoid selecting chunkservers with the same IP. (default is `false`)
avoidSameIpChunkservers: true
# Specify the redundancy level for the minimum chunk part loss before endangerment. (default is 0)
redundancyLevel: 0
# Define the number of snapshotted nodes to clone before batch execution. (default is 1000)
snapshotInitialBatchSize: 1000
# Set the maximum batch size for snapshot requests. (default is 10000)
snapshotInitialBatchSizeLimit: 10000
# Ensure the test files loop checks all files within a specified time. (in seconds, default is 3600s, i.e., 1 hour)
fileTestLoopMinTime: 3600
# Set the delay before attempting reconnection to the metadata server. (in seconds, default is 1s)
masterReconnectionDelay: 1
# Define the timeout for metadata server connections. (in seconds, default is 60s)
masterTimeout: 10
# Add a disk usage load penalty to reduce frequent heavy chunkserver selections. Values between 0% and 50% are accepted. (percentage value, default is 0%)
loadFactorPenalty: 0
# Prioritize data parts to chunkservers with more space, clustering parities on imbalance. (default is `true`)
prioritizeDataParts: true
# Set the maximum polling wait time in milliseconds for events, balancing latency and CPU usage. (in milliseconds, default is 50)
pollTimeoutMs: 50
# Whether to perform mlockall() to avoid swapping out sfsmaster process (default is `false`)
lockMemory: false
# Interval for periodically cleaning of reserved files, in milliseconds (default is 0, i.e. the reserved files deletion is disabled).
emptyReservedFilesPeriodMs: 0
# Set the valid log levels: "trace", "debug", "info", "warning", "error", "critical", "off". (default is "trace")
logLevel: "trace"
# SaunaFS chunkserver configurable options.
chunksConfiguration:
# Call fsync() after a chunk is modified. (default is `true`)
performFsync: true
# Set the number of threads that handle connections with clients. (default is 4)
nrOfNetworkWorkers: 4
# Set the number of threads that the connection to the master may use to process operations on chunks. (default is 10, minimum is 2)
masterNrOfWorkers: 10
# Determine whether to remove each chunk from the page cache when closing it. (default is `true`)
hddAdviseNoCache: true
# Set the valid log levels: 'trace', 'debug', 'info', 'warning', 'error', 'critical', 'off'. (default is "trace")
logLevel: "trace"
# Limit glibc malloc arenas to a specific value to reduce virtual memory usage (Linux only). (default is 0)
limitGlibcMallocArenas: 4
# Whether to perform mlockall() to avoid swapping out sfschunkserver process. (default is `false`)
lockMemory: false
# Set the free space threshold to mark a volume as 100% utilized. (default is "4GiB")
hddLeaveSpaceDefault: "4GiB"
# Enable CRC checking when reading data from disk. (default is `true`)
hddCheckCrcWhenReading: true
# Enable CRC checking when writing data to disk. (default is `true`)
hddCheckCrcWhenWriting: true
# Enable chunkserver to detect zero values in chunk data and free corresponding file blocks. (default is `false`)
hddPunchHoles: false
# Enable chunkserver to send periodic reports of its I/O load to the master. (default is `false`)
enableLoadFactor: false
# Set the number of threads that each network worker may use for disk operations. (default is 4)
nrOfHddWorkersPerNetworkWorker: 4
# Set the maximum number of jobs that each network worker may use for disk operations. (default is 1000)
bgJobsCntPerNetworkWorker: 1000
# Verify that chunk metadata and data parts exist during a disk scan. (default is `true`)
statChunksAtDiskScan: true
# Maximum amount of time in milliseconds that the polling operation will wait for events. Smaller values could reduce latency at the cost of CPU usage. (default is 50)
pollTimeoutMs: 50
SaunafsMetadataVolume
This resource represents a single PVC for metadata storage. Instance of this resource belongs to a SaunafsCluster
instance.
Example manifest:
apiVersion: saunafs.sarkan.io/v1beta1
kind: SaunafsMetadataVolume
metadata:
name: example-metadata-volume
namespace: saunafs-operator
spec:
# Name of the SaunaFS cluster this metadata volume belongs to.
clusterName: example-cluster
# Name of the persistent volume claim to use.
persistentVolumeClaimName: pvc-1
TIP
SaunaFS Metadata Volume must be in the same namespace as SaunaFS Cluster it belongs to.
SaunafsChunkVolume
This resource represents a single PVC for chunk storage. Instance of this resource belongs to a SaunafsCluster
instance.
Example manifest:
apiVersion: saunafs.sarkan.io/v1beta1
kind: SaunafsChunkVolume
metadata:
name: example-chunks-volume
namespace: saunafs-operator
spec:
# Name of the SaunaFS cluster this metadata volume belongs to.
clusterName: example-cluster
# Name of the persistent volume claim to use.
persistentVolumeClaimName: pvc-2
TIP
SaunaFS Metadata Volume must be in the same namespace as SaunaFS Cluster it belongs to.
SaunafsExport
This resource represents a SaunaFS export. Instance of this resource belongs to a SaunafsCluster
instance. It serves as access control for sfsmounts.
Example manifest:
apiVersion: saunafs.sarkan.io/v1beta1
kind: SaunafsExport
metadata:
name: example-saunafs-export
namespace: saunafs-operator
spec:
# Name of the SaunaFS cluster this export belongs to.
clusterName: saunafs-cluster
# Path to be exported relative to your SaunaFS root.
path: "/my-password-protected-export"
# Comma separated list of export options, refer to SaunaFS documentation for list of possible options. defaults to 'readonly'.
options: "rw"
# Kubernetes secret with password that should protect the export. Secret must contain field with key 'saunafs-export-password'. Secret must be in the same namespace as SaunaFS Cluster.
exportSecretName: saunafs-export