JuiceFS
JuiceFS is a S3-backed distributed file-system which uses a local Valkey instance to provide higher levels of read/write performance than NFS or SeaweedFS-CSI-Driver. Max uses it to mount SeaweedFS buckets containing video-games and virtual-machine disks which are too latency-sensitive to work well with other network-backed file systems. Valkey is used as a database here, so it's PVCs need to be backed up or you will lose data.
There are a few really nice write-up's by SmartMore outlining the organizations use of SeaweedFS + JuiceFS here:
- https://juicefs.com/en/blog/usage-tips/seaweedfs-tikv
- https://juicefs.com/en/blog/user-stories/ai-training-storage-selection-seaweedfs-juicefs
- https://juicefs.com/en/blog/user-stories/deep-learning-ai-storage#Why-choose-SeaweedFS-for-object-storage
Features:
- POSIX Compatible: JuiceFS can be used like a local file system, making it easy to integrate with existing applications.
- HDFS Compatible: JuiceFS is fully compatible with the HDFS API, which can enhance metadata performance.
- S3 Compatible: JuiceFS provides an S3 gateway to implement an S3-compatible access interface.
- Cloud-Native: It is easy to use JuiceFS in Kubernetes via the CSI Driver.
- Distributed: Each file system can be mounted on thousands of servers at the same time with high-performance concurrent reads and writes and shared data.
- Strong Consistency: Any changes committed to files are immediately visible on all servers.
- Outstanding Performance: JuiceFS achieves millisecond-level latency and nearly unlimited throughput depending on the object storage scale (see performance test results).
- Data Security: JuiceFS supports encryption in transit and encryption at rest (view Details).
- File Lock: JuiceFS supports BSD lock (flock) and POSIX lock (fcntl).
- Data Compression: JuiceFS supports the LZ4 and Zstandard compression algorithms to save storage space.
Depenencies
- JuiceFS assumes you already have an existing S3 proivder + bucket to store your files in.
- You'll also need a bucket for for the Valkey PVC backups.
Example Config:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80 | juicefs:
enabled: false
description: |
[link=https://github.com/juicedata/juicefs]juicefs[/link] JuiceFS is a distributed POSIX file system built on top of Redis and S3 (but we use Valkey and SeaweedFS).
init:
enabled: false
restore:
# set to true to run a restic restore via a k8up job for:
# seaweedfs pvcs and nextcloud files pvc
enabled: false
# for Cloudnative Postgres operator "cluster" CRD type resources
cnpg_restore: true
# restic snapshot ID for each PVC
restic_snapshot_ids:
# juicefs-valkey-primary volume pvc snapshot id
juicefs_primary: "latest"
# juicefs-valkey-replica volume pvc snapshot id
juicefs_replica: "latest"
values: []
backups:
# cronjob syntax schedule to run nextcloud pvc backups
pvc_schedule: 10 0 * * *
# cronjob syntax (with SECONDS field) for nextcloud postgres backups
# must happen at least 10 minutes before pvc backups, to avoid corruption
# due to missing files. This is because the cnpg backup shows as completed
# before it actually is, due to the wal archive it lists as it's end not
# being in the backup yet
postgres_schedule: 0 0 0 * * *
s3:
# these are for pushing remote backups of your local s3 storage, for speed and cost optimization
endpoint: ""
bucket: ""
region: ""
secret_access_key:
value_from:
env: JUICEFS_S3_BACKUP_SECRET_KEY
access_key_id:
value_from:
env: JUICEFS_S3_BACKUP_ACCESS_ID
restic_repo_password:
value_from:
env: JUICEFS_RESTIC_REPO_PASSWORD
argo:
# secrets keys to make available to Argo CD ApplicationSets
secret_keys:
valkey_url: valkey.juicefs.svc.cluster.local
valkey_port: "6379"
valkey_password:
value_from:
env: JUICEFS_VALKEY_PASSWORD
s3_bucket_url: ""
s3_key_id:
value_from:
env: JUICEFS_S3_KEY_ID
s3_secret_key:
value_from:
env: JUICEFS_S3_SECRET_KEY
s3_dshboard_url: juicefs.vleermuis.tech
# git repo to install the Argo CD app from
repo: https://github.com/small-hack/argocd-apps
# path in the argo repo to point to. Trailing slash very important!
path: demo/juicefs/app_of_apps/
# either the branch or tag to point at in the argo repo above
revision: finish-juicefs
# kubernetes cluster to install the k8s app into, defaults to Argo CD default
cluster: https://kubernetes.default.svc
# namespace to install the k8s app in
namespace: juicefs
# recurse directories in the provided git repo
# if set to false, we will not deploy the CSI driver
directory_recursion: true
# source repos for Argo CD App Project (in addition to argo.repo)
project:
name: juicefs
source_repos:
- https://juicedata.github.io/charts/
- oci://registry-1.docker.io/
destination:
# automatically includes the app's namespace and argocd's namespace
namespaces: []
|