1.1. Description¶

This project uses Quarkus, the Supersonic Subatomic Java Framework and is compiled with Java 21.

If you want to learn more about Quarkus, please visit its website: https://quarkus.io/ .

Clone Cloud Store (CCS) allows to simplify access to Cloud Storage for major services such as Amazon S3 or S3 like implementations, Azure Blob Storage and Google Cloud Storage.

It provides a simple REST API, much more simpler than usual ones, for Quarkus environments.

One of the goal is to simplify handling big InputStream files, without having to store them physically on disk or in memory, neither in client application neither in front CCS services.

To allow this, specific functionalities of Quarkus Rest services (client and server) are used, such as the possibility to send or receive such InputStream, chunk by chunk, and with back pressure control.

Note

It might be possible to use other Http Client, but one has to take care of possible limitations of such Http SDK, such as to not send or receive from client side with all InputStream in memory. Apache httpclient5 is compatible.

Clone Cloud Store allows also to clone storage between various Cloud sites, even using different technologies (for instance, one using Amazon S3, another one using Azure Blob Storage):

It can be used in 1 site only, or multiples sites (no limitations). When a bucket or object is created/deleted on one site, it is automatically asynchronously replicated to other sites. If an object is missing, due to outage or local issues, it can try to reach a copy synchronously on one of the remote sites and proceeds if it exists to its local restoration asynchronously.
It provides a Reconciliation algorithm which will compare all sites to restore existing Bucket and Objects everywhere. This process is not blocking, meaning the sites can continue to run as usual.
This reconciliation process allows Disaster Recovery process, without interruption of service during recovery. Note that new creation/deletion of Buckets or Objects is taken into account during reconciliation.
This reconciliation process allows Cloud migration, without interruption of service during cloning. Note that new creation/deletion of Buckets or Objects is taken into account during reconciliation.

Cloud Clone Store relies on Quarkus and other technologies:

A database to store the status of Buckets and Objects: MongoDB or PostgreSql
A topic/queue system to allow asynchronous actions: Kafka or Pulsar
Optional Prometheus to get observability metrics
At least 5 JVM processes: (more JVM can be setup to improve reliability and performance) - Accessor (1 or more) - Accessor for Replicator (1 or more) - Replicator (1 or more) - Reconciliator (1) - Administration (1)

A simplest implementation with 1 JVM (1 or more) is available without database, topic or remote sites support. It allows to test the solution with your application or to allow a smooth move to Cloud Clone Store: Accessor Simple Gateway

1.1.1. Available functionalities¶

Database: MongoDB
Topics: Kafka
Common - Full support for InputStream within Quarkus (through a patch of Quarkus) - Full support of Database choice between MongoDB and PostgreSql (by configuration) - Metrics available for Prometheus or equivalent
Accessor - Fully functional - Include remote checking if locally not present (by configuration) - Include remote cloning - Include Public Client and Internal Client (Quarkus) - Include Public Client based on Apache httpclient 5 without need of Quarkus - Simple Gateway with no Database nor Remote access or cloning available - Include optional Buffered Accessor relying on local space (only for unsteady Storage service)
Driver - Support of S3, Azure Blob Storage and Google Cloud Storage
Replicator - Fully functional for replication or preemptive remote action
Topology - Full support for remote Clone Cloud Store sites
Ownership - Support for ownership based on Bucket
Quarkus patch client: patch until Quarkus validate PR 37308
Reconciliator - Logic in place but not yet API (so no Disaster Recovery or Cloud Migration yet) - Initialization of a CCS site from a remote one or from an existing Driver Storage - Missing API and Configurations - Will need extra API on Replicator

1.1.2. Notes of versions¶

1.1.2.1. 0.8.0 2024/02¶

Fully tested Reconciliation steps
Accessor buffered upload to limit side effect of unsteady Storage service
Accessor Ownership and CRUD rights support
Administration Topology and Ownership support
Add Apache http client for Accessor Public client (no Quarkus dependency)
Refactorization on Server side
Prepare import from existing Driver Storage without CCS before
Compression configurable for internal services
Optimize Azure Driver and MongoDb Bulk operations
Add Metrics on Topics and Driver
Fix Digest implementation and Proactive Replication implementation
Fix doc and API
Clean up Logs

1.1.2.2. 0.7.0 2024/01¶

Support of Simple Gateway Accessor
First steps on Reconciliator batch

1.1.2.3. 0.6.0 2023/11¶

Patch of Quarkus to support InputStream on client side (upload and download)

1.1.2.4. 0.5.0 2023/10¶

Refactorization and simplification
Support of Dynamic choice of Database (MongoDB or PostgreSql) in Common

1.1.2.5. 0.4.0 2023/09¶

Performance improvements
Support of Proactive replication from Accessor

1.1.2.6. 0.3.0 2023/07¶

Adding Topology support to Replicator
Support of Public Accessor with remote access

1.1.2.7. 0.2.0 2023/01¶

Replicator support with asynchronous replication
Internal Accessor support
Support of Kafka

1.1.2.8. 0.1.0 2022/06¶

Public Accessor support
Driver for Amazon S3 and S3 like support
Support of MongoDB

1.1.3. Status logic¶

1.1.4. Architecture¶

1.1.4.1. Zoom when using Buffered Accessor¶

Architecture on 1 site with Buffered option¶

Note that Buffered option shall not be used in general, except if the final Storage service is unsteady, therefore giving issues while uploading new Objects. This option allows to buffered locally on local disks (or through NAS/NFS) the object to store, and then to try to save this locally backuped object to the Storage service. If done, the local copy is purged. If not, it is therefore registered for retries in recurrent jobs later on.

This option shall be used with caution due to the risk of filling local storage and therefore leading to “not enough space on device” error if the Storage service is down for too long.

This option is also available for the Simple Gateway Accessor.

Navigation

Related Topics