Storage policies
Description
To manage data storage efficiently, OpenIO has implemented a flexible rule-based system named “Storage policies” which allows the platform administrator to easily organize the way data is stored, to fit his current and future needs.
In particular, “Storage policies” are the way to manage tiering and data protection.
Prerequisites
The architecture must allow the application of storage policies.
Limitations
A storage policy and especially a “Data security policy” cannot be updated because it will have impacts. However, it is possible to add some.
Concepts
Slots
Slots are names associated to services, which are useful for the administrator for categorizing services on his cluster. Using slots is a way to tag services. For example, it can be useful for the administrator to declare a slot grouping services sharing common characteristics like storage media type or physical location. These slots are used in storage pools declaration.
Business rules:
- There is a default slot for each type of service. If the administrator does not put a service in a (set of) slot(s), the service will go in the default slot.
- Each service can be associated to one or several slots.
Configuration:
Slots can be defined in each service configuration file /etc/oio/sds/[NS]/watch/[Service]-X.yml
[Service] : account, meta0, meta1, meta2, rawx, rdir, redis, sqlx
Example:
In the following example, we have 4 slots declared for this rawx service, “rawx” which is the default slot, “rawx-ssd”, “rawx-odd”,”rawx-site1” which are custom slots added by the administrator
slots:
- rawx
- rawx-ssd
- rawx-odd
- rawx-site1
Locations
The administrator can declare the location of each service as a dot-separated string like room1.rack1.server2.volume4 (1 to 4 words). Locations can be used in pools configuration in order to allow finding services that are far from each other when doing erasure coding or data replication.
The distance between two services is 4 less the number of words in common, starting from the left.
For example :
room1.rack1.srv12.vol4
androom1.rack1.srv12.vol5
have a distance of 1room1.rack1.srv12.vol4
androom1.rack2.srv11.vol4
have a distance of 3.
Business rules:
If no location is set, the IP address and port of the services are used.
The dotted string is internally converted to a 64-bit integer (by hashing each word using djb2).
Configuration:
Locations are defined in each service configuration file /etc/oio/sds/[NS]/watch/[Service]-X.yml
[Service] : account, meta0, meta1, meta2, rawx, rdir, redis, sqlx
Example :
location: room1.rack1.srv12.vol4
Pools
Pools define the rules that the load-balancer should follow when picking services. They are configured as a set of targets. Each target declares the number of services to pick, followed by one or more slots to use. The slots will be consumed in the same order as they are declared, thus providing a fallback mechanism when more that one slot is assigned to a target.
Business rules:
There is a default pool for each type of services
Configuration:
Pools are defined in /etc/oio/sds/[NS]/conscience-X/conscience-X-services.conf
[pool:<pool-name>]
targets=<number-of-services-to-pick>,<slot-name>,<slot-name>;<number-of-services-to-pick>,<slot-name>,<slot-name>
nearby_mode=<boolean>
min_dist=<number>
max_dist=<number>
warn_dist=<number>
Examples:
Examples of pools configuration for rawx services:
[pool:fastrawx3]
# Pick 3 SSD rawx, or any rawx if SSD is not available
targets=3,rawx-ssd,rawx
[pool:rawxevenodd]
# Pick one "even" and one "odd" rawx
targets=1,rawx-even;1,rawx-odd
[pool:rawx2]
# As with rawxevenodd, but with permissive fallback on any rawx
targets=1,rawx-even,rawx;1,rawx-odd,rawx
[pool:zonedrawx3]
# Pick one rawx in Europe, one in USA, one in Asia, or anywhere if none available
targets=1,rawx-europe,rawx;1,rawx-usa,rawx;1,rawx-asia,rawx
[pool:zonedrawx2]
# Pick 2 rawx services in Europe (or in Asia if there are no more in Europe) and 2 in the USA (or in Asia if there are no more in the USA), and ensure a minimum distance of 2 between each service.
targets=2,rawx-europe,rawx-asia;2,rawx-usa,rawx-asia
min_dist=2
Data security policy
The Data Security policy describes the way an object is stored on a storage pool.
Each data security policy is derived from one of the supported security types. For the moment being, these are:
- plain replication security (replicated data chunks)
- erasure coding security (data chunks + parity chunks)
Configuration:
Data security policies are defined in the Conscience service.
The configuration file /etc/oio/sds/[NS]/conscience-X/conscience-X-policies.conf describes all the data security definitions available in the namespace
Option | Description |
---|---|
nb_copy |
replication only: defines the number of copy to store |
distance |
defines the minimum distance between chunks to ensure security |
algo |
erasure coding only: defines the erasure coding algorithm to use |
k |
erasure coding only: defines the number of data chunks |
m |
erasure coding only: defines the number of parity chunks |
Examples:
[DATA_SECURITY]
# 3x replication
DUPONETHREE=plain/distance=1,nb_copy=3
# 2x replication
DUPONETWO=plain/distance=1,nb_copy=2
# 6+3 Erasure Coding policy (6 data chunks + 3 parity chunks using Reed Solomon with liberasurecode).
ECISAL63D1=ec/k=6,m=3,algo=liberasurecode_rs_vand,distance=1
Storage policies
Storage Policies are the way to describe different storage tiers in your storage platform. A storage policy is a combination of Data Security policy and target pools
A storage policy can be applied at :
- an Object-level: in this case, the Storage Policy is explicitely specified when pushing the object
- a Container-level: in this case, a Storage Policy must have previously been defined at the container level and is applied to the object once it is pushed in the container
- a namespace level: this is the default behaviour. In this case, it will use the policy defined in /etc/oio/sds/[NS]/conscience-X/conscience-X.conf
The storage policy is applied by the meta2 service.
Business rules:
- Each storage policy must at least contain a Data Security policy, and can be used to target only certain pools.
- Storage policies are defined for a namespace
- pools are not mandatory (in this case, indicate NONE)
Configuration:
All the storage policies available in the namespace are defined in the Conscience service in the configuration file /etc/oio/sds/[NS]/conscience-X/conscience-X-policies.conf
A storage policy is defined as follows:
<Storage-policy-name>=<storage-pool-name>:<data-security-policy-name>
The default storage policy for the namespace is defined in the file /etc/oio/sds.conf.d/[NS] :
ns.storage_policy=THREECOPIES
To define a storage policy on a container, it can be done at the container creation :
openio container create --storage-policy <storage_policy> <container-name>
You can also set a storage policy on a container using the following command :
openio container set --storage-policy <storage_policy> <container-name>
You can specify the storage policy at object creation with the following command :
openio object create --policy <policy> <filename>
Examples :
THREECOPIES=NONE:DUPONETHREE
TWOCOPIES=NONE:DUPONETWO
ECISAL63D1=NONE:ECISAL63D1
Dynamic Storage Policies
OpenIO designed a dynamic mechanism that automatically selects the best data protection mechanism according to the characteristics of the stored object, thus combining optimal efficiency and data protection. It is thus possible to have several storage policies for each domain / user / bucket, then to assign rules to apply these policies, e.g. depending on the size of the object to be stored.
In a context where we do not know the type of file to store, the use of the dynamic data protection mechanism implemented by OpenIO is therefore recommended.
As an example, both x3 replication and 14+4 erasure coding can be assigned for a new bucket, with the rule that files smaller than 128KiB are replicated, while larger files use erasure coding.
As an illustration, for an 8 KB object, we obtain:
- 3 copies: a total capacity of 8 * 3 = 24KB but only 3 I/O operations.
- EC 14 + 4 (considering a piece of 8Ko): 8 * 18 = 72KB and 18 I/O operations.
In this case, a multiple data protection policy is not only better in terms of performance, but also in terms of capacity consumption.
For an 8 MB file, you get:
- 3 copies: a total capacity of 8 * 3 = 24 MB
- EC 14 + 4: 8 * 18/14 = 10.2 MB In this case, no matter what the I/O operations are, it’s really clear that erasure coding saves a lot of storage space.
Business rules
Dynamic storage policies are managed in the S3/Swift gateway
Configuration
The dynamic storage policies are configured in the oio-swift configuration file /etc/oio/sds/[NS]/oioswift-X/proxy-server.conf
Two parameters must be configured:
- oio_storage_policies:
The list of storage policies that are actually manageable by the Swift gateway. All of them must exist in the target OpenIO SDS platform, and if they require erasure coding support, they must rely on an existing/deployed liberasurecode backend.
- auto_storage_policies:
The dynamic storage policies configuration which which will be applied.
The format is: “DEFAULT,POLICY:THRESHOLD[,POLICY:THRESHOLD]*”
“DEFAULT”: the storage policy to be used when the size of the content is unknown, with a notable case for ‘Transfer-Encoding: chunked’ uploads. “POLICY:THRESHOLD”: the policy POLICY must be used when the size of the content is over THRESHOLD.
At least one POLICY:THRESHOLD rule is required. When several tuples are present, upon an upload the whole list is iterated and the best match is kept as the actual value.
Example:
oio_storage_policies=SINGLE,EC123,EC64,THREECOPIES,FOURCOPIES
# The subsequent configuration tells swift to apply EC12+3 for stream uploads,
# 3 replicas for small files (<256kiB), EC6+4 for medium files (>=256kiB
# and <64MiB) then EC12+3 for large files (>= 64MiB).
auto_storage_policies=EC123,THREECOPIES:0,EC64:262144,EC123:67108864
Operation
List the storage policies
List the storage policies declared in the namespace
It is possible to list all the storage policies which are available in the namespace by using the command openio cluster show
# openio cluster show
+-----------------------------+----------------------------------------------------------------------+
| Field | Value |
+-----------------------------+----------------------------------------------------------------------+
| namespace | OPENIO |
| chunksize | 104857600 |
| storage_policy.ERASURECODE | NONE:ERASURECODE |
| storage_policy.ECISAL42D1 | NONE:ECISAL42D1 |
| storage_policy.ECISALC35D1 | NONE:ECISALC35D1 |
| storage_policy.ECISALC75D1 | NONE:ECISALC75D1 |
| storage_policy.ECLIBEC144D1 | NONE:ECLIBEC144D1 |
| storage_policy.ECISAL144D1 | NONE:ECISAL144D1 |
| storage_policy.SINGLE | NONE:NONE |
| storage_policy.ECISAL63D1 | NONE:ECISAL63D1 |
| storage_policy.THREECOPIES | NONE:DUPONETHREE |
| storage_policy.ECLIBEC42D1 | NONE:ECLIBEC42D1 |
| storage_policy.ECLIBEC63D1 | NONE:ECLIBEC63D1 |
| storage_policy.TWOCOPIES | NONE:DUPONETWO |
| data_security.ERASURECODE | ec/k=6,m=3,algo=liberasurecode_rs_vand,distance=1 |
| data_security.ECISAL42D1 | ec/k=4,m=2,algo=isa_l_rs_vand,distance=1 |
| data_security.ECISALC35D1 | ec/k=3,m=5,algo=isa_l_rs_cauchy,distance=1 |
| data_security.ECISALC75D1 | ec/k=7,m=5,algo=isa_l_rs_cauchy,distance=1 |
| data_security.ECLIBEC144D1 | ec/k=14,m=4,algo=liberasurecode_rs_vand,distance=1 |
| data_security.ECISAL144D1 | ec/k=14,m=4,algo=isa_l_rs_vand,distance=1 |
| data_security.DUPONETHREE | plain/distance=1,nb_copy=3 |
| data_security.ECLIBEC123D1 | ec/k=12,m=3,algo=liberasurecode_rs_vand,distance=1 |
| data_security.ECISAL63D1 | ec/k=6,m=3,algo=isa_l_rs_vand,distance=1 |
| data_security.DUPONETWO | plain/distance=1,nb_copy=2 |
| data_security.ECLIBEC42D1 | ec/k=4,m=2,algo=liberasurecode_rs_vand,distance=1 |
| data_security.ECLIBEC63D1 | ec/k=6,m=3,algo=liberasurecode_rs_vand,distance=1 |
| data_security.ECISAL123D1 | ec/k=12,m=3,algo=isa_l_rs_vand,distance=1 |
| flat_bitlength | 17 |
| service_update_policy | meta2=KEEP|3|1|;rdir=KEEP|1|1|user_is_a_service=rawx;sqlx=KEEP|3|1|; |
| storage_policy | THREECOPIES |
+-----------------------------+----------------------------------------------------------------------+
The fields storage_policy.
corresponds to all the storage policies declared in the namespace.
The fields data_security.
corresponds to all the data security policies declared in the namespace.
The field storage_policy
corresponds to the default storage policy of the namespace.
List storage policy applied to a container or an object
It is possible to list the storage policy applied to a container by using the command openio container show
# openio container show my_container --oio-account my_account
+----------------+--------------------------------------------------------------------+
| Field | Value |
+----------------+--------------------------------------------------------------------+
| account | my_account |
| base_name | 7991F2BDBAFB48D7246BFFEF64A79B9800C170A0AF5BFD10697A4290F5F953A2.1 |
| bytes_usage | 0B |
| container | my_container |
| ctime | 1533833142 |
| max_versions | Namespace default |
| objects | 0 |
| quota | Namespace default |
| status | Enabled |
| storage_policy | TWOCOPIES |
+----------------+--------------------------------------------------------------------+
It is possible to list the storage policy applied to an object by using the command openio object show
# openio object show my_container my_object --oio-account my_account
+-----------+----------------------------------+
| Field | Value |
+-----------+----------------------------------+
| account | my_account |
| container | my_container |
| ctime | 1533833952 |
| hash | 9EB03B6E836CEAE565BA79F76C821DDA |
| id | E352E98B037305006B030F50551AA0EE |
| mime-type | application/octet-stream |
| object | my_object |
| policy | SINGLE |
| size | 14 |
| version | 1533833952973557 |
+-----------+----------------------------------+
Change the default storage policy of the namespace
It is possible to change the default storage policy applied to the namespace by changing the ns.storage_policy
parameter in /etc/oio/sds.conf.d/OPENIO
configuration file.
Then, all the proxy and meta2 services of the cluster must be restarted
# gridinit_cmd restart OPENIO-meta2-X
# gridinit_cmd restart OPENIO-oioproxy-X