· 7 years ago · Apr 27, 2018, 04:50 PM
1# IPFS block encryption
2
3This document specifies how to implement IPFS block encryption. Note that this specification only deals with symmetric encryption. Asymmetric encryption can later be added on top of this specification.
4
5## CID version 4
6
7We introduce a new CID version 4 that can be used to retrieve and decrypt an encrypted block.
8
9```
10<cidv4> ::=
11 <multibase-prefix>
12 <cid-version>
13 <multicodec-content-type>
14 <secret-key>
15 <multihash-content-address>
16
17<secret-key> ::=
18 <key-type>
19 <key-length>
20 <key-data>
21```
22
23CIDv4 extends CIDv1 with a new secret-key element. The secret-key contains a key type that specifies which encryption algorithm to use. The only defined key type is 1, which means encryption with NaCl’s secretbox. Other key types may be defined in the future.
24
25CIDv4 is a private CID that should not be shared over the network or stored in the blockstore. It must first be converted to a public CID which is a CIDv1 with the same multibase-prefix and multihash-content-address as the CIDv4, but with a multicodec-content-type of raw.
26
27```
28<public-cidv1> ::=
29 <multibase-prefix> = cidv4-multibase-prefix
30 <cid-version> = “1â€
31 <multicodec-content-type> = “rawâ€
32 <multihash-content-address> = cidv4-multihash-content-address
33```
34
35So when we want to retrieve an encrypted block we first fetch the block over the network by using the computed CIDv1 and then decrypts it by using the embedded key. This should be implemented in a lower level protocol so it is fully transparent to applications that use the IPFS API.
36
37## Encrypted block
38
39This is the algorithm to create a new encrypted block:
40
41 1. Let the plaintext be data we want to encrypt
42 2. Generate a new random key and nonce
43 3. Encrypt the plaintext with the key and nonce to get the ciphertext
44 4. Let blockdata be the nonce concatenated with the ciphertext
45 5. Generate a CIDv4 containing the key, the content-type of the plaintext and the multihash of the blockdata
46 6. Compute the public CIDv1 from the CIDv4 as described above
47 7. Create the block from the CIDv1 and the blockdata
48
49Note that this algorithm is non-deterministic so if you encrypt the same plaintext twice you will end up with two different blocks.
50
51## Linking to encrypted blocks
52
53IPFS objects can link to other objects, and they can now also link to content addressed by a CIDv4. It is important to note that if an unencrypted object links to an encrypted object in this fashion the encrypted object is effectively no longer encrypted, because the secret key is now publicly available.
54
55For this reason it is recommended that if an object links to any encrypted data it will itself be encrypted. This is of particular importance to the “ipfs files†API, as well as the storage of the currently pinned blocks.
56
57## Pinning encrypted blocks
58
59Pinning encrypted data may sound very complicated, but it’s actually very easy. If you recursively pin a block by it’s CIDv4, all the linked blocks will also be indirectly pinned, exactly as with normal blocks. If pinning an encrypted block by it’s CIDv1 there is no way to tell what data it links to as it is essentially just a raw block. None of this should require any new code.
60
61When doing garbage collection the only thing we need to do is to convert any CIDv4s into their corresponding CIDv1 just before running the actual garbage collection.
62
63## Open questions
64
65 * Should it be possible to pad the plaintext in order to make all the blocks the same size?
66 * CIDv4s are currently very long (96 characters in base58). Is there any way to make them shorter while still maintaining the same level of security and speed?
67 * Encryption currently makes the data 40 bytes longer which means that if the plaintext is a power of two the ciphertext will not be a power of two. It would be nice if we could encrypt without adding extra bytes, but then we would have to store even more in the CIDv4. Is that a good idea or not? Another option is to chunk files at 2^n - 40 bytes instead of the normal 2^n.
68 * Should encrypted blocks have their own content-type so other peers know that they are encrypted blocks, or is it better to keep them as raw data so other peers don’t know if it is an encrypted block or not?
69 * Should we specify a way to deterministically encrypt blocks in addition to the non-deterministic way?