1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281
//! # ScrambleDB
//!
//! This document describes `ScrambleDB`, a protocol between several
//! parties for the pseudonymization and non-transitive joining of data.
//!
//! ## Overview and Concepts
//! `ScrambleDB` operates on tables of data, where a table is a collection
//! of attribute entries for entities identified by unique keys.
//!
//!
//! `ScrambleDB` offers two sub-protocols for blindly converting between different
//! table types.
//! ### Conversion from plain tables to pseudonymized columns
//! A plain table contains attribute data organized by (possibly
//! sensitive) entity identifiers, e.g. a table might store attribute
//! data for attributes `Address` and `Date of Birth (DoB)` under the
//! entity identifier `Full Name`:
//!
//! | Full Name (Identifier) | Address | DoB |
//! |------------------------|------------------------------------|---------------|
//! | Bilbo Baggins | 1 Bagshot Row, Hobbiton, the Shire | Sept. 22 1290 |
//! | Frodo Baggins | 1 Bagshot Row, Hobbiton, the Shire | Sept. 22 1368 |
//!
//! The result of `ScrambleDB` pseudonymization of such a
//! table can be thought of as computed in two steps:
//!
//! 1. Splitting the original table by attributes, resulting in
//! single-column tables, one per attribute, indexed by the original
//! identifier.
//!
//! | Full Name (Identifier) | Address |
//! |------------------------|------------------------------------|
//! | Bilbo Baggins | 1 Bagshot Row, Hobbiton, the Shire |
//! | Frodo Baggins | 1 Bagshot Row, Hobbiton, the Shire |
//!
//! | Full Name (Identifier) | DoB |
//! |------------------------|---------------|
//! | Bilbo Baggins | Sept. 22 1290 |
//! | Frodo Baggins | Sept. 22 1368 |
//!
//! 2. Pseudonymization and shuffling of split columns, such that the original
//! identifiers are replaced by pseudonyms which are unlinkable between
//! different columns.
//!
//! | Pseudonym (Identifier) | Address |
//! |------------------------|------------------------------------|
//! | _pseudo1_ | 1 Bagshot Row, Hobbiton, the Shire |
//! | _pseudo2_ | 1 Bagshot Row, Hobbiton, the Shire |
//! | | |
//!
//! | Pseudonym (Identifier) | DoB |
//! |------------------------|---------------|
//! | _pseudo3_ | Sept. 22 1368 |
//! | _pseudo4_ | Sept. 22 1290 |
//!
//!
//! Since the result of pseudonymizing a plain table is a set of
//! pseudonymized single-column tables we refer to this operation as a
//! _split conversion_.
//!
//! ### Conversion from pseudonymized columns to non-transitively joined tables
//! Pseudonymized columns may be selectively re-joined such that the
//! original link between data is restored, but under a fresh pseudonymous
//! identifier instead of the original (sensitive) identifier. In the
//! above example, a join of pseudonymized columns `Address` and `DoB`
//! would result in the following pseudonymized joined table.
//!
//!
//! | Join Pseudonym (Identifier) | Address | DoB |
//! |-----------------------------|------------------------------------|---------------|
//! | _pseudo5_ | 1 Bagshot Row, Hobbiton, the Shire | Sept. 22 1290 |
//! | _pseudo6_ | 1 Bagshot Row, Hobbiton, the Shire | Sept. 22 1368 |
//!
//! The contained pseudonyms are fresh for each join and are
//! non-transitive, i.e. it is not possible to further join two
//! join-results based on the join pseudonym.
//!
//! Since the result of this conversion is a joined table, we refer to the
//! operation as a _join conversion_.
//!
//! ### Data Sources, Stores and Converter
//! `ScrambleDB` is a multiparty protocol where parties serve different
//! roles as origins or destinations of data.
//!
//! Non-pseudonymized data originates at a **data source**.
//!
//! **Data stores** hold pseudonymized data and come in two forms:
//! - The **data lake** is a designated data store which stores
//! pseudonymized data columns fed to it by data sources via the
//! ScrambleDB protocol.
//! - A **data processor** is a data store which acquires pseudonymized
//! joined tables from a data lake via the ScrambleDB protocol.
//!
//! The **converter** facilitates the protocol in an oblivious fashion by
//! blindly performing the two types of conversion operations.
//!
//!
//! ## Cryptographic Preliminaries
//!
//! ### Rerandomizable Public Key Encryption
//! A rerandomizable public key encryption scheme `RPKE` is parameterized by a
//! set of possible plaintexts `PlainText` as well as a set of ciphertexts
//! `Ciphertext`.
//!
//! It offers the following interface:
//! - Key Generation:
//! ```text
//! fn RPKE.generate_key_pair(randomness) -> (ek, dk)
//!
//! Inputs:
//! randomness
//!
//! Ouputs:
//! ek: EncryptionKey
//! dk: DecryptionKey
//! ```
//! - Encryption:
//! ```text
//! fn RPKE.encrypt(ek, msk, randomness) -> ctxt
//!
//! Inputs:
//! ek: EncryptionKey
//! msg: Plaintext
//! randomness
//!
//! Outputs:
//! ctxt: Ciphertext
//! ```
//! - Decryption:
//! ``` text
//! fn RPKE.decrypt(dk, ctxt) -> msg'
//!
//! Inputs:
//! dk: DecryptionKey
//! ctxt: Ciphertext
//!
//! Output:
//! msg': Plaintext
//!
//! Failures:
//! DecryptionFailure
//! ```
//! - Ciphertext rerandomization:
//!
//! ``` text
//! fn RPKE.rerandomize(ek, ctxt, randomness) -> ctxt'
//!
//! Inputs:
//! ek: EncryptionKey
//! ctxt: Ciphertext
//! randomness
//!
//! Output:
//! ctxt': Ciphertext
//! ```
//!
//! ### Convertible Pseudorandom Function (coPRF)
//! **TODO: Describe coPRF interface**
//!
//! ### Pseudorandom Permutation
//! A pseudorandom permutation is a keyed pseudorandom permutation with
//! the following interface, where `PRPKey` is the set of possible keys
//! for the permutation and `PRPValue` is the both the domain and range of
//! the permutation.
//!
//! - Permutation:
//!
//! ``` text
//! PRP.eval(k, x) -> y
//!
//! Inputs:
//! k: PRPKey
//! x: PRPValue
//!
//! Output:
//! y: PRPValue
//! ```
//!
//! - Inversion:
//!
//! ``` text
//! PRP.inverse(k, x) -> y
//!
//! Inputs:
//! k: PRPKey
//! x: PRPValue
//!
//! Output:
//! y: PRPValue
//! ```
//!
//! We require that for all possible PRP keys `k`, successive application
//! of `PRP.eval`, then `PRP.inverse` or reversed is the identity
//! function, i.e. for any `x` in `PRPValue`:
//!
//! ``` text
//! PRP.eval(k, PRP.inverse(k, x)) = PRP.inverse(k, PRP.eval(k, x)) = x
//! ```
use libcrux::hpke::{aead::AEAD, kdf::KDF, kem::KEM, HPKECiphertext, HPKEConfig, Mode};
/// Security parameter in bytes
const SECPAR_BYTES: usize = 16;
/// The HPKE Configuration used in the implementation of double HPKE
/// encryption using the HPKE single-shot API.
const HPKE_CONF: HPKEConfig = HPKEConfig(
Mode::mode_base,
KEM::DHKEM_P256_HKDF_SHA256,
KDF::HKDF_SHA256,
AEAD::ChaCha20Poly1305,
);
/// A wrapper type to facilitate (de-)serialization of HPKE
/// ciphertexts to (and from) linear byte vectors.
pub struct SerializedHPKE {
len_kem_output: u32,
len_ciphertext: u32,
bytes: Vec<u8>,
}
impl SerializedHPKE {
/// Prepare an HPKE ciphertext for serialization by wrapping it in
/// a `SerializedHPKE`.
pub fn from_hpke_ct(ct: &HPKECiphertext) -> Self {
let mut bytes = ct.0.clone();
bytes.extend_from_slice(&ct.1);
Self {
len_kem_output: ct.0.len() as u32,
len_ciphertext: ct.1.len() as u32,
bytes,
}
}
/// Reconstruct an HPKE ciphertext from the wrapper type. This
/// does not perform validation of the reconstructed ciphertext.
pub fn to_hpke_ct(&self) -> HPKECiphertext {
HPKECiphertext(
self.bytes[0..self.len_kem_output as usize].to_vec(),
self.bytes[self.len_kem_output as usize..self.bytes.len()].to_vec(),
)
}
/// Serialize the wrapper type to a byte vector.
pub fn to_bytes(&self) -> Vec<u8> {
let mut bytes = Vec::new();
bytes.extend_from_slice(&self.len_kem_output.to_be_bytes());
bytes.extend_from_slice(&self.len_ciphertext.to_be_bytes());
bytes.extend_from_slice(&self.bytes);
bytes
}
/// Deseralize a wrapped HPKE ciphertext from a byte vector. This
/// does not perform validation of the deserialized ciphertext.
pub fn from_bytes(bytes: &[u8]) -> Self {
let len_kem_output = u32::from_be_bytes(bytes[0..4].try_into().unwrap());
let len_ciphertext = u32::from_be_bytes(bytes[4..8].try_into().unwrap());
Self {
len_kem_output,
len_ciphertext,
bytes: bytes[8..bytes.len()].to_vec(),
}
}
}
pub mod table;
pub mod setup;
pub mod split;
pub mod join;
pub mod finalize;
pub mod data_transformations;
pub mod data_types;
pub mod error;
#[cfg(feature = "wasm")]
pub mod wasm_demo;
#[cfg(test)]
mod test_util;