· 4 years ago · Aug 10, 2021, 12:52 AM
1openapi: 3.0.1
2info:
3 title: MPC4F
4 description: "Multiparty Computation for Fairness.\n
5
6 For this API documentation, we'll use the following running example: \n
7
8 1. There are three DCs named Alice, Bob, and Charlie, respectively. \n
9
10 2. There are four questions in the survey, namely gender, ethnic group, marital status, and occupational status. For our fairness computation, we encode the survey result into binary. The first bit is true if the surveyed person identifies as male, the second bit is true if the surveyed person identifies as female, ..., the last bit is true if the surveyed person has retired. By choosing a certain bit (by specifying the *offset*) in our binary, we can get the statistics for a certain set of sensitive attributes for which to run our fairness computation. For example, if we let \"offset = 1\", we can start fairness computation with regards to the attribute *gender identify as male*. To simplify the discussion, we assume there are 30 binary fields for these four questions after encoding. \n
11 "
12 version: 1.1.0
13servers:
14- url: http://localhost:8000
15tags:
16- name: Data Custodian
17 description: Data Custodian APIs
18- name: Model Provider
19 description: Model Provider APIs
20paths:
21 /dc/status:
22 get:
23 tags:
24 - Data Custodian
25 summary: Simple liveness checking.
26 description: "Checks the status of the called DC. If this DC is running, it returns the set up time. \n
27
28 The CLI `util ping` call this API to display the status of each DC."
29 responses:
30 200:
31 description: Successful operation
32 content:
33 'text/plain':
34 schema:
35 type: string
36 example: 'Fri, 16 Jul 2021 20:00:00 +0000'
37 description: The time that the DC is set up.
38 /dc/shares/{rid}:
39 post:
40 tags:
41 - Data Custodian
42 summary: Upload the secret shares of the record under the given rid.
43 description: "Call this API to upload the secret shares under the specified rid. In our system, the SO call this API.\n
44
45 In the case of the running example, there are three DCs and 30 binary fields in the survey. Therefore, the SO would generate three separate secret shares for Alice, Bob, and Charlie, respectively. Each DC receives the secret shares, which is a vector of scalars of length 30, considering there are 30 binary fields of the survey. \n
46
47 The survey fields are all binary (e.g. bit 1 represents if the person is a female or not), which allows us to perform secret sharing in the following way: the three secret shares (one share for each DC) would sum to 1 if the survey field is true for this record, and it would sum to 0 if the survey field is false.\n
48
49 We have 30 (binary) fields in the survey, so we would need to send each of Alice, Bob, and Charlie 30 scalars in binary - the 30 scalars together make up the DC's secret share.\n
50
51 For example, summing the 30th scalar in the secret shares sent to the three DCs would result in 1 if and only if the surveyed person indicates they have retired.\n
52
53 Instead of encoding the binary in JSON, which would transform the binary representation to base64 and slow down the data transmission by approximately 25%, we concatenate the 30 binary scalars (30 secret shares) together and send it as a binary value as a whole. If the length of each binary scalar is 256 (this is determined by the configuration file), then bit 0-255 would represent the first scalar (the secret share for the first survey field), bit 256-512 would represent the second scalar, so on and so forth. \n
54
55 To store the secret shares, the DC uploads the secret shares to the DC's database with the rid as the key.\n
56
57 After successfully storing the valid secret shares under the given rid. The DC creates a “received” record in AD to mark a successful upload. Through this way, it is able to verify if all DCs received the secret share by querying the AD. \n"
58 parameters:
59 - name: rid
60 in: path
61 description: Record ID
62 required: true
63 schema:
64 $ref: '#/components/schemas/RecordID'
65 requestBody:
66 content:
67 application/json:
68 schema:
69 $ref: '#/components/schemas/SecretShares'
70 required: true
71 responses:
72 200:
73 description: OK
74 content: {}
75 400:
76 description: Bad format in the request body
77 content:
78 text/plain:
79 schema:
80 type: string
81 example: "Error: bad format for secret shares."
82 409:
83 description: Conflict. POST requests in this DC have been dumped to another server but no new imports have been given.
84 content:
85 'text/plain':
86 schema:
87 type: string
88 example: "Error: POST requests in this DC are frozen because data in this DC has been dumped to another server but no new imports have been given."
89 delete:
90 tags:
91 - Data Custodian
92 summary: Deletes the secret shares of the record under the given rid.
93 description: "Call this API to delete the secret shares under the specified rid. It deletes the record with the corresponding rid in the DC database.\n
94
95 If the rid does not exist, an error is returned."
96 parameters:
97 - name: rid
98 in: path
99 description: Record ID
100 required: true
101 schema:
102 $ref: '#/components/schemas/RecordID'
103 responses:
104 200:
105 description: OK
106 content: {}
107 400:
108 description: Bad request in path parameter. rid does not exist.
109 content:
110 text/plain:
111 schema:
112 type: string
113 example: "Error: rid does not exist."
114
115 /dc/computation_request:
116 post:
117 tags:
118 - Data Custodian
119 summary: Create a new computation request with a specified cid.
120 description: "MP calls this API endpoint to request a fairness computation identified by the cid (Computation ID) passed in. The request body contains (1) the cid (2) the encrypted inference results (3) range proofs, and (4) the fairness computation details, which includes the list of rids that in this computation, as well as the offset for the survey field (in the case of the running example, offset=30 means we're computing fairness in regards to the survey field \"retired or not\"). \n
121
122 After The DC successfully receives the request body, it first checks if the given cid is already used or not, and returns an error if it's used. If it's unused, the DC loads the fairness computation details identified by the given cid from the AD. Then the DC makes sure that the fairness computation details given via the POST request body match those obtained from the database. In other words, the DC checks to make sure that the fairness computation details identified by the given cid in the AD contain the same set of rids and the same offset as the ones given in the request body. If the two sets of information don't match, it returns an error message, otherwise it will schedule the fairness computation for the batch of rids included in this computation. \n
123
124 Information about this computation is maintained in the AD using the cid as the identifier considering that the fairness computation will be executed asynchronously and might take a long time (up to 24 hours). The information includes the time that the computation is created, started, and finished; the current status of the computation; the computation result in binary; the fairness computation details mentioned above. \n
125
126 If the computation is successfully started, then the returned string will read the timestamp of the start time of the computation. The AD and DC database are updated with the computation information accordingly. When the computation is completed, the DC database is updated with the computation result and other metadata as well, while the AD is only updated with the metadata (computation finish time, etc.).\n
127
128 If the DC encounters errors (e.g. the information obtained from the AD does not match the input information; the input cid is already used; database unavailable in this DC, etc.), the related error message is returned instead of the timestamp."
129 requestBody:
130 description: Computation ID
131 content:
132 multipart/form_data:
133 schema:
134 type: object
135 properties:
136 cid:
137 type: string
138 ciphertexts:
139 type: array
140 description: Homomorphically encrypted inference result generated
141 by MP and sorted by rid.
142 items:
143 type: string
144 format: binary
145 proof:
146 type: string
147 description: a ZKP that shows all ciphertexts are well-formed. It could
148 be either separate proof for all ciphertexts or an aggregated
149 proof.
150 fairness_computation_details:
151 $ref: '#/components/schemas/FairnessComputationDetails'
152 required: true
153 responses:
154 200:
155 description: OK
156 content:
157 text/plain:
158 schema:
159 type: string
160 example: "Computation [cid] started at [timestamp] / error message"
161 400:
162 description: Bad Request
163 content:
164 text/plain:
165 schema:
166 type: string
167 example: "Error. The information obtained from the AD does not match the input information / The input cid is already used / Database unavailable in this DC"
168 409:
169 description: Conflict. POST requests in this DC have been dumped to another server but no new imports have been given.
170 content:
171 'text/plain':
172 schema:
173 type: string
174 example: "Error: POST requests in this DC are frozen because data in this DC has been dumped to another server but no new imports have been given."
175 /dc/computation/{cid}:
176 get:
177 tags:
178 - Data Custodian
179 summary: Get the result of the computation with the specified cid.
180 description: "The MP calls this endpoint to get the result of the fairness computation (a scalar) with the specified cid. \n
181
182 The DC queries the DC database using the given cid as key to retreive the fairness computation status and (if completed) result and then returns it. If the computation is still running, return an error message."
183 parameters:
184 - name: cid
185 in: path
186 description: Computation ID
187 required: true
188 schema:
189 $ref: '#/components/schemas/ComputationID'
190 responses:
191 200:
192 description: OK
193 content:
194 applications/octet-stream:
195 schema:
196 $ref: '#/components/schemas/ComputationResult'
197 409:
198 description: Conflict. Computation still running
199 content:
200 text/plain:
201 schema:
202 type: string
203 example: 'Error: the current computation is still running.'
204 /dc/dump:
205 get:
206 tags:
207 - Data Custodian
208 summary: Get a dump of all records in the database.
209 description: "This is an endpoint for data backup or migration.
210
211 When a request to migrate the data from the old DC to the new DC is made, the new DC would call this API endpoint of the old DC to obtain a dump of all records in the old DC's database. Specifically, this API endpoint is called by the new DC during the new DC's `POST /migration_request` process. \n
212
213 Only the DC owner (localhost) or the approved new DC can call this API. After the data dump, the POST methods of the old DC will be frozen until the next import in order to avoid data discrepancies between the old and new DC."
214 responses:
215 200:
216 description: OK
217 content:
218 application/octet-stream:
219 schema:
220 type: string
221 format: binary
222 example: The binary for the dump
223 401:
224 description: Unauthorized (not DC owner or the approved new DC). Only the DC owner or the approved new DC can access this API.
225 content:
226 text/plain:
227 schema:
228 type: string
229 example: 'Error. Only the DC owner or the approved new DC can access this API.'
230 put:
231 tags:
232 - Data Custodian
233 summary: Overwrite the current database with records in the dump.
234 description: This is an API endpoint reserved for manual database status backup. When the data from the old DC is retrieved by the new DC, the new DC will call the same method to overwrite its database with the retrieved data from the old DC.
235 requestBody:
236 content:
237 application/octed-stream:
238 schema:
239 type: string
240 format: binary
241 example: The binary for the dump.
242 required: true
243 responses:
244 200:
245 description: OK
246 content: {}
247 /dc/migration_approval:
248 post:
249 tags:
250 - Data Custodian
251 summary: The old DC gives approval for data migration.
252 description: Call this endpoint of the old DC to mark for its approval of data migration from the old DC to the new DC. Information regarding this migration is identified by the mid, which is passed in the request body. The old DC shows its approval by updating a related field in the AD.
253 requestBody:
254 description: Migration ID
255 content:
256 'text/plain':
257 schema:
258 $ref: '#/components/schemas/MigrationID'
259 required: true
260 responses:
261 200:
262 description: OK
263 content: {}
264 400:
265 description: Bad format in the request body
266 content:
267 'text/plain':
268 schema:
269 type: string
270 example: "Error updating the related field in the AD."
271 /dc/migration_request:
272 post:
273 tags:
274 - Data Custodian
275 summary: The new DC requests to load the data from the old DC.
276 description: "Call this endpoint of the new DC to request and process a data migration from the old DC to the new DC. After the new DC received this API call, it first confirms the request by querying the AD: check if the mid is valid, the old DC matches, and has approved the migration request by updating a related field in AD.\n
277
278 If so, it accesses the `GET /dump` API from the old DC and loads the dump.\n
279
280 After successfully loading the dump, the new DC registers the existence of secret shares in AD and updates the status of the migration."
281 requestBody:
282 description: Migration ID
283 content:
284 'text/plain':
285 schema:
286 $ref: '#/components/schemas/MigrationID'
287 required: true
288 responses:
289 200:
290 description: OK
291 content: {}
292 400:
293 description: Bad format in request body
294 content:
295 'text/plain':
296 schema:
297 type: string
298 example: "mid invalid / Dump in bad format / Error updating the related field in the AD."
299 401:
300 description: Unauthorized. Old DC incorrect / Cannot find the approval from the old DC.
301 content:
302 'text/plain':
303 schema:
304 type: string
305 example: "Old DC incorrect / Cannot find the approval from the old DC."
306 /mp/inference_result/{rid}:
307 post:
308 tags:
309 - Model Provider
310 summary: Load inference result with specified rid.
311 description: "Call this API to upload a plaintext inference result identified by a specific rid, which will be used for fairness computation in the future. The inference result is in plaintext and MP will encrypt it in later procedures (namely `GET /start_computation?{offset}`). The inference result is an integer between 0 and 2^b, where b is a parameter. \n
312
313 MP stores and manages the plaintext inference results locally in memory, so no database APIs and operations are called in this endpoint.\n
314
315 If the rid has already been used in a previous upload, the inference result will be overwritten."
316 parameters:
317 - name: rid
318 in: path
319 description: Record ID
320 required: true
321 schema:
322 $ref: '#/components/schemas/RecordID'
323 requestBody:
324 content:
325 'application/octet-stream':
326 schema:
327 $ref: '#/components/schemas/InferenceResult'
328 required: true
329 responses:
330 200:
331 description: OK
332 content: {}
333 delete:
334 tags:
335 - Model Provider
336 summary: Remove an inference result with specified rid.
337 description: "Call this API to remove a plaintext inference result identified by a specific rid.\n
338
339 MP stores and manages the plaintext inference results locally, so no database APIs and operations are called in this endpoint.\n
340
341 If the rid does not exist, an error is returned."
342 parameters:
343 - name: rid
344 in: path
345 description: Record ID
346 required: true
347 schema:
348 type: string
349 responses:
350 200:
351 description: OK
352 content: {}
353 400:
354 description: Bad request. The rid does not exist.
355 content:
356 'text/plain':
357 schema:
358 type: string
359 example: The given rid does not exist.
360 /mp/start_computation:
361 get:
362 tags:
363 - Model Provider
364 summary: Run a fairness computation for all loaded inference results.
365 description: "Call this endpoint to start a round of fairness computation. \n
366
367 The offset representing the survey field for which we want to conduct the fairness computation is given in the request body. In the case of our running example, `offset=30` would mean we want to conduct a fairness computation regarding the second survey field: whether the survey taker has retired or not. \n
368
369 First, the MP collects the complete list of rids for all local inference results that exist locally (sent to the MP via `POST /mp/inference_result/{rid}`).\n
370
371 Second, the MP queries AD to retrieve the complete list of rids whose corresponding secret shares of the survey results are all successfully stored in all DCs (via `POST /dc/shares/{rid}`). \n
372
373 After obtaining both lists, the MP computes the intersection of the two lists. This resulting list of rids are the ones that participate in this round's fairness computation.\n
374
375 It then uploads the offset for this round of computation and a list of rids that participate in this fairness computation (these two together are called *fairness computation details*) to AD. \n
376
377 Then it encrypts the locally stored inference results for rids in this list, and calls every DC's `POST /dc/computation_request` endpoint to request the computation with the cid, the encrypted inference results, the range proofs, and fairness computation details as the request body. Inside each DC's `POST /dc/computation_request`, the DCs will query the AD to see if the fairness computation details they received in the request body matches the fairness computation details under the key cid uploaded to the AD. Please refer to `POST /dc/computation_request` for more details. \n
378
379 It returns the cid for this computation if succeeded."
380 parameters:
381 - name: offset
382 in: query
383 required: true
384 schema:
385 type: integer
386 responses:
387 200:
388 description: OK
389 content:
390 application/json:
391 schema:
392 $ref: '#/components/schemas/ComputationID'
393 # type: object
394 # properties:
395 # success:
396 # type: boolean
397 # cid:
398 # type: string
399 # example: Computation ID
400 # error:
401 # $ref: '#/components/schemas/Error'
402 # description: Computation ID
403 # example: cid
404 /mp/start_migration:
405 get:
406 tags:
407 - Model Provider
408 summary: Request a migration between different DCs.
409 description: "Request a migration between two different DCs.\n
410
411 Upon receiving the request, the MP uploads the migration information (timestamp, old DC name, new DC name, etc.) to the AD under the mid as the identifier. The public key may also be included in the migration information in addition to the hostname if the host is not specified in the configuration file before.\n
412
413 Note that the old DC needs to access its own `POST /dc/migration_approval` endpoint to approve the migration (by updating the AD) and the new DC needs to access its own `POST /dc/migration_request` endpoint to migrate the data from the old DC.
414
415 "
416 parameters:
417 - name: old_dc_name
418 in: query
419 required: true
420 schema:
421 type: string
422 - name: new_dc_name
423 in: query
424 required: true
425 schema:
426 type: string
427 responses:
428 200:
429 description: OK
430 content: {}
431 # application/json:
432 # schema:
433 # type: object
434 # properties:
435 # success:
436 # type: boolean
437 # error:
438 # $ref: '#/components/schemas/Error'
439
440 /mp/computation/{cid}:
441 get:
442 tags:
443 - Model Provider
444 summary: Get the status for a certain computation identified by cid.
445 description: "Get the status for a certain computation identified by cid.\n
446
447 It queries AD to see if all DCs have finished their computation for this cid. If the AD indicates that all computations for cid have finished, the MP will call every DC's `GET /dc/computation/{cid}` API endpoint to retrieve the result of every DC's computation. After receiving the result from every DC, the MP aggregates (sums up) the results it received and then decrypts it to obtain the final result for this fairness computation. If fairness computations on all DCs are successful, the decrypted final result is returned. Otherwise, related errors are returned."
448 parameters:
449 - name: cid
450 in: path
451 description: Computation ID
452 required: true
453 schema:
454 $ref: '#/components/schemas/ComputationID'
455 responses:
456 200:
457 description: OK
458 content:
459 text/plain:
460 schema:
461 $ref: '#/components/schemas/ComputationResult'
462 # application/json:
463 # schema:
464 # type: object
465 # properties:
466 # success:
467 # type: boolean
468 # computation_result:
469 # $ref: '#/components/schemas/ComputationResult'
470 # error:
471 # $ref: '#/components/schemas/Error'
472components:
473 schemas:
474 MigrationID:
475 type: string
476 example: "mid"
477 ComputationID:
478 type: string
479 example: "cid"
480 RecordID:
481 type: string
482 example: "rid"
483 Error:
484 type: string
485 example: If an error occurred, the message is displayed here.
486 InferenceResult:
487 type: string
488 format: binary
489 ComputationResult:
490 type: string
491 format: binary
492 example: Computation result (in binary)
493 FairnessComputationDetails:
494 type: object
495 properties:
496 offset:
497 type: integer
498 format: u8
499 rids:
500 type: array
501 items:
502 type: string
503 SecretShares:
504 type: object
505 properties:
506 shares:
507 type: array
508 items:
509 type: string
510 format: base64
511 # example: AQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
512 example: [
513 "AQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=",
514 "AgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=",
515 "AwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="
516 ]
517 # length:
518 # type: integer
519 # description: The length of one secret share scalar (in binary).
520 # example: [the length of one secret share scalar]
521
522 SurveyResult:
523 type: object
524 properties:
525 rid:
526 type: integer
527 format: int64
528 attributes:
529 type: array
530 items:
531 type: string
532 timestamp:
533 type: string