# Social Graph Database

Aggregation and indexing are key features of social networks. Aggregation is necessary for tracking global interactions such as likes and follows; and indexing is needed to construct feeds, run search queries, manage spam, and otherwise interpret the social graph database. Since we cannot remove these centralized roles, the next best thing is to ensure that **anyone can perform them with algorithmic transparency**. In other words, any node should be able to verifiably contribute to a function: resume execution at the last checkpoint, perform an operation, and produce a proof of the operation.

> Self-authenticating data provides a scalability advantage by enabling store-and-forward caches. Aggregators in a self-authenticating network can host data on behalf of smaller providers without reducing trust in the data's authenticity. With verifiable computation, these aggregators will even be able to produce computed views – metrics, follow graphs, search indexes, and more – while still preserving the trustworthiness of the data. This topological flexibility is key for creating global views of activity from many different origins.
>
> source: [Bluesky blog](https://blueskyweb.xyz/blog/3-6-2022-a-self-authenticating-social-protocol)

### User data stores <a href="#user-data-stores" id="user-data-stores"></a>

In a decentralized social network, users can host their databases on their own infrastructure. This forms a federated network of servers hosting user information, where no ordering is assumed between the data stores.

Each data store is identified by some keypair. This can be used to sign the data store root on each mutation, in order to prove the authenticity of the update.

### Use-case: Aggregated "follow" relations <a href="#use-case-aggregated-follow-relations" id="use-case-aggregated-follow-relations"></a>

It would be valuable to construct a social-graph database of “follow” relations between users' data stores. This database is a secondary index; the user data stores are the primary/canonical store. Our aggregator will watch some set of user data stores for changes to their followers, and then update its secondary index.

#### Indexer for queries <a href="#indexer-for-queries" id="indexer-for-queries"></a>

A few useful queries (reads) to include are:

```rust
- isFollowing(from: DatastoreID, to: DatastoreID) -> boolean- listFollowers(to: DatastoreID) -> DatastoreID[]- listFollows(from: DatastoreID) -> DatastoreID[]- countFollowers(to: DatastoreID) -> number- countFollows(from: DatastoreID) -> number
```

To handle queries, we construct an [indexer](https://maru.docs.proxima.one/use-cases/indexer.html). Frank McSherry describes how to translate these queries into operators for differential dataflow in a graph database [here](https://github.com/frankmcsherry/blog/blob/master/posts/2016-07-17.md#an-example). The aggregator includes a proof for each operator used in the query.

#### Provable mutations <a href="#provable-mutations" id="provable-mutations"></a>

In response to updates in the user data stores, the aggregator must mutate its secondary database to match:

```rust
- addFollow(from: DatastoreID, to: DatastoreID, prf: Proof)- removeFollow(from: DatastoreID, to: DatastoreID, prf: Proof)
```

Just like performing queries, we can write operators to react to these updates and include a proof for each operator. Additionally, the aggregator includes a proof of authenticity of the update (i.e. that it was signed by the data store's keypair).

### Recursive proofs <a href="#recursive-proofs" id="recursive-proofs"></a>

The aggregator produces a proof for each operator. As the number of queries and mutations grows, the number of operator proofs quickly becomes too large for a client to verify. To reduce the number of proofs, we use the [proof pipeline](https://docs.proxima.one/proxima/how-maru-works/architecture#proof-pipeline) to prove them all in parallel and merge all of the proofs together.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.proxima.one/proxima/maru/use-cases/social-graph-database.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
