Bitcoin-Powered Database

From Bitcoin Wiki
Revision as of 04:56, 13 July 2012 by Ripper234 (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Bounty jar for this project.

As far I know, the Bitcoin blockchain is pretty much the only data structure that is both global and tamper-proof.

  • global - There is one global instance of the blockchain (up to forks ... which are then resolved).
  • tamper-proof - If a block enters the blockchain, after 6+ confirmations this block can't be feasible faked or altered.

A Bitcoin-powered database would be an API over the blockchain, that would expose a subset of the usual CRUD database operations. In fact, it would be an append-only data structure, because nothing can ever be really deleted from the blockchain - so only CREATE and READ operations will be implemented. Object Versioning will be used to emulate updates and deletions.

Use Cases

Undeletedable, Auditable Asset ledger

On May 2012, Bitcoinica was hacked, and its ledger database was deleted. If its ledger were managed on a global, tamper-proof database, this would not have been possible.

Such a ledger can be constructed in a way where each balance is encrypted by a key known only to the site operator and the owner of the balance, in order to keep the account balance private. While a Bitcoin-Powered Database might not have adequet performance characteristics to perform as the backend of most websites, it can be used as a persistent data backup, that can be independently verified by a websites users.

Implementation

Regular databases are composed of a server and a client library. In this case, the "server" is all the miner nodes, so only a client library is needed. All operations are encoded as Bitcoin transactions.

We will use a primitive PUT(message) that encodes the parameter message in the blockchain using a Block chain message service.

CREATE(guid, version, encoding, data, owner_pub_key, hash_algorithm, signature)

This operation creates a new record.

  • guid - a unique ID identifying this object, generated by the client.
  • version - an object version number.
  • encoding - enum identifying the encoding of the data (e.g. 0 for json, 1 for zipped json, ... other formats can be added if needed)
  • data - application specific. The data should be decoded according to the ***encoding*** parameter.
  • hash_algorithm - An enum specifying which cryptographic hash function is used to verify data (0 for RSA+bcrypt, ...).
  • owner_pub_key - A public key identifying the owner of this object (e.g. using RSA)
  • signature - hash over all the other parameters - e.g. (bcrypt(RSA_encrypt(private_key, (guid, version, encoding, data, hash_algorithm, owner_pub_key))).

The first time a CREATE operation is used with a particular guid, its version should be equal to 0, and any owner_pub_key can be provided. Any subsequent CREATE op with this guid must have the same owner_pub_key, and must have a version equal to 1+max(previous versions of objects with the same guid).

Any invalid CREATEs are ignored (e.g. CREATE with incorrect version or wrong signature).

It should be mentioned that CREATE operations cost some amount of Bitcoin to be "performed" by the network - the exact amount depends on the implementation of the PUT operation (The raw implementation of CREATE is just to PUT a message with the concatenation of its parameters).

READ(guid)

This operation returns an array of all the valid version of an object with this guid (or an empty array if the object does not exist). It is naively implemented by traversing the blockchain and looking for the first message with this guid, storing the owner_pub_key, and moving forward from there, identifying the valid versions and discarding invalid ones. This naive implementation is very costly (O(blockchain size)), but can be cached.

Sample implementation of a ledger

Trading websites such as Mt. Gox, GLBSE and Bitcoinica use a ledger - a system that records, for each user and asset class (BTC, USD, securities), the quantity of assets he owns at any given moment in time. Here is a simple way for such sites to implement an auditable public ledger over BPD.

  • For every user, we create both a unique encryption key and a guid. These are sent to the user immediately upon registration, and are also stored in the website's normal database.
  • For every transaction on the site, a statement comprising all of the user's assets is encoded in some format, and then written to the BPD in encrypted form, using the user's guid and encryption key.
  • At any point when the user wishes to verify his assets on the webiste, he performs a READ operation with his guid, and then decrypts his balances using his decryption key. He can then validate his current and previous account.

Note, this system does not reduce the security of the website, because all data in the BPD is encrypted (we assume the user's computer is trojan-free - if this is an issue, the user may opt-out of receiving a copy of his encryption key ... this is still preferable to not using a BPD, because at least the site's owner would have a sure way of authenticating the ledger, that can't be deleted or tampered with ... with the assumption that the guids and encryption keys are secure and backed up).