Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

A few questions/thoughts #4

Open
whyrusleeping opened this issue Mar 22, 2017 · 14 comments
Open

A few questions/thoughts #4

whyrusleeping opened this issue Mar 22, 2017 · 14 comments

Comments

@whyrusleeping
Copy link

Hey, This is a great initiative 👍 to self describing opaque strings :)

@kumavis asked @diasdavid and I on IRC to take a look at this and give some feedback, so i figured i'd start an issue for discussion.

It seems what we're really doing here is coming up with a scheme for encoding PKI identities (an ethereum address being roughly the hash of a public key).

Some initial things i want to consider:

  • What would an address on bitcoin look like? How would that be represented?
  • How difficult would it be to represent an ipfs peer ID with this format?
  • Do you think the checksum could be optional? i.e. allow un-checksummed mnids ?
  • The version byte should probably be a varint, this allows future extensibility
  • The network IDs should be in a table somewhere, that way people using this spec don't overlap on their selections
  • Using a segment of the genesis block hash seems like a nice idea, but i would be worried about the possibility of a collision if this spec becomes more adopted. A global table seems the better approach
  • The hash function used for the checksum at the end should be specified by the version (version 1 == sha3, version2 maybe something different if sha3 breaks?). Hardcoding these things always makes me nervous
  • Consider using multibase instead of strictly base58. Some applications might want a case insensitive encoding (like filenames on non-linux OSes)
@daviddias
Copy link

One of the other questions we had during this discussion was:

  • What about network hardforks, how can we make sure that both networks become distinct if the Id is the genesis block.

@kumavis
Copy link

kumavis commented Mar 22, 2017

@whyrusleeping worth noting that the public key may not be available, either bc:

  • there isnt one (e.g. contract's address is just pseudo-random) or
  • the account hasn't made a signature public yet (e.g. signed a transaction).

@kumavis
Copy link

kumavis commented Mar 22, 2017

one other note from IRC convo was:

the 1 byte version prefix can be a varint pretty easily

@whyrusleeping
Copy link
Author

Some further related reading:
https://github.com/WebOfTrustInfo/rebooting-the-web-of-trust-fall2016/blob/master/draft-documents/DID-Spec-Implementers-Draft-01.pdf

Not necessarily what you want to do, but definitely a worthwhile read in the same thought space

@whyrusleeping
Copy link
Author

@kumavis good point about it not necessarily being public keys. I think they also took that into account in the DID spec i linked.

@coder5876
Copy link

Thanks for the input @whyrusleeping @diasdavid! I think it would be exciting if we could have a "multiaddress" for blockchain addresses. As for a list of ID:s, this list here is pretty well used with BIP32 wallets:

https://github.com/satoshilabs/slips/blob/master/slip-0044.md

We expect there to be a ton of private networks where we should probably derive the ID (4 bytes right now) from the genesis block hash or similar intrinsic data, so it's a bit unclear how to decide wether to use a simple number or the 4 bytes.

@diasdavid The decision to use the block hash was because we expect hard forks to mainly be meaningful in large public networks that would have a code in the official list anyway. For small private chains it seems less likely that there will be hard forks, but I may also be wrong about this.

@whyrusleeping
Copy link
Author

@christianlundkvist my issue with using some hash for the ID is that since you said there will be "a ton of private networks", given the birthday paradox, we run a 50% chance of a collision in the 32 bit integer space once we have around 80,000 different IDs

@coder5876
Copy link

Good point @whyrusleeping! I'm thinking that if we have a version byte then hash lengths could be increased in the future. I also think if we have ~100k different networks maintaining a global table would be pretty difficult. Hmm, tricky... 🤔

@whyrusleeping
Copy link
Author

@christianlundkvist I think maintaining a global table will be less work than it seems. The way i see this working easily is groups can allocate ranges of the table to themselves, and then define their more concrete meanings locally, and if they feel compelled, PR the specifics back to the main table.

For example, ethereum could grab the range 0x000040 to 0x000100, and on the global table it would be noted that those are ethereum IDs. Within that, the ethereum group can define their own subtable within that allocated range to be whatever they want. Its a kind of tiered consensus/partitioned locking approach

@jashmenn
Copy link

jashmenn commented Jun 17, 2017

I've been looking to do the same thing and this project is very close, but I agree w/ @whyrusleeping that it's worth considering a global table. It will be necessary to keep some sort of table anyway. E.g the upcoming BTC fork comes to mind: they share a genesis block, so a cross-network identifier needs to store the post-fork identifiers.

It seems to me that you could essentially accomplish this by using multihash with a different set of constants. If you:

  • use a varint network id / network code
  • use a varint size
  • drop the version number
  • drop the checksum
  • use multibase

You would essentially have multihash but instead of labeling the function that created a digest, you're labeling an address as belonging to a network.

It doesn't even seem necessary to me to store the type of address on that particular network. That is you wouldn't necessarily need to store in this library the distinction between a Bitcoin P2PKH vs. P2SH address. It seems to me that the network identifier would suffice. It's enough to know that an address is intended for e.g. the ethereum test network vs. production.

That said, I realize this may be too big of a change for a specification you're already using. I might just fork multihash for now, because while this idea of labeled bytes is quite similar, multihash seems focused on labeling the output of a specific type of functions.

@whyrusleeping
Copy link
Author

whyrusleeping commented Jun 17, 2017 via email

@jashmenn
Copy link

My main motivation is to have a protocol that allows self-describing addresses, specifically to be used for sending value (tokens, filecoin, btc).

So in the context of this library mnid (or something else we create? Maybe named multichainaddr?), I'm not sure we store the protocol-specific scoping (p2pkh, p2sh, etc.) in this identifier. My first thought is that we leave that to the distinction to the software that ends up performing the value transfer.

If we did include it here, you either have to:

  • store more bytes that specify the scoping (e.g. 4 bytes for the SegWit BTC fork and 4 bytes for a p2sh address) or
  • implement the functions to parse and distinguish the scopings for every network

In the first case, we'd end up maintaining a secondary fixed list of scopings, but the bigger issue is that the scope would be self-identified and potentially open to error. (That is, marking an address scope as something it actually isn't). In the second case, it seems like a lot of work to reimplement these.

That said, for my purposes anyway, differentiating scopings here isn't one of my goals. My vision is to have a "multichainaddr" value, much like multihash, that given the value and the protocol you know what network and address to send token-value to.

Earlier, you asked about IPFS Peer IDs -- It's tricky to find the right abstraction because an "address" could mean a lot of different things: is it a transformation of a public key? is it a human-readable string identifier? an IPFS peer ID? (Or even an email address that has a valid PayPal account, if you want to go there.)

My needs are more around establishing a protocol for transmitting token-value and less about identifying any possible hash-like address. But there doesn't seem to be any real technical reason why you couldn't include a network-identifier for IPFS Peer IDs. Though if that were the case, instead of a name like multichainaddr, a name like multihashaddr may be a better fit (or maybe it would just be confusing with multihash and multiaddr already in the mix).

@whyrusleeping
Copy link
Author

@jashmenn in the case of bitcoin (or any system that uses bitcoin style scripting) you must encode whether its a p2pkh or p2sh, otherwise you cannot properly create the transaction (the output script is different for both).

differentiating scopings here isn't one of my goals.

I really think it is, actually. You want to select the scope of which network/blockchain/fork-of-that-blockchain/other-variants so you can be as unambiguous as possible.

@felixwatts
Copy link

Would it make sense to prefix with a scheme so that the MNID is a URI? e.g.

mnid:2nQtiQG6Cgm1GYTBaaKAgr76uY7iSexUkqX

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants