| |
| |
| |
| |
| Internet Engineering Task Force G. D'Amore, Ed. |
| Internet-Draft |
| Intended status: Informational October 26, 2016 |
| Expires: April 29, 2017 |
| |
| |
| ZeroTier Mapping for Scalability Protocols |
| sp-zerotier-mapping-01 |
| |
| Abstract |
| |
| This document defines the ZeroTier mapping for scalability protocols. |
| |
| Status of This Memo |
| |
| This Internet-Draft is submitted in full conformance with the |
| provisions of BCP 78 and BCP 79. |
| |
| Internet-Drafts are working documents of the Internet Engineering |
| Task Force (IETF). Note that other groups may also distribute |
| working documents as Internet-Drafts. The list of current Internet- |
| Drafts is at http://datatracker.ietf.org/drafts/current/. |
| |
| Internet-Drafts are draft documents valid for a maximum of six months |
| and may be updated, replaced, or obsoleted by other documents at any |
| time. It is inappropriate to use Internet-Drafts as reference |
| material or to cite them other than as "work in progress." |
| |
| This Internet-Draft will expire on April 29, 2017. |
| |
| Copyright Notice |
| |
| Copyright (c) 2016 IETF Trust and the persons identified as the |
| document authors. All rights reserved. |
| |
| This document is subject to BCP 78 and the IETF Trust's Legal |
| Provisions Relating to IETF Documents |
| (http://trustee.ietf.org/license-info) in effect on the date of |
| publication of this document. Please review these documents |
| carefully, as they describe your rights and restrictions with respect |
| to this document. Code Components extracted from this document must |
| include Simplified BSD License text as described in Section 4.e of |
| the Trust Legal Provisions and are provided without warranty as |
| described in the Simplified BSD License. |
| |
| |
| |
| |
| |
| |
| |
| D'Amore Expires April 29, 2017 [Page 1] |
| |
| Internet-Draft ZeroTier mapping for SPs October 2016 |
| |
| |
| 1. Underlying protocol |
| |
| ZeroTier expresses an 802.3 style layer 2, where frames maybe |
| excchanged as if they were Ethernet frames. Virtual broadcast |
| domains are created within a numbered "network", and frames may then |
| be exchanged with any peers on that network. |
| |
| Frames may arrive in any order, or be lost, just a with Ethernet |
| (best effort delivery), but they are strongly protected by a |
| cryptographic checksum, so frames that do arrive will be uncorrupted. |
| Furthermore, ZeroTier guarantees that a given frame will be received |
| at most once. |
| |
| Each application on a ZeroTier network has its own address, called a |
| ZeroTier ID (ZTID), which is globally unique -- this is generated |
| from a hash of the public key associated with the application. |
| |
| A given application may participate in multiple ZeroTier networks. |
| |
| We assume each nanomsg application will have it's own ZeroTier ID, |
| and will not use more than one. Management of these IDs, as well as |
| the underlying key pairs, is out of scope of this document. |
| |
| ZeroTier networks have a maximum payload MTU of 2800 bytes. However |
| they also have an "optimum" MTU, based upon the underlying networks |
| (typically UDP) and overheads that are used to exchange such packets. |
| For our purposes we will assume this to be approximately 1400 bytes. |
| These values can change on different networks. |
| |
| 2. Packet layout |
| |
| Each nanomsg message sent over ZeroTier will be comprised of one or |
| more fragments, where each fragment is mapped to a single underlying |
| ZeroTier L2 frame. We use the EtherType field of 0901 to indicate |
| nanomsg over ZeroTier protocol (number to registered with IEEE). |
| |
| Each frame shall be prepended with the following header: |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| D'Amore Expires April 29, 2017 [Page 2] |
| |
| Internet-Draft ZeroTier mapping for SPs October 2016 |
| |
| |
| 0 1 2 3 |
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | destination mac address (bytes 0-3) | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | destination mac (bytes 4-5) | source mac (bytes 0-1) | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | source mac address (bytes 2-5) | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | 0x90 | 0x01 | op | flags | version | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | fragment offset | fragment length | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | destination port | source port | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | type | message ID | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | payload... |
| +-+-+-+-+-+-+-+- |
| |
| |
| All numeric fields are in big-endian byte order. |
| |
| As above, the start of each frame is just as a normal Ethernet frame, |
| with destination, source, and type of 0x901. |
| |
| The op is a field that indicates the type of message being sent. The |
| following values are defined: DATA (0), CONN (1), DISC (2), PING (3), |
| and ERR (4). These are discussed further below. Implementations |
| MUST discard messages where the op is not one of these. |
| |
| There are two flags defined. The first is MF (1), which indicates |
| that a message is fragmented, and more fragments follow. This flag |
| may only be set on DATA messages. The last fragment of a message |
| will not have this flag set. |
| |
| The second flag is AK (2), which indicates that a given message is a |
| reply to an earlier message. This is only valid for the CONN, DISC, |
| and PING message types. |
| |
| Note that the MF and the AK flag bits are mutually exclusive. |
| |
| The version byte MUST be set to 0x1. Implementations MUST discard |
| any messages received for any other version. |
| |
| The fragment length and offset are given in terms of octets, and only |
| include the payload. For example, the first fragment of a message |
| bearing a 2000 byte payload, itself only carrying 1400 bytes of that |
| |
| |
| |
| D'Amore Expires April 29, 2017 [Page 3] |
| |
| Internet-Draft ZeroTier mapping for SPs October 2016 |
| |
| |
| payload would have the MF bit set in flags, offset 0, and length |
| 1400. The second fragment would have the MF bit clear, length 600, |
| and offset 1400. |
| |
| As a single fragment cannot exceed the size of a ZeroTier frame, the |
| high order six bits of the fragment length MUST be zero, and the |
| value encoded MUST be less than 2800. |
| |
| Each fragment for a given message must carry the same message ID. |
| Implementations MUST initialize this to a random value, and MUST |
| increment this each time a new message is sent. |
| |
| The port fields are used to discriminate different uses, allowing one |
| application to have multiple connections or sockets open. The |
| purpose is analogous to TCP port numbers, except that instead of the |
| operating system performing the discrimination the application or |
| library code must do so. |
| |
| The type field is the numeric SP protocol ID, in big-endian form. |
| When receiving a message for a port, if the SP protocol ID does not |
| match the SP protocol expected on that port, the implementation MUST |
| discard this message. |
| |
| The maximum payload size, and hence the maximum SP message that may |
| be transmitted using this transport, is 65535 octets. |
| Implementations are encouraged to restrict this further. |
| |
| Note that it is not by accident that the payload is 32-bit aligned in |
| this message format. |
| |
| Source and destination MAC addresses shall be constructed |
| algorithmically from the relevant ZeroTier IDs. |
| |
| Note that at this time, broadcast and multicast is not supported by |
| this mapping. (A future update may resolve this.) |
| |
| 3. DATA messages |
| |
| DATA messages carry SP protocol payload data. They can only be sent |
| on an established session (see CONN messages below), and are never |
| acknowledged. |
| |
| 4. CONN messages |
| |
| CONN frames represent a session establishment. They allow a peer to |
| advertise its port number to a remote peer, and to verify that a peer |
| is responsive. The payload for the CONN frame is a 4 byte (big- |
| endian) value, consisting of the SP protocol ID of the sender. |
| |
| |
| |
| D'Amore Expires April 29, 2017 [Page 4] |
| |
| Internet-Draft ZeroTier mapping for SPs October 2016 |
| |
| |
| The connection is initiated by the initiator sending this message, |
| with its own SP protocol ID. The AK flag will in this case be clear. |
| |
| The responder will acknowledge this by replying with its SP protocol |
| ID in the 4-byte payload, with the AK flag set. |
| |
| Alternatively, a responder may reject the connection attempt by |
| sending a suitably formed ERR message (see below). |
| |
| If a sender does not receive a reply, it SHOULD retry this message |
| before giving up and reporting an error to the user. |
| |
| If a CONN frame is received for a session that already exists, the |
| receiver MUST reply. The CONN request is idempotent. |
| |
| 5. DISC messages |
| |
| DISC messages are used to request a session be terminated. This |
| notifies the remote sender that no more data will be sent or |
| accepted, and the session resources may be released. There is no |
| payload. The party closing the session sends this with the AK flag |
| clear. There is no acknowledgement. |
| |
| 6. PING messages |
| |
| In order to keep session state, implementations will generally store |
| data for each session. In order to prevent a stale session from |
| consuming these resources forever, and in order to keep underlying |
| ZeroTier sessions alive, a PING message may be sent. This message |
| has no payload. |
| |
| The sender MUST leave the AK bit clear. If the PING is is |
| successful, then the responder MUST reply with a PING message with |
| the AK bit set. |
| |
| In the event of an error, an implemenation MAY reply with an ERR |
| message. |
| |
| Implementations MUST not initiate PING messages if they have either |
| received or sent other session messages recently. |
| |
| Implemenations shall use a timeout T1 seconds of be used before |
| initiating a message the first time, and that in the absence of a |
| reply, up to N further attempts be made, separated by T2 seconds. If |
| no reply to the Nth attempt is received after T2 seconds have passed, |
| then the remote peer should be assumed offline or dead, and the |
| session closed. |
| |
| |
| |
| |
| D'Amore Expires April 29, 2017 [Page 5] |
| |
| Internet-Draft ZeroTier mapping for SPs October 2016 |
| |
| |
| It is recommended that T1, T2, and N all be configurable, with |
| recommended default values of 60, 10, and 5. With these values, |
| sessions that appear dead after 2 minutes will be closed, and their |
| resources reclaimed. |
| |
| 7. ERR messages |
| |
| ERR messages indicate a failure in the session, and abruptly |
| terminates the session. The payload for these messages consists of a |
| single byte error code, followed by an ASCII message describing the |
| error (not terminated by zero). This message shall not be more than |
| 128 bytes in length. |
| |
| The following error codes are defined: |
| |
| 0x01 No party listening at that address or port. |
| |
| 0x02 No such session found. |
| |
| 0x03 SP protocol ID invalid. |
| |
| 0x04 Generic protocol error. |
| |
| 0x05 Message size too big. |
| |
| 0xff Other uncategorized error. |
| |
| Implemenations MUST discard any session state upon receiving an ERR |
| message. These messages are not acknowledged. |
| |
| 8. Reassembly Guidelines |
| |
| Implementations MUST accept and reassemble fragmented DATA messages. |
| Implementations MUST discard fragmented messages of other types. |
| |
| Messages larger than the ZeroTier MTU (2800) MUST be fragmented. |
| |
| Implementations SHOULD limit the number of unassembled messages |
| retained for reassembly, to minimize the likelihood of intentional |
| abuse. It is suggested that at most 2 unassembled messages be |
| retained. It is further suggested that if 2 or more unfragmented |
| messages arrive before a message is reassembled, or more than 5 |
| seconds pass before the reassembly is complete, that the unassembled |
| fragments be discarded. |
| |
| |
| |
| |
| |
| |
| |
| D'Amore Expires April 29, 2017 [Page 6] |
| |
| Internet-Draft ZeroTier mapping for SPs October 2016 |
| |
| |
| 9. Ports |
| |
| The port numbers are 16-bit fields, allowing a single ZT ID to |
| service multiple application layer protocols, which could be treated |
| as seperate end points, or as separate sockets in the application. |
| The implementation is responsible for discriminating on these and |
| delivering to the appropriate consumer. |
| |
| As with UDP or TCP, it is intended that each party have its own port |
| number, and that a pair of ports (combined with ZeroTier IDs) be used |
| to identify a single conversation. |
| |
| An SP server should allocate a port for number advertisement. It is |
| expected cliets will generate ephemeral port numbers. |
| |
| Implementations are free to choose how to allocate port numbers, but |
| it is recommended manually configured port numbers are small, with |
| the high order bit clear, and that numbers > 32768 (high order bit |
| set) be used for ephemeral allocations. |
| |
| It is recommended that separate short queues (perhaps just one or two |
| messages long) be kept per local port in implementations, to prevent |
| head-of-line blocking issues where backpressure on one consumer |
| (perhaps just a single thread or socket) blocks others. |
| |
| 10. URI Format |
| |
| The URI scheme used to represent ZeroTier addresses makes use of |
| ZeroTier IDs, ZeroTier network IDs, and our own 16-bit ports. |
| |
| The format shall be nnzt://<nwid>/<ztid>:<port>, where the <nwid> |
| component represents the 16-digit hexadecimal ZeroTier network ID, |
| the <ztid> represents the 10-digit hexadecimal ZeroTier Device ID, |
| and the <port> is the 16-bit port number previously described. |
| |
| 11. IANA Considerations |
| |
| This memo includes no request to IANA. |
| |
| 12. Security Considerations |
| |
| The mapping isn't intended to provide any additional security in |
| addition to what ZeroTier does. |
| |
| |
| |
| |
| |
| |
| |
| |
| D'Amore Expires April 29, 2017 [Page 7] |
| |
| Internet-Draft ZeroTier mapping for SPs October 2016 |
| |
| |
| Author's Address |
| |
| Garrett D'Amore (editor) |
| |
| Email: garrett@damore.org |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| D'Amore Expires April 29, 2017 [Page 8] |