Quick references

Manta HTTP Quick Reference

HTTP Status Codes in Manta

HTTP Headers in Manta

Requests using "100-continue"

HTTP allows clients to specify a header called Expect: 100-continue to request that the server validate the request headers before the client sends the rest of it. For example, suppose a client wants to upload a 10 GiB object to /foo/stor/bar/obj1, but /foo/stor/bar does not exist. With Expect: 100-continue, the server can immediately send a "404 Not Found" response (because the parent directory doesn’t exist). Without this header, HTTP would require that the client send the entire 10 GiB request.

When Expect: 100-continue is specified with the request headers, then the client waits for a 100-continue response before proceeding to send the body of the request.

We mention this behavior because error handling for requests that do not use 100-continue can be surprising. For example, when the client doesn’t specify this header, the server might still choose to send a 400 or 500-level response immediately, but it must still wait for the client to send the whole request. There have been bugs in the past where the server did not read the request of the request, resulting in a memory leak and a timeout from the client’s perspective (because the client has no reason to read a response before it has even finished sending the request, if it didn’t use 100-continue).

Streaming vs. fixed-size requests

In order to frame HTTP requests and responses, one of two modes must be used:

A request or response can specify a content-length header that indicates exactly how many bytes of data will be contained in the body; or
A request or response can specify transfer-encoding: chunked, which indicates that the body will be sent in chunks, each of which is preceded by a size

Manta treats these two modes a little differently. If an upload request has a content-length, then Manta ensures that the storage nodes chosen to store the data have enough physical space available. Requests with transfer-encoding: chunked are called streaming uploads. For these uploads, a maximum content length is assumed by the server that’s used to validate that storage nodes contain enough physical space. The maximum content length for a streaming upload can be overridden using the max-content-length header.

See also the next section on [_validating_the_contents_of_requests_and_responses].

Validating the contents of requests and responses

It’s critical that clients and servers validate the body of responses and requests. Some types of corruption are impossible to report any other way.

Corrupted requests and responses can manifest in a number of ways:

the sender may stop sending after too few bytes
the sender may send EOF after sending too few bytes
the sender may send too many bytes
the body may have the right number of bytes, but have incorrect bytes

Importantly, because of the two modes of transfer described above (under [_streaming_vs_fixed_size_requests]), the reader of a request or response always knows how many bytes to expect. In the cases above:

If the sender stops sending bytes after too few bytes (but the socket is still open for writes in both directions), then the reader will fail the operation due to a timeout. For example, if the client does this, then the server will report a 408 error. The client must implement a timeout for this case to cover the case where the server fails in this way.
If the sender sends EOF after too few bytes, this would be a bad request or response. If a client did this, then the server would report a 400 error. The client must implement a check for this case to cover the case where the server fails in this way. At this point in the HTTP operation, the client may have already read a successful response (i.e., a 200), and it needs to be sophisticated enough to treat it as an error anyway.
If the sender sends too many bytes, then the request or response would be complete, but the next request or response would likely be invalid.
When possible, clients and servers should generally send a Content-MD5 header. This allows the remote side to compute an MD5 checksum on the body and verify that the correct bytes were sent. For object downloads, Manta always stores the MD5 computed from the original upload and it always provides the Content-MD5 header on responses. If clients provide a Content-MD5 header on uploads, then Manta always validates that it receives it. When both of these mechanisms are used by both client and server, a client can be sure of end-to-end integrity.

Note: It’s been noted that MD5 checksums are deprecated for security purposes due to the risk of collisions. While they are likely not appropriate for security, MD5 collisions remain rare enough for MD5 checksums to be used for basic integrity checks.

Muskie log entry properties

Below is a summary of the most relevant fields for an audit log entry. (Note that Muskie sometimes writes out log entries unrelated to the completion of an HTTP request. Only log entries with "audit": true represent completion of an HTTP request. Other log entries have other fields.)

General Muskie-provided properties

JSON property	Example value	Meaning
`audit`	`true`	If `true`, this entry describes completion of an HTTP request. Otherwise, this is some other type of log entry, and many of the fields below may not apply.
`latency`	26	Time in milliseconds between when Muskie started processing this request and when the response headers were sent. This is commonly called time to first byte. See also building a request timeline. This should generally match the `x-response-time` response header.
`operation`	`getstorage`	Manta-defined token that describes the type of operation. In this case, `getstorage` refers to an HTTP `GET` from a user’s `stor` directory.
`req`	See specific properties below.	Object describing the incoming request
`req.method`	`GET`	HTTP method for this request (specified by the client)
`req.url`	`"/poseidon/stor/manta_gc/mako/1.stor.staging.joyent.us?limit=1024"`	URL (path) provided for this request (specified by the client)
`req.headers`	{ "accept": "/", "x-request-id": "a080d88b-8e42-4a98-a6ec-12e1b0dbf612", "date": "Tue, 01 Aug 2017 03:03:13 GMT", "authorization": "Signature keyId=\"/poseidon/keys/ef:0e:27:45:c5:95:4e:92:ba:ab:03:17:e5:3a:60:14\",algorithm=\"rsa-sha256\",headers=\"date\",signature=\"...\"", "user-agent": "restify/1.4.1 (ia32-sunos; v8/3.14.5.9; OpenSSL/1.0.1i) node/0.10.32", "accept-version": "~1.0", "host": "manta.staging.joyent.us", "connection": "keep-alive", "x-forwarded-for": "::ffff:172.27.4.22" }	Headers provided with this request (specified by the client). The `Date` header is particularly useful to note, as this usually reflects the timestamp (on the client) when the client generated the request. This is useful when constructing a request timeline. In particular, problems with the network (timeouts and retransmissions) or queueing any time before Muskie starts processing the request can be identified using this header, provided that the client clock is not too far off from the server clock.
`req.caller`	{ "login": "poseidon", "uuid": "4d649f41-cf87-ca1d-c2c0-bb6a9004311d", "groups": [ "operators" ], "user": null }	Object describing the account making this request. This is not the same as the owner! Note that this can differ from the owner of the resource (`req.owner`). That commonly happens when the caller uses operator privileges to access objects in someone else’s account or when any user makes an authenticated request to access public data in some other user’s account.
`req.caller.login`	`"poseidon"`	For authenticated requests, the name of the account that made the request.
`req.caller.uuid`	`"4d649f41-cf87-ca1d-c2c0-bb6a9004311d"`	For authenticated requests, the unique identifier for the account that made the request.
`req.caller.groups`	`[ "operators" ]`	For authenticated requests, a list of groups that the caller is part of. Generally, the only interesting group is `"operators"`, which grants the caller privileges to read from and write to any account.
`req.caller.user`	`null`	For authenticated requests from a subuser of the account, the name of the subuser account.
`req.owner`	`"4d649f41-cf87-ca1d-c2c0-bb6a9004311d"`	Unique identifier for the account that owns the requested resource. This is generally the uuid of the account at the start of the URL (i.e., for a request of `"/poseidon/stor"`, this would be the uuid of the account `poseidon`).
`res`	See specific properties below.	Describes the HTTP response sent by Muskie to the client.
`res.statusCode`	200	HTTP-level status code.
`res.headers`	{ "last-modified": "Sat, 22 Mar 2014 01:17:01 GMT", "content-type": "application/x-json-stream; type=directory", "result-set-size": 1, "date": "Tue, 01 Aug 2017 03:03:13 GMT", "server": "Manta", "x-request-id": "a080d88b-8e42-4a98-a6ec-12e1b0dbf612", "x-response-time": 26, "x-server-name": "204ac483-7e7e-4083-9ea2-c9ea22f459fd" }	Headers sent in the response from Muskie to the client. Among the most useful is the `x-request-id` header, which should uniquely identify this request. You can use this to correlate observations from the client or other parts of the system.
`route`	`"getstorage"`	Identifies the name of the restify route that handled this request.

Muskie-provided properties for debugging only

JSON property	Example value	Meaning
`entryShard`	`"tcp://3.moray.staging.joyent.us:2020"`	When present, this indicates the shard that was queried for the metadata for `req.url`. Unfortunately, this field is not currently present when Muskie fails to fetch metadata, either because of a Moray failure or just because the metadata is missing (i.e., the path doesn’t exist).
`err`	`false`	Error associated with this request, if any. See [_details_about_specific_error_messages].
`objectId`	`"bf54fb8a-6cb5-4683-8655-f9ad90b984d4"`	When present, this is the unique identifier for the Manta object identified by `req.url` when the request was made. This is helpful when trying to verify that a request fetched the exact object that you expect (and not another object that had the same name at the time).
`parentShard`	`"tcp://2.moray.staging.joyent.us:2020"`	When present, this indicates the shard that was queried for the metadata for the parent directory of `req.url`. This is only present when the parent metadata was fetched (which is common for PUT requests, but not GET or DELETE requests). Unfortunately, this field is not currently present when Muskie fails to fetch metadata, either because of a Moray failure or just because the metadata is missing (i.e., the path doesn’t exist).
`logicalRemoteAddress`	`"172.27.4.22"`	The (remote) IP address of the client connected to Manta. Note that clients aren’t connected directly to Muskie. When using TLS ("https" URLs), clients connect to `stud` in the `loadbalancer` component. Stud connects to `haproxy` in the same container. `haproxy` in the load balancer container connects to another `haproxy` instance in the Muskie container. That `haproxy` instance connects to a Muskie process. The client’s IP is passed through this chain and recorded in `logicalRemoteAddress`.
`remoteAddress`, `remotePort`	`"127.0.0.1"`, `64628`	The IP address and port of the TCP connection over which this request was received. Generally, Muskie only connects directly to an `haproxy` inside the same zone, so the remote address will usually be `127.0.0.1`. Neither of these fields is generally interesting except when debugging interactions with the local `haproxy`.
`req.timers`	{ "earlySetup": 32, "parseDate": 8, "parseQueryString": 28, "handler-3": 127, "checkIfPresigned": 3, "enforceSSL": 3, "ensureDependencies": 5, "_authSetup": 5, "preSignedUrl": 3, "checkAuthzScheme": 4, "parseAuthTokenHandler": 36, "signatureHandler": 73, "parseKeyId": 59, "loadCaller": 133, "verifySignature": 483, "parseHttpAuthToken": 5, "loadOwner": 268, "getActiveRoles": 43, "gatherContext": 27, "setup": 225, "getMetadata": 5790, "storageContext": 8, "authorize": 157, "ensureEntryExists": 3, "assertMetadata": 3, "getDirectoryCount": 7903, "getDirectory": 10245 }	An object describing the time in microseconds for each phase of the request processing pipeline. This is useful for identifying latency. The names in this object are the names of functions inside Muskie responsible for the corresponding phase of request processing.
`sharksContacted`	[ { "shark": "1.stor.staging.joyent.us", "result": "ok", "timeToFirstByte": 2, "timeTotal": 902, "_startTime": 1509505866032 }, { "shark": "2.stor.staging.joyent.us", "result": "ok", "timeToFirstByte": 1, "timeTotal": 870, "_startTime": 1509505866033 } ]	This field should be present for Manta requests that make requests to individual storage nodes. The value is an array of storage nodes contacted as part of the request, including the result of this subrequest, when it started, and how long it took. For GET requests, these subrequests are GET requests from individual storage nodes hosting a copy of the object requested. These subrequests happen serially, and we stop as soon as one completes. For PUT requests, the storage node subrequests are PUT requests to individual storage nodes on which a copy of the new object will be stored. If all goes well, you’ll see N sharks contacted (typically 2, but whatever the client’s requested durability level is), all successfully, and the requests will be concurrent with each other. If any of these fail, Manta will try another N sharks, and up to one more set of N. For durability level 2, you may see up to 6 sharks contacted: three sets of two. The sets would be sequential, while each pair in a set run concurrently.

Bunyan-provided properties

JSON property	Example value	Meaning
`time`	`"2017-08-01T03:03:13.985Z"`	ISO 8601 timestamp closest to when the log entry was generated.
`hostname`	`"204ac483-7e7e-4083-9ea2-c9ea22f459fd"`	The hostname of the system that generated the log entry. For us, this is generally a uuid corresponding to the zonename of the Muskie container.
`pid`	`79465`	The pid of the process that generated the log entry.
`level`	`30`	Bunyan-defined log level. This is a numeric value corresponding to conventional values like `'debug'`, `'info'`, `'warn'`, etc. You can filter based on level using the `bunyan` command.
`msg`	`"handled: 200"`	For Muskie audit log entries, the message is always `"handled: "` followed by the HTTP level status code.

XXX talk about common stack traces? XXX that should include 503 from 'No storage nodes available for this request'

Debugging tools quick reference

Glossary of jargon

bounce (as in: "bounce a box", "bounce a service")	Bouncing a box or a service means restarting it. Bouncing a box usually means rebooting a server. Bouncing a service usually means restarting an SMF service (killing any running processes and allowing the system to restart them).
bound (as in: "CPU-bound", "disk-bound", "I/O-bound")	A program or a workload is said to be "X-bound" for some resource X when its performance is limited by that resource. For example, the performance of a CPU-bound process is limited by the amount of CPU available to it. "Disk-bound" (or "I/O-bound") usually means that a process or workload is limited by the I/O performance of the storage subsystem, which may be a collection of disks organized into a ZFS pool.
box	A box is a physical server (as opposed to a virtual machine or container).
container/zone/VM	A container is a lightweight virtualized environment, usually having its own process namespace, networking stack, filesystems, and so on. For most purposes, a container looks like a complete instance of the operating system, but there may be many containers running within one instance of the OS. They generally cannot interact with each other except through narrow channels like the network. The illumos implementation of containers are called zones. SmartOS also runs hardware-based virtual machines inside zones (i.e., a heavyweight hardware-virtualized environment within the lightweight OS-virtualized environment), and while those are technically running in a container, the term container is usually only applied to zones not running a hardware-based virtualization environment. For historical reasons, within Triton and SmartOS, zones are sometimes called VMs, though that term sometimes refers only to the hardware virtualized variety. The three terms are often used interchangeably (and also interchangeably with instance, since most components are deployed within their own container).
headroom	Idle capacity for a resource. For example, we say there’s CPU headroom on a box when some CPUs are idle some of the time. This usually means the system is capable of doing more work (at least with respect to this resource).
instance (general, SAPI)	Like service, instance can refer to a number of different things, including a member of a SAPI service or SMF service. Most commonly, "instance" to refer to a SAPI service.
latency	Latency refers to how much time an operation takes. It can apply to any discrete operation: a disk I/O request, a database transaction, a remote procedure call, a system call, establishment of a TCP connection, an HTTP request, and so on.
out of (as in: "out of CPU")	We sometimes say a box is out of a resource when that resource is fully utilized (i.e., "out of CPU" when all CPUs are busy).
pegged, slammed, swamped	These are all synonyms for being out of some resource. "The CPUs are pegged" means a box has very little CPU headroom (i.e., the CPUs are mostly fully utilized). You can also say "one CPU is pegged" (i.e., that CPU is fully utilized). You might also say "the disks are swamped" (i.e., they’re nearly always busy doing I/O). See also saturated.
saturated	A resource is saturated when processes are failing to use the resource because it’s already fully utilized. For example, when CPUs are saturated, threads that are ready to run have to wait in queues. When a network port is saturated, packets are dropped. Similar to pegged, but more precise.
service (general)	Service can refer to a SAPI service (see below), an SMF service (see below), or it may be used more generally to describe almost any useful function provided by a software component. As a verb (e.g., "this process is servicing requests"), it usually means "to process [requests]".
service (SAPI)	Within SAPI (the Triton facility for managing configuration and deployment of cloud applications like Manta), a service refers to a collection of instances providing similar functionality. It usually describes a type of component (e.g., "storage" or "webapi") that may have many instances. These instances usually share images and configuration, and within SAPI, the service is the place where such configuration is stored.
service (SMF)	Within the operating system, an SMF service is a piece of configuration that usually describes long-running programs that should be automatically restarted under various failure conditions. For example, we define an SMF service for "mahi-v2" (our authenticationc ache) so that the operating system automatically starts the service upon boot and restarts it if the process exits or dumps core. (Within SMF, it’s actually instances of a service that get started, stopped, restarted, and so on. For many services, there’s only one "default" instance, and the terms are often used interchangeably. Usually someone will say "I restarted the mahi-v2 service" rather than "I restarted the sole instance of the mahi-v2 service". However, for some services (notably "muskie", "moray", "electric-moray", and "binder") we do deploy multiple instances, and it may be important to be more precise (e.g., "three of the muskie instances in this zone are in maintenance"). See `smf(5)`.
shard	A shard generally refers to a database that makes up a fraction of a larger logical database. For example, the Manta metadata tier is one logical data store, but it’s divided into a number of equally-sized shards. In sharded systems like this, incoming requests are directed to individual shards in a deterministic way based on some sharding key. (Many systems use a customer id for this purpose. Manta traditionally uses the name of the parent directory of the resource requested. In Manta, each shard typically uses 2-3 databases for high availability, but these aren’t separate shards because they’re exact copies. Sharding typically refers to a collection of disjoint databases that together make up a much larger dataset.
tail latency	When discussing a collection of operations, tail latency refers to the latency of the slowest operations (i.e., the tail of the distribution). This is often quantified using a high-numbered percentile. For example, if the 99th percentile of requests is 300ms, then 99% of requests have latency at most 300ms. As compared with an average or median latency, the 99th percentile better summarizes the latency of the slowest requests.

Code	HTTP	Meaning in Manta
100-continue	-	The client requested extra initial validation, and the server has not yet rejected the request.
200	`OK`	Most commonly used for successful GETs
201	`Created`	Most commonly used for creating jobs and multipart uploads (not object PUT operations)
204	`No Content`	Used for successful direction creations, directory removals, object uploads, object deletes, snaplink creation, and a handful of other operations
400	`Bad Request`	The client send an invalid HTTP request (e.g., an incorrect MD5 checksum)
401	`Unauthorized`	The client sent an invalid or unsupported signature, or it did not send any signature.
403	`Forbidden`	The client failed to authenticate, or it authenticated and was not allowed to access the resource.
408	`Request Timeout`	The server did not receive a complete request from the client within a reasonable timeout.
409	`Conflict`	The client sent an invalid combination of parameters for an API request.
412	`Precondition Failed`	The client issued a conditional request and the conditions were not true. (For example, this could have been a PUT-if-the-object-does-not-already-exist, and the object already existed.)
413	`Request Entity Too Large`	The client attempted a streaming upload and sent more bytes than were allowed based on the `max-content-length` header. See [_request_has_exceeded_bytes] for details.
429	`Too Many Requests`	The client is being rate-limited by the server because it issued too many requests in too short a period.
499	(not in HTTP)	The 499 status is used to indicate that the client appeared to abandon the request. (In this case, it’s not possible to send a response. The 499 code is used for internal logging and statistics.) This was originally used in nginx.
500	`Internal Server Error`	Catch-all code for a failure to process this request.
502	`Bad Gateway`	Historically, this code was emitted by Manta when requests took more than two minutes to complete. This was an artifact of the load balancer. Modern versions of Manta report this as a 503.
503	`Service Unavailable`	This code generally indicates that the system is overloaded and cannot process more work. In practice, this currently means that a particular metadata shard’s queue is full, that Muskie took too long to respond to the request, or that there aren’t enough working storage nodes with enough disk space to satisfy this upload.
507	`Not Enough Space`	The Manta deployment is out of physical disk space for new objects. See [_not_enough_free_space_for_mb] for details.

Header	Request/Response	Origin	Meaning
`Content-Length`	Both	HTTP	See [_streaming_vs_fixed_size_requests].
`Content-MD5`	Both	HTTP	MD5 checksum of the body of a request or response. It’s essential that clients and servers validate this on receipt.
`Content-Type`	Both	HTTP, Manta	Describes the type (i.e., MIME type) of the body of the request or response. Manta understands a special content-type for directories called `application/json; type=directory`, which represents a Manta directory.
`Date`	Both	HTTP	The time when the request or response was generated. This is often useful when debugging for putting together a timeline.
`Transfer-encoding: chunked`	Both	HTTP	See [_streaming_vs_fixed_size_requests].
any header starting with `m-`	Both	Manta	Arbitrary user-provided headers.
`Result-Set-Size`	Response	Manta	For GET or HEAD requests on directories, this header indicates how many items are in the directory.
`x-request-id`	Both	Manta	A unique identifier for this request. This can be used to locate details about a request in Matna logs. Clients may specify this header on requests, in which case Manta will use the requested id. Othewrise, Manta will generate one and provide it with the response.
`x-server-name`	Response	Manta	A unique identifier for the frontend instance that handled this request. Specifically, this identifies the "webapi" zone that handled the request.

Tool	Where you run it	Has manual page?	Purpose
`manta-oneach(1)`	headnode GZ or "manta" zones	Yes	Run arbitrary commands in various types of Manta zones
`manta-login(1)`	headnode GZ or "manta" zones	Yes	Open a shell in a particular Manta zone
`mlocate`	"webapi" zone	No	Fetch metadata for an object (including what shard it’s on)
`moray(1)` tools	"moray", "electric-moray" zones	Yes	Fetch rows directly from Moray
`moraystat.d`	"moray" zones	No	Shows running stats about Moray RPC activity
`pgsqlstat` tools	"postgres" zones (need to be copied in as needed)	No	Report on PostgreSQL activity
`bunyan`	Anywhere	Yes	Format bunyan-format log files. With `-p PID`, shows live verbose log entries from a process.
`curl`	Anywhere	Yes	`curl` is (among other things) a general-purpose HTTP client. It can be used to make test requests to Manta itself as well as various components within Manta, including authcache and storage.
`proc(1)` tools (also called the `ptools`, which includes `pfiles`, `pstack`, and others)	Anywhere	Yes	Inspect various properties of a process, including its open files, thread stacks, working directory, signal mask, etc.
`netstat(1M)`	Anywhere	Yes	Shows information about the networking stack, including open TCP connections and various counters (including error counters).
`vfsstat(1M)`	Anywhere	Yes	Shows running stats related to applications' use of the filesystem (e.g., reads and writes)
`prstat(1M)`	Anywhere	Yes	Shows running stats related to applications' use of CPU and memory
`mpstat(1M)`	Anywhere	Yes	Shows running stats related to system-wide CPU usage
`zonememstat(1M)`	Anywhere	Yes	Shows running stats related to zone-wide memory usage
`mdb_v8`	Anywhere	No	Inspect JavaScript-level state in core files from Node.js processes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0003-quick-reference.adoc

0003-quick-reference.adoc

Quick references

Manta HTTP Quick Reference

HTTP Status Codes in Manta

HTTP Headers in Manta

Requests using "100-continue"

Streaming vs. fixed-size requests

Validating the contents of requests and responses

Muskie log entry properties

General Muskie-provided properties

Muskie-provided properties for debugging only

Bunyan-provided properties

Debugging tools quick reference

Glossary of jargon

Files

0003-quick-reference.adoc

Latest commit

History

0003-quick-reference.adoc

File metadata and controls

Quick references

Manta HTTP Quick Reference

HTTP Status Codes in Manta

HTTP Headers in Manta

Requests using "100-continue"

Streaming vs. fixed-size requests

Validating the contents of requests and responses

Muskie log entry properties

General Muskie-provided properties

Muskie-provided properties for debugging only

Bunyan-provided properties

Debugging tools quick reference

Glossary of jargon