-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test against Scylla alternator #27
Comments
I came. I saw. I failed. scylladb/scylladb#9240 |
It's possible to run ScyllaDB Alternator in a container, if one uses the standard (non-nightly) build.
What can be done?
|
@dimaqq Have you ran tests with aiodynamo? Can you elaborate on the version (of both) you've been using? I've discovered aiodynamo recently and enjoyed moving code from slow boto3 and it was great so far, but now I've tried to test scylladb with dynamodb api, and aiodynamo code just doesn't work (while my boto3 code works fine). Example of the issue I have: import boto3
dynamodb = boto3.resource('dynamodb',endpoint_url='http://localhost:8000',
region_name='None', aws_access_key_id='None', aws_secret_access_key='None')
dynamodb.batch_write_item(RequestItems={
'usertable': [{'PutRequest': {
'Item': { 'key': 'test', 'x' : {'hello': 'world'} }
}}]
}) And now aiodynamo code replicating example above (doesn't work): import asyncio
from aiohttp import ClientSession
from aiodynamo.client import Client, URL
from aiodynamo.credentials import Credentials
from aiodynamo.http.aiohttp import AIOHTTP
from aiodynamo.expressions import HashKey
async def main():
async with ClientSession() as session:
client = Client(AIOHTTP(session), Credentials.auto(), region="None", endpoint=URL("http://localhost:8000"))
await client.put_item("usertable", item={"key": "test", "x": {"hello": "world"}})
asyncio.run(main()) It hangs for a while and gives this output:
My system is Ubuntu 21.04, I ran ScyllaDB in the docker-compose, and I've tried I just installed fresh version of boto3 and aiodynamo[aiohttp] from pypi for this repro. Any ideas? |
I think |
Hm, I'm not sure about that. On the one hand, you are right, in the repro-case I don't have any environment configured. But in my actual application I have gimmick I'll add it to the repro case to make sure it's not an issue. |
scylla returns the wrong (or at least a different) mimetype in JSON responses, and the aiohttp adaptor fails due to that. Either use the httpx adaptor or change the aiohttp one to ignore mimetypes. With
|
When you say that do you mean something like https://github.com/HENNGE/aiodynamo/blob/master/src/aiodynamo/http/aiohttp.py#L57, or something else? I'm trying to understand how easy is that to tweak, and whether it's worth it for me. |
change that to |
Eh, that still didn't make it work for me. |
I run it with |
Ohh, woops. I think it was my fault. At some point when tweaking repro code I changed something in the url, and the throttling error didn't help. Can confirm httpx and tweaked aiohttp works for me. I guess I will stick with httpx though, if aiohttp fix won't be in the lib, since I really have no desire to maintain my fork for this. |
...or not. After pluging httpx into my application, I've seen that request time went from 0.03s with aiohttp adapter to 0.7s with httpx. All I changed was from from aiodynamo.http.aiohttp import AIOHTTP # @nocheckin: fork required for this to work.
from aiohttp import ClientSession
self.aioclient = ClientSession()
self.aiodynamo = Client(AIOHTTP(self.aioclient), Credentials.auto(), self.region, endpoint=URL(self.db_url)) to from aiodynamo.http.httpx import HTTPX
from httpx import AsyncClient
self.aioclient = AsyncClient()
self.aiodynamo = Client(HTTPX(self.aioclient), Credentials.auto(), self.region, endpoint=URL(self.db_url)) And these numbers (0.03s and 0.7-1s) are similar both for scylladb and dynamodb-local for me, so I guess that's another issue. I just haven't used httpx before. Am I doing something terribly wrong here? |
I can't see anything obviously wrong here. ~1s response times are pretty bad 😱 Off the top of my head, I'd consider two aspects:
|
Although the DynamoDB API responses are JSON, additional conventions apply to these responses - such as how error codes are encoded in JSON. For this reason, DynamoDB uses the content type `application/x-amz-json-1.0` instead of the standard `application/json` in its responses. Until this patch, Scylla used `application/json` in its responses. This unexpected content-type didn't bother any of the AWS libraries which we tested, but it does bother the aiodynamo library (see HENNGE/aiodynamo#27). Moreover, we should return the x-amz-json-1.0 content type for future proofing: It turns out that AWS already defined x-amz-json-1.1 - see: https://awslabs.github.io/smithy/1.0/spec/aws/aws-json-1_1-protocol.html The 1.1 content type differs (only) in how it encodes error replies. If one day DynamoDB starts to use this new reply format (it doesn't yet) and if DynamoDB libraries will need to differenciate between the two reply formats, Alternator better return the right one. This patch also includes a new test that the Content-Type header is returned with the expected value. The test passes on DynamoDB, and after this patch it starts to pass on Alternator as well. Fixes #9554. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211031094621.1193387-1-nyh@scylladb.com>
Alternator does not support HTTP 2 (neither does DynamoDB or DynamoDB local, by the way), so I doubt that's related. This is a wild guess but unexplained fraction-second delays could be bad interaction between Naggle's algorithm and delayed ack:
You can verify in wireshark if the timing makes sense for this explanation. If it's this problem, you can try setting the TCP_NODELAY option on the client's socket - to disable Nagle's algorithm. Even more efficient is to use TCP_CORK - to explicitly tell the kernel to only send one packet after several write system calls. I don't know |
Actually... I have used aiodynamo+httpx in the past, and the performance was fine. |
Okay, I think the slowness is my fault (my profiler's). I was using (Haven't tested with httpx though, since I have no need for it as aiohttp is fixed in the scylla now) |
Although the DynamoDB API responses are JSON, additional conventions apply to these responses - such as how error codes are encoded in JSON. For this reason, DynamoDB uses the content type `application/x-amz-json-1.0` instead of the standard `application/json` in its responses. Until this patch, Scylla used `application/json` in its responses. This unexpected content-type didn't bother any of the AWS libraries which we tested, but it does bother the aiodynamo library (see HENNGE/aiodynamo#27). Moreover, we should return the x-amz-json-1.0 content type for future proofing: It turns out that AWS already defined x-amz-json-1.1 - see: https://awslabs.github.io/smithy/1.0/spec/aws/aws-json-1_1-protocol.html The 1.1 content type differs (only) in how it encodes error replies. If one day DynamoDB starts to use this new reply format (it doesn't yet) and if DynamoDB libraries will need to differenciate between the two reply formats, Alternator better return the right one. This patch also includes a new test that the Content-Type header is returned with the expected value. The test passes on DynamoDB, and after this patch it starts to pass on Alternator as well. Fixes #9554. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211031094621.1193387-1-nyh@scylladb.com> (cherry picked from commit 6ae0ea0)
Although the DynamoDB API responses are JSON, additional conventions apply to these responses - such as how error codes are encoded in JSON. For this reason, DynamoDB uses the content type `application/x-amz-json-1.0` instead of the standard `application/json` in its responses. Until this patch, Scylla used `application/json` in its responses. This unexpected content-type didn't bother any of the AWS libraries which we tested, but it does bother the aiodynamo library (see HENNGE/aiodynamo#27). Moreover, we should return the x-amz-json-1.0 content type for future proofing: It turns out that AWS already defined x-amz-json-1.1 - see: https://awslabs.github.io/smithy/1.0/spec/aws/aws-json-1_1-protocol.html The 1.1 content type differs (only) in how it encodes error replies. If one day DynamoDB starts to use this new reply format (it doesn't yet) and if DynamoDB libraries will need to differenciate between the two reply formats, Alternator better return the right one. This patch also includes a new test that the Content-Type header is returned with the expected value. The test passes on DynamoDB, and after this patch it starts to pass on Alternator as well. Fixes #9554. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211031094621.1193387-1-nyh@scylladb.com> (cherry picked from commit 6ae0ea0)
I gave it a go, again... but I'm a bit stuck:
|
and then
|
@dimaqq what does "request failed" mean? Was there an HTTP error? What was the content of the HTTP reply? |
Right, here's a qiuck fix to get benchmarks running: await response.json(
- content_type="application/x-amz-json-1.0", encoding="utf-8"
+ content_type=None, encoding="utf-8"
), Looks like ScyllaDB Alternator returns different MIME type than Amazon DynamoDB cc @nyh |
@dimaqq I thought I already fixed the mime type (scylladb/scylladb#9554) - which version of Scylla are you using? Can you please verify with "docker pull" that you are using a recent version, not some version that was called "latest" a year ago? |
It's what gets pulled today: |
Query performance (against single node, running in Docker, backed by host mount-bound volume, Linux laptop SSD)
So, officially faster than DynamoDB (global tables, max provisioning) which topped out at ~5K rows/s IIRC. |
My mime-type fix reached 4.5 only 4 weeks ago (scylladb/scylladb@5d7064e) so this version is not recent enough for this fix. |
Nightly cannot start up at the moment, with the same arguments as normal build:
I think I've seen this before, IIRC that's due to Scylla refusing to listen to |
I can't reproduce the above failure. I got a slightly newer nightly, but it worked:
@syuu1228 does this scylla-housekeeping error seem familiar? Could it explain why scylla-server is not coming up? I don't think Scylla is listening on 0.0.0.0 - why/where would it do that? A different problem might be that Scylla insists to listen on port 10000 (the REST API) by default on 127.0.0.1 - not the address you give it. Maybe that's a problem in your docker setup somehow (it works on mine...). You can try to override the REST API address with the "--api-address" option and see if it changes anything. |
scylladb/scylladb#5796 (comment)
ScyllaDB is a fast reimplementation of Cassandra, and they have a dynamodb compatibility layer called alternator.
I've ran some basic tests against their docker image. It would be awesome to run a performance test now that
aiodynamo
is so much faster :)The text was updated successfully, but these errors were encountered: