GATEWAY_TIMEOUT encountered when building full GFE for release 3470 #153

chrisammon3000 · 2022-02-01T03:01:10Z

seq-ann 1.1.0 threw an error during the gfe-db build job with the following stacktrace:

2022-02-01 02:36:08 - Logger.seqann.gfe - INFO - GFE = HLA-DPB1w3-2-160-462-223-334-814-127-41-1-25
--
2022-02-01 02:36:08 - root - INFO - Getting GFE data for allele HLA18552.2...
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/seqann/gfe.py", line 244, in get_gfe
feature = self.api.create_feature(body=request)
File "/usr/local/lib/python3.8/site-packages/seqann/feature_client/apis/features_api.py", line 77, in create_feature
(data) = self.create_feature_with_http_info(**kwargs)
File "/usr/local/lib/python3.8/site-packages/seqann/feature_client/apis/features_api.py", line 142, in create_feature_with_http_info
return self.api_client.call_api(resource_path, 'POST',
File "/usr/local/lib/python3.8/site-packages/seqann/feature_client/api_client.py", line 330, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.8/site-packages/seqann/feature_client/api_client.py", line 154, in __call_api
response_data = self.request(method, url,
File "/usr/local/lib/python3.8/site-packages/seqann/feature_client/api_client.py", line 365, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.8/site-packages/seqann/feature_client/rest.py", line 212, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.8/site-packages/seqann/feature_client/rest.py", line 184, in request
raise ApiException(http_resp=r)
seqann.feature_client.rest.ApiException: (504)
Reason: GATEWAY_TIMEOUT
HTTP response headers: HTTPHeaderDict({'Content-Length': '0', 'Connection': 'keep-alive'})
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./src/app.py", line 580, in <module>
gfe = gfe_from_allele(
File "./src/app.py", line 371, in gfe_from_allele
features, gfe = gfe_maker.get_gfe(ann, locus)
File "/usr/local/lib/python3.8/site-packages/seqann/gfe.py", line 248, in get_gfe
self.logger.error(self.logname + "Exception when calling DefaultApi->create_feature %e" + e)
TypeError: can only concatenate str (not "ApiException") to str

This occurred in the context of a full build for all alleles for version 3470.

Here is the call in the script that failed:
https://github.com/abk7777/gfe-db/blob/135d179e1f9295c5b8b72bfbd61d8789db30f0e2/gfe-db/pipeline/jobs/build/src/app.py#L580-L582

Using these dependencies:

py-ard==0.6.11
py-gfe==1.1.5

chrisammon3000 · 2022-02-01T03:02:40Z

@pbashyal-nmdp
I recently updated to the latest py-ard and py-gfe but I'm not sure if that is what caused the issue.

pbashyal-nmdp · 2022-02-01T15:13:58Z

Looks like it timed out with GATEWAY_TIMEOUT when calling the feature service. The libraries shouldn't affect feature service calls.

Does retrying work ?

chrisammon3000 · 2022-02-03T02:33:02Z

Retrying as the same effect. I ran it in a debugger and it gave me this for the exact same allele HLA18552.2:

Exception has occurred: TypeError
can only concatenate str (not "ApiException") to str

During handling of the above exception, another exception occurred:

  File "/Users/ammon/Documents/00-Projects/nmdp-bioinformatics/02-Repositories/gfe-db/gfe-db/pipeline/jobs/build/src/app.py", line 372, in gfe_from_allele
    features, gfe = gfe_maker.get_gfe(ann, locus)
  File "/Users/ammon/Documents/00-Projects/nmdp-bioinformatics/02-Repositories/gfe-db/gfe-db/pipeline/jobs/build/src/app.py", line 581, in <module>
    gfe = gfe_from_allele(

The same error is also in the original error message. I'll try working with seq-ann and making individual API calls to the feature service with this allele to see if I can get to the bottom of it.

chrisammon3000 · 2022-02-07T23:42:52Z

When I built this one allele on its own it worked fine, no timeout encountered. I wonder if gfe-db is overloading the feature service API during the build. I wouldn't be surprised because the build proceeds extremely rapidly processing ~30,000 alleles in 15-20 minutes, I think that's around 20-25 alleles per second. That places a constant, very high load for on the API for the duration of the build.

For alleles that encounter timeouts, I think it might be a good approach to decouple the retry mechanism from the build in gfe-db. I did some math and even minimal retry could drastically increase the build time and cost if even 20 alleles fail out of ~30,000. Decoupling the retry logic would also make it easier to set an alarm threshold if lots of alleles are failing. I'll follow up on this in a separate issue for gfe-db.

pbashyal-nmdp · 2022-02-08T15:48:43Z

Yes, feature service is getting overloaded. I'll look into adding a caching layer for the service. That should help with other uses as well.

chrisammon3000 · 2022-02-09T02:11:03Z

I guess another option would be to rate limit requests, but this will increase the build time. The current build is done on a c5d.2xlarge at $0.384 per hour and it takes around 20 minutes, so it could go a lot longer before cost becomes any kind of an issue.

Honestly I think rate-limiting might be easier to implement than caching on the API side.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GATEWAY_TIMEOUT encountered when building full GFE for release 3470 #153

GATEWAY_TIMEOUT encountered when building full GFE for release 3470 #153

chrisammon3000 commented Feb 1, 2022

chrisammon3000 commented Feb 1, 2022

pbashyal-nmdp commented Feb 1, 2022

chrisammon3000 commented Feb 3, 2022

chrisammon3000 commented Feb 7, 2022

pbashyal-nmdp commented Feb 8, 2022

chrisammon3000 commented Feb 9, 2022

GATEWAY_TIMEOUT encountered when building full GFE for release 3470 #153

GATEWAY_TIMEOUT encountered when building full GFE for release 3470 #153

Comments

chrisammon3000 commented Feb 1, 2022

chrisammon3000 commented Feb 1, 2022

pbashyal-nmdp commented Feb 1, 2022

chrisammon3000 commented Feb 3, 2022

chrisammon3000 commented Feb 7, 2022

pbashyal-nmdp commented Feb 8, 2022

chrisammon3000 commented Feb 9, 2022