Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(sim): maker connext-client crash before taker order settlement #1758

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

LePremierHomme
Copy link
Contributor

@LePremierHomme LePremierHomme commented Jul 26, 2020

Implementing a new instability test scenario where, during a swap, the maker’s outgoing payment ETH swap client crashes while the taker is delaying his settlement.


I've implemented a similar one for LND, but it looks like that if the maker LND is crashing, the taker cannot settle.
The cannonical xud don't even reach that stage:

  1. maker gets a local LND error and sends swapFailed packet, so the taker fails the deal.
  2. taker LND invoice subscription gets an error, so he delete the subscription.

If custom code is introduced, the taker still fails to settle the invoice (UnexpectedClientError). This scenario also causes the channel to get closed.
That's why I thought there's no point to keep this test, and I removed the test code (I still have it stashed in case we'll want it).

@ghost
Copy link

ghost commented Jul 27, 2020

After 2 attempts I was able to get them passing locally. The first 2 attempts failed with failing to setup xud network.

@ghost
Copy link

ghost commented Jul 27, 2020

@sangaman please also try to run them locally to see if you run into similar issues?

@kilrau kilrau requested a review from raladev July 29, 2020 10:34
@raladev raladev removed their request for review July 29, 2020 10:58
@kilrau kilrau added the P2 mid priority label Aug 10, 2020
@kilrau
Copy link
Contributor

kilrau commented Sep 3, 2020

Can you give this a rebase and see if this still works? @LePremierHomme

Le Premier Homme added 2 commits September 30, 2020 14:51
# Conflicts:
#	test/simulation/custom-xud.patch
@LePremierHomme
Copy link
Contributor Author

The test case doesn't pass anymore, probably due to recent changes.

As described, the test simulate the maker's connext client crash while taker is delaying the settlement. After the settlement, connext client is re-started and connection is verified. But then the recovery procedure doesn't kicking in, although its interval was adjusted to 2sec. Looks like there's an error connecting with LND-BTC, although its still alive, and later the invoice subscription is being deleted.

The test fails because the maker's BTC funds aren't recovered, and the channel remains with active htlcs.

@sangaman @erkarl any ideas?

Maker xud log

02/10/2020 07:05:44.089 [SWAPS] debug: Executing maker code to resolve hash
02/10/2020 07:05:44.090 [DB] trace: Executing (default): SELECT `id`, `nodePubKey`, `addressesText`, `lastAddressText`, `banned`, `createdAt`, `updatedAt` FROM `nodes` AS `Node` WHERE `Node`.`nodePubKey` = '02d637dcc5c5a646325c14aebb2a57de736bba3cb1095043bd3e27b2c8bf6ba77e';
02/10/2020 07:05:44.089 [SWAPS] debug: Setting SendingPayment phase for deal 47bcf1891136817648272550c755e4cc9552eab05bb6a0139bd4cd31914d4a2e
02/10/2020 07:05:44.092 [DB] trace: Executing (default): INSERT INTO `swapdeals` (`rHash`,`role`,`state`,`phase`,`nodeId`,`orderId`,`localId`,`proposedQuantity`,`quantity`,`takerAmount`,`takerCurrency`,`takerPubKey`,`takerCltvDelta`,`makerCltvDelta`,`makerAmount`,`makerCurrency`,`createTime`) VALUES ('47bcf1891136817648272550c755e4cc9552eab05bb6a0139bd4cd31914d4a2e',1,0,3,1,'b02e2790-047d-11eb-9a69-81bf42dec0c3','maker_order_id',100,100,4000,'ETH','indra816wrzU3dyoDh2npTixsHKYSqMbKSj3LrLMZx4zCTFB4mQ3Tvr',200,62,100,'BTC',1601622343891) ON CONFLICT (`rHash`) DO UPDATE SET `rHash`=EXCLUDED.`rHash`,`proposedQuantity`=EXCLUDED.`proposedQuantity`,`orderId`=EXCLUDED.`orderId`,`quantity`=EXCLUDED.`quantity`,`makerAmount`=EXCLUDED.`makerAmount`,`takerAmount`=EXCLUDED.`takerAmount`,`makerCurrency`=EXCLUDED.`makerCurrency`,`takerCurrency`=EXCLUDED.`takerCurrency`,`takerCltvDelta`=EXCLUDED.`takerCltvDelta`,`takerPubKey`=EXCLUDED.`takerPubKey`,`localId`=EXCLUDED.`localId`,`phase`=EXCLUDED.`phase`,`state`=EXCLUDED.`state`,`role`=EXCLUDED.`role`,`createTime`=EXCLUDED.`createTime`,`makerCltvDelta`=EXCLUDED.`makerCltvDelta`,`nodeId`=EXCLUDED.`nodeId`;
02/10/2020 07:05:44.096 [CONNEXT] debug: sending payment of 40000000000000 with hash 0x47bcf1891136817648272550c755e4cc9552eab05bb6a0139bd4cd31914d4a2e to indra816wrzU3dyoDh2npTixsHKYSqMbKSj3LrLMZx4zCTFB4mQ3Tvr
02/10/2020 07:05:44.097 [CONNEXT] trace: sending request to /hashlock-transfer: {"amount":"40000000000000","assetId":"0x0000000000000000000000000000000000000000","timelock":"200","lockHash":"0x47bcf1891136817648272550c755e4cc9552eab05bb6a0139bd4cd31914d4a2e","recipient":"indra816wrzU3dyoDh2npTixsHKYSqMbKSj3LrLMZx4zCTFB4mQ3Tvr"}
02/10/2020 07:05:44.544 [LND-BTC] debug: new channel inbound capacity: 7349900
02/10/2020 07:05:44.594 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:05:44.596 [CONNEXT] verbose: collateralized channel for ETH
02/10/2020 07:05:44.644 [CONNEXT] trace: new inbound capacity (collateral) for ETH of 0
02/10/2020 07:05:45.442 [HTTP] debug: http server request json: {"id":"2801b972-3f54-43a6-ab11-eefcc740c60f","data":{"amount":{"_hex":"0x246139ca8000","_isBigNumber":true},"appIdentityHash":"0xcaaba9bf1a9bedab6f175a1ec58a458b8c8f651d59bf1903d417c4df92f3dabf","assetId":"0x0000000000000000000000000000000000000000","meta":{"recipient":"indra816wrzU3dyoDh2npTixsHKYSqMbKSj3LrLMZx4zCTFB4mQ3Tvr","sender":"indra5b3QBEELA4YXUJjoSmDdMA625K4f7XyJ7Avq131yURcCUE84JX","senderAssetId":"0x0000000000000000000000000000000000000000","paymentId":"0x5fc5efb63315fdf8bfb9ff638ded40a90cbad4f5389fc4ada3c214cbe8199727","timelock":"200"},"sender":"indra5b3QBEELA4YXUJjoSmDdMA625K4f7XyJ7Avq131yURcCUE84JX","transferMeta":{"lockHash":"0x47bcf1891136817648272550c755e4cc9552eab05bb6a0139bd4cd31914d4a2e","expiry":{"_hex":"0xde","_isBigNumber":true},"timelock":"200"},"type":"HashLockTransferApp","paymentId":"0x5fc5efb63315fdf8bfb9ff638ded40a90cbad4f5389fc4ada3c214cbe8199727","recipient":"indra816wrzU3dyoDh2npTixsHKYSqMbKSj3LrLMZx4zCTFB4mQ3Tvr"}}
02/10/2020 07:05:45.443 [CONNEXT] warn: received unexpected incoming transfer created event with rHash 47bcf1891136817648272550c755e4cc9552eab05bb6a0139bd4cd31914d4a2e, units: 40000000000000, timelock 200, token address 0x0000000000000000000000000000000000000000, and paymentId 0x5fc5efb63315fdf8bfb9ff638ded40a90cbad4f5389fc4ada3c214cbe8199727
02/10/2020 07:05:45.463 [HTTP] debug: http server request json: {"id":"2801b972-3f54-43a6-ab11-eefcc740c60f","data":{"type":"HashLockTransferApp","amount":{"_hex":"0x246139ca8000","_isBigNumber":true},"appIdentityHash":"0xcaaba9bf1a9bedab6f175a1ec58a458b8c8f651d59bf1903d417c4df92f3dabf","assetId":"0x0000000000000000000000000000000000000000","sender":"indra5b3QBEELA4YXUJjoSmDdMA625K4f7XyJ7Avq131yURcCUE84JX","meta":{"recipient":"indra816wrzU3dyoDh2npTixsHKYSqMbKSj3LrLMZx4zCTFB4mQ3Tvr","sender":"indra5b3QBEELA4YXUJjoSmDdMA625K4f7XyJ7Avq131yURcCUE84JX","senderAssetId":"0x0000000000000000000000000000000000000000","paymentId":"0x5fc5efb63315fdf8bfb9ff638ded40a90cbad4f5389fc4ada3c214cbe8199727","timelock":"200"},"paymentId":"0x5fc5efb63315fdf8bfb9ff638ded40a90cbad4f5389fc4ada3c214cbe8199727","recipient":"indra816wrzU3dyoDh2npTixsHKYSqMbKSj3LrLMZx4zCTFB4mQ3Tvr","transferMeta":{"expiry":{"_hex":"0xde","_isBigNumber":true},"timelock":"200","lockHash":"0x47bcf1891136817648272550c755e4cc9552eab05bb6a0139bd4cd31914d4a2e"}}}
02/10/2020 07:05:45.464 [CONNEXT] warn: received unexpected incoming transfer created event with rHash 47bcf1891136817648272550c755e4cc9552eab05bb6a0139bd4cd31914d4a2e, units: 40000000000000, timelock 200, token address 0x0000000000000000000000000000000000000000, and paymentId 0x5fc5efb63315fdf8bfb9ff638ded40a90cbad4f5389fc4ada3c214cbe8199727
02/10/2020 07:05:46.076 [SWAPS] info: CUSTOM_SCENARIO: INSTABILITY::MAKER_CLIENT_CRASH_WHILE_SENDING
02/10/2020 07:05:46.076 [SWAPS] info: CLIENT_TYPE ConnextClient, CLIENT_PID: 5194
02/10/2020 07:05:46.091 [RPC] trace: received call /xudrpc.Xud/GetBalance
02/10/2020 07:05:46.098 [RPC] trace: call /xudrpc.Xud/GetBalance succeeded
02/10/2020 07:05:47.573 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:05:47.576 [CONNEXT] info: new status: Disconnected
02/10/2020 07:05:47.576 [CONNEXT] error: Error: connect ECONNREFUSED 0.0.0.0:34713
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
02/10/2020 07:05:47.577 [CONNEXT] error: failed to update total outbound capacity: Error: connect ECONNREFUSED 0.0.0.0:34713
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
02/10/2020 07:05:51.575 [CONNEXT] info: trying to verify connection to connext
02/10/2020 07:05:51.576 [CONNEXT] trace: sending request to /health
02/10/2020 07:05:51.578 [CONNEXT] error: Error: connect ECONNREFUSED 0.0.0.0:34713
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
02/10/2020 07:05:51.578 [CONNEXT] error: could not verify connection to connext, retrying in 5000 ms: Error: connect ECONNREFUSED 0.0.0.0:34713
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
02/10/2020 07:05:56.580 [CONNEXT] info: trying to verify connection to connext
02/10/2020 07:05:56.580 [CONNEXT] trace: sending request to /health
02/10/2020 07:05:56.582 [CONNEXT] error: Error: connect ECONNREFUSED 0.0.0.0:34713
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
02/10/2020 07:05:56.582 [CONNEXT] error: could not verify connection to connext, retrying in 5000 ms: Error: connect ECONNREFUSED 0.0.0.0:34713
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
02/10/2020 07:06:01.583 [CONNEXT] info: trying to verify connection to connext
02/10/2020 07:06:01.583 [CONNEXT] trace: sending request to /health
02/10/2020 07:06:01.584 [CONNEXT] error: Error: connect ECONNREFUSED 0.0.0.0:34713
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
02/10/2020 07:06:01.585 [CONNEXT] error: could not verify connection to connext, retrying in 5000 ms: Error: connect ECONNREFUSED 0.0.0.0:34713
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
02/10/2020 07:06:02.227 [RPC] trace: received call /xudrpc.Xud/GetInfo
02/10/2020 07:06:02.238 [CONNEXT] trace: sending request to /version
02/10/2020 07:06:02.238 [CONNEXT] trace: sending request to /config
02/10/2020 07:06:02.248 [RPC] trace: call /xudrpc.Xud/GetInfo succeeded
02/10/2020 07:06:06.586 [CONNEXT] info: trying to verify connection to connext
02/10/2020 07:06:06.587 [CONNEXT] trace: sending request to /health
02/10/2020 07:06:06.590 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"sponsor enrich spin artwork cradle clean error near idea found gospel open satisfy firm dilemma issue maximum trouble juice govern curve heart large enjoy"}
02/10/2020 07:06:06.605 [CONNEXT] trace: sending request to /connect: {"mnemonic":"sponsor enrich spin artwork cradle clean error near idea found gospel open satisfy firm dilemma issue maximum trouble juice govern curve heart large enjoy"}
02/10/2020 07:06:06.608 [CONNEXT] trace: sending request to /subscribe: {"event":"CONDITIONAL_TRANSFER_UNLOCKED_EVENT","webhook":"http://127.0.0.1:33605/preimage"}
02/10/2020 07:06:06.622 [CONNEXT] trace: sending request to /subscribe: {"event":"CONDITIONAL_TRANSFER_CREATED_EVENT","webhook":"http://127.0.0.1:33605/incoming-transfer"}
02/10/2020 07:06:06.633 [CONNEXT] trace: sending request to /subscribe: {"event":"DEPOSIT_CONFIRMED_EVENT","webhook":"http://127.0.0.1:33605/deposit-confirmed"}
02/10/2020 07:06:06.646 [CONNEXT] info: new status: ConnectionVerified
02/10/2020 07:06:06.647 [P2P] trace: Sent NodeStateUpdate packet to 02d637dcc5c5a646325c14aebb2a57de736bba3cb1095043bd3e27b2c8bf6ba77e (FruitType): "{\"body\":{\"addresses\":[{\"host\":\"127.0.0.1\",\"port\":35773}],\"connextIdentifier\":\"indra5b3QBEELA4YXUJjoSmDdMA625K4f7XyJ7Avq131yURcCUE84JX\",\"lndPubKeys\":{\"BTC\":\"02d11b334adf21679111b056724a9440421f47e8e94861a606c149f2c942ac26ac\",\"LTC\":\"03a6fdea14b24745be8834e061c476cee5b3a1769224b8f52c5bcbe7d290d7b85e\"},\"lndUris\":{\"BTC\":[\"02d11b334adf21679111b056724a9440421f47e8e94861a606c149f2c942ac26ac@127.0.0.1:43415\"],\"LTC\":[\"03a6fdea14b24745be8834e061c476cee5b3a1769224b8f52c5bcbe7d290d7b85e@127.0.0.1:39391\"]},\"pairs\":[\"LTC/BTC\",\"BTC/ETH\"],\"tokenIdentifiers\":{\"BTC\":\"bitcoin-simnet\",\"ETH\":\"0x0000000000000000000000000000000000000000\",\"LTC\":\"litecoin-simnet\"}},\"header\":{\"id\":\"bdc96e50-047d-11eb-9a69-81bf42dec0c3\"}}"
02/10/2020 07:06:08.578 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:06:08.606 [CONNEXT] trace: new inbound capacity (collateral) for ETH of 10004000
02/10/2020 07:06:11.577 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:06:11.605 [CONNEXT] trace: new inbound capacity (collateral) for ETH of 10004000
02/10/2020 07:06:13.054 [P2P] trace: Sent Ping packet to 02d637dcc5c5a646325c14aebb2a57de736bba3cb1095043bd3e27b2c8bf6ba77e (FruitType): "{\"header\":{\"id\":\"c19b36d0-047d-11eb-9a69-81bf42dec0c3\"}}"
02/10/2020 07:06:13.058 [P2P] trace: Received Pong packet from 02d637dcc5c5a646325c14aebb2a57de736bba3cb1095043bd3e27b2c8bf6ba77e (FruitType): "{\"header\":{\"id\":\"c19bac00-047d-11eb-918f-b1983d10f3af\",\"reqId\":\"c19b36d0-047d-11eb-9a69-81bf42dec0c3\"}}"
02/10/2020 07:06:13.059 [P2P] trace: Received Ping packet from 02d637dcc5c5a646325c14aebb2a57de736bba3cb1095043bd3e27b2c8bf6ba77e (FruitType): "{\"header\":{\"id\":\"c19bac01-047d-11eb-918f-b1983d10f3af\"}}"
02/10/2020 07:06:13.060 [P2P] trace: Sent Pong packet to 02d637dcc5c5a646325c14aebb2a57de736bba3cb1095043bd3e27b2c8bf6ba77e (FruitType): "{\"header\":{\"id\":\"c19c2130-047d-11eb-9a69-81bf42dec0c3\",\"reqId\":\"c19bac01-047d-11eb-918f-b1983d10f3af\"}}"
02/10/2020 07:06:14.579 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:06:14.605 [CONNEXT] trace: new inbound capacity (collateral) for ETH of 10004000
02/10/2020 07:06:17.559 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:06:17.588 [CONNEXT] trace: new inbound capacity (collateral) for ETH of 10004000
02/10/2020 07:06:20.559 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:06:20.584 [CONNEXT] trace: new inbound capacity (collateral) for ETH of 10004000
02/10/2020 07:06:23.559 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:06:23.585 [CONNEXT] trace: new inbound capacity (collateral) for ETH of 10004000
02/10/2020 07:06:26.558 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:06:26.584 [CONNEXT] trace: new inbound capacity (collateral) for ETH of 10004000
02/10/2020 07:06:29.559 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:06:29.586 [CONNEXT] trace: new inbound capacity (collateral) for ETH of 10004000
02/10/2020 07:06:32.228 [RPC] trace: received call /xudrpc.Xud/GetBalance
02/10/2020 07:06:32.235 [RPC] trace: call /xudrpc.Xud/GetBalance succeeded
02/10/2020 07:06:32.519 [LND-BTC] trace: error on pendingChannels: 2 UNKNOWN: the client has been shutdown
02/10/2020 07:06:32.519 [LND-BTC] error: failed to update total outbound capacity: Error: 2 UNKNOWN: the client has been shutdown
    at Object.exports.createStatusError (/custom-xud-vol/node_modules/grpc/src/common.js:91:15)
    at Object.onReceiveStatus (/custom-xud-vol/node_modules/grpc/src/client_interceptors.js:1209:28)
    at InterceptingListener._callNext (/custom-xud-vol/node_modules/grpc/src/client_interceptors.js:568:42)
    at InterceptingListener.onReceiveStatus (/custom-xud-vol/node_modules/grpc/src/client_interceptors.js:618:8)
    at callback (/custom-xud-vol/node_modules/grpc/src/client_interceptors.js:847:24)
02/10/2020 07:06:32.520 [LND-BTC] trace: error on listChannels: 2 UNKNOWN: channel event store shutting down
02/10/2020 07:06:32.558 [CONNEXT] trace: sending request to /balance/0x0000000000000000000000000000000000000000
02/10/2020 07:06:32.572 [LND-BTC] debug: deleted invoice subscription for 47bcf1891136817648272550c755e4cc9552eab05bb6a0139bd4cd31914d4a2e
02/10/2020 07:06:32.588 [CONNEXT] trace: new inbound capacity (collateral) for ETH of 10004000
02/10/2020 07:06:33.356 [RPC] trace: received call /xudrpc.Xud/Shutdown
02/10/2020 07:06:33.357 [GLOBAL] info: XUD is shutting down
02/10/2020 07:06:33.358 [LND-BTC] info: new status: Disconnected
02/10/2020 07:06:33.358 [LND-LTC] info: new status: Disconnected
02/10/2020 07:06:33.359 [CONNEXT] info: new status: Disconnected
02/10/2020 07:06:33.360 [P2P] debug: Peer 02d637dcc5c5a646325c14aebb2a57de736bba3cb1095043bd3e27b2c8bf6ba77e (FruitType): closing socket. reason: Shutdown
02/10/2020 07:06:33.362 [HTTP] info: http server has closed
02/10/2020 07:06:33.362 [RPC] trace: call /xudrpc.Xud/Shutdown succeeded
02/10/2020 07:06:33.363 [P2P] trace: Sent Disconnecting packet to 02d637dcc5c5a646325c14aebb2a57de736bba3cb1095043bd3e27b2c8bf6ba77e (FruitType): "{\"body\":{\"reason\":9},\"header\":{\"id\":\"cdb5d100-047d-11eb-9a69-81bf42dec0c3\"}}"
02/10/2020 07:06:33.365 [ORDERBOOK] debug: removed all orders for peer 02d637dcc5c5a646325c14aebb2a57de736bba3cb1095043bd3e27b2c8bf6ba77e (FruitType)
02/10/2020 07:06:33.368 [RPC] info: gRPC server completed shutdown
02/10/2020 07:06:33.370 [GLOBAL] info: XUD shutdown gracefully

@sangaman
Copy link
Collaborator

sangaman commented Oct 9, 2020

The test case doesn't pass anymore, probably due to recent changes.

As described, the test simulate the maker's connext client crash while taker is delaying the settlement. After the settlement, connext client is re-started and connection is verified. But then the recovery procedure doesn't kicking in, although its interval was adjusted to 2sec. Looks like there's an error connecting with LND-BTC, although its still alive, and later the invoice subscription is being deleted.

The test fails because the maker's BTC funds aren't recovered, and the channel remains with active htlcs.

@sangaman @erkarl any ideas?

In those logs it does look like the Connext client gets shut down but it doesn't look like the send payment call to Connext actually fails - I don't see the logging statement from this line https://github.com/ExchangeUnion/xud/blob/master/lib/swaps/Swaps.ts#L1057 that I'd expect if a sendpayment call failed. I'm honestly not sure why that is - could it be that the client is down too briefly?

If the call doesn't fail, it will get stuck waiting for a response from that call which appears to be what is happening. I have no explanation for why LND-BTC decides to fail after a while. Maybe it's hitting some timeout within the simulation test? As for why the invoice gets deleted, that's what we do when the invoice subscription to lnd gives us an error.

@kilrau kilrau marked this pull request as draft October 19, 2020 13:48
@kilrau
Copy link
Contributor

kilrau commented Dec 2, 2020

Let's leave this open as draft as-is and give this another shot after switch to vector like you said @LePremierHomme

@sangaman sangaman removed their request for review October 5, 2022 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants