Skip to content

Commit 215da6a

Browse files
author
Manav Kumar
committed
[yugabyte#28102] YSQL: de-refer the rule before unlocking the route
Summary: On running the connection burst test following core was generated (lldb) target create "/home/yugabyte/yb-software/yugabyte-2024.2.3.0-b116-centos-x86_64/bin/odyssey" --core "/home/yugabyte/cores/core_41219_1752696376_!home!yugabyte!yb-software!yugabyte-2024.2.3.0-b116-centos-x86_64!bin!odyssey" Core file '/home/yugabyte/cores/core_41219_1752696376_!home!yugabyte!yb-software!yugabyte-2024.2.3.0-b116-centos-x86_64!bin!odyssey' (x86_64) was loaded. (lldb) bt all error: odyssey GetDIE for DIE 0x3c is outside of its CU 0x66d45 * thread #1, name = 'odyssey', stop reason = signal SIGSEGV * frame #0: 0x0000564340e2cc6f odyssey`od_backend_connect(server=0x00005138fc5ef6c0, context="", route_params=0x0000000000000000, client=0x00005138ff7a2580) at backend.c:815:19 frame #1: 0x0000564340e2a80e odyssey`od_frontend_attach(client=0x00005138ff7a2580, context="", route_params=0x0000000000000000) at frontend.c:305:8 frame #2: 0x0000564340e26b11 odyssey`od_frontend_remote [inlined] od_frontend_attach_and_deploy(client=0x00005138ff7a2580, context=<unavailable>) at frontend.c:361:11 frame #3: 0x0000564340e26afe odyssey`od_frontend_remote(client=0x00005138ff7a2580) at frontend.c:2120:13 frame #4: 0x0000564340e22d65 odyssey`od_frontend(arg=0x00005138ff7a2580) at frontend.c:2756:12 frame #5: 0x0000564340e4b912 odyssey`mm_scheduler_main(arg=0x00005138fc218dc0) at scheduler.c:17:2 frame #6: 0x0000564340e4bb77 odyssey`mm_context_runner at context.c:28:2 Which points to storage = route->rule->storage; meaning rule has already been set to NULL which lead to above crash. The root cause is a race condition in the object cleanup. The rule associated with a route was being de-referenced (unref) outside of a lock protecting the route object while cleaning up the route. This allows for a scenario where one thread could proceed to clean up the rule, while another thread simultaneously acquires a lock on the same route and attempts to use its rule pointer, which would now be a dangling pointer. This diff move the de-referencing of the rule object to a code block where a lock is already acquired on the route object. This change ensures atomic handling of the route and its associated rule, preventing any concurrent access to an invalid pointer. Jira: DB-17729 Test Plan: Jenkins: all tests Reviewers: skumar, vikram.damle, asrinivasan, arpit.saxena Reviewed By: skumar Subscribers: svc_phabricator, yql Differential Revision: https://phorge.dev.yugabyte.com/D45583
1 parent 0c5ebd2 commit 215da6a

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

src/odyssey/sources/router.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -384,10 +384,11 @@ static inline int od_router_gc_cb(od_route_t *route, void **argv)
384384
instance->yb_stats[index].user_oid = -1;
385385
}
386386

387+
/* unref route rule */
388+
od_rules_unref(route->rule);
387389
od_route_unlock(route);
388390

389-
/* unref route rule and free route object */
390-
od_rules_unref(route->rule);
391+
/* free route object */
391392
od_route_free(route);
392393
return 0;
393394
done:

0 commit comments

Comments
 (0)