-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Labels
priority/lowNice to have but not critical to the project's successNice to have but not critical to the project's successtype/bugSomething isn't working correctly or is brokenSomething isn't working correctly or is broken
Description
Beat process crashes sometimes:
[2025-12-06 11:36:55,055: ERROR/Beat] Process Beat
Traceback (most recent call last):
File "/usr/local/lib/python3.13/dbm/sqlite3.py", line 83, in _execute
return closing(self._cx.execute(*args, **kwargs))
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: disk I/O error
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.13/site-packages/billiard/process.py", line 323, in _bootstrap
self.run()
~~~~~~~~^^
File "/usr/local/lib/python3.13/site-packages/celery/beat.py", line 718, in run
self.service.start(embedded_process=True)
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.13/site-packages/celery/beat.py", line 649, in start
self.scheduler._do_sync()
~~~~~~~~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.13/site-packages/celery/beat.py", line 428, in _do_sync
self.sync()
~~~~~~~~~^^
File "/usr/local/lib/python3.13/site-packages/celery/beat.py", line 599, in sync
self._store.sync()
~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.13/shelve.py", line 168, in sync
self[key] = entry
~~~~^^^^^
File "/usr/local/lib/python3.13/shelve.py", line 125, in __setitem__
self.dict[key.encode(self.keyencoding)] = f.getvalue()
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.13/dbm/sqlite3.py", line 100, in __setitem__
self._execute(STORE_KV, (key, value))
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.13/dbm/sqlite3.py", line 85, in _execute
raise error(str(exc))
dbm.sqlite3.error: disk I/O error
I found a possible reason behind it.
Celery uses sqlite3 to store schedule, and it conflicts with how helm upgrade goes.
Helm first start new resources, and only after that terminating older ones, which makes beat unable to write to database on upgrade. I tried workaround: removed schedule database and then did an upgrade. for now - seems like it works fine.
Proposal
Use redis for scheduler persistance
Metadata
Metadata
Assignees
Labels
priority/lowNice to have but not critical to the project's successNice to have but not critical to the project's successtype/bugSomething isn't working correctly or is brokenSomething isn't working correctly or is broken