翼度科技»论坛 编程开发 python 查看内容

sqlalchemy 报错 Lost connection to MySQL server during query 解决

4

主题

4

帖子

12

积分

新手上路

Rank: 1

积分
12

最近在开发过程中遇到一个sqlalchemy lost connection的报错,记录解决方法。
报错信息

python后端开发,使用的框架是Fastapi + sqlalchemy。在一个接口请求中报错如下:
  1. [2023-03-24 06:36:35 +0000] [217] [ERROR] Exception in ASGI application
  2. Traceback (most recent call last):
  3.   File "/usr/local/lib/python3.8/dist-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi
  4.     result = await app(  # type: ignore[func-returns-value]
  5.   File "/usr/local/lib/python3.8/dist-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
  6.     return await self.app(scope, receive, send)
  7.   File "/usr/local/lib/python3.8/dist-packages/fastapi/applications.py", line 199, in __call__
  8.     await super().__call__(scope, receive, send)
  9.   File "/usr/local/lib/python3.8/dist-packages/starlette/applications.py", line 112, in __call__
  10.     await self.middleware_stack(scope, receive, send)
  11.   File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/errors.py", line 181, in __call__
  12.     raise exc from None
  13.   File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/errors.py", line 159, in __call__
  14.     await self.app(scope, receive, _send)
  15.   File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/base.py", line 26, in __call__
  16.     await response(scope, receive, send)
  17.   File "/usr/local/lib/python3.8/dist-packages/starlette/responses.py", line 224, in __call__
  18.     await run_until_first_complete(
  19.   File "/usr/local/lib/python3.8/dist-packages/starlette/concurrency.py", line 24, in run_until_first_complete
  20.     [task.result() for task in done]
  21.   File "/usr/local/lib/python3.8/dist-packages/starlette/concurrency.py", line 24, in <listcomp>
  22.     [task.result() for task in done]
  23.   File "/usr/local/lib/python3.8/dist-packages/starlette/responses.py", line 216, in stream_response
  24.     async for chunk in self.body_iterator:
  25.   File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/base.py", line 56, in body_stream
  26.     task.result()
  27.   File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/base.py", line 38, in coro
  28.     await self.app(scope, receive, send)
  29.   File "/usr/local/lib/python3.8/dist-packages/starlette_exporter/middleware.py", line 289, in __call__
  30.     await self.app(scope, receive, wrapped_send)
  31.   File "/usr/local/lib/python3.8/dist-packages/starlette/exceptions.py", line 82, in __call__
  32.     raise exc from None
  33.   File "/usr/local/lib/python3.8/dist-packages/starlette/exceptions.py", line 71, in __call__
  34.     await self.app(scope, receive, sender)
  35.   File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 580, in __call__
  36.     await route.handle(scope, receive, send)
  37.   File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 241, in handle
  38.     await self.app(scope, receive, send)
  39.   File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 55, in app
  40.     await response(scope, receive, send)
  41.   File "/usr/local/lib/python3.8/dist-packages/starlette/responses.py", line 146, in __call__
  42.     await self.background()
  43.   File "/usr/local/lib/python3.8/dist-packages/starlette/background.py", line 35, in __call__
  44.     await task()
  45.   File "/usr/local/lib/python3.8/dist-packages/starlette/background.py", line 20, in __call__
  46.     await run_in_threadpool(self.func, *self.args, **self.kwargs)
  47.   File "/usr/local/lib/python3.8/dist-packages/starlette/concurrency.py", line 40, in run_in_threadpool
  48.     return await loop.run_in_executor(None, func, *args)
  49.   File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
  50.     result = self.fn(*self.args, **self.kwargs)
  51.   File "/app/ymir_app/app/libs/datasets.py", line 330, in ats_import_dataset_in_backgroud
  52.     task = crud.task.create_placeholder(
  53.   File "/app/ymir_app/app/crud/crud_task.py", line 81, in create_placeholder
  54.     db.commit()
  55.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 1428, in commit
  56.     self._transaction.commit(_to_root=self.future)
  57.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 829, in commit
  58.     self._prepare_impl()
  59.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
  60.     self.session.flush()
  61.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 3298, in flush
  62.     self._flush(objects)
  63.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 3438, in _flush
  64.     transaction.rollback(_capture_exception=True)
  65.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
  66.     compat.raise_(
  67.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 207, in raise_
  68.     raise exception
  69.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 3398, in _flush
  70.     flush_context.execute()
  71.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
  72.     rec.execute(self)
  73.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/unitofwork.py", line 630, in execute
  74.     util.preloaded.orm_persistence.save_obj(
  75.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/persistence.py", line 242, in save_obj
  76.     _emit_insert_statements(
  77.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/persistence.py", line 1219, in _emit_insert_statements
  78.     result = connection._execute_20(
  79.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1582, in _execute_20
  80.     return meth(self, args_10style, kwargs_10style, execution_options)
  81.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
  82.     return connection._execute_clauseelement(
  83.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1451, in _execute_clauseelement
  84.     ret = self._execute_context(
  85.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1813, in _execute_context
  86.     self._handle_dbapi_exception(
  87.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1994, in _handle_dbapi_exception
  88.     util.raise_(
  89.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 207, in raise_
  90.     raise exception
  91.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1770, in _execute_context
  92.     self.dialect.do_execute(
  93.   File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 717, in do_execute
  94.     cursor.execute(statement, parameters)
  95.   File "/usr/local/lib/python3.8/dist-packages/pymysql/cursors.py", line 148, in execute
  96.     result = self._query(query)
  97.   File "/usr/local/lib/python3.8/dist-packages/pymysql/cursors.py", line 310, in _query
  98.     conn.query(q)
  99.   File "/usr/local/lib/python3.8/dist-packages/pymysql/connections.py", line 548, in query
  100.     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  101.   File "/usr/local/lib/python3.8/dist-packages/pymysql/connections.py", line 775, in _read_query_result
  102.     result.read()
  103.   File "/usr/local/lib/python3.8/dist-packages/pymysql/connections.py", line 1156, in read
  104.     first_packet = self.connection._read_packet()
  105.   File "/usr/local/lib/python3.8/dist-packages/pymysql/connections.py", line 692, in _read_packet
  106.     packet_header = self._read_bytes(4)
  107.   File "/usr/local/lib/python3.8/dist-packages/pymysql/connections.py", line 748, in _read_bytes
  108.     raise err.OperationalError(
  109. sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')
  110. [SQL: INSERT INTO task (name, hash, type, state, parameters, config, percent, duration, error_code, user_id, project_id, dataset_id, model_stage_id, is_terminated, is_deleted, last_message_datetime, create_datetime, update_datetime) VALUES (%(name)s, %(hash)s, %(type)s, %(state)s, %(parameters)s, %(config)s, %(percent)s, %(duration)s, %(error_code)s, %(user_id)s, %(project_id)s, %(dataset_id)s, %(model_stage_id)s, %(is_terminated)s, %(is_deleted)s, %(last_message_datetime)s, %(create_datetime)s, %(update_datetime)s)]
  111. [parameters: {'name': 't0000001000012b2ae341679639795', 'hash': 't0000001000012b2ae341679639795', 'type': 5, 'state': 1, 'parameters': '{"group_name": "from_ats_6579a9116a", "description": null, "project_id": 12, "input_url": null, "input_dataset_id": null, "input_dataset_name": null, "input_path": "/data/ymir-workplace/ymir-sharing/3c87e23bb8904b638a9479d6e68aea23", "strategy": 4, "source": 5, "import_type": 5}', 'config': None, 'percent': 0, 'duration': None, 'error_code': None, 'user_id': 1, 'project_id': 12, 'dataset_id': None, 'model_stage_id': None, 'is_terminated': 0, 'is_deleted': 0, 'last_message_datetime': datetime.datetime(2023, 3, 24, 6, 36, 35, 351864), 'create_datetime': datetime.datetime(2023, 3, 24, 6, 36, 35, 351870), 'update_datetime': datetime.datetime(2023, 3, 24, 6, 36, 35, 351873)}]
  112. (Background on this error at: http://sqlalche.me/e/14/e3q8)
复制代码
主要报错信息是:
sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')
在网上搜了很多答案包括:

  • 设置sqlalchemy 回收链接的时间为10分钟 pool_recycle
    engine = create_engine(url, pool_recycle=600)
  • 设置每次session操作之前检查 pool_pre_ping
    engine = create_engine("mysql+pymysql://user:pw@host/db", pool_pre_ping=True,pool_recycle=1800)
  • 不使用连接池
    engine = create_engine("mysql+pymysql://user:pw@host/db", pool_pre_ping=True,pool_recycle=-1)
  • 检查数据库设置的连接超时时间
经过以上一些列操作还是不能解决问题。于是仔细分析这个问题出现的原因。
分析问题原因

从字面意思来看就是数据库在查询时丢失了连接,这里的连接也就是session。这个接口是一个操作很多的任务,要下载大量数据集,通常在20G以上,所以设计成异步接口。请求接口之后获取一个数据库session,然后处理简单任务直接返回一个成功的状态,最后将耗时任务放在后台任务完成。这里的后台任务是Fastapi自身的功能,专门用于处理一些小型的耗时任务,如发送邮件等。lost connect 就是发生在后台任务中。
抽象任务流程:

  • 用户调用接口时获取session
  • 异步接口直接返回
  • 后台任务下载数据库30分钟左右
  • 下载完成更新数据库状态,错误发生。
所以通过分析这个任务的流程可以发现是持有session过长导致的。从接口请求的开始就获取了该session,然后将session传递到后台任务中,经过30分钟之后才再次使用该session,就发生了lost connection的问题。
解决办法

知道问题症状所在就知道如何对症下药的了,就是在后台下载任务30分钟之后更新数据库时重新获取一个session,不复用之前的session,这样就就解决了这个问题。
这个问题之所以没有发现是因为按照官网的介绍pool_recycle字段就是负责回收session,配合pool_pre_ping每次使用session之前检查一次就能解决这个session断联的问题。但是似乎在配置的pool_recycle醒没有生效。
可能这个问题是我自身没配置好导致的,但是也可以作为解决此类问题的一个思路。遇到类似问题排查时思考一下,是不是持有session时间过长。
附录猜测过程



来源:https://www.cnblogs.com/goldsunshine/p/17304427.html
免责声明:由于采集信息均来自互联网,如果侵犯了您的权益,请联系我们【E-Mail:cb@itdo.tech】 我们会及时删除侵权内容,谢谢合作!

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有账号?立即注册

x

举报 回复 使用道具