삽질/개발,엔지니어링

클릭하우스에서 실행 쿼리가 종료 되지 않을 때

maengis 2023. 10. 20. 11:31

클릭하우스에 ALTER 요청시 오류가 나고 있었다.

clickhouse_driver.errors.ServerException: Code: 159.
DB::Exception: Watching task /clickhouse/task_queue/ddl/query-0000363944 is executing longer than distributed_ddl_task_timeout (=180) seconds. There are 1 unfinished hosts (0 of them are currently active), they are going to execute the query in background. Stack trace:

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c7498f7 in /usr/bin/clickhouse
1. DB::Exception::Exception<String&, long&, unsigned long&, unsigned long&>(int, FormatStringHelperImpl<std::type_identity<String&>::type, std::type_identity<long&>::type, std::type_identity<unsigned long&>::type, std::type_identity<unsigned long&>::type>, String&, long&, unsigned long&, unsigned long&) @ 0x000000001248ef29 in /usr/bin/clickhouse
2. DB::DDLQueryStatusSource::generate() @ 0x000000001248d953 in /usr/bin/clickhouse
3. DB::ISource::tryGenerate() @ 0x0000000013393258 in /usr/bin/clickhouse
4. DB::ISource::work() @ 0x0000000013392d8a in /usr/bin/clickhouse
5. DB::ExecutionThreadContext::executeTask() @ 0x00000000133aac5a in /usr/bin/clickhouse
6. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x00000000133a1710 in /usr/bin/clickhouse
7. DB::PipelineExecutor::execute(unsigned long, bool) @ 0x00000000133a09a0 in /usr/bin/clickhouse
8. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true>::ThreadFromGlobalPoolImpl<DB::PullingAsyncPipelineExecutor::pull(DB::Chunk&, unsigned long)::$_0>(DB::PullingAsyncPipelineExecutor::pull(DB::Chunk&, unsigned long)::$_0&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x00000000133ae66f in /usr/bin/clickhouse
9. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000c832d27 in /usr/bin/clickhouse
10. start_thread @ 0x0000000000007dd5 in /usr/lib64/libpthread-2.17.so
11. __clone @ 0x00000000000fe02d in /usr/lib64/libc-2.17.so

 

클릭하우스쪽에 특정 ALTER 쿼리 하나가 계속 실행 중이라서 DDL 쿼리가 처리 안 되고 있었음.

SHOW PROCESSLIST에서 쿼리ID 확인 해서 KILL QUREY WHERE query_id = '쿼리아이디' ASYNC 해도 킬이 안 됨.

 

구글링 해보니 아래 내용 발견

https://github.com/ClickHouse/ClickHouse/issues/10076

 

KILL QUERY cannot work immediately in SELECT count() FROM Engine=Merge table · Issue #10076 · ClickHouse/ClickHouse

Some of queries are stock in clickhouse database. We try to use a command KILL QUERY around 2 hours , but it always report status 'wait'. We only can restart server to solve this problem. The query...

github.com

 

댓글에 보니 원격에서 실행된 쿼리가 죽지 않고 계속 살아 있고, 죽일 수도 없다는 거.

실행한 원격 서버쪽에서 끊어야 되는 거 같아서 실행한 서버를 가보니 배포가 있어 재시작 되었어야 할 uwsgi가 3일 넘게 살아 있는 게 보임.

해당 uwsgi를 킬 하고 재시작 하니까 클릭하우스쪽 쿼리도 사라짐.

반응형