Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Duplicate column: ??\n\n\t0# doris::Status doris::Status::create<true>(doris::TStatus const&) #238

Open
2 of 3 tasks
qijinkui opened this issue Nov 6, 2024 · 2 comments

Comments

@qijinkui
Copy link

qijinkui commented Nov 6, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

Version

doris-spark-connector:master
spark 3.1.2

What's Wrong?

"TxnId": 1425436,
"Label": "spark_streamload_20241106_205702_9dd5634a98214aa08d5bb0254bee8930",
"Comment": "",
"TwoPhaseCommit": "false",
"Status": "Fail",
"Message": "[ANALYSIS_ERROR]TStatus: errCode = 2, detailMessage = Duplicate column: ??\n\n\t0# doris::Status doris::Status::create(doris::TStatus const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187\n\t1# doris::StreamLoadAction::_process_put(doris::HttpRequest*, std::shared_ptrdoris::StreamLoadContext) at /opt/tiger/compile_path/src/code.byted.org/emr/doris/be/src/common/status.h:446\n\t2# doris::StreamLoadAction::on_header(doris::HttpRequest*, std::shared_ptrdoris::StreamLoadContext) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701\n\t3# doris::StreamLoadAction::on_header(doris::HttpRequest*) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701\n\t4# doris::EvHttpServer::on_header(evhttp_request*) at /opt/tiger/compile_path/src/code.byted.org/emr/doris/be/src/http/ev_http_server.cpp:255\n\t5# ?\n\t6# bufferevent_run_readcb\n\t7# ?\n\t8# ?\n\t9# ?\n\t10# ?\n\t11# std::_Function_handler<void (), doris::EvHttpServer::start()::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/atomicity.h:98\n\t12# doris::ThreadPool::dispatch_thread() at /opt/tiger/compile_path/src/code.byted.org/emr/doris/be/src/util/threadpool.cpp:0\n\t13# doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562\n\t14# start_thread\n\t15# __clone\n"

What You Expected?

Expect to support writing Chinese field names

How to Reproduce?

image

https://github.com/apache/doris-spark-connector/blob/master/spark-doris-connector/src/test/scala/org/apache/doris/spark/sql/TestSparkConnector.scala

def dataframeWriteTest(): Unit = {
val session = SparkSession.builder().master("local[*]").getOrCreate()
val df = session.createDataFrame(Seq(
(1,"zhangsan-1", 18),
(2,"lisi-2", 19),
(3,"wangwu-1", 20)
)).toDF("id","名称","年龄")

df.write
  .format("doris")
  .option("doris.query.port", "9030")
  .option("doris.fenodes", dorisFeNodes)
  .option("doris.table.identifier", dorisTable)
  .option("user", dorisUser)
  .option("password", dorisPwd)
  .option("doris.table.pk.keys", "id")
  .option("doris.connection.jdbc.url", "jdbc:mysql://localhost:9030")
  //.option("doris.write.fields", "id,名称,年龄")
  .option("sink.batch.size",2)
  .option("sink.max-retries",2)
  //.mode("overwrite")
  .save()

//set global enable_unicode_name_support=true

session.stop()

}

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@qijinkui
Copy link
Author

qijinkui commented Nov 6, 2024

doris 2.0.10

@gnehil
Copy link
Contributor

gnehil commented Dec 2, 2024

This is due to an encoding anomaly when the http client transmits a request header containing noon characters. We are trying to resolve this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants