Bug: >> Order of magnitude difference between RAW KV benchmarks and SurrealDB TiKV driver read operations #3413

MinaMatta98 · 2024-01-26T09:17:57Z

MinaMatta98
Jan 26, 2024

Describe the bug

Hello,

I have benchmarked the following TiKV deployment configuration:

global:
  user: "mina"
  ssh_port: 22
  deploy_dir: "/data1/tidb/db"
  data_dir: "/data1/tidb/db-data" # # Supported values: "amd64", "arm64" (default: "amd64")
  arch: "amd64"
  resource_control:
    memory_limit: "24G"
    cpu_quota: "3200%"
    io_read_bandwidth_max: "/dev/nvme1n1 /dev/nvme0n1p6 /dev/nvme0n1p7 /dev/nvme0n1 700%"
    io_write_bandwidth_max: "/dev/nvme1n1 /dev/nvme0n1p6 /dev/nvme0n1p7 /dev/nvme0n1 700%"

server_configs:
  pd:
    replication.location-labels:
    - host

pd_servers:
  - host: 127.0.0.1
    client_port: 1279
    peer_port: 2380

# Note that 1 server should simulate not being on the same drive for testing
tikv_servers:
  - host: 127.0.0.1
    port: 20160
    status_port: 10080
    config:
      server.labels:
        host: host1
    deploy_dir: "/data1/tidb-deploy/tikv-20160"
    data_dir: "/data1/tidb-data/tikv-20160"

Note that the single PD and TiKV node configuration was purposefully chosen to ensure that the issue wasn't related to the integration between the surrealdb driver and the TiKV distribution mechanisms between multiple pd's and storage servers.

The configuration is deployed via the following command successfully where the cluster.yaml is the configuration listed above and is started successfully:

tiup cluster deploy development v7.5.0 cluster.yaml -p "password"

Both load and read testing is conducted via the official TiKV recommendations within their docs: https://tikv.org/docs/7.1/deploy/performance/instructions/

Using go-ycsb, the following is run for benchmarking:

# This loads the workload
./go-ycsb load tikv -P workloads/workloada -p tikv.pd="127.0.0.1:1279" -p tikv.type="raw" -p recordcount=10000000 -p operationcount=30000000 -p threadcount=32

# This runs the workload
./go-ycsb run tikv -P workloads/workloada -p tikv.pd="127.0.0.1:1279" -p tikv.type="raw" -p recordcount=10000000 -p operationcount=30000000 -p threadcount=32

The resulting average output from the cluster is approximately as follows:

INSERT - Takes(s): 249.9, Count: 3742650, OPS: 14974.1, Avg(us): 8526, Min(us): 480, Max(us): 810495, 50th(us): 7403, 90th(us): 11903, 95th(us): 16095, 99th(us): 29615, 99.9th(us): 83263, 99.99th(us): 499967
TOTAL  - Takes(s): 249.9, Count: 3742650, OPS: 14974.1, Avg(us): 8526, Min(us): 480, Max(us): 810495, 50th(us): 7403, 90th(us): 11903, 95th(us): 16095, 99th(us): 29615, 99.9th(us): 83263, 99.99th(us): 499967

UPDATE - Takes(s): 189.9, Count: 1349791, OPS: 7107.1, Avg(us): 17813, Min(us): 4444, Max(us): 999423, 50th(us): 11287, 90th(us): 40351, 95th(us): 51647, 99th(us): 87359, 99.9th(us): 162047, 99.99th(us): 587263
READ   - Takes(s): 200.0, Count: 1451358, OPS: 7258.3, Avg(us): 192, Min(us): 44, Max(us): 75455, 50th(us): 162, 90th(us): 275, 95th(us): 334, 99th(us): 806, 99.9th(us): 2041, 99.99th(us): 3115

Insertions occur at roughly 15K operations per second on average. Reads and updates are occur at approximately 7k operations per second. Note that P99.99 latency is also sub 1s.

The question is then, given the aforementioned results, is it appropriate to expect the following results when bench-marking the surrealdb database via apache Jmeter, which is launched using the TiKV driver?

This benchmark was targeted towards an axum endpoint, returning the following:

pub async fn get_customers() -> Json<Vec<Customer>> {
    Json(DB.select("customer").await.unwrap())
}

Where the customer table had 0 entries and the setup is managed via the following:

pub static DB: LazyLock<Surreal<surrealdb::engine::any::Any>> = LazyLock::new(Surreal::init);
#[instrument]
async fn deploy_tikv() -> Result<(), Box<dyn Error>> {
    Command::new("tiup")
        .arg("cluster")
        .arg("deploy")
        .arg(ENV.clone())
        .arg("v7.5.0")
        .arg(std::fs::canonicalize("./src/database/cluster.yaml")?)
        .arg("-p")
        .arg(PASS.clone())
        .spawn()?
        .wait_with_output()
        .await?;
    Ok(())
}

#[instrument]
async fn init_tikv() -> Result<(), Box<dyn Error>> {
    tokio::spawn(async {
        Command::new("tiup")
            .arg("cluster")
            .arg("start")
            .arg(ENV.clone())
            .spawn()
            .unwrap()
            .wait_with_output()
            .await
    })
    .await??;
}

And connection via the following:

DB.connect("tikv://127.0.0.1:1279").await?;

To ensure that this issue was related to the kv-tikv driver and not axum itself, I ran the same benchmark using the kv-speedb driver:

#[async_recursion]
async fn init_db() -> Result<(), Box<dyn Error>> {
   match std::fs::canonicalize("./data.db") {
       Ok(dir) => {
            DB.connect(&(String::from("speedb:/") + dir.to_str().unwrap()))
                .await?;
       }
       Err(_) => {
           tracing::info!("Could not find database directory");
           std::fs::create_dir_all("./data.db").unwrap();
           init_db().await?;
       }
   }

   DB.use_ns("namespace").use_db("database").await?;

   Ok(())
}

Where the target SSD is the same SSD used within the TiKV test.

The result is the following:

The question is whether this is expected behavior, given that the database is empty and initial testing results on reads for RAW KV where over an order of magnitude faster?

Steps to reproduce

Explained above

Expected behaviour

Significantly lower latency and faster throughput of the database.

SurrealDB version

1.1.1 on 64 bit Arch Linux (32 thread 7950x)

Contact Details

[email protected]

Is there an existing issue for this?

I have searched the existing issues

Code of Conduct

I agree to follow this project's Code of Conduct

Answered by sgirones

Jan 29, 2024

Moved to a discussion because it's not really an issue (yet?).

A few comments:

You are using 2 different benchmarking frameworks and then comparing the results. Although I understand that you are trying to design the 2nd one to be as close as possible to ycsb, there will always be differences.
In go-ycsb, when you run with tikv.type="raw" , it means no transactions. SurrealDB always open a transaction against TiKV to do even the simplest of operations. In order to fix the go-ycsb benchmark, you need to replace that parameter with -p tikv.type=txn -p tikv.async_commit=false -p tikv.one_pc=false
SurrealDB, for every "user facing" operation (i.e. CREATE), does more than 5 additional reads i…

View full answer

sgirones · 2024-01-29T08:31:24Z

sgirones
Jan 29, 2024
Maintainer

Moved to a discussion because it's not really an issue (yet?).

A few comments:

You are using 2 different benchmarking frameworks and then comparing the results. Although I understand that you are trying to design the 2nd one to be as close as possible to ycsb, there will always be differences.
In go-ycsb, when you run with tikv.type="raw" , it means no transactions. SurrealDB always open a transaction against TiKV to do even the simplest of operations. In order to fix the go-ycsb benchmark, you need to replace that parameter with -p tikv.type=txn -p tikv.async_commit=false -p tikv.one_pc=false
SurrealDB, for every "user facing" operation (i.e. CREATE), does more than 5 additional reads inside the transaction (Check table, Check fields, Check indexes, Check if record exists...). We plan to optimise these and reduce the amount of additional operations, but for now, this is the current state of things. So when comparing individual operations in TiKV vs SurrealDB, assume you are actually doing x6 or x7 the throughput you see in SurrealDB.

I hope this clarifies things! We internally ran some benchmarks using go-ycsb against TiKV and SurrealDB, and saw that, when comparing real throughput, we are roughly on pair, so the client is not a bottleneck.

3 replies

MinaMatta98 Jan 30, 2024
Author

Hello,

Thank you very much for your prompt response.

I did not benchmark this to demonstrate equivalence.

It had come to my attention that the TiKV driver pales in comparison to its RocksDB and SpeeDB counterparts, to the extent where there is ~ 20x increase in speed moving from TiKV to SpeeDB. Is this expected?

I have thoroughly examined the underlying architecture within TiKV and I would be surprised if it would even perform at 50% of the RocksDB client. However, we're looking at an order of magnitude difference.

My question is then (if you would be so kind as to answer), is this slow down expected and acknowledged by the team at SurrealDB or is it a problem with configuration of the TiKV client?

sgirones Jan 30, 2024
Maintainer

hey @MinaMatta98 ! What makes you think the client is the issue?

RocksDB/SpeeDB perf can't be compared to TiKV, they are different in nature: the former are local-only KV stores, while the later is a distributed KV store. The technical tradeoffs made by those products directly affect their performance.

So yeah, TiKV will always be slower than RocksDB/SpeeDB

MinaMatta98 Jan 30, 2024
Author

Hello,

I wasn't sure. I wanted insight from the team.

I was expecting an absolute decrease in performance, just not to this extent.

I guess that's about what you were expecting then.

Thank you @sgirones !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SurrealDB

Bug: >> Order of magnitude difference between RAW KV benchmarks and SurrealDB TiKV driver read operations #3413

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

SurrealDB

Bug: >> Order of magnitude difference between RAW KV benchmarks and SurrealDB TiKV driver read operations #3413

MinaMatta98 Jan 26, 2024

Describe the bug

Steps to reproduce

Expected behaviour

SurrealDB version

Contact Details

Is there an existing issue for this?

Code of Conduct

Replies: 1 comment · 3 replies

sgirones Jan 29, 2024 Maintainer

MinaMatta98 Jan 30, 2024 Author

sgirones Jan 30, 2024 Maintainer

MinaMatta98 Jan 30, 2024 Author

MinaMatta98
Jan 26, 2024

Replies: 1 comment 3 replies

sgirones
Jan 29, 2024
Maintainer

MinaMatta98 Jan 30, 2024
Author

sgirones Jan 30, 2024
Maintainer

MinaMatta98 Jan 30, 2024
Author