Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [benchmark][cluster] Garbage collection on minio, residual data and data is not deleted according to the specified time #33097

Open
1 task done
wangting0128 opened this issue May 16, 2024 · 6 comments
Assignees
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@wangting0128
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: milvus-io-lru-dev-2721816-20240507
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka):pulsar    
- SDK version(e.g. pymilvus v2.0.0rc2): 2.4.2
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: fouramf-sbfcd

server:

NAME                                                              READY   STATUS        RESTARTS       AGE     IP              NODE         NOMINATED NODE   READINESS GATES
lru-500g-etcd-0                                                   1/1     Running       0              26d     10.104.20.193   4am-node22   <none>           <none>
lru-500g-milvus-standalone-65588948-kkd7h                         1/1     Running       0              9d      10.104.31.141   4am-node34   <none>           <none>
lru-500g-pulsar-bookie-0                                          1/1     Running       0              26d     10.104.20.195   4am-node22   <none>           <none>
lru-500g-pulsar-bookie-1                                          1/1     Running       0              26d     10.104.29.179   4am-node35   <none>           <none>
lru-500g-pulsar-bookie-2                                          1/1     Running       0              26d     10.104.26.180   4am-node32   <none>           <none>
lru-500g-pulsar-broker-0                                          1/1     Running       1 (25d ago)    26d     10.104.1.8      4am-node10   <none>           <none>
lru-500g-pulsar-proxy-0                                           1/1     Running       0              22d     10.104.6.68     4am-node13   <none>           <none>
lru-500g-pulsar-recovery-0                                        1/1     Running       0              22d     10.104.6.67     4am-node13   <none>           <none>
lru-500g-pulsar-zookeeper-0                                       1/1     Running       1 (20d ago)    26d     10.104.20.194   4am-node22   <none>           <none>
lru-500g-pulsar-zookeeper-1                                       1/1     Running       0              26d     10.104.29.181   4am-node35   <none>           <none>
lru-500g-pulsar-zookeeper-2                                       1/1     Running       0              26d     10.104.34.8     4am-node37   <none>           <none>
截屏2024-05-16 16 15 32

minio garbage collection after 40 minutes
image

residual data

./index_files/449193006024386963/1/449193006017977748/449193006024386876/HNSW_7:
total 4.0K
drwxr-xr-x 2 1000 1000  28 Apr 21 12:52 27299f1c-06a0-4be1-803f-ef7d800b9a54
-rw-r--r-- 1 1000 1000 361 Apr 21 12:52 xl.meta

./index_files/449193006024386963/1/449193006017977748/449193006024386876/HNSW_7/27299f1c-06a0-4be1-803f-ef7d800b9a54:
total 4.7M
-rw-r--r-- 1 1000 1000 4.7M Apr 21 12:52 part.1

client pod name: fouramf-sbfcd-833673770
client log:
image

Expected Behavior

No response

Steps To Reproduce

1. [2024-05-16 06:33:09,140 -  INFO - fouram]: Drop collection 'fouram_270m'
2. [2024-05-16 06:33:09,145 -  INFO - fouram]: Create collection fouram_dlxQWZ9R 
3. [2024-05-16 06:34:03,094 -  INFO - fouram]: Drop collection 'fouram_dlxQWZ9R'

Milvus Log

minio data:

bash-4.4$ cd export/lru-500g/lru500g/
bash-4.4$ ls -l
total 16
drwxr-xr-x 7 1000 1000 45056 May 16 07:32 index_files
drwxr-xr-x 3 1000 1000    40 May 16 07:05 stats_log
bash-4.4$ ls -lh -R 
.:
total 16K
drwxr-xr-x 7 1000 1000 44K May 16 07:32 index_files
drwxr-xr-x 3 1000 1000  40 May 16 07:05 stats_log

./index_files:
total 0
drwxr-xr-x 3 1000 1000 23 Apr 22 16:04 449193006024386963
drwxr-xr-x 3 1000 1000 23 May  7 09:08 449193006028593157
drwxr-xr-x 3 1000 1000 23 Apr 22 17:04 449193006028593241
drwxr-xr-x 3 1000 1000 23 Apr 22 17:04 449193006028593307
drwxr-xr-x 4 1000 1000 36 May 16 07:21 449594851695789997

./index_files/449193006024386963:
total 0
drwxr-xr-x 3 1000 1000 40 Apr 21 12:52 1

./index_files/449193006024386963/1:
total 0
drwxr-xr-x 3 1000 1000 40 Apr 21 12:52 449193006017977748

./index_files/449193006024386963/1/449193006017977748:
total 0
drwxr-xr-x 3 1000 1000 28 Apr 21 13:57 449193006024386876

./index_files/449193006024386963/1/449193006017977748/449193006024386876:
total 0
drwxr-xr-x 3 1000 1000 77 Apr 21 12:52 HNSW_7

./index_files/449193006024386963/1/449193006017977748/449193006024386876/HNSW_7:
total 4.0K
drwxr-xr-x 2 1000 1000  28 Apr 21 12:52 27299f1c-06a0-4be1-803f-ef7d800b9a54
-rw-r--r-- 1 1000 1000 361 Apr 21 12:52 xl.meta

./index_files/449193006024386963/1/449193006017977748/449193006024386876/HNSW_7/27299f1c-06a0-4be1-803f-ef7d800b9a54:
total 4.7M
-rw-r--r-- 1 1000 1000 4.7M Apr 21 12:52 part.1

./index_files/449193006028593157:
total 0
drwxr-xr-x 3 1000 1000 40 Apr 21 14:44 1

./index_files/449193006028593157/1:
total 0
drwxr-xr-x 3 1000 1000 40 Apr 21 14:44 449193005838895425

./index_files/449193006028593157/1/449193005838895425:
total 0
drwxr-xr-x 8 1000 1000 124 Apr 21 16:00 449193005786153519

./index_files/449193006028593157/1/449193005838895425/449193005786153519:
total 0
drwxr-xr-x 2 1000 1000 10 Apr 21 14:50 HNSW_45
drwxr-xr-x 2 1000 1000 10 Apr 21 14:50 HNSW_46
drwxr-xr-x 2 1000 1000 10 Apr 21 14:50 HNSW_47
drwxr-xr-x 2 1000 1000 10 Apr 21 14:50 HNSW_48
drwxr-xr-x 2 1000 1000 10 Apr 21 14:50 HNSW_50
drwxr-xr-x 2 1000 1000 10 Apr 21 14:50 HNSW_51

./index_files/449193006028593157/1/449193005838895425/449193005786153519/HNSW_45:
total 0

./index_files/449193006028593157/1/449193005838895425/449193005786153519/HNSW_46:
total 0

./index_files/449193006028593157/1/449193005838895425/449193005786153519/HNSW_47:
total 0

./index_files/449193006028593157/1/449193005838895425/449193005786153519/HNSW_48:
total 0

./index_files/449193006028593157/1/449193005838895425/449193005786153519/HNSW_50:
total 0

./index_files/449193006028593157/1/449193005838895425/449193005786153519/HNSW_51:
total 0

./index_files/449193006028593241:
total 0
drwxr-xr-x 3 1000 1000 40 Apr 21 14:52 1

./index_files/449193006028593241/1:
total 0
drwxr-xr-x 3 1000 1000 40 Apr 21 14:52 449193006017977748

./index_files/449193006028593241/1/449193006017977748:
total 0
drwxr-xr-x 4 1000 1000 46 Apr 21 16:00 449193006028593139

./index_files/449193006028593241/1/449193006017977748/449193006028593139:
total 0
drwxr-xr-x 2 1000 1000 10 Apr 21 14:53 HNSW_4
drwxr-xr-x 2 1000 1000 10 Apr 21 14:53 HNSW_6

./index_files/449193006028593241/1/449193006017977748/449193006028593139/HNSW_4:
total 0

./index_files/449193006028593241/1/449193006017977748/449193006028593139/HNSW_6:
total 0

./index_files/449193006028593307:
total 0
drwxr-xr-x 3 1000 1000 40 Apr 21 15:01 1

./index_files/449193006028593307/1:
total 0
drwxr-xr-x 3 1000 1000 40 Apr 21 15:01 449193006028593210

./index_files/449193006028593307/1/449193006028593210:
total 0
drwxr-xr-x 3 1000 1000 28 Apr 21 16:00 449193006028593225

./index_files/449193006028593307/1/449193006028593210/449193006028593225:
total 0
drwxr-xr-x 2 1000 1000 10 Apr 21 15:01 HNSW_1

./index_files/449193006028593307/1/449193006028593210/449193006028593225/HNSW_1:
total 0

./index_files/449594851695789997:
total 0
drwxr-xr-x 3 1000 1000 40 May  7 09:22 1
drwxr-xr-x 3 1000 1000 40 May  7 09:31 2

./index_files/449594851695789997/1:
total 0
drwxr-xr-x 3 1000 1000 40 May  7 09:22 449193005891583751

./index_files/449594851695789997/1/449193005891583751:
total 0
drwxr-xr-x 4 1000 1000 48 May  7 09:50 449239199803509465

./index_files/449594851695789997/1/449193005891583751/449239199803509465:
total 0
drwxr-xr-x 2 1000 1000 10 May  7 09:22 HNSW_10
drwxr-xr-x 2 1000 1000 10 May  7 09:22 HNSW_11

./index_files/449594851695789997/1/449193005891583751/449239199803509465/HNSW_10:
total 0

./index_files/449594851695789997/1/449193005891583751/449239199803509465/HNSW_11:
total 0

./index_files/449594851695789997/2:
total 0
drwxr-xr-x 3 1000 1000 40 May  7 09:31 449193005891583751

./index_files/449594851695789997/2/449193005891583751:
total 0
drwxr-xr-x 3 1000 1000 29 May  7 09:50 449239199803509465

./index_files/449594851695789997/2/449193005891583751/449239199803509465:
total 0
drwxr-xr-x 2 1000 1000 10 May  7 09:31 HNSW_12

./index_files/449594851695789997/2/449193005891583751/449239199803509465/HNSW_12:
total 0

./stats_log:
total 0
drwxr-xr-x 3 1000 1000 40 May 16 07:05 449193005785809078

./stats_log/449193005785809078:
total 0
drwxr-xr-x 3 1000 1000 40 May 16 06:59 449193005902001133

./stats_log/449193005785809078/449193005902001133:
total 0
drwxr-xr-x 3 1000 1000 25 Apr 21 23:37 449246820256973531

./stats_log/449193005785809078/449193005902001133/449246820256973531:
total 0
drwxr-xr-x 3 1000 1000 40 Apr 21 23:37 100

./stats_log/449193005785809078/449193005902001133/449246820256973531/100:
total 0
drwxr-xr-x 2 1000 1000 10 Apr 21 23:37 449246820256973665

./stats_log/449193005785809078/449193005902001133/449246820256973531/100/449246820256973665:
total 0

Anything else?

milvus.yaml

autoIndex:
  params:
    build: '{"M": 18,"efConstruction": 240,"index_type": "HNSW", "metric_type": "IP"}'
common:
  DiskIndex:
    BeamWidthRatio: 4
    BuildNumThreadsRatio: 1
    LoadNumThreadRatio: 8
    MaxDegree: 56
    PQCodeBudgetGBRatio: 0.125
    SearchCacheBudgetGBRatio: 0.1
    SearchListSize: 100
  bloomFilterSize: 100000
  buildIndexThreadPoolRatio: 0.75
  chanNamePrefix:
    cluster: by-dev
    dataCoordSegmentInfo: segment-info-channel
    dataCoordStatistic: datacoord-statistics-channel
    dataCoordTimeTick: datacoord-timetick-channel
    queryTimeTick: queryTimeTick
    replicateMsg: replicate-msg
    rootCoordDelta: rootcoord-delta
    rootCoordDml: rootcoord-dml
    rootCoordStatistics: rootcoord-statistics
    rootCoordTimeTick: rootcoord-timetick
    search: search
    searchResult: searchResult
  defaultIndexName: _default_idx
  defaultPartitionName: _default
  entityExpiration: -1
  gracefulStopTimeout: 1800
  gracefulTime: 5000
  indexSliceSize: 16
  locks:
    metrics:
      enable: false
    threshold:
      info: 500
      warn: 1000
  maxBloomFalsePositive: 0.05
  preCreatedTopic:
    enabled: false
    names:
    - topic1
    - topic2
    timeticker: timetick-channel
  security:
    authorizationEnabled: false
    tlsMode: 0
  session:
    retryTimes: 30
    ttl: 30
  simdType: auto
  storage:
    enablev2: false
    scheme: s3
  storageType: remote
  subNamePrefix:
    dataCoordSubNamePrefix: dataCoord
    dataNodeSubNamePrefix: dataNode
    proxySubNamePrefix: proxy
    queryNodeSubNamePrefix: queryNode
    rootCoordSubNamePrefix: rootCoord
  threadCoreCoefficient:
    highPriority: 10
    lowPriority: 1
    middlePriority: 5
  traceLogMode: 0
  ttMsgEnabled: true
  usePartitionKeyAsClusteringKey: false
  useVectorAsClusteringKey: false
dataCoord:
  address: localhost
  channel:
    balanceInterval: 360
    balanceSilentDuration: 300
    watchTimeoutInterval: 300
  compaction:
    clustering:
      autoEnable: false
      dropTolerance: 86400
      enable: true
      gcInterval: 600
      maxCentroidsNum: 10240
      maxInterval: 86400
      maxSegmentSize: 128m
      maxTrainSizeRatio: 0.8
      minCentroidsNum: 128
      minInterval: 3600
      newDataRatioThreshold: 0.2
      newDataSizeThreshold: 256m
      preferSegmentSize: 64m
      stateCheckInterval: 10
      timeout: 3600
      triggerInterval: 600
    enableAutoCompaction: true
    indexBasedCompaction: true
    levelzero:
      forceTrigger:
        deltalogMinNum: 10
        minSize: 8388608
    maxParallelTaskNum: 10
    rpcTimeout: 10
  enableActiveStandby: false
  enableCompaction: true
  enableGarbageCollection: true
  gc:
    dropTolerance: 10800
    interval: 3600
    missingTolerance: 3600
    scanInterval: 168
  grpc:
    clientMaxRecvSize: 536870912
    clientMaxSendSize: 268435456
    serverMaxRecvSize: 268435456
    serverMaxSendSize: 536870912
  import:
    filesPerPreImportTask: 2
    maxImportFileNumPerReq: 1024
    taskRetention: 10800
    waitForIndex: true
  ip: null
  port: 13333
  segment:
    assignmentExpiration: 2000
    compactableProportion: 0.85
    diskSegmentMaxSize: 2048
    enableLevelZero: true
    expansionRate: 1.25
    maxBinlogFileNumber: 32
    maxIdleTime: 600
    maxLife: 86400
    maxSize: 1024
    minSizeFromIdleToSealed: 16
    sealProportion: 0.12
    smallProportion: 0.5
dataNode:
  channel:
    updateChannelCheckpointMaxParallel: 10
    workPoolSize: -1
  clusteringCompaction:
    memoryBufferRatio: 0.1
  dataSync:
    flowGraph:
      maxParallelism: 1024
      maxQueueLength: 16
    maxParallelSyncMgrTasks: 256
    skipMode:
      coldTime: 60
      enable: true
      skipNum: 4
  grpc:
    clientMaxRecvSize: 536870912
    clientMaxSendSize: 268435456
    serverMaxRecvSize: 268435456
    serverMaxSendSize: 536870912
  import:
    maxConcurrentTaskNum: 16
    maxImportFileSizeInGB: 16
  ip: null
  memory:
    forceSyncEnable: true
    forceSyncSegmentNum: 1
    watermarkCluster: 0.5
    watermarkStandalone: 0.2
  port: 21124
  segment:
    deleteBufBytes: 67108864
    insertBufSize: 16777216
    syncPeriod: 600
  timetick:
    byRPC: true
etcd:
  data:
    dir: default.etcd
  endpoints:
  - lru-500g-etcd:2379
  kvSubPath: kv
  log:
    level: info
    path: stdout
  metaSubPath: meta
  rootPath: by-dev
  ssl:
    enabled: false
    tlsCACert: /path/to/ca.pem
    tlsCert: /path/to/etcd-client.pem
    tlsKey: /path/to/etcd-client-key.pem
    tlsMinVersion: 1.3
  use:
    embed: false
gpu:
  initMemSize: 0
  maxMemSize: 0
grpc:
  client:
    backoffMultiplier: 2
    compressionEnabled: false
    dialTimeout: 200
    initialBackOff: 0.2
    keepAliveTime: 10000
    keepAliveTimeout: 20000
    maxBackoff: 10
    maxMaxAttempts: 10
  log:
    clientMaxRecvSize: 536870912
    clientMaxSendSize: 268435456
    level: WARNING
    serverMaxRecvSize: 268435456
    serverMaxSendSize: 536870912
indexCoord:
  address: localhost
  bindIndexNodeMode:
    address: localhost:22930
    enable: false
    nodeID: 0
    withCred: false
  enableActiveStandby: false
  port: 31000
  segment:
    minSegmentNumRowsToEnableIndex: 1024
indexNode:
  enableDisk: true
  grpc:
    clientMaxRecvSize: 536870912
    clientMaxSendSize: 268435456
    serverMaxRecvSize: 268435456
    serverMaxSendSize: 536870912
  ip: null
  maxDiskUsagePercentage: 95
  port: 21121
  scheduler:
    buildParallel: 1
localStorage:
  path: /var/lib/milvus/data/
log:
  file:
    maxAge: 10
    maxBackups: 20
    maxSize: 300
    rootPath: ""
  format: text
  level: debug
  stdout: true
messageQueue: pulsar
metastore:
  type: etcd
minio:
  accessKeyID: miniolru500g
  address: minio-1.minio
  bucketName: lru-500g
  cloudProvider: aws
  iamEndpoint: null
  logLevel: fatal
  port: 9000
  region: null
  requestTimeoutMs: 10000
  rootPath: lru500g
  secretAccessKey: miniolru500g
  ssl:
    tlsCACert: /path/to/public.crt
  useIAM: false
  useSSL: false
  useVirtualHost: false
mq:
  type: pulsar
proxy:
  accessLog:
    enable: false
    formatters:
      base:
        format: '[$time_now] [ACCESS] <$user_name: $user_addr> $method_name [status:
          $method_status] [code: $error_code] [sdk: $sdk_version] [msg: $error_msg]
          [traceID: $trace_id] [timeCost: $time_cost]'
      query:
        format: '[$time_now] [ACCESS] <$user_name: $user_addr> $method_name [status:
          $method_status] [code: $error_code] [sdk: $sdk_version] [msg: $error_msg]
          [traceID: $trace_id] [timeCost: $time_cost] [database: $database_name] [collection:
          $collection_name] [partitions: $partition_name] [expr: $method_expr]'
        methods:
        - Query
        - Search
        - Delete
  connectionCheckIntervalSeconds: 120
  connectionClientInfoTTLSeconds: 86400
  ginLogSkipPaths: /
  ginLogging: true
  grpc:
    clientMaxRecvSize: 67108864
    clientMaxSendSize: 268435456
    serverMaxRecvSize: 67108864
    serverMaxSendSize: 268435456
  healthCheckTimeout: 3000
  http:
    debug_mode: false
    enabled: true
  internalPort: 19529
  ip: null
  maxConnectionNum: 10000
  maxDimension: 32768
  maxFieldNum: 64
  maxNameLength: 255
  maxShardNum: 16
  maxTaskNum: 1024
  maxVectorFieldNum: 4
  msgStream:
    timeTick:
      bufSize: 512
  port: 19530
  slowQuerySpanInSeconds: 5
  timeTickInterval: 200
pulsar:
  address: lru-500g-pulsar-proxy
  enableClientMetrics: false
  maxMessageSize: 5242880
  namespace: default
  port: 6650
  requestTimeout: 60
  tenant: public
  webport: 80
queryCoord:
  address: localhost
  autoBalance: true
  autoHandoff: true
  balanceIntervalSeconds: 60
  balancer: ScoreBasedBalancer
  brokerTimeout: 5000
  channelTaskTimeout: 60000
  checkHandoffInterval: 5000
  checkInterval: 1000
  distPullInterval: 500
  enableActiveStandby: false
  globalRowCountFactor: 0.1
  growingRowCountWeight: 4
  grpc:
    clientMaxRecvSize: 536870912
    clientMaxSendSize: 268435456
    serverMaxRecvSize: 268435456
    serverMaxSendSize: 536870912
  heartbeatAvailableInterval: 10000
  ip: null
  loadTimeoutSeconds: 600
  memoryUsageMaxDifferencePercentage: 30
  overloadedMemoryThresholdPercentage: 90
  port: 19531
  reverseUnBalanceTolerationFactor: 1.3
  scoreUnbalanceTolerationFactor: 0.05
  segmentTaskTimeout: 120000
  taskExecutionCap: 256
  taskMergeCap: 1
queryNode:
  cache:
    enabled: true
    memoryLimit: 2147483648
    readAheadPolicy: willneed
    warmup: sync
  dataSync:
    flowGraph:
      maxParallelism: 1024
      maxQueueLength: 16
  diskCacheCapacityLimit: 51539607552
  enableDisk: true
  enableSegmentPrune: false
  grouping:
    enabled: true
    maxNQ: 1000
    topKMergeRatio: 20
  grpc:
    clientMaxRecvSize: 536870912
    clientMaxSendSize: 268435456
    serverMaxRecvSize: 268435456
    serverMaxSendSize: 536870912
  ip: null
  lazyLoadRequestResourceRetryInterval: 2000
  lazyLoadRequestResourceTimeout: 5000
  lazyloadEnabled: true
  lazyloadWaitTimeout: 300000
  loadMemoryUsageFactor: 1
  maxDiskUsagePercentage: 95
  mmap:
    mmapEnabled: true
  port: 21123
  scheduler:
    cpuRatio: 10
    maxReadConcurrentRatio: 1
    maxTimestampLag: 86400
    receiveChanSize: 10240
    scheduleReadPolicy:
      enableCrossUserGrouping: false
      maxPendingTask: 10240
      maxPendingTaskPerUser: 1024
      name: fifo
      taskQueueExpire: 60
    unsolvedQueueSize: 10240
  segcore:
    cgoPoolSizeRatio: 2
    chunkRows: 128
    exprEvalBatchSize: 8192
    interimIndex:
      buildParallelRate: 0.5
      enableIndex: true
      memExpansionRate: 1.15
      nlist: 128
      nprobe: 16
    knowhereThreadPoolNumRatio: 4
  stats:
    publishInterval: 1000
  useStreamComputing: true
quotaAndLimits:
  compactionRate:
    enabled: false
    max: -1
  ddl:
    collectionRate: -1
    enabled: false
    partitionRate: -1
  dml:
    bulkLoadRate:
      collection:
        max: -1
      max: -1
    deleteRate:
      collection:
        max: -1
      max: -1
    enabled: false
    insertRate:
      collection:
        max: -1
      max: -1
    upsertRate:
      collection:
        max: -1
      max: -1
  dql:
    enabled: false
    queryRate:
      collection:
        max: -1
      max: -1
    searchRate:
      collection:
        max: -1
      max: -1
  enabled: true
  flushRate:
    collection:
      max: -1
    enabled: false
    max: -1
  indexRate:
    enabled: false
    max: -1
  limitReading:
    coolOffSpeed: 0.9
    forceDeny: false
    queueProtection:
      enabled: false
      nqInQueueThreshold: -1
      queueLatencyThreshold: -1
    resultProtection:
      enabled: false
      maxReadResultRate: -1
  limitWriting:
    diskProtection:
      diskQuota: -1
      diskQuotaPerCollection: -1
      enabled: true
    forceDeny: false
    growingSegmentsSizeProtection:
      enabled: false
      highWaterLevel: 0.4
      lowWaterLevel: 0.2
      minRateRatio: 0.5
    memProtection:
      dataNodeMemoryHighWaterLevel: 0.95
      dataNodeMemoryLowWaterLevel: 0.85
      enabled: true
      queryNodeMemoryHighWaterLevel: 0.95
      queryNodeMemoryLowWaterLevel: 0.85
    ttProtection:
      enabled: false
      maxTimeTickDelay: 300
  limits:
    maxCollectionNum: 65536
    maxCollectionNumPerDB: 65536
  quotaCenterCollectInterval: 3
rootCoord:
  address: localhost
  dmlChannelNum: 16
  enableActiveStandby: false
  grpc:
    clientMaxRecvSize: 536870912
    clientMaxSendSize: 268435456
    serverMaxRecvSize: 268435456
    serverMaxSendSize: 536870912
  ip: null
  maxDatabaseNum: 64
  maxGeneralCapacity: 65536
  maxPartitionNum: 4096
  minSegmentSizeToEnableIndex: 1024
  port: 53100
tikv:
  endpoints: 127.0.0.1:2389
  kvSubPath: kv
  metaSubPath: meta
  rootPath: by-dev
tls:
  caPemPath: configs/cert/ca.pem
  serverKeyPath: configs/cert/server.key
  serverPemPath: configs/cert/server.pem
trace:
  exporter: stdout
  jaeger:
    url: null
  otlp:
    endpoint: null
    secure: true
  sampleFraction: 0
@wangting0128 wangting0128 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. test/benchmark benchmark test labels May 16, 2024
@yanliang567
Copy link
Contributor

/unassign

@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 16, 2024
@yanliang567 yanliang567 added this to the 2.4.lru milestone May 16, 2024
@xiaofan-luan
Copy link
Contributor

/assign @chyezh
could you help on verifying this?

@chyezh
Copy link
Contributor

chyezh commented May 21, 2024

From log

2024-05-19 18:50:51.497 | (no unique labels) | [2024/05/19 18:50:51.497 +00:00] [INFO] [datacoord/garbage_collector.go:517] ["garbageCollector will recycle index files"] [buildID=449193006024386963]

the index file has been gc, but minio didn't delete it.

I have tried to delete it at offline, seems that delete method perform correctly at client-side.
image

204 no content is returned from server side.
image
It's a success http code at minio client.
https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObject.html
image

it's a bug of minio?

@chyezh
Copy link
Contributor

chyezh commented May 21, 2024

it can not be deleted on minio console too.

@wangting0128
Copy link
Contributor Author

it can not be deleted on minio console too.

Another problem with this issue is that the gc time seems to be inconsistent with the configured time.

@chyezh
Copy link
Contributor

chyezh commented May 21, 2024

DropCollection don't update DroppedAt field of segment.
so no tolerance time wait when gc segment.
will fix it in next version.

2024-05-16 06:33:15.336(no unique labels)[2024/05/16 06:33:15.336 +00:00] [INFO] [datacoord/services.go:554] ["receive DropVirtualChannel request"] [traceID=3e9b4e63e2e7e2339060c74aa6c2de90] [channelName=by-dev-rootcoord-dml_0_449193005785809078v0]2024-05-16 06:33:15.336(no unique labels)[2024/05/16 06:33:15.336 +00:00] [DEBUG] [datacoord/meta.go:842] ["meta update: update drop channel segment info"] [channel=by-dev-rootcoord-dml_0_449193005785809078v0] | 2024-05-16 06:33:15.336 | (no unique labels) | [2024/05/16 06:33:15.336 +00:00] [INFO] [datacoord/services.go:554] ["receive DropVirtualChannel request"] [traceID=3e9b4e63e2e7e2339060c74aa6c2de90] [channelName=by-dev-rootcoord-dml_0_449193005785809078v0] |   |   |   | 2024-05-16 06:33:15.336 | (no unique labels) | [2024/05/16 06:33:15.336 +00:00] [DEBUG] [datacoord/meta.go:842] ["meta update: update drop channel segment info"] [channel=by-dev-rootcoord-dml_0_449193005785809078v0] |  
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
2024-05-16 06:33:15.336 | (no unique labels) | [2024/05/16 06:33:15.336 +00:00] [INFO] [datacoord/services.go:554] ["receive DropVirtualChannel request"] [traceID=3e9b4e63e2e7e2339060c74aa6c2de90] [channelName=by-dev-rootcoord-dml_0_449193005785809078v0] |  
  |   | 2024-05-16 06:33:15.336 | (no unique labels) | [2024/05/16 06:33:15.336 +00:00] [DEBUG] [datacoord/meta.go:842] ["meta update: update drop channel segment info"] [channel=by-dev-rootcoord-dml_0_449193005785809078v0] |  

2024-05-16 06:33:15.336(no unique labels)[2024/05/16 06:33:15.336 +00:00] [DEBUG] [datacoord/meta.go:1587] ["updating segment state and updating metrics"] [segmentID=449193005786090852] ["old state"=Flushed] ["new state"=Dropped] ["# of rows"=350000] |   |   | 2024-05-16 06:33:15.336 | (no unique labels) | [2024/05/16 06:33:15.336 +00:00] [DEBUG] [datacoord/meta.go:1587] ["updating segment state and updating metrics"] [segmentID=449193005786090852] ["old state"=Flushed] ["new state"=Dropped] ["# of rows"=350000] |  
  |   | 2024-05-16 06:33:15.336 | (no unique labels) | [2024/05/16 06:33:15.336 +00:00] [DEBUG] [datacoord/meta.go:1587] ["updating segment state and updating metrics"] [segmentID=449193005786090852] ["old state"=Flushed] ["new state"=Dropped] ["# of rows"=350000] |  
2024-05-16 06:33:15.336(no unique labels)[2024/05/16 06:33:15.336 +00:00] [DEBUG] [datacoord/meta.go:1587] ["updating segment state and updating metrics"] [segmentID=449193005786161686] ["old state"=Flushed] ["new state"=Dropped] ["# of rows"=300000] |   |   | 2024-05-16 06:33:15.336 | (no unique labels) | [2024/05/16 06:33:15.336 +00:00] [DEBUG] [datacoord/meta.go:1587] ["updating segment state and updating metrics"] [segmentID=449193005786161686] ["old state"=Flushed] ["new state"=Dropped] ["# of rows"=300000]
  |   | 2024-05-16 06:33:15.336 | (no unique labels) | [2024/05/16 06:33:15.336 +00:00] [DEBUG] [datacoord/meta.go:1587] ["updating segment state and updating metrics"] [segmentID=449193005786161686] ["old state"=Flushed] ["new state"=Dropped] ["# of rows"=300000]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants