Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misleading Date Format Pattern for CassandraSourceConnector #901

Open
jp-9 opened this issue Dec 22, 2022 · 0 comments
Open

Misleading Date Format Pattern for CassandraSourceConnector #901

jp-9 opened this issue Dec 22, 2022 · 0 comments

Comments

@jp-9
Copy link

jp-9 commented Dec 22, 2022

Issue Guidelines

Please review these questions before submitting any issue?

What version of the Stream Reactor are you reporting this issue for?

bbd3c5b

Are you running the correct version of Kafka/Confluent for the Stream reactor release?

Yes

Do you have a supported version of the data source/sink .i.e Cassandra 3.0.9?

Yes

Have you read the docs?

Yes

What is the expected behaviour?

The CassandraDateFormatter uses the date format pattern: "yyyy-MM-dd HH:mm:ss.SSS'Z'"

  class CassandraDateFormatter {
    private val dateFormatPattern = "yyyy-MM-dd HH:mm:ss.SSS'Z'"  // <----- Hardcoded pattern

    def parse(date: String): Date = {
      val dateFormatter = new SimpleDateFormat(dateFormatPattern)
      dateFormatter.parse(date)
    }

    def format(date: Date): String = {
      val dateFormatter = new SimpleDateFormat(dateFormatPattern)
      dateFormatter.format(date)
    }

    def getYear(date: Date): Option[Int] = {
      val dateFormatter = new SimpleDateFormat("yyyy");
      dateFormatter.format(date).toIntOption
    }
  }

When setting my initial offset I want to do it in UTC time so intuitively you would something like this:
connect.cassandra.initial.offset=2022-12-22 18:00:0.000Z <--- the Z at the end usually indicating that this is a UTC+00 date.

However the format pattern that is actually implemented is a bit misleading. The date set must end in a Z in order for it to be parsed correctly, but because the Z in the format pattern is in quotes it doesn't actually use it when determining timezone, it just requires it to be in the date string. If we want the Z at the end to indicate UTC time the format has to be "yyyy-MM-dd HH:mm:ss.SSSX" (https://docs.oracle.com/en/java/javase/12/docs/api/java.base/java/text/SimpleDateFormat.html)

An example to illustrate my point, assuming I am in UTC-05:00 (Eastern Standard Time).

According to ISO 8601 "2022-12-22 12:00:00.000Z" should be Thu Dec 22 7:00:00 EST 2022

>> new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS'Z'").parse("2022-12-22 12:00:00.000Z")
Output:
❌ Thu Dec 22 12:00:00 EST 2022

>> new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSX").parse("2022-12-22 12:00:00.000Z")
Output:
✔ Thu Dec 22 7:00:00 EST 2022

Was this design intentional? Is there a way to set the initial offset in Zulu Time?

** Edit: Accidentally included some of my test code in the CassandraDateFormatter copy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants