Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newer versions of the library on maven central #96

Open
jayantshekhar opened this issue Oct 21, 2019 · 5 comments
Open

Newer versions of the library on maven central #96

jayantshekhar opened this issue Oct 21, 2019 · 5 comments

Comments

@jayantshekhar
Copy link

System Information

  • spark_2.2.0-1.2.5
  • Spark 2.3 and later

Describe the problem

EMR clusters which use spark 2.3 and later have newer versions of sagemaker spark jars.

However they are not available on maven central : https://mvnrepository.com/artifact/com.amazonaws/sagemaker-spark

When is the plan to release to maven central for spark 2.3 and later? Or any recommendations for running on later EMR versions of the cluster.

Minimal repo / logs

@laurenyu
Copy link
Contributor

Unfortunately, we don't have any plans to upgrade the current Spark version, but we are always re-evaluating our roadmap based on customer feedback!

@jayantshekhar
Copy link
Author

Thanks for that Lauren!

Trying to understand it. Is Spark-SageMaker on the roadmap and would you recommend users to continue building solutions on it?

Is there something else you would like us to go with when integrating with SageMaker especially when running jobs on EMR?

@nadiaya
Copy link
Contributor

nadiaya commented Nov 11, 2019

Please refer to our documentation for Spark support. https://docs.aws.amazon.com/sagemaker/latest/dg/apache-spark.html We are evaluating our roadmap and will add support for the latest version in the future.

@jayantshekhar
Copy link
Author

Thanks a lot Nadia!
Will keep an eye on it and look forward to support for Spark 2.3 and Spark 2.4.

@ehameyie
Copy link

I had issues running sagemaker_pyspark on EMR 5.22 per this closed issue. I was able to have it work with no issue and confirm this with an AWS tech support. The changes I had to apply are listed in my comments in the closed issue linked above. Figured I'd also post here in case it can benefit anyone else.

One question though. It appears that sagemaker_pyspark SDK is not updated as often as sagemaker python SDK. Should we not be concerned because sagemaker_pyspark is a wrapper for sagemaker python SDK; or is it indeed lower priority in your roadmap and therefore receives less support?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants