Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"unable to evaluate payload provided" with KMeans Clustering Algorithm #102

Open
hdamani09 opened this issue Oct 31, 2019 · 1 comment
Open

Comments

@hdamani09
Copy link

Please fill out the form below.

System Information

  • Spark:
  • SDK Version:spark_2.2.0-1.2.5
  • Spark Version:2.4.3
  • Algorithm: KMeans:

Describe the problem

Hi,
I trained a model with Sagemaker-Spark provided KMeans algorithm for a sample csv dataset and hyperparameters as shown below:

"hyperParameters": {
        "feature_dim": "6",
        "k": "10",
        "mini_batch_size": "15"
}

The dataframe after doing the feature engineering vector assembly is as follows:

+-------------------+---+---+-----+---------------+----+----+----------------------------------+
|Result_of_Treatment|sex|age|Time |Number_of_Warts|Type|Area|features |
+-------------------+---+---+-----+---------------+----+----+----------------------------------+
|0 |1 |17 |9.25 |12 |1 |10 |[0.0,1.0,17.0,9.25,12.0,1.0,10.0] |
|0 |1 |17 |11.5 |2 |1 |10 |[0.0,1.0,17.0,11.5,2.0,1.0,10.0] |
|0 |1 |23 |10.25|7 |3 |72 |[0.0,1.0,23.0,10.25,7.0,3.0,72.0] |
|0 |1 |29 |11.75|5 |1 |96 |[0.0,1.0,29.0,11.75,5.0,1.0,96.0] |
|0
|1 |34 |11.25|1 |3 |150 |[0.0,1.0,34.0,11.25,1.0,3.0,150.0]|
|0 |1 |34 |12.0 |1 |3 |150 |[0.0,1.0,34.0,12.0,1.0,3.0,150.0] |
|0 |1 |40 |11.5 |9 |2 |80 |[0.0,1.0,40.0,11.5,9.0,2.0,80.0] |
|0 |1 |50 |8.0 |1 |3 |132 |[0.0,1.0,50.0,8.0,1.0,3.0,132.0] |
|0 |1 |50 |8.75 |11 |3 |132 |[0.0,1.0,50.0,8.75,11.0,3.0,132.0]|
|0 |1 |63 |2.75 |3 |3 |20 |[0.0,1.0,63.0,2.75,3.0,3.0,20.0] |
|0 |1 |67 |3.75 |11 |3 |20 |[0.0,1.0,67.0,3.75,11.0,3.0,20.0] |
|0 |2 |23 |11.75|12 |3 |72 |[0.0,2.0,23.0,11.75,12.0,3.0,72.0]|
|0 |2 |24 |9.5 |3 |3 |20 |[0.0,2.0,24.0,9.5,3.0,3.0,20.0] |
|0 |2 |27 |8.75 |2 |1 |6 |[0.0,2.0,27.0,8.75,2.0,1.0,6.0] |
|0 |2 |32 |12.0 |4 |3 |750 |[0.0,2.0,32.0,12.0,4.0,3.0,750.0] |
|0 |2 |34 |11.25|3 |3 |150 |[0.0,2.0,34.0,11.25,3.0,3.0,150.0]|
|0 |2 |34 |12.0 |3 |3 |95 |[0.0,2.0,34.0,12.0,3.0,3.0,95.0] |
|0 |2 |35 |8.5 |6 |3 |100 |[0.0,2.0,35.0,8.5,6.0,3.0,100.0] |
|0 |2 |36 |10.5 |4 |1 |8 |[0.0,2.0,36.0,10.5,4.0,1.0,8.0] |
|1 |1 |15 |3.5 |2 |1 |4 |[1.0,1.0,15.0,3.5,2.0,1.0,4.0] |
+-------------------+---+---+-----+---------------+----+----+----------------------------------+
only showing top 20 rows

root
|-- Result_of_Treatment: integer (nullable = true)
|-- sex: integer (nullable = true)
|-- age: integer (nullable = true)
|-- Time: double (nullable = true)
|-- Number_of_Warts: integer (nullable = true)
|-- Type: integer (nullable = true)
|-- Area: integer (nullable = true)
|-- features: vector (nullable = true)

When, I try to do a transform using this dataframe on the KMeans model, I get the following exception stack trace. Can someone help me understand and resolve why is this occuring?

Minimal repo / logs

org.apache.spark.SparkException : Job aborted due to stage failure: Task 0 in stage 14.0 failed 4 times, most recent failure: Lost task 0.3 in stage 14.0 (TID 16, ip-172-21-87-93.aws.com, executor 1): com.amazonaws.services.sagemakerruntime.model.ModelErrorException: Received client error (400) from hd-kmeans-Model-20191031-184713 with message "unable to evaluate payload provided". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/hd-kmeans-endpoint-20191031-184713 in account 820784505615 for more information. (Service: AmazonSageMakerRuntime; Status Code: 424; Error Code: ModelError; Request ID: 5b811028-e055-4a53-b0f3-3c7df73565ca) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1712) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1367) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512) at com.amazonaws.services.sagemakerruntime.AmazonSageMakerRuntimeClient.doInvoke(AmazonSageMakerRuntimeClient.java:235) at com.amazonaws.services.sagemakerruntime.AmazonSageMakerRuntimeClient.invoke(AmazonSageMakerRuntimeClient.java:211) at com.amazonaws.services.sagemakerruntime.AmazonSageMakerRuntimeClient.executeInvokeEndpoint(AmazonSageMakerRuntimeClient.java:175) at com.amazonaws.services.sagemakerruntime.AmazonSageMakerRuntimeClient.invokeEndpoint(AmazonSageMakerRuntimeClient.java:151) at com.amazonaws.services.sagemaker.sparksdk.transformation.util.RequestBatchIterator.hasNext(RequestBatchIterator.scala:133) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636) at org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anonfun$1$$anon$1.hasNext(InMemoryRelation.scala:125) at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335) at org.apache.spark.rdd.RDD.iterator(RDD.scala:286) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Driver stacktrace: Cause : com.amazonaws.services.sagemakerruntime.model.ModelErrorException: Received client error (400) from hd-kmeans-Model-20191031-184713 with message "unable to evaluate payload provided". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/hd-kmeans-endpoint-20191031-184713 in account 820784505615 for more information. (Service: AmazonSageMakerRuntime; Status Code: 424; Error Code: ModelError; Request ID: 5b811028-e055-4a53-b0f3-3c7df73565ca)

@nadiaya
Copy link
Contributor

nadiaya commented Nov 6, 2019

Driver stacktrace: Cause : com.amazonaws.services.sagemakerruntime.model.ModelErrorException: Received client error (400) from hd-kmeans-Model-20191031-184713 with message "unable to evaluate payload provided". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/hd-kmeans-endpoint-20191031-184713 in account 820784505615 for more information. (Service: AmazonSageMakerRuntime; Status Code: 424; Error Code: ModelError; Request ID: 5b811028-e055-4a53-b0f3-3c7df73565ca)

Could you provide logs of the error from https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/hd-kmeans-endpoint-20191031-184713 in account 820784505615?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants