Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated Python libraries to work with current Scala libraries in dataframes #113

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

stevebuckingham
Copy link

Using point, polygon and polyline types in PySpark was broken due to changes in the Scala libraries, the references now required and how the types were being serialized and deserialized.

These changes are to re-enable dataframe functionality in the Python libraries.

@codecov-io
Copy link

codecov-io commented May 15, 2017

Codecov Report

Merging #113 into master will increase coverage by 12.71%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #113       +/-   ##
===========================================
+ Coverage    76.7%   89.41%   +12.71%     
===========================================
  Files          42       48        +6     
  Lines        1395     1550      +155     
  Branches      103      106        +3     
===========================================
+ Hits         1070     1386      +316     
+ Misses        325      164      -161
Impacted Files Coverage Δ
...in/scala/magellan/mapreduce/ShapeInputFormat.scala 90.62% <0%> (-9.38%) ⬇️
src/main/scala/magellan/BoundingBox.scala 93.47% <0%> (-6.53%) ⬇️
src/main/scala/magellan/io/ShapeWritable.scala 55.55% <0%> (-4.45%) ⬇️
src/main/scala/magellan/Line.scala 86.88% <0%> (-4.35%) ⬇️
...scala/org/apache/spark/sql/types/PolyLineUDT.scala 88.88% <0%> (-4.22%) ⬇️
.../scala/org/apache/spark/sql/types/PolygonUDT.scala 93.75% <0%> (-2.55%) ⬇️
src/main/scala/magellan/index/ZOrderCurve.scala 92.68% <0%> (-1.61%) ⬇️
src/main/scala/magellan/PolyLine.scala 89.18% <0%> (-1.09%) ⬇️
...ain/scala/magellan/mapreduce/ShapefileReader.scala 96.15% <0%> (-0.82%) ⬇️
src/main/scala/magellan/index/Indexer.scala 100% <0%> (ø) ⬆️
... and 36 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e74c03a...58226de. Read the comment docs.

@harsha2010
Copy link
Owner

@stevebuckingham , were you able to test this on a cluster? Does joins etc on a distributed data frame work in python? Or are more changes needed to fix the python scripts?

@harsha2010
Copy link
Owner

sbt/sbt clean test is showing me a compilation failure in the python module:

Listing '/Users/kernelfish/projects/magellan/python'...
Listing '/Users/kernelfish/projects/magellan/python/magellan'...
Compiling '/Users/kernelfish/projects/magellan/python/magellan/types.py'...
*** File "/Users/kernelfish/projects/magellan/python/magellan/types.py", line 443
print Point(1,1)
^
SyntaxError: invalid syntax

@stevebuckingham
Copy link
Author

I'll update that - I've been compiling on 2.7 and so that is valid. I'll import print from future and update the cost. I've managed within joins between points and polygons but I am having a problem getting intersect to work with polyline and polygon (which I think it should looking at the Scala code). So I suggest you reject this pull request and I can submit a new one when I fox both of the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants