-
Notifications
You must be signed in to change notification settings - Fork 924
SPARKC-686 scala 2.13 support #1361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARKC-686 scala 2.13 support #1361
Conversation
while keeping compatibility with scala 2.12 - Applied automatated scala collection migration via [scalafix] (https://github.com/scala/scala-collection-compat) - Conditionally add dependencies on `org.scala-lang.modules.scala-collection-compat` and `org.scala-lang.modules.scala-parallel-collections` for scala 2.13 - Scala version specific implementations of `SparkILoop`, `ParIterable`, `GenericJavaRowReaderFactory` and `CanBuildFrom`. branch: feature/SPARKC-686-scala-213-support
with canonical way to map values branch: feature/SPARKC-686-scala-213-support
because Stream is deprecated and results in a stack overflow on scala 2.13 branch: feature/SPARKC-686-scala-213-support
`java.lang.ClassCastException: scala.collection.mutable.ArrayBuffer cannot be cast to scala.collection.immutable.Seq` branch: feature/SPARKC-686-scala-213-support
branch: feature/SPARKC-686-scala-213-support
branch: feature/SPARKC-686-scala-213-support
so we don't need to trawl through the (long) log output to find out which test failed. Annotate only, which doesn't require check permission. branch: feature/SPARKC-686-scala-213-support
@jtgrabowski The intermittence of the failing tests doesn't seem to depend on the particular version of scala or cassandra used. Rather, I seem to have more success in the early hours in my timezone (Indonesia). Could it be that we run into timing problems due to varying amounts of resources available to github actions at different times of the day? |
I don't know what is causing this, the errors seam unrelated to the PR. |
No worries! Anything I could do in the mean time to help preparing a new release? |
as requested https://github.com/SamTheisens/spark-cassandra-connector/pull/1#discussion_r1263734916 branch: feature/SPARKC-686-scala-213-support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. Left minor comments.
@@ -12,7 +11,9 @@ import com.datastax.spark.connector.cql.CassandraConnector | |||
import com.datastax.spark.connector.embedded.SparkTemplate._ | |||
import com.datastax.spark.connector.rdd.partitioner.EndpointPartition | |||
import com.datastax.spark.connector.writer.AsyncExecutor | |||
import spire.ClassTag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use scala.reflect.ClassTag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in [SPARKC-686] Fix incorrect import
@@ -771,7 +771,7 @@ class TableWriterSpec extends SparkCassandraITFlatSpecBase with DefaultCluster { | |||
verifyKeyValueTable("key_value") | |||
} | |||
|
|||
it should "be able to append and prepend elements to a C* list" in { | |||
it should "be able to.append and.prepend elements to a C* list" in { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove dots from test name here and others below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -175,7 +175,7 @@ class GettableDataToMappedTypeConverter[T : TypeTag : ColumnMapper]( | |||
for ((s, _) <- columnMap.setters) | |||
yield (s, ReflectionUtil.methodParamTypes(targetType, s).head) | |||
val setterColumnTypes: Map[String, ColumnType[_]] = | |||
columnMap.setters.mapValues(columnType) | |||
columnMap.setters.map{case (k, v) => (k, columnType(v))}.toMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why did you choose .map().toMap
over .view.mapValues...
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC I was struggling to find a syntax that works in both 2.12
and 2.13
. It looks like mapValues
was actually fine as long as the return type is converted back to a map.
[SPARKC-686] Replace closure with idiomatic mapValues
branch: feature/SPARKC-686-scala-213-support
branch: feature/SPARKC-686-scala-213-support
the .toMap is necessary for scala 2.13 as the function returns a `scala.collection.MapView` instead of Map branch: feature/SPARKC-686-scala-213-support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for this awesome contribution.
Please sing CLA (https://cla.datastax.com/) if you haven't already.
Yes, I've signed. |
Thank you @SamTheisens ! A new scc version should be released next week. |
@jtgrabowski anything I could help out with towards creating a release? |
@SamTheisens the release is now done. The artifacts should appear in the public repository shortly. Thanks again! |
Awesome! Thanks a lot! |
I tests this only today. I was following the other branch. Everything works now. Thank you! |
Description
How did the Spark Cassandra Connector Work or Not Work Before this Patch
Spark Cassandra Connector did not work with Scala 2.13 before this patch.
General Design of the patch
This is another attempt inspired by #1349. The approach has been as follows:
org.scala-lang.modules.scala-collection-compat
andorg.scala-lang.modules.scala-parallel-collections
for scala 2.13SparkILoop
,ParIterable
,GenericJavaRowReaderFactory
andCanBuildFrom
.How Has This Been Tested?
All automated tests have passed in my fork, but the
(2.12.11, 3.11.10)
build seems to time out intermittently at cluster startup:Checklist: