SPARKC-686 scala 2.13 support #1361

SamTheisens · 2023-07-11T13:30:14Z

Description

How did the Spark Cassandra Connector Work or Not Work Before this Patch

Spark Cassandra Connector did not work with Scala 2.13 before this patch.

General Design of the patch

This is another attempt inspired by #1349. The approach has been as follows:

Applied automatated scala collection migration via scalafix
Conditionally add dependencies on org.scala-lang.modules.scala-collection-compat and org.scala-lang.modules.scala-parallel-collections for scala 2.13
Scala version specific implementations of SparkILoop, ParIterable, GenericJavaRowReaderFactory and CanBuildFrom.

How Has This Been Tested?

All automated tests have passed in my fork, but the (2.12.11, 3.11.10) build seems to time out intermittently at cluster startup:

java.lang.RuntimeException: The command [ccm, create, ccm_1, -i, 127.12.0., -v, 5.1.24, --dse, --config-dir=/tmp/ccm6059075771339504008] was killed after 10 minutes

Checklist:

I have a ticket in the OSS JIRA
I have performed a self-review of my own code
Locally all tests pass (make sure tests fail without your patch)

while keeping compatibility with scala 2.12 - Applied automatated scala collection migration via [scalafix] (https://github.com/scala/scala-collection-compat) - Conditionally add dependencies on `org.scala-lang.modules.scala-collection-compat` and `org.scala-lang.modules.scala-parallel-collections` for scala 2.13 - Scala version specific implementations of `SparkILoop`, `ParIterable`, `GenericJavaRowReaderFactory` and `CanBuildFrom`. branch: feature/SPARKC-686-scala-213-support

with canonical way to map values branch: feature/SPARKC-686-scala-213-support

because Stream is deprecated and results in a stack overflow on scala 2.13 branch: feature/SPARKC-686-scala-213-support

`java.lang.ClassCastException: scala.collection.mutable.ArrayBuffer cannot be cast to scala.collection.immutable.Seq` branch: feature/SPARKC-686-scala-213-support

branch: feature/SPARKC-686-scala-213-support

so we don't need to trawl through the (long) log output to find out which test failed. Annotate only, which doesn't require check permission. branch: feature/SPARKC-686-scala-213-support

SamTheisens · 2023-07-14T07:52:17Z

@jtgrabowski The intermittence of the failing tests doesn't seem to depend on the particular version of scala or cassandra used. Rather, I seem to have more success in the early hours in my timezone (Indonesia).
Re-running failed tests only tends to get all the lights green.

Could it be that we run into timing problems due to varying amounts of resources available to github actions at different times of the day?

jtgrabowski · 2023-07-14T13:30:01Z

I don't know what is causing this, the errors seam unrelated to the PR.
I apologize, but I won't be able to take a look until 24th of July. Let's revisit then.

SamTheisens · 2023-07-14T14:47:27Z

I don't know what is causing this, the errors seam unrelated to the PR. I apologize, but I won't be able to take a look until 24th of July. Let's revisit then.

No worries! Anything I could do in the mean time to help preparing a new release?

as requested https://github.com/SamTheisens/spark-cassandra-connector/pull/1#discussion_r1263734916 branch: feature/SPARKC-686-scala-213-support

jtgrabowski

This is great. Left minor comments.

jtgrabowski · 2023-07-28T11:21:30Z

connector/src/it/scala/com/datastax/spark/connector/rdd/RDDSpec.scala

@@ -12,7 +11,9 @@ import com.datastax.spark.connector.cql.CassandraConnector
 import com.datastax.spark.connector.embedded.SparkTemplate._
 import com.datastax.spark.connector.rdd.partitioner.EndpointPartition
 import com.datastax.spark.connector.writer.AsyncExecutor
+import spire.ClassTag


Use scala.reflect.ClassTag

Fixed in [SPARKC-686] Fix incorrect import

jtgrabowski · 2023-07-28T11:32:54Z

connector/src/it/scala/com/datastax/spark/connector/writer/TableWriterSpec.scala

@@ -771,7 +771,7 @@ class TableWriterSpec extends SparkCassandraITFlatSpecBase with DefaultCluster {
    verifyKeyValueTable("key_value")
  }

-  it should "be able to append and prepend elements to a C* list" in {
+  it should "be able to.append and.prepend elements to a C* list" in {


Could you remove dots from test name here and others below?

Fixed in [SPARKC-686] Remove accidental dots in spec names

jtgrabowski · 2023-07-28T11:41:44Z

...r/src/main/scala/com/datastax/spark/connector/mapper/GettableDataToMappedTypeConverter.scala

@@ -175,7 +175,7 @@ class GettableDataToMappedTypeConverter[T : TypeTag : ColumnMapper](
      for ((s, _) <- columnMap.setters)
        yield (s, ReflectionUtil.methodParamTypes(targetType, s).head)
    val setterColumnTypes: Map[String, ColumnType[_]] =
-      columnMap.setters.mapValues(columnType)
+      columnMap.setters.map{case (k, v) => (k, columnType(v))}.toMap


Just curious, why did you choose .map().toMap over .view.mapValues...?

IIRC I was struggling to find a syntax that works in both 2.12 and 2.13. It looks like mapValues was actually fine as long as the return type is converted back to a map.
[SPARKC-686] Replace closure with idiomatic mapValues

branch: feature/SPARKC-686-scala-213-support

the .toMap is necessary for scala 2.13 as the function returns a `scala.collection.MapView` instead of Map branch: feature/SPARKC-686-scala-213-support

jtgrabowski

LGTM! Thank you for this awesome contribution.
Please sing CLA (https://cla.datastax.com/) if you haven't already.

SamTheisens · 2023-07-28T14:02:24Z

LGTM! Thank you for this awesome contribution. Please sing CLA (https://cla.datastax.com/) if you haven't already.

Yes, I've signed.

jtgrabowski · 2023-07-28T15:04:48Z

Thank you @SamTheisens ! A new scc version should be released next week.

SamTheisens · 2023-08-15T13:09:23Z

Thank you @SamTheisens ! A new scc version should be released next week.

@jtgrabowski anything I could help out with towards creating a release?

jtgrabowski · 2023-08-21T09:39:34Z

@SamTheisens the release is now done. The artifacts should appear in the public repository shortly. Thanks again!

SamTheisens · 2023-08-21T12:00:31Z

@SamTheisens the release is now done. The artifacts should appear in the public repository shortly. Thanks again!

Awesome! Thanks a lot!

jesperancinha · 2024-01-18T22:52:01Z

I tests this only today. I was following the other branch. Everything works now. Thank you!

SamTheisens added 2 commits July 11, 2023 11:08

[SPARKC-686] Replace deprecated mapvalues

40f1cff

with canonical way to map values branch: feature/SPARKC-686-scala-213-support

SamTheisens changed the title ~~Feature/sparkc 686 scala 213 support~~ SPARKC-686 scala 213 support Jul 11, 2023

SamTheisens changed the title ~~SPARKC-686 scala 213 support~~ SPARKC-686 scala 2.13 support Jul 11, 2023

SamTheisens added 5 commits July 12, 2023 09:01

[SPARKC-686] Migrate from Stream to LazyList

201492a

because Stream is deprecated and results in a stack overflow on scala 2.13 branch: feature/SPARKC-686-scala-213-support

[SPARKC-686] Avoid potential class cast exception

dd3be50

`java.lang.ClassCastException: scala.collection.mutable.ArrayBuffer cannot be cast to scala.collection.immutable.Seq` branch: feature/SPARKC-686-scala-213-support

[SPARKC-686] Fix typos in documentation

57efb2b

branch: feature/SPARKC-686-scala-213-support

[SPARKC-686] Clean up conditional library import

4a676f5

branch: feature/SPARKC-686-scala-213-support

[SPARKC-686] Publish test results

364b374

so we don't need to trawl through the (long) log output to find out which test failed. Annotate only, which doesn't require check permission. branch: feature/SPARKC-686-scala-213-support

SamTheisens mentioned this pull request Jul 13, 2023

SPARKC-686 Port to Scala 2.13 #1349

Open

3 tasks

[SPARKC-686] Removing cassandra 5.1.24

9176052

as requested https://github.com/SamTheisens/spark-cassandra-connector/pull/1#discussion_r1263734916 branch: feature/SPARKC-686-scala-213-support

jtgrabowski reviewed Jul 28, 2023

View reviewed changes

SamTheisens added 3 commits July 28, 2023 20:23

[SPARKC-686] Fix incorrect import

cb36121

branch: feature/SPARKC-686-scala-213-support

[SPARKC-686] Remove accidental dots in spec names

c6d84c6

branch: feature/SPARKC-686-scala-213-support

[SPARKC-686] Replace closure with idiomatic mapValues

1e5702e

the .toMap is necessary for scala 2.13 as the function returns a `scala.collection.MapView` instead of Map branch: feature/SPARKC-686-scala-213-support

jtgrabowski approved these changes Jul 28, 2023

View reviewed changes

jtgrabowski merged commit 5a25f7f into apache:master Jul 28, 2023

eappere added a commit to eappere/spark-cassandra-connector that referenced this pull request Dec 5, 2024

Backport build system fixes from upstream apache#1361

d4e37b4

eappere added a commit to criteo-forks/spark-cassandra-connector that referenced this pull request Dec 6, 2024

Backport build system fixes from upstream apache#1361

e16fc54

SPARKC-686 scala 2.13 support #1361

SPARKC-686 scala 2.13 support #1361

Uh oh!

Conversation

SamTheisens commented Jul 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How did the Spark Cassandra Connector Work or Not Work Before this Patch

General Design of the patch

How Has This Been Tested?

Checklist:

Uh oh!

SamTheisens commented Jul 14, 2023

Uh oh!

jtgrabowski commented Jul 14, 2023

Uh oh!

SamTheisens commented Jul 14, 2023

Uh oh!

jtgrabowski left a comment

Choose a reason for hiding this comment

Uh oh!

jtgrabowski Jul 28, 2023

Choose a reason for hiding this comment

Uh oh!

SamTheisens Jul 28, 2023

Choose a reason for hiding this comment

Uh oh!

jtgrabowski Jul 28, 2023

Choose a reason for hiding this comment

Uh oh!

SamTheisens Jul 28, 2023

Choose a reason for hiding this comment

Uh oh!

jtgrabowski Jul 28, 2023

Choose a reason for hiding this comment

Uh oh!

SamTheisens Jul 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jtgrabowski left a comment

Choose a reason for hiding this comment

Uh oh!

SamTheisens commented Jul 28, 2023

Uh oh!

jtgrabowski commented Jul 28, 2023

Uh oh!

SamTheisens commented Aug 15, 2023

Uh oh!

jtgrabowski commented Aug 21, 2023

Uh oh!

SamTheisens commented Aug 21, 2023

Uh oh!

jesperancinha commented Jan 18, 2024

Uh oh!

Uh oh!

SamTheisens commented Jul 11, 2023 •

edited

Loading

SamTheisens Jul 28, 2023 •

edited

Loading