-
Notifications
You must be signed in to change notification settings - Fork 2k
[Spark][Infra] Drop support for Spark 3.5 and formally pin to released Spark 4.0.1 #5616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This reverts commit 8fd880f.
| numberOfAddFiles = checkpointDataIter.getNumberOfAddActions(); | ||
| } catch (FileAlreadyExistsException faee) { | ||
| throw new CheckpointAlreadyExistsException(version); | ||
| } catch (IOException io) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgrading the hadoop version changes this error class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm .. I wonder what the change was?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the change in hadoop but instead of seeing a FileAlreadyExistsException we see a IOException with cause FileAlreadyExistsException. We have this tested (at least one test fails w/out this fix here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They seem similar enough so didn't look further into it. Seems like a minor API difference.
|
In theory I could do the src code shims + test code shims separately if that would help. Let me know if that makes reviews easier (not sure if anyone wants to review the shim code changes closely, or if tests pass & code compiles that's enough). |
kernel-spark/src/main/java/io/delta/kernel/spark/catalog/SparkTable.java
Outdated
Show resolved
Hide resolved
| echo "❌ Cache MISS - will download dependencies" | ||
| fi | ||
| - name: Run tests | ||
| # Run unit tests with JDK 17. These unit tests depend on Spark, and Spark 4.0+ is JDK 17. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this comment be better placed besides java-version: "17" ?
| @@ -1,59 +0,0 @@ | |||
| name: "Delta Spark Master" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we update the PR title to say drop support for spark 3.5 and spark master compilation ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean we haven't actually been compiling with spark master in a while... (as we're using a very stale snapshot). But I can make the title more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. Sorry, I'm still confused. Here we are deleting our job to compile against spark "master" right? (perhaps it was a stale master ..)
But does Drop support for Spark 3.5 and formally pin to released Spark 4.0.1 reflect that?
That seems like an important highlight, sorry, and I want to make sure my understanding is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think calling it spark master before was misleading, in fact, in the previous PR we renamed the spark version spec to spark40Snapshot instead of master. I think saying we are removing spark master is misleading considering we never were compiling with Spark master. We will be fixing that in future PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more correct to say spark_master_test.yaml was incorrectly named this whole time.
|
|
||
| // Changes in 4.1.0 | ||
| // TODO: change in type hierarchy due to removal of DeltaThrowableConditionShim | ||
| ProblemFilters.exclude[MissingTypesProblem]("io.delta.exceptions.*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reviewers this seems safe to me, considering no one should be catching DeltaThrowableConditionShim... but would like additional opinions
| ).configureUnidoc() | ||
|
|
||
| /* | ||
| TODO: readd delta-iceberg on Spark 4.0+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lzlfred Hey Fred, we will be releasing in on both Spark 4.0 and Spark 4.1 next release, we will need to update this build to work for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also tracking the todo at #5326
| ).configureUnidoc() | ||
|
|
||
| /* | ||
| TODO: compilation broken for Spark 4.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracking at #5326
@linzhou-db @littlegrasscao FYI can you please look into fixing this once I merge this PR
| val lookupSparkVersion: PartialFunction[(Int, Int), String] = { | ||
| // version 4.0.0-preview1 | ||
| case (major, minor) if major >= 4 => "4.0.0-preview1" | ||
| // TODO: how to run integration tests for multiple Spark versions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracking at #5326
| with open("python/README.md", "r", encoding="utf-8") as fh: | ||
| long_description = fh.read() | ||
|
|
||
| # TODO: once we support multiple Spark versions update this to be compatible with both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracking at #5326
Which Delta project/connector is this regarding?
Description
PART OF #5326
Contains the following changes:
In a future PR
(these will require adding new shims, but in different areas)
How was this patch tested?
Unit tests + ran integration tests locally (python, scala + pip)
Tracking open TODOs at #5326