Skip to content

Conversation

@koperagen
Copy link
Collaborator

column arithmetics are essentially obsolete IMO, doesn't make sense to keep this page around

column arithmetics are essentially obsolete IMO, doesn't make sense to keep this page around
@koperagen koperagen added this to the 1.0.0-Beta5 milestone Dec 5, 2025
@koperagen koperagen requested a review from Jolanrensen December 5, 2025 15:32
@koperagen koperagen self-assigned this Dec 5, 2025
@koperagen koperagen added the documentation Improvements or additions to documentation (not KDocs) label Dec 5, 2025
@Jolanrensen
Copy link
Collaborator

related issues: #530, #265

I wouldn't say they're obsolete actually. I've seen AI recommend colA + colB on multiple occasions. If we can, why shouldn't we support such a thing?

@koperagen
Copy link
Collaborator Author

koperagen commented Dec 8, 2025

I think they are obsolete in a sense that we don't have cases where we'd recommend using it. At least i don't. dataframe is not pandas, and to be useful in Kotlin most of the time you need to operate on individual objects / values. Which it the default for majority of our operations, except for rare groupBy { s.length() }. But even there i'd recommend groupBy { expr { s.length } } because we do have DataColumn<String>.length(), but not DataColumn<List>.size().

We can keep these functions around (i don't suggest to deprecate them), but i prefer to migrate examples from using it whenever i see. If you have specific examples when you see it to be useful, please share

@Jolanrensen
Copy link
Collaborator

@koperagen I think most useful examples are from Kandy:

Let's say you have some data with values and corrections of those values:

val df = dataFrameOf(
    "ID" to columnOf(1L, 2L, 3L),
    "score" to columnOf(10, 20, 30),
    "correction" to columnOf(-0.5, 0.0, 0.5),
)

It would be nice to plot the immediate results of the scores with correction without having to add a new "temporary" column, just for plotting:

df.plot { 
    bars { 
        x(ID)
        y(score + correction)
    }
}

(this is currently not possible between 2 columns, and there is no easy way to zip two columns... There's add {} but this requires an extra cell in notebooks (for now))

"column operator value" or "value operator value" can usually be replaced with something like expr { column + value } or a column.map { it + value }. But it may be nice to be able to do a x(columnA / 100.0) at once in Kandy.

@koperagen
Copy link
Collaborator Author

@koperagen I think most useful examples are from Kandy:

True! I can easily imagine that, i remember writing similar code myself.
I suggest to remove the page, and when we decide to introduce it back - let's provide "goal" of that API

  1. That most of the time column operations are not needed in dataframe unlike pandas, because we provide row-based APIs
  2. Recommended "adapter" functions like map or expr
  3. Example where it might still be useful, and only then what APIs we provide

@Jolanrensen
Copy link
Collaborator

@koperagen good plan :) Certainly better to remove it than have a // TODO on a page

@koperagen koperagen merged commit 94bbcf9 into master Dec 8, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation (not KDocs)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants