Skip to content

Conversation

@Dristro
Copy link

@Dristro Dristro commented Feb 21, 2025

Noticed that the .data_as_pandas returns a df containing both colors and labels. Whereas the .data_as_X_y returns colors, which isn't very useful when interpreting and working with the data.

Update: Modified the ScatterWidget class so that .data_as_X_y now returns labels instead of colors.
Question: Was there a specific reason for returning color instead of label? Happy to discuss if there's a particular use case for it!

Returning the labels insead of the colors when using .data_as_X_y. Its more usefull imo.
@koaning
Copy link
Owner

koaning commented Feb 21, 2025

I need to think about this one a bit. I don't mind also having the labels in there but I think we want the colors too if we want to make matplotlib charts with the same data that have the same colors as the drawing widget.

@Dristro
Copy link
Author

Dristro commented Feb 22, 2025

I see your point. Would it help to introduce two separate properties .data_as_X_y_colors and .data_as_X_y_labels? It’s a bit redundant but would allow for both functionalities.

Alternatively, .data_as_X_y could be a function with a parameter, something like type='colors' or 'labels'. That way, the user can choose what they need dynamically.

From my experience, I tend to use labels more than colors, so having that flexibility would be useful. What do you think?

@koaning
Copy link
Owner

koaning commented Mar 6, 2025

I guess it makes sense to return actual labels for data_as_X_y. If the user wills it, the dataframe variant still can contain the extra column.

Could you add a unit test for this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants