Skip to content

Conversation

@timcreatedit
Copy link

As you could see from #70, I expected to be able to just use this package from my normal testWidgets tests, just like I would with matchesGoldenFile.

As a POC, I would propose something like this to make usage of this package as frictionless as possible. Things that are still TBD:

  • What do we do with the runner variants (e.g. testGoldenSceneOnIOS)? I personally would've expected them to be part of the Gallery functionality itself, so that I can just create the same gallery on different platforms.
  • While I think it probably works in all cases like this, I can't verify 100% accuracy, since the goldens don't pass right now. The failure scenes look equivalent to me, but it would of course be better if we first make sure the goldens pass on main
  • Probably more, this PR's main purpose is gathering feedback for now

@matthew-carroll
Copy link
Contributor

I don't think you "need" to use testGoldenScene(). You can replicate all the code that's in it: https://github.com/Flutter-Bounty-Hunters/flutter_test_goldens/blob/main/lib/src/test_runners.dart#L241

But now you've just moved that responsibility onto yourself for every test.

What do we do with the runner variants (e.g. testGoldenSceneOnIOS)? I personally would've expected them to be part of the Gallery functionality itself, so that I can just create the same gallery on different platforms.

There are some tests that you only want to run on iOS. That runner is a convenience to configure the platform to iOS. If you want a gallery with multiple platforms, you don't need to use that runner.

Copy link
Contributor

@matthew-carroll matthew-carroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some review. Can't tell how much of this is actually intentional because it looks like you auto-reformatted the files you opened so it's mostly noise. Please try to revert and apply just the changes you intend to make.

as golden_toolkit;

/// Remember if fonts have already been loaded in this isolate.
bool _fontsLoaded = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually needed? What's the current behavior that you're trying to fix with this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me like right now if I have multiple testGoldenScene tests in the same suite, they will all read the font manifest, even though the fonts have already been loaded. The same thing would happen with my change, but even if we don't move font loading somewhere else it seems like wasted work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just looking for clarity as to whether this actually prevents additional execution, or if we're just repeating something that's already tracked inside the font loading behavior. So without this we're re-reading the manifest on every call even if fonts are already loaded? (I haven't looked at that in a while).

// TODO: Return a success/failure report that we can publish to the test output.
await _compareGoldens(tester, _fileName, screenshots);
FtgLog.pipeline.finer("Done comparing goldens.");
await TestFonts.loadAppFonts();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to force this on everyone. If people want standard Ahem goldens, they need to be able to get them.

FtgLog.pipeline.finer("Done comparing goldens.");
await TestFonts.loadAppFonts();

tester.view
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't wanna force this either.

I see that you're trying to merge things together, but that's actually the opposite of what a toolkit should do. Higher level conveniences that add default behaviors is fine. But forcing decisions on everyone at the center of the tool will lead to angry devs who get to a certain point of adoption and then realize they can't control something that they need to control.

@timcreatedit
Copy link
Author

Can't tell how much of this is actually intentional because it looks like you auto-reformatted the files you opened so it's mostly noise. Please try to revert and apply just the changes you intend to make.

Sorry for that, I didn't notice. I did this quickly at the end of my workday so we can discuss the idea, hence the draft status. Thank you for taking the time to go through it anyway!

We don't wanna force this either.

I see that you're trying to merge things together, but that's actually the opposite of what a toolkit should do. Higher level conveniences that add default behaviors is fine. But forcing decisions on everyone at the center of the tool will lead to angry devs who get to a certain point of adoption and then realize they can't control something that they need to control.

I think I see where you are coming from, but then – at least to me – the current API isn't really intuitive. You are saying that we don't need testGoldenScene, but when you use the core functionality without it, it breaks.

Before this change, users would wrap their galleries in testGoldenScene, which sets up the view so that galleries actually layout properly. My intent here was to move this setup to the gallery itself, since it needs it for its run.

I think on the flip side, it is more confusing to have the test function set up a different view for the entirety of its run, since it affects all other test code as well.

For font loading, I understand why moving it to the gallery and timeline run() is too restrictive, but it also seems weird to have it in the test runner. If I want to render a gallery using Ahem, I need to just know to set up the View correctly, but not load the fonts, since testGoldenScene implicitly did both for me in other cases.

Expected behavior for me would be to set up tester.view only for the run() of the Gallery/Timeline, and have font rendering behavior be opt-in in another way (function call before, or maybe a function on the Gallery for while you build it). It's a bit trickier of course, since you can't unload fonts anymore 🥲

Let me know what you think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty to commit this, but I understand if you don't want this here. I thought it might help other VS Code users like me, that have the "format on save" setting turned on

@timcreatedit
Copy link
Author

But now you've just moved that responsibility onto yourself for every test.

I suppose another way to deal with this is for the gallery layouts to actually work with different DPRs. Then setting up the view correctly wouldn't be as essential for using them.

@matthew-carroll
Copy link
Contributor

I think I see where you are coming from, but then – at least to me – the current API isn't really intuitive. You are saying that we don't need testGoldenScene, but when you use the core functionality without it, it breaks.

Unless I'm missing something, I don't think this is true. I don't think the core functionality is broken - it looked like you just didn't account for the standard behavior of testWidgets(). If you're going to use testWidgets() then you've got to account for that choice.

Is there actually something broken in there, or are you just dealing with the consequences of Flutter setting up 3.0 DPI, etc?

Before this change, users would wrap their galleries in testGoldenScene, which sets up the view so that galleries actually layout properly. My intent here was to move this setup to the gallery itself, since it needs it for its run.

testGoldenScene() has nothing to do with layout. This is the entire implementation of that method:

@isGoldenScene
@isTest
void testGoldenScene(
  String description,
  WidgetTesterCallback test, {
  bool? skip,
  Timeout? timeout,
  bool semanticsEnabled = true,
  TestVariant<Object?> variant = const DefaultTestVariant(),
  dynamic tags,
  int? retry,
}) {
  testWidgets(
    description,
    (tester) async {
      await TestFonts.loadAppFonts();

      tester.view
        ..devicePixelRatio = 1.0
        ..platformDispatcher.textScaleFactorTestValue = 1.0;

      try {
        await test(tester);
      } finally {
        tester.view.reset();
      }
    },
    skip: skip,
    variant: variant,
    timeout: timeout,
    semanticsEnabled: semanticsEnabled,
    tags: tags,
    retry: retry,
  );
}

I think on the flip side, it is more confusing to have the test function set up a different view for the entirety of its run, since it affects all other test code as well.

Please see the implementation above. We call tester.view.reset() after every test run.

For font loading, I understand why moving it to the gallery and timeline run() is too restrictive, but it also seems weird to have it in the test runner. If I want to render a gallery using Ahem, I need to just know to set up the View correctly, but not load the fonts, since testGoldenScene implicitly did both for me in other cases.

I'm not sure there's any situation here that isn't a problem, because Flutter has made working with fonts in a test a problem. The question is, what's the least bad option we can provide? I'm open to options, but I think it's critical that we not put developers in a position where they can't make a choice about when/if a non-undoable thing is done.

Expected behavior for me would be to set up tester.view only for the run() of the Gallery/Timeline, and have font rendering behavior be opt-in in another way (function call before, or maybe a function on the Gallery for while you build it). It's a bit trickier of course, since you can't unload fonts anymore 🥲

Why do you want to control tester.view with run()? This would seem to imply that you want to stick multiple golden tests in a single test run. I think that's probably not advisable for multiple reasons. On the other hand, if there's one golden generation/comparison per test, then I'm not sure what's gained by moving that control into run()?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants