Skip to content

Conversation

@bxf12315
Copy link
Contributor

@bxf12315 bxf12315 commented Nov 3, 2025

No description provided.

@bxf12315 bxf12315 requested a review from ctron November 3, 2025 14:47
Copy link
Contributor

@ctron ctron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running that change over and over again, it would always set the same value on the same entity.

I think we should have this a bit more random. At least when it comes to the values. One idea could be to use a timestamp as a value.

Another idea could be to pick a random advisory, instead of using only a single (configured) one.

@bxf12315
Copy link
Contributor Author

Running that change over and over again, it would always set the same value on the same entity.

I think we should have this a bit more random. At least when it comes to the values. One idea could be to use a timestamp as a value.

Another idea could be to pick a random advisory, instead of using only a single (configured) one.

I chose the second approach. Please review it again.

@ctron
Copy link
Contributor

ctron commented Nov 12, 2025

Another idea could be to pick a random advisory, instead of using only a single (configured) one.

I chose the second approach. Please review it again.

Great. Sorry for not being clear enough. I think in this case the values should not be configured. There seems to be no benefit of doing that. But simply always fetch it from the database.

@bxf12315
Copy link
Contributor Author

The corresponding pull request. Thanks
guacsec/trustify-scale-test-runs#30

@ctron
Copy link
Contributor

ctron commented Nov 26, 2025

The corresponding pull request. Thanks guacsec/trustify-scale-test-runs#30

I am not sure how this relates to the request of not having the value configured, but evaluated from the database?

@bxf12315
Copy link
Contributor Author

bxf12315 commented Nov 27, 2025

The corresponding pull request. Thanks guacsec/trustify-scale-test-runs#30

I am not sure how this relates to the request of not having the value configured, but evaluated from the database?

If put_advisory_labels and patch_advisory_labels are empty in the scenario, these two methods will be skipped. Does trustify-scale-test-run regenerate the scenario file on every run? Do you mean removing put_advisory_labels and patch_advisory_labels from the scenario and reading them from the database each time we run the scale test? However, in this case, no database is specified when the test is executed.
At the moment, the logic in this PR is to query 20 IDs from the database for put_advisory_labels and patch_advisory_labels. Then, here (https://github.com/guacsec/trustify-scale-testing/pull/78/files#diff-6991c4c6d61c5d119cd2983b5f8f30c3b5681dddb103d2978c7ab4175b2567f2R153), these IDs are read and used. As the scale test currently reads its data from the scenario, it seems logical to write those IDs into the scenario file. Would this approach be considered incorrect? If not, could you please provide some more detailed clarification? Thanks.

@ctron
Copy link
Contributor

ctron commented Nov 27, 2025

Another idea could be to pick a random advisory, instead of using only a single (configured) one.

I chose the second approach. Please review it again.

Great. Sorry for not being clear enough. I think in this case the values should not be configured. There seems to be no benefit of doing that. But simply always fetch it from the database.

Right, this is what I tried to say here. Those values should not be configured. But just picked randomly from the database, each time it runs. I think this could be done by using the API. Or it could be done by directly accessing the database. I think using the API might make it easier to run, as no DB connection would be required.

It may be beneficial to split this up into multiple steps.

@bxf12315
Copy link
Contributor Author

bxf12315 commented Nov 27, 2025

Another idea could be to pick a random advisory, instead of using only a single (configured) one.

I chose the second approach. Please review it again.

Great. Sorry for not being clear enough. I think in this case the values should not be configured. There seems to be no benefit of doing that. But simply always fetch it from the database.

Right, this is what I tried to say here. Those values should not be configured. But just picked randomly from the database, each time it runs. I think this could be done by using the API. Or it could be done by directly accessing the database. I think using the API might make it easier to run, as no DB connection would be required.

It may be beneficial to split this up into multiple steps.

Does this mean that before performing each put or patch action, an ID needs to be queried first, just like it is done when uploading and deleting advisories?
Do we need to retain the put_advisory_labels and patch_advisory_labels attributes in the scenario? If so, should the corresponding advisory IDs be randomly fetched via REST API at the start of the scale test and assigned to these attributes? Is that correct?

@bxf12315
Copy link
Contributor Author

The latest commit randomly selects an advisory ID and then performs put or patch operations on it. However, it results in two requests being made.
Screenshot 2025-11-27 at 22 44 20
That seems reasonable as a benchmark, don’t you think?

"download_advisory": "24ae57c3-4b57-4f4e-82c1-83ae26059a89",
"get_advisory": "24ae57c3-4b57-4f4e-82c1-83ae26059a89"
"get_advisory": "24ae57c3-4b57-4f4e-82c1-83ae26059a89",
"put_advisory_lables": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove those arrays.

src/restapi.rs Outdated
user: &mut GooseUser,
) -> Result<String, Box<TransactionError>> {
// Generate random 3-digit offset
let offset = rand::thread_rng().gen_range(0..=999);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of assuming a maximum number, you can (on startup) query the total number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check again.

delete_counter.clone()
), name: format!("delete_sbom_from_pool_sequential[{} SBOMs]", pool.len()))
),
name: format!("delete_sbom_from_pool_sequential[{} SBOMs]", pool.len()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure "pool" is the right term now.

use goose::goose::{GooseUser, TransactionResult};
use goose::goose::{GooseMethod, GooseRequest, GooseUser, TransactionError, TransactionResult};
use reqwest::{Client, RequestBuilder};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove that empty lone and have it sorted.

src/restapi.rs Outdated
use urlencoding::encode;

// Static variable to store advisory total count
static ADVISORY_TOTAL: OnceCell<u64> = OnceCell::const_new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having a global static, wouldn't it be possible to perform this lookup once during the start of the application and pass it in to functions?

src/restapi.rs Outdated

/// Send Advisory labels request using PUT method
pub async fn put_advisory_labels(user: &mut GooseUser) -> TransactionResult {
let advisory_id = list_advisory_random_single(user).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this as part of the function will pollute the measurements for this operation. as it does add more: the loading of the advisory and possibly the loading of all advisories.

To my understanding, this could be split into two steps and information be passed along as GooseUserData.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for Goose, its transactions seem to be executed in a random order — may I ask how we can ensure they are executed sequentially? Wouldn’t this add more complexity?
The best approach would actually be to read a certain number of IDs from the database and then perform operations on them randomly, just like in the first version of the PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned (with a link) on the other PR, I don't think that's the case. The round robin schedule (which is default) keeps the order we define.

The idea of the tests is to see how operations change over time. If we combine operations, we might not be aware of changes, or which operation is the cause for a change. As Goose seems to support splitting it up and supporting handover of state, we should leverage that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean exchanging data through session? Session values are only valid for the current user, but these actions should be callable by different users, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the link you mentioned earlier? Could you share it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused about a few points and would appreciate your clarification:

1. This PR does not involve uploading advisories; the advisory upload is handled in a separate PR.

Yes it is. It's unfortunate, as it could go together with this one.

2. If we do not use `GooseScheduler::Serial`, how can we ensure that looking up the advisory ID is always executed before the PUT or PATCH request? In my testing, after setting parameters via `GooseUserData`, reading them back is not guaranteed to happen on the same `GooseUser`.(`GooseScheduler:: RoundRobin `)

To my understanding of https://book.goose.rs/config/scheduler.html#round-robin-scheduler ... this is what would work for as, and is the default. Serial would work too, but it not the default.

3. In this PR, there are three operations:
   
   * Compute the total number of advisories
   * Randomly select one advisory ID
   * Perform a PUT or PATCH on the labels for this ID
   
   Should these three operations be grouped into a dedicated `Scenario`?

The compute part is only required to limit the test. It should only be executed once. Therefore, there is no need to have this as part of the metrics. Randomly selecting an advisory is required with every operation. However its time to run should not pollute the time of the actual operation.

   And in the "upload advisory" case, would there then be two operations:
   
   * Upload the advisory file
   * Delete this file

If using GooseUserData is not appropriate for the second point, what would be the recommended way to pass parameters between these steps?

I think that is the appropriate way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, GooseUserData does not solve this problem, because it is only valid for the current GooseUser, and in different transactions there may be different GooseUsers executing the steps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, GooseUserData does not solve this problem, because it is only valid for the current GooseUser, and in different transactions there may be different GooseUsers executing the steps.

I am not sure how you come to this conclusion. I created an example which should show how to use it and which should prove it does what we require: https://github.com/ctron/goose-session-example

Please let me know if there are any flaws in this example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, give me a bit of time to make the changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After rechecking the previous code related to GooseUserData, I found that set_on_start is the key factor. If GooseUserData is not set in a set_on_start transaction, it is very hard to ensure that the data is later read from the same GooseUser.​

I have already made a new revision following the logic we discussed earlier; could you please review it? If it looks good to you, I will update the upload Advisory PR accordingly.

src/restapi.rs Outdated
use urlencoding::encode;

/// Get advisory total count
pub async fn get_advisory_total(user: &mut GooseUser) -> TransactionResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't do this is a transaction, but once on the startup of the application (if required) and pass in the value as u64 to functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already moved get_advisory_total out of the transaction, but running it in the transaction startup should be a more reasonable approach. This is the result of running get_advisory_total in the transaction startup.get_advisory_total is not included in the metrics list.
Screenshot 2025-12-15 at 21 33 57

New version code, a lot of additional logic checks were introduced, and a new environment variable is also required. Do you think this level of change is really worth it?

src/restapi.rs Outdated
Comment on lines 33 to 34
/// List advisory with random offset and limit=1
pub async fn list_advisory_random_single(user: &mut GooseUser) -> TransactionResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// List advisory with random offset and limit=1
pub async fn list_advisory_random_single(user: &mut GooseUser) -> TransactionResult {
/// Get a random advisory ID
pub async fn find_random_advisory(user: &mut GooseUser) -> TransactionResult {

src/restapi.rs Outdated
if let Some(id) = first_item.get("uuid").and_then(|u| u.as_str()) {
log::info!("Listing advisory with offset {}: {}", offset, id);

let goose_user_data =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, please re-use the user data from the top of the function.

src/restapi.rs Outdated
pub async fn list_advisory_random_single(user: &mut GooseUser) -> TransactionResult {
// Get the total count first
let total = {
let goose_user_data = user
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code for getting the user data is used quite often in the same way. Please extract this into an own function. As part of the GooseUserData struct.

src/restapi.rs Outdated
let response = user.get("/api/v2/advisory").await?;
let json_data = response.response?.json::<serde_json::Value>().await?;
/// Fetch advisory total count once at application startup
pub async fn get_advisory_total(host: Option<String>) -> Result<u64, anyhow::Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it looks like the value is never None, can't we just use host: &str?

src/restapi.rs Outdated

log::info!("Fetching advisory total from: {}", url);

let response = client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it you don't re-use the client, you can also use request::get directly.

src/main.rs Outdated
.set_weight(5)?;
// Register advisory label transactions if host is available.
// Since the scenario object doesn't provide host information, we use environment
let host = s.host.clone().or_else(|| std::env::var("HOST").ok());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there maybe a way to get this from the Goose configuration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried for a long time and so far have not found a way to obtain it from the Goose configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants