-
Notifications
You must be signed in to change notification settings - Fork 119
Open
Description
I am benchmarking the performance of N2V2 prediction using files of different sizes (ranging from 330MB to 5.13GB) on a local server having an A6000 (48GB GDDR6) GPU and an HPC Cluster having A100 (40GB HBM2) GPUs. The performance is similar for smaller files and for bigger files the prediction is faster by upto 50% on the local server even though we expected the opposite. I am wondering if N2V2 uses FP32/FP16 in the backend, if it can make use of Tensor cores and also if there is frequent data transfer between the GPU memory, CPU cache and RAM? Would someone be able to provide details regarding this?
Metadata
Metadata
Assignees
Labels
No labels