当 QueryPerformanceCounter 被调用时会发生什么?

2021-12-18 00:00:00 windows winapi timing c++

我正在研究在我们的系统中使用 QueryPerformanceCounter 的确切含义,并试图了解它对应用程序的影响.从我的 4 核单 CPU 机器上运行它可以看出,它需要大约 230ns 才能运行.当我在 24 核 4 cpu xeon 上运行它时,它需要大约 1.4 毫秒才能运行.更有趣的是,在我的机器上以多个线程运行时,它们不会相互影响.但是在多 CPU 机器上,线程会引起某种交互,导致它们相互阻塞.我想知道总线上是否有一些他们都查询的共享资源?当我调用 QueryPerformanceCounter 时到底发生了什么,它真正衡量的是什么?

I'm looking into the exact implications of using QueryPerformanceCounter in our system and am trying to understand it's impact on the application. I can see from running it on my 4-core single cpu machine that it takes around 230ns to run. When I run it on a 24-core 4 cpu xeon it takes around 1.4ms to run. More interestingly on my machine when running it in multiple threads they don't impact each other. But on the multi-cpu machine the threads cause some sort of interaction that causes them to block each other. I'm wondering if there is some shared resource on the bus that they all query? What exactly happens when I call QueryPerformanceCounter and what does it really measure?

推荐答案

Windows QueryPerformanceCounter() 具有确定处理器数量并在必要时调用同步逻辑的逻辑.它尝试使用 TSC 寄存器,但对于多处理器系统,不能保证该寄存器在处理器之间同步(更重要的是,由于智能降频和睡眠状态,可能会有很大差异).

Windows QueryPerformanceCounter() has logic to determine the number of processors and invoke syncronization logic if necessary. It attempts to use the TSC register but for multiprocessor systems this register is not guaranteed to be syncronized between processors (and more importantly can vary greatly due to intelligent downclocking and sleep states).

MSDN 说调用哪个处理器无关紧要,因此您可能会看到额外的同步代码导致这种情况的开销.另请记住,它可以调用总线传输,因此您可能会看到总线争用延迟.

MSDN says that it doesn't matter which processor this is called on so you may be seeing extra syncronization code for such a situation cause overhead. Also remember that it can invoke a bus transfer so you may be seeing bus contention delays.

如果可能,请尝试使用 SetThreadAffinityMask() 将其绑定到特定处理器.否则,您可能不得不忍受延迟,或者您可以尝试不同的计时器(例如查看 http://en.wikipedia.org/wiki/High_Precision_Event_Timer).

Try using SetThreadAffinityMask() if possible to bind it to a specific processor. Otherwise you might just have to live with the delay or you could try a different timer (for example take a look at http://en.wikipedia.org/wiki/High_Precision_Event_Timer).

相关文章