Where does the branch predictor store it's prediction data?

aerosayan · November 14, 2020, 15:36

Where does the branch predictor store it's prediction data?

I ask this question because I don't understand where the speculative execution pipeline of the CPU store it's prediction data.

As per my understanding, the Instruction Cache(ICACHE) will prefetch and load the code that is to be executed next. So it makes sense that the branch predictors will also load their prediction data from somewhere. I don't know from where though.

For example :

Code:

// Solver core loop
// code ...
// code ...
// code ...


// Calculate flux

if( flux == ROE_FLUX)
{
  // Calculate Roe flux

}
else
if( flux == VAN_LEER_FLUX)
{
 // Calculate Van Leer flux

}


// code ...
// code ...
// code ...

In the example given above, the flux calculation will happen in the core of the solver loop. However as we can see, the code has to decide which flux to calculate. Now, in a conventional solver, only a single type of flux will be used, so only one of those branches will be selected for all of the iterations and the other will be discarded.

I want to ensure that the branch predictor will always select the correct branch and not suffer from branch misprediction penalties.

So, will the branch predictors always know which branch to select (and if so, where do they load this data from), or will the branch predictor always has to test the two branches and suffer from ocassional branch mispredictions?

I'm not looking to do premature optimizations, but just to understand the process better.

Thanks

aero_head · November 14, 2020, 16:24

Hello Sayan,

I am pretty sure more context is needed but I'm not sure if it would be from you or from someone who knows more about the software.

I think branch prediction is pretty good if it’s the same every loop, that is to say that I am not sure if that would be a noticeable performance impact. I know you were looking into understanding the process. That process depends on how complicated the code is in the first if branch and if goes into some deep hardware execution stuff.

I am unsure if you are writing the code that is getting executed in the main loop or if some other code is being used. If you have access to some code that gets run before hand you could make a function pointer or use some other trick to prevent needing to do an if statement inside the loop, but I am unsure if you would have that ability because for something like this it will run iterations of the solver on several cores so it doesn’t follow typical execution flow.

That being said, do you have more code available for us to look at to try to help more?

flotus1 · November 14, 2020, 17:28

The big question here is: where and how do you assign a value to "flux"?

aerosayan · November 15, 2020, 04:42

Quote:

Originally Posted by flotus1

The big question here is: where and how do you assign a value to "flux"?

The value for flux is set during the initialization step of the solver. The value isn't known during compilation, so the compiler can't perform dead code/branch removal optimization. The value is guaranteed to be present in the L1 cache. All of the Plain Old Data variables in the performance critical section are kept together in memory so that they are loaded in a single cache line and are always available for use.

arjun · November 15, 2020, 05:34

Do you also want us to guess the software you are talking about?? Just wondering.

aerosayan · November 15, 2020, 06:23

Quote:

Originally Posted by arjun

Do you also want us to guess the software you are talking about?? Just wondering.

Writing my own code using C++ and Fortran 2003.

Question asked was regarding how modern pipelined microprocessors handle code execution in their Speculative Execution optimizer to reduce the cost of conditional branching, and not any commercial software.

It is a hardware related question.

flotus1 · November 15, 2020, 08:14

Since it's a run-time constant, there are certainly some tricks you could pull off to avoid branch prediction entirely.
But at the same time, branch prediction should have no issues handling this. After all, it's always the same code path, which is the best-case scenario short of knowing the code path at compile time. You could use a profiler to verify what's happening.

arjun · November 15, 2020, 13:08

Quote:

Originally Posted by aerosayan

Writing my own code using C++ and Fortran 2003.

Question asked was regarding how modern pipelined microprocessors handle code execution in their Speculative Execution optimizer to reduce the cost of conditional branching, and not any commercial software.

It is a hardware related question.

None of this is clear from your original posting.
Good luck.

November 14, 2020, 15:36	Where does the branch predictor store it's prediction data?	#1
aerosayan Senior Member Sayan Bhattacharjee Join Date: Mar 2020 Posts: 495 Rep Power: 8	Where does the branch predictor store it's prediction data? I ask this question because I don't understand where the speculative execution pipeline of the CPU store it's prediction data. As per my understanding, the Instruction Cache(ICACHE) will prefetch and load the code that is to be executed next. So it makes sense that the branch predictors will also load their prediction data from somewhere. I don't know from where though. For example : Code: // Solver core loop // code ... // code ... // code ... // Calculate flux if( flux == ROE_FLUX) { // Calculate Roe flux } else if( flux == VAN_LEER_FLUX) { // Calculate Van Leer flux } // code ... // code ... // code ... In the example given above, the flux calculation will happen in the core of the solver loop. However as we can see, the code has to decide which flux to calculate. Now, in a conventional solver, only a single type of flux will be used, so only one of those branches will be selected for all of the iterations and the other will be discarded. I want to ensure that the branch predictor will always select the correct branch and not suffer from branch misprediction penalties. So, will the branch predictors always know which branch to select (and if so, where do they load this data from), or will the branch predictor always has to test the two branches and suffer from ocassional branch mispredictions? I'm not looking to do premature optimizations, but just to understand the process better. Thanks praveen likes this.

November 14, 2020, 17:28		#3
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,428 Rep Power: 49	The big question here is: where and how do you assign a value to "flux"? aero_head likes this.

November 15, 2020, 08:14		#7
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,428 Rep Power: 49	Since it's a run-time constant, there are certainly some tricks you could pull off to avoid branch prediction entirely. But at the same time, branch prediction should have no issues handling this. After all, it's always the same code path, which is the best-case scenario short of knowing the code path at compile time. You could use a profiler to verify what's happening. sbaffini and aerosayan like this.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
[OpenFOAM] How to get the coordinates of velocity data at all cells and at all times	vidyadhar	ParaView	9	May 20, 2020 21:06
【Help】"Error: Update_Time_Level: invalid data"	Chen	FLUENT	2	August 24, 2014 08:51
Problem running in parralel	Val	OpenFOAM Running, Solving & CFD	1	June 12, 2014 03:47
[OpenFOAM] saving data in paraview	aylalisa	ParaView	3	May 31, 2014 12:38
using pressure varaition data for noise prediction	Deosharan Roy	CFX	2	July 13, 2005 02:30

November 14, 2020, 16:24		#2
aero_head Senior Member Kira Join Date: Nov 2020 Location: Canada Posts: 435 Rep Power: 9	Hello Sayan, I am pretty sure more context is needed but I'm not sure if it would be from you or from someone who knows more about the software. I think branch prediction is pretty good if it’s the same every loop, that is to say that I am not sure if that would be a noticeable performance impact. I know you were looking into understanding the process. That process depends on how complicated the code is in the first if branch and if goes into some deep hardware execution stuff. I am unsure if you are writing the code that is getting executed in the main loop or if some other code is being used. If you have access to some code that gets run before hand you could make a function pointer or use some other trick to prevent needing to do an if statement inside the loop, but I am unsure if you would have that ability because for something like this it will run iterations of the solver on several cores so it doesn’t follow typical execution flow. That being said, do you have more code available for us to look at to try to help more?

November 15, 2020, 05:34		#5
arjun Senior Member Arjun Join Date: Mar 2009 Location: Nurenberg, Germany Posts: 1,291 Rep Power: 35	Do you also want us to guess the software you are talking about?? Just wondering.