Multi Core, Multi CPU

User avatar
ChristianWN
Posts: 4
Joined: Tue Aug 08, 2017 12:55 pm

Multi Core, Multi CPU

Postby ChristianWN » Tue Aug 08, 2017 1:30 pm

Hi I use a live setup on my workstation dual xeon. The way I am thinking is one daw/live-host on each core and I do the synchronization of audio in the driver/audio-interface. This way I avoid pops and clicks. But I am still looking for a daw/live-host that can use all the cores available, so I can run all my vst-plugins inside it, you know that would be more practical. Right now I use vsthost, but it can not mange more than 4 instances, I need 6 or more...
So searching internet for a live-host I found LP2. Can you shed some light on how LP2 uses cpu cores? Could you get that each plugin chain could be mapped to a different core? And could you have the option to choose? If not is there a way to run more instances of LP2?

Thanks for any insight.
User avatar
nikolai
Posts: 1231
Joined: Fri Sep 04, 2009 1:05 pm
Location: Norway
Contact:

Re: Multi Core, Multi CPU

Postby nikolai » Wed Aug 09, 2017 12:40 pm

Hi Christian
Interesting that you should ask this now, as I'm investigating this topic these days.
Can you shed some light on how LP2 uses cpu cores?

Yes, big tech-nerd-alert for the following explanation though.
First, software dont have access to cpu-cores, they can not decide which core to work on etc, that is the way the OS is built, and with good reason.
The way it works is that the software has multiple "Threads", these are like separate ""tracks" of code.
The OS then decides how these should be divided on the cpu.
Idealy you would think that one thread pr plugin, or pr chain would be a great idea. but it does not work in practice.
LiveProfessor uses around 15 threads all in all when idling, it is graphics, timers, audio processing etc.
The current version does not seem to balance the audio processing load evenly on all cores.
I have a test version that does that, but what I have found recently is that when pushing latency down to the 32-64 sample area, the overhead (extra work) of managing/separating the tasks, makes it less efficient again. So there is a big gain with higher latency, but loss at lower.
That is way the current version kind of plays it safe. I am working on it though, maybe I'll add a option for the user so that each system can be tweaked to best performance.
User avatar
ChristianWN
Posts: 4
Joined: Tue Aug 08, 2017 12:55 pm

Re: Multi Core, Multi CPU

Postby ChristianWN » Wed Aug 09, 2017 4:45 pm

Hi, thanks for the replay. I would like to suggest a brainstorming and I can also help out testing the multi-core support. I have time and a dual cpu xeon computer and years of deep understanding of digital audio. I was also a programmer for years. I post my findings on the RME forums. My email is christianwn [at] gmail [dot] com.

I see what you write but think of this, there are many factors that need to be considered:

Normally you can let the OS decide the thread distribution, as you have said yourself. I don't think this is a good idea for low latency audio. It can possibly work under an optimal situation, but that is risky.
So you need to use affinity in your programming code to force the load onto a specific core. This is also because of the L1 L2 cache, you want everything to stay on the same core and cache, not flying around between cores.
The hard part from my perspective is that you would have to do asynchronous audio mixing inside LP2, here you can decide upon complexity vs just getting something to work. You would then have a main thread that is communicating with the asio driver and you would async/mix all the other threads with that thread.
There is also a workaround for this asyncing and mixing. You can let the audio-interface/driver do this work. I use Rme UCX so I can only speak for that. They way you do it is: You have each chain in LP2 go to a separate input and output, no mixing in the LP2. Then you use the included mixer for the audio-interface to do all the summing/mixing and routing. You can than use hardware loopback channels to mix even more. This way LP2 will stay away from synchronization issues. You then basically run each chain as a program within a program.

Even if NUMA programming is for multi cpu systems it can be used for 1 cpu systems. Because a one cpu system is basically a NUMA system with on cpu/node. Look at these articles:

https://software.intel.com/en-us/articl ... s-for-numa
https://stackoverflow.com/questions/246 ... e-a-thread
http://www.rossbencina.com/code/real-ti ... or-nothing

This one has 2 videos further down.
https://forum.juce.com/t/c-threads-vs-j ... s/17272/16


Regards
Christian W. Nyman
User avatar
nikolai
Posts: 1231
Joined: Fri Sep 04, 2009 1:05 pm
Location: Norway
Contact:

Re: Multi Core, Multi CPU

Postby nikolai » Wed Aug 09, 2017 10:54 pm

Hi, thanks for the replay. I would like to suggest a brainstorming and I can also help out testing the multi-core support. I have time and a dual cpu xeon computer and years of deep understanding of digital audio. I was also a programmer for years. I post my findings on the RME forums. My email is christianwn [at] gmail [dot] com.

Glad to hear, I'll take you up on that.
So you need to use affinity in your programming code to force the load onto a specific core.

Well, recommendations seem to vary on this. The argument against it is that there are many,many other threads that do jump around and by assigning to specific cores we step in the way of the OS.
It would be cool to have a test version with options to turn these thing on/off
Right now there are one audio thread pr core (with realtime priority) but they are managed by the os
There is also a workaround for this asyncing and mixing. You can let the audio-interface/driver do this work. I use Rme UCX so I can only speak for that. They way you do it is: You have each chain in LP2 go to a separate input and output, no mixing in the LP2. Then you use the included mixer for the audio-interface to do all the summing/mixing and routing. You can than use hardware loopback channels to mix even more. This way LP2 will stay away from synchronization issues. You then basically run each chain as a program within a program.

Are you saying this can be done with out adding an additional buffer (and thereby latency) ?
User avatar
ChristianWN
Posts: 4
Joined: Tue Aug 08, 2017 12:55 pm

Re: Multi Core, Multi CPU

Postby ChristianWN » Thu Aug 10, 2017 12:32 am

Are you saying this can be done with out adding an additional buffer (and thereby latency) ?


Yes I do this right now with 4 instances of the live-host(daw) vsthost. I use each instance on a separate core(affinity) connected to separate inputs and outputs. Then I mix the tracks in the mixer of the audio interface. I send via loopback, the final mix and reverb to LP2 then finally everything goes out to the speakers via analog out 1 and 2. This gives no synchronization overhead, like cross cpu syncing etc. The loopback have no latency, it is actually ahead of time and needs a delay. I can not speak for other audio interfaces regarding loopback and latency, but is should not be any noticeable delay. This would save you from doing all the really intricate synchronization work in LP2, not reinvent the wheel. This is what I would love to see in a live-host, tight integration with the audio-interface mixer. RME's totalmix can even be controlled via mackie control... Perhaps not the easiest setup, but I know it can work really well.
Lytz1
Posts: 30
Joined: Mon Jul 11, 2016 3:06 pm

Re: Multi Core, Multi CPU

Postby Lytz1 » Wed Aug 16, 2017 12:46 am

Well, all that I can say is that something is very much wrong and messed d up in regards of multicore-processing on OSX....

I am peaking over 90% in LP2, Crackle and dropout galore... On the other hand Activity Monitor shows the highest Core on around 50% with not much going on everywhere else.
And that's on a 12 core MacPro... That's not good...

screenshot.png
screenshot.png (64.26 KiB) Viewed 933 times
User avatar
nikolai
Posts: 1231
Joined: Fri Sep 04, 2009 1:05 pm
Location: Norway
Contact:

Re: Multi Core, Multi CPU

Postby nikolai » Fri Aug 18, 2017 11:19 am

Hi
Ok, so I've been experimenting a bit with this.

Here are some test versions for you:
OSX:
https://www.dropbox.com/s/aayv6oxc935z9 ... X.zip?dl=1

Windows 64Bit:
https://www.dropbox.com/s/5fu3jzyqjtymm ... 4.zip?dl=1

Windows 32Bit
https://www.dropbox.com/s/m2nl1tgu4bc60 ... 6.zip?dl=1

Inside the Audio & Midi options, where you select audio device, there is now a little button called "Use More Threads"
This is a switch between the old processing method and the new test version.
Please let me know if you notice any difference, bad or worse

The processing is now divided, much like in the wire view.
So one single chain with 50 plugins inside, you will not notice much difference, as they all need to be processed in serial anyway.
50 chains with 1 plugin in each should be divided evenly on the cores.
Lytz1
Posts: 30
Joined: Mon Jul 11, 2016 3:06 pm

Re: Multi Core, Multi CPU

Postby Lytz1 » Sat Aug 19, 2017 9:00 pm

Very *very* good job you did here. On OSX. Cuts CPU from 90% red-clipping easily to around 40%. (+/- 5%)

Thanks,
tL.
User avatar
nikolai
Posts: 1231
Joined: Fri Sep 04, 2009 1:05 pm
Location: Norway
Contact:

Re: Multi Core, Multi CPU

Postby nikolai » Tue Aug 22, 2017 3:33 pm

Good to hear.
How is it performing at lower buffer sizes?
That is when it starts to get tricky
PGangi
Posts: 1
Joined: Wed Aug 30, 2017 5:04 am

Re: Multi Core, Multi CPU

Postby PGangi » Wed Aug 30, 2017 5:09 am

Bringing this post back up. I was experiencing this on a 8 Core MacPro with my DSP hovering around 95% then quickly dropping to 75%ish and then back up to 85%. I downloaded the test version and that helped bring the DSP down to the 45% area but I still experienced the quick jumps in DSP usage. I am currently running at a buffer size of 128 samples, but would love to get it down to 32 or even 16 if I can since I am using this in a live setting. Any insight as to why the DSP is jumping like that?
User avatar
nikolai
Posts: 1231
Joined: Fri Sep 04, 2009 1:05 pm
Location: Norway
Contact:

Re: Multi Core, Multi CPU

Postby nikolai » Wed Aug 30, 2017 9:41 am

PGangi wrote:Bringing this post back up. I was experiencing this on a 8 Core MacPro with my DSP hovering around 95% then quickly dropping to 75%ish and then back up to 85%. I downloaded the test version and that helped bring the DSP down to the 45% area but I still experienced the quick jumps in DSP usage. I am currently running at a buffer size of 128 samples, but would love to get it down to 32 or even 16 if I can since I am using this in a live setting. Any insight as to why the DSP is jumping like that?


Hi what OS version are you running. Spikes in CPU usage like this is usually caused by other services running in the background, also if you are on USB, make sure that nothing else shares the USB bus.
Lytz1
Posts: 30
Joined: Mon Jul 11, 2016 3:06 pm

Re: Multi Core, Multi CPU

Postby Lytz1 » Fri Sep 01, 2017 1:15 pm

Quickly checked this. On low latency I also have fluctuations in CPU. Not really high spikes but strong fluctuations.
However I think this is kinda normal to other MT optimized DAWS as well.

(I have to add that *still* I can playback projects on 128 which fall flat on their face without that option turned on on 1024.)

Return to “LiveProfessor 2”

Who is online

Users browsing this forum: dukimusic and 3 guests