Today we take another look at the Ryzen chipset and discuss further optimization’s. Memory is always a question that comes up and historically it hasn’t really had an impact for audio, where the bottleneck in performance often ends up being elsewhere in the setup.
Even with the previous generations on Ryzen where the optimal memory advised were around the 2666MHz (first generation) – 3200MHz (second generation) clock speeds and in our own testing moving up from 2666MHz to 3200MHz on either generation didn’t get us any favourable results in audio benchmarking, although it did help for video rendering workloads.
As such, I went with the previous suggested the best memory when testing around launch and AMD has publically outlined that the optimum speed is now 3733MHz with a CAS16 timing as this puts the memory on a perfect 1:1 ratio with the internal Infinity Fabric bus arrangement.
At this point 3733MHz RAM is still not overly common, even more, uncommon is the super low CAS 16 kits. I’ve currently got a 3733MHz pack being shipped to me (although only CAS17) for further testing when it arrives, although I’ll keep that for when I do a full retest in the coming week.
The results I have today is more of a comparison to show some basic gains and at a slightly cheaper price point. Above 3600MHz memory carries a sizable price premium and some of you may be wondering what gains can be achieved at what price points.
To do this testing I’ve got results generated using the 3200 RAM used in the previous testing, 3600 RAM with CAS18 which are the standard packs we use here and then I’ve run the same 3600MHz RAM clocked up to 3733MHz, which in real-terms ended up being around 3725MHz running in Windows.
Stock CPU 3200MHz RAM
Stock CPU 3600MHz RAM
Stock CPU 3725MHz RAM
The DAWBench DSP test gave us some small gains on the 64 buffer and then became much more apparent at larger buffer sizes, where we’re talking closer to 8% at the 512 buffer.
Stock CPU 3200 RAM
CPU Over load Point
Stock CPU 3600 RAM
CPU Over load Point
Stock CPU 3725 RAM
CPU Over load Point
What we can see here is similar small gains moving from 3200MHz to 3600MHz, with it being fairly marginal overall moving up at this level.
Clocking the RAM up towards it’s advised 3733MHz clocks in this instance produced us more notable gains with excess to 10% being seen at most buffer sizes. I’ll also note that that between the 3600MHz and 3725MHz results the memory hole started to disappear as the CPU overload point moved upwards. I suspect and remain hopeful when we see perfectly matched 3733MHz RAM with CAS 16 timings as they’ve advised, that we’ll finally see that performance hole disappear for good.
Given that 3600MHz RAM is only about 10% more costly than 3200MHz then that’s a no brainer of an upgrade, but the jump above that to 3733MHz can easily cost twice as much again depending on the quantity and size of RAM sticks that you need.
I’d expect memory costs to continue to drop over the coming months as no doubt many firms will now be ramping up 3733MHz production over the coming months. Our own provider was also on the back foot, having already killed off their 3733MHz supplies due to a lack of customer interest before the AMD launch, it’s only now that they are rapidly bringing back old lines and looking to flesh out their ranges to support the popular new platform.
In regards to overclocking the advice that AMD put forward early on appears to be very true with faster memory installed. In initial testing, I overclocked the systems and ran 3200MHz memory and saw some solid gains. With the faster memory, we see the same if not better gains and we can also run the CPU cooler at stock.
I did note that I had both an overclocked chip up and running with 3600MHz RAM and the memory performance hole pretty much disappeared completely, but the system wasn’t stable under heavy loads an there is no way you would want to run that in a production environment.
Indeed, it seems that overclocking is more or less impossible when taking the memory over 3200MHz at this time, although given the performance boost we see with the faster RAM this isn’t a complaint. This might even improve in the future as the BIOSes get optimized and better high-speed memory continues to arrive, but it’s very much something to be aware of if buying a machine at this point in the lifecycle.
One thing that the results have left me wondering, especially with the closing of the gap as we approach the 3733MHz optimum is has this always been the case. 3733MHz didn’t exist when Ryzen generation 1 arrived and I’m not even sure if it was a widely available product when Ryzen 2 launched. Even now it carries a rather hefty cost premium and I have to ponder is this simply a case of the memory market catching up to the Ryzen chipset.. has Ryzen so far simply been ahead of its time?
The last bit of testing I’m going to carry out over the coming week is to retest with the information that we’ve picked up since the first look. It’ll now be running stock clocks with the 3733MHz RAM that is shipping to us now and it’ll be running a none hybrid test version that of a freshly expanded test setup.
In light of recent testing on the new AMD platform, a number of questions arose and I’m going to spend some time working through those over a couple of follow up articles.
The first one to tackle is Cubase which I ended up pulling from the testing this time due to uncertainty on the results being returned. This was to be the first time using Cubase 10 in the benchmarks, a change I was keen to move up to in light of them making adjustments to the engine to resolve the MMCSS issues that crept in on the previous build due to low-level OS changes. We had been working with a tweaked registry workaround in C9.5, so we were rather keen to see what other gains were to be had in working with the latest iteration of the software.
Whilst I’m looking at this I also want to start off by tackling another question in the process, namely the one of ASIOGuard.
ASIOGuard is there to stop dropouts and overloading whilst recording, it’s essentially another buffer designed to keep you safe from digital gremlins. where It also means that your trading off some degree of performance overhead in order to achieve that extra stability.
Normally, we will test with ASIOGuard disabled essentially because we’re looking to test the hardware and not ASIOGuard itself. The first result I want to post is a Cubase 10 with ASIOGuard Off/Low/Normal/High and at various buffer setting.
Firstly you’ll note that ASIOGuard off is far better performing, although I’ll note this still isn’t quite as I would have expected.
CPU Over load Point
AG On Low
CPU Over load Point
AG On Norm
CPU Over load Point
AG On High
CPU Over load Point
So, as it shows above, ASIOGuard rather skews the performance for us in testing. You can note that with a CPU overload point of roughly 90% maximum the ASIOGuard off setting gave us both the highest total polyphony and succeeded in leveraging the most amount CPU in that test.
Now with ASIOGuard on this isn’t the whole story. At each buffer setting the total performance was still there above the points where I drew the line. However, I couldn’t cleanly push past the points that I’ve indicated.
What do I mean by this?
With DAWBench testing, the way we take the metric is to simply keep adding more and more instances of whichever plugin it might be until such point where the audio overloads and then we pull back slightly and take the measurement.
What I was seeing here was the audio breaking up and then not coming back until I reduced the active channels back to the point that I’ve recorded.
So, for example, the chart above shows ASIOGuard – Low overloading on the 512 buffer at 560 notes. If I keep adding more instances until the point it crackles and falls over, it’s more like 1100 poly with 95% CPU.
So, why are the results on paper looking so low?
Because whilst I can build up to 1100 instances I then can not start/stop cleanly without it replicating the audio cut out and recover issue I note above.
So, say I take it to 1000 note poly and the audio is playing away fine. If I stop the project at this point the audio will stop playing. If I then proceed to start Cubase playing again it will immediately lock up, refusing to start playing audio again until I reduce it back to the point that I’ve noted on the chart.
Essentially it’s behaving as if it’s overloading and choking, which doesn’t make for a smooth session when recording.
So, the next question that comes to mind, is this an inherent issue inside of the Cubase 10 engine?
Above we see the same set of ASIOGuard On/Off tests running on a 9900K at 4,9GHz all core and running 2666MHz RAM.
CPUOver load Point
AG On Low
CPUOver load Point
AG On Norm
CPUOver load Point
AG On High
CPUOver load Point
The first thing to note is that the ASIOGuard “off” setting does look to offer us the sort of result curve that we would be expecting to see in this testing situation and with a minimum of 90% CPU being leveraged rising quickly to 100% it’s performing as we would hope to see.
The ASIOGuard itself is designed to sit as a safety buffer and at tighter settings you can see where it fails to keep up as the CPU overloads at lower buffer settings, but when working ideally it will tend to trade off performance for stability at lower ASIO buffers as well as allowing for potentially a little more overhead to be extracted at more relaxed settings.
But that aside, the results above should indicate why we prefer to run any testing in Cubase with ASIOGuard itself disabled due to more balanced results as we’re testing just the hardware and not the ASIOGuard itself.
What was also apparent was that I wasn’t seeing the “rubber banding” effect on the Intel system and that the point where it fell over, it did pretty much fall over at its audio load break-up point.
There was none of this being able to push it 200% past it’s highest start/stop result and on the Intel testing it would prove to be that the point where it started to crackle that was also the same point where it would fail the stop/start part of the test.
So, on the Intel setup, these were the respective results for Cubase and Reaper testing where the performance curves look to be as we’d expect in regards to the point of audio drop out in each instance.
Intel 9900K Test
Cubase (AG Off)
CPU Over load Point
CPU Over load Point
AMD 3900X Test
Cubase (AG Off)
CPU Over load Point
CPU Over load Point
So, the reason I ultimately dropped Cubase from this round was the above. I just wasn’t sure at the time what or why the results were skewed in the fashion that they were and wanted to go with a test that I considered to be less aggressive with trying to optimize its own handling.
To note, I did a similar shoot off on Ryzen 1 & 2 setups but wasn’t able to close the gap in any meaningful way although I was using an older build of the sequencer engine at the time it should be noted that I’m seeing the memory hole tighten up slightly with faster RAM than the 3200MHz which was recommended on the last generation, but is now being eschewed in AMD recommendations in place of the newer 3733MHz packs which they’ve now noted is the optimum clocking speed for working with Ryzen.
I can’t help but wonder if this was always the case and it was simply the prohibitory high price of 3600MHz+ RAM two years ago (3800MHz is still rather high cost at the time of writing), is this a case of Infinity Fabric making it to market a number of years before the supporting hardware was widely available to the general public?
At the moment the Vi test is being updated, so I’ll look to do a pure VI test in the coming days and will republish with updated results as well as delving further into the memory handling side. I’ll note that the performance curve that I saw in testing this time mirrored my first run with the hybrid test build, but I’m also keen to see how it plays out doing a full retest with it across the board.
To draw this article to a close, Cubase 10 on the Intel side appears to be behaving as expected but the AMD handling has proven erratic enough for me to question it in regards to giving the hardware being examined a fairer test. For Cubase users and importantly for those of you working with large sample libraries, this raises questions on the suitability of AMD for handling your workload.
For the rest of us, it raises the question on whether or not its Cubase or Reaper that is the exception to the rule here and right now I’m not well placed to answer it. I understand there are further builds in the pipeline, so more testing will be carried out there as and when it’s ported.
*Please note, further testing is currently ongoing due to new test builds being made available and due to community feedback requesting a few different usage scenarios. Some interesting results are being seen, updates will be posted in detail over the coming week.*
The AMD Ryzen 3000 series has been well anticipated and in fact the last time there was this much buzz around a release, it was quite possibly the first Ryzen generation a couple of years back. At that time we saw the platform pull AMD back into the limelight and whilst the results were mixed across many usage scenarios, it was clear the platform certainly had the potential to live up to.
In the interim we’ve of course had the 2000 series, which built upon the gains we’d already seen and continued to close the gap even further. AMD, of course, has continued tweaking the platform throughout this period, acknowledging some of the internal latency issues we also saw in the first round of testing and generally showing positive improvements along the way.
I’m coming to this with a short delay after launch due to a shortage of hardware over the first week. The mainstream reviewers looked to have got their hands on them in the week prior to launch, so there is already a lot of coverage out there regarding the more common applications and the hardware has performed well. The upside of this is that I only managed to put one of the chips across the bench before the launch day AGESA BIOS update surfaced which after applying I also saw a small improvement to performance and so ensured that all testing was done with this in place.
I’m going to be putting the 3600, 3700X & 3900X through their paces here and as normal I’ll be looking to max turbo clock them where I can. This ensures we don’t have a slowest core scenario tipping things into overloading earlier than necessary, but it does mean that on a stock setup you should allow for some variance.
The 3600 for a none X series chip did well, allowing us to take it to a steady 4.2GHz on all of it’s 6 dual threaded cores. The 3700X allowed us to max out all of 8 of its cores at 4.4GHz, and the 3900X managed 4.3GHz whilst sitting around 70 degrees even under maximum load. I managed a 4.4Ghz but not without seeing a huge increase in temperatures alongside it.
Whilst the promotional headlines have been focused on the 7nm die shrink, it should be noted that the entire architecture has received an overhaul in the process with AMD noting a sizable 15% increase in IPC performance. Other notable improvements include further tuning to the internal memory latencies and a sizable increase to the L3 cache, both of which should be beneficial to our performance scores.
For the Ryzen testing, I’m using the X570 Asus Tuf board which was the first of the new range to land in the office. It has been fully updated prior to testing with the latest BIOS and running 3200MHz Corsair LPX memory
With news over the past 12 months of security concerns and various performance affecting patches that have since followed, I’ve set up a new test bench where the Windows 10 build being used is the current 1903 with all drivers being freshly installed. Also given all these changes I’ve benched a number of the Intel chips in this round of testing, with both of the Z390 and X299 boards being fully updated Asus Prime boards.
On top of that reinstall and due to exceeding the benchmarking overhead in the last round with the largest available chips, I’ve made a few modifications to the standard DAWBench tests this time as I suspect that I run the risk of easily surpassing the tests in their default forms.
With the DAWBench DSP test, the SGA1566 plugin now has all instances set to the high-performance setting, running at 24/48 and this gives us plenty of headroom for our needs.
The Kontakt based DAWBench VI test, on the other hand, I would expect to quite quickly outperform with the CPU’s we have here.
I’ve attempted to soak up some of the available performance we have on offer by applying two instances of the SGA1566 plugin in high-performance mode to each of the 16 hidden sine tracks, which on my 9900K testbed took up about 50% of the available performance. This should give us a reasonable baseline to start from and still have the ability to check for any performance affected by latency.
Since the Native KA6 interface has had a generational jump and the older model is now discontinued, I’ve now gone and retired the old testing interface. I’ve now switched it up to an RME Babyface on this round of testing and shall be sticking with that going forward.
Another change is that both sets of tests are being reported using Reaper this time. I completed the testing initially using both Reaper and Cubase, but upon looking over the results I saw an irregularity that sent me back to retest again using Reaper for both sets, more of which I shall cover in the results section.
Having made all those changes, please be aware that these results and prior results are in no way comparable. I’ve changed the following and have retested all CPU’s listed in the results.
OS version. BIOS versions. Reaper version. Cubase removed. Different audio interface. Modified DAWBench versions.
The one benchmark that can still be used to loosely compare is CPU-Z and that is where we shall start.
I’ve used the inbuilt 9900K metric as a baseline comparison. I’ll note that the result they have recorded is within 10% of my last round of testing, so it seems quite fitting to use it here.
So first up DAWBench wise is the classic DSP test, running the SGA1566 variant as covered up top.
This test sees us stacking up plugin instances on a thread by thread by thread basis until the whole CPU hits a breaking point. An impressive result from the sub £200 AMD and the top end 3900X is trading blows with the £1000+ Intel chips. This test is our raw performance test and as hoped the results are impressive.
The second chart we have is the DAWBench VI Kontakt test, which I imagine is going to be the interesting one for most readers following on from prior writes ups.
And interesting it certainly was. The cross core latency we’ve seen in earlier models has gone on the 3600 and we were hitting 95% on the 64 buffer on the 3700X with 100% leveraged on the 128 buffer, both of which are certainly welcome sights.
The 3900X with its new dual chiplet design was the only model to not come away with a clean sheet, although given we have an extra die section to deal with, this might not prove to be a huge surprise. I would expect it to mature in much the same fashion as the other chips below it in the range have done over future iterations and no doubt looking forward to them fine-tuning this new design further.
With the first round of testing as noted up top, I used Cubase 10 initially for the DAWBench VI testing and everything looked great right up until the final test. On the 3900X I saw a 30% performance drop at 64 buffer with only 70% of the CPU being used at maximum load and 128 gave me 80% – 90% with the full CPU being leveraged at the 256 buffer and above mirroring the low buffer latency we’ve seen in previous generations.
It’s at this point where I wondered if we’d see any difference with a sequencer switch out, it didn’t help in the previous testing, but C10 had a few major changes under the hood over earlier versions and I’m keen to see if any of those have impacted here. I rebuilt the new test in Reaper and took another look at it with Reaper offering differing results. I still saw a performance hole at the 64 and 128 buffers, but this time it was more like 80% (64 buffer) and 90% (128 buffer) of the CPU being leveraged before it started to top out.
So, interesting to note variance drift between sequencers and how efficiently Kontakt appears to be running within them. I did upon seeing this completely re-bench in Reaper and those are the results presented. but do be aware of the sequencer variance that appears to be in play.
It’s also worth noting that some sequencer may not be able to address the full 32 threads efficiently, even if it can currently see them. I can foresee a lot of optimization being required by various DAW coders in order to ensure that their software can still keep up with the new hardware that is currently emerging.
So, overall thoughts are one of being largely impressed at each given price point. I don’t think I’d drop as low as the 3600 personally, but the 3700X has a strong claim as a superb all-rounder at the entry level and both of these chips seem to have largely shaken any concerns that remain about internal latency handling.
The 3900X has the noted performance latency still, although it seems to vary between applications and we don’t see that occurring with either the Reaper or Cubase test on the Intel side. I wouldn’t normally be happy with seeing anything drop out at 70% or 80% load but there is certainly an argument that it still offers reasonable value as even then it exceeds the 9900K which is currently sat around the same price.
Certainly, anyone working above a 128 buffer has little to no concern there as it appears to recover in full by the 256 buffer.
So there we have it, a great first outing for AMD’s 7nm design. I’ve seen comments aplenty about the lack of overclocking capabilities and yes we’ve come up short of the all core clock that I was aiming for in two of the tests, but I do kind of expect that from any first generation chip after a die shrink. I’ve certainly no doubt that we’ll have refinements over the next couple of years that will successfully extract every last bit of performance from the Zen 2 platform.
My only reservation at this time is compatibility with third-party hardware and mainly interfaces. We saw some compatibility issues with Ryzen 1 & 2 with some PCIe sound cards and some USB based interfaces. ASMedia have a bit of a poor rep on Intel board where they’ve provided their third party USB3 solutions as audio devices don’t tend to play too well on them. We saw similar incidents with the implementation they packaged for the Ryzen board on generation one and thankfully it was less common on Ryzen gen 2.
Ryzen Gen 3 has an AMD designed USB implementation but built around an ASMedia package and at this point, I’ve little idea how it will hold up with all the device we have available. I was testing using a Babyface Pro this time around, so that’s validated, but I would certainly check with user groups for your key devices for any compatibility issues prior to buying.
Looking forward, unsurprisingly Intel’s next refresh details have started to leak across various sites. The Cometlake refresh has a 10 core chip and various price reductions being dangled via those leaks, which obviously look to challenge this Ryzen release when they arrive.
Whilst some people might already be rolling their eyes at this leak timing, those who remember back to the last time we were entrenched in some good ole CPU wars, they’ll remember that this is pretty much business as usual and I can see price wars on the horizon as AMD snatches more and more market share.
But that’s all still to come in the future. Right now, for the time being, the third iteration AMD Ryzen series is easily their most compelling offering yet.
Over the past year or so the Intel i9 9900K has proven itself to be a superb flagship CPU amongst our mid-range systems. With the chip having some of the best IPC (instruction per clock-cycle) results that we’ve seen on any platform and the ability to leverage up to 16 threads of that performance, it has proven the perfect fit for many studio upgrades already this year.
Indeed you would have to spend a sizable chunk of cash more in the enthusiast range above it, in order to find a chip that can stand up to this powerhouse. Needless to say, it’s proven to be one of the most popular CPUs we’ve ever supplied when it comes to audio workstations.
When I took a look at the chip at launch, we realised that cramming this many cores into a CPU of this size, there would be some questions about heat dissipation and that we would have to carefully consider the cooling. This caused us to set out and try and construct a new solution for our ultimate 9900K based system and the result is the TW390.
So what sets this apart from our other systems?
A few years back I took a look at a number of cases that had appeared that featured bottom to top cooling solutions. Heat rises and given the majority of our noise comes from fans having to force air through the case, the thought process here was one of letting nature take it’s course.
At the time the testing went well, but the available fans couldn’t prove themselves capable and quiet enough for us to introduce them into our system range. Super large fans were still relatively unheard of at that time and of the options available to us back then, none of them had been specifically developed with quietness in mind.
Fast forward back to the present and this design idea has been revisited by a few more case manufacturers and in the meantime, more and more firms have begun to concentrate on larger format fans and specifically making them more sonically pleasing.
The perfect time to take another look perhaps?
Enter Cooler Master in the shape of the MasterCase SL600M.
The case features the required bottom to top airflow with a set of 200mm mounting points at the bottom of the case. We set out to test all the fans that would fit and see which ones sonically would stand up to the task in hand.
We found a winner and they were LED-based. It was at this point that we decided that we’d run with this new case design and introduce our first low noise, windowed, illuminated and overclocked studio solution.
The cooling allows us to run the 9900K at its dual-core turbo, but we have more than enough room to overclock all cores to that same turbo speed of 4.9GHz. In real terms, this means no changing of CPU clock speeds on the fly and that ensures that the power is always there when recording. No concerns here about potentially being caught short by your system power saving as the performance is always available when you need it.
We’ve chosen a board ideal for the task as the model here features full Asus AuraSync functionality. This allows you to have full control of all the connected addressable LEDS both inside and outside of the machine. We’ve set it up with 4 strips around the case and those colour changing fans, but you can also hook up a keyboard, mouse, mouse mats, monitors and anything else that features similar compatible RGB LED’s including the Philips home Hue system and have your whole room sync with the system!
Those fans, however, are not just about the light show. After all with all this power to cool you would expect it to be working hard and generating a sizable amount of background noise. This system, however, is tuned to allow us to run the fans at under 700RPM even with the machine running at a 100% full CPU load and in conjunction with the BeQuiet! cooler inside of this machine, it remains well below our demo room noise floor of 28db/a at less than a meter away!
This is a very, very quiet machine… at least sonically. To see more check out the video showcase below.
This system is available now with your choice of storage and memory options allowing up to 64GB of RAM to be included.
News that will possibly only affect a small number of users, but ultimately one that will make their composing life better. The FLS limit in Windows has become more and more apparent in recent years as computer performance has improved and the total number of plugins found within any given project has scaled too.
The FLS slot limit (Fiber Local Storage) has always been in place and places a limit on the total number of unique plugins that can run on a system. Multiple copies of the same plugin can share code and run efficently, so the previous limit of 128 plugs was percieved to be high enough for the average user.
The problem we’ve seen over the years is that some plugs don’t efficiently use their resources and can sometimes use up multiple slots per plugin. Some reports have shown that some software can eat up to 7 or 8 slots per unique plugin, so it should be apparent that for users working with large templates and intensive projects that the ceiling was ever looming.
Todays insiders announcement sees a new FLS limit being included in the download and should public testing prove positive we would expect to see this rolled out to everyone later in the year.
The new limits increase the headroom limit to well over 4000 slots, so this new limit should keep even the most demanding user going for a good few years to come!
As PCs get more powerful, musicians have created increasingly complex projects with more tracks, more instruments, and deeper effects chains. As a result, some of those musicians were running up against a FLS (Fiber Local Storage) slot allocation ceiling that prevented them from loading into their DAWs (Digital Audio Workstations) as many unique plugins as they’d like. This build greatly raises that per-process FLS slot allocation ceiling, allowing loading potentially thousands of unique plugins. Beyond musicians, this change will positively impact any application that dynamically loads hundreds or thousands of unique DLLs that have statically-linked Visual C++ runtimes, or otherwise allocate FLS slots.
In what’s become a fairly regular feature in the calendar these days, we see the yearly update to Cubase making its appearance once more as we hurtle towards the final month of the year.
This time around sees us receive a full version update as we move on to Cubase 10. With the full version releases we expect to see plenty of new features creep in whilst the smaller updates and fixes tend to be the focus on the .5 release, so what exactly do we have in store this time around?
Working through the “what’s new” list for interesting updates and the first one that stands out is a revised channel strip promising to extend the functionality and usability of its included modules, with new metering elements offering direct visual feedback for each of those modules.
Mix console snapshots now allow you to set up alternative mixes for your project and A/B compare the results within seconds. By lettings you save your current mix into a tab within MixConsole you can then instantly switch between them at any time, adding notes to each snapshot as you go. You can even mix and match by choosing a part of the mix like the EQ settings from one snapshot and applying them to another one.
Next up we see a dedicated audio alignment tool being introduced, which no doubt is the sort of functionality that is going to be very well received by many users out there.
Variaudio gets its own overhaul with improved workflow and even more creative tools. Smart controls aim to speed up your workflow by allowing direct control of all parameters at each segment. Promising micro pitch level adjustments for smooth drifts and transitions and the capability to push it to the extreme in order to achieve popular extreme pitch effects, this is another tool update that is no doubt going to make a lot of users happy.
Groove Agent getting an update to SE5. Alongside those on the handy plugin side is a redesigned and updated REVerence along with a collection of Vintage Verb settings and a completely new “Distroyer” processor capable of adding subtle warmth or utterly destroying your audio for those extreme effects.
But GUI changes and new tools whilst all nice to have, are not going to be the highlight for a lot of users this time around. For the power users amongst us, the biggest bugbear for the last few years has been the 14 thread limit we’ve been seeing after last years Creators update moved the goalposts for Cubase and how it handles multiple cores, which certainly left a lot of users frustrated at lost performance overhead. Well, the good news is that we’re being promised “significant improvement” this time around, and we’ve heard that they’ve been working on this for a while behind the scenes, so this alone could prove an extremely worthwhile upgrade for anyone running 8 or more physical cores.
The new Latency Monitor in the MixConsole promises to give you enhanced control whilst your monitoring and recording by now displaying both the sum of the latencies and the individual latency of each plug-in bugbear the effects chain viable. This should make it easier to track down any painful lag when working with effects cores in a live situation and should prove to be of benefit to many users.
Side-chaining which for a long time was a bit of a weak spot for Cubase gets another overhaul this time around, with further refinements to the process. The new simplified method will allow you to create the desired routing via just a few clicks by activating the side-chaining in your FX plug-in and selecting the source from the track list and away you go.
Other interesting technical updates include adding support for 32-bit integer and 64-bit float audio formats, AAF import and export options, along with additional MPE support for those users making use of capable controllers.
Productivity wise there is a host of improved “editing to picture” features for those doing sound for film work as well as a full “Virtual Reality production suite” featuring a whole host of tools specifically designed for producing VR content all the way from the recording to mastering stages.
All in all an interesting set of updates and some much-needed fixes carried out behind the scenes. We’ll certainly looking forward to road testing this edition in the near future.
Coffee Lake has been with us now for just over a year and it’s been a rather turbulent period for Intel. AMD’s continued gains over the last 12 – 18 months has marked a change in the marketplace and the first generation Coffee Lake launch perhaps felt a little rushed last time around, especially as Intel was attempted to respond to the opening volley in the now ongoing CPU wars.
This time around I find myself looking over the selection of chips in front of me and the key question on my mind right now is one of “have they managed to extract the platforms potential this time around?”
So, I’ve got 3 different models here all new to the Intel mid-range:
1. The new flagship in the form of the 8 core + Hyper-threading i9 9900K running at 3.6 with a turbo clock of 5GHz out of the (oddly) shaped box.
Chip is being run at all core 5GHz
2. The i7 9700K featuring 8 cores but no Hyper-Threading. The chip is clocked to 3.6GHz and 4.9GHz out of it’s rather more normally shaped box.
Chip is being run at all core 4.9GHz
3. Lastly the 9600K in another boring box. 6 cores, no Hyper-Threading and 3.7GHz with 4.6GHz on the turbo.
Chip is being run at all core 4.6GHz
So, we see some firsts here and some repositioning in the range. The i9’s go mainstream and in this case, we’re seeing a few notable key differences there. The big one is that it’s the first time we’ve seen Intel put out an 8 core mainstream chip. Given we only got our first mainstream 6 core back on the last range refresh, it’s good to see them again being pushed into cramming more value onto the die this time around.
The i9’s are also promising us solder under the heat-spreader this time around, rather than the paste found in models elsewhere in the range, so this should in theory help with overclocking for those wishing to push them a bit more.
The i7 & i5 models this time around are limited to 8 cores and 6 cores respectively with no hyper-threading. Whilst it helps to differentiate between the respective ranges, it is going to come as a bit of a shock to anyone used to the current i5/i7 naming convention. On first thought, we wondered it this meant that we could expect the new 8 core with no HT to be outperformed by the older 6 core + HT models or not, although this could very well come down to specific workloads.
Hyper-Threading by its very nature is based around stealing unused clock cycles to get more work done, so if your workload is already thrashing the CPU, then having Hyper-Threading isn’t really going to have much of an impact. In previous testing I’ve tended to note anywhere between 20% and 60% gains with it turned on depending upon the software in use, so it could be argued that having an extra 2 real cores, could equate to somewhere in the region of 4 or even more lost Hyper-Threads (once again, workload permitting) and we’ve also got to consider clock and IPC gains here, so playing off the 9600K & 9500K’s against their predecessors are going to be certainly interesting.
So lets get down to it.
All the standard tests to start with and nothing unusual going on so far. Whilst they are all clocked fairly close together as far as the cores go, you can note differing amounts of L3 cache on each of the chips, which is no doubt going to help a little in both the single and multi-core benchmarks.
So on with the DAWBench SGA DSP Test and we can see the 3 new chips in Yellow above. Starting with the 9600K the obvious comparison here is against its predecessor and frankly, it’s a little underwhelming with a somewhere between a 1% – 10% increase depending upon the buffer in play and scaling upwards as the buffer size is increased.
The 9700K is next and we get to compare its new design configuration of 8 true cores and no Hyper-threading, which also appears to come off poorly here when compared against the older 8700K with the results showing up a 20% – 40% drop off against Intel’s own previous generation class leader.
The loss of Hyper-threading here really looks to have impacted the testing on the new generation at least under the DAWBench classic test. I do get the thought process here with the chip design itself, as the largest new segment in recent years that seems to have captured the marketing teams imagination has been the rise in content creation users who are live streaming. True cores for that sort of content generation is far more beneficial, especially gamers who wish to live stream at the same time, so I fully understand this design choice, in fact it could be argued that this style of chip would be preferential for anyone working live but for anyone looking for raw performance in the studio it’s all a bit disappointing so far.
The flagship here, however, is no longer the i7 model, but rather the i9 9900K and it’s at least here where things are making rather more sense. It’s the first time that we’ve seen an 8 core in Intel’s mid-range line up and looking at the result above, it looks to have settled itself just above the 7820X from the Intel Enthusiast range (X299) and to be fair, on paper at least it makes perfect sense that it would replace that chip.
It’s the same core count, a few generations newer and clocked higher, so it was always going to be a contender, what it does mean, however, is that once again we see one of Intel’s mid-range chips start to cannibalize their own enthusiast class of chips. In fact, we’ve now reached the point where the lower end i7 enthusiast class has had a dearth of releases over the last 15 months and largely been killed off, wherein the same period AMD has successfully taken a sizable bite out of that part of the market space too and we see them continue to take advantage of Intel’s lack of new competing models.
Indeed, in the chart here sat above it, we see the large core count AMD’s as well as the older generation i9’s outlining exactly what this test is good at, which is small files being spread efficiently over the all the available processing space and honestly, the results here once again don’t really give us any surprises as to how and where the chips are being positioned in the range.
Switching over to the DAWBench VI Kontakt based test we see a more interesting picture as the higher single core clocks appear to give us a welcome boost here. In the one thing, it does really outline for us here is that the Kontakt handling looks to benefit from IPC figures all around.
Having the dedicated cores looks to help when working at tighter ASIO buffer settings on both the 9600K and 9700K, although we can see that this benefit disappears on the 9700K once we slacken that setting off to around the 256 buffer. It appears at this point that the Hyper-threading on the older 8700K finally gets a bit of room to breath and flex it’s stuff once you open up the buffer far enough and this in itself is interesting information.
Thinking about this from a live point of view where you’re aiming for the tightest RTL score and quite likely to be making use of Rompler style libraries, this does outline that going with these new chips that feature all real cores might well pay off for you in this situation. However, if you’re working in the studio, the loss performance at the larger buffer settings, at least in comparison with the older generation might once again prove a little perplexing.
Taking a look at the i9 9900K by comparison and it starts to make more sense again, with it doing rather a good job at once more making the older 7820X chip irrelevant. There is less challenge up this end of the chart from the red team largely due to the lack of solid benchmarks obtained in the last round which you can catch up on if you hit the link.
What this means is that the options here do seem to be becoming even more divided. It’s been pointed out that the higher latency jobs that the Zen chips were excelling at are applicable to all sorts of media editors still and with each additional chip it becomes ever more clear that these continue to remain very scenario dependent, and that Kontakts way of working tends to favour highly clocked cores and larger IPC figures over the workload being spread out over more numerous but slower cores.
Before I round up I just want to throw out a couple of additional charts. I didn’t get a chance to do it with all of them, but I did record the i9 9900K at both stock and at the all core overclock, largely so you can see the difference it can make by setting it to the all core turbo.
Depending on the test and buffer size it’s up to around 8% in these benchmarks, although this can grow as you use more complex chains of processing in your projects. A chip is only really as strong as it’s the weakest core, as once you max out any given core you begin to run the risk of audio artefacts creeping in.
I mention this specifically with the i9 9900K as a lot of premium boards have been shipping with 5GHz profiles now for a few years and it’s rather easy to hit the results I’m showing above with a halfway decent cooler solution. Above that, you’ll probably want to move to a water cooler solutions with 5.2GHz looking to be the target for anyone wanting to really drive it.
I’ll also note that the i7 9700K was running comfortably just below 80 degrees by the time I all core turbo’d it, whereas the i5 9600K was sitting nicely around the 60 degrees mark even with Prime 95 absolutely thrashing it, so I reckon for anyone wanting true cores only, you might have quite a chunk of headroom there to play with if you want to tinker with it.
So, overall, what are my final thoughts?
The i5 9600K and i7 9700K both feel like a step backwards for our part of the market to a degree. Sure, they have some strengths and I’ll come back to the example of low latency machines for live use again being a prospective user base, but their value proposition in comparison to other chips already out there is where it really falls over in the studio.
Having a sideways move in the overall performance is a little disappointing but we’re seeing an initial street price on the i7 9600K of around £350 against the i5 8600K historical showing of around £250. Similarly the 8700K was around £350 for most of its lifecycle and the 9700K sits at £499 at launch, so we’re seeing price increases with each of those ranges, although I suspect as supply catches up with the initial demand we may find some price realignment over the coming months and I wouldn’t be all that surprised to see the new chips reflect older price points once the market stabilises. This is a fairly common occurrence with any new chip release, but admittedly it leaves me feeling a rather underwhelmed given all I’ve discussed already from a performance point of view.
The i9 9900K, on the other hand, replaces the 7820X which spent most of its lifecycle between £400 – £500 in the UK and the i9 9900K has landed at £599. Assuming it’s going to drift over the coming months we’re still essentially looking at £100 mark up over the older model.
The DAWBench classic test here shows us mixed gains depending upon the workload and it’s up against the AMD’s which manage to still outperform it within this test. By contrast, the DAWBench VI test flips it with it outperforming the chips on the chart and keeping in mind the Threadripper results previously.
So, does even the i9 9900K make sense? Well, yes, it’s the one that really does here. With the change to the Z390 platform, we see a cost saving over the older X299 platform complete with a more advanced feature set. With the cost differences between boards often totalling and surpassing the £100 amount, the overall cost of going with an i9 9900K over an i7 7820X looks to come out in the i9’s favour and that’s before considering the performance gains it offers.
The additional good news here is that the other previous sticking point with the Z390 platform for some users is it’s restricted memory capabilities, as the four slots could only handle a maximum of 64GB. We’ve seen an announcement recently however that they are going to start offering double stacked DIMM’s over the coming months to support this platform, so hopefully, it shouldn’t be all that long until these boards can handle 128GB as well.
Overall this feels like Intel’s real response to AMD’s advances last year although given the swift execution and release of the second generation Zen chips, perhaps they are still a tad on the backfoot here. It’s kinda where Coffee Lake should have been last time around and it’s of course good to see more power in the mid-range. It does leave me questioning where exactly it’s going to leave the enthusiast class, as anything less than an i9 on that platform is going to prove to be poor value at this point and given the age of that platform I really can’t help but hope that the next Intel enthusiast platform can’t be all that far off now.
It feels like this is the repostioning that Intel needed to happen to put it’s own range back into some context, but it may not prove to be the change that everyone was looking for, at least in our small corner of the market.
At the very least here the i9 9900K emerges as a rather strong contender for us audio users and I suspect any other i9 based refresh over the coming months is going to make this all make a whole load more sense when the dust settles. But with AMD already promising updates to its own platform and announced tweaks for their memory balancing promised over the next few weeks Intel may have to work even harder over the coming months.
We’ve always found that Steinberg’s UR interface series have proven to be solid options at their respective price points, normally proven to be extremely capable all-rounders with great pre-amps & signal path, as well as having a general tank-like construction.
Production packs have always been popular with users who are just starting out and with this one we see what looks to be a complete solution for anyone wishing to get started with writing and recording music at home.
In the box you can expect to find the standard UR22 recording pack:
Studio Condenser Microphone ST-M01
Studio headphones ST-H01
Cubasis LE download info (this is the iOS version of Cubase LE, so user can also use the interface/mic/headphones on an iPad).
But that’s not all! The bonus to be found within this limited edition pack is that you also receive a full copy of Cubase Artist edition 9.5 (including Elicencer) as well as a full copy of Wavelab Elements 9.5.
With the package expected to come in around £342 this pretty much equates to a free copy of Cubase Artist which in itself has a value of over £200, making this a superb deal for anyone wishing to invest in a complete solution to get going.
The package is available for pre-order now and we are expecting to be shipping them over the next few days.
Please be aware that this offer is extremely limited and won’t be around for long!
Native Instruments have really gone all out this year, with a huge range refresh taking place and a raft of new products fresh out of Berlin.
The flagship collection package reaches it’s 12th revision and adds even more instruments and sound libraries to the collection.
Amongst the headline updates here include a follow up to the now legendary “Massive” synth with Massive X making it into the pack. The original Massive had a huge impact upon the softsynth world as well as EDM and dance music in general, so whilst current details are still scarce the promise of a whole new engine and a modular workflow is certainly going to have people eagerly awaiting the full reveal in the coming months.
We also see the release of Kontakt 6 as another heavy hitter and various sound libraries like Session Strings 2 and a host of extra expansion packs such as the Middle East discovery series, Analog Dreams, Ethereal Earth and Hybrid keys packs also for Kontakt.
The new TRK-01 “Kick and Bass” instrument is in there too, as well as creative effects like Phasis, Choral and Flair making up a sizable number of new and improved tools on this revision release.
For a more complete list of all that can be found in the new Komplete 12 bundle, click the image below to expand.
Komplete Kontrol – A Series
Taking the already popular Kontrol keyboard range and releasing a cut-down version makes so much sense to us, that we wonder how it hasn’t happened sooner. Promising the same great design experience we’ve come to expect from the “S” series models, whilst getting the option to pick one up without all the bells and whistles.
That’s not to say they are featureless, with plenty of pre-mapped controls for all of your most used N.I. libraries software they’ll certainly help to improve your workflow for anyone who makes good use of packages like Kontakt already.
The keyboards will be shipping with a solid collection of plug-ins too, offering versions of Komplete Instruments and effects as well as the Machine Essentials libraries. They will be available in three different sizes which are 25, 49 and 61 note editions.
Komplete Kontrol – S88 MKII
Not to be left out, the “S” series gets an update too, although just the largest S88 model at this time. The rest of the range has already had their MK2 editions out for some time, so really we’re just seeing the S88 being brought into line with the rest of the range, but none the less always good to see an update.
Some of the smart play features feel like they might be a little un-needed for someone willing to invest this much in a good playing keyboard,but it none the less continues to feature one of the best in class fully weighted Fatar keybeds and offering the same great software packages as the A-series mentioned above make it a serious contender for anyone considering a with keyboard upgrade in the future.
Maschine Mikro Mk3
Another model refresh and the big news here is the new touch strip, designed to add even more hands-on control. Not such a major overhaul here, but for anyone who’s been thinking about picking up a Mikro anyway, then more features to play with are always great I’m sure we can all agree!
Traktor Pro and Traktor S2 & S4 Mk3’s.
Traktor 3 is due to ship with the two new controller revisions and offers up a number of improvements under the hood. The audio engine uses the Elastique 3 time stretching engine for noticeable audio improvements whilst playing.
The interface GUI has been redesigned to offer additional clarity in use and the few other new knobs and buttons have been added, most notably the one knob “Mixer FX” control that can be assigned to each channel independently.
The S2 has had some hands-on optimizations in the shape of up-sized jog wheels and increased touch sensitivity, giving users even more control when mixing and scratching.
The big change here, however, is in the S4 model, which sees the introduction of Haptic Drive which offers us high-torque, motorized jog wheels that provide performers with haptic feedback in three modes: Jog Mode, Turntable Mode, and Beatgrid Adjust Mode.
DJs can now feel cue points and loops when scrolling through tracks, and enable Turntable Mode for natural-feeling beatmatching while nudging and stalling the jog wheels. Interfacing software and hardware within Haptic Drive™ technology means that its functionality can be expanded, and will grow over the course of future updates, giving DJs even more ways to interact with their music.
The S4 places vital performance information on the hardware itself, keeping everything DJs need to know front and follow up in the booth. High-resolution displays on each deck display a waveform strip, track title, loop length and activation, key, and BPM, as well as Stem and Remix Deck components when performing with Stems and samples. Further visual feedback is provided by RGB light rings surrounding each jog wheel, which visualize deck selection, tempo, and track-end warnings.
It’s not all hardware and software today through as a number of online services have had updates too.
Sounds.com is due to exit beta and is a service offering an ever-growing collection of over 700,000 high-quality loops and samples from over 250 providers for a monthly subscription fee.
We’re advised that this is the culmination of months of continuous expansion and improvement based on user feedback, with the aim to provide musicians of all levels an easy and affordable way to access an ever-growing library of high-quality loops and samples with plenty more to be added going forward.
Also receiving a redesign is Metapop, which is NI’s online hub for music makers to share, connect, and collaborate with like-minded creators. In the site relaunch, NI is launching a newly redesigned Metapop, with a new user experience focused on making it easier to connect, share, and collaborate with fellow music makers.
After listening to the feedback from its users, NI has added new features like groups and chats, helping creators to find their community, connect with others, and get feedback on their music. Additionally, a new mobile version of the site lets users upload, connect, and listen to newly released samples – whether at home or on-the-go.
Lastly “Loop Loft” who joined the N.I. stable back in January this year is also receiving a site redesign. The Loop Loft itself continues to provide an ever-growing source of inspirational content to help producers learn and grow, including exclusive tutorials, interviews, and tips from leading industry professionals.
I’m the first to admit that I’m a little late to the table with this write-up. The original 2990WX sample arrived whilst I was on leave and was quickly placed into a video rig and sent out for review, meaning I’ve had to locate another one at a later date. Along with that, I’m honestly a little overwhelmed with how much interest this £1700 workstation grade CPU has generated with the public in recent weeks, as I really didn’t expect this level of interest in a chip at this sort of price point.
I’ve also approached this with a little trepidation due to earlier testing. As someone noted over the GS forum, the 2990WX might not prove all that interesting for audio due to the design layout of the cores and the limitations we’ve seen previously with memory addressing inside of studio-based systems. They were certainly right there, as the first generation failed to blow me away and there remains a number of reservations I have with the under-laying design of this technology that potentially could be amplified by this new release. During the initial testing of the 2990WX this time around, the 1950X replacement also arrived with us too in the shape of the 2950X and given some of the results of the 2990WX I thought throwing it into the mix might prove a handy comparison.
Why bring all this up at all? Well, because everything I discussed back then is still completely relevant. In fact, I’m going to go as far as to suggest that anyone doesn’t understand what I’m referring to at this point should head over to last years 1950X coverage and bring themselves up to speed before venturing forward any further.
Back again? Up to speed?
Then I shall begin.
The 2990WX is the new flagship within the AMD consumer range and features a 32 core / 64 thread design. It has a base clock of 3GHz with a max twin core turbo of 4.2GHz and an advised power draw of 250W TDP.
I won’t split hairs. It’s a beast… something I’m sure most people reading this are well aware of given the past week or so’s publicity.
In fact, for offline rendering, I could close the article right there. If you’re a video editor on this page and don’t happen to care about audio (hello… you might be lost, but welcome regardless) then you should feel secure in picking up one of these right now if you have the resources and the need for more power in your workshop.
But as was proven with the release of the 1950X, the requirements for a smooth running audio PC for a lot of users are largely pinned on how great it is for real-time rendering, which is a whole different ballgame.
In the 1950X article I linked up top, I went into a great deal of detail in regards to where performance holes existed. I found that low latency response was sluggish and resulted in a loss of performance overhead that left it not in an ideal place for audio orientated systems. I had a theory that NUMA load optimization for offline workloads was leaving the whole setup in a not ideal situation for real-time workloads like ASIO based audio handling.
In the weeks following that article, we saw AMD release BIOS updates and application tweaks to try and resolve the NUMA addressing latency I had discussed in the original article, largely to no avail as far as the average audio user was concerned. In AMD’s defence, they were optimizing it further for tasks that didn’t include the sort of demands that real-time audio places upon it, so whilst I understand the improvements were successful in the markets they were designed to help, few of those happened to be audio-centric.
At the time it was just a theory, but my conclusion was largely one being that if this is as integral to the design as I thought it might be, then it would take a whole architecture redesign to reduce the latency that was occurring to levels that would keep us rather demanding pro users happy.
The 2990WX we see here today is not the architecture change we would require for that to happen as where the 1950X has 2 dies in one chip, the 2990WX is now running a 4 die configuration which has the potential to amplify any previous design choices. If I was right about hard NUMA being the root of the lag in the first generation then on paper it looks like we can expect this to only get worse this time around due to the extra data paths and potential extra distance the internal data routing might have to cope with.
The 2950X, by comparison, is an update to the older 1950X and maintains 2 functional dies, with tweaks to the chip’s performance. Given the similar architecture, I would expect this to perform similar to the older chip, although make gains from the process refinements and tweaks enacted within this newer model. I’ll note that the all core overclocking is improved this time around and a stable 4GHz was quick and easy to achieve.
OK, so let’s run through the standard benchmarking and see what’s going on.
As normal I’ve locked it off at an all core turbos on both of the chips. As with a lot of these higher core count chips, I’ve not managed to hit a stable all core max turbo clock, which would have been 4.2GHz, rather settling for 3.8GHz on the 2990WX and 4GHz on the 2950X both of which perform fine with aircooling.
I’ve spoken to our video team about this and they managed to hit a stable 4.1GHz on the 2990WX using a Corsair H100, so it looks like you can eak out a bit more performance if noise is less of a consideration in your environment.
If you’re not aware from previous coverage why I do this, if you’re running a turbo with a large spread between the max and minimum clock speeds then the problem with real-time audio is that when 1 core falls over, they all fall over. So, whilst you might have 2 cores running at 4.2GHz the moment one of the cores still running at 3.2GHz fails to keep up then the whole lot will come tumbling down with it. Locking cores off will give you a smoother operating experience overall and I’m always keen to find a stable level of performance like this when doing this sort of testing.
I don’t always remember to run this benchmark, although this time I’ve made the effort as Geekbench doesn’t appear to support this many cores at this point. Handily enough, I did at least run this over the 1950X last time which returned results of 428 on the single core and 9209 on the multi-core at the time.
Given that the 2990WX looks to be pulling twice the performance and physically has twice the number of cores, it looks to all be scaling rather well at this point. The 2950X, on the other hand, sees around a 10%-15% gain on the single and multi-core scores over the previous generation.
Moving onwards and the first test result here is the SGA DAWBench DSP test.
This initial test is very promising, as was the older 1950X testing. Raw performance wise we’re talking about it by the bucket, I really can’t stress that enough with both chips performing well in what is essentially a very CPU-centric test.
At the lowest buffer we see it being exceeded by the older chip, so what is going on there? Well, we’re seeing a repeat in the pattern that was exhibited by the 1950X where there is an impact to performance at tighter buffers, and it does appear that at the very tightest buffer setting that we’re seeing some additional inefficiency caused by the additional dies, although this does resolve itself when we move up a buffer setting.
Last time we scaled up from 70% load being accessible at a 64 buffer and this time, I imagine due to the extra dies being used we see the lowest setting corrupting around the 65% load level and then scaling up by 10% every time we double the buffer.
As a note when I pulled that 512 buffer result this time around and it returned 529 instances.
The 2950X, by comparison, returned me a load handling around the 85% on a 64 buffer, rising to 95% at a 256. An improvement on the first look we took a look at the original 1950X chip, although I’ll note I was also seeing this improved handling when I did the 1950X retest a few months ago using the newer SGA1156 charts that has replaced the classic DSP test, so this might be down to the change in benchmarks over the last year, or it could also be down to the BIOS level changes they’ve made since original generation launch.
So far, so reasonable. A lot of users, even those with the most demanding of latency requirements can get away with a 128 buffer on the better audio interfaces and the performance levels seen at a 128 buffer, at least in this test are easily the highest single chip results that I’ve seen so far.
In fact, knowing we’re losing 40% of the overhead on the 2990WX is really frustrating when you understand the sort of performance that we could be seeing otherwise. But even with that in mind, if you wanted to go all out and grab the most powerful option that you can, then wouldn’t this still make sense?
Well, that test is pure CPU performance and in the 1950X testing, the irregularities started to really manifest themselves in the DAWBench Kontakt test where it started to depend equally on the memory addressing side of things.
Normally I would insert a chart here to show how that testing panned out.
But I can’t.
It started off pretty well. I fired it up with a 64 buffer and started adding load to the project. I made it up to around 70% CPU load on the first attempt before the whole project collapsed on me and started to overload. I slackened it off by muting the tracks and took it back down to around 35% load where it stabilised, but from this point onwards I couldn’t take it above 35% without it overloading, not until I restarted the project.
I then tried again at each buffer setting up to 512 and it repeated the pattern each time.
I proceeded to talk this one through with Vin the creator of the various DAWBench suites and a number of other ideas were kicked about, some of which I’ve dived further into.
One line of thought was that as I was still using Cubase and the last 8.5 build specifically, precisely for the reason that C9 has a load balance problem for high core count CPU’s that is currently being worked upon. The older C8.5 build is noted as not having the same issue manifest due to a difference in the engine and during testing this time Windows itself was showing a fairly balanced loads mapped across all of the cores whilst I was looking at performance meter, but even so, historically, exceeding 32 cores has always been questionable inside many of the DAW clients.
So, to counter this concern, I went and ran the same tests under Reaper and saw much the same result. I could push projects to maybe 65%-70% and then it would distort the audio as the chip overloaded and this wouldn’t resolve itself until the sequencer was closed and reloaded.
So what is going on there? If I was to speculate, then the NUMA memory addressing is designed to allocate the nearest RAM channel to it’s nearest physical core and not to use other RAM channels until on core’s local channel is full.
I suspect with knowing that, that the outcome here is that it’s maintaining the optimal handling up until that 70% level and then once it figures out that the RAM channel is overloaded it starts allocating data on the fly as it sees fit. The reallocation of that data to one of the other 3 dies would result in it being buffered and then allocated to the secondary memory location and would result in additional latency when the data is recalled in a later buffer cycle which would result in audio being lost when the buffer cycle completes before it can be recalled.
In short, we’re seeing the same outcome as the first generation 1950X but amplified by the additional resources that now need to be managed.
This way of working is the whole point of hard NUMA addressing and indeed is the optimal design for most workstation workloads where multiple chips (or die clusters in this case) need to be managed. It’s a superb way for dealing with optimization for many workloads from database servers through to off-line render farms, but for anything requiring super-tight real-time memory allocation handling it remains a poor way of doing things.
As I’ve said previously, this is nothing new for anyone who deals with multi-CPU workstations where NUMA management has been a topic of interest to designers for decades now. There has always been a performance hit for dealing with multiple CPU’s in our type of workflow and it’s largely why I’ve always shy’d away from multiple chip Xeon based systems as they too exhibit this to a certain extent.
Much like the first generation 1950X with it’s 2 dies, we see similar memory addressing latency when we use 2 seperate Xeons and this has always been the case. I would never use 4 of those together in a cluster for this sort of work simply due to that latency and so the overall outcome with 4 dies being used in this fashion isn’t all that surprising.
I also tried retesting with SMT turned off, so it could only access the 32 physical cores in order to rule out a multi-threading problem. The CPU usage didn’t quite double at each buffer instead settling around the 70% total usage mark but the total amount of usable tracks remained the same and once again going over this lead to the audio collapsing quite rapidly.
So, much like the first generation the handling of VST instruments and especially those which are memory heavy look like they may not be the best sort of workload for this arrangement. This ultimately remains a shame, especially as one of the other great concerns from last time which was heat has been addressed by quite some degree. Running the 2990WX even with an overclock didn’t really see it get much above 70 degrees and that was on air. Given that the advised TDP here is 250W at stock, rising quickly when overclocked even to the point of doubling the power draw, the temperatures for a core count this huge is rather impressive. I think there is a lot to pay attention too here by Intel in regards to thermals and the news that the forthcoming i9’s are finally going to be soldered again, makes a whole load of sense given what we’ve seen here with the AMD solutions. If anything it’s just a shame it took the competition pulling this out of the hat before they took notice of all the requests for it to be brought back by their own customers over recent years.
Still, that’s the great thing about a competitive marketplace and very much what we like to see. Going forward I don’t really see these performance quirks changing within the Threadripper range, much the same way that I never expect it to change within the Xeon ecosystem. Both chip ranges are designed for certain tasks and optimized in certain ways, which ultimately makes them largely unsuitable for low latency audio work, no matter how much they exceed in other segments.
There is some argument here for users who may not require ultra-tight real-time performance. It’s been brought to my attention in the past that users like mastering guys could have a lot of scope for using the performance available here and if they are doing video production work too, well, that only strengthens the argument.
On paper that all makes sense and although I haven’t tested along those lines specifically, the results seem to indicate that even the trickiest of loads for these CPU’s seem to stabilise at 512 and above with 80%+ of the CPU being accessed, even in the worst case scenario. I have to wonder how it would stand up in mixed media scenarios although I would hope that ultimately in any situation where you render it offline that you should be able to leverage the maximum overhead from these chips.
I suspect the other upshot of this testing might be one of revisiting the total CPU core count that each DAW package can access these days. Last time I did a group test was about half a decade ago and certainly, all the packages look to have up’d their game since then. Even so, I doubt anyone working on a sequencer engine even 3 years ago would have envisioned a core count such as the one offered by the 2950X here, let along the monstrous core count found in the 2990WX.
AMD’s Zen core IPC gains this generation as we’ve already seen with Ryzen refresh earlier in the year were around the 12% mark and it looks to have translated faithfully into Threadripper series with the 2950X model. One of AMD’s big shouting points at launch was regarding just how scalable the Zen package was simply by upping the die count and that’s clear by the raw performance offered by the 2990WX, they really have proven just how effective this platform can be when dealing with workloads it’s designed for.
One day I just hope they manage to find a way of making it applicable to the more demanding of us studio users too.