Breaking Boundaries with Data Parallel C++ | oneAPI | Intel Software

Breaking Boundaries with Data Parallel C++ | oneAPI | Intel Software


[MUSIC PLAYING] Welcome to another
episode of Tech Decoded. I’m Lindsay Michelet with Intel. I’m joined today by Alice Chan. Alice is a vice president in our
Intel Architecture, Software, and Graphics group,
and the general manager of our Compilers
and Languages team. She’s held numerous
leadership positions here at Intel since joining in 1995
and is an expert in compiler technologies and products. Welcome, Alice. It’s nice to have you with us. Thanks, Lindsay. It’s really nice to be here. Alice, Intel has introduced
the oneAPI initiative, and as part of that, the
Data Parallel C++ programming language. Why is this important
to the industry now? Well, let’s just talk about
oneAPI and what it is. It is really an
industry initiative that is led by Intel to get a
better productivity and ways or methodology to do
software development across many different
architectures. Data-centric workload
and applications has evolved quite a bit in
the last couple of years. It becomes extremely diverse. You have your scalar program
and vector and matrix. And because of that,
there’s also a lot of innovation in hardware. So our CPU can do scalar
programming really well, and vector. And then there is the
GPU and AI accelerator that is good as vector
programming and matrix programming. And then you have FPGA,
which is something that is completely different
and a different paradigm. Today, software developers
has a hard time bringing all of these things together. It’s a big challenge. Every single one
of these hardware have their own silos of tools,
of developers’ language, and a lot of these
are even proprietary. Without an open
specification, you don’t know where the
language is heading. You don’t know what’s going on. You don’t know whether
the next version is going to be compatible with you. It’s a big problem. And this means that productivity
is going to be suffer. And the amount of
re-use among the code is very difficult once you
have different language and proprietary. So Data Parallel C++ is really
an evolution that Intel is putting together, then
try to address this issue, that we will bring productivity
and get optimum performance, which is really the most
important thing across architecture. And it’s the changing
of a paradigm on how people think
in programming across diverse hardware. And that’s what oneAPI, and
together with Data Parallel C++, must strive to accomplish. And also, because it’s open, you
want to bring in the community, bring in different
vendors so that everybody can come together and innovate. So Intel is absolutely
leading this effort because it’s so important
for the industry today. Can we dive into more
specifics on Data Parallel C++? What is it? And why do we need it, as
opposed to using OpenCL or C++? Yep, yep, very good question. So Data Parallel C++
is really an evolution. There’s really no strong
programming language today that can
cross-architecture like what we just described. So we need something stronger. We need something
that can address the matrix and the vector
to do parallel programming. And also, we need a language
that have the capability to do offloading,
like into accelerator to do heterogeneous computing. C++ is good. C++ is a very broad
adopted language. It is a very strong
language that give actually very good performance. However, there are
something that is missing. In the standard today, unless
you add other things to it, it doesn’t really do
heterogeneous programming. OpenCL is a good
innovation, actually. It was led by the Cronos Group. And when it get started,
it was very promising. However, companies that
joined into that initiative start breaking away from it. They have their own
extension, and the standards start diverging. And eventually, they favor
their own proprietary language rather than just stay
in the consortium. It’s just kind of splinter out. Because of that,
the Cronos Group actually have new
innovation on top of what they learned from OpenCL. There is a new language, a
programming model that they put together that’s
based on pure C++. So if you know C++, this
become much simpler. The language is much simpler. It support heterogeneous
programming, like OpenCL. And it have new
open specifications, and many companies are joining
together to put that together. So we take a look at it,
and it is very promising. And we start to do innovation on
top of this foundation with C++ under that,
[? SQL ?] in between, and we put Data Parallel
C++ on top of that. Why do we need to do that? There are gaps in
[? SQL ?] that actually with missing features
that without that is very hard for you to get to
that optimum performance that we were talking
about across hardware, especially for things
like FPGA or the GPU that Intel is going to be on
the roadmap in the near future. So we have extension, and we
have innovation on top of that. And that’s basically what
Data Parallel C++ is. And also, another thing
is, because it’s open. It really break apart from the
silos of proprietary vendor lock languages that we
have today for the GPU. And also, because space is C++. This means that it interoperate
with a lot of other very common programming model and
languages outside, like OpenMP, as you described, Fortran,
of course, C++, MPI. So a lot of things
will just come together and work together. And so basically, that’s
what Data Parallel C++ is. Let’s talk a little bit more
about oneAPI and Data Parallel C++ being open, being based
on industry standards. Why is this so important? Well, without a standard,
with an open specification, you can’t have the
community innovate together. You can’t work together
with other vendors. We can’t invite partners
to come say, hey, let’s do this together. The perfect world is if we
can have everything into C++, because as I say,
C++ is powerful. However, C++, the standard
is moving quite slow. Every 10 years, 7
years before they will come up with a standard. And we need something that is
moving a little bit faster. [? SQL ?] has that. [? SQL ?]
move fast, but there are gaps. So Intel is stepping in
to make sure that we’re going to fill the gaps. So the way that we did it, we
actually have a GitHub today. There is a implementation
of the DPC process compiler, open source, and the
specification is also open. This means that any
vendors that would like to implement that
for their hardware, they have all the
information to do so. And in fact, once we did that,
there are other community people, and vendors are coming
to join us into this effort. So that’s actually
really exciting. It really changed the way
on how people look at how you can program to a GPU today. That’s really one solution. We basically opened
this up to everybody. I agree. It is exciting, and
I think it feels like we’re at an inflection
point, almost, in history, where we’re looking at
simplifying programming across a heterogeneous environment
using DPC++ for that. Have we seen anything like
this in history before? I remember that. [LAUGHTER] That was, what,
10 years ago already. Many years ago, Intel has
another big initiative. And it’s actually
still going on. When our processor becomes CPU,
becomes more and more complex, we start adding more
and more vectors. Parallel programming becomes
really important for developers to get the maximum performance. And in order to do that,
that’s a paradigm shift. The way how you write program
is completely different when you only write serial
code or scalar code. So at that time, we
do similar effort, but this is even
bigger, in my opinion. We have [? two ?] [? suite, ?]
we have a lot of training. We have a lot of new
innovation in the language, especially in OpenMP
that we pushed out. The world is moving along. Parallel programming
is still important, but offloading computation
to accelerator and GPU is playing an important part
in the data-centric world. So that is another
shift that I’m seeing on top of the parallel
programming paradigm. We need to make sure that
we educate developers, understand how to offload,
how to take the most advantage of the devices that
make available to them today. And that’s what oneAPI
is trying to do. And it’s really not just Intel. It’s really the whole
world, the whole ecosystem of the hardware vendors will
be benefiting from this also. Agreed. Alice, what kind of performance
can we expect across the four different architectures that
you mentioned from Data Parallel C++? And really, how
confident are you that we chose the
right language? We’re pretty confident. [LAUGHTER] Right? So go back to what
I said earlier. We have a really good
foundation as a language. C++ is sitting right there and
adding the capability to do offloading, heterogeneous
programming much easier than what can be done in– or
actually, it’s similar. The style of programming
is similar to the proprietary language today. So if you are familiar with
that, it’s pretty easy port. The most important thing
is what I said earlier, that we’re adding open
extensions with specifications to fill the gaps. For example, in the GPU
that is on Intel roadmap, well, parallel programming
is important, so their extension that will
support vectorization. And then there’s FPGA. That is a whole different
beast of requirement that is needed, how pipes
is being implemented. Those are just
simple example, how memories are managed between
the devices and the host. All of these are the
innovation that we’re putting into the language
with open specifications so that everybody can help out. And we’re actually also getting
input from other vendors too and what they would
like to see in the language to help them. So I think, together,
with a good foundation and the innovation
that we’re doing across these
architecture, yeah, we’re pretty confident to
get the performance that we’re looking for. Yeah. What About using C++ and OpenMP? How would that compare
to Data Parallel C++? Right. So we do have a lot of customers
that are very familiar with using OpenMP on top of C++
for both parallel programming and offloading. If you want to do that, Intel
is going to absolutely continue to provide that
for you so you can use all the OpenMP pragmas to
do threading, vectorization, and offloading. And that will absolutely be
supported on the Intel roadmap for our GPU moving forward. However, if you are more
familiar with the GPU style of programming, writing kernels,
and offloading to a device, DPC++ might be a better fit,
because the style of how you program is very similar. You will have an easier
time to port over. And moreover, under
the oneAPI brand, DPC++ is really the language
of choice if you want to cross more than just CPU and GPU. We’re going to reach FPGA. We’re going to reach
the AI accelerator, so that’s the difference. The oneAPI Toolkit, are we going
to see anything new in that that we haven’t seen before? Yeah. Well, there’s a lot of things
that are really familiar that will always be there. Intel have a very rich
set of performance library that if you are familiar with
writing program through API, those will continue to be
there and will be highly optimized across platform. The debugging tools, of
course, you write program, you need to know how to debug. All those will be there. And the tools is the same for
analysis and giving advice. But however, there’s going
to be richer feature that is coming in. So for the analysis
tool, you’ll be able to dig deeper
into the hardware to figure out what’s going on. You can know the profile of your
applications and best tune it. And then we also
have an advisor, too, that will actually give
you hints and advise on how do you do things better. We did that with
parallel programming. This time, you will have advice
on which part of the code is best for you to
move over to device and get better performance. So at a high level,
the tools are similar, but the feature is
going to be enhanced to support this
cross-architecture initiative that we’re going to have. Yeah, it’s an exciting time. Yeah, it is. This may be a tough one, but
where do you see DPC++ five years from now? Five years, it’s going
to be exciting time. It’s going to be very exciting
five years, because we’re really changing the
way how developers think about programming
across device. And once we get
the message across, and once people have a chance
to take a look at what we do, I think they will
really appreciate that, because you’re talking
about re-use productivity with optimum performance. So with this message, with
what we are delivering, I think it’s going to
generate a lot of excitement in the industry. So this means that
I expect adoption. There will be a lot of adoption. Of course, first of
all, there’s going to be a lot of people
checking it out. And hopefully, they will come
together and innovate with us within the community. And yeah, by five years’ time,
I expect a lot of adoption across the industry. So if there’s one thing that you
want to tell developers today about DPC++, what would it be? We are serious. [CHUCKLES] We’re really putting a lot
of effort and resources together to do this
right, because we believe that this is really
important to the industry. It’s real. It’s already here. As I said, there’s
already a GitHub, and we’ll have product
rolling out, so check it out. It is going to change
the way how you develop, and it’s going to
revolutionize the industry. So yeah, come join us. It’s an exciting time. Absolutely. Alice, thank you for
joining us today. Most welcome. For more information
on Data Parallel C++, visit our Tech Decoded website
or the Intel Developer Zone. My name is Lindsay Michelet. Thank you for joining us
for another episode of Tech Decoded. We’ll see you next time. [MUSIC PLAYING]

Leave a Reply

Your email address will not be published. Required fields are marked *