UC Berkeley Cloud Computing Meetup 001

UC Berkeley Cloud Computing Meetup 001


I’m going to make my part of this very
brief. I mainly want to just thank a whole lot of people. And I want to start with our speakers, so they’ll be coming up afterwards and we’ll do them in a row. Shane Knapp from the RISELab is going to talk about his work managing AWS and other cloud provisioning for researchers and students. Lindsey Heagy will talk about her work doing large-scale research computing as a member of Jupyter project – it’s very
cool. Then Lukas, who is here somewhere, will talk, from DevSpace. They’re a SkyDeck
company and they make managing Kubernetes clusters pain free, so we all are going
to want to hear about that. I want to thank my co-organizers. So
Anthony [Anthony stands up], he’s been great in really getting this
off the ground as one of the first movers, sort of talking about getting
a community around cloud here. Sibyl Chen is the Senior Program Director at the SkyDeck, and she’s gonna be speaking a little bit about what’s going on in the SkyDeck, and Jason Christopher from Research IT. I also want to give a shout out to Jean
Cheng from the Academic Innovation Studio. If you haven’t been to
it, go to ais.berkeley.edu. She is an expert on dealing with innovation in
higher ed and thinking about technology in creative ways, and she was also great
at helping us think through how a Meetup should work. I should point out
my two bosses are here. Larry Conrad is the University’s Chief Information Officer
and Jenn Stringer… [off camera: “she just walked out”]…Okay, she just walked out [laughter] — my brand new boss. And let’s see, and then I want to thank our sponsors. So Carolyn Winnett couldn’t be here, she’s
the Executive Director of the SkyDeck. Everything you see here, from the sign on
the top of the building to the new space they’re building out over third floor
is a product of her vision and the amazing team that she’s put together, and
Gordon also has done an amazing job helping us get all the logistics ready
for tonight. And I want to give a shout out Cathryn Carson, she’s going to be speaking in a minute. She is the faculty
lead of the Data Sciences
Education Program and a Professor here of History. To get a new division and an
interdisciplinary division off the ground requires a lot of
intellectual and other heavy lifting and a lot of the credit goes to
Cathryn – she would probably deny that but.. [Cathryn, off camera: “I will.”] Right. [Applause] So, dealing with technologists, I’m used to a lot of skepticism. So one question I wanted to just tackle up front — why a cloud meetup? Like, why are we doing this? And I would posit that the answer is right here in the room. So I’d like to gather some data about this.
So I want to do a quick poll. So anyone who is an academic — a student, faculty or
researcher, raise your hand. So how many? So we’ve got a bunch of them.
And then, anyone on the staff here, if you’re on the staff raise your hand — so
we’ve got a lot — this includes our IT staff. Let’s see, any SkyDeck companies or other entrepreneurial ecosystem people? So, we have probably three or four companies represented there. And then who just found out about this on meetup.com and is just local? Great. And how many people are Berkeley alumni? So we have a lot of alumni also here. So that is why. Research universities are very decentralized – we barely even talk to each other about stuff. The hardest part
of my job is to try to get people to do things more similarly and
less divergently, when we really have no sticks and our carrots are fairly pathetic. So getting people together and talking and starting with the people
side of this I think, is really the way to go, and that’s why we’re doing this. So to sort of put that to the test also —
it’s a Meetup — so people are here to meet each other, so would you please take
a minute and introduce yourselves to the people on the other side of you, and just
say what it is you do — someone you haven’t met before, and what brought you here. [Off camera: Lots of conversation. ] Thanks Bill. I’m just delighted to be part of welcoming you to this first cloud Meetup, and I don’t want to stand in between you and Shane or any
of the presentations that will follow, but I do want to bring out and make
visible and articulated this great energy and enthusiasm and bottom-up
innovation that Bill and Anthony and all of the organizers and all of you have
made possible coming together here. Bill mentioned that the Division of
Data Sciences is a new, emergent, and very, very fast-moving organization. And
because we try to teach at scale in new and innovative ways, we found ourselves
moving faster and faster into the cloud in ways that we had some inkling of when
we started the program back three and a half years ago, but the acceleration of
the technological developments and the uptake have been so fascinating to be
part of watching. As Bill mentioned, I’m a historian, and I think this is a moment of
radical fast change, and to be able to see a community coming together from
inside and outside the University — to discuss where it’s going, and how we can
adapt to it — it is going to be really thrilling to be able to be part of. And so I feel the
energy of the room, I am really excited to see each of the presentations, and I above all want to reinforce Bill’s sense, that we can do this
together, that each of us brings something to the
table here, and to be able to see the University move, in particular, as fast
into this space as we’ve moved into data science education, touching thousands of
students a semester through our cloud-based platform — that gives me great
hope. We can bring that energy into moving our institutions forward. And so
I’m thrilled for the chance to be up here at SkyDeck, which also has that
spirit, and would like to pass it over to a brief brief presentation about the opportunities that you have here too. Thank you so much. I just wanted to give a quick welcome to all of you. By a
quick show of hands, for how many of you is this your very first time to the SkyDeck? Wow! Okay great — well yeah, what a pleasure to host all of you guys here. When Bill mentioned the idea of hosting at SkyDeck, Caroline and the rest of the team immediately said yes.
We’re really excited, and anything that we can do to support the community, we’re
happy to. I think some of the next, upcoming events may also be here as well,
so we look forward to seeing you many times through. But a quick a few quick
word about SkyDeck. For those of you that don’t know too much about us we’re one
of the tech accelerators in Berkeley, and right now we have a hundred and twenty
start-up teams coming through our office every semester. There are two
tracks. So there is a cohort track. So those are the teams — there are twenty teams that we invest in, that’s usually a $100K. And we work really closely with
them to help them get to a point where they can be successfully raise their seed round or series-A. And then there’s another hundred teams on the hotdesk track. A hundred teams are a little bit earlier stage maybe they have a prototype or MVP [minimum viable product] and they’re iterating. And all of these teams
come to SkyDeck, they access our programming, we have countless workshops and fireside lunch and learns, and we also have a vast advisor network. Bill
is actually one of our Sky Advisors, so we pair our teams with advisors who are
experts in the industry that they’re in and they get really strategic help and
advice and warm introductions. For any of you guys who may know founders that are
looking for an environment or ecosystem that they can plug into and really get
some help SkyDeck is here, we’re here to really support Cal founders. It’s really easy to plug into our network at whatever stage startup it is at. So, our next application session
opens on April 1st. If you know any founders that you think might be
interested in learning more on April 1st, send them to skydeck.berkeley.edu
and they can basically apply very easily to join our ecosystem. We
usually have about a thousand applicants. Yes, we’re anticipating a
thousand applications for those next cohort, so it is competitive but again, for us we’re all about supporting founders and start-ups
around Berkeley. So, thank you Bill for inviting us the
host and we’re happy to support you guys Hope you guys enjoy the evening. Hey everybody. My name is Shane Knapp, and I work in the computer science division, specifically for the RISELab. We do a lot of cool stuff –
graduate research lab, work a lot on machine learning, reinforcement learning,
AI model serving…I also manage the technical staff across a bunch of
different research labs as well. And I run the Apache spark open source code system, so I’ve got a few hats. On top of all that, we host a massive AWS consolidated
billing family. And we have over 175 linked accounts — I should have actually checked today I think we’re closer to two hundred. [pause] We also support other cloud providers
but not but not as much as AWS. Amazon is one of our corporate sponsors
for the lab, so we deal with a lot of research credits, we actually don’t
really pay our own bills. Thank you Matt Jamieson for that. So why
the cloud? Especially in research, you know, we focus on really high accuracy
and low latency secure systems and the cloud gives us, you know, a really good
opportunity to configure everything to which, actually we allow the researchers
to run all this stuff and they can configure all their systems as they need it,
think you know low latency, availability, if they want brand-new shiny GPUs or
systems with 128 gigs of ram or even more, than, you know, they can just get
that. We have a lot of students doing a lot of research, and, you know, hardware is
expensive so the upfront costs of buying a box of 8 GPUs, that costs something. And then the opex, you know, how do you manage these boxes? You have to
have the staff fire to take care of it? So cloud’s pretty much a no-brainer, you
would think. So how do we do it? Linked accounts. So we have a central planning
account, and, I’m actually going to just jump in really quick. This is not going to be a technical talk about how we deploy things in the cloud
because we actually don’t manage any of that. We give the students and the
researchers the power to do this themselves. We don’t want to get in the
way. And with so many linked accounts there’s no way I come up with some sort
of infrastructure that can be shared amongst all these students with
different projects doing different things. So with Amazon as our sponsor, we have consolidated billing family. We get a lot
of credits as a sponsorship deal. We have people invited into our org, and
they don’t see a bill, so they just spend spend spend and we pay for it. Also the
reason why we like the cloud, is by the time students actually get into grad school,
they’ve been using AWS or other cloud providers for years, so this is a real no-brainer
to them, like they know what to do when they come in, they know how to spin up
instances, they know how to save their AMIs. You know, it’s — I think I’ve had
to give probably two hours of technical support on using AWS, in the five years
have been at Cal. I’m okay with that. I mean seriously, and again you know that
all the permutations of the configurations, memory, CPUs, GPUs,
networking, lambdas, you know, just it’s a really powerful flexible system for
researchers. And in conjunction with bare metal. So we actually do have a pretty
significant amount of machines up in Soda Hall. And we work with the
researchers, so they can actually prototype their experiments on bare
metal, figure out what they want to run and how they’re going to run it, and then
fine tune their spends and their systems on AWS. It’s a lot less work for us for
the most part and we provide a lot of feedback to industry partners. Okay so
what’s the result? So, many GPU and CPU hours were
consumed and much research was completed. I mean, we do a lot of work. If you look
up this one project called Pywren that came out of a partnership with
the RISELab and BCCI [Berkeley Center for Computational Imaging]. The lambdas at scale, it’s amazing. Happy grad students and faculty. Many many many credits burned. Lots of stress for me for a while. A lot more work than I initially anticipated and out of them I wrote a 1000 lines of Python to parse
billing and released that in open source. Not only is it a thousand lines Python
it’s an n-ary tree implementation of Python, to use the organizational units. And the other thing that happened is we
significantly expanded our local pool of GPUs. We were burning at one
point over $2,000 a day on Amazon. So doing the math, I went to my PI, and said, hey if we cut GPU you know take multiply that by 10, 10 days of GPU
usage that’s $26,000 we can buy an amazingly good expandable box for
that much money. So it’s interesting we moved to the
cloud, found out all the problems with the cloud. We’re not a start up. We don’t know that, like, our
QA team, and our dev team, and our release teams are going to be using just these
amount of resources you know it’s like oh, an SDI paper is due, and spend just
spiked up. You know, so we just had to really kind of figure out a good balance
of actual on-prem bare metal, versus what’s out there in the ether. So how is it to manage something like this? It’s — first point says it all,
it’s easy except when it’s not. We’re not centrally managed, and we don’t have
insight into what all these students are doing with the cloud. Billing is the biggest problem we have with the cloud. How much are you
going to spend? Can you forecast spend? Can you figure out like — there’s a
paper due, so what’s the burn gonna look like leading up to the paper? Have I
mentioned the number of accounts of projects we have? The learning curve to
get people to migrate — it’s really not that bad. Going back to bursty
workflows and unique usage patterns, I mean, you never know
when someone has a paper due, or someone’s got a job interview at some
university and they have to get the paper out. Billing again — I really can’t stress this
enough, if you’re using research credits, the built-in billing system in AWS does
not support this very well at all. I can’t get how many credits I have
remaining short of going to a web page and copying and pasting. There’s no API access for stuff like this, so it just makes it really difficult. Other problems,
like cost and availability of new, newer instance types, specifically GPUs and
FPGAs now, we’re doing a lot of work with those. That has gotten better as Amazon has been kind of like expanding their hardware, and then a thousand lines of
Python just to parse a billing file. That’s it. Thank you. Hello it’s good to be here. So I’m going to be telling you a bit about a Jupyter project and how that is enabling
Geoscience. So first, I want to give you just a bit of an idea of my background. So I
moved here just at the beginning of the year. I did my PhD at UBC in Vancouver
in geophysics, and so a lot of my thesis work was looking at electromagnetic inverse problems. So you fly a helicopter around, we collect some electromagnetic data and
then from that what we’re going to try to do is back out a 3-D physical
property model of the subsurface. And so that could be important, especially in
California right now, when people are trying to understand how much
groundwater we have, or where there are clay layers that protect the groundwater. So
that’s about my thesis work. And how that brought me into open-source software is that you need to be able to solve partial differential equations. We need
to figure out and try and estimate what those data should look like if we know
what the earth model is. And then from that, we can try and formulate an
optimization problem to figure out that 3-D subsurface model from the data that
we collected. So I ended up writing a lot of code and that brought me into the
open-source software world, and so that brought me into Python and also into
Jupyter and with that, seeing how powerful of a tool it was for my own
research, I got very excited about also applications for education and so I have
done a lot of work looking at how to make education much more interactive
using tools like Jupyter. So if we ask the question what actually is driving
progress in the geosciences? What do geoscientists think about when they’re
doing research? It’s really this interwoven path of theory and ideas that
we check with observations and data. And then we also need simulations and
computations to try and understand what’s going on. So that image here on
the left that is a magnetic map. And so this is very high-resolution magnetic
data, I just picked one patch of it the structure covers the whole earth.
There’s over 11 million lines and surveys that were all connected to make
this data set, so there’s a lot going on in there. From there we can…there’s
actually a lot of evidence for plate tectonics. So you can see a lot of the
striping, so you see the striping just off the coast of California? That’s
one of the key insights that let us know that plate tectonics is going on. So
that’s just one observation from the data. But then one of the things that we
want to do, is actually take those data and try and perform some computations to
figure out what the crustal structure is. So that involves a lot of then solving Maxwell’s equations. So this is where in the cloud starts coming in. For quite a number of
years we could work with data on our local hardware and solve these
equations locally, but now the data is getting big enough, and the problems are
getting complex enough that we need to move to the cloud. So just highlighting some of the problems we encounter in research, there’s problems and challenges with
software. So traditionally a lot of academic research has been driven by
proprietary software. It’s not interoperable, so a research group over
here learns how to solve the magnetic
equation, a research group over here solves Maxwell’s equations, and they just
don’t connect. And so that really hinders a lot of progress. Now we’re getting to the
point where the datasets are large enough that you can’t work with them locally. So we need to be accessing them on the cloud. And then this imposes challenges on
researchers. You need to learn a whole new set of tools, and you need to learn
how to transfer your prototyping workflow from your laptop up to the cloud.
And so I think it’s important to remember who are the audience is, and what these tools need to accomplish here. We’re not doing computing for computing
sake, we need the tools to solve a problem. We need to be able to enable
exploration. Researchers need to be able to act interactively with the code to
gain insights. And then at the end of the day, we need to be able to communicate
those results and share them both with other researchers and with the
general public. And all of these things, hopefully, should not be posing too much
cognitive load on the researchers. We want to be focusing on the research, not
fighting with the computer. So it want to give you just a bit of a tour
of what a geoscientist nowadays is using in a computational workflow. I started
with the scientific Python ecosystem. Similar diagrams could be made for Julia and R and several other open source languages, but Python is one of the
largest and growing communities at this point in time. So it’s probably the most
commonly used in the open Geoscience world. So at the bottom there, we have a
Python. And then as you move up through these layers, you’re able to leverage
pieces below you. So for example NumPy takes care of linear algebra. So when I
learned to write my differential equations, I don’t actually have to go
and figure out all the matrix solving and all of that sort of stuff, that’s
just taken care of for me. One piece that I want to focus on here
is Jupyter. So although this is inside of the Python ecosystem, Jupyter is actually
language agnostic, so we can equally plug Jupyter into a Julia diagram or an R
diagram. And I want to show you a bit of what it enables. So when people ask
what is Jupyter – it’s a very big question – because if you’re just trying to find an
ecosystem, I’ve just written down a few of the packages there at the bottom — it’s
very big. There’s lots of software that goes into Jupyter. But if we describe the
mission, I would call it a community of people and an ecosystem of
tools, all dedicated to interactive computing. And so the way most people are aware of Jupyter is through the Jupyter Notebook. So the Jupyter Notebook is a
document that combines — by text and equations — with software code that you
can actually run in this document, as well as outputs. So those can be figures,
it also plugs saying things like widgets, so you can actually start making your
code much richer and much more interactive. So that runs on your local
machine, but now we want to actually start….So the Jupyter Notebook,
what it actually is — is we can think of it as being composed of three different
things. It is a document, so it is that that document that you’re looking at, of
equations and code, but it’s also an interface and an environment. So, the
document itself is actually represented as just a JSON structure. And so, what
that means is we can actually start on different interfaces to it. Something
like Interact that Netflix uses. It’s built on the same document
structure, and it’s just a simpler interface. Similar with R-Studio. So depending on what problem you need to solve,
we can actually compose different pieces to deal with the problem that you
working on. So that’s a running Jupyter instance, but how do we actually then
connect researchers with that compute environment? And this is where JupyterHub comes in. So JupyterHub can be deployed on a HPC system up on the cloud
and it actually allows users to log in and access their own Jupyter environment.
So it handles authentication, it handles the resource allocation, storage, all of
those sorts of things, that you as a researcher just don’t want to have to
think about. And what’s great is, is it’s the exact same interface as what I would
be running on my local laptop. So, the interface in my interaction has not
fundamentally changed. So once you’ve gone through and actually done some of
your research, how do you actually share this now with collaborators or publish
it with your paper? And this is where the Binder Project comes in. And so Binder
combines a lot of the infrastructure that Jupyter Hub has for building your
software environment, and deploying that on the cloud, to actually allow you
to generate a URL that you can give to anybody from your GitHub repository,
that will spin up a compute instance for you. So, if you have your GitHub
repository of Jupyter Notebooks and you define what needs to be in that software
environment, we can build that for you on Binder, and give you a link that you
can then pass on to anyone to reproduce your results. So
that’s a fair bit of tech, but what does this actually have to do with the
geosciences and how does this enable geoscience research? And this is where
I’m very excited to see the progress that’s being made in the Pangeo project.
And so the way that they’ve branded themselves, is it’s a community driven
effort for big data geociences. And so what that actually means, is that they have combined a whole bunch of tools in the open-source ecosystem in a way that’s tailored to
geoscientists. So they have advocated to have analysis-ready data, so they’ve
worked with NOAA and NASA and quite a number of other large organizations to
get the data stored in the cloud friendly format up on the cloud. Then
they’ve been working with researchers to help implement scalable components like
dask and xarray into their code. So that lets you parallelize research code
then by using something like JupyterHub, you could package all of that up
deploy it on the cloud so now you’ve got your compute resources next to your data,
and you can start performing your geoscience analysis. So Pangeo in a
lot of ways, is just a collection of resources that’s tailored to geoscientists. So they have deployed on both HPC systems, as well as the cloud and
again, what’s neat about this, is if you are a researcher doing atmospheric research
and some of your data is on NCAR and some of it is on AWS, you’re going to be using the same Jupyter interface to interact with your data no matter what.
So they’ve deployed several different instances that are tailored to different
communities, so there’s some looking at polar research, some looking the
oceanography, some at hydrology. There’s even groups that have adapted
this for neuroscience. So it’s not just geoscience, this whole idea is starting to expand.
And so they’ve been successful in just the past couple of years of
gaining thousands of users at this point, which i think is a real testament to the
way that they’ve developed. And the way that they’ve developed and I think the
way that Jupyter has developed, and some key pieces that I think make both of these projects successful, are that they are modular building blocks. So we’re
building building blocks that interoperate with each other and with
software that already exists and is used in the open source ecosystem. The other
thing that I think makes these projects successful, is that they are built in
conjunction with researchers who are using them — so the Pangeo project
includes software developers you are with Anaconda, but it also includes
researchers who are doing hydrology work and so by having them working together
on one project, you learn a lot and you make sure that the tools that you
are building are always solving the researchers problems. So with that,
there’s lots of ways to get in touch, we use Discourse in the Jupyter community
as a way to keep in touch and have conversations, so if you’re interested in
finding out more, or have more questions please join us there. [Applause] Hi, I’m Lukas from DevSpace, and we’re
one of the Skydeck cohort teams. We actually met Bill at an advisor meet
and greet at one of the office hours I think and he invited us to the cloud computing
meetup here, which is very exciting to be here. Thank you for inviting us. And I
hope I can talk a little bit about an exciting topic about the product that
we’re building. Part of it is open-source, part of its proprietary, and we’re
dealing with NoOps Kubernetes. Essentially, the idea behind DevSpace
is the user that wants to make use of Kubernetes needs to set up a cluster.
Setting up and maintaining a cluster is a lot of work or costs a lot of money. But
actually a regular user just wants to deploy applications, just wants to use
deployment services and ingresses in Kubernetes, and we allow that to users, without having them to set up their own cluster, managing their own cluster
without getting all the stress. Let’s first take a look at what my teams made
up so we have two other co-founders here my name’s Lukas and then there’s also
Fabien my CTO, who’s here too, and then Daniel, who handles the business
side of things in our start-up. We have a couple of advisors, like Abby Kearns the head of Cloud Foundry, Michael Aday, and Mona Sabet, who has a venture
firm, and we are very happy to be part of Berkeley’s SkyDeck accelerator. We’re
actually a team from Germany, so an international team, so if I’m pronouncing
things kind of funny, that’s where it comes from. So, we just moved here
in January, currently transforming our company to a U.S. entity, and we are hoping to build the future of cloud native. So let’s talk a little bit about why cloud native is important. There have been plenty of studies showing that companies
adopting cloud technologies according to KPMG — 65% of companies already use
cloud technologies, 25% of them 25% of developers use Docker and 75% of them
say that that accelerates their workflow. So not only cloud native is important, but also
containers, cloud native technologies, everything that they cloud native
foundation, and computing foundation stands for. And what is very important — a lot of
new technologies are cloud only. As Gartner says here. So, leading-edge computing technology is
more often cloud only. That’s why we really need to care about moving towards
cloud technologies. So why am I talking about Kubernetes? Well I’m just showing
this quote here from Jim Zemlin, the director of the Linux Foundation,
saying that Kubernetes is coming the Linux of the cloud, so kind of like the
operating system for running cloud computing technologies is Kubernetes.
Kubernetes will be this technology really leveraging the cloud, and you can see
in the in the background the Google Trends chart for a search term
Kubernetes — so it’s really exploding — the adoption of
Kubernetes. So why is Kubernetes exploding? Well first of all, we can
accelerate build processes, we can accelerate deployments with it, so we
can essentially get faster releases with Kubernetes. We can handle infrastructure
as code anyway with Kubernetes. We get zero downtime, rolling updates and just
infinite scalability, adopted from companies like (wow, it actually works, great) companies like Google and Amazon, and IBM, which help to build
communities. So we can learn from their best practices. And we have a reduced
vendor lock-in because Kubernetes is very portable. I mean Kubernetes runs
on AWS. It runs with one click on GCP, it also runs on your bare metal clusters, it
runs on hybrid scenarios — it’s so flexible in terms of deployment, that it
really allows this vision of completely portable applications across platforms. And that’s why more and more industry leaders build
on Kubernetes to outperform their competitors. So why aren’t you using
Kubernetes? Well, when we ask that to people that we
meet, the most common answer is that Kubernetes is very complicated, it’s too complicated for us, we don’t want to deal with it. Maybe that’s because we talk to a lot of small companies, a lot of startups, or a
lot of small innovation teams in larger companies, but that’s a very common answer. Another answer is that we have no time to migrate. We hear — our systems are running on AWS, we have our instances there, we have our pipelines there, we can’t move
to Kubernetes. It’s a lot of effort to migrate. And another common issue is, we don’t
have an ops team that really knows how to operate Kubernetes, that knows how to manage Kubernetes. And those problems are actually real. So that is not something
that is just a perception of people, it is fair, Kubernetes is very complicated,
you need a lot of experience to run it, and it takes time to migrate, and that’s
why we start an open source project called DevSpace CLI. To kind
of mitigate those issues. DevSpace CLI is a kind of Swiss Army knife
for Kubernetes. It helps you containerize any project very quickly, move it to
Kubernetes, then deploy it on top of Kubernetes, debug it on top of
Kubernetes, and if you’re really really looking for adrenaline, than you can also develop stuff directly on top of Kubernetes cluster. So
that’s the ultimate stuff that we don’t build stuff in our local environment, we build on top of Kubernetes too. DevSpace CLI is open source. It’s on GitHub, it has
an Apache license, has more than 1,200 commits so far, and has over 500
GitHub stars. So, if you’re wondering why — say you’re a startup, so you’re probably not just doing open source software, we also have a platform which is called DevSpace
Cloud, which offers hosted Kubernetes namespaces, so very easily you can get a
namespace instead of an entire cluster. You can do everything in that namespace.
You can create this namespace have automatic ssl, you have automatic domain
connections, you have an in-built private registry, and you have full access with
Kubectl, Helm, and all the other tools in the Kubernetes ecosystem. So in this
namespace, you’re really a master of it. You know, you’re the admin, you can do
whatever you want. 1 Space is free, so it’s very easy to try out, and we’re
currently working on getting DevSpace Cloud out on on-prem too, so that actually a company can say we want to offer that service to
our development teams, without having to use your hosted solution. So they take
DevSpace Cloud installed on the Kubernetes cluster. It works on any
Kubernetes cluster, and then it can provide that service for the development
teams. And from the user point of view, DevSpace CLI and DevSpace Cloud work
very well together. So there’s only one command in DevSpace CLI, which is
tied to DevSpace Cloud, and that’s>, which actually
creates a namespace, isolates that namespace within a Kubernetes cluster, but all the rest is usable within the CLI tool. And that’s what I want you to
to show you right now. So this project here is a regular React-type
application. React is a front-end framework, so it might not be the best
use-case for running on top of Kubernetes, but it’s very easy to
demonstrate it with React because it’s very easy to set up. And if we have this
project here we see on the left-hand side, it has no Dockerfile, it’s not a containerized application. So the only
thing I did before the session I ran>, that’s a standard way of creating a React application, it just says “Hello World”, nothing else. And I installed
DevSpace CLI obviously. So if I want to containerize this project with
DevSpace CLI, I just run>, and what DevSpace is doing now, it detects my programming language, so it takes a look
at my project, analyzes my project and then asks a couple of questions. For
example asks select the programming language of your project — it’s already
detected, we’re using javascript here, but we can also switch to Python, or we you we can say none in case it didn’t provide the correct language over here, and we’ll
create some basic template, which you need to work on with. But in case of
JavaScript we have a working version, so we can just select JavaScript here, and
you can see here that what it’s done now is it created a Dockerfile over here… So, I have a Dockerfile over here, you
see that there’s no Git repository, so you can actually see the files that have
changed. We see this .dockerignore file and Dockerfile. This Dockerfile is very basic, it
just inherits from a node image, then creates a folder app in our image,
marks it as a working directory. It copies the package JSON, which kind of defines
the application, the dependencies. Then it installs the dependencies with npm-install, then copies the rest of the application code, and then it starts our application,
so it marks the entry point as npm-start. If we run our initializing
command,>now, we will also see that a new folder
has been created, Chart, and that’s actually where Helm Chart will be placed
in. There’s one further question that that DevSpace is asking us, which port is the application listening on for React applications that’s port 3000. So I’m just confirming the default port here and then there’s this
question — do we want to use DevSpace Cloud or do we want to use our own
cluster? So if you’re already having kubectl installed, if you’re already connected to your kube cloud cluster you just say no, deploy it to your own namespace and work with DevSpace CLI without DevSpace Cloud, but if your organization in the
future is using DevSpace Cloud Enterprise or using DevSpace Cloud as a
free version, you can just go ahead and say yes here. A login window pops up, and
then we just log in with GitHub. Now we logged in, then we go back to the
command line, and we see our project is initialized, we have this configuration
folder here .devspace, we have this Chart folder here which defines how our
application is deployed to Kubernetes, and now we can very easily run>.>in this
case, which creates a new namespace in DevSpace Cloud. As I said if you’re not using DevSpace Cloud, this step is not necessary, but if you are, then you can very easily get a Kubernetes namespace now. And then we can run>to actually deploy our application to this newly created namespace, and what DevSpace is doing right now, it authenticates towards the Docker
registry — in this case it uses the Docker registry of DevSpace Cloud, but we can
also in the configuration tell it – push it to DockerHub, push it to our GitLab
registry, really any Docker registry would work in this case, it authenticates
with the local credentials so you can just login with Docker login and our
application is built as an image pushed to this registry and deployed to
Kubernetes. And the last step would be to>which actually opens our application, so it takes a look at the ingress that we have to find and it
sees the URL that our application is connected to, and as soon as the
application is ready, our ports are started, our images port, it will open up
the URL where we can actually see the application running. If we want to check
the status of our deployment, we can also use Kubectl for that. So, whenever you run – oh, we see the application running now… So, it took me probably just like three or
four minutes to get this React application running on top of Kubernetes
and you really need a minimum of Kubernetes experience for that. Actually you only need a little bit of knowledge about your application. You need to know
the programming language, you need to know the port, and it creates all the
resources that you need. And the great thing is that DevSpace doesn’t limit
you to anything. So it’s not like, I mean you can deploy applications with Heroku
just in a minute, you just run two or three console commands and install a CLI.
But then you’re locked into Heroku. In our case, you have a Dockerfile now, Helm Chart you could deploy to any Kubernetes cluster. It’s really just standard, so
we’re not we’re not changing anything in your application you don’t have to adapt
anything, we don’t lock you in, it’s pure Kubernetes resource, it’s pure Dockerfile, and it would work with
any CI/CD pipeline, and the cool thing is, if we have Kubectl installer we can actually
take a look for example at the parts in our namespace and see that we have a
a tiller service running. We employ Hemp Charts… So we have this server side component
called Tiller which actually runs our applications and then we have our
application running in a pod you can see here the four seven seven eight and so
on, which contains the part which runs our containers, and it’s very easy to
directly access that Kubernetes namespace, without even going through
DevSpace, so you could extend this in any way you
want it’s compatible to any standard Kubernetes procedures and other CLI tools
that you might already be using or that you find later on and want to use and
that’s really the power of it and there’s a couple of other convenience
commands that you can see here. There’s for example DevSpace analyze, which
automatically analyzes issues with your application. So let’s say our application
wouldn’t start correctly. Our part would keep on crashing, our container would
have an error log and would crash all the time. Then we need to see which part is
crashing that the logs of that and it’s a lot of repetitive work or let’s say a
service doesn’t have an endpoint we have to first get a list of services, get a list of
end points, see which end points are not mapping to the service. All these kinds
of checks that you would have to do if your application is not showing up when
they’re intended URL, those checks are all inside of DevSpace analyze. DevSpace
analyze checks all these things in parallel and just gives you a report of
issues, which really show you the problems. So for example if for entry
point is not working that’s where the analyzer will tell you, this
part has issues with the entry point this image is not working, this was the
last output, and it’s very easy to fix that, and it doesn’t take 20 minutes to
find out where I actually is the issue in Kubernetes. And there’s a couple of
other commands like logs and enter to actually open a terminal and as I said,
there’s one more command, which is called DevSpace dev, which actually starts
our application in development mode. So what happens here is, it’s very similar
procedures with the deployment, but the difference is that we overwrite the entry
points of the images that means we do not actually start the containers
correctly, we start them in a sleep mode. That means the containers are there, the
application is packaged in there but it’s not being started and with that we
see here that we also do port forwarding for the port that we specified
beforehand and for a couple of other ports that we may specify later and we
do the code synchronization between our local repository
and the remote containers and now we actually have a terminal inside the
running part, inside the running container, which has not started an
application. And you can see beforehand we marked>as a working
directory, we end up in that directory, and can now run>here to start our React application, and then our React application starts up inside the
container that runs remotely and then we can easily change any
javascript file and it will be directly synchronized in the container and our
application will restart. So if you have a hot reloading – like node 1
for example I think for flask and other piping tools, they have the tool for most
actually web frameworks that would work very easily. If I change a file the
application hot reloads. And we don’t need to rebuild the image,
restart the container, and that saves a lot of time if we’re actually debugging
issues that might happen if we have four or five microservices, which communicate
with each other which is very hard to reproduce in the local development environment. Actually running React always takes a couple of seconds… and there we go, and now we have the
output of this React application in the development mode, and we can just access the application on localhost just as it were running on our local
machine. If I change a file now it will automatically reload, restart the
application, and we can also test it again on localhost. And this port
forwarding is very interesting because we can also attach remote debuggers
very easily without worrying about authentication. I can just turn my Visual
Studio code in this case to listen on local host port 6675 to listen to the
debugger, start that debugger in the container – and actually remote debug
inside a Kubernetes environment. So you can see there’s a lot of possibilities
how you use DevSpace CLI and DevSpace Cloud as a kind of an extension to
make it even easier for users to get started with with Kubernetes. And if you
have any questions regarding the open source project, regarding the DevSpace Cloud, regarding SkyDeck and the program, just come up to me later on, I’m really happy
to chat. Thank-you. (applause)

Leave a Reply

Your email address will not be published. Required fields are marked *