You are on page 1of 30

DOING BIG DATA FOR

REAL WITH DOCKER


MESOSPHERE DCOS
Elizabeth Lingg
elizabeth@mesosphere.io
AGENDA
1. Intro
2. Mesosphere, Docker, and DCOS Overview
3. Big Data Container Orchestration using DCOS and Docker
4. Demo
5. Q&A
INTRO
Engineering Manager @ Mesosphere
M.S. Computer Science with a Specialization in Artificial
Intelligence from Stanford
B.S. Computer Science with a Minor in Math, B.S. Policy
and Management from Carnegie Mellon
Experience in AI, Big Data, and Systems
Enjoys applying Distributed Systems to Manage and
Reason Over Large Amounts of Data
MESOS
Provides primitives to author datacenter-native apps.
PRIMITIVES
Resources (cpu, mem, disk, ports)
Asset fetching
Task state tracking
API for the datacenter
STATUS QUO IS STATIC
PARTITIONING
AND USE OF VIRTUAL MACHINES
MESOS LET US TREAT A CLUSTER OF
NODES...
AS ONE BIG COMPUTER
 
Not as individual Not as VMs
machines
BUT AS COMPUTATIONAL
RESOURCES LIKE CORES, MEMORY,
DISKS, ETC.
WE LOVE CONTAINERS
MOST MODERN APPLICATIONS ARE A WEB OF
CONTAINERS
A CONTAINER ORCHESTRATION PLATFORM
Containerization in Mesos, a brief history
MESOSPHERE DCOS
Software to provide a complete OS: init, cron, apt-get,
discovery, routing
Beautiful web UI and CLI
Support
Ecosystem of DCOS Services
Mesos Master and Mesos Workers Running in Docker
Containers
DCOS UI
DCOS CLI
$ dcos

Command line utility for the Mesosphere Datacenter Operating


System (DCOS). The Mesosphere DCOS is a distributed operating
system built around Apache Mesos. This utility provides tools
for easy management of a DCOS installation.

Available DCOS commands:

config Get and set DCOS CLI configuration properties


help Display command line usage information
marathon Deploy and manage applications on the DCOS
node Manage DCOS nodes
package Install and manage DCOS software packages
service Manage DCOS services
task Manage DCOS tasks
BIG DATA DISTRIBUTED
APPLICATIONS ON DCOS
Mesos Master and Mesos Workers Running in Docker
Containers
Distributed Applications Running in Containers on the
Mesos Workers
Container Orchestration done by Apache Mesos
Resource Allocation and Scaling Managed by Apache
Mesos
BIG DATA DISTRIBUTED
APPLICATIONS ON DCOS
Popular Distributed Apps easily deployed on a single
DCOS Cluster
Kafka, Cassandra, HDFS, Spark, and other Big Data
Services
Health checks and failure recovery are automated
APPLICATION NETWORKING
Interact with the CLI or REST API's to interact with the
services
Mesos DNS Resolution
Docker Networking mainly done through host mode
networking, works seamlessly
DATA SECURITY
Services storing secure data run on private worker nodes
in the cluster
Private nodes can only be accessed through VPN
As needed, services choose what is exposed through a
proxy running on a public node
Distributed Application can authenticate with the Master
using Framework Authentication (Kerberos Option)
EXAMPLE: SIMPLE DOCKER APP ON
DCOS
{
"id": "/mesosphere/cd-demo-app",
"instances": 1,
"cpus": 1,
"mem": 512,
"container": {
"type": "DOCKER",
"docker": {
"image": "mesosphere/cd-demo-app:$tag",
"network": "BRIDGE",
"portMappings": [
{
"servicePort": 28080,
"containerPort": 80,
"hostPort": 0,
"protocol": "tcp"
< }
EXAMPLE: CASSANDRA DCOS
SERVICE
FEATURES
Managed node configuration
Health Monitoring
Rest API
DNS Names for nodes
Multiple Rings in one cluster
INSTALL
$ dcos package install cassandra

CUSTOMIZABLE INSTALL OPTIONS


{
"cassandra": {
"cluster-name": "dev",
"resources": {
"cpus": 3.0,
"mem": 6144,
"disk": 30720
}
}
}

$ dcos package install cassandra --options=options.json


INSTALLING
HEALTHY
REST API
GET /node/all

GET /health/cluster/report

POST /node/{node}/replace

POST /cluster/repair/start

POST /scale/nodes?nodeCount={count}
DEMO!
Q&A
THANKS!
LET'S CHAT!
WE'RE HIRING!
DCOS: mesosphere.com
Join: mesosphere.com/careers/