Spark环境搭建03

上面说到了Spark如何与Hadoop整合,下面就说一下Spark如何与HBase整合。

1、获取hbase的classpath

#要把netty和jetty的包去掉,否则会有jar包冲突
HBASE_PATH=`/home/hadoop/Deploy/hbase-1.1.2/bin/hbase classpath`

2、启动spark

bin/spark-shell --driver-class-path $HBASE_PATH

3、进行简单的操作

import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.hbase.io.ImmutableBytesWritable

val conf = HBaseConfiguration.create()
conf.set(TableInputFormat.INPUT_TABLE, "inpatient_hb")

val admin = new HBaseAdmin(conf)
admin.isTableAvailable("inpatient_hb")
res1: Boolean = true

val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], classOf[org.apache.hadoop.hbase.client.Result])

hBaseRDD.count()
2017-01-03 20:46:29,854 INFO  [main] scheduler.DAGScheduler (Logging.scala:logInfo(58)) - Job 0 finished: count at <console>:36, took 23.170739 s
res2: Long = 115077

Spark环境搭建02

1、启动spark

sbin/start-all.sh

可以在http://hiup:8080/看到spark运行情况。

2、启动shell

bin/spark-shell

3、测试

$ ./run-example SparkPi 10
Pi is roughly 3.140408

$ ./run-example SparkPi 100
Pi is roughly 3.1412528

$ ./run-example SparkPi 1000
Pi is roughly 3.14159016

4、基本操作

#HDFS加载数据
scala> var textFile=sc.textFile("/usr/hadoop/inpatient.txt")

#第一行
scala> textFile.first()
res1: String = "第一行内容"

#第一行,用逗号分割后,第一列(住院号)
textFile.first().split(",")(0)
res2: String = "0000718165"

#第一行,用逗号分割后,第5列(费用)
textFile.first().split(",")(5)
res3: String = "100.01"

#行数
scala> textFile.count()
res4: Long = 115411

#包含ICU的行数
textFile.filter(line=>line.contains("ICU")).count()
res5: Long = 912

#获取每一行的长度
var lineLengths = textFile.map(s=>s.length)

#获取总长度
var totalLenght = lineLengths.reduce((a,b)=>a+b)
totalLenght: Int = 32859905

#获取最大费用
textFile.map(line=>if(line.split(",").size==30) line.split(",")(23).replace("\"","") else "0").reduce((a,b)=>if(a.toDouble>b.toDouble) a else b)
res6: String = 300

#创建一个类
@SerialVersionUID(100L)
class PATIENT(var PATIENT_NO : String,var NAME : String,var SEX_CODE : String,var BIRTHDATE : String,var BALANCE_COST : String) extends Serializable 

#新建一个对象
var p=new PATIENT("PATIENT_NO","NAME","SEX_CODE","BIRTHDATE","BALANCE_COST")

#新建一个map函数
def mapFunc(line:String) : PATIENT = {
var cols=line.split(",")
return new PATIENT(cols(0),cols(1),cols(2),cols(3),cols(4))
}

#最大费用
textFile.filter(line=>line.split(",").size==30).map(mapFunc).reduce((a,b)=>if(a.BALANCE_COST.replace("\"","").toDouble>b.BALANCE_COST.replace("\"","").toDouble) a else b).BALANCE_COST

#男性最大费用
textFile.filter(line=>line.split(",").size==30).map(mapFunc).filter(p=>p.SEX_CODE=="\"M\"").reduce((a,b)=>if(a.BALANCE_COST.replace("\"","").toDouble>b.BALANCE_COST.replace("\"","").toDouble) a else b).BALANCE_COST

#女性最大费用
textFile.filter(line=>line.split(",").size==30).map(mapFunc).filter(p=>p.SEX_CODE=="\"F\"").reduce((a,b)=>if(a.BALANCE_COST.replace("\"","").toDouble>b.BALANCE_COST.replace("\"","").toDouble) a else b).BALANCE_COST

#退出
scala> exit

Spark环境搭建01

1、下载scala-2.11.1,并解压到/usr/scala/scala-2.11.1

2、下载spark-2.0.0-bin-hadoop2.4,并解压到/home/hadoop/Deploy/spark-2.0.0
(*如果要看后续文章,建议使用hadoop-2.5.2 hbase-1.1.2 hive-1.2.1 spark-2.0.0)

3、复制spark-env.sh.template为spark-env.sh,并添加下面几行

export JAVA_HOME=/usr/java/jdk1.7.0_79
export SCALA_HOME=/usr/scala/scala-2.11.1/
export SPARK_MASTER_IP=hiup01
export SPARK_WORKER_MEMORY=1g
export HADOOP_CONF_DIR=/home/hadoop/Deploy/hadoop-2.5.2/etc/hadoop

4、复制slaves.template为slaves,并添加下面几行

hiup01
hiup02
hiup03

5、将scala-2.11.1及spark-2.0.0复制到hiup02及hiup03

6、环境搭建完毕。

Docker私有仓库搭建

1、安装registry

# sudo apt-get install docker docker-registry

2、上传镜像
2.1、客户端允许http

$ sudo vi /etc/defualt/docker
#添加这一行
DOCKER_OPTS="--insecure-registry 192.168.130.191:5000"

2.2、上传镜像

#查看镜像列表
$ sudo docker images
REPOSITORY                           TAG                 IMAGE ID            CREATED             SIZE
elasticsearch                        5.1                 747929f3b12a        2 weeks ago         352.6 MB

#标记镜像
$ sudo docker tag elasticsearch:5.1 192.168.130.191:5000/elasticsearch

#查看镜像列表
$ sudo docker images
REPOSITORY                           TAG                 IMAGE ID            CREATED             SIZE
elasticsearch                        5.1                 747929f3b12a        2 weeks ago         352.6 MB
192.168.130.191:5000/elasticsearch   5.1                 747929f3b12a        2 weeks ago         352.6 MB

#上传镜像
$ sudo docker push 192.168.130.191:5000/elasticsearch:5.1
The push refers to a repository [192.168.130.191:5000/elasticsearch]
cea33faf9668: Pushed
c3707daa9b07: Pushed
a56b404460eb: Pushed
5e48ecb24792: Pushed
f86173bb67f3: Pushed
c87433dfa8d7: Pushed
c9dbd14c23f0: Pushed
b5b4ba1cb64d: Pushed
15ba1125d6c0: Pushed
bd25fcff1b2c: Pushed
8d9c6e6ceb37: Pushed
bc3b6402e94c: Pushed
223c0d04a137: Pushed
fe4c16cbf7a4: Pushed
5.1: digest: sha256:14ec0b594c0bf1b007debc12e3a16a99aee74964724ac182bc851fec3fc5d2b0 size: 3248

3、查询镜像

$ curl -X GET http://192.168.130.191:5000/v2/_catalog
{"repositories":["alpine","elasticsearch","jetty","mongo","mysql","nginx","openjdk","redis","registry","ubuntu","zookeeper"]}

$ curl -X GET http://192.168.130.191:5000/v2/elasticsearch/tags/list
{"name":"elasticsearch","tags":["5.1"]}

#下面的查询命令总是报404错误,api文档中也没有,有些奇怪
$ curl -X GET http://192.168.130.191:5000/v2/search?q=elasticsearch
$ sudo docker search 192.168.130.191:5000/elasticsearch

4、下载镜像

$ sudo docker pull 192.168.130.191:5000/elasticsearch:5.1
5.1: Pulling from elasticsearch
386a066cd84a: Pull complete
75ea84187083: Pull complete
3e2e387eb26a: Pull complete
eef540699244: Pull complete
1624a2f8d114: Pull complete
7018f4ec6e0a: Pull complete
6ca3bc2ad3b3: Pull complete
424638b495a6: Pull complete
2ff72d0b7bea: Pull complete
d0d6a2049bf2: Pull complete
003b957bd67f: Pull complete
14d23bc515af: Pull complete
923836f4bd50: Pull complete
c0b5750bf0f7: Pull complete
Digest: sha256:14ec0b594c0bf1b007debc12e3a16a99aee74964724ac182bc851fec3fc5d2b0
Status: Downloaded newer image for 192.168.130.191:5000/elasticsearch:5.1

5、删除镜像

$ curl -X DELETE /v2/elasticsearch/manifests/5.1

参考github

elasticsearch docker官方镜像无法运行

elasticsearch5.1 docker官方镜像运行时会报错:

ERROR: bootstrap checks failed
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

那是因为vm.max_map_count达不到es的最低要求262144,修改方式有两种:

#一次生效
sudo sysctl -w vm.max_map_count=262144
#永久生效
sudo vi /etc/sysctl.conf
#添加这一行
vm.max_map_count=262144

#加载配置
sudo sysctl -p

常用docker镜像命令(Compose)

1、拉取镜像

#ubuntu-16.04.1-server-amd64
sudo apt-get install docker
sudo apt-get install docker-compose
#拉取镜像
sudo docker pull mysql:5.7
sudo docker pull redis:3.2
sudo docker pull mongo:3.4
sudo docker pull jetty:9.3-jre8
sudo docker pull nginx:1.11
sudo docker pull elasticsearch:5.1
sudo docker pull ubuntu:16.04

2、新建网络

sudo docker network create hiup

3、整理yml文件
3.1yml版本1

h01-mysql02:
  image: mysql:5.7
  mem_limit: 128m
  cpu_shares: 100
  tty: true
  hostname: h01-mysql02
  net: hiup
  ports:
    - "3306:3306"
  volumes:
   - /home/hiup/docker/data/mysql/var/lib/mysql:/var/lib/mysql
   - /home/hiup/docker/data/mysql/etc/mysql/conf.d:/etc/mysql/conf.d
  environment:
   - MYSQL_ROOT_PASSWORD=hiup
  log_driver: "json-file"
  log_opt:
    max-size: "10m"
    max-file: "10"

h01-redis02:
  image: redis:3.2
  mem_limit: 128m
  cpu_shares: 100
  tty: true
  hostname: h01-redis02
  net: hiup
  volumes:
   - /home/hiup/docker/data/redis/etc/redis/:/etc/redis/
   - /home/hiup/docker/data/redis/data:/data
  ports:
   - "6379:6379"
  command: redis-server /etc/redis/redis.conf
  log_driver: "json-file"
  log_opt:
    max-size: "10m"
    max-file: "10"

h01-mongo02:
  image: mongo:3.4
  mem_limit: 128m
  cpu_shares: 100
  tty: true
  hostname: h01-mongo02
  net: hiup
  ports:
   - "27017:27017"
  volumes:
   - /home/hiup/docker/data/mongo/etc/mongod.conf:/etc/mongod.conf
   - /home/hiup/docker/data/mongo/data/db:/data/db
  log_driver: "json-file"
  log_opt:
    max-size: "10m"
    max-file: "10"

h01-jetty02:
  image: jetty:9.3-jre8
  mem_limit: 128m
  cpu_shares: 100
  tty: true
  hostname: h01-jetty02
  net: hiup
  ports:
   - "8080:8080"
  volumes:
   - /home/hiup/docker/data/jetty/usr/local/jetty/etc:/usr/local/jetty/etc
   - /home/hiup/docker/data/jetty/webapps:/var/lib/jetty/webapps
  log_driver: "json-file"
  log_opt:
    max-size: "10m"
    max-file: "10"

h01-nginx02:
  image: nginx:1.11
  mem_limit: 128m
  cpu_shares: 100
  tty: true
  hostname: h01-nginx02
  net: hiup
  ports:
   - "80:80"
  volumes:
   - /home/hiup/docker/data/nginx/etc/nginx/nginx.conf:/etc/nginx/nginx.conf
   - /home/hiup/docker/data/nginx/usr/share/nginx/html:/usr/share/nginx/html
  log_driver: "json-file"
  log_opt:
    max-size: "10m"
    max-file: "10"

h01-es02:
  image: elasticsearch:5.1
  mem_limit: 640m
  cpu_shares: 100
  tty: true
  hostname: h01-es02
  net: hiup
  ports:
   - "9200:9200"
   - "9300:9300"
  volumes:
   - /home/hiup/docker/data/es/usr/share/elasticsearch/config:/usr/share/elasticsearch/config
   - /home/hiup/docker/data/es/usr/share/elasticsearch/data:/usr/share/elasticsearch/data
  command: elasticsearch
  log_driver: "json-file"
  log_opt:
    max-size: "10m"
    max-file: "10"

h01-ubuntu02:
  image: ubuntu:16.04
  mem_limit: 128m
  cpu_shares: 100
  tty: true
  hostname: h01-ubuntu02
  net: hiup
  #ports:
  #volumes:
  command: /bin/bash
  log_driver: "json-file"
  log_opt:
    max-size: "10m"
    max-file: "10"

3.2yml版本2

version: '2'
services:
  h01-mysql02:
    image: mysql:5.7
    mem_limit: 128m
    cpu_shares: 100
    tty: true
    hostname: h01-mysql02
    network_mode: hiup
    ports:
      - "3306:3306"
    volumes:
     - /home/hiup/docker/data/mysql/var/lib/mysql:/var/lib/mysql
     - /home/hiup/docker/data/mysql/etc/mysql/conf.d:/etc/mysql/conf.d
    environment:
     - MYSQL_ROOT_PASSWORD=hiup
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "10"
  
  h01-redis02:
    image: redis:3.2
    mem_limit: 128m
    cpu_shares: 100
    tty: true
    hostname: h01-redis02
    network_mode: hiup
    volumes:
     - /home/hiup/docker/data/redis/etc/redis/:/etc/redis/
     - /home/hiup/docker/data/redis/data:/data
    ports:
     - "6379:6379"
    command: redis-server /etc/redis/redis.conf
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "10"
  
  h01-mongo02:
    image: mongo:3.4
    mem_limit: 128m
    cpu_shares: 100
    tty: true
    hostname: h01-mongo02
    network_mode: hiup
    ports:
     - "27017:27017"
    volumes:
     - /home/hiup/docker/data/mongo/etc/mongod.conf:/etc/mongod.conf
     - /home/hiup/docker/data/mongo/data/db:/data/db
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "10"
  
  h01-jetty02:
    image: jetty:9.3-jre8
    mem_limit: 128m
    cpu_shares: 100
    tty: true
    hostname: h01-jetty02
    network_mode: hiup
    ports:
     - "8080:8080"
    volumes:
     - /home/hiup/docker/data/jetty/usr/local/jetty/etc:/usr/local/jetty/etc
     - /home/hiup/docker/data/jetty/webapps:/var/lib/jetty/webapps
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "10"
  
  h01-nginx02:
    image: nginx:1.11
    mem_limit: 128m
    cpu_shares: 100
    tty: true
    hostname: h01-nginx02
    network_mode: hiup
    ports:
     - "80:80"
    volumes:
     - /home/hiup/docker/data/nginx/etc/nginx/nginx.conf:/etc/nginx/nginx.conf
     - /home/hiup/docker/data/nginx/usr/share/nginx/html:/usr/share/nginx/html
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "10"
  
  h01-es02:
    image: elasticsearch:5.1
    mem_limit: 640m
    cpu_shares: 100
    tty: true
    hostname: h01-es02
    network_mode: hiup
    ports:
     - "9200:9200"
     - "9300:9300"
    volumes:
     - /home/hiup/docker/data/es/usr/share/elasticsearch/config:/usr/share/elasticsearch/config
     - /home/hiup/docker/data/es/usr/share/elasticsearch/data:/usr/share/elasticsearch/data
    command: elasticsearch
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "10"
  
  h01-ubuntu02:
    image: ubuntu:16.04
    mem_limit: 128m
    cpu_shares: 100
    tty: true
    hostname: h01-ubuntu02
    network_mode: hiup
    ports:
    volumes:
    command: /bin/bash
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "10"

4、运行

sudo docker-compose up -d

5、配置文件目录

.
├── es
│   └── usr
│       └── share
│           └── elasticsearch
│               ├── config
│               │   ├── elasticsearch.yml
│               │   ├── log4j2.properties
│               │   └── scripts
│               └── data
│                   └── nodes
│                       └── 0
│                           ├── node.lock
│                           └── _state
│                               ├── global-0.st
│                               └── node-0.st
├── jetty
│   ├── usr
│   │   └── local
│   │       └── jetty
│   │           └── etc
│   │               ├── example-quickstart.xml
│   │               ├── gcloud-memcached-session-context.xml
│   │               ├── gcloud-session-context.xml
│   │               ├── hawtio.xml
│   │               ├── home-base-warning.xml
│   │               ├── jamon.xml
│   │               ├── jdbcRealm.properties
│   │               ├── jetty-alpn.xml
│   │               ├── jetty-annotations.xml
│   │               ├── jetty-cdi.xml
│   │               ├── jetty.conf
│   │               ├── jetty-debuglog.xml
│   │               ├── jetty-debug.xml
│   │               ├── jetty-deploy.xml
│   │               ├── jetty-gcloud-memcached-sessions.xml
│   │               ├── jetty-gcloud-session-idmgr.xml
│   │               ├── jetty-gcloud-sessions.xml
│   │               ├── jetty-gzip.xml
│   │               ├── jetty-http2c.xml
│   │               ├── jetty-http2.xml
│   │               ├── jetty-http-forwarded.xml
│   │               ├── jetty-https.xml
│   │               ├── jetty-http.xml
│   │               ├── jetty-infinispan.xml
│   │               ├── jetty-ipaccess.xml
│   │               ├── jetty-jaas.xml
│   │               ├── jetty-jdbc-sessions.xml
│   │               ├── jetty-jmx-remote.xml
│   │               ├── jetty-jmx.xml
│   │               ├── jetty-logging.xml
│   │               ├── jetty-lowresources.xml
│   │               ├── jetty-monitor.xml
│   │               ├── jetty-nosql.xml
│   │               ├── jetty-plus.xml
│   │               ├── jetty-proxy-protocol-ssl.xml
│   │               ├── jetty-proxy-protocol.xml
│   │               ├── jetty-proxy.xml
│   │               ├── jetty-requestlog.xml
│   │               ├── jetty-rewrite-customizer.xml
│   │               ├── jetty-rewrite.xml
│   │               ├── jetty-setuid.xml
│   │               ├── jetty-spring.xml
│   │               ├── jetty-ssl-context.xml
│   │               ├── jetty-ssl.xml
│   │               ├── jetty-started.xml
│   │               ├── jetty-stats.xml
│   │               ├── jetty-threadlimit.xml
│   │               ├── jetty.xml
│   │               ├── jminix.xml
│   │               ├── jolokia.xml
│   │               ├── krb5.ini
│   │               ├── README.spnego
│   │               ├── rewrite-compactpath.xml
│   │               ├── spnego.conf
│   │               ├── spnego.properties
│   │               └── webdefault.xml
│   └── webapps
│       └── jvmjsp.war
├── mongo
│   ├── data
│   │   └── db
│   │       ├── collection-0-4376730799513530636.wt
│   │       ├── collection-2-4376730799513530636.wt
│   │       ├── collection-5-4376730799513530636.wt
│   │       ├── diagnostic.data
│   │       │   └── metrics.2016-12-27T08-57-50Z-00000
│   │       ├── index-1-4376730799513530636.wt
│   │       ├── index-3-4376730799513530636.wt
│   │       ├── index-4-4376730799513530636.wt
│   │       ├── index-6-4376730799513530636.wt
│   │       ├── journal
│   │       │   ├── WiredTigerLog.0000000001
│   │       │   ├── WiredTigerPreplog.0000000001
│   │       │   └── WiredTigerPreplog.0000000002
│   │       ├── _mdb_catalog.wt
│   │       ├── mongod.lock
│   │       ├── sizeStorer.wt
│   │       ├── storage.bson
│   │       ├── WiredTiger
│   │       ├── WiredTigerLAS.wt
│   │       ├── WiredTiger.lock
│   │       ├── WiredTiger.turtle
│   │       └── WiredTiger.wt
│   └── etc
│       └── mongod.conf
├── mysql
│   ├── etc
│   │   └── mysql
│   │       └── conf.d
│   │           ├── docker.cnf
│   │           └── mysql.cnf
│   └── var
│       └── lib
│           └── mysql
│               ├── auto.cnf
│               ├── ib_buffer_pool
│               ├── ibdata1
│               ├── ib_logfile0
│               ├── ib_logfile1
│               ├── mysql [error opening dir]
│               ├── performance_schema [error opening dir]
│               └── sys [error opening dir]
├── nginx
│   ├── etc
│   │   └── nginx
│   │       └── nginx.conf
│   └── usr
│       └── share
│           └── nginx
│               └── html
│                   ├── 50x.html
│                   └── index.html
└── redis
    ├── data
    │   └── dump.rdb
    └── etc
        └── redis
            ├── redis.conf
            └── sentinel.conf

6、本地工具安装

sudo apt-get install mysql-client
sudo apt-get install redis-tools
sudo apt-get install mongodb-clients
sudo apt-get install curl

常用docker镜像命令(Shell)

1、拉取镜像

#ubuntu-16.04.1-server-amd64
sudo apt-get install docker
sudo apt-get install docker-compose
#拉取镜像
sudo docker pull mysql:5.7
sudo docker pull redis:3.2
sudo docker pull mongo:3.4
sudo docker pull jetty:9.3-jre8
sudo docker pull nginx:1.11
sudo docker pull elasticsearch:5.1
sudo docker pull ubuntu:16.04

2、新建网络

sudo docker network create hiup

3、启动容器
3.1、第一次启动容器

#mysql
sudo docker run --net=hiup --name h01-mysql01 -h h01-mysql01 -p3306:3306 -c 100 -m 128m -e MYSQL_ROOT_PASSWORD=hiup -v /home/hiup/docker/data/mysql/var/lib/mysql:/var/lib/mysql -v /home/hiup/docker/data/mysql/etc/mysql/conf.d:/etc/mysql/conf.d -itd mysql:5.7

#redis
sudo docker run --net=hiup --name h01-redis01 -h h01-redis01 -p6379:6379 -c 100 -m 128m  -v /home/hiup/docker/data/redis/etc/redis/:/etc/redis/ -v /home/hiup/docker/data/redis/data:/data -itd redis:3.2 redis-server /etc/redis/redis.conf
#下面配置提供持久化支持
#redis-server --appendonly yes

#mongodb
sudo docker run --net=hiup --name h01-mongo01 -h h01-mongo01 -p27017:27017 -c 100 -m 128m -v /home/hiup/docker/data/mongo/etc/mongod.conf:/etc/mongod.conf -v /home/hiup/docker/data/mongo/data/db:/data/db -itd mongo:3.4
#提供授权支持
#--auth

#jetty
sudo docker run --net=hiup --name h01-jetty01 -h h01-jetty01 -p8080:8080 -c 100 -m 128m -v /home/hiup/docker/data/jetty/usr/local/jetty/etc:/usr/local/jetty/etc -v /home/hiup/docker/data/jetty/webapps:/var/lib/jetty/webapps -itd jetty:9.3-jre8
#默认环境变量
#JETTY_HOME    =  /usr/local/jetty
#JETTY_BASE    =  /var/lib/jetty
#TMPDIR        =  /tmp/jetty
#Deploy dir is /var/lib/jetty/webapps
#内存设置
#-e JAVA_OPTIONS="-Xmx1g"
#参数列表
#--list-config

#nginx
sudo docker run --net=hiup --name h01-nginx01 -h h01-nginx01 -p80:80 -c 100 -m 128m -v /home/hiup/docker/data/nginx/etc/nginx/nginx.conf:/etc/nginx/nginx.conf -v /home/hiup/docker/data/nginx/usr/share/nginx/html:/usr/share/nginx/html -itd nginx:1.11

#elasticsearch
sudo docker run --net=hiup --name h01-es01 -h h01-es01 -p9200:9200 -p9300:9300 -c 100 -m 640m -v /home/hiup/docker/data/es/usr/share/elasticsearch/config:/usr/share/elasticsearch/config -v /home/hiup/docker/data/es/usr/share/elasticsearch/data:/usr/share/elasticsearch/data -itd elasticsearch:5.1

#ubuntu
sudo docker run --net=hiup --name h01-ubuntu01 -h h01-ubuntu01 -c 100 -m 128m -itd ubuntu:16.04
sudo docker attach h01-ubuntu01

3.2、第n次启动容器(n>1)

sudo docker start h01-mysql01
sudo docker start h01-redis01
sudo docker start h01-mongo01
sudo docker start h01-jetty01
sudo docker start h01-nginx01
sudo docker start h01-es01
sudo docker start h01-ubuntu01

4、配置文件目录

.
├── es
│   └── usr
│       └── share
│           └── elasticsearch
│               ├── config
│               │   ├── elasticsearch.yml
│               │   ├── log4j2.properties
│               │   └── scripts
│               └── data
│                   └── nodes
│                       └── 0
│                           ├── node.lock
│                           └── _state
│                               ├── global-0.st
│                               └── node-0.st
├── jetty
│   ├── usr
│   │   └── local
│   │       └── jetty
│   │           └── etc
│   │               ├── example-quickstart.xml
│   │               ├── gcloud-memcached-session-context.xml
│   │               ├── gcloud-session-context.xml
│   │               ├── hawtio.xml
│   │               ├── home-base-warning.xml
│   │               ├── jamon.xml
│   │               ├── jdbcRealm.properties
│   │               ├── jetty-alpn.xml
│   │               ├── jetty-annotations.xml
│   │               ├── jetty-cdi.xml
│   │               ├── jetty.conf
│   │               ├── jetty-debuglog.xml
│   │               ├── jetty-debug.xml
│   │               ├── jetty-deploy.xml
│   │               ├── jetty-gcloud-memcached-sessions.xml
│   │               ├── jetty-gcloud-session-idmgr.xml
│   │               ├── jetty-gcloud-sessions.xml
│   │               ├── jetty-gzip.xml
│   │               ├── jetty-http2c.xml
│   │               ├── jetty-http2.xml
│   │               ├── jetty-http-forwarded.xml
│   │               ├── jetty-https.xml
│   │               ├── jetty-http.xml
│   │               ├── jetty-infinispan.xml
│   │               ├── jetty-ipaccess.xml
│   │               ├── jetty-jaas.xml
│   │               ├── jetty-jdbc-sessions.xml
│   │               ├── jetty-jmx-remote.xml
│   │               ├── jetty-jmx.xml
│   │               ├── jetty-logging.xml
│   │               ├── jetty-lowresources.xml
│   │               ├── jetty-monitor.xml
│   │               ├── jetty-nosql.xml
│   │               ├── jetty-plus.xml
│   │               ├── jetty-proxy-protocol-ssl.xml
│   │               ├── jetty-proxy-protocol.xml
│   │               ├── jetty-proxy.xml
│   │               ├── jetty-requestlog.xml
│   │               ├── jetty-rewrite-customizer.xml
│   │               ├── jetty-rewrite.xml
│   │               ├── jetty-setuid.xml
│   │               ├── jetty-spring.xml
│   │               ├── jetty-ssl-context.xml
│   │               ├── jetty-ssl.xml
│   │               ├── jetty-started.xml
│   │               ├── jetty-stats.xml
│   │               ├── jetty-threadlimit.xml
│   │               ├── jetty.xml
│   │               ├── jminix.xml
│   │               ├── jolokia.xml
│   │               ├── krb5.ini
│   │               ├── README.spnego
│   │               ├── rewrite-compactpath.xml
│   │               ├── spnego.conf
│   │               ├── spnego.properties
│   │               └── webdefault.xml
│   └── webapps
│       └── jvmjsp.war
├── mongo
│   ├── data
│   │   └── db
│   │       ├── collection-0-4376730799513530636.wt
│   │       ├── collection-2-4376730799513530636.wt
│   │       ├── collection-5-4376730799513530636.wt
│   │       ├── diagnostic.data
│   │       │   └── metrics.2016-12-27T08-57-50Z-00000
│   │       ├── index-1-4376730799513530636.wt
│   │       ├── index-3-4376730799513530636.wt
│   │       ├── index-4-4376730799513530636.wt
│   │       ├── index-6-4376730799513530636.wt
│   │       ├── journal
│   │       │   ├── WiredTigerLog.0000000001
│   │       │   ├── WiredTigerPreplog.0000000001
│   │       │   └── WiredTigerPreplog.0000000002
│   │       ├── _mdb_catalog.wt
│   │       ├── mongod.lock
│   │       ├── sizeStorer.wt
│   │       ├── storage.bson
│   │       ├── WiredTiger
│   │       ├── WiredTigerLAS.wt
│   │       ├── WiredTiger.lock
│   │       ├── WiredTiger.turtle
│   │       └── WiredTiger.wt
│   └── etc
│       └── mongod.conf
├── mysql
│   ├── etc
│   │   └── mysql
│   │       └── conf.d
│   │           ├── docker.cnf
│   │           └── mysql.cnf
│   └── var
│       └── lib
│           └── mysql
│               ├── auto.cnf
│               ├── ib_buffer_pool
│               ├── ibdata1
│               ├── ib_logfile0
│               ├── ib_logfile1
│               ├── mysql [error opening dir]
│               ├── performance_schema [error opening dir]
│               └── sys [error opening dir]
├── nginx
│   ├── etc
│   │   └── nginx
│   │       └── nginx.conf
│   └── usr
│       └── share
│           └── nginx
│               └── html
│                   ├── 50x.html
│                   └── index.html
└── redis
    ├── data
    │   └── dump.rdb
    └── etc
        └── redis
            ├── redis.conf
            └── sentinel.conf

5、本地工具安装

sudo apt-get install mysql-client
sudo apt-get install redis-cli
sudo apt-get install mongodb-clients
sudo apt-get install curl

Docker的几种联网方式

一、Docker联网常用概念
Port Expose:
标识image暴露某端口

Port Binding:
将虚拟机的端口,映射到宿主机的某端口

Linking:
contianer B link到container A,则B可以访问A

network:
虚拟机联网类型

二、Docker默认有以下几种联网方式:
none:无网络
host:与宿主机公用网卡
bridge:docker0做路由
container:容器共享,容器共享的虚拟机之间可以相互访问
用户定义network:在同一network中的主机是可以相互访问的,host也是相互知道的
用户定义overlay network:主要用于跨宿主机的docker虚拟机之间的通讯

困了,先写个题纲。。。

Hive与HBase数据互导

1、Hive到HBase
1.1、创建hive表

create table inpatient_hv(
PATIENT_NO String COMMENT '住院号',
NAME String COMMENT '姓名',
SEX_CODE String COMMENT '性别',
BIRTHDATE TIMESTAMP COMMENT '生日',
BALANCE_COST String COMMENT '总费用')
COMMENT '住院患者基本信息'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '\\'
STORED AS TEXTFILE;

1.2、hive表导入数据

load data inpath '/usr/hadoop/inpatient.txt' into table inpatient_hv

1.3、创建hbase表

create table inpatient_hb(
PATIENT_NO String COMMENT '住院号',
NAME String COMMENT '姓名',
SEX_CODE String COMMENT '性别',
BIRTHDATE TIMESTAMP COMMENT '生日',
BALANCE_COST String COMMENT '总费用')
COMMENT '住院患者基本信息'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,pinfo:NAME,pinfo:SEX_CODE,pinfo:BIRTHDATE,pinfo:BALANCE_COST") 
TBLPROPERTIES ("hbase.table.name" = "inpatient_hb");

1.4、数据从hive导入hbase

INSERT OVERWRITE TABLE inpatient_hb SELECT * FROM inpatient_hv;

2、hbase到hive
2.1、创建hbase表

create 'inpatient_hb','pinfo'

2.2、hbase表导入数据

./hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="," -Dimporttsv.columns=HBASE_ROW_KEY,pinfo:INPATIENT_NO,pinfo:NAME,pinfo:SEX_CODE,pinfo:BIRTHDATE,pinfo:BALANCE_COS inpatient_hb /usr/hadoop/inpatient.txt

2.3、创建hive表

#创建hbase external表
create external table inpatient_hb(
PATIENT_NO String COMMENT '住院号',
INPATIENT_NO String  COMMENT '住院流水号',
NAME String COMMENT '姓名',
SEX_CODE String COMMENT '性别',
BIRTHDATE TIMESTAMP COMMENT '生日',
BALANCE_COST String COMMENT '总费用')
COMMENT '住院患者基本信息'
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,pinfo:INPATIENT_NO,pinfo:NAME,pinfo:SEX_CODE,pinfo:BIRTHDATE,pinfo:BALANCE_COST") 
TBLPROPERTIES ("hbase.table.name" = "inpatient_hb");

#创建hive表
create table inpatient_hv(
PATIENT_NO String COMMENT '住院号',
INPATIENT_NO String  COMMENT '住院流水号',
NAME String COMMENT '姓名',
SEX_CODE String COMMENT '性别',
BIRTHDATE TIMESTAMP COMMENT '生日',
BALANCE_COST String COMMENT '总费用')
COMMENT '住院患者基本信息'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '\\'
STORED AS TEXTFILE;

2.4、数据从hbase导入hive

INSERT OVERWRITE TABLE inpatient_hv SELECT * FROM inpatient_hb;

Hive环境搭建04

1、建表

create table inpatient(
PATIENT_NO String COMMENT '住院号',
NAME String COMMENT '姓名',
SEX_CODE String COMMENT '性别',
BIRTHDATE TIMESTAMP COMMENT '生日',
BALANCE_COST String COMMENT '总费用')
COMMENT '住院患者基本信息'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,pinfo:INPATIENT_NO,pinfo:NAME,pinfo:SEX_CODE,pinfo:BIRTHDATE,pinfo:BALANCE_COST") 
TBLPROPERTIES ("hbase.table.name" = "inpatient");

2、Hbase导入数据
2.1、Hbase直接导入

./hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="," -Dimporttsv.columns=HBASE_ROW_KEY,pinfo:INPATIENT_NO,pinfo:NAME,pinfo:SEX_CODE,pinfo:BIRTHDATE,pinfo:BALANCE_COST inpatient /usr/hadoop/inpatient.txt
......
2016-12-22 10:33:36,985 INFO  [main] client.RMProxy: Connecting to ResourceManager at hadoop-master/172.16.172.13:8032
2016-12-22 10:33:37,340 INFO  [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-12-22 10:33:43,450 INFO  [main] input.FileInputFormat: Total input paths to process : 1
2016-12-22 10:33:44,640 INFO  [main] mapreduce.JobSubmitter: number of splits:1
2016-12-22 10:33:44,952 INFO  [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-12-22 10:33:47,173 INFO  [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1482371551462_0002
2016-12-22 10:33:50,830 INFO  [main] impl.YarnClientImpl: Submitted application application_1482371551462_0002
2016-12-22 10:33:51,337 INFO  [main] mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1482371551462_0002/
2016-12-22 10:33:51,338 INFO  [main] mapreduce.Job: Running job: job_1482371551462_0002
2016-12-22 10:34:39,499 INFO  [main] mapreduce.Job: Job job_1482371551462_0002 running in uber mode : false
2016-12-22 10:34:39,572 INFO  [main] mapreduce.Job:  map 0% reduce 0%
2016-12-22 10:35:48,228 INFO  [main] mapreduce.Job:  map 1% reduce 0%
2016-12-22 10:36:06,876 INFO  [main] mapreduce.Job:  map 3% reduce 0%
2016-12-22 10:36:09,981 INFO  [main] mapreduce.Job:  map 5% reduce 0%
2016-12-22 10:36:13,739 INFO  [main] mapreduce.Job:  map 7% reduce 0%
2016-12-22 10:36:17,592 INFO  [main] mapreduce.Job:  map 10% reduce 0%
2016-12-22 10:36:22,891 INFO  [main] mapreduce.Job:  map 12% reduce 0%
2016-12-22 10:36:45,217 INFO  [main] mapreduce.Job:  map 17% reduce 0%
2016-12-22 10:37:14,914 INFO  [main] mapreduce.Job:  map 20% reduce 0%
2016-12-22 10:37:35,739 INFO  [main] mapreduce.Job:  map 25% reduce 0%
2016-12-22 10:37:39,013 INFO  [main] mapreduce.Job:  map 34% reduce 0%
2016-12-22 10:38:24,289 INFO  [main] mapreduce.Job:  map 42% reduce 0%
2016-12-22 10:38:36,644 INFO  [main] mapreduce.Job:  map 49% reduce 0%
2016-12-22 10:38:57,618 INFO  [main] mapreduce.Job:  map 54% reduce 0%
2016-12-22 10:39:00,808 INFO  [main] mapreduce.Job:  map 56% reduce 0%
2016-12-22 10:39:07,879 INFO  [main] mapreduce.Job:  map 58% reduce 0%
2016-12-22 10:39:11,489 INFO  [main] mapreduce.Job:  map 60% reduce 0%
2016-12-22 10:39:24,708 INFO  [main] mapreduce.Job:  map 62% reduce 0%
2016-12-22 10:39:29,188 INFO  [main] mapreduce.Job:  map 63% reduce 0%
2016-12-22 10:39:34,165 INFO  [main] mapreduce.Job:  map 65% reduce 0%
2016-12-22 10:40:12,473 INFO  [main] mapreduce.Job:  map 66% reduce 0%
2016-12-22 10:40:39,471 INFO  [main] mapreduce.Job:  map 73% reduce 0%
2016-12-22 10:40:40,910 INFO  [main] mapreduce.Job:  map 74% reduce 0%
2016-12-22 10:40:42,936 INFO  [main] mapreduce.Job:  map 75% reduce 0%
2016-12-22 10:40:46,471 INFO  [main] mapreduce.Job:  map 77% reduce 0%
2016-12-22 10:40:50,495 INFO  [main] mapreduce.Job:  map 79% reduce 0%
2016-12-22 10:40:53,267 INFO  [main] mapreduce.Job:  map 81% reduce 0%
2016-12-22 10:41:06,843 INFO  [main] mapreduce.Job:  map 83% reduce 0%
2016-12-22 10:41:13,140 INFO  [main] mapreduce.Job:  map 92% reduce 0%
2016-12-22 10:41:22,305 INFO  [main] mapreduce.Job:  map 93% reduce 0%
2016-12-22 10:41:27,671 INFO  [main] mapreduce.Job:  map 96% reduce 0%
2016-12-22 10:41:48,688 INFO  [main] mapreduce.Job:  map 100% reduce 0%
2016-12-22 10:43:20,552 INFO  [main] mapreduce.Job: Job job_1482371551462_0002 completed successfully
2016-12-22 10:43:28,574 INFO  [main] mapreduce.Job: Counters: 31
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=127746
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=43306042
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=2
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Job Counters
                Launched map tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=460404
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=460404
                Total vcore-seconds taken by all map tasks=460404
                Total megabyte-seconds taken by all map tasks=471453696
        Map-Reduce Framework
                Map input records=115411
                Map output records=115152
                Input split bytes=115
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=26590
                CPU time spent (ms)=234550
                Physical memory (bytes) snapshot=83329024
                Virtual memory (bytes) snapshot=544129024
                Total committed heap usage (bytes)=29036544
        ImportTsv
                Bad Lines=259
        File Input Format Counters
                Bytes Read=43305927
        File Output Format Counters
                Bytes Written=0

2.2、completebulkload导入

#/etc/profile中添加下面一行
#export HADOOP_CLASSPATH="$HADOOP_CLASSPATH:$HBASE_HOME/lib/*"
./hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="," -Dimporttsv.bulk.output=/usr/hadoop/inpatient.tmp -Dimporttsv.columns=HBASE_ROW_KEY,pinfo:INPATIENT_NO,pinfo:NAME,pinfo:SEX_CODE,pinfo:BIRTHDATE,pinfo:BALANCE_COST inpatient /usr/hadoop/inpatient.txt
......
2016-12-22 12:26:04,496 INFO  [main] client.RMProxy: Connecting to ResourceManager at hadoop-master/172.16.172.13:8032
2016-12-22 12:26:12,411 INFO  [main] input.FileInputFormat: Total input paths to process : 1
2016-12-22 12:26:12,563 INFO  [main] mapreduce.JobSubmitter: number of splits:1
2016-12-22 12:26:12,577 INFO  [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-12-22 12:26:13,220 INFO  [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1482371551462_0005
2016-12-22 12:26:13,764 INFO  [main] impl.YarnClientImpl: Submitted application application_1482371551462_0005
2016-12-22 12:26:13,832 INFO  [main] mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1482371551462_0005/
2016-12-22 12:26:13,833 INFO  [main] mapreduce.Job: Running job: job_1482371551462_0005
2016-12-22 12:26:35,952 INFO  [main] mapreduce.Job: Job job_1482371551462_0005 running in uber mode : false
2016-12-22 12:26:36,156 INFO  [main] mapreduce.Job:  map 0% reduce 0%
2016-12-22 12:27:15,839 INFO  [main] mapreduce.Job:  map 3% reduce 0%
2016-12-22 12:27:18,868 INFO  [main] mapreduce.Job:  map 53% reduce 0%
2016-12-22 12:27:21,981 INFO  [main] mapreduce.Job:  map 58% reduce 0%
2016-12-22 12:27:29,195 INFO  [main] mapreduce.Job:  map 67% reduce 0%
2016-12-22 12:27:41,582 INFO  [main] mapreduce.Job:  map 83% reduce 0%
2016-12-22 12:27:52,819 INFO  [main] mapreduce.Job:  map 85% reduce 0%
2016-12-22 12:27:59,189 INFO  [main] mapreduce.Job:  map 93% reduce 0%
2016-12-22 12:28:07,498 INFO  [main] mapreduce.Job:  map 100% reduce 0%
2016-12-22 12:29:11,199 INFO  [main] mapreduce.Job:  map 100% reduce 67%
2016-12-22 12:29:24,353 INFO  [main] mapreduce.Job:  map 100% reduce 70%
2016-12-22 12:29:32,324 INFO  [main] mapreduce.Job:  map 100% reduce 74%
2016-12-22 12:29:37,001 INFO  [main] mapreduce.Job:  map 100% reduce 79%
2016-12-22 12:29:38,011 INFO  [main] mapreduce.Job:  map 100% reduce 82%
2016-12-22 12:29:41,038 INFO  [main] mapreduce.Job:  map 100% reduce 84%
2016-12-22 12:29:45,082 INFO  [main] mapreduce.Job:  map 100% reduce 88%
2016-12-22 12:29:48,115 INFO  [main] mapreduce.Job:  map 100% reduce 90%
2016-12-22 12:29:51,154 INFO  [main] mapreduce.Job:  map 100% reduce 92%
2016-12-22 12:29:54,186 INFO  [main] mapreduce.Job:  map 100% reduce 94%
2016-12-22 12:29:57,205 INFO  [main] mapreduce.Job:  map 100% reduce 97%
2016-12-22 12:30:00,236 INFO  [main] mapreduce.Job:  map 100% reduce 100%
2016-12-22 12:30:06,388 INFO  [main] mapreduce.Job: Job job_1482371551462_0005 completed successfully
2016-12-22 12:30:09,203 INFO  [main] mapreduce.Job: Counters: 50
        File System Counters
                FILE: Number of bytes read=237707880
                FILE: Number of bytes written=357751428
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=43306042
                HDFS: Number of bytes written=195749237
                HDFS: Number of read operations=8
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=3
        Job Counters
                Launched map tasks=1
                Launched reduce tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=99691
                Total time spent by all reduces in occupied slots (ms)=83330
                Total time spent by all map tasks (ms)=99691
                Total time spent by all reduce tasks (ms)=83330
                Total vcore-seconds taken by all map tasks=99691
                Total vcore-seconds taken by all reduce tasks=83330
                Total megabyte-seconds taken by all map tasks=102083584
                Total megabyte-seconds taken by all reduce tasks=85329920
        Map-Reduce Framework
                Map input records=115411
                Map output records=115152
                Map output bytes=118397787
                Map output materialized bytes=118853937
                Input split bytes=115
                Combine input records=115152
                Combine output records=115077
                Reduce input groups=115077
                Reduce shuffle bytes=118853937
                Reduce input records=115077
                Reduce output records=3337137
                Spilled Records=345231
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=2017
                CPU time spent (ms)=38130
                Physical memory (bytes) snapshot=383750144
                Virtual memory (bytes) snapshot=1184014336
                Total committed heap usage (bytes)=231235584
        ImportTsv
                Bad Lines=259
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=43305927
        File Output Format Counters
                Bytes Written=195749237

3、在hive中进行查询

hive> select * from inpatient limit 1;
OK
......
Time taken: 12.419 seconds, Fetched: 1 row(s)

hive> select count(*) from inpatient;
Query ID = hadoop_20161222114304_b247c745-a6ec-4e52-b76d-daefb657ac20
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1482371551462_0004, Tracking URL = http://hadoop-master:8088/proxy/application_1482371551462_0004/
Kill Command = /home/hadoop/Deploy/hadoop-2.5.2/bin/hadoop job  -kill job_1482371551462_0004
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2016-12-22 11:44:22,634 Stage-1 map = 0%,  reduce = 0%
2016-12-22 11:45:08,704 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.74 sec
2016-12-22 11:45:50,754 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 8.19 sec
MapReduce Total cumulative CPU time: 8 seconds 190 msec
Ended Job = job_1482371551462_0004
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 8.19 sec   HDFS Read: 13353 HDFS Write: 7 SUCCESS
Total MapReduce CPU Time Spent: 8 seconds 190 msec
OK
115077
Time taken: 170.801 seconds, Fetched: 1 row(s)

./hadoop jar /home/hadoop/Deploy/hbase-1.1.2/lib/hbase-server-1.1.2.jar completebulkload /usr/hadoop/inpatient.tmp inpatient
......
16/12/22 12:42:04 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x4df040780x0, quorum=localhost:2181, baseZNode=/hbase
16/12/22 12:42:04 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
16/12/22 12:42:04 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
16/12/22 12:42:04 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x5924755d380005, negotiated timeout = 90000
16/12/22 12:42:06 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x7979cd9c connecting to ZooKeeper ensemble=localhost:2181
16/12/22 12:42:06 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x7979cd9c0x0, quorum=localhost:2181, baseZNode=/hbase
16/12/22 12:42:06 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
16/12/22 12:42:06 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
16/12/22 12:42:07 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x5924755d380006, negotiated timeout = 90000
16/12/22 12:42:07 WARN mapreduce.LoadIncrementalHFiles: Skipping non-directory hdfs://hadoop-master:9000/usr/hadoop/inpatient.tmp/_SUCCESS
16/12/22 12:42:08 INFO hfile.CacheConfig: CacheConfig:disabled
16/12/22 12:42:08 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://hadoop-master:9000/usr/hadoop/inpatient.tmp/pinfo/7ee330c0f66c4d36b5d614a337d3929f first=" last="B301150360"
16/12/22 12:42:08 INFO client.ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
16/12/22 12:42:08 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x5924755d380006
16/12/22 12:42:08 INFO zookeeper.ZooKeeper: Session: 0x5924755d380006 closed
16/12/22 12:42:08 INFO zookeeper.ClientCnxn: EventThread shut down