Kafka Standalone Docker Container

Nachdem wir nun wissen, wie man Kafka auf einem Raspberry Pi installieren kann, wollen wir nun das ganze auf einen Docker Container übertragen. Dafür werden wir im Laufe des Artikels einen Dockerfile erstellen, welches als Grundlage für den Dockercontainer dient. Die Konfigrurationsdateien die für das Setup verwendet wurden, sind bereits in dem Raspberry Pi Artikel erläutert.

Inhaltsverzeichnis

Dockerfile – Vorbereitung

Für unser Dockerfile legen wir einen extra Ordner an. In diesem legen wir eine Datei dockerfile und einen Ordner files an. In dem Ornder files werden alle Dateien hinterlegt, die im Images für den Dockercontainer hinterlegt werden sollen.

Nachdem Datei und Ordner angelegt wurden, öffnet man die Datei dockerfile mit einem Editor. Das dockerfile wird mit den folgenden Zeilen befüllt:

from centos:latest

RUN yum update -y
RUN yum install vim wget dnsutils -y

WORKDIR /app

from centos:latest

RUN yum update -y

RUN yum install vim wget dnsutils -y

WORKDIR /app

In der ersten Zeile legen wir fest, welches Base Image verwendet werden soll. Anschließend installiert man updates und ein paar nützliche Programme. Als nächstes legen wir ein Ordner an, in dem wir in den nächsten Schritten unsere Applikation installieren werden.

Dockerfile – Java Installation

Wir starten mit der Installation von Java. Dafür muss die Datei jdk-8u191-linux-x64.rpm im files Ordner hinterlegt werden. Die Datei kann man von der Oracle Downloadseite herunterladen. Ist die Datei hinterlegt, kann das Dockerfile um die folgenden Zeilen ergänzt werden:

#java 
COPY files/jdk-8u191-linux-x64.rpm /app/
RUN yum install /app/jdk-8u191-linux-x64.rpm -y
ENV JAVA_HOME /usr/java/latest

#java

COPY files/jdk-8u191-linux-x64.rpm /app/

RUN yum install /app/jdk-8u191-linux-x64.rpm -y

ENV JAVA_HOME /usr/java/latest

Hier wird das rpm Paket aus dem files Ordner in den Ordner /app kopiert. Anschließend installiert man das Paket. Nach der Installation wird noch die Umgebungsvariable JAVA_HOME gesetzt.

Dockerfile – Zookeeper Installation

Für Zookeeper legen wir im files Ordner die Datei zoo.cfg an. Diese wird mit dem folgenden Inhalt befüllt:

# The number of milliseconds of each tick
tickTime=5000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/app/zookeeper/data
# the port at which the clients will connect
clientPort=2181

server.1=localhost:2888:3888

# The number of milliseconds of each tick

tickTime=5000

# The number of ticks that the initial

# synchronization phase can take

initLimit=10

# The number of ticks that can pass between

# sending a request and getting an acknowledgement

syncLimit=5

# the directory where the snapshot is stored.

dataDir=/app/zookeeper/data

# the port at which the clients will connect

clientPort=2181

server.1=localhost:2888:3888

Anschließend legen wir eine zweite Datei im files Ordner an mit dem namen log4j.properties.

# Define some default values that can be overridden by system properties
zookeeper.root.logger=INFO, CONSOLE
zookeeper.console.threshold=INFO
zookeeper.log.dir=/app/zookeeper/logs
zookeeper.log.file=zookeeper.log
zookeeper.log.threshold=DEBUG
zookeeper.tracelog.dir=.
zookeeper.tracelog.file=zookeeper_trace.log

#
# ZooKeeper Logging Configuration
#

# Format is "<default threshold> (, <appender>)+

# DEFAULT: console appender only
log4j.rootLogger=${zookeeper.root.logger}

# Example with rolling log file
#log4j.rootLogger=DEBUG, CONSOLE, ROLLINGFILE

# Example with rolling log file and tracing
#log4j.rootLogger=TRACE, CONSOLE, ROLLINGFILE, TRACEFILE

#
# Log INFO level and above messages to the console
#
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.Threshold=${zookeeper.console.threshold}
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n

#
# Add ROLLINGFILE to rootLogger to get log file output
#    Log DEBUG level and above messages to a log file
log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.ROLLINGFILE.Threshold=${zookeeper.log.threshold}
log4j.appender.ROLLINGFILE.File=${zookeeper.log.dir}/${zookeeper.log.file}

# Max log file size of 10MB
log4j.appender.ROLLINGFILE.MaxFileSize=10MB
# uncomment the next line to limit number of backup files
#log4j.appender.ROLLINGFILE.MaxBackupIndex=10

log4j.appender.ROLLINGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.ROLLINGFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n


#
# Add TRACEFILE to rootLogger to get log file output
#    Log DEBUG level and above messages to a log file
log4j.appender.TRACEFILE=org.apache.log4j.FileAppender
log4j.appender.TRACEFILE.Threshold=TRACE
log4j.appender.TRACEFILE.File=${zookeeper.tracelog.dir}/${zookeeper.tracelog.file}

log4j.appender.TRACEFILE.layout=org.apache.log4j.PatternLayout
### Notice we are including log4j's NDC here (%x)
log4j.appender.TRACEFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L][%x] - %m%n

# Define some default values that can be overridden by system properties

zookeeper.root.logger=INFO, CONSOLE

zookeeper.console.threshold=INFO

zookeeper.log.dir=/app/zookeeper/logs

zookeeper.log.file=zookeeper.log

zookeeper.log.threshold=DEBUG

zookeeper.tracelog.dir=.

zookeeper.tracelog.file=zookeeper_trace.log

# ZooKeeper Logging Configuration

# Format is "<default threshold> (, <appender>)+

# DEFAULT: console appender only

log4j.rootLogger=${zookeeper.root.logger}

# Example with rolling log file

#log4j.rootLogger=DEBUG, CONSOLE, ROLLINGFILE

# Example with rolling log file and tracing

#log4j.rootLogger=TRACE, CONSOLE, ROLLINGFILE, TRACEFILE

# Log INFO level and above messages to the console

log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender

log4j.appender.CONSOLE.Threshold=${zookeeper.console.threshold}

log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout

log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n

# Add ROLLINGFILE to rootLogger to get log file output

# Log DEBUG level and above messages to a log file

log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender

log4j.appender.ROLLINGFILE.Threshold=${zookeeper.log.threshold}

log4j.appender.ROLLINGFILE.File=${zookeeper.log.dir}/${zookeeper.log.file}

# Max log file size of 10MB

log4j.appender.ROLLINGFILE.MaxFileSize=10MB

# uncomment the next line to limit number of backup files

#log4j.appender.ROLLINGFILE.MaxBackupIndex=10

log4j.appender.ROLLINGFILE.layout=org.apache.log4j.PatternLayout

log4j.appender.ROLLINGFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n

# Add TRACEFILE to rootLogger to get log file output

# Log DEBUG level and above messages to a log file

log4j.appender.TRACEFILE=org.apache.log4j.FileAppender

log4j.appender.TRACEFILE.Threshold=TRACE

log4j.appender.TRACEFILE.File=${zookeeper.tracelog.dir}/${zookeeper.tracelog.file}

log4j.appender.TRACEFILE.layout=org.apache.log4j.PatternLayout

### Notice we are including log4j's NDC here (%x)

log4j.appender.TRACEFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L][%x] - %m%n

Nachdem die benötigten Dateien für Zookeeper angelegt wurden, geht es weiter mit dem Dockerfile.

#Zookeeper
RUN wget http://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz
RUN tar -zxvf zookeeper-3.4.13.tar.gz -C /app
RUN mv /app/zookeeper-3.4.13/ /app/zookeeper
COPY files/zoo.cfg /app/zookeeper/conf/
COPY files/log4j.properties /app/zookeeper/conf/
RUN mkdir /app/zookeeper/logs

#Zookeeper

RUN wget http://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz

RUN tar -zxvf zookeeper-3.4.13.tar.gz -C /app

RUN mv /app/zookeeper-3.4.13/ /app/zookeeper

COPY files/zoo.cfg /app/zookeeper/conf/

COPY files/log4j.properties /app/zookeeper/conf/

RUN mkdir /app/zookeeper/logs

In diesem Abschnitt des Dockerfiles starten wir damit, Zookeeper herunterzuladen. Anschließend wird Zookeeper entpackt und umbenannten. Danach werden die vorher angelegten Konfigurationsdateien kopiert und ein Ordner für die Zookeeper Logdateien angelegt.

Dockerfile – Kafka Installation

Auch für Kafka müssen wir zuerst eine Datei im files Ordner anlegen mit dem Namen kafka-run-class.sh.

#!/bin/bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

if [ $# -lt 1 ];
then
  echo "USAGE: $0 [-daemon] [-name servicename] [-loggc] classname [opts]"
  exit 1
fi

# CYGINW == 1 if Cygwin is detected, else 0.
if [[ $(uname -a) =~ "CYGWIN" ]]; then
  CYGWIN=1
else
  CYGWIN=0
fi

if [ -z "$INCLUDE_TEST_JARS" ]; then
  INCLUDE_TEST_JARS=false
fi

# Exclude jars not necessary for running commands.
regex="(-(test|test-sources|src|scaladoc|javadoc)\.jar|jar.asc)$"
should_include_file() {
  if [ "$INCLUDE_TEST_JARS" = true ]; then
    return 0
  fi
  file=$1
  if [ -z "$(echo "$file" | egrep "$regex")" ] ; then
    return 0
  else
    return 1
  fi
}

base_dir=$(dirname $0)/..

if [ -z "$SCALA_VERSION" ]; then
  SCALA_VERSION=2.11.12
fi

if [ -z "$SCALA_BINARY_VERSION" ]; then
  SCALA_BINARY_VERSION=$(echo $SCALA_VERSION | cut -f 1-2 -d '.')
fi

# run ./gradlew copyDependantLibs to get all dependant jars in a local dir
shopt -s nullglob
for dir in "$base_dir"/core/build/dependant-libs-${SCALA_VERSION}*;
do
  CLASSPATH="$CLASSPATH:$dir/*"
done

for file in "$base_dir"/examples/build/libs/kafka-examples*.jar;
do
  if should_include_file "$file"; then
    CLASSPATH="$CLASSPATH":"$file"
  fi
done

if [ -z "$UPGRADE_KAFKA_STREAMS_TEST_VERSION" ]; then
  clients_lib_dir=$(dirname $0)/../clients/build/libs
  streams_lib_dir=$(dirname $0)/../streams/build/libs
  rocksdb_lib_dir=$(dirname $0)/../streams/build/dependant-libs-${SCALA_VERSION}
else
  clients_lib_dir=/opt/kafka-$UPGRADE_KAFKA_STREAMS_TEST_VERSION/libs
  streams_lib_dir=$clients_lib_dir
  rocksdb_lib_dir=$streams_lib_dir
fi


for file in "$clients_lib_dir"/kafka-clients*.jar;
do
  if should_include_file "$file"; then
    CLASSPATH="$CLASSPATH":"$file"
  fi
done

for file in "$streams_lib_dir"/kafka-streams*.jar;
do
  if should_include_file "$file"; then
    CLASSPATH="$CLASSPATH":"$file"
  fi
done

if [ -z "$UPGRADE_KAFKA_STREAMS_TEST_VERSION" ]; then
  for file in "$base_dir"/streams/examples/build/libs/kafka-streams-examples*.jar;
  do
    if should_include_file "$file"; then
      CLASSPATH="$CLASSPATH":"$file"
    fi
  done
else
  VERSION_NO_DOTS=`echo $UPGRADE_KAFKA_STREAMS_TEST_VERSION | sed 's/\.//g'`
  SHORT_VERSION_NO_DOTS=${VERSION_NO_DOTS:0:((${#VERSION_NO_DOTS} - 1))} # remove last char, ie, bug-fix number
  for file in "$base_dir"/streams/upgrade-system-tests-$SHORT_VERSION_NO_DOTS/build/libs/kafka-streams-upgrade-system-tests*.jar;
  do
    if should_include_file "$file"; then
      CLASSPATH="$CLASSPATH":"$file"
    fi
  done
fi

for file in "$rocksdb_lib_dir"/rocksdb*.jar;
do
  CLASSPATH="$CLASSPATH":"$file"
done

for file in "$base_dir"/tools/build/libs/kafka-tools*.jar;
do
  if should_include_file "$file"; then
    CLASSPATH="$CLASSPATH":"$file"
  fi
done

for dir in "$base_dir"/tools/build/dependant-libs-${SCALA_VERSION}*;
do
  CLASSPATH="$CLASSPATH:$dir/*"
done

for cc_pkg in "api" "transforms" "runtime" "file" "json" "tools" "basic-auth-extension"
do
  for file in "$base_dir"/connect/${cc_pkg}/build/libs/connect-${cc_pkg}*.jar;
  do
    if should_include_file "$file"; then
      CLASSPATH="$CLASSPATH":"$file"
    fi
  done
  if [ -d "$base_dir/connect/${cc_pkg}/build/dependant-libs" ] ; then
    CLASSPATH="$CLASSPATH:$base_dir/connect/${cc_pkg}/build/dependant-libs/*"
  fi
done

# classpath addition for release
for file in "$base_dir"/libs/*;
do
  if should_include_file "$file"; then
    CLASSPATH="$CLASSPATH":"$file"
  fi
done

for file in "$base_dir"/core/build/libs/kafka_${SCALA_BINARY_VERSION}*.jar;
do
  if should_include_file "$file"; then
    CLASSPATH="$CLASSPATH":"$file"
  fi
done
shopt -u nullglob

if [ -z "$CLASSPATH" ] ; then
  echo "Classpath is empty. Please build the project first e.g. by running './gradlew jar -PscalaVersion=$SCALA_VERSION'"
  exit 1
fi

# JMX settings
if [ -z "$KAFKA_JMX_OPTS" ]; then
  KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false  -Dcom.sun.management.jmxremote.ssl=false "
fi

# JMX port to use
if [  $JMX_PORT ]; then
  KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT "
fi

# Log directory to use
if [ "x$LOG_DIR" = "x" ]; then
  LOG_DIR="$base_dir/logs"
fi

# Log4j settings
if [ -z "$KAFKA_LOG4J_OPTS" ]; then
  # Log to console. This is a tool.
  LOG4J_DIR="$base_dir/config/tools-log4j.properties"
  # If Cygwin is detected, LOG4J_DIR is converted to Windows format.
  (( CYGWIN )) && LOG4J_DIR=$(cygpath --path --mixed "${LOG4J_DIR}")
  KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:${LOG4J_DIR}"
else
  # create logs directory
  if [ ! -d "$LOG_DIR" ]; then
    mkdir -p "$LOG_DIR"
  fi
fi

# If Cygwin is detected, LOG_DIR is converted to Windows format.
(( CYGWIN )) && LOG_DIR=$(cygpath --path --mixed "${LOG_DIR}")
KAFKA_LOG4J_OPTS="-Dkafka.logs.dir=$LOG_DIR $KAFKA_LOG4J_OPTS"

# Generic jvm settings you want to add
if [ -z "$KAFKA_OPTS" ]; then
  KAFKA_OPTS=""
fi

# Set Debug options if enabled
if [ "x$KAFKA_DEBUG" != "x" ]; then

    # Use default ports
    DEFAULT_JAVA_DEBUG_PORT="5005"

    if [ -z "$JAVA_DEBUG_PORT" ]; then
        JAVA_DEBUG_PORT="$DEFAULT_JAVA_DEBUG_PORT"
    fi

    # Use the defaults if JAVA_DEBUG_OPTS was not set
    DEFAULT_JAVA_DEBUG_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=${DEBUG_SUSPEND_FLAG:-n},address=$JAVA_DEBUG_PORT"
    if [ -z "$JAVA_DEBUG_OPTS" ]; then
        JAVA_DEBUG_OPTS="$DEFAULT_JAVA_DEBUG_OPTS"
    fi

    echo "Enabling Java debug options: $JAVA_DEBUG_OPTS"
    KAFKA_OPTS="$JAVA_DEBUG_OPTS $KAFKA_OPTS"
fi

# Which java to use
if [ -z "$JAVA_HOME" ]; then
  JAVA="java"
else
  JAVA="$JAVA_HOME/bin/java"
fi

# Memory options
if [ -z "$KAFKA_HEAP_OPTS" ]; then
  KAFKA_HEAP_OPTS="-Xmx256M"
fi

# JVM performance options
if [ -z "$KAFKA_JVM_PERFORMANCE_OPTS" ]; then
  KAFKA_JVM_PERFORMANCE_OPTS="-client -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark -XX:+DisableExplicitGC -Djava.awt.headless=true"
fi
# version option
for args in "$@" ; do
  if [ "$args" = "--version" ]; then
    exec $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "kafka.utils.VersionInfo"
  fi
done

while [ $# -gt 0 ]; do
  COMMAND=$1
  case $COMMAND in
    -name)
      DAEMON_NAME=$2
      CONSOLE_OUTPUT_FILE=$LOG_DIR/$DAEMON_NAME.out
      shift 2
      ;;
    -loggc)
      if [ -z "$KAFKA_GC_LOG_OPTS" ]; then
        GC_LOG_ENABLED="true"
      fi
      shift
      ;;
    -daemon)
      DAEMON_MODE="true"
      shift
      ;;
    *)
      break
      ;;
  esac
done

# GC options
GC_FILE_SUFFIX='-gc.log'
GC_LOG_FILE_NAME=''
if [ "x$GC_LOG_ENABLED" = "xtrue" ]; then
  GC_LOG_FILE_NAME=$DAEMON_NAME$GC_FILE_SUFFIX

  # The first segment of the version number, which is '1' for releases before Java 9
  # it then becomes '9', '10', ...
  # Some examples of the first line of `java --version`:
  # 8 -> java version "1.8.0_152"
  # 9.0.4 -> java version "9.0.4"
  # 10 -> java version "10" 2018-03-20
  # 10.0.1 -> java version "10.0.1" 2018-04-17
  # We need to match to the end of the line to prevent sed from printing the characters that do not match
  JAVA_MAJOR_VERSION=$($JAVA -version 2>&1 | sed -E -n 's/.* version "([0-9]*).*$/\1/p')
  if [[ "$JAVA_MAJOR_VERSION" -ge "9" ]] ; then
    KAFKA_GC_LOG_OPTS="-Xlog:gc*:file=$LOG_DIR/$GC_LOG_FILE_NAME:time,tags:filecount=10,filesize=102400"
  else
    KAFKA_GC_LOG_OPTS="-Xloggc:$LOG_DIR/$GC_LOG_FILE_NAME -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M"
  fi
fi

# Remove a possible colon prefix from the classpath (happens at lines like `CLASSPATH="$CLASSPATH:$file"` when CLASSPATH is blank)
# Syntax used on the right side is native Bash string manipulation; for more details see
# http://tldp.org/LDP/abs/html/string-manipulation.html, specifically the section titled "Substring Removal"
CLASSPATH=${CLASSPATH#:}

# If Cygwin is detected, classpath is converted to Windows format.
(( CYGWIN )) && CLASSPATH=$(cygpath --path --mixed "${CLASSPATH}")

# Launch mode
if [ "x$DAEMON_MODE" = "xtrue" ]; then
  nohup $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "$@" > "$CONSOLE_OUTPUT_FILE" 2>&1 < /dev/null &
else
  exec $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "$@"
fi

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

#!/bin/bash

# Licensed to the Apache Software Foundation (ASF) under one or more

# contributor license agreements. See the NOTICE file distributed with

# this work for additional information regarding copyright ownership.

# The ASF licenses this file to You under the Apache License, Version 2.0

# (the "License"); you may not use this file except in compliance with

# the License. You may obtain a copy of the License at

# http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

if [ $# -lt 1 ];

then

echo "USAGE: $0 [-daemon] [-name servicename] [-loggc] classname [opts]"

exit 1

# CYGINW == 1 if Cygwin is detected, else 0.

if [[ $(uname -a) =~ "CYGWIN" ]]; then

CYGWIN=1

else

CYGWIN=0

if [ -z "$INCLUDE_TEST_JARS" ]; then

INCLUDE_TEST_JARS=false

# Exclude jars not necessary for running commands.

should_include_file() {

if [ "$INCLUDE_TEST_JARS" = true ]; then

return 0

file=$1

if [ -z "$(echo "$file" | egrep "$regex")" ] ; then

return 0

else

return 1

}

base_dir=$(dirname $0)/..

if [ -z "$SCALA_VERSION" ]; then

SCALA_VERSION=2.11.12

if [ -z "$SCALA_BINARY_VERSION" ]; then

SCALA_BINARY_VERSION=$(echo $SCALA_VERSION | cut -f 1-2 -d '.')

# run ./gradlew copyDependantLibs to get all dependant jars in a local dir

shopt -s nullglob

for dir in "$base_dir"/core/build/dependant-libs-${SCALA_VERSION}*;

CLASSPATH="$CLASSPATH:$dir/*"

done

for file in "$base_dir"/examples/build/libs/kafka-examples*.jar;

if should_include_file "$file"; then

CLASSPATH="$CLASSPATH":"$file"

done

if [ -z "$UPGRADE_KAFKA_STREAMS_TEST_VERSION" ]; then

clients_lib_dir=$(dirname $0)/../clients/build/libs

streams_lib_dir=$(dirname $0)/../streams/build/libs

rocksdb_lib_dir=$(dirname $0)/../streams/build/dependant-libs-${SCALA_VERSION}

else

clients_lib_dir=/opt/kafka-$UPGRADE_KAFKA_STREAMS_TEST_VERSION/libs

streams_lib_dir=$clients_lib_dir

rocksdb_lib_dir=$streams_lib_dir

for file in "$clients_lib_dir"/kafka-clients*.jar;

if should_include_file "$file"; then

CLASSPATH="$CLASSPATH":"$file"

done

for file in "$streams_lib_dir"/kafka-streams*.jar;

if should_include_file "$file"; then

CLASSPATH="$CLASSPATH":"$file"

done

if [ -z "$UPGRADE_KAFKA_STREAMS_TEST_VERSION" ]; then

for file in "$base_dir"/streams/examples/build/libs/kafka-streams-examples*.jar;

if should_include_file "$file"; then

CLASSPATH="$CLASSPATH":"$file"

done

else

VERSION_NO_DOTS=`echo $UPGRADE_KAFKA_STREAMS_TEST_VERSION | sed 's/\.//g'`

SHORT_VERSION_NO_DOTS=${VERSION_NO_DOTS:0:((${#VERSION_NO_DOTS} - 1))} # remove last char, ie, bug-fix number

for file in "$base_dir"/streams/upgrade-system-tests-$SHORT_VERSION_NO_DOTS/build/libs/kafka-streams-upgrade-system-tests*.jar;

if should_include_file "$file"; then

CLASSPATH="$CLASSPATH":"$file"

done

for file in "$rocksdb_lib_dir"/rocksdb*.jar;

CLASSPATH="$CLASSPATH":"$file"

done

for file in "$base_dir"/tools/build/libs/kafka-tools*.jar;

if should_include_file "$file"; then

CLASSPATH="$CLASSPATH":"$file"

done

for dir in "$base_dir"/tools/build/dependant-libs-${SCALA_VERSION}*;

CLASSPATH="$CLASSPATH:$dir/*"

done

for cc_pkg in "api" "transforms" "runtime" "file" "json" "tools" "basic-auth-extension"

for file in "$base_dir"/connect/${cc_pkg}/build/libs/connect-${cc_pkg}*.jar;

if should_include_file "$file"; then

CLASSPATH="$CLASSPATH":"$file"

done

if [ -d "$base_dir/connect/${cc_pkg}/build/dependant-libs" ] ; then

CLASSPATH="$CLASSPATH:$base_dir/connect/${cc_pkg}/build/dependant-libs/*"

done

# classpath addition for release

for file in "$base_dir"/libs/*;

if should_include_file "$file"; then

CLASSPATH="$CLASSPATH":"$file"

done

for file in "$base_dir"/core/build/libs/kafka_${SCALA_BINARY_VERSION}*.jar;

if should_include_file "$file"; then

CLASSPATH="$CLASSPATH":"$file"

done

shopt -u nullglob

if [ -z "$CLASSPATH" ] ; then

echo "Classpath is empty. Please build the project first e.g. by running './gradlew jar -PscalaVersion=$SCALA_VERSION'"

exit 1

# JMX settings

if [ -z "$KAFKA_JMX_OPTS" ]; then

KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false "

# JMX port to use

if [ $JMX_PORT ]; then

KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT "

# Log directory to use

if [ "x$LOG_DIR" = "x" ]; then

LOG_DIR="$base_dir/logs"

# Log4j settings

if [ -z "$KAFKA_LOG4J_OPTS" ]; then

# Log to console. This is a tool.

LOG4J_DIR="$base_dir/config/tools-log4j.properties"

# If Cygwin is detected, LOG4J_DIR is converted to Windows format.

(( CYGWIN )) && LOG4J_DIR=$(cygpath --path --mixed "${LOG4J_DIR}")

KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:${LOG4J_DIR}"

else

# create logs directory

if [ ! -d "$LOG_DIR" ]; then

mkdir -p "$LOG_DIR"

# If Cygwin is detected, LOG_DIR is converted to Windows format.

(( CYGWIN )) && LOG_DIR=$(cygpath --path --mixed "${LOG_DIR}")

KAFKA_LOG4J_OPTS="-Dkafka.logs.dir=$LOG_DIR $KAFKA_LOG4J_OPTS"

# Generic jvm settings you want to add

if [ -z "$KAFKA_OPTS" ]; then

KAFKA_OPTS=""

# Set Debug options if enabled

if [ "x$KAFKA_DEBUG" != "x" ]; then

# Use default ports

DEFAULT_JAVA_DEBUG_PORT="5005"

if [ -z "$JAVA_DEBUG_PORT" ]; then

JAVA_DEBUG_PORT="$DEFAULT_JAVA_DEBUG_PORT"

# Use the defaults if JAVA_DEBUG_OPTS was not set

DEFAULT_JAVA_DEBUG_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=${DEBUG_SUSPEND_FLAG:-n},address=$JAVA_DEBUG_PORT"

if [ -z "$JAVA_DEBUG_OPTS" ]; then

JAVA_DEBUG_OPTS="$DEFAULT_JAVA_DEBUG_OPTS"

echo "Enabling Java debug options: $JAVA_DEBUG_OPTS"

KAFKA_OPTS="$JAVA_DEBUG_OPTS $KAFKA_OPTS"

# Which java to use

if [ -z "$JAVA_HOME" ]; then

JAVA="java"

else

JAVA="$JAVA_HOME/bin/java"

# Memory options

if [ -z "$KAFKA_HEAP_OPTS" ]; then

KAFKA_HEAP_OPTS="-Xmx256M"

# JVM performance options

if [ -z "$KAFKA_JVM_PERFORMANCE_OPTS" ]; then

KAFKA_JVM_PERFORMANCE_OPTS="-client -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark -XX:+DisableExplicitGC -Djava.awt.headless=true"

# version option

for args in "$@" ; do

if [ "$args" = "--version" ]; then

exec $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "kafka.utils.VersionInfo"

done

while [ $# -gt 0 ]; do

COMMAND=$1

case $COMMAND in

-name)

DAEMON_NAME=$2

CONSOLE_OUTPUT_FILE=$LOG_DIR/$DAEMON_NAME.out

shift 2

;;

-loggc)

if [ -z "$KAFKA_GC_LOG_OPTS" ]; then

GC_LOG_ENABLED="true"

shift

;;

-daemon)

DAEMON_MODE="true"

shift

;;

break

;;

esac

done

# GC options

GC_FILE_SUFFIX='-gc.log'

GC_LOG_FILE_NAME=''

if [ "x$GC_LOG_ENABLED" = "xtrue" ]; then

GC_LOG_FILE_NAME=$DAEMON_NAME$GC_FILE_SUFFIX

# The first segment of the version number, which is '1' for releases before Java 9

# it then becomes '9', '10', ...

# Some examples of the first line of `java --version`:

# 8 -> java version "1.8.0_152"

# 9.0.4 -> java version "9.0.4"

# 10 -> java version "10" 2018-03-20

# 10.0.1 -> java version "10.0.1" 2018-04-17

# We need to match to the end of the line to prevent sed from printing the characters that do not match

JAVA_MAJOR_VERSION=$($JAVA -version 2>&1 | sed -E -n 's/.* version "([0-9]*).*$/\1/p')

if [[ "$JAVA_MAJOR_VERSION" -ge "9" ]] ; then

KAFKA_GC_LOG_OPTS="-Xlog:gc*:file=$LOG_DIR/$GC_LOG_FILE_NAME:time,tags:filecount=10,filesize=102400"

else

KAFKA_GC_LOG_OPTS="-Xloggc:$LOG_DIR/$GC_LOG_FILE_NAME -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M"

# Remove a possible colon prefix from the classpath (happens at lines like `CLASSPATH="$CLASSPATH:$file"` when CLASSPATH is blank)

# Syntax used on the right side is native Bash string manipulation; for more details see

# http://tldp.org/LDP/abs/html/string-manipulation.html, specifically the section titled "Substring Removal"

CLASSPATH=${CLASSPATH#:}

# If Cygwin is detected, classpath is converted to Windows format.

(( CYGWIN )) && CLASSPATH=$(cygpath --path --mixed "${CLASSPATH}")

# Launch mode

if [ "x$DAEMON_MODE" = "xtrue" ]; then

nohup $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "$@" > "$CONSOLE_OUTPUT_FILE" 2>&1 < /dev/null &

else

exec $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "$@"

Nun kann das Dockerfile weiter angepasst werden. Ähnlich wie bei Zookeeper wird auch für Kafka die Applikation heruntergeladen, entpackt und umbenannt. Danach definieren wir zwei Umgebungsvariablen und kopieren die vorher angelegte Datei.

#Kafka
RUN wget http://www-eu.apache.org/dist/kafka/2.0.0/kafka_2.12-2.0.0.tgz
RUN tar -zxvf kafka_2.12-2.0.0.tgz -C /app/
RUN mv /app/kafka_2.12-2.0.0/ /app/kafka
ENV JMX_PORT=${JMX_PORT:-9999}
ENV KAFKA_HEAP_OPTS="-Xmx256M -Xms128M"
COPY files/kafka-run-class.sh /app/kafka/bin

#Kafka

RUN wget http://www-eu.apache.org/dist/kafka/2.0.0/kafka_2.12-2.0.0.tgz

RUN tar -zxvf kafka_2.12-2.0.0.tgz -C /app/

RUN mv /app/kafka_2.12-2.0.0/ /app/kafka

ENV JMX_PORT=${JMX_PORT:-9999}

ENV KAFKA_HEAP_OPTS="-Xmx256M -Xms128M"

COPY files/kafka-run-class.sh /app/kafka/bin

Dockerfile – Starten mehrere Dienste

Mehrere Dienste in einem Docker Container zu starten ist nicht ganz so einfach, da man nicht zwei Startskripte hintereinander ausführen kann. Um dies zu Umgehen gibt es zwei Möglichkeiten: Entweder man schreibt sich ein Startskript welches mehrere Dienste startet oder man verwenden das Tool supervisor. Ich habe mich an dieser Stelle für ein Startskript entschieden, welches Zookeeper und Kafka nacheinander startet. Dieses wurde im Ordner files mit dem Namen start.sh hinterlegt und sieht wie folgt aus:

#!/bin/bash

# Start the first process
/app/zookeeper/bin/zkServer.sh start 
status=$?
if [ $status -ne 0 ]; then
  echo "Failed to start zookeeper: $status"
  exit $status
fi

# Start the second process
/app/kafka/bin/kafka-server-start.sh /app/kafka/config/server.properties
status=$?
if [ $status -ne 0 ]; then
  echo "Failed to start kafka: $status"
  exit $status
fi

# Naive check runs checks once a minute to see if either of the processes exited.
# This illustrates part of the heavy lifting you need to do if you want to run
# more than one service in a container. The container exits with an error
# if it detects that either of the processes has exited.
# Otherwise it loops forever, waking up every 60 seconds

while sleep 60; do
  ps aux |grep zookeeper |grep -q -v grep
  PROCESS_1_STATUS=$?
  ps aux |grep kafka |grep -q -v grep
  PROCESS_2_STATUS=$?
  # If the greps above find anything, they exit with 0 status
  # If they are not both 0, then something is wrong
  if [ $PROCESS_1_STATUS -ne 0 -o $PROCESS_2_STATUS -ne 0 ]; then
    echo "One of the processes has already exited."
    exit 1
  fi
done

#!/bin/bash

# Start the first process

/app/zookeeper/bin/zkServer.sh start

status=$?

if [ $status -ne 0 ]; then

echo "Failed to start zookeeper: $status"

exit $status

# Start the second process

/app/kafka/bin/kafka-server-start.sh /app/kafka/config/server.properties

status=$?

if [ $status -ne 0 ]; then

echo "Failed to start kafka: $status"

exit $status

# Naive check runs checks once a minute to see if either of the processes exited.

# This illustrates part of the heavy lifting you need to do if you want to run

# more than one service in a container. The container exits with an error

# if it detects that either of the processes has exited.

# Otherwise it loops forever, waking up every 60 seconds

while sleep 60; do

ps aux |grep zookeeper |grep -q -v grep

PROCESS_1_STATUS=$?

ps aux |grep kafka |grep -q -v grep

PROCESS_2_STATUS=$?

# If the greps above find anything, they exit with 0 status

# If they are not both 0, then something is wrong

if [ $PROCESS_1_STATUS -ne 0 -o $PROCESS_2_STATUS -ne 0 ]; then

echo "One of the processes has already exited."

exit 1

done

Nun können wie letzten Schritte im Dockerfile hinterlegt werden. Hier gebe ich als erstes die beiden Ports an, mit denen später kommuniziert wird. Dann wird das Startskript übertragen und die Rechte für das Skript gesetzt . Zu guter letzt wird das Skript noch ausgeführt, damit beim Starten des Containers auch die Applikation startet.

EXPOSE 2181 9092

COPY files/start.sh /app/start.sh
RUN chmod 777 /app/start.sh
CMD /app/start.sh

EXPOSE 2181 9092

COPY files/start.sh /app/start.sh

RUN chmod 777 /app/start.sh

CMD /app/start.sh

Das Image kann nun erstellt werden, indem man in dem Ordner mit dem Dockerfile den folgenden Befehl ausführt:

docker build -t kafka .

1	docker build -t kafka .

Wenn ihr zu faul seit alles zu kopieren, könnt ihr das Projekt auch aus meinem GitHub Repository ziehen.

Dockerfile – Vorbereitung

Dockerfile – Java Installation

Dockerfile – Zookeeper Installation

Dockerfile – Kafka Installation

Dockerfile – Starten mehrere Dienste

Über Christian Piazzi

Schreibe einen Kommentar Antworten abbrechen

Archive

Kontakt, Datenschutz und Impressum

Dockerfile – Vorbereitung

Dockerfile – Java Installation

Dockerfile – Zookeeper Installation

Dockerfile – Kafka Installation

Dockerfile – Starten mehrere Dienste

Über Christian Piazzi

Schreibe einen Kommentar Antworten abbrechen

Archive

Kontakt, Datenschutz und Impressum

Schlagwörter